Your curated collection of saved posts and media
@XFreeze @elonmusk https://t.co/p07G80NEXj
@elonmusk @imPenny2x @neuralink Teslabot the ultimate neuralink accessory...
NEW paper worth reading. (bookmark it) Autonomous research systems usually prove themselves on cherry-picked wins, human-framed topics, or a handful of preset tasks. FARS runs the full loop at scale instead. Stage-specific agents handle ideation, planning, experimentation, and writing over a shared workspace that records proposals, code, logs, results, and manuscripts. Its first public deployment produced 166 complete papers across 67 fine-grained AI/ML topics, and it kept the failures in the corpus rather than curating a highlight reel. Why it matters. 282 volunteer reviews over 140 papers give an honest read. FARS can produce review-worthy artifacts, while the same reviews expose recurring failure modes in narrow scope, methodology, and integrity. Paper: https://t.co/f6pMG0hYAA Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c
Great paper on managing agent skills. Skill libraries keep growing, and picking the right skills has become a bottleneck for coding agents. The defaults are to expose the agent to the whole skill collection, or retrieve skills with embeddings and rerankers. Both treat the choice as independent picks. SkillComposer treats composition as one joint decision over which skills, how many, and in what order. A constrained autoregressive decoder over skill identifiers produces the full plan in a single pass, so dependencies between successive skills fall out naturally. On SkillsBench with GPT-5.2-Codex and Gemini-3-Pro-Preview, it lifts pass rate by +23.1 and +18.2pp over no-skill, beats top-3 retrieval, and matches the gold-skill upper bound at lower prompt-token cost. Paper: https://t.co/ovbQf07Mmk Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
We have an internal slack bot showing leaderboards for ARC Prize 2026 Lots of new leaderboard positions after the 1st place templates were open sourced yesterday https://t.co/AlQXF2wfu6
I've watched every AI wave since the first chatbots. @clickup 's BrainΒ² is the first one that does not make me pick a model or start over each session. It already knows my work. https://t.co/IIFtRreGqB
New block in Notion: HTML. Build interactive HTML right on your Notion page. Ask AI to turn your content into interactive explainers, prototypes, or diagrams. Share with your team to use and tinker together. https://t.co/y6ojkjdZkv
All of this stuff feels like we are taking steps back. I just don't see it the same way these folks are seeing it. https://t.co/XXVXQpozSZ
@dpetrou https://t.co/65yydWI5Ka
https://t.co/DqV6cQUD9f
Lol. With my agent setup (using loops and automations), I already hit limits with Opus 4.8 on my Max plan. Fable 5 is essentially unusable, not to mention that's it's nerfed. This is the most confusing AI launch of all time. https://t.co/6XHp7CF6VS
Not really excited about this nerfed and limited Fable 5. One of the most confusing AI launches of all time. But we carry on. https://t.co/pYkAruxV8L
Fable 5 is back. https://t.co/9RTGUCcPHy
I'm going to try the new @NVIDIAAI Nemotron-3-Nano-30B-A3B and compare it to Qwen 3.6 35B in agentic workflows. https://t.co/z9cnRBOo1c
Fable 5 is back. https://t.co/9RTGUCcPHy
I am working on building a vector database of all the top podcasts in the world. I think there's a ton of alpha in there, and you can query it through a natural language interface. Currently in beta, head over to Rallies chat to try it. If you have a favorite podcast, you can comment it and I'll add it to our database.
I'm 17 and with my cofounder(@NgenfueGlynn), we just killed traditional BI tools. We built Seleci because founders waste hours every week opening 8 different tools just to know if their business is healthy. Seleci connects Stripe, Google Analytics, HubSpot, Intercom, PostHog, QuickBooks, Mailchimp, and Sentry into one intelligence layer. Four specialized AI agents, Revenue, Growth, Churn, and Product, analyze your live data and deliver a morning briefing so you wake up knowing exactly where your business stands. β Daily morning briefings β Smart anomaly alerts (before things go wrong) β Automated reports, no manual work β One dashboard replacing a dozen tools β Actionable insights, not just charts Try Seleci at https://t.co/RiURyaCcEQ. Comment "Seleci" and i'll give you a 14-days free Seleci Pro key
Your AI coding agents are blind. They can't see organic cursor drifts, easing curves, multi-agent coordination, or how a beautiful canvas actually feels alive. I built AnimSpec (https://t.co/XqAfPqwf4j) to fix that. Watch me take a screen recording of @MagicPathAI multi-agent workspace β turn it into a perfect prompt β and have both MagicPath as well as @claudeai Design rebuild the exact same organic animations.π
Even I donβt understand what this means. Bus stop advertising in San Francisco. https://t.co/EoLsXU5nJ2
Check it out on @huggingfaceπ https://t.co/bHsHo4fd5g
Check it out on @huggingfaceπ https://t.co/bHsHo4fd5g
Plain and simple. https://t.co/w4IxRThmoW
Plain and simple. https://t.co/w4IxRThmoW
> Be Jonny Kim > Born to South Korean immigrants in Los Angeles > Grows up in an intensely abusive household, constantly full of fear > The night before he graduates high school, his father threatens the family with a gun > Police arrive, a shootout happens, and his father is killed > Decides he wants to protect people so he enlists in the Navy at 18 > Survives Hell Week and becomes a Navy SEAL > Deploys to Iraq twice as a combat medic, sniper, and point man > Completes over 100 combat operations under fire > Earns a Silver Star and a Bronze Star for saving wounded comrades > Watches his close friends die in battle and realizes he wants to heal people, not just fight > Leaves active duty to get a degree in Mathematics from USD > Auditions for medical school and gets accepted into Harvard > Graduates from Harvard Medical School as an M.D. in 2016 > Starts his residency in emergency medicine at Massachusetts General Hospital > Gets bored of being a regular doctor and applies to NASA > Selected as 1 of only 12 candidates out of 18,300 applicants > Becomes a NASA Astronaut in 2020 > Decides space isn't enough, so he joins Navy flight school to face his fear of flying > Earns his wings as a fully certified military pilot and naval flight surgeon > Launches into space on a rocket to the International Space Station > Logs 245 days in orbit, traveling 104 million miles around the Earth before returning home > Returns to Earth as a SEAL, a Harvard Doctor, an Aviator, and an Astronaut at just 41 years old And Jonny Kim is still the most humble guy on the planet who makes everyone else's resume look blank. Jonny Kim is badass.
KFC has rebuilt its entire identity around the bucket β and JKR is calling it the 'Bucketverse' π A new 3D wordmark, custom type (Kentucky Fried Serif and Kentucky Fried Sans) and a redrawn Colonel, rolling out across 150+ countries: https://t.co/c0MCQA2m5k https://t.co/PqLEr6k7zZ

We took a 30B model and split it in two to write tokens in parallel instead of one at a time. Introducing Nemotron-Labs-TwoTower: a diffusion language model from NVIDIA Research adapted from Nemotron-3-Nano-30B-A3B. Hereβs how it works: one half holds the context, the other writes the tokens, with both reusing the pretrained model instead of training a new one from scratch. We found it kept 98.7% of the original modelβs quality at 2.42Γ faster generation.
π£ I'll be in Seoul next week to present one main conference paper and four workshop papers at ICML! I'll also be on a panel at the https://t.co/D3wwI18H7o alignment workshop! Reach out if you are around and want to chat about uncertainty, reliability, or AI evals!π Detailsβ¬οΈ πPaper 1: Towards a Science of AI Agent Reliability πMain conference: Thursday (July 9) β’ 14:30β16:15 in Hall A β’ Poster #3408 πWorkshop on Failure Modes in Agentic AI (FAGEN): Friday (July 10) β’ 10:10β11:00 and 14:40β15:30 in Grand Ballroom 104-105 πhttps://t.co/HAKHzASrOZ π§΅https://t.co/uQCpPIiXSJ πPaper 2: Log Analysis is Necessary for Credible Evaluation of AI Agents πWorkshop on Failure Modes in Agentic AI (FAGEN): Friday (July 10) β’ 10:10β11:00 and 14:40β15:30 in Grand Ballroom 104-105 πhttps://t.co/2xKsB4oMaU π§΅https://t.co/StcdxiRuXi πPaper 3: Open-World Evaluations for Measuring Frontier AI Capabilities πWorkshop on Agents in the Wild (AIWILD): Saturday (July 11) β’ 11:10β12:00 and 16:10β17:00 in Hall B2 πhttps://t.co/nq9iJtBGLs π§΅https://t.co/tTblfaNqld πPaper 4: Life After Benchmark Saturation: A Case Study of CORE-Bench πWorkshop on Agents in the Wild (AIWILD): Saturday (July 11) β’ 11:10β12:00 and 16:10β17:00 in Hall B2 πhttps://t.co/NtEyYrSlF9 π§΅https://t.co/w7Pphsd6ko π£οΈPanel on the AI capabilityβreliability gap πhttps://t.co/D3wwI18H7o Seoul Alignment Workshop: Monday (July 6) πhttps://t.co/iBxqhTQmVf Also, my advisor @random_walker is going to deliver a keynote on Thursday (July 9) at 13:30 in Hall C: https://t.co/qAO4ZjhZxX. Don't miss it!

π’ 1) We have a few papers that advance the state of the art of AI agent evaluation. Details and links in Stephan's post. 2) AI agent evaluation has quickly become a distinct discipline. We're working on a paper titled "Emerging trends in AI agent evaluation" that extracts best practices for this community. 3) I'm giving an invited talk at ICML, addressing anxiety about supposedly imminent Recursive Self Improvement and the question of what will remain for humans to work on (especially scientists, researchers, software engineers). I hope to make it provocative but cautiously optimistic. https://t.co/rYHlxPGEXY (I also plan to share the ideas from the talk as essays on the AI as Normal Technology newsletter.)
π£ I'll be in Seoul next week to present one main conference paper and four workshop papers at ICML! I'll also be on a panel at the https://t.co/D3wwI18H7o alignment workshop! Reach out if you are around and want to chat about uncertainty, reliability, or AI evals!π Detailsβ¬οΈ πP
Launch FLARE-AI π: Flaw & Incident Reporting for AI. Reporting AI flaws is broken: forms are scattered, non-standardized, and reports get siloed instead of reaching everyone who needs them (think universal jailbreaks). We built one open-source place to fix that: create your report + route. Open source + free. π§΅/

PyTorch Foundation is a Gold Sponsor of Agentic AI Summit 2026. Matt White, CTO of PyTorch Foundation, will lead βThe Open Agentic Stack,β on building AI systems with open source, open standards, and composability. π https://t.co/r5XsdDBGar https://t.co/45C2zpWHyB

@DennisHerzog18 @Teach2Breach Robot company @proceptionAI from my robot. :-) https://t.co/GMGBfwPLhF

Love the Agentic MapReduce approach. First learned about it in DocETL and it works well across many kinds of tasks even beyond security. Check out this git repo for more details https://t.co/CGokMbJmAB
Introducing Devin Security Swarm A more cost effective and accurate way to find security vulnerabilities in complex codebases, based on a new architecture: Agentic MapReduce.
lots of stuff new cooking in the DocETL project -- we are building an AI-SQL interface; friendly for agents like claude code / codex to use as a tool; starting to work with open source LLMs...star and stay tuned!! https://t.co/aBrSVjZs8y
Love the Agentic MapReduce approach. First learned about it in DocETL and it works well across many kinds of tasks even beyond security. Check out this git repo for more details https://t.co/CGokMbJmAB
I've joined @OpenAI to work on Codex @ajambrosino and team have built a very good app! It's the first coding agent GUI that got me out of the terminal Excited to help make it even better, especially as it goes beyond software engineers Also delighted to get to work with old friends @gpeal8 @tarstarr again
π Build and test web apps right in the browser with @code! Pages can now request camera, location, and device access with per-site approval prompts. The Agents window (Preview) also introduces guided onboarding tours to help you get started faster. https://t.co/9t6cY7VOH3