Your curated collection of saved posts and media
Adaptive retrieval is the way to go! And this RouteRAG paper shows why. Let's talk about it: RAG systems have a retrieval problem. The default approach to multi-hop reasoning today relies on fixed retrieval pipelines. It typically involves fetching text + maybe graph data, and hope everything is retrieved in one-shot. But the reality is that complex questions and real-world tasks require adaptive retrieval. Sometimes you need text. Sometimes you need relational structure from a graph. Sometimes it will be great to use both. And it's not secrete that graph retrieval is expensive, so retrieving it unnecessarily wastes compute. This new research introduces RouteRAG, an RL-based framework that teaches LLMs to make adaptive retrieval decisions during reasoning. When to retrieve, what source to retrieve from, and when to stop. The model learns a unified generation policy through two-stage training. > Stage 1 optimizes for answer correctness, establishing reasoning capability. > Stage 2 adds an efficiency reward that discourages unnecessary retrieval, teaching the model to balance accuracy against computational cost. The action space includes three retrieval modes: passage-only, graph-only, or hybrid. The model dynamically selects based on evolving query needs. Text retrieval works well for simple questions. Graph retrieval shines for multi-hop reasoning. The policy learns when each is appropriate. Results across five QA benchmarks: RouteRAG-7B achieves 60.6 average F1, outperforming Search-R1 (56.8 F1) despite being trained on only 10k examples versus 170k. On multi-hop datasets like 2Wiki, it reaches 64.6 F1 compared to 58.9 for Search-R1. The efficiency gains are also substantial. RouteRAG-7B reduces average retrieval turns by 20% compared to training without the efficiency reward, while actually improving accuracy by 1.1 F1 points. So we get best of both worlds: fewer retrieval calls and better answers. And here is something exciting: Small models also approach large model performance. RouteRAG with Qwen2.5-3B surpasses several graph-based RAG systems built on GPT-4o-mini, suggesting that improving the retrieval policy can be as impactful as scaling the backbone. Teaching models when and what to retrieve through RL yields more efficient and accurate multi-hop reasoning than scaling training data or model size alone. Paper: https://t.co/a4J6oAX0GC Learn to build RAG and effective AI Agents in our academy: https://t.co/zQXQt0PMbG

OpenSkills v1.3.0 is out ๐ The Universal Skills loader for AI Coding Agents Now you can: โข Use Symlinks with your skills โข Install skills from local paths & private git repos โข Sync to any .md file (--output flag) โข Run fully headless in CI/CD (--yes) npm i -g openskills https://t.co/Rt0le0Akxy
Jamie Dimon says soft skills like emotional intelligence and communication are vital as AI eliminates roles https://t.co/XBm2hITjVp @jpmorgan @fortunemagazine
At this small buyout firm, talking about AI for cost-cutting is off-limits https://t.co/tOUS8W7n7y @nicollsanddimes @businessinsider
After 3 tech layoffs, I knew I had to lean into being a founder https://t.co/Bp0QhYDrKu @TimSParadis @businessinsider
DeepCode: Open Agentic Coding AI coding agents still can't reliably turn research papers into working code. The best LLM agents achieve only 42% replication scores on scientific papers, while human PhD experts hit 72%. But the problem isn't model capability. This new paper suggests it might have to do with information management. It introduces DeepCode, an open agentic coding framework that treats repository synthesis as a channel optimization problem, maximizing task-relevant signals under finite context budgets. How does this work? Scientific papers are high-entropy specifications with scattered multimodal constraints, equations, pseudocode, and hyperparameters. Naive approaches that concatenate raw documents with growing code history cause channel saturation, where redundant tokens mask critical algorithmic details and signal-to-noise ratio collapses. DeepCode addresses this through four orchestrated information operations: 1) Source compression via blueprint distillation: A planning agent transforms unstructured papers into structured implementation blueprints with file hierarchies, component specifications, and verification protocols. 2) Structured indexing using stateful code memory: As files are generated, the system maintains compact memory entries of the evolving codebase to preserve cross-file consistency without context saturation. 3) Conditional knowledge injection via RAG: The system bridges implicit specification gaps by pulling standard implementation patterns from external knowledge bases. 4) Closed-loop error correction: A validation agent treats runtime execution feedback as corrective signals to identify and fix bugs iteratively. The results on OpenAI's PaperBench benchmark are impressive. DeepCode achieves 73.5% replication score, a 70% relative improvement over the best LLM agent baseline (o1 at 43.3%). It decisively outperforms commercial agents: Cursor at 58.4%, Claude Code at 58.7%, and Codex at 40.0%. Most notably, DeepCode surpasses human experts. On a 3-paper subset evaluated by ML PhD students from Berkeley, Cambridge, and Carnegie Mellon, humans scored 72.4%. DeepCode scored 75.9%. Principled information-flow management yields significantly larger performance gains than merely scaling model size or context length. The framework is fully open source. Paper: https://t.co/LXVKsxOXfi Learn to build effective AI Agents here: https://t.co/JBU5beIoD0

Sign if you know/remember https://t.co/bfJo4lplht
Sign if you know/remember https://t.co/bfJo4lplht
Make sure to set up the all new ๐ and Grok widgets on your lock screen for instant access. https://t.co/tkkryFdh3t
WOW! An absolute ocean of Chileans have flooded the streets to celebrate the end of socialist rule in their country. A capitalist revolution is ensuing all over the Americas. https://t.co/B5RGsIY2AM
@Wizarab10 Ohh!! Oh babyyyyy Sir Dickson you like Native very well oo Do you wear Jeans and shorts? https://t.co/YoCGEmfj1O
Amazon pulls AI recap from Fallout TV show after it made several mistakes https://t.co/PYvJIJuqZW @liv_mcmahon @bbctech @bbc
With AI, @MIT researchers teach a robot to build furniture by just asking https://t.co/YVPz0y1ILE @therobotreport https://t.co/Pgnft1Unj4
@youwouldntpost it begins... https://t.co/jcrQiTbVbS
๐ช๐บOn vote pour cette couvโ๐๐ผ!! Merci @roxannevarza @ClaraChappaz et ๐๐ผ@itsrwanman pour ce matching! Story ยซย Thankfully we have @demishassabis in the photo.ย ยป & @ClementDelangue @julien_c @Thom_Wolf @arthurmensh @ptrkprz @ylecun @antonosika @FabianHedin @ASMLcompany @Lovable https://t.co/u8npbbj1qG
Do LLMs really understand? To cut through the noise around AI, we brought in two experts shaping the field: Yann LeCun (@ylecun) and Adam Brown of @GoogleDeepMind. Watch them in conversation with @JannaLevin about the true intelligence of artificial minds. https://t.co/LnzQGPUIwA https://t.co/f6gbwJSELF
It was a joy to watch this substantive discussion. @JannaLevin for @PioneerWorks_ talking to @ylecun and Adam Brown of Google DeepMind, about many things AI! And a surprise plant in the audience toward the end shouts out questions about machine consciousness! https://t.co/gIWX6FtPv8
I've never had a reason to really try Grok so I've missed out on all the insanity. It turns out that @grok thinks I'm @emollick. I'm flattered to be confused with Ethan, but also... what? How does a bug like this even arise?! https://t.co/pp6WcNXDi7

Update: I asked it what my username is and answered correctly (after looking at 19 webpages?!) and now I can't reproduce the bug even in a fresh session. Oh well. You know what they say, everyone can be @emollick for 15 minutes. https://t.co/rSflSk5zT3
Embarking on a massive redesign for @braingridai with the new brand. @braingridai crafted a plan, broke down the tasks, and provided detailed prompts for each one. Now, Claude code is pulling tasks one by one and building them ๐ช๐ผ https://t.co/XzlSxsRCu3
UK charities brace for ยฃ300m Christmas donation slump, with childrenโs and homelessness charities among worst affected https://t.co/BB4KQZnF0J
Let them suffer. Provide directly to your own communities, these charlatans just redirect your money without your knowing. Follow the trail, charity starts at home. https://t.co/CaQVsdqFJi
UK charities brace for ยฃ300m Christmas donation slump, with childrenโs and homelessness charities among worst affected https://t.co/BB4KQZnF0J

@nathan_ga19 me and the boys banning Indians from X app https://t.co/FK4naa2SZX
Your weekend just got lighter. ๐ Upgrade what you need today and split the payments with zero stress. Lucredโs got you covered. #LucredFinance #StressFreeShopping #BNPL #NigeriaFintech https://t.co/GJTLwXVfIA
Weโre hiring 2 summer interns on the NVIDIA Brev team! Apply here: https://t.co/h7iswNxJCU https://t.co/F33d8DBq1U
@Tim_Dettmers "in other words, AGI โ then that intelligence can improve itself, leading to a runaway effect. This idea comes from Oxford-based philosophers who brought these concepts to the Bay Area." Amen. By the way, David Krakauer imo understands intelligence very well (see images and our recent coverage of him). Most of the ideology around "AGI" conceives of intelligence as an abstract monomaniacal algorithm with a single goal - something which no serious cognitive science (other than connectionists, LOL) would give any credence to. As @MelMitchell1 says, it's multi scale, multi objective and multi domain. I just finished reading this book - https://t.co/9fLwfAy5Hu by Adam Becker and you would really enjoy reading chapter 2 "Machines of Loving Grace"

Exports are only 20% of China's GDP. The U.S. is 15% of that (3%). The Chinese domestic market is way bigger than people realize.
This picture was taken on Mars this week. https://t.co/Dd0KM8z4yC
็ๆAIใฏใไบบใฎๆ้ทใๅ ้ใใใใใผใใใผใใซใชใใใ Google Cloudๆงๆไพใฎใๆฅ็ตใใธใในใ่จไบใซใฆใSakana AI CEO David Ha๏ผ@hardmaru๏ผใๅๅ ใใใคใณใฟใใฅใผใๅ ฌ้ใใใพใใใๆฅญๅใซ้ฉใใAIใขใใซใฎ็ตใฟๅใใๆนใใๆฅๆฌไผๆฅญใฎๅผทใฟใๆดปใใAIๅฐๅ ฅใซใคใใฆ่ชใใพใใใ ็ต็นๆๅใซๅฏใๆทปใAIใ ใๆฅๆฌใฎๅคงๆไผๆฅญใฎไปใฎใใๆนใฏใไฝๅๅนดใใฎๆ้ใใใใฆๅบๆฅไธใใฃใฆใใใใฎใงใใใใฎ็ต็นใฎๅฝขใๅใๆนใซใชใฃใใใจใซใฏใใใใชใใฎ็็ฑใใใใพใใใใใซๅฏพใใฆใ็ต็นใใใฉใใใซใใจ่ฆๆฑใใใฎใฏ้ๅ็็ใงใใใใใใAIใใใใๆฏๆดใใฆไธ็ทใซๅใใใจใๅคงๅใงใใใ ไบบใฎๆ้ทใๅ ้ใใใใใผใใใผใ็ฎๆใใฆ ใ็ๆAIใฎ็ฎๆจใฎไธใคใฏใไบบใจใณใณใใฅใผใฟใผใๆฉๆธกใใใCompanionใซใชใใใจใงใใโฆๆทฑใๆฅญๅใฎ็่งฃใAIใๆใกใไบบ้ใจๅใๆนๅใๅใใฆๅ้กใ่งฃๆฑบใใฆใใใใใซใชใฃใฆใใใใจใๆฌกใฎๆๆฆใ ใจๆใใพใใใ Sakana AIใฎใใใทใงใณ ใSakana AIใฏAIใขใใซใAIใจใผใธใงใณใใฎ็ ็ฉถ้็บใซๅใ็ตใในใฟใผใใขใใใงใใใใพๅใ็ตใใงใใใใจใฏ2ใคใใใพใใใจใใซใฎใผๅน็ใ้ซใๆฅๆฌ็บใฎAIใขใใซใฎ้็บใจใใใใคใใฎAIใขใใซใใใพใไฝฟใๅใใฆ้ซๅบฆใชใใญใปในใ่ชๅๅฎ่กใงใใใใใชAIใจใผใธใงใณใใฎ้็บใงใใใ AIใขใใซใไฝฟใๅใใ็ตใฟๅใใใ ใAIใขใใซใซใใฃใฆๅผทใใจใใใจๅผฑใใจใใใ้ใฃใฆใใพใใญใใ ใใ็งใใกใฏใ่คๆฐใฎAIใ็ตใฟๅใใใฆๆฑใใๆฉ่ฝใๅฎ็พใใใใซใใใกใฆใณใใผใทใงใณใขใใซใๆๅฑใใฆใใพใใโฆไผๆฅญใฎไบๆฅญใใกใคใณใ้ฉ็จใใๆฅญๅใซใใฃใฆ้ธใถในใ็ๆAIใฏๅคใใใพใใๅฉ็จ่ ใฎๅฟๅใๆฅญๅใใฆใใฆใ็่งฃใใใใใงใAIใขใใซใ้ธใณใใใใฆใขใใซใๅบใใฆใใๅ็ญใฎ้ฉๅใใ่ฉไพกใใๅฟ ่ฆใใใใพใใใใฎใใญใปในใใใใAIๅฐๅ ฅใฎๆๅใๅทฆๅณใใ้ตใซใชใใพใใใ ๅ จๆใฏใใกใ๏ผhttps://t.co/RSSy8bFbLa

Applied Research Engineer https://t.co/FuEoI2xrzS