@dair_ai
Adaptive retrieval is the way to go! And this RouteRAG paper shows why. Let's talk about it: RAG systems have a retrieval problem. The default approach to multi-hop reasoning today relies on fixed retrieval pipelines. It typically involves fetching text + maybe graph data, and hope everything is retrieved in one-shot. But the reality is that complex questions and real-world tasks require adaptive retrieval. Sometimes you need text. Sometimes you need relational structure from a graph. Sometimes it will be great to use both. And it's not secrete that graph retrieval is expensive, so retrieving it unnecessarily wastes compute. This new research introduces RouteRAG, an RL-based framework that teaches LLMs to make adaptive retrieval decisions during reasoning. When to retrieve, what source to retrieve from, and when to stop. The model learns a unified generation policy through two-stage training. > Stage 1 optimizes for answer correctness, establishing reasoning capability. > Stage 2 adds an efficiency reward that discourages unnecessary retrieval, teaching the model to balance accuracy against computational cost. The action space includes three retrieval modes: passage-only, graph-only, or hybrid. The model dynamically selects based on evolving query needs. Text retrieval works well for simple questions. Graph retrieval shines for multi-hop reasoning. The policy learns when each is appropriate. Results across five QA benchmarks: RouteRAG-7B achieves 60.6 average F1, outperforming Search-R1 (56.8 F1) despite being trained on only 10k examples versus 170k. On multi-hop datasets like 2Wiki, it reaches 64.6 F1 compared to 58.9 for Search-R1. The efficiency gains are also substantial. RouteRAG-7B reduces average retrieval turns by 20% compared to training without the efficiency reward, while actually improving accuracy by 1.1 F1 points. So we get best of both worlds: fewer retrieval calls and better answers. And here is something exciting: Small models also approach large model performance. RouteRAG with Qwen2.5-3B surpasses several graph-based RAG systems built on GPT-4o-mini, suggesting that improving the retrieval policy can be as impactful as scaling the backbone. Teaching models when and what to retrieve through RL yields more efficient and accurate multi-hop reasoning than scaling training data or model size alone. Paper: https://t.co/a4J6oAX0GC Learn to build RAG and effective AI Agents in our academy: https://t.co/zQXQt0PMbG