@omarsar0
RARE: Retrieval-Augmented Reasoning for LLMs Extends the rStar reasoning framework to enhance reasoning accuracy and factual reliability of LLMs. It leverages a Monte Carlos Tree Search (MCTS) framework with explicit retrieval-augmented reasoning to produce multiple candidate reasoning trajectories. Then it leverages a retrieval-augmented factuality scorer to evaluate the factual accuracy of the reasoning trajectories. The trajectory with the highest factuality score is selected as the final answer by the system. On medical reasoning tasks, RARE (which uses Llama 3.1) surpasses larger models such as GPT-4. On commonsense reasoning tasks, RARE outperformed Claude-3.5 Sonnet and GPT-4o-mini, achieving performance competitive with GPT-4o. Note that this is a test-time computing framework which means there is no need for additional training or fine-tuning of the underlying LLM. The LLM could use any open-source model. The authors plan to release code and datasets soon.