@arankomatsuzaki
ReSum: Long-Horizon Web Agents Without Context Limits ⢠Problem: ReAct hits context limits in long searches (32k tokens) ⢠Solution: ReSum periodically compresses history ā compact reasoning states ⢠ReSumTool-30B: specialized summarizer extracts key evidence & gaps ⢠ReSum-GRPO (RL): trains agents to adapt summaries into reasoning ⢠+4.5% over ReAct baseline, +8.2% with RL across web search benchmarks abs: https://t.co/QRkfu2w6TN (7/7)