🐦 Twitter Post Details

Viewing enriched Twitter post

@omarsar0

What does it take to build the best cost-efficient deep research agent? Current Deep Research systems optimize for multi-hop retrieval: find scattered facts, chain them together, and then return an answer. They feel more like efficient web crawlers than researchers who synthesize evidence into defensible arguments. But real research requires intent decomposition, planning, cross-source verification, reflection, and structured report writing. This report introduces Step-DeepResearch, a 32B parameter end-to-end Deep Research agent that rivals OpenAI and Gemini's proprietary systems at a fraction of the cost. Reframe training from predicting the next token to deciding the next atomic action. Four atomic capabilities form the foundation: planning and task decomposition, deep information seeking, reflection and verification, and report generation. They propose a progressive training pipeline from agentic mid-training through supervised fine-tuning to reinforcement learning. Mid-training injects domain knowledge and tool-calling ability across 128K context. SFT composes atomic capabilities into end-to-end research trajectories. RL with a Checklist-style Judger reward optimizes for rubric compliance in real web environments. On Scale AI's ResearchRubrics benchmark, Step-DeepResearch scores 61.42, comparable to OpenAI DeepResearch and Gemini DeepResearch. In expert human evaluations on their new ADR-Bench (Chinese Deep Research scenarios), it outperforms larger models like MiniMax-M2, GLM-4.6, and DeepSeek-V3.2. The architecture is surprisingly simple: a single ReAct-style agent with no multi-agent orchestration or heavyweight workflows. All complexity is internalized through training. Why does this work matter? Medium-sized models can achieve expert-level Deep Research when trained on the right atomic capabilities. The most cost-effective path isn't more parameters or elaborate workflows. It's better data and progressive capability composition. Paper: https://t.co/am4PjHNfVc Learn to build effective AI agents in our academy: https://t.co/JBU5beIoD0

Media 1

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2005378485842477298/media_0.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2005378485842477298/media_0.jpg?",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2025-12-31T02:48:16.589368",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2005378485842477298",
  "url": "https://x.com/omarsar0/status/2005378485842477298",
  "twitterUrl": "https://twitter.com/omarsar0/status/2005378485842477298",
  "text": "What does it take to build the best cost-efficient deep research agent?\n\nCurrent Deep Research systems optimize for multi-hop retrieval: find scattered facts, chain them together, and then return an answer.\n\nThey feel more like efficient web crawlers than researchers who synthesize evidence into defensible arguments.\n\nBut real research requires intent decomposition, planning, cross-source verification, reflection, and structured report writing.\n\nThis report introduces Step-DeepResearch, a 32B parameter end-to-end Deep Research agent that rivals OpenAI and Gemini's proprietary systems at a fraction of the cost.\n\nReframe training from predicting the next token to deciding the next atomic action. Four atomic capabilities form the foundation: planning and task decomposition, deep information seeking, reflection and verification, and report generation.\n\nThey propose a progressive training pipeline from agentic mid-training through supervised fine-tuning to reinforcement learning. Mid-training injects domain knowledge and tool-calling ability across 128K context. SFT composes atomic capabilities into end-to-end research trajectories. RL with a Checklist-style Judger reward optimizes for rubric compliance in real web environments.\n\nOn Scale AI's ResearchRubrics benchmark, Step-DeepResearch scores 61.42, comparable to OpenAI DeepResearch and Gemini DeepResearch. In expert human evaluations on their new ADR-Bench (Chinese Deep Research scenarios), it outperforms larger models like MiniMax-M2, GLM-4.6, and DeepSeek-V3.2.\n\nThe architecture is surprisingly simple: a single ReAct-style agent with no multi-agent orchestration or heavyweight workflows. All complexity is internalized through training.\n\nWhy does this work matter?\n\nMedium-sized models can achieve expert-level Deep Research when trained on the right atomic capabilities. The most cost-effective path isn't more parameters or elaborate workflows. It's better data and progressive capability composition.\n\nPaper: https://t.co/am4PjHNfVc\n\nLearn to build effective AI agents in our academy: https://t.co/JBU5beIoD0",
  "source": "Twitter for iPhone",
  "retweetCount": 52,
  "replyCount": 19,
  "likeCount": 305,
  "quoteCount": 0,
  "viewCount": 22879,
  "createdAt": "Sun Dec 28 20:41:03 +0000 2025",
  "lang": "en",
  "bookmarkCount": 304,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2005378485842477298",
  "displayTextRange": [
    0,
    296
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "omarsar0",
    "url": "https://x.com/omarsar0",
    "twitterUrl": "https://twitter.com/omarsar0",
    "id": "3448284313",
    "name": "elvis",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
    "description": "",
    "location": "DAIR.AI Academy",
    "followers": 282031,
    "following": 752,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Fri Sep 04 12:59:26 +0000 2015",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 34254,
    "hasCustomTimelines": true,
    "isTranslator": true,
    "mediaCount": 4420,
    "statusesCount": 16895,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "2006004138220605920"
    ],
    "profile_bio": {
      "description": "Building @dair_ai • Prev: Meta AI, Elastic, PhD • New cohort: https://t.co/GZMhf39NRs",
      "entities": {
        "description": {
          "urls": [
            {
              "display_url": "dair-ai.thinkific.com/courses/claude…",
              "expanded_url": "https://dair-ai.thinkific.com/courses/claude-code-for-everyone-2",
              "indices": [
                62,
                85
              ],
              "url": "https://t.co/GZMhf39NRs"
            }
          ],
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                9,
                17
              ],
              "name": "",
              "screen_name": "dair_ai"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "dair.ai",
              "expanded_url": "https://www.dair.ai/",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/XQto5ypkSM"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/cyF5HEWpmY",
        "expanded_url": "https://twitter.com/omarsar0/status/2005378485842477298/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": [
              {
                "h": 92,
                "w": 92,
                "x": 211,
                "y": 491
              }
            ]
          },
          "orig": {
            "faces": [
              {
                "h": 92,
                "w": 92,
                "x": 211,
                "y": 491
              }
            ]
          }
        },
        "id_str": "2005378481916665856",
        "indices": [
          297,
          320
        ],
        "media_key": "3_2005378481916665856",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARvUiRq22uAACgACG9SJG6DaAPIAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABG9SJGrba4AAKAAIb1IkboNoA8gAA",
            "media_key": "3_2005378481916665856"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/G9SJGrba4AAwu1b.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 818,
              "w": 1460,
              "x": 0,
              "y": 0
            },
            {
              "h": 1460,
              "w": 1460,
              "x": 0,
              "y": 0
            },
            {
              "h": 1664,
              "w": 1460,
              "x": 0,
              "y": 0
            },
            {
              "h": 1716,
              "w": 858,
              "x": 301,
              "y": 0
            },
            {
              "h": 1716,
              "w": 1460,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1716,
          "width": 1460
        },
        "sizes": {
          "large": {
            "h": 1716,
            "w": 1460
          }
        },
        "type": "photo",
        "url": "https://t.co/cyF5HEWpmY"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "urls": [
      {
        "display_url": "arxiv.org/abs/2512.20491",
        "expanded_url": "https://arxiv.org/abs/2512.20491",
        "indices": [
          1990,
          2013
        ],
        "url": "https://t.co/am4PjHNfVc"
      },
      {
        "display_url": "dair-ai.thinkific.com",
        "expanded_url": "https://dair-ai.thinkific.com/",
        "indices": [
          2066,
          2089
        ],
        "url": "https://t.co/JBU5beIoD0"
      }
    ]
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}