🐦 Twitter Post Details

Viewing enriched Twitter post

@omarsar0

Great read for AI devs. (bookmark it) LLM agents are slow. The bottleneck in complex agentic systems today is the planning part. Plan generation alone can take 25+ seconds for task requests. This compounds fast at scale. Real-world dataset analysis shows about 30% of requests received by LLM-driven agents are semantically identical or similar. This new paper introduces AgentReuse, a plan reuse mechanism that caches and retrieves previously generated plans based on semantic similarity. Two requests like "Book a ticket from Hefei to Beijing for tomorrow" and "Book a ticket from Changsha to Shanghai for Friday" differ in parameters but share an identical task structure. The plan steps are the same. Only the key values change. Using these insights, AgentReuse separates intent from parameters. It extracts key parameters (time, origin, destination), classifies intent, and then performs similarity matching on the parameter-stripped request. When a match exists, it injects new parameters into the cached plan and executes directly. On a real-world dataset of 2,664 task requests, AgentReuse achieves a 93% effective plan reuse rate. F1 score of 0.9718. Accuracy of 0.9459. Latency was reduced by 93.12% compared to no caching and 60.61% compared to GPTCache. The overhead is minimal. ~100MB additional VRAM, less than 1MB memory per request, and under 10ms processing latency per request. Plan generation that previously took 25-30 seconds becomes a cache lookup. Agents don't need to regenerate plans for structurally similar tasks. Semantic caching at the plan level, not the response level, unlocks massive latency reduction while preserving accuracy for dynamic, real-time information. I am sure this can inspire more general patterns that speed up coding agents. Remains to be seen, but it seems like a cool idea to apply in that domain. Paper: https://t.co/oIF1o44Zrl Learn to build effective AI agents in our academy: https://t.co/JBU5beIoD0

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2005799762252136537/media_0.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2005799762252136537/media_0.jpg?",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2025-12-31T02:47:57.229739",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2005799762252136537",
  "url": "https://x.com/omarsar0/status/2005799762252136537",
  "twitterUrl": "https://twitter.com/omarsar0/status/2005799762252136537",
  "text": "Great read for AI devs.\n\n(bookmark it)\n\nLLM agents are slow.\n\nThe bottleneck in complex agentic systems today is the planning part.\n\nPlan generation alone can take 25+ seconds for task requests.\n\nThis compounds fast at scale. Real-world dataset analysis shows about 30% of requests received by LLM-driven agents are semantically identical or similar.\n\nThis new paper introduces AgentReuse, a plan reuse mechanism that caches and retrieves previously generated plans based on semantic similarity.\n\nTwo requests like \"Book a ticket from Hefei to Beijing for tomorrow\" and \"Book a ticket from Changsha to Shanghai for Friday\" differ in parameters but share an identical task structure. The plan steps are the same. Only the key values change.\n\nUsing these insights, AgentReuse separates intent from parameters.\n\nIt extracts key parameters (time, origin, destination), classifies intent, and then performs similarity matching on the parameter-stripped request. When a match exists, it injects new parameters into the cached plan and executes directly.\n\nOn a real-world dataset of 2,664 task requests, AgentReuse achieves a 93% effective plan reuse rate. F1 score of 0.9718. Accuracy of 0.9459. Latency was reduced by 93.12% compared to no caching and 60.61% compared to GPTCache.\n\nThe overhead is minimal.\n\n~100MB additional VRAM, less than 1MB memory per request, and under 10ms processing latency per request. Plan generation that previously took 25-30 seconds becomes a cache lookup.\n\nAgents don't need to regenerate plans for structurally similar tasks. Semantic caching at the plan level, not the response level, unlocks massive latency reduction while preserving accuracy for dynamic, real-time information.\n\nI am sure this can inspire more general patterns that speed up coding agents. Remains to be seen, but it seems like a cool idea to apply in that domain.\n\nPaper: https://t.co/oIF1o44Zrl\n\nLearn to build effective AI agents in our academy: https://t.co/JBU5beIoD0",
  "source": "Twitter for iPhone",
  "retweetCount": 68,
  "replyCount": 30,
  "likeCount": 312,
  "quoteCount": 3,
  "viewCount": 26279,
  "createdAt": "Tue Dec 30 00:35:03 +0000 2025",
  "lang": "en",
  "bookmarkCount": 505,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2005799762252136537",
  "displayTextRange": [
    0,
    297
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "omarsar0",
    "url": "https://x.com/omarsar0",
    "twitterUrl": "https://twitter.com/omarsar0",
    "id": "3448284313",
    "name": "elvis",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
    "description": "",
    "location": "DAIR.AI Academy",
    "followers": 282031,
    "following": 752,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Fri Sep 04 12:59:26 +0000 2015",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 34254,
    "hasCustomTimelines": true,
    "isTranslator": true,
    "mediaCount": 4420,
    "statusesCount": 16895,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "2006004138220605920"
    ],
    "profile_bio": {
      "description": "Building @dair_ai • Prev: Meta AI, Elastic, PhD • New cohort: https://t.co/GZMhf39NRs",
      "entities": {
        "description": {
          "urls": [
            {
              "display_url": "dair-ai.thinkific.com/courses/claude…",
              "expanded_url": "https://dair-ai.thinkific.com/courses/claude-code-for-everyone-2",
              "indices": [
                62,
                85
              ],
              "url": "https://t.co/GZMhf39NRs"
            }
          ],
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                9,
                17
              ],
              "name": "",
              "screen_name": "dair_ai"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "dair.ai",
              "expanded_url": "https://www.dair.ai/",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/XQto5ypkSM"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/5bs6nzimce",
        "expanded_url": "https://twitter.com/omarsar0/status/2005799762252136537/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": [
              {
                "h": 87,
                "w": 87,
                "x": 821,
                "y": 263
              }
            ]
          },
          "orig": {
            "faces": [
              {
                "h": 87,
                "w": 87,
                "x": 821,
                "y": 263
              }
            ]
          }
        },
        "id_str": "2005799758070439937",
        "indices": [
          298,
          321
        ],
        "media_key": "3_2005799758070439937",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARvWCEC2WmABCgACG9YIQa+aAFkAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABG9YIQLZaYAEKAAIb1ghBr5oAWQAA",
            "media_key": "3_2005799758070439937"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/G9YIQLZaYAEfQ09.png",
        "original_info": {
          "focus_rects": [
            {
              "h": 907,
              "w": 1620,
              "x": 0,
              "y": 0
            },
            {
              "h": 1620,
              "w": 1620,
              "x": 0,
              "y": 0
            },
            {
              "h": 1798,
              "w": 1577,
              "x": 0,
              "y": 0
            },
            {
              "h": 1798,
              "w": 899,
              "x": 0,
              "y": 0
            },
            {
              "h": 1798,
              "w": 1620,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1798,
          "width": 1620
        },
        "sizes": {
          "large": {
            "h": 1798,
            "w": 1620
          }
        },
        "type": "photo",
        "url": "https://t.co/5bs6nzimce"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "urls": [
      {
        "display_url": "arxiv.org/abs/2512.21309",
        "expanded_url": "https://arxiv.org/abs/2512.21309",
        "indices": [
          1872,
          1895
        ],
        "url": "https://t.co/oIF1o44Zrl"
      },
      {
        "display_url": "dair-ai.thinkific.com",
        "expanded_url": "https://dair-ai.thinkific.com/",
        "indices": [
          1948,
          1971
        ],
        "url": "https://t.co/JBU5beIoD0"
      }
    ]
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}