🐦 Twitter Post Details

Viewing enriched Twitter post

@omarsar0

Google just published a banger guide on effective context engineering for multi-agent systems. Pay attention to this one, AI devs! (bookmark it) Here are my key takeaways: Context windows aren't the bottleneck. Context engineering is. For more complex and long-horizon problems, context management cannot be treated as a simple "string manipulation" problem. The default approach to handling context in agent systems today remains stuffing everything into the prompt. More history, more tokens, more confusion. Most teams treat context as a string concatenation problem. But raw context dumps create three critical failures: > cost explosion from repetitive information > performance degradation from "lost in the middle" effects > increase in hallucination rates when agents misattribute actions across a system Context management becomes an architectural concern alongside storage and compute. This means that explicit transformations replace ad-hoc string concatenation. Agents receive the minimum required context by default and explicitly request additional information via tools. It seems that Google's Agent Development Kit is really thinking deeply about context management. It introduces a tiered architecture that treats context as "a compiled view over a stateful system" rather than a prompt-stuffing activity. What does this look like? 1) Structure: The Tiered Model The framework separates storage from presentation across four distinct layers: 1) Working Context handles ephemeral per-invocation views. 2) Session maintains the durable event log, capturing every message, tool call, and control signal. 3) Memory provides searchable, long-lived knowledge outliving single sessions. 4) Artifacts manage large binary data through versioned references rather than inline embedding. How does context compilation actually work? It works through ordered LLM Flows with explicit processors. A contents processor performs three operations: selection filters irrelevant events, transformation flattens events into properly-roled Content objects, and injection writes formatted history into the LLM request. The contents processor is essentially the bridge between a session and the working context. The architecture implements prefix caching by dividing context into stable prefixes (instructions, identity, summaries) and variable suffixes (latest turns, tool outputs). On top of that, a static_instruction primitive guarantees immutability for system prompts, preserving cache validity across invocations. 2) Agentic Management of What Matters Now Once you figure out the structure, the core challenge then becomes relevance. You need to figure out what belongs in the active window right now. ADK answers this through collaboration between human-defined architecture and agentic decision-making. Engineers define where data lives and how it's summarized. Agents decide dynamically when to "reach" for specific memory blocks or artifacts. For large payloads, ADK applies a handle pattern. A 5MB CSV or massive JSON response lives in artifact storage, not the prompt. Agents see only lightweight references by default. When raw data is needed, they call LoadArtifactsTool for temporary expansion. Once the task completes, the artifact offloads. This turns permanent context tax into precise, on-demand access. For long-term knowledge, the MemoryService provides two retrieval patterns: 1) Reactive recall: agents recognize knowledge gaps and explicitly search the corpus. 2) Proactive recall: pre-processors run similarity search on user input, injecting relevant snippets before model invocation. Agents recall exactly the snippets needed for the current step rather than carrying every conversation they've ever had. All of this reminds me of the tiered approach to Claude Skills, which does improve the efficient use of context in Claude Code. 3) Multi-agent Context Single-agent systems suffer from context bloat. When building multi-agents, this problem amplifies further, which easily leads to "context explosion" as you incorporate more sub-agents. For multi-agent coordination to work effectively, ADK provides two patterns. Agents-as-tools treats specialized agents as callables receiving focused prompts without an ancestral history. Agent Transfer, which enables full control handoffs where sub-agents inherit session views. The include_contents parameter controls context flow, defaulting to full working context or providing only the new prompt. What prevents hallucination during agent handoffs? The solution is conversation translation. Prior Assistant messages convert to narrative context with attribution tags. Tool calls from other agents are explicitly marked. Each agent assumes the Assistant role without misattributing the broader system's history to itself. Lastly, you don't need to use Google ADK to apply these insights. I think these could apply across the board when building multi-agent systems. (image courtesy of nano banana pro)

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1997348089888374918/media_0.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1997348089888374918/media_0.jpg?",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2025-12-08T13:23:08.476203",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "1997348089888374918",
  "url": "https://x.com/omarsar0/status/1997348089888374918",
  "twitterUrl": "https://twitter.com/omarsar0/status/1997348089888374918",
  "text": "Google just published a banger guide on effective context engineering for multi-agent systems.\n\nPay attention to this one, AI devs! (bookmark it)\n\nHere are my key takeaways:\n\nContext windows aren't the bottleneck. Context engineering is.\n\nFor more complex and long-horizon problems, context management cannot be treated as a simple \"string manipulation\" problem.\n\nThe default approach to handling context in agent systems today remains stuffing everything into the prompt. More history, more tokens, more confusion. Most teams treat context as a string concatenation problem.\n\nBut raw context dumps create three critical failures:\n\n> cost explosion from repetitive information\n> performance degradation from \"lost in the middle\" effects\n> increase in hallucination rates when agents misattribute actions across a system\n\nContext management becomes an architectural concern alongside storage and compute. This means that explicit transformations replace ad-hoc string concatenation. Agents receive the minimum required context by default and explicitly request additional information via tools.\n\nIt seems that Google's Agent Development Kit is really thinking deeply about context management. It introduces a tiered architecture that treats context as \"a compiled view over a stateful system\" rather than a prompt-stuffing activity.\n\nWhat does this look like?\n\n1) Structure: The Tiered Model\n\nThe framework separates storage from presentation across four distinct layers:\n\n1) Working Context handles ephemeral per-invocation views.\n2) Session maintains the durable event log, capturing every message, tool call, and control signal.\n3) Memory provides searchable, long-lived knowledge outliving single sessions.\n4) Artifacts manage large binary data through versioned references rather than inline embedding.\n\nHow does context compilation actually work? It works through ordered LLM Flows with explicit processors. A contents processor performs three operations: selection filters irrelevant events, transformation flattens events into properly-roled Content objects, and injection writes formatted history into the LLM request.\n\nThe contents processor is essentially the bridge between a session and the working context.\n\nThe architecture implements prefix caching by dividing context into stable prefixes (instructions, identity, summaries) and variable suffixes (latest turns, tool outputs). On top of that, a static_instruction primitive guarantees immutability for system prompts, preserving cache validity across invocations.\n\n2) Agentic Management of What Matters Now\n\nOnce you figure out the structure, the core challenge then becomes relevance.\n\nYou need to figure out what belongs in the active window right now.\n\nADK answers this through collaboration between human-defined architecture and agentic decision-making. Engineers define where data lives and how it's summarized. Agents decide dynamically when to \"reach\" for specific memory blocks or artifacts.\n\nFor large payloads, ADK applies a handle pattern. A 5MB CSV or massive JSON response lives in artifact storage, not the prompt. Agents see only lightweight references by default. When raw data is needed, they call LoadArtifactsTool for temporary expansion. Once the task completes, the artifact offloads. This turns permanent context tax into precise, on-demand access.\n\nFor long-term knowledge, the MemoryService provides two retrieval patterns:\n\n1) Reactive recall: agents recognize knowledge gaps and explicitly search the corpus.\n\n2) Proactive recall: pre-processors run similarity search on user input, injecting relevant snippets before model invocation. Agents recall exactly the snippets needed for the current step rather than carrying every conversation they've ever had.\n\nAll of this reminds me of the tiered approach to Claude Skills, which does improve the efficient use of context in Claude Code.\n\n3) Multi-agent Context\n\nSingle-agent systems suffer from context bloat. When building multi-agents, this problem amplifies further, which easily leads to \"context explosion\" as you incorporate more sub-agents.\n\nFor multi-agent coordination to work effectively, ADK provides two patterns. Agents-as-tools treats specialized agents as callables receiving focused prompts without an ancestral history. Agent Transfer, which enables full control handoffs where sub-agents inherit session views. The include_contents parameter controls context flow, defaulting to full working context or providing only the new prompt.\n\nWhat prevents hallucination during agent handoffs? The solution is conversation translation. Prior Assistant messages convert to narrative context with attribution tags. Tool calls from other agents are explicitly marked. Each agent assumes the Assistant role without misattributing the broader system's history to itself.\n\nLastly, you don't need to use Google ADK to apply these insights. I think these could apply across the board when building multi-agent systems.\n\n(image courtesy of nano banana pro)",
  "source": "Twitter for iPhone",
  "retweetCount": 158,
  "replyCount": 27,
  "likeCount": 887,
  "quoteCount": 7,
  "viewCount": 68939,
  "createdAt": "Sat Dec 06 16:51:08 +0000 2025",
  "lang": "en",
  "bookmarkCount": 1410,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "1997348089888374918",
  "displayTextRange": [
    0,
    273
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "omarsar0",
    "url": "https://x.com/omarsar0",
    "twitterUrl": "https://twitter.com/omarsar0",
    "id": "3448284313",
    "name": "elvis",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
    "description": "Building @dair_ai • Ex Meta AI, Elastic, PhD • New cohort: https://t.co/xw2XQ0z8up",
    "location": "DAIR.AI Academy",
    "followers": 278411,
    "following": 727,
    "status": "",
    "canDm": true,
    "canMediaTag": false,
    "createdAt": "Fri Sep 04 12:59:26 +0000 2015",
    "entities": {
      "description": {
        "urls": [
          {
            "display_url": "dair-ai.thinkific.com/courses/buildi…",
            "expanded_url": "https://dair-ai.thinkific.com/courses/building-effective-ai-agents-2",
            "url": "https://t.co/xw2XQ0z8up",
            "indices": [
              59,
              82
            ]
          }
        ]
      },
      "url": {
        "urls": [
          {
            "display_url": "dair.ai",
            "expanded_url": "https://www.dair.ai/",
            "url": "https://t.co/XQto5ypkSM",
            "indices": [
              0,
              23
            ]
          }
        ]
      }
    },
    "fastFollowersCount": 0,
    "favouritesCount": 33796,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 4362,
    "statusesCount": 16685,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1997717251546583103"
    ],
    "profile_bio": {},
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.x.com/ReZtE3KdTt",
        "expanded_url": "https://x.com/omarsar0/status/1997348089888374918/photo/1",
        "id_str": "1997348085408911360",
        "indices": [
          274,
          297
        ],
        "media_key": "3_1997348085408911360",
        "media_url_https": "https://pbs.twimg.com/media/G7gBgFOa0AAjk2I.jpg",
        "type": "photo",
        "url": "https://t.co/ReZtE3KdTt",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": []
          },
          "medium": {
            "faces": []
          },
          "small": {
            "faces": []
          },
          "orig": {
            "faces": []
          }
        },
        "sizes": {
          "large": {
            "h": 768,
            "w": 1408,
            "resize": "fit"
          },
          "medium": {
            "h": 655,
            "w": 1200,
            "resize": "fit"
          },
          "small": {
            "h": 371,
            "w": 680,
            "resize": "fit"
          },
          "thumb": {
            "h": 150,
            "w": 150,
            "resize": "crop"
          }
        },
        "original_info": {
          "height": 768,
          "width": 1408,
          "focus_rects": [
            {
              "x": 0,
              "y": 0,
              "w": 1371,
              "h": 768
            },
            {
              "x": 72,
              "y": 0,
              "w": 768,
              "h": 768
            },
            {
              "x": 119,
              "y": 0,
              "w": 674,
              "h": 768
            },
            {
              "x": 264,
              "y": 0,
              "w": 384,
              "h": 768
            },
            {
              "x": 0,
              "y": 0,
              "w": 1408,
              "h": 768
            }
          ]
        },
        "media_results": {
          "result": {
            "media_key": "3_1997348085408911360"
          }
        }
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "urls": [],
    "user_mentions": []
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "article": null
}