🐦 Twitter Post Details

Viewing enriched Twitter post

@omarsar0

Another banger paper from Microsoft. Why it's a big deal: It teaches reasoning models to compress their own chain-of-thought mid-generation. The most interesting finding isn't the 2-3x memory savings or the doubled throughput. It's that when the model erases a reasoning block after summarizing it, the deleted information keeps leaking forward through the KV cache representations, forming an implicit second channel that accounts for 15 pp of accuracy. The model is, in some meaningful sense, remembering things it can no longer see. If context management turns out to be a teachable skill (and 30K training examples seem to be enough), then the bottleneck for long-horizon agents may be less about architecture and more about the right training data, which is a very different kind of problem than most people are working on. If it helps, below is my research agent's visual summary of the paper (at least highlighting the key parts).

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2042315710173528122/media_0.jpg",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2042315710173528122/media_0.jpg",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2026-04-09T19:01:48.278032",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2042315710173528122",
  "url": "https://x.com/omarsar0/status/2042315710173528122",
  "twitterUrl": "https://twitter.com/omarsar0/status/2042315710173528122",
  "text": "Another banger paper from Microsoft.\n\nWhy it's a big deal:\n\nIt teaches reasoning models to compress their own chain-of-thought mid-generation.\n\nThe most interesting finding isn't the 2-3x memory savings or the doubled throughput. It's that when the model erases a reasoning block after summarizing it, the deleted information keeps leaking forward through the KV cache representations, forming an implicit second channel that accounts for 15 pp of accuracy. \n\nThe model is, in some meaningful sense, remembering things it can no longer see. \n\nIf context management turns out to be a teachable skill (and 30K training examples seem to be enough), then the bottleneck for long-horizon agents may be less about architecture and more about the right training data, which is a very different kind of problem than most people are working on.\n\nIf it helps, below is my research agent's visual summary of the paper (at least highlighting the key parts).",
  "source": "Twitter for iPhone",
  "retweetCount": 3,
  "replyCount": 1,
  "likeCount": 8,
  "quoteCount": 0,
  "viewCount": 448,
  "createdAt": "Thu Apr 09 18:56:24 +0000 2026",
  "lang": "en",
  "bookmarkCount": 10,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2042315710173528122",
  "displayTextRange": [
    0,
    279
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "omarsar0",
    "url": "https://x.com/omarsar0",
    "twitterUrl": "https://twitter.com/omarsar0",
    "id": "3448284313",
    "name": "elvis",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
    "description": "",
    "location": "DAIR.AI Academy",
    "followers": 297675,
    "following": 802,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Fri Sep 04 12:59:26 +0000 2015",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 35350,
    "hasCustomTimelines": true,
    "isTranslator": true,
    "mediaCount": 4589,
    "statusesCount": 17604,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "2042286186920550498"
    ],
    "profile_bio": {
      "description": "Building @dair_ai • Prev: Meta AI, Elastic, PhD • New AI learning portal: https://t.co/1e8RZKs4uX",
      "entities": {
        "description": {
          "hashtags": [],
          "symbols": [],
          "urls": [
            {
              "display_url": "academy.dair.ai",
              "expanded_url": "https://academy.dair.ai/",
              "indices": [
                74,
                97
              ],
              "url": "https://t.co/1e8RZKs4uX"
            }
          ],
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                9,
                17
              ],
              "name": "",
              "screen_name": "dair_ai"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "dair.ai",
              "expanded_url": "https://www.dair.ai/",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/XQto5ypSIk"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/QuxrL7sh4L",
        "expanded_url": "https://twitter.com/omarsar0/status/2042315710173528122/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": []
          },
          "orig": {
            "faces": []
          }
        },
        "id_str": "2042314922256637952",
        "indices": [
          280,
          303
        ],
        "media_key": "3_2042314922256637952",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARxXwpnkliAACgACHFfDUVgXwDoAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABHFfCmeSWIAAKAAIcV8NRWBfAOgAA",
            "media_key": "3_2042314922256637952"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/HFfCmeSWIAAwlUc.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 1842,
              "w": 3290,
              "x": 0,
              "y": 0
            },
            {
              "h": 2128,
              "w": 2128,
              "x": 0,
              "y": 0
            },
            {
              "h": 2128,
              "w": 1867,
              "x": 0,
              "y": 0
            },
            {
              "h": 2128,
              "w": 1064,
              "x": 43,
              "y": 0
            },
            {
              "h": 2128,
              "w": 3290,
              "x": 0,
              "y": 0
            }
          ],
          "height": 2128,
          "width": 3290
        },
        "sizes": {
          "large": {
            "h": 1325,
            "w": 2048
          }
        },
        "type": "photo",
        "url": "https://t.co/QuxrL7sh4L"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "urls": [],
    "user_mentions": []
  },
  "quoted_tweet": {
    "type": "tweet",
    "id": "2041974013950373901",
    "url": "https://x.com/DimitrisPapail/status/2041974013950373901",
    "twitterUrl": "https://twitter.com/DimitrisPapail/status/2041974013950373901",
    "text": "https://t.co/lbjcGDxpJn",
    "source": "Twitter for iPhone",
    "retweetCount": 100,
    "replyCount": 21,
    "likeCount": 702,
    "quoteCount": 23,
    "viewCount": 189005,
    "createdAt": "Wed Apr 08 20:18:37 +0000 2026",
    "lang": "zxx",
    "bookmarkCount": 1006,
    "isReply": false,
    "inReplyToId": null,
    "conversationId": "2041974013950373901",
    "displayTextRange": [
      0,
      23
    ],
    "inReplyToUserId": null,
    "inReplyToUsername": null,
    "author": {
      "type": "user",
      "userName": "DimitrisPapail",
      "url": "https://x.com/DimitrisPapail",
      "twitterUrl": "https://twitter.com/DimitrisPapail",
      "id": "573817445",
      "name": "Dimitris Papailiopoulos",
      "isVerified": false,
      "isBlueVerified": true,
      "verifiedType": null,
      "profilePicture": "https://pbs.twimg.com/profile_images/1733487310728024064/Ah_NBQlM_normal.jpg",
      "coverPicture": "https://pbs.twimg.com/profile_banners/573817445/1697212820",
      "description": "",
      "location": "Madison, WI",
      "followers": 25502,
      "following": 1370,
      "status": "",
      "canDm": false,
      "canMediaTag": true,
      "createdAt": "Mon May 07 17:26:48 +0000 2012",
      "entities": {
        "description": {
          "urls": []
        },
        "url": {}
      },
      "fastFollowersCount": 0,
      "favouritesCount": 13044,
      "hasCustomTimelines": true,
      "isTranslator": false,
      "mediaCount": 1065,
      "statusesCount": 10081,
      "withheldInCountries": [],
      "affiliatesHighlightedLabel": {},
      "possiblySensitive": false,
      "pinnedTweetIds": [
        "2041974013950373901"
      ],
      "profile_bio": {
        "description": "Researcher @MSFTResearch, AI Frontiers Lab; Prof @UWMadison (on leave); reasoning in context; learning how to automate; agent of agents; babas of Inez Lily.",
        "entities": {
          "description": {
            "hashtags": [],
            "symbols": [],
            "urls": [],
            "user_mentions": [
              {
                "id_str": "0",
                "indices": [
                  11,
                  24
                ],
                "name": "",
                "screen_name": "MSFTResearch"
              },
              {
                "id_str": "0",
                "indices": [
                  49,
                  59
                ],
                "name": "",
                "screen_name": "UWMadison"
              }
            ]
          },
          "url": {
            "urls": [
              {
                "display_url": "papail.io",
                "expanded_url": "http://papail.io",
                "indices": [
                  0,
                  23
                ],
                "url": "https://t.co/lQR1OLEeGm"
              }
            ]
          }
        }
      },
      "isAutomated": false,
      "automatedBy": null
    },
    "extendedEntities": {},
    "card": null,
    "place": {},
    "entities": {
      "hashtags": [],
      "symbols": [],
      "timestamps": [],
      "urls": [
        {
          "display_url": "x.com/i/article/2041…",
          "expanded_url": "http://x.com/i/article/2041557735926329344",
          "indices": [
            0,
            23
          ],
          "url": "https://t.co/lbjcGDxpJn"
        }
      ],
      "user_mentions": []
    },
    "quoted_tweet": null,
    "retweeted_tweet": null,
    "isLimitedReply": false,
    "communityInfo": null,
    "article": {
      "title": "Memento: Teaching LLMs to Manage Their Own Context ",
      "preview_text": "We taught models to compress their own chain-of-thought mid-generation. Peak KV cache drops 2–3x, throughput nearly doubles, and the erased reasoning blocks leave traces in the KV cache that the model",
      "cover_media_img_url": "https://pbs.twimg.com/media/HFY0jEwWIAAMxTd.jpg"
    }
  },
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "communityInfo": null,
  "article": null
}