🐦 Twitter Post Details

Viewing enriched Twitter post

@omarsar0

LLM agents loop, drift, and get stuck on hard reasoning tasks up to 30% of the time. Current fixes are either too blunt (hard step limits) or too expensive (LLM-as-judge adding 10-15% overhead per step). New research proposes a smarter middle ground. The work introduces the Cognitive Companion, a parallel monitoring architecture with two variants: an LLM-based monitor and a novel Probe-based monitor that detects reasoning degradation from the model's own hidden states at zero inference overhead. The Probe-based Companion trains a simple logistic regression classifier on hidden states from layer 28. It reads the model's internal representations during the existing forward pass, requiring no additional model calls. A single matrix multiplication is all it takes to flag when reasoning quality is declining. Why does it matter? The LLM-based Companion reduced repetition on loop-prone tasks by 52-62% with roughly 11% overhead. The Probe-based variant achieved a mean effect size of +0.471 with zero measured overhead and AUROC 0.840 on cross-validated detection. But the results also reveal an important nuance: companions help on loop-prone and open-ended tasks while showing neutral or negative effects on structured tasks. Models below 3B parameters also struggled to act on companion guidance at all. This suggests the future isn't universal monitoring but selective activation, deploying cognitive companions only where reasoning degradation is a real risk. Paper: https://t.co/K2vqDADwU8 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2045139481779696027/media_0.png",
      "filename": "media_0.png"
    }
  ],
  "processed_at": "2026-04-17T14:04:59.666693",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2045139481779696027",
  "url": "https://x.com/omarsar0/status/2045139481779696027",
  "twitterUrl": "https://twitter.com/omarsar0/status/2045139481779696027",
  "text": "LLM agents loop, drift, and get stuck on hard reasoning tasks up to 30% of the time.\n\nCurrent fixes are either too blunt (hard step limits) or too expensive (LLM-as-judge adding 10-15% overhead per step).\n\nNew research proposes a smarter middle ground.\n\nThe work introduces the Cognitive Companion, a parallel monitoring architecture with two variants: an LLM-based monitor and a novel Probe-based monitor that detects reasoning degradation from the model's own hidden states at zero inference overhead.\n\nThe Probe-based Companion trains a simple logistic regression classifier on hidden states from layer 28. It reads the model's internal representations during the existing forward pass, requiring no additional model calls. A single matrix multiplication is all it takes to flag when reasoning quality is declining.\n\nWhy does it matter?\n\nThe LLM-based Companion reduced repetition on loop-prone tasks by 52-62% with roughly 11% overhead. The Probe-based variant achieved a mean effect size of +0.471 with zero measured overhead and AUROC 0.840 on cross-validated detection.\n\nBut the results also reveal an important nuance: companions help on loop-prone and open-ended tasks while showing neutral or negative effects on structured tasks. Models below 3B parameters also struggled to act on companion guidance at all.\n\nThis suggests the future isn't universal monitoring but selective activation, deploying cognitive companions only where reasoning degradation is a real risk.\n\nPaper: https://t.co/K2vqDADwU8\n\nLearn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX",
  "source": "Twitter for iPhone",
  "retweetCount": 1,
  "replyCount": 1,
  "likeCount": 4,
  "quoteCount": 0,
  "viewCount": 303,
  "createdAt": "Fri Apr 17 13:57:03 +0000 2026",
  "lang": "en",
  "bookmarkCount": 4,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2045139481779696027",
  "displayTextRange": [
    0,
    277
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "omarsar0",
    "url": "https://x.com/omarsar0",
    "twitterUrl": "https://twitter.com/omarsar0",
    "id": "3448284313",
    "name": "elvis",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
    "description": "",
    "location": "DAIR.AI Academy",
    "followers": 298465,
    "following": 811,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Fri Sep 04 12:59:26 +0000 2015",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 35446,
    "hasCustomTimelines": true,
    "isTranslator": true,
    "mediaCount": 4600,
    "statusesCount": 17642,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "2044769798845079665"
    ],
    "profile_bio": {
      "description": "Building @dair_ai • Prev: Meta AI, Elastic, PhD • New AI learning portal: https://t.co/1e8RZKs4uX",
      "entities": {
        "description": {
          "urls": [
            {
              "display_url": "academy.dair.ai",
              "expanded_url": "https://academy.dair.ai/",
              "indices": [
                74,
                97
              ],
              "url": "https://t.co/1e8RZKs4uX"
            }
          ],
          "user_mentions": [
            {
              "id_str": "",
              "indices": [
                9,
                17
              ],
              "name": "",
              "screen_name": "dair_ai"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "dair.ai",
              "expanded_url": "https://www.dair.ai/",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/XQto5ypSIk"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/DNL8dGLA1T",
        "expanded_url": "https://twitter.com/omarsar0/status/2045139481779696027/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": [
              {
                "h": 210,
                "w": 210,
                "x": 112,
                "y": 1436
              }
            ]
          },
          "orig": {
            "faces": [
              {
                "h": 210,
                "w": 210,
                "x": 112,
                "y": 1436
              }
            ]
          }
        },
        "id_str": "2045139478340386816",
        "indices": [
          278,
          301
        ],
        "media_key": "3_2045139478340386816",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARxhy4Uam0AACgACHGHLheea8ZsAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABHGHLhRqbQAAKAAIcYcuF55rxmwAA",
            "media_key": "3_2045139478340386816"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/HGHLhRqbQAApSPf.png",
        "original_info": {
          "focus_rects": [
            {
              "h": 902,
              "w": 1610,
              "x": 0,
              "y": 0
            },
            {
              "h": 1610,
              "w": 1610,
              "x": 0,
              "y": 0
            },
            {
              "h": 1794,
              "w": 1574,
              "x": 36,
              "y": 0
            },
            {
              "h": 1794,
              "w": 897,
              "x": 404,
              "y": 0
            },
            {
              "h": 1794,
              "w": 1610,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1794,
          "width": 1610
        },
        "sizes": {
          "large": {
            "h": 1794,
            "w": 1610
          }
        },
        "type": "photo",
        "url": "https://t.co/DNL8dGLA1T"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "urls": [
      {
        "display_url": "arxiv.org/abs/2604.13759",
        "expanded_url": "https://arxiv.org/abs/2604.13759",
        "indices": [
          1487,
          1510
        ],
        "url": "https://t.co/K2vqDADwU8"
      },
      {
        "display_url": "academy.dair.ai",
        "expanded_url": "https://academy.dair.ai/",
        "indices": [
          1563,
          1586
        ],
        "url": "https://t.co/1e8RZKs4uX"
      }
    ],
    "user_mentions": []
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "communityInfo": null,
  "article": null
}