🐦 Twitter Post Details

Viewing enriched Twitter post

@ChrisLaubAI

BREAKING: Alibaba tested 18 AI coding agents on 100 real codebases, spanning 233 days each. they failed spectacularly. turns out passing tests once is easy. maintaining code for 8 months without breaking everything is where AI completely collapses. SWE-CI is the first benchmark that measures long-term code maintenance instead of one-shot bug fixes. each task tracks 71 consecutive commits of real evolution. 75% of models break previously working code during maintenance. only Claude Opus 4.5 and 4.6 stay above 50% zero-regression rate. every other model accumulates technical debt that compounds with every single iteration. here's the brutal part: - HumanEval and SWE-bench measure "does it work right now" - SWE-CI measures "does it still work after 8 months of changes" agents optimized for snapshot testing write brittle code that passes tests today but becomes completely unmaintainable tomorrow. they built EvoScore to weight later iterations heavier than early ones. agents that sacrifice code quality for quick wins get punished when the consequences compound. the AI coding narrative just got more honest. most models can write code. almost none can maintain it.

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2030931602872967460/media_0.jpg",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2030931602872967460/media_0.jpg",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2026-03-09T16:23:11.170980",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2030931602872967460",
  "url": "https://x.com/ChrisLaubAI/status/2030931602872967460",
  "twitterUrl": "https://twitter.com/ChrisLaubAI/status/2030931602872967460",
  "text": "BREAKING: Alibaba tested 18 AI coding agents on 100 real codebases, spanning 233 days each. they failed spectacularly.\n\nturns out passing tests once is easy. maintaining code for 8 months without breaking everything is where AI completely collapses.\n\nSWE-CI is the first benchmark that measures long-term code maintenance instead of one-shot bug fixes. each task tracks 71 consecutive commits of real evolution.\n\n75% of models break previously working code during maintenance. only Claude Opus 4.5 and 4.6 stay above 50% zero-regression rate. every other model accumulates technical debt that compounds with every single iteration.\n\nhere's the brutal part:\n\n- HumanEval and SWE-bench measure \"does it work right now\"\n- SWE-CI measures \"does it still work after 8 months of changes\"\n\nagents optimized for snapshot testing write brittle code that passes tests today but becomes completely unmaintainable tomorrow.\n\nthey built EvoScore to weight later iterations heavier than early ones. agents that sacrifice code quality for quick wins get punished when the consequences compound.\n\nthe AI coding narrative just got more honest.\n\nmost models can write code. almost none can maintain it.",
  "source": "Twitter for iPhone",
  "retweetCount": 15,
  "replyCount": 20,
  "likeCount": 70,
  "quoteCount": 5,
  "viewCount": 7518,
  "createdAt": "Mon Mar 09 09:00:01 +0000 2026",
  "lang": "en",
  "bookmarkCount": 30,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2030931602872967460",
  "displayTextRange": [
    0,
    280
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "ChrisLaubAI",
    "url": "https://x.com/ChrisLaubAI",
    "twitterUrl": "https://twitter.com/ChrisLaubAI",
    "id": "1598365749940310019",
    "name": "Chris Laub",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/2008145755404738561/kaZogs_A_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/1598365749940310019/1761869372",
    "description": "",
    "location": "🏝️",
    "followers": 23025,
    "following": 506,
    "status": "",
    "canDm": true,
    "canMediaTag": false,
    "createdAt": "Thu Dec 01 17:17:53 +0000 2022",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 65801,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 1061,
    "statusesCount": 20438,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [],
    "profile_bio": {
      "description": "@sentient_agency | Category-defining AI product launches for Series A+ AI companies. Clients include OpenAI and Higgsfield.",
      "entities": {
        "description": {
          "hashtags": [],
          "symbols": [],
          "urls": [],
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                0,
                16
              ],
              "name": "",
              "screen_name": "sentient_agency"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "linktr.ee/chrislaub",
              "expanded_url": "https://linktr.ee/chrislaub",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/MCcRiySgou"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/5bVO4xs8QS",
        "expanded_url": "https://twitter.com/ChrisLaubAI/status/2030931602872967460/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": []
          },
          "orig": {
            "faces": []
          }
        },
        "id_str": "2030931599106379777",
        "indices": [
          281,
          304
        ],
        "media_key": "3_2030931599106379777",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARwvUYdv2hABCgACHC9RiFBboSQAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABHC9Rh2/aEAEKAAIcL1GIUFuhJAAA",
            "media_key": "3_2030931599106379777"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/HC9Rh2_aEAEWJov.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 672,
              "w": 1200,
              "x": 0,
              "y": 0
            },
            {
              "h": 1200,
              "w": 1200,
              "x": 0,
              "y": 0
            },
            {
              "h": 1368,
              "w": 1200,
              "x": 0,
              "y": 0
            },
            {
              "h": 1402,
              "w": 701,
              "x": 250,
              "y": 0
            },
            {
              "h": 1402,
              "w": 1200,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1402,
          "width": 1200
        },
        "sizes": {
          "large": {
            "h": 1402,
            "w": 1200
          }
        },
        "type": "photo",
        "url": "https://t.co/5bVO4xs8QS"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "urls": [],
    "user_mentions": []
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}