🐦 Twitter Post Details

Viewing enriched Twitter post

@rasbt

@anupbhat30 You can tune hparams such GQA and MLA have roughly the same KV caches size for each model size, but yeah, they question is which one has the better modeling performance for the same size. I think the jury is still out, although rumors have it that MLA doesn't do that well for small sizes. Unfortunately, there is no ablation study across sizes though to say anything more concrete.

View on Twitter

📊 Media Metadata

{
  "score": 0.38,
  "score_components": {
    "author": 0.09,
    "engagement": 0.0,
    "quality": 0.08000000000000002,
    "source": 0.135,
    "nlp": 0.05,
    "recency": 0.025
  },
  "scored_at": "2026-03-08T14:24:33.970679",
  "import_source": "api_import",
  "source_tagged_at": "2026-03-08T14:24:33.970689",
  "enriched": true,
  "enriched_at": "2026-03-08T14:24:33.970690"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2030322542808805713",
  "url": "https://x.com/rasbt/status/2030322542808805713",
  "twitterUrl": "https://twitter.com/rasbt/status/2030322542808805713",
  "text": "@anupbhat30 You can tune hparams such GQA and MLA have roughly the same KV caches size for each model size, but yeah, they question is which one has the better modeling performance for the same size.\nI think the jury is still out, although rumors have it that MLA doesn't do that well for small sizes. Unfortunately, there is no ablation study across sizes though to say anything more concrete.",
  "source": "Twitter for iPhone",
  "retweetCount": 0,
  "replyCount": 0,
  "likeCount": 7,
  "quoteCount": 0,
  "viewCount": 2927,
  "createdAt": "Sat Mar 07 16:39:50 +0000 2026",
  "lang": "en",
  "bookmarkCount": 3,
  "isReply": true,
  "inReplyToId": "2030321525807878265",
  "conversationId": "2030313119487037906",
  "displayTextRange": [
    12,
    288
  ],
  "inReplyToUserId": "859305177110663168",
  "inReplyToUsername": "anupbhat30",
  "author": {
    "type": "user",
    "userName": "rasbt",
    "url": "https://x.com/rasbt",
    "twitterUrl": "https://twitter.com/rasbt",
    "id": "865622395",
    "name": "Sebastian Raschka",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1661187442043486209/a3E4t1eV_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/865622395/1742309979",
    "description": "",
    "location": "United States",
    "followers": 404595,
    "following": 1133,
    "status": "",
    "canDm": false,
    "canMediaTag": true,
    "createdAt": "Sun Oct 07 02:06:16 +0000 2012",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 24208,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 2078,
    "statusesCount": 19467,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1999847254367117736"
    ],
    "profile_bio": {
      "description": "ML/AI research engineer. Ex stats professor.\nAuthor of \"Build a Large Language Model From Scratch\" (https://t.co/O8LAAMRzzW) & reasoning (https://t.co/5TueQKx2Fk)",
      "entities": {
        "description": {
          "hashtags": [],
          "symbols": [],
          "urls": [
            {
              "display_url": "amzn.to/4fqvn0D",
              "expanded_url": "https://amzn.to/4fqvn0D",
              "indices": [
                100,
                123
              ],
              "url": "https://t.co/O8LAAMRzzW"
            },
            {
              "display_url": "mng.bz/lZ5B",
              "expanded_url": "https://mng.bz/lZ5B",
              "indices": [
                138,
                161
              ],
              "url": "https://t.co/5TueQKx2Fk"
            }
          ],
          "user_mentions": []
        },
        "url": {
          "urls": [
            {
              "display_url": "sebastianraschka.com",
              "expanded_url": "https://sebastianraschka.com",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/HrtQQ5tgJl"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {},
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "urls": [],
    "user_mentions": []
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}