🐦 Twitter Post Details

Viewing enriched Twitter post

@rasbt

While waiting for DeepSeek V4 we got two very strong open-weight LLMs from India yesterday. There are two size flavors, Sarvam 30B and Sarvam 105B model (both reasoning models). Interestingly, the smaller 30B model uses “classic” Grouped Query Attention (GQA), whereas the larger 105B variant switched to DeepSeek-style Multi-Head Latent Attention (MLA). As I wrote about in my analyses before, both are popular attention variants to reduce KV cache size (the longer the context, the more you save compared to regular attention). MLA is more complicated to implement, but it can give you better modeling performance if we go by the ablation studies in the 2024 DeepSeek V2 paper (as far as I know, this is still the most recent apples-to-apples comparison). Speaking of modeling performance, the 105B model is on par with LLMs of similar size: gpt-oss 120B and Qwen3-Next (80B). Sarvam is better on some tasks and worse on others, but roughly the same on average. It’s not the strongest coder in SWE-Bench Verified terms, but it is surprisingly good at agentic reasoning and task completion (Tau2). It’s even better than Deepseek R1 0528. Considering the smaller Sarvam 30B, the perhaps most comparable model to the 30B model is Nemotron 3 Nano 30B, which is slightly ahead in coding per SWE-Bench Verified and agentic reasoning (Tau2) but slightly worse in some other aspects (Live Code Bench v6, BrowseComp). Unfortunately, Qwen3-30B-A3B is missing in the benchmarks, which is, as far as I know, is the most popular model of that size class. Interestingly, though, the Sarvam team compared their 30B model to Qwen3-30B-A3B on a computational performance analysis, where they found that Sarvam gets 20-40% more tokens/sec throughput compared to Qwen3 due to code and kernel optimizations. Anyways, one thing that is not captured by the benchmarks above is Sarvam’s good performance on Indian languages. According to a judge model, the Sarvam team found that their model is preferred 90% of the time compared to others when it comes to Indian texts. (Since they built and trained the tokenizer from scratch as well, Sarvam also comes with a 4 times higher token efficiency on Indian languages.

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2030313119487037906/media_0.jpg",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2026-03-08T14:24:36.267513",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2030313119487037906",
  "url": "https://x.com/rasbt/status/2030313119487037906",
  "twitterUrl": "https://twitter.com/rasbt/status/2030313119487037906",
  "text": "While waiting for DeepSeek V4 we got two very strong open-weight LLMs from India yesterday.\n\nThere are two size flavors, Sarvam 30B and Sarvam 105B model (both reasoning models).\n\nInterestingly, the smaller 30B model uses “classic” Grouped Query Attention (GQA), whereas the larger 105B variant switched to DeepSeek-style Multi-Head Latent Attention (MLA).\n\nAs I wrote about in my analyses before, both are popular attention variants to reduce KV cache size (the longer the context, the more you save compared to regular attention).\n\nMLA is more complicated to implement, but it can give you better modeling performance if we go by the ablation studies in the 2024 DeepSeek V2 paper (as far as I know, this is still the most recent apples-to-apples comparison).\n\nSpeaking of modeling performance, the 105B model is on par with LLMs of similar size: gpt-oss 120B and Qwen3-Next (80B). Sarvam is better on some tasks and worse on others, but roughly the same on average.\n\nIt’s not the strongest coder in SWE-Bench Verified terms, but it is surprisingly good at agentic reasoning and task completion (Tau2). It’s even better than Deepseek R1 0528.\n\nConsidering the smaller Sarvam 30B, the perhaps most comparable model to the 30B model is Nemotron 3 Nano 30B, which is slightly ahead in coding per SWE-Bench Verified and agentic reasoning (Tau2) but slightly worse in some other aspects (Live Code Bench v6, BrowseComp).\n\nUnfortunately, Qwen3-30B-A3B is missing in the benchmarks, which is, as far as I know, is the most popular model of that size class. Interestingly, though, the Sarvam team compared their 30B model to Qwen3-30B-A3B on a computational performance analysis, where they found that Sarvam gets 20-40% more tokens/sec throughput compared to Qwen3 due to code and kernel optimizations.\n\nAnyways, one thing that is not captured by the benchmarks above is Sarvam’s good performance on Indian languages. According to a judge model, the Sarvam team found that their model is preferred 90% of the time compared to others when it comes to Indian texts. (Since they built and trained the tokenizer from scratch as well, Sarvam also comes with a 4 times higher token efficiency on Indian languages.",
  "source": "Twitter for iPhone",
  "retweetCount": 637,
  "replyCount": 39,
  "likeCount": 3821,
  "quoteCount": 31,
  "viewCount": 214808,
  "createdAt": "Sat Mar 07 16:02:23 +0000 2026",
  "lang": "en",
  "bookmarkCount": 1273,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2030313119487037906",
  "displayTextRange": [
    0,
    274
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "rasbt",
    "url": "https://x.com/rasbt",
    "twitterUrl": "https://twitter.com/rasbt",
    "id": "865622395",
    "name": "Sebastian Raschka",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1661187442043486209/a3E4t1eV_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/865622395/1742309979",
    "description": "",
    "location": "United States",
    "followers": 404595,
    "following": 1133,
    "status": "",
    "canDm": false,
    "canMediaTag": true,
    "createdAt": "Sun Oct 07 02:06:16 +0000 2012",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 24208,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 2078,
    "statusesCount": 19467,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1999847254367117736"
    ],
    "profile_bio": {
      "description": "ML/AI research engineer. Ex stats professor.\nAuthor of \"Build a Large Language Model From Scratch\" (https://t.co/O8LAAMRzzW) & reasoning (https://t.co/5TueQKx2Fk)",
      "entities": {
        "description": {
          "hashtags": [],
          "symbols": [],
          "urls": [
            {
              "display_url": "amzn.to/4fqvn0D",
              "expanded_url": "https://amzn.to/4fqvn0D",
              "indices": [
                100,
                123
              ],
              "url": "https://t.co/O8LAAMRzzW"
            },
            {
              "display_url": "mng.bz/lZ5B",
              "expanded_url": "https://mng.bz/lZ5B",
              "indices": [
                138,
                161
              ],
              "url": "https://t.co/5TueQKx2Fk"
            }
          ],
          "user_mentions": []
        },
        "url": {
          "urls": [
            {
              "display_url": "sebastianraschka.com",
              "expanded_url": "https://sebastianraschka.com",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/HrtQQ5tgJl"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/0uqmLxofRE",
        "expanded_url": "https://twitter.com/rasbt/status/2030313119487037906/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": [
              {
                "h": 416,
                "w": 416,
                "x": 592,
                "y": 442
              }
            ]
          },
          "orig": {
            "faces": [
              {
                "h": 832,
                "w": 832,
                "x": 1184,
                "y": 884
              }
            ]
          }
        },
        "id_str": "2030312841207574528",
        "indices": [
          275,
          298
        ],
        "media_key": "3_2030312841207574528",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARwtHsWjF3AACgACHC0fBm3XEdIAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABHC0exaMXcAAKAAIcLR8GbdcR0gAA",
            "media_key": "3_2030312841207574528"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/HC0exaMXcAAdqg8.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 2266,
              "w": 4046,
              "x": 0,
              "y": 198
            },
            {
              "h": 4046,
              "w": 4046,
              "x": 0,
              "y": 0
            },
            {
              "h": 4096,
              "w": 3593,
              "x": 0,
              "y": 0
            },
            {
              "h": 4096,
              "w": 2048,
              "x": 716,
              "y": 0
            },
            {
              "h": 4096,
              "w": 4046,
              "x": 0,
              "y": 0
            }
          ],
          "height": 4096,
          "width": 4046
        },
        "sizes": {
          "large": {
            "h": 2048,
            "w": 2023
          }
        },
        "type": "photo",
        "url": "https://t.co/0uqmLxofRE"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "urls": [],
    "user_mentions": []
  },
  "quoted_tweet": {
    "type": "tweet",
    "id": "2029965547824431356",
    "url": "https://x.com/pratykumar/status/2029965547824431356",
    "twitterUrl": "https://twitter.com/pratykumar/status/2029965547824431356",
    "text": "📢 Open-sourcing the Sarvam 30B and 105B models! Trained from scratch with all data, model research and inference optimisation done in-house, these models punch above their weight in most global benchmarks plus excel in Indian languages.\n\nGet the weights at Hugging Face and AIKosh. Thanks to the good folks at SGLang for day 0 support, vLLM support coming soon. Links, benchmark scores, examples, and more in our blog - https://t.co/DcCG3zlN8p",
    "source": "Twitter for iPhone",
    "retweetCount": 1243,
    "replyCount": 200,
    "likeCount": 6717,
    "quoteCount": 175,
    "viewCount": 645124,
    "createdAt": "Fri Mar 06 17:01:16 +0000 2026",
    "lang": "en",
    "bookmarkCount": 1228,
    "isReply": false,
    "inReplyToId": null,
    "conversationId": "2029965547824431356",
    "displayTextRange": [
      0,
      273
    ],
    "inReplyToUserId": null,
    "inReplyToUsername": null,
    "author": {
      "type": "user",
      "userName": "pratykumar",
      "url": "https://x.com/pratykumar",
      "twitterUrl": "https://twitter.com/pratykumar",
      "id": "1530518342049890307",
      "name": "Pratyush Kumar",
      "isVerified": false,
      "isBlueVerified": true,
      "verifiedType": null,
      "profilePicture": "https://pbs.twimg.com/profile_images/1530522280824295426/PR-Tv-wj_normal.jpg",
      "coverPicture": "https://pbs.twimg.com/profile_banners/1530518342049890307/1653740011",
      "description": "",
      "location": "",
      "followers": 26470,
      "following": 167,
      "status": "",
      "canDm": true,
      "canMediaTag": true,
      "createdAt": "Sat May 28 11:56:36 +0000 2022",
      "entities": {
        "description": {
          "urls": []
        },
        "url": {}
      },
      "fastFollowersCount": 0,
      "favouritesCount": 132,
      "hasCustomTimelines": true,
      "isTranslator": false,
      "mediaCount": 70,
      "statusesCount": 135,
      "withheldInCountries": [],
      "affiliatesHighlightedLabel": {},
      "possiblySensitive": false,
      "pinnedTweetIds": [],
      "profile_bio": {
        "description": "Building Sarvam AI",
        "entities": {
          "description": {
            "hashtags": [],
            "symbols": [],
            "urls": [],
            "user_mentions": []
          },
          "url": {
            "urls": [
              {
                "display_url": "dashboard.sarvam.ai",
                "expanded_url": "https://dashboard.sarvam.ai/",
                "indices": [
                  0,
                  23
                ],
                "url": "https://t.co/qDVVHQ47js"
              }
            ]
          }
        }
      },
      "isAutomated": false,
      "automatedBy": null
    },
    "extendedEntities": {},
    "card": {
      "binding_values": [
        {
          "key": "photo_image_full_size_large",
          "value": {
            "image_value": {
              "height": 419,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=800x419",
              "width": 800
            }
          }
        },
        {
          "key": "thumbnail_image",
          "value": {
            "image_value": {
              "height": 150,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=280x150",
              "width": 240
            }
          }
        },
        {
          "key": "domain",
          "value": {
            "string_value": "www.sarvam.ai"
          }
        },
        {
          "key": "thumbnail_image_large",
          "value": {
            "image_value": {
              "height": 320,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=800x320_1",
              "width": 512
            }
          }
        },
        {
          "key": "summary_photo_image_small",
          "value": {
            "image_value": {
              "height": 202,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=386x202",
              "width": 386
            }
          }
        },
        {
          "key": "thumbnail_image_original",
          "value": {
            "image_value": {
              "height": 750,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=orig",
              "width": 1200
            }
          }
        },
        {
          "key": "photo_image_full_size_small",
          "value": {
            "image_value": {
              "height": 202,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=386x202",
              "width": 386
            }
          }
        },
        {
          "key": "summary_photo_image_large",
          "value": {
            "image_value": {
              "height": 419,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=800x419",
              "width": 800
            }
          }
        },
        {
          "key": "thumbnail_image_small",
          "value": {
            "image_value": {
              "height": 63,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=100x100",
              "width": 100
            }
          }
        },
        {
          "key": "thumbnail_image_x_large",
          "value": {
            "image_value": {
              "height": 750,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=png&name=2048x2048_2_exp",
              "width": 1200
            }
          }
        },
        {
          "key": "photo_image_full_size_original",
          "value": {
            "image_value": {
              "height": 750,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=orig",
              "width": 1200
            }
          }
        },
        {
          "key": "vanity_url",
          "value": {
            "scribe_key": "vanity_url",
            "string_value": "sarvam.ai"
          }
        },
        {
          "key": "photo_image_full_size",
          "value": {
            "image_value": {
              "height": 314,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=600x314",
              "width": 600
            }
          }
        },
        {
          "key": "thumbnail_image_color",
          "value": {
            "image_color_value": {
              "palette": [
                {
                  "percentage": 73.1,
                  "rgb": {
                    "blue": 37,
                    "green": 110,
                    "red": 86
                  }
                },
                {
                  "percentage": 17.65,
                  "rgb": {
                    "blue": 44,
                    "green": 154,
                    "red": 168
                  }
                },
                {
                  "percentage": 3.8,
                  "rgb": {
                    "blue": 206,
                    "green": 221,
                    "red": 213
                  }
                },
                {
                  "percentage": 2.22,
                  "rgb": {
                    "blue": 29,
                    "green": 75,
                    "red": 53
                  }
                },
                {
                  "percentage": 1.36,
                  "rgb": {
                    "blue": 49,
                    "green": 156,
                    "red": 139
                  }
                }
              ]
            }
          }
        },
        {
          "key": "title",
          "value": {
            "string_value": "Open-Sourcing Sarvam 30B and 105B | Sarvam AI"
          }
        },
        {
          "key": "summary_photo_image_color",
          "value": {
            "image_color_value": {
              "palette": [
                {
                  "percentage": 73.1,
                  "rgb": {
                    "blue": 37,
                    "green": 110,
                    "red": 86
                  }
                },
                {
                  "percentage": 17.65,
                  "rgb": {
                    "blue": 44,
                    "green": 154,
                    "red": 168
                  }
                },
                {
                  "percentage": 3.8,
                  "rgb": {
                    "blue": 206,
                    "green": 221,
                    "red": 213
                  }
                },
                {
                  "percentage": 2.22,
                  "rgb": {
                    "blue": 29,
                    "green": 75,
                    "red": 53
                  }
                },
                {
                  "percentage": 1.36,
                  "rgb": {
                    "blue": 49,
                    "green": 156,
                    "red": 139
                  }
                }
              ]
            }
          }
        },
        {
          "key": "summary_photo_image_x_large",
          "value": {
            "image_value": {
              "height": 750,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=png&name=2048x2048_2_exp",
              "width": 1200
            }
          }
        },
        {
          "key": "summary_photo_image",
          "value": {
            "image_value": {
              "height": 314,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=600x314",
              "width": 600
            }
          }
        },
        {
          "key": "photo_image_full_size_color",
          "value": {
            "image_color_value": {
              "palette": [
                {
                  "percentage": 73.1,
                  "rgb": {
                    "blue": 37,
                    "green": 110,
                    "red": 86
                  }
                },
                {
                  "percentage": 17.65,
                  "rgb": {
                    "blue": 44,
                    "green": 154,
                    "red": 168
                  }
                },
                {
                  "percentage": 3.8,
                  "rgb": {
                    "blue": 206,
                    "green": 221,
                    "red": 213
                  }
                },
                {
                  "percentage": 2.22,
                  "rgb": {
                    "blue": 29,
                    "green": 75,
                    "red": 53
                  }
                },
                {
                  "percentage": 1.36,
                  "rgb": {
                    "blue": 49,
                    "green": 156,
                    "red": 139
                  }
                }
              ]
            }
          }
        },
        {
          "key": "photo_image_full_size_x_large",
          "value": {
            "image_value": {
              "height": 750,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=png&name=2048x2048_2_exp",
              "width": 1200
            }
          }
        },
        {
          "key": "card_url",
          "value": {
            "scribe_key": "card_url",
            "string_value": "https://t.co/DcCG3zlN8p"
          }
        },
        {
          "key": "summary_photo_image_original",
          "value": {
            "image_value": {
              "height": 750,
              "url": "https://pbs.twimg.com/card_img/2029940858519441412/Umip-8Ab?format=jpg&name=orig",
              "width": 1200
            }
          }
        }
      ],
      "card_platform": {
        "platform": {
          "audience": {
            "name": "production"
          },
          "device": {
            "name": "iPhone",
            "version": "13"
          }
        }
      },
      "name": "summary_large_image",
      "url": "https://t.co/DcCG3zlN8p",
      "user_refs_results": []
    },
    "place": {},
    "entities": {
      "hashtags": [],
      "symbols": [],
      "urls": [
        {
          "display_url": "sarvam.ai/blogs/sarvam-3…",
          "expanded_url": "https://www.sarvam.ai/blogs/sarvam-30b-105b",
          "indices": [
            420,
            443
          ],
          "url": "https://t.co/DcCG3zlN8p"
        }
      ],
      "user_mentions": []
    },
    "quoted_tweet": null,
    "retweeted_tweet": null,
    "isLimitedReply": false,
    "article": null
  },
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}