🐦 Twitter Post Details

Viewing enriched Twitter post

@ZeyuanAllenZhu

🚀 NVIDIA continues to lead on open-sourcing pretraining data — Nemotron-CC-v2 has dropped! 👏 Congrats to @KarimiRabeeh @issanjeev @PavloMolchanov @KezhiKong @SimonXinDong @ctnzr @YejinChoinka + many others! 🙏 A very loud thank you for citing our Physics of LMs, Part 3.1. You’re perhaps the first leading lab to publicly acknowledge its usefulness (knowledge augmentation: add QA at pretrain-level, add diversity + translation). When I ran this code 2 years ago, it was using V100s + 8 A100s so many didn’t believe in it --- I wasn’t approved to test on real-life data, couldn’t secure GPUs for larger experiments. That’s why this recognition really matters: it validates the value of foundational projects like ours, and helps me keep pushing to deliver more insights for the AI community. Truly grateful. https://t.co/c5g1VMUhCr

Media 1
Media 2
Media 3

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1962119316427706828/media_0.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1962119316427706828/media_0.jpg?",
      "type": "photo",
      "filename": "media_0.jpg"
    },
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1962119316427706828/media_1.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1962119316427706828/media_1.jpg?",
      "type": "photo",
      "filename": "media_1.jpg"
    },
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1962119316427706828/media_2.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1962119316427706828/media_2.jpg?",
      "type": "photo",
      "filename": "media_2.jpg"
    }
  ],
  "processed_at": "2025-09-01T20:03:36.447613",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "1962119316427706828",
  "url": "https://x.com/ZeyuanAllenZhu/status/1962119316427706828",
  "twitterUrl": "https://twitter.com/ZeyuanAllenZhu/status/1962119316427706828",
  "text": "🚀 NVIDIA continues to lead on open-sourcing pretraining data — Nemotron-CC-v2 has dropped! \n👏 Congrats to @KarimiRabeeh @issanjeev @PavloMolchanov @KezhiKong @SimonXinDong @ctnzr @YejinChoinka + many others!\n\n🙏 A very loud thank you for citing our Physics of LMs, Part 3.1. You’re perhaps the first leading lab to publicly acknowledge its usefulness (knowledge augmentation: add QA at pretrain-level, add diversity + translation).\n\nWhen I ran this code 2 years ago, it was using V100s + 8 A100s so many didn’t believe in it --- I wasn’t approved to test on real-life data, couldn’t secure GPUs for larger experiments. That’s why this recognition really matters: it validates the value of foundational projects like ours, and helps me keep pushing to deliver more insights for the AI community. Truly grateful.\n\nhttps://t.co/c5g1VMUhCr",
  "source": "Twitter for iPhone",
  "retweetCount": 74,
  "replyCount": 13,
  "likeCount": 572,
  "quoteCount": 9,
  "viewCount": 47678,
  "createdAt": "Sun Aug 31 11:44:34 +0000 2025",
  "lang": "en",
  "bookmarkCount": 252,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "1962119316427706828",
  "displayTextRange": [
    0,
    274
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "ZeyuanAllenZhu",
    "url": "https://x.com/ZeyuanAllenZhu",
    "twitterUrl": "https://twitter.com/ZeyuanAllenZhu",
    "id": "136335720",
    "name": "Zeyuan Allen-Zhu, Sc.D.",
    "isVerified": false,
    "isBlueVerified": false,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1777537825493426176/OYHinr5A_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/136335720/1722053273",
    "description": "",
    "location": "",
    "followers": 20017,
    "following": 449,
    "status": "",
    "canDm": false,
    "canMediaTag": false,
    "createdAt": "Fri Apr 23 16:59:01 +0000 2010",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 583,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 94,
    "statusesCount": 479,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1949230282860892259"
    ],
    "profile_bio": {
      "description": "physics of language models @ Meta (FAIR, not GenAI)\n🎓:Tsinghua Physics — MIT CSAIL — Princeton/IAS\n🏅:IOI x 2 — ACM-ICPC — USACO — Codejam — math MCM",
      "entities": {
        "description": {},
        "url": {
          "urls": [
            {
              "display_url": "zeyuan.allen-zhu.com",
              "expanded_url": "http://zeyuan.allen-zhu.com",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/pBdNTwt2he"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/APR4n747DU",
        "expanded_url": "https://twitter.com/ZeyuanAllenZhu/status/1962119316427706828/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": [
              {
                "h": 176,
                "w": 176,
                "x": 948,
                "y": 123
              }
            ]
          },
          "orig": {
            "faces": [
              {
                "h": 176,
                "w": 176,
                "x": 948,
                "y": 123
              }
            ]
          }
        },
        "id_str": "1962119271020122112",
        "indices": [
          275,
          298
        ],
        "media_key": "3_1962119271020122112",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARs62RTMGlAACgACGzrZH16bEcwAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABGzrZFMwaUAAKAAIbOtkfXpsRzAAA",
            "media_key": "3_1962119271020122112"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/GzrZFMwaUAAU1yJ.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 774,
              "w": 1382,
              "x": 0,
              "y": 0
            },
            {
              "h": 774,
              "w": 774,
              "x": 0,
              "y": 0
            },
            {
              "h": 774,
              "w": 679,
              "x": 0,
              "y": 0
            },
            {
              "h": 774,
              "w": 387,
              "x": 0,
              "y": 0
            },
            {
              "h": 774,
              "w": 1867,
              "x": 0,
              "y": 0
            }
          ],
          "height": 774,
          "width": 1867
        },
        "sizes": {
          "large": {
            "h": 774,
            "w": 1867
          }
        },
        "type": "photo",
        "url": "https://t.co/APR4n747DU"
      },
      {
        "display_url": "pic.twitter.com/APR4n747DU",
        "expanded_url": "https://twitter.com/ZeyuanAllenZhu/status/1962119316427706828/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {},
          "orig": {}
        },
        "id_str": "1962119293262577664",
        "indices": [
          275,
          298
        ],
        "media_key": "3_1962119293262577664",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARs62Rn520AACgACGzrZH16bEcwAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABGzrZGfnbQAAKAAIbOtkfXpsRzAAA",
            "media_key": "3_1962119293262577664"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/GzrZGfnbQAAReRz.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 797,
              "w": 1423,
              "x": 0,
              "y": 0
            },
            {
              "h": 797,
              "w": 797,
              "x": 0,
              "y": 0
            },
            {
              "h": 797,
              "w": 699,
              "x": 0,
              "y": 0
            },
            {
              "h": 797,
              "w": 399,
              "x": 110,
              "y": 0
            },
            {
              "h": 797,
              "w": 1777,
              "x": 0,
              "y": 0
            }
          ],
          "height": 797,
          "width": 1777
        },
        "sizes": {
          "large": {
            "h": 797,
            "w": 1777
          }
        },
        "type": "photo",
        "url": "https://t.co/APR4n747DU"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "urls": [
      {
        "display_url": "huggingface.co/datasets/nvidi…",
        "expanded_url": "https://huggingface.co/datasets/nvidia/Nemotron-CC-v2",
        "indices": [
          811,
          834
        ],
        "url": "https://t.co/c5g1VMUhCr"
      }
    ],
    "user_mentions": [
      {
        "id_str": "1041794770174205955",
        "indices": [
          106,
          119
        ],
        "name": "Rabeeh Karimi",
        "screen_name": "KarimiRabeeh"
      },
      {
        "id_str": "31904918",
        "indices": [
          120,
          130
        ],
        "name": "Sanjeev Satheesh",
        "screen_name": "issanjeev"
      },
      {
        "id_str": "2368348172",
        "indices": [
          131,
          146
        ],
        "name": "Pavlo Molchanov",
        "screen_name": "PavloMolchanov"
      },
      {
        "id_str": "1295857882026647552",
        "indices": [
          147,
          157
        ],
        "name": "Kezhi Kong",
        "screen_name": "KezhiKong"
      },
      {
        "id_str": "851356654528376832",
        "indices": [
          158,
          171
        ],
        "name": "X. Dong",
        "screen_name": "SimonXinDong"
      },
      {
        "id_str": "257642411",
        "indices": [
          172,
          178
        ],
        "name": "Bryan Catanzaro",
        "screen_name": "ctnzr"
      },
      {
        "id_str": "893882282175471616",
        "indices": [
          179,
          192
        ],
        "name": "Yejin Choi",
        "screen_name": "YejinChoinka"
      }
    ]
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}