@ylecun
RT @askalphaxiv: Yann LeCun 🤝 Saining Xie insane crossover of the 2 biggest visual representation researchers in the AI field “Beyond Lan…
Viewing enriched Twitter post
RT @askalphaxiv: Yann LeCun 🤝 Saining Xie insane crossover of the 2 biggest visual representation researchers in the AI field “Beyond Lan…
{
"score": 0.36,
"score_components": {
"author": 0.09,
"engagement": 0.0,
"quality": 0.06000000000000001,
"source": 0.135,
"nlp": 0.05,
"recency": 0.025
},
"scored_at": "2026-03-06T07:22:45.484712",
"import_source": "api_import",
"source_tagged_at": "2026-03-06T07:22:45.484732",
"enriched": true,
"enriched_at": "2026-03-06T07:22:45.484735"
} {
"type": "tweet",
"id": "2029794560876929392",
"url": "https://x.com/ylecun/status/2029794560876929392",
"twitterUrl": "https://twitter.com/ylecun/status/2029794560876929392",
"text": "RT @askalphaxiv: Yann LeCun 🤝 Saining Xie\n\ninsane crossover of the 2 biggest visual representation researchers in the AI field\n\n“Beyond Lan…",
"source": "Twitter for iPhone",
"retweetCount": 99,
"replyCount": 11,
"likeCount": 654,
"quoteCount": 8,
"viewCount": 48509,
"createdAt": "Fri Mar 06 05:41:49 +0000 2026",
"lang": "en",
"bookmarkCount": 462,
"isReply": false,
"inReplyToId": null,
"conversationId": "2029794560876929392",
"displayTextRange": [
0,
140
],
"inReplyToUserId": null,
"inReplyToUsername": null,
"author": {
"type": "user",
"userName": "ylecun",
"url": "https://x.com/ylecun",
"twitterUrl": "https://twitter.com/ylecun",
"id": "48008938",
"name": "Yann LeCun",
"isVerified": false,
"isBlueVerified": true,
"verifiedType": null,
"profilePicture": "https://pbs.twimg.com/profile_images/1483577865056702469/rWA-3_T7_normal.jpg",
"coverPicture": "https://pbs.twimg.com/profile_banners/48008938/1642547502",
"description": "",
"location": "New York",
"followers": 1054999,
"following": 775,
"status": "",
"canDm": false,
"canMediaTag": true,
"createdAt": "Wed Jun 17 16:05:51 +0000 2009",
"entities": {
"description": {
"urls": []
},
"url": {}
},
"fastFollowersCount": 0,
"favouritesCount": 27248,
"hasCustomTimelines": true,
"isTranslator": false,
"mediaCount": 461,
"statusesCount": 25191,
"withheldInCountries": [],
"affiliatesHighlightedLabel": {},
"possiblySensitive": false,
"pinnedTweetIds": [
"1862598063275061484"
],
"profile_bio": {
"description": "Professor at NYU & Executive Chairman at AMI Labs. \nEx-Chief AI Scientist at Meta.\nResearcher in AI, Machine Learning, Robotics, etc.\nACM Turing Award Laureate.",
"entities": {
"description": {
"hashtags": [],
"symbols": [],
"urls": [],
"user_mentions": []
},
"url": {
"urls": [
{
"display_url": "yann.lecun.com",
"expanded_url": "http://yann.lecun.com",
"indices": [
0,
23
],
"url": "https://t.co/POp7IBHfXy"
}
]
}
}
},
"isAutomated": false,
"automatedBy": null
},
"extendedEntities": {},
"card": null,
"place": {},
"entities": {
"hashtags": [],
"symbols": [],
"timestamps": [],
"urls": [],
"user_mentions": [
{
"id_str": "1722422481942884352",
"indices": [
3,
15
],
"name": "alphaXiv",
"screen_name": "askalphaxiv"
}
]
},
"quoted_tweet": null,
"retweeted_tweet": {
"type": "tweet",
"id": "2029644559391535314",
"url": "https://x.com/askalphaxiv/status/2029644559391535314",
"twitterUrl": "https://twitter.com/askalphaxiv/status/2029644559391535314",
"text": "Yann LeCun 🤝 Saining Xie\n\ninsane crossover of the 2 biggest visual representation researchers in the AI field\n\n“Beyond Language Modeling: An Exploration of Multimodal Pretraining”\n\nRight now, most multimodal models are basically a language model with a vision adapter bolted on, so they can describe images, but they don’t really think in images or video.\n\nThis paper shows what happens when you do it the hard way: train one model from scratch on text, images, and video with a unified setup.\n\nThey key idea is if you give the model a good visual internal format and it can use vision for both understanding and generating.\n\nAdditionally, multimodal data can improve language instead of distracting it, and mixture-of-experts lets you scale vision’s huge data intake without bloating everything else.\n\nThis paves the way towards changing the vision paradigm from “captioning add-on” model to native multimodal foundation model.",
"source": "Twitter for iPhone",
"retweetCount": 99,
"replyCount": 11,
"likeCount": 654,
"quoteCount": 8,
"viewCount": 48509,
"createdAt": "Thu Mar 05 19:45:46 +0000 2026",
"lang": "en",
"bookmarkCount": 462,
"isReply": false,
"inReplyToId": null,
"conversationId": "2029644559391535314",
"displayTextRange": [
0,
278
],
"inReplyToUserId": null,
"inReplyToUsername": null,
"author": {
"type": "user",
"userName": "askalphaxiv",
"url": "https://x.com/askalphaxiv",
"twitterUrl": "https://twitter.com/askalphaxiv",
"id": "1722422481942884352",
"name": "alphaXiv",
"isVerified": false,
"isBlueVerified": true,
"verifiedType": null,
"profilePicture": "https://pbs.twimg.com/profile_images/1866663567417806848/-Vj32Dq-_normal.jpg",
"coverPicture": "https://pbs.twimg.com/profile_banners/1722422481942884352/1738960325",
"description": "",
"location": "",
"followers": 34303,
"following": 45,
"status": "",
"canDm": true,
"canMediaTag": true,
"createdAt": "Thu Nov 09 01:15:23 +0000 2023",
"entities": {
"description": {
"urls": []
},
"url": {}
},
"fastFollowersCount": 0,
"favouritesCount": 3137,
"hasCustomTimelines": true,
"isTranslator": false,
"mediaCount": 815,
"statusesCount": 1729,
"withheldInCountries": [],
"affiliatesHighlightedLabel": {},
"possiblySensitive": false,
"pinnedTweetIds": [
"2029226538445820054"
],
"profile_bio": {
"description": "High fidelity research",
"entities": {
"description": {
"hashtags": [],
"symbols": [],
"urls": [],
"user_mentions": []
},
"url": {
"urls": [
{
"display_url": "alphaxiv.org",
"expanded_url": "http://alphaxiv.org",
"indices": [
0,
23
],
"url": "https://t.co/7lQNcnzeZ7"
}
]
}
}
},
"isAutomated": false,
"automatedBy": null
},
"extendedEntities": {
"media": [
{
"allow_download_status": {
"allow_download": true
},
"display_url": "pic.twitter.com/itKoXYNyqf",
"expanded_url": "https://twitter.com/askalphaxiv/status/2029644559391535314/photo/1",
"ext_media_availability": {
"status": "Available"
},
"features": {
"large": {
"faces": []
},
"orig": {
"faces": []
}
},
"id_str": "2029643423783809025",
"indices": [
279,
302
],
"media_key": "3_2029643423783809025",
"media_results": {
"id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARwqvfC/V3ABCgACHCq++SbWMNIAAA==",
"result": {
"__typename": "ApiMedia",
"id": "QXBpTWVkaWE6DAABCgABHCq98L9XcAEKAAIcKr75JtYw0gAA",
"media_key": "3_2029643423783809025"
}
},
"media_url_https": "https://pbs.twimg.com/media/HCq98L9XcAEwn7h.jpg",
"original_info": {
"focus_rects": [
{
"h": 809,
"w": 1444,
"x": 0,
"y": 0
},
{
"h": 1444,
"w": 1444,
"x": 0,
"y": 0
},
{
"h": 1626,
"w": 1426,
"x": 0,
"y": 0
},
{
"h": 1626,
"w": 813,
"x": 0,
"y": 0
},
{
"h": 1626,
"w": 1444,
"x": 0,
"y": 0
}
],
"height": 1626,
"width": 1444
},
"sizes": {
"large": {
"h": 1626,
"w": 1444
}
},
"type": "photo",
"url": "https://t.co/itKoXYNyqf"
}
]
},
"card": null,
"place": {},
"entities": {
"hashtags": [],
"symbols": [],
"urls": [],
"user_mentions": []
},
"quoted_tweet": null,
"retweeted_tweet": null,
"isLimitedReply": false,
"article": null
},
"isLimitedReply": false,
"article": null
}