@omarsar0
ROUGE misaligns with humans. In a human study, LLM‑as‑Judge matches human labels much better than ROUGE. Results show that LLM‑as‑Judge F1 0.832 vs ROUGE 0.565, with far higher agreement. https://t.co/KXcZDIS9s5
Viewing enriched Twitter post
ROUGE misaligns with humans. In a human study, LLM‑as‑Judge matches human labels much better than ROUGE. Results show that LLM‑as‑Judge F1 0.832 vs ROUGE 0.565, with far higher agreement. https://t.co/KXcZDIS9s5
{
"media": [
{
"type": "photo",
"url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1955647073652785312/media_0.png?",
"filename": "media_0.png"
}
],
"processed_at": "2025-08-14T07:00:02.674704",
"pipeline_version": "2.0"
} {
"type": "tweet",
"id": "1955647073652785312",
"url": "https://x.com/omarsar0/status/1955647073652785312",
"twitterUrl": "https://twitter.com/omarsar0/status/1955647073652785312",
"text": "ROUGE misaligns with humans.\n\nIn a human study, LLM‑as‑Judge matches human labels much better than ROUGE.\n\nResults show that LLM‑as‑Judge F1 0.832 vs ROUGE 0.565, with far higher agreement. https://t.co/KXcZDIS9s5",
"source": "Twitter for iPhone",
"retweetCount": 0,
"replyCount": 1,
"likeCount": 1,
"quoteCount": 0,
"viewCount": 411,
"createdAt": "Wed Aug 13 15:06:11 +0000 2025",
"lang": "en",
"bookmarkCount": 0,
"isReply": true,
"inReplyToId": "1955647057936765213",
"conversationId": "1955647039733481841",
"inReplyToUserId": "3448284313",
"inReplyToUsername": "omarsar0",
"author": {
"type": "user",
"userName": "omarsar0",
"url": "https://x.com/omarsar0",
"twitterUrl": "https://twitter.com/omarsar0",
"id": "3448284313",
"name": "elvis",
"isVerified": false,
"isBlueVerified": true,
"verifiedType": null,
"profilePicture": "https://pbs.twimg.com/profile_images/939313677647282181/vZjFWtAn_normal.jpg",
"coverPicture": "https://pbs.twimg.com/profile_banners/3448284313/1565974901",
"description": "",
"location": "",
"followers": 259855,
"following": 649,
"status": "",
"canDm": true,
"canMediaTag": true,
"createdAt": "Fri Sep 04 12:59:26 +0000 2015",
"entities": {
"description": {
"urls": []
},
"url": {}
},
"fastFollowersCount": 0,
"favouritesCount": 31822,
"hasCustomTimelines": true,
"isTranslator": true,
"mediaCount": 3835,
"statusesCount": 15444,
"withheldInCountries": [],
"affiliatesHighlightedLabel": {},
"possiblySensitive": false,
"pinnedTweetIds": [
"1955618455010562311"
],
"profile_bio": {
"description": "Building with AI agents @dair_ai • Prev: Meta AI, Galactica LLM, Elastic, PaperswithCode, PhD • I share insights on how to build with AI Agents ⬇️",
"entities": {
"description": {
"user_mentions": [
{
"id_str": "0",
"indices": [
24,
32
],
"name": "",
"screen_name": "dair_ai"
}
]
},
"url": {
"urls": [
{
"display_url": "dair-ai.thinkific.com",
"expanded_url": "https://dair-ai.thinkific.com/",
"indices": [
0,
23
],
"url": "https://t.co/JBU5beHQNs"
}
]
}
}
},
"isAutomated": false,
"automatedBy": null
},
"extendedEntities": {
"media": [
{
"display_url": "pic.twitter.com/KXcZDIS9s5",
"expanded_url": "https://twitter.com/omarsar0/status/1955647073652785312/photo/1",
"ext_media_availability": {
"status": "Available"
},
"features": {
"large": {},
"orig": {}
},
"id_str": "1955647070922305536",
"indices": [
190,
213
],
"media_key": "3_1955647070922305536",
"media_results": {
"id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARsj2qZIW3AACgACGyPapusbQKAAAA==",
"result": {
"__typename": "ApiMedia",
"id": "QXBpTWVkaWE6DAABCgABGyPapkhbcAAKAAIbI9qm6xtAoAAA",
"media_key": "3_1955647070922305536"
}
},
"media_url_https": "https://pbs.twimg.com/media/GyPapkhbcAAeCIL.png",
"original_info": {
"focus_rects": [
{
"h": 470,
"w": 839,
"x": 0,
"y": 343
},
{
"h": 813,
"w": 813,
"x": 0,
"y": 0
},
{
"h": 813,
"w": 713,
"x": 0,
"y": 0
},
{
"h": 813,
"w": 407,
"x": 0,
"y": 0
},
{
"h": 813,
"w": 839,
"x": 0,
"y": 0
}
],
"height": 813,
"width": 839
},
"sizes": {
"large": {
"h": 813,
"w": 839
}
},
"type": "photo",
"url": "https://t.co/KXcZDIS9s5"
}
]
},
"card": null,
"place": {},
"entities": {},
"quoted_tweet": null,
"retweeted_tweet": null,
"article": null
}