🐦 Twitter Post Details

Viewing enriched Twitter post

@jerryjliu0

A downside with using VLMs to parse PDFs is guaranteeing that the output text is *correct* and output in the correct reading order. 1️⃣ Text correctness: making sure that digits, words, sentences are not hallucinated or dropped. 2️⃣ Reading Order: making sure that complex multi-layout pages are linearized into the right 1-d text order. We call this Content Faithfulness in ParseBench, our comprehensive document OCR benchmark for agents. We have 167k rules that measure digit/word/sentence-level correctness along with reading order correctness. It seems relatively table-stakes, but no parser gets this 100% right, and this means that the agent’s downstream decision-making is compromised. Come learn more about how this metric works in the video below, along with our full blog writeup, whitepaper, and website! Blog: https://t.co/57OHkx0pQW Paper: https://t.co/Ho2oH2xEAM Website: https://t.co/g0b0jsCynW

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2045623431220412755/media_0.mp4",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2045623431220412755/media_0.mp4",
      "type": "video",
      "filename": "media_0.mp4"
    },
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2045623431220412755/media_1.jpg",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2045623431220412755/media_1.jpg",
      "type": "photo",
      "filename": "media_1.jpg"
    }
  ],
  "processed_at": "2026-04-19T00:48:56.388417",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2045623431220412755",
  "url": "https://x.com/jerryjliu0/status/2045623431220412755",
  "twitterUrl": "https://twitter.com/jerryjliu0/status/2045623431220412755",
  "text": "A downside with using VLMs to parse PDFs is guaranteeing that the output text is *correct* and output in the correct reading order.\n\n1️⃣ Text correctness: making sure that digits, words, sentences are not hallucinated or dropped.\n2️⃣ Reading Order: making sure that complex multi-layout pages are linearized into the right 1-d text order.\n\nWe call this Content Faithfulness in ParseBench, our comprehensive document OCR benchmark for agents. We have 167k rules that measure digit/word/sentence-level correctness along with reading order correctness.\n\nIt seems relatively table-stakes, but no parser gets this 100% right, and this means that the agent’s downstream decision-making is compromised.\n\nCome learn more about how this metric works in the video below, along with our full blog writeup, whitepaper, and website!\n\nBlog: https://t.co/57OHkx0pQW\nPaper: https://t.co/Ho2oH2xEAM\nWebsite: https://t.co/g0b0jsCynW",
  "source": "Twitter for iPhone",
  "retweetCount": 5,
  "replyCount": 2,
  "likeCount": 35,
  "quoteCount": 1,
  "viewCount": 4584,
  "createdAt": "Sat Apr 18 22:00:06 +0000 2026",
  "lang": "en",
  "bookmarkCount": 38,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2045623431220412755",
  "displayTextRange": [
    0,
    273
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "jerryjliu0",
    "url": "https://x.com/jerryjliu0",
    "twitterUrl": "https://twitter.com/jerryjliu0",
    "id": "369777416",
    "name": "Jerry Liu",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1283610285031460864/1Q4zYhtb_normal.jpg",
    "coverPicture": "",
    "description": "",
    "location": "",
    "followers": 72999,
    "following": 1471,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Wed Sep 07 22:54:31 +0000 2011",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 8658,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 1487,
    "statusesCount": 6861,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [],
    "profile_bio": {
      "description": "Parsing the world's hardest PDFs @llama_index. cofounder/CEO\n\nCareers: https://t.co/EUnMNmbCtx\nEnterprise: https://t.co/Ht5jwxSrQB",
      "entities": {
        "description": {
          "urls": [
            {
              "display_url": "llamaindex.ai/careers",
              "expanded_url": "https://www.llamaindex.ai/careers",
              "indices": [
                71,
                94
              ],
              "url": "https://t.co/EUnMNmbCtx"
            },
            {
              "display_url": "llamaindex.ai/contact",
              "expanded_url": "https://www.llamaindex.ai/contact",
              "indices": [
                107,
                130
              ],
              "url": "https://t.co/Ht5jwxSrQB"
            }
          ],
          "user_mentions": [
            {
              "id_str": "",
              "indices": [
                33,
                45
              ],
              "name": "",
              "screen_name": "llama_index"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "llamaindex.ai",
              "expanded_url": "https://www.llamaindex.ai/",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/YiIfjVlzb6"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "additional_media_info": {
          "monetizable": true
        },
        "display_url": "pic.twitter.com/IFZXPZ37sb",
        "expanded_url": "https://twitter.com/jerryjliu0/status/2045623431220412755/video/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "id_str": "2045621916283846656",
        "indices": [
          274,
          297
        ],
        "media_key": "13_2045621916283846656",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwABAoAARxjgktw2kAAAAA=",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAAECgABHGOCS3DaQAAAAA==",
            "media_key": "13_2045621916283846656"
          }
        },
        "media_url_https": "https://pbs.twimg.com/amplify_video_thumb/2045621916283846656/img/HYUgUVr78uosvlUY.jpg",
        "original_info": {
          "focus_rects": [],
          "height": 2160,
          "width": 3840
        },
        "sizes": {
          "large": {
            "h": 1152,
            "w": 2048
          }
        },
        "type": "video",
        "url": "https://t.co/IFZXPZ37sb",
        "video_info": {
          "aspect_ratio": [
            16,
            9
          ],
          "duration_millis": 108736,
          "variants": [
            {
              "content_type": "application/x-mpegURL",
              "url": "https://video.twimg.com/amplify_video/2045621916283846656/pl/sAOKnomUOflvyNm6.m3u8?tag=21"
            },
            {
              "bitrate": 256000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/2045621916283846656/vid/avc1/480x270/uX1qRfXPVlikzkGP.mp4?tag=21"
            },
            {
              "bitrate": 832000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/2045621916283846656/vid/avc1/640x360/5SclR4BtLQ4zxQBh.mp4?tag=21"
            },
            {
              "bitrate": 2176000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/2045621916283846656/vid/avc1/1280x720/bUV3i5htNOzsJH4e.mp4?tag=21"
            },
            {
              "bitrate": 10368000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/2045621916283846656/vid/avc1/1920x1080/4dt3EWbP1ZhMTQG8.mp4?tag=21"
            },
            {
              "bitrate": 25128000,
              "content_type": "video/mp4",
              "url": "https://video.twimg.com/amplify_video/2045621916283846656/vid/avc1/3840x2160/vzY2MEdCiZ7pEqVj.mp4?tag=21"
            }
          ]
        }
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "timestamps": [],
    "urls": [
      {
        "display_url": "llamaindex.ai/blog/parsebenc…",
        "expanded_url": "https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=xjl&utm_campaign=2026-apr-",
        "indices": [
          827,
          850
        ],
        "url": "https://t.co/57OHkx0pQW"
      },
      {
        "display_url": "arxiv.org/abs/2604.08538…",
        "expanded_url": "https://arxiv.org/abs/2604.08538?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr-",
        "indices": [
          858,
          881
        ],
        "url": "https://t.co/Ho2oH2xEAM"
      },
      {
        "display_url": "parsebench.ai/?utm_medium=so…",
        "expanded_url": "https://parsebench.ai/?utm_medium=socials&utm_source=xjl&utm_campaign=2026-apr-",
        "indices": [
          891,
          914
        ],
        "url": "https://t.co/g0b0jsCynW"
      }
    ],
    "user_mentions": []
  },
  "quoted_tweet": {
    "type": "tweet",
    "id": "2045145054772183128",
    "url": "https://x.com/llama_index/status/2045145054772183128",
    "twitterUrl": "https://twitter.com/llama_index/status/2045145054772183128",
    "text": "Let's talk content faithfulness.\n\nFour days ago, we launched ParseBench, the first document OCR benchmark for AI agents.\n\nIts most fundamental metric asks: did the parser capture all the text, in order, without making things up?\nWe grade three failure modes with 167K+ rule-based tests:\n\n❌Omissions (word, sentence, digit)\n❌Hallucinations\n❌Reading order violations\n\nThe bar has shifted from \"good enough for a human to read\" to \"reliable enough for an agent to act on.\"\nDeep dive in the video. \n\nFull write-up: \nhttps://t.co/2sq5ncGiel",
    "source": "Twitter for iPhone",
    "retweetCount": 8,
    "replyCount": 5,
    "likeCount": 39,
    "quoteCount": 1,
    "viewCount": 10188,
    "createdAt": "Fri Apr 17 14:19:12 +0000 2026",
    "lang": "en",
    "bookmarkCount": 30,
    "isReply": false,
    "inReplyToId": null,
    "conversationId": "2045145054772183128",
    "displayTextRange": [
      0,
      279
    ],
    "inReplyToUserId": null,
    "inReplyToUsername": null,
    "author": {
      "type": "user",
      "userName": "llama_index",
      "url": "https://x.com/llama_index",
      "twitterUrl": "https://twitter.com/llama_index",
      "id": "1604278358296055808",
      "name": "LlamaIndex 🦙",
      "isVerified": false,
      "isBlueVerified": true,
      "verifiedType": "Business",
      "profilePicture": "https://pbs.twimg.com/profile_images/1967920417760251904/0ytfduMQ_normal.png",
      "coverPicture": "https://pbs.twimg.com/profile_banners/1604278358296055808/1770092126",
      "description": "",
      "location": "",
      "followers": 112294,
      "following": 33,
      "status": "",
      "canDm": false,
      "canMediaTag": true,
      "createdAt": "Sun Dec 18 00:52:44 +0000 2022",
      "entities": {
        "description": {
          "urls": []
        },
        "url": {}
      },
      "fastFollowersCount": 0,
      "favouritesCount": 1533,
      "hasCustomTimelines": true,
      "isTranslator": false,
      "mediaCount": 1866,
      "statusesCount": 3829,
      "withheldInCountries": [],
      "affiliatesHighlightedLabel": {},
      "possiblySensitive": false,
      "pinnedTweetIds": [],
      "profile_bio": {
        "description": "The world's best AI Document OCR\n\nLlamaParse: https://t.co/yQGTiRSNvj\nDocs: https://t.co/us6GCS1Clb",
        "entities": {
          "description": {
            "urls": [
              {
                "display_url": "cloud.llamaindex.ai",
                "expanded_url": "https://cloud.llamaindex.ai/",
                "indices": [
                  46,
                  69
                ],
                "url": "https://t.co/yQGTiRSNvj"
              },
              {
                "display_url": "developers.llamaindex.ai/python/cloud/",
                "expanded_url": "https://developers.llamaindex.ai/python/cloud/",
                "indices": [
                  76,
                  99
                ],
                "url": "https://t.co/us6GCS1Clb"
              }
            ]
          },
          "url": {
            "urls": [
              {
                "display_url": "llamaindex.ai",
                "expanded_url": "https://www.llamaindex.ai/",
                "indices": [
                  0,
                  23
                ],
                "url": "https://t.co/epzefqQqZx"
              }
            ]
          }
        }
      },
      "isAutomated": false,
      "automatedBy": null
    },
    "extendedEntities": {
      "media": [
        {
          "additional_media_info": {
            "monetizable": true
          },
          "allow_download_status": {
            "allow_download": true
          },
          "display_url": "pic.twitter.com/7vgxG4OFqS",
          "expanded_url": "https://twitter.com/llama_index/status/2045145054772183128/video/1",
          "ext_media_availability": {
            "status": "Available"
          },
          "id_str": "2045144646569906176",
          "indices": [
            280,
            303
          ],
          "media_key": "13_2045144646569906176",
          "media_results": {
            "id": "QXBpTWVkaWFSZXN1bHRzOgwABAoAARxh0DhtF1AAAAA=",
            "result": {
              "__typename": "ApiMedia",
              "id": "QXBpTWVkaWE6DAAECgABHGHQOG0XUAAAAA==",
              "media_key": "13_2045144646569906176"
            }
          },
          "media_url_https": "https://pbs.twimg.com/amplify_video_thumb/2045144646569906176/img/YXtFu-5Et8DDQQi_.jpg",
          "original_info": {
            "focus_rects": [],
            "height": 2160,
            "width": 3840
          },
          "sizes": {
            "large": {
              "h": 1152,
              "w": 2048
            }
          },
          "type": "video",
          "url": "https://t.co/7vgxG4OFqS",
          "video_info": {
            "aspect_ratio": [
              16,
              9
            ],
            "duration_millis": 108675,
            "variants": [
              {
                "content_type": "application/x-mpegURL",
                "url": "https://video.twimg.com/amplify_video/2045144646569906176/pl/JXXXbnqkkxJxUQDf.m3u8?tag=21&v=173"
              },
              {
                "bitrate": 256000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2045144646569906176/vid/avc1/480x270/m_gY5PsbClikqDqC.mp4?tag=21"
              },
              {
                "bitrate": 832000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2045144646569906176/vid/avc1/640x360/5H_X6N3S07jda_pv.mp4?tag=21"
              },
              {
                "bitrate": 2176000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2045144646569906176/vid/avc1/1280x720/MHVwTCBrNhKs1r63.mp4?tag=21"
              },
              {
                "bitrate": 10368000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2045144646569906176/vid/avc1/1920x1080/bpeMSBNxeSFM1FPA.mp4?tag=21"
              },
              {
                "bitrate": 25128000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2045144646569906176/vid/avc1/3840x2160/FnnvILs4MR5Wu7RH.mp4?tag=21"
              }
            ]
          }
        }
      ]
    },
    "card": null,
    "place": {},
    "entities": {
      "hashtags": [],
      "symbols": [],
      "timestamps": [],
      "urls": [
        {
          "display_url": "llamaindex.ai/blog/parsebenc…",
          "expanded_url": "https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=twitter&utm_campaign=2026--",
          "indices": [
            512,
            535
          ],
          "url": "https://t.co/2sq5ncGiel"
        }
      ],
      "user_mentions": []
    },
    "quoted_tweet": null,
    "retweeted_tweet": null,
    "isLimitedReply": false,
    "communityInfo": null,
    "article": null
  },
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "communityInfo": null,
  "article": null
}