🐦 Twitter Post Details

Viewing enriched Twitter post

@llama_index

Document OCR benchmarks are hitting a ceiling - and that's a problem for real-world AI applications. Our latest analysis reveals why OmniDocBench, the go-to standard for document parsing evaluation, is becoming inadequate as models like GLM-OCR @Zai_org achieve 94.6% accuracy while still failing on complex real-world documents. 📊 Models are saturating OmniDocBench scores but still struggle with complex financial reports, legal filings, and domain-specific documents 🎯 Rigid exact-match evaluation penalizes semantically correct outputs that differ in formatting (HTML vs markdown, spacing, etc.) ⚡ AI agents need semantic correctness, not perfect formatting matches - current benchmarks miss this critical distinction 🔬 The benchmark's 1,355 pages can't capture the full complexity of production document processing needs The document parsing challenge isn't solved just because benchmark scores look impressive. We need evaluation methods that reward semantic understanding over exact formatting, especially as AI agents become the primary consumers of parsed content. We're building parsing models focused on semantic correctness for complex visual documents. If you're scaling OCR workloads in production, LlamaParse handles the edge cases that benchmarks miss. Read our full analysis: https://t.co/tcZP1PM8kv

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2026342120236396844/media_0.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2026342120236396844/media_0.jpg?",
      "type": "photo",
      "filename": "media_0.jpg"
    },
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2026342120236396844/media_1.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2026342120236396844/media_1.jpg?",
      "type": "photo",
      "filename": "media_1.jpg"
    }
  ],
  "processed_at": "2026-03-01T19:11:11.826794",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2026342120236396844",
  "url": "https://x.com/llama_index/status/2026342120236396844",
  "twitterUrl": "https://twitter.com/llama_index/status/2026342120236396844",
  "text": "Document OCR benchmarks are hitting a ceiling - and that's a problem for real-world AI applications.\n\nOur latest analysis reveals why OmniDocBench, the go-to standard for document parsing evaluation, is becoming inadequate as models like GLM-OCR @Zai_org achieve 94.6% accuracy while still failing on complex real-world documents.\n\n📊 Models are saturating OmniDocBench scores but still struggle with complex financial reports, legal filings, and domain-specific documents\n🎯 Rigid exact-match evaluation penalizes semantically correct outputs that differ in formatting (HTML vs markdown, spacing, etc.)\n⚡ AI agents need semantic correctness, not perfect formatting matches - current benchmarks miss this critical distinction\n🔬 The benchmark's 1,355 pages can't capture the full complexity of production document processing needs\n\nThe document parsing challenge isn't solved just because benchmark scores look impressive. We need evaluation methods that reward semantic understanding over exact formatting, especially as AI agents become the primary consumers of parsed content.\n\nWe're building parsing models focused on semantic correctness for complex visual documents. If you're scaling OCR workloads in production, LlamaParse handles the edge cases that benchmarks miss.\n\nRead our full analysis: https://t.co/tcZP1PM8kv",
  "source": "Twitter for iPhone",
  "retweetCount": 10,
  "replyCount": 3,
  "likeCount": 65,
  "quoteCount": 2,
  "viewCount": 13286,
  "createdAt": "Tue Feb 24 17:03:03 +0000 2026",
  "lang": "en",
  "bookmarkCount": 56,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2026342120236396844",
  "displayTextRange": [
    0,
    277
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "llama_index",
    "url": "https://x.com/llama_index",
    "twitterUrl": "https://twitter.com/llama_index",
    "id": "1604278358296055808",
    "name": "LlamaIndex 🦙",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": "Business",
    "profilePicture": "https://pbs.twimg.com/profile_images/1967920417760251904/0ytfduMQ_normal.png",
    "coverPicture": "https://pbs.twimg.com/profile_banners/1604278358296055808/1770092126",
    "description": "",
    "location": "",
    "followers": 108842,
    "following": 29,
    "status": "",
    "canDm": false,
    "canMediaTag": true,
    "createdAt": "Sun Dec 18 00:52:44 +0000 2022",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 1491,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 1826,
    "statusesCount": 3740,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [],
    "profile_bio": {
      "description": "AI Agents for document OCR + workflows\n\nLlamaParse: https://t.co/yQGTiRSfFL\nDocs: https://t.co/us6GCS14vD",
      "entities": {
        "description": {
          "hashtags": [],
          "symbols": [],
          "urls": [
            {
              "display_url": "cloud.llamaindex.ai",
              "expanded_url": "https://cloud.llamaindex.ai/",
              "indices": [
                52,
                75
              ],
              "url": "https://t.co/yQGTiRSfFL"
            },
            {
              "display_url": "developers.llamaindex.ai/python/cloud/",
              "expanded_url": "https://developers.llamaindex.ai/python/cloud/",
              "indices": [
                82,
                105
              ],
              "url": "https://t.co/us6GCS14vD"
            }
          ],
          "user_mentions": []
        },
        "url": {
          "urls": [
            {
              "display_url": "llamaindex.ai",
              "expanded_url": "https://www.llamaindex.ai/",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/epzefqPT9Z"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/4DgWqAKDQW",
        "expanded_url": "https://twitter.com/llama_index/status/2026342120236396844/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": []
          },
          "orig": {
            "faces": []
          }
        },
        "id_str": "2026342117778571264",
        "indices": [
          278,
          301
        ],
        "media_key": "3_2026342117778571264",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARwfA2uLmsAACgACHB8DbB4aMSwAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABHB8Da4uawAAKAAIcHwNsHhoxLAAA",
            "media_key": "3_2026342117778571264"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/HB8Da4uawAA4tI7.jpg",
        "original_info": {
          "focus_rects": [
            {
              "h": 581,
              "w": 1038,
              "x": 0,
              "y": 219
            },
            {
              "h": 800,
              "w": 800,
              "x": 238,
              "y": 0
            },
            {
              "h": 800,
              "w": 702,
              "x": 296,
              "y": 0
            },
            {
              "h": 800,
              "w": 400,
              "x": 447,
              "y": 0
            },
            {
              "h": 800,
              "w": 1038,
              "x": 0,
              "y": 0
            }
          ],
          "height": 800,
          "width": 1038
        },
        "sizes": {
          "large": {
            "h": 800,
            "w": 1038
          }
        },
        "type": "photo",
        "url": "https://t.co/4DgWqAKDQW"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "urls": [
      {
        "display_url": "llamaindex.ai/blog/omnidocbe…",
        "expanded_url": "https://www.llamaindex.ai/blog/omnidocbench-is-saturated-what-s-next-for-ocr-benchmarks?utm_source=socials&utm_medium=li_social",
        "indices": [
          1298,
          1321
        ],
        "url": "https://t.co/tcZP1PM8kv"
      }
    ],
    "user_mentions": [
      {
        "id_str": "1726486879456096256",
        "indices": [
          246,
          254
        ],
        "name": "Z.ai",
        "screen_name": "Zai_org"
      }
    ]
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}