🐦 Twitter Post Details

Viewing enriched Twitter post

@jerryjliu0

Parsing complex tables in PDFs is extremely challenging. Existing metrics for measuring table accuracy, like TEDS (tree edit distance similarity), overweight exact table structure and underweight semantic correctness. 🚫 Overweight: If the rows within a table are out of order - even if the semantic meaning is still consistent - then TEDS heavily penalizes these values, even though the downstream AI agent would have no problem interpreting the values. 🚫 Overweight: If the HTML is semantic equivalent but output with different tags (th vs. td), TEDS will penalize 🚫 Underweight: If the header is dropped or transposed, then TEDS mildly penalizes these values, even though the entire semantic meaning of the table is destroyed. We recently released ParseBench, a comprehensive enterprise document benchmark with a heavy focus on *semantic correctness* for tables. We define a new metric: TableRecordMatch - which treats tables as a bag of records, where each record is a dictionary of key-value pairs, with keys being the headers and values being the cell values. We combine it with the GriTS metric (more robust than TEDS) to come up with the final GTRM score. It’s worth giving our full paper a read if you haven’t already. Also come check out our website hub! Website: https://t.co/k564afEGJo Blog: https://t.co/57OHkx0pQW Paper: https://t.co/Ho2oH2xEAM

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2044446899567292570/media_0.jpg",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2044446899567292570/media_0.jpg",
      "type": "photo",
      "filename": "media_0.jpg"
    },
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2044446899567292570/media_1.jpg",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2044446899567292570/media_1.jpg",
      "type": "photo",
      "filename": "media_1.jpg"
    },
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2044446899567292570/media_2.jpg",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/2044446899567292570/media_2.jpg",
      "type": "photo",
      "filename": "media_2.jpg"
    }
  ],
  "processed_at": "2026-04-16T10:31:06.968715",
  "pipeline_version": "2.0"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2044446899567292570",
  "url": "https://x.com/jerryjliu0/status/2044446899567292570",
  "twitterUrl": "https://twitter.com/jerryjliu0/status/2044446899567292570",
  "text": "Parsing complex tables in PDFs is extremely challenging.\n\nExisting metrics for measuring table accuracy, like TEDS (tree edit distance similarity), overweight exact table structure and underweight semantic correctness.\n\n🚫 Overweight: If the rows within a table are out of order - even if the semantic meaning is still consistent - then TEDS heavily penalizes these values, even though the downstream AI agent would have no problem interpreting the values.\n🚫 Overweight: If the HTML is semantic equivalent but output with different tags (th vs. td), TEDS will penalize\n🚫 Underweight: If the header is dropped or transposed, then TEDS mildly penalizes these values, even though the entire semantic meaning of the table is destroyed.\n\nWe recently released ParseBench, a comprehensive enterprise document benchmark with a heavy focus on *semantic correctness* for tables.\n\nWe define a new metric: TableRecordMatch - which treats tables as a bag of records, where each record is a dictionary of key-value pairs, with keys being the headers and values being the cell values.\n\nWe combine it with the GriTS metric (more robust than TEDS) to come up with the final GTRM score.\n\nIt’s worth giving our full paper a read if you haven’t already. Also come check out our website hub!\n\nWebsite: https://t.co/k564afEGJo\n\nBlog: https://t.co/57OHkx0pQW\nPaper: https://t.co/Ho2oH2xEAM",
  "source": "Twitter for iPhone",
  "retweetCount": 22,
  "replyCount": 2,
  "likeCount": 96,
  "quoteCount": 2,
  "viewCount": 12781,
  "createdAt": "Wed Apr 15 16:04:59 +0000 2026",
  "lang": "en",
  "bookmarkCount": 90,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "2044446899567292570",
  "displayTextRange": [
    0,
    279
  ],
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "jerryjliu0",
    "url": "https://x.com/jerryjliu0",
    "twitterUrl": "https://twitter.com/jerryjliu0",
    "id": "369777416",
    "name": "Jerry Liu",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1283610285031460864/1Q4zYhtb_normal.jpg",
    "coverPicture": "",
    "description": "",
    "location": "",
    "followers": 72816,
    "following": 1470,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Wed Sep 07 22:54:31 +0000 2011",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 8652,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 1483,
    "statusesCount": 6851,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [],
    "profile_bio": {
      "description": "Parsing the world's hardest PDFs @llama_index. cofounder/CEO\n\nCareers: https://t.co/EUnMNmbCtx\nEnterprise: https://t.co/Ht5jwxSrQB",
      "entities": {
        "description": {
          "hashtags": [],
          "symbols": [],
          "urls": [
            {
              "display_url": "llamaindex.ai/careers",
              "expanded_url": "https://www.llamaindex.ai/careers",
              "indices": [
                71,
                94
              ],
              "url": "https://t.co/EUnMNmbCtx"
            },
            {
              "display_url": "llamaindex.ai/contact",
              "expanded_url": "https://www.llamaindex.ai/contact",
              "indices": [
                107,
                130
              ],
              "url": "https://t.co/Ht5jwxSrQB"
            }
          ],
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                33,
                45
              ],
              "name": "",
              "screen_name": "llama_index"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "llamaindex.ai",
              "expanded_url": "https://www.llamaindex.ai/",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/YiIfjVlzb6"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "display_url": "pic.twitter.com/yOtoPajmHu",
        "expanded_url": "https://twitter.com/jerryjliu0/status/2044446899567292570/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": []
          },
          "orig": {
            "faces": []
          }
        },
        "id_str": "2044446851877994497",
        "indices": [
          280,
          303
        ],
        "media_key": "3_2044446851877994497",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARxfVZRwWoABCgACHF9Vn4rbcJoAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABHF9VlHBagAEKAAIcX1WfittwmgAA",
            "media_key": "3_2044446851877994497"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/HF9VlHBagAEKpoP.png",
        "original_info": {
          "focus_rects": [
            {
              "h": 725,
              "w": 1294,
              "x": 0,
              "y": 123
            },
            {
              "h": 848,
              "w": 848,
              "x": 0,
              "y": 0
            },
            {
              "h": 848,
              "w": 744,
              "x": 0,
              "y": 0
            },
            {
              "h": 848,
              "w": 424,
              "x": 0,
              "y": 0
            },
            {
              "h": 848,
              "w": 1294,
              "x": 0,
              "y": 0
            }
          ],
          "height": 848,
          "width": 1294
        },
        "sizes": {
          "large": {
            "h": 848,
            "w": 1294
          }
        },
        "type": "photo",
        "url": "https://t.co/yOtoPajmHu"
      },
      {
        "display_url": "pic.twitter.com/yOtoPajmHu",
        "expanded_url": "https://twitter.com/jerryjliu0/status/2044446899567292570/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "features": {
          "large": {
            "faces": []
          },
          "orig": {
            "faces": []
          }
        },
        "id_str": "2044446870475534336",
        "indices": [
          280,
          303
        ],
        "media_key": "3_2044446870475534336",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAQoAARxfVZjE2nAACgACHF9Vn4rbcJoAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAABCgABHF9VmMTacAAKAAIcX1WfittwmgAA",
            "media_key": "3_2044446870475534336"
          }
        },
        "media_url_https": "https://pbs.twimg.com/media/HF9VmMTacAAF422.png",
        "original_info": {
          "focus_rects": [
            {
              "h": 932,
              "w": 1664,
              "x": 0,
              "y": 240
            },
            {
              "h": 1464,
              "w": 1464,
              "x": 0,
              "y": 0
            },
            {
              "h": 1464,
              "w": 1284,
              "x": 0,
              "y": 0
            },
            {
              "h": 1464,
              "w": 732,
              "x": 7,
              "y": 0
            },
            {
              "h": 1464,
              "w": 1664,
              "x": 0,
              "y": 0
            }
          ],
          "height": 1464,
          "width": 1664
        },
        "sizes": {
          "large": {
            "h": 1464,
            "w": 1664
          }
        },
        "type": "photo",
        "url": "https://t.co/yOtoPajmHu"
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "urls": [
      {
        "display_url": "parsebench.ai",
        "expanded_url": "http://parsebench.ai/",
        "indices": [
          1280,
          1303
        ],
        "url": "https://t.co/k564afEGJo"
      },
      {
        "display_url": "llamaindex.ai/blog/parsebenc…",
        "expanded_url": "https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=xjl&utm_campaign=2026-apr-",
        "indices": [
          1311,
          1334
        ],
        "url": "https://t.co/57OHkx0pQW"
      },
      {
        "display_url": "arxiv.org/abs/2604.08538…",
        "expanded_url": "https://arxiv.org/abs/2604.08538?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr-",
        "indices": [
          1342,
          1365
        ],
        "url": "https://t.co/Ho2oH2xEAM"
      }
    ],
    "user_mentions": []
  },
  "quoted_tweet": {
    "type": "tweet",
    "id": "2044420652224975203",
    "url": "https://x.com/llama_index/status/2044420652224975203",
    "twitterUrl": "https://twitter.com/llama_index/status/2044420652224975203",
    "text": "Let's talk parsing tables.\n\nTwo days ago we launched ParseBench,the first document OCR benchmark built for AI agents.\n\nThis deep dive breaks down TableRecordMatch (GTRM), our metric for evaluating complex tables the way your pipeline actually consumes them: as records keyed by column headers.\n\nhttps://t.co/2sq5ncGiel",
    "source": "Twitter for iPhone",
    "retweetCount": 10,
    "replyCount": 2,
    "likeCount": 52,
    "quoteCount": 3,
    "viewCount": 18568,
    "createdAt": "Wed Apr 15 14:20:41 +0000 2026",
    "lang": "en",
    "bookmarkCount": 32,
    "isReply": false,
    "inReplyToId": null,
    "conversationId": "2044420652224975203",
    "displayTextRange": [
      0,
      277
    ],
    "inReplyToUserId": null,
    "inReplyToUsername": null,
    "author": {
      "type": "user",
      "userName": "llama_index",
      "url": "https://x.com/llama_index",
      "twitterUrl": "https://twitter.com/llama_index",
      "id": "1604278358296055808",
      "name": "LlamaIndex 🦙",
      "isVerified": false,
      "isBlueVerified": true,
      "verifiedType": "Business",
      "profilePicture": "https://pbs.twimg.com/profile_images/1967920417760251904/0ytfduMQ_normal.png",
      "coverPicture": "https://pbs.twimg.com/profile_banners/1604278358296055808/1770092126",
      "description": "",
      "location": "",
      "followers": 111993,
      "following": 33,
      "status": "",
      "canDm": false,
      "canMediaTag": true,
      "createdAt": "Sun Dec 18 00:52:44 +0000 2022",
      "entities": {
        "description": {
          "urls": []
        },
        "url": {}
      },
      "fastFollowersCount": 0,
      "favouritesCount": 1531,
      "hasCustomTimelines": true,
      "isTranslator": false,
      "mediaCount": 1858,
      "statusesCount": 3819,
      "withheldInCountries": [],
      "affiliatesHighlightedLabel": {},
      "possiblySensitive": false,
      "pinnedTweetIds": [],
      "profile_bio": {
        "description": "The world's best AI Document OCR\n\nLlamaParse: https://t.co/yQGTiRSNvj\nDocs: https://t.co/us6GCS1Clb",
        "entities": {
          "description": {
            "hashtags": [],
            "symbols": [],
            "urls": [
              {
                "display_url": "cloud.llamaindex.ai",
                "expanded_url": "https://cloud.llamaindex.ai/",
                "indices": [
                  46,
                  69
                ],
                "url": "https://t.co/yQGTiRSNvj"
              },
              {
                "display_url": "developers.llamaindex.ai/python/cloud/",
                "expanded_url": "https://developers.llamaindex.ai/python/cloud/",
                "indices": [
                  76,
                  99
                ],
                "url": "https://t.co/us6GCS1Clb"
              }
            ],
            "user_mentions": []
          },
          "url": {
            "urls": [
              {
                "display_url": "llamaindex.ai",
                "expanded_url": "https://www.llamaindex.ai/",
                "indices": [
                  0,
                  23
                ],
                "url": "https://t.co/epzefqQqZx"
              }
            ]
          }
        }
      },
      "isAutomated": false,
      "automatedBy": null
    },
    "extendedEntities": {
      "media": [
        {
          "additional_media_info": {
            "monetizable": true
          },
          "allow_download_status": {
            "allow_download": true
          },
          "display_url": "pic.twitter.com/7ZQOUqo3hb",
          "expanded_url": "https://twitter.com/llama_index/status/2044420652224975203/video/1",
          "ext_media_availability": {
            "status": "Available"
          },
          "id_str": "2044419864027152384",
          "indices": [
            278,
            301
          ],
          "media_key": "13_2044419864027152384",
          "media_results": {
            "id": "QXBpTWVkaWFSZXN1bHRzOgwABAoAARxfPQjXF0AAAAA=",
            "result": {
              "__typename": "ApiMedia",
              "id": "QXBpTWVkaWE6DAAECgABHF89CNcXQAAAAA==",
              "media_key": "13_2044419864027152384"
            }
          },
          "media_url_https": "https://pbs.twimg.com/amplify_video_thumb/2044419864027152384/img/QVAwU41TgZ86qeN_.jpg",
          "original_info": {
            "focus_rects": [],
            "height": 2160,
            "width": 3840
          },
          "sizes": {
            "large": {
              "h": 1152,
              "w": 2048
            }
          },
          "type": "video",
          "url": "https://t.co/7ZQOUqo3hb",
          "video_info": {
            "aspect_ratio": [
              16,
              9
            ],
            "duration_millis": 96296,
            "variants": [
              {
                "content_type": "application/x-mpegURL",
                "url": "https://video.twimg.com/amplify_video/2044419864027152384/pl/HoN8A20HDShjQ5Ij.m3u8?tag=21&v=8d3"
              },
              {
                "bitrate": 256000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2044419864027152384/vid/avc1/480x270/em6b-anKviypPI_l.mp4?tag=21"
              },
              {
                "bitrate": 832000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2044419864027152384/vid/avc1/640x360/YLgo6mg5_tT0xiSW.mp4?tag=21"
              },
              {
                "bitrate": 2176000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2044419864027152384/vid/avc1/1280x720/hNcCDmdhA49pnaKG.mp4?tag=21"
              },
              {
                "bitrate": 10368000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2044419864027152384/vid/avc1/1920x1080/aYSAZHW4306ClGKf.mp4?tag=21"
              },
              {
                "bitrate": 25128000,
                "content_type": "video/mp4",
                "url": "https://video.twimg.com/amplify_video/2044419864027152384/vid/avc1/3840x2160/bxOqxmRqkq2FNT0p.mp4?tag=21"
              }
            ]
          }
        }
      ]
    },
    "card": null,
    "place": {},
    "entities": {
      "hashtags": [],
      "symbols": [],
      "timestamps": [],
      "urls": [
        {
          "display_url": "llamaindex.ai/blog/parsebenc…",
          "expanded_url": "https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=twitter&utm_campaign=2026--",
          "indices": [
            295,
            318
          ],
          "url": "https://t.co/2sq5ncGiel"
        }
      ],
      "user_mentions": []
    },
    "quoted_tweet": null,
    "retweeted_tweet": null,
    "isLimitedReply": false,
    "communityInfo": null,
    "article": null
  },
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "communityInfo": null,
  "article": null
}