🐦 Twitter Post Details

Viewing enriched Twitter post

@jerryjliu0

Pretty excited about this new RAG technique I cooked up 🧑‍🍳 A top issue with RAG chunking is it splits the document into fragmented pieces, causing top-k retrieval to return partial context. Also most documents have multiple hierarchies of sections: top-level sections, sub-sections, etc. This is also why lots of people are interested in exploring the idea of knowledge graphs - pulling in "links" to related pages to expand retrieved context. This notebook lets you retrieve contiguous chunks without having to spend a lot of time tuning the chunking algorithm, thanks to GraphRAG-esque metadata tagging + retrieval. Tag chunks with sections, and use the section ID to expand the retrieved set. Check it out https://t.co/mIolxuMT12

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "url": "https://pbs.twimg.com/media/GcSMtkjaUAAXd13.jpg",
      "type": "photo",
      "original_url": "https://pbs.twimg.com/media/GcSMtkjaUAAXd13.jpg",
      "format_converted_from_list": true
    }
  ],
  "conversion_date": "2025-08-13T00:32:56.030289",
  "format_converted": true,
  "original_structure": "had_media_only"
}

🔧 Raw API Response

{
  "user": {
    "created_at": "2011-09-07T22:54:31.000Z",
    "default_profile_image": false,
    "description": "co-founder/CEO @llama_index\n\nCareers: https://t.co/EUnMNmbCtx\nEnterprise: https://t.co/Ht5jwxSrQB",
    "fast_followers_count": 0,
    "favourites_count": 7173,
    "followers_count": 54388,
    "friends_count": 1364,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 1136,
    "location": "",
    "media_count": 1063,
    "name": "Jerry Liu",
    "normal_followers_count": 54388,
    "possibly_sensitive": false,
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1283610285031460864/1Q4zYhtb_normal.jpg",
    "screen_name": "jerryjliu0",
    "statuses_count": 5321,
    "translator_type": "none",
    "url": "https://t.co/YiIfjVlzb6",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "369777416"
  },
  "id": "1856768968973062620",
  "conversation_id": "1856768968973062620",
  "full_text": "Pretty excited about this new RAG technique I cooked up 🧑‍🍳\n\nA top issue with RAG chunking is it splits the document into fragmented pieces, causing top-k retrieval to return partial context. Also most documents have multiple hierarchies of sections: top-level sections, sub-sections, etc.\n\nThis is also why lots of people are interested in exploring the idea of knowledge graphs - pulling in \"links\" to related pages to expand retrieved context. \n\nThis notebook lets you retrieve contiguous chunks without having to spend a lot of time tuning the chunking algorithm, thanks to GraphRAG-esque metadata tagging + retrieval. Tag chunks with sections, and use the section ID to expand the retrieved set.\n\nCheck it out \n\nhttps://t.co/mIolxuMT12",
  "reply_count": 13,
  "retweet_count": 113,
  "favorite_count": 646,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/GcSMtkjaUAAXd13.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/jerryjliu0/status/1856768968973062620",
  "created_at": "2024-11-13T18:39:53.000Z",
  "#sort_index": "1856768968973062620",
  "view_count": 116343,
  "quote_count": 10,
  "is_quote_tweet": true,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "quoted_tweet": {
    "user": {
      "created_at": "2022-12-18T00:52:44.000Z",
      "default_profile_image": false,
      "description": "Build LLM agents over your data\n\nGithub: https://t.co/HC19j7vMwc\nDocs: https://t.co/QInqg2zksh\nDiscord: https://t.co/3ktq3zzYII",
      "fast_followers_count": 0,
      "favourites_count": 1261,
      "followers_count": 82612,
      "friends_count": 26,
      "has_custom_timelines": false,
      "is_translator": false,
      "listed_count": 1366,
      "location": "",
      "media_count": 1375,
      "name": "LlamaIndex 🦙",
      "normal_followers_count": 82612,
      "possibly_sensitive": false,
      "profile_banner_url": "https://pbs.twimg.com/profile_banners/1604278358296055808/1696908553",
      "profile_image_url_https": "https://pbs.twimg.com/profile_images/1623505166996742144/n-PNQGgd_normal.jpg",
      "screen_name": "llama_index",
      "statuses_count": 2997,
      "translator_type": "none",
      "url": "https://t.co/epzefqQqZx",
      "verified": true,
      "withheld_in_countries": [],
      "id_str": "1604278358296055808"
    },
    "id": "1856743483941556640",
    "conversation_id": "1856743483941556640",
    "full_text": "We’re excited to feature a new RAG technique - dynamic section retrieval 💫 - which ensures that you can retrieve entire contiguous sections instead of naive fragmented chunks from a document.\n\nThis is a top pain point we’ve heard from our community on multi-document RAG challenges - naive RAG returns fragmented context without awareness of the surrounding document. Our approach allows you to start off with a “simple” chunking technique (e.g. per page), but do a post-processing workflow to attach section/sub-section metadata.\n\nYou can then do GraphRAG-like retrieval (two-pass retrieval): retrieve chunks, look up the attached section metadata, and then do a second call to return all chunks that match the section ID.\n\nhttps://t.co/mzZXN4QYtx",
    "reply_count": 7,
    "retweet_count": 55,
    "favorite_count": 309,
    "hashtags": [],
    "symbols": [],
    "user_mentions": [],
    "urls": [],
    "media": [
      {
        "media_url": "https://pbs.twimg.com/media/GcR6WwcbYAAr4dQ.jpg",
        "type": "photo"
      }
    ],
    "url": "https://twitter.com/llama_index/status/1856743483941556640",
    "created_at": "2024-11-13T16:58:37.000Z",
    "#sort_index": "1856768968973062700",
    "view_count": 110991,
    "quote_count": 7,
    "is_quote_tweet": false,
    "is_retweet": false,
    "is_pinned": false,
    "is_truncated": true
  },
  "startUrl": "https://x.com/jerryjliu0/status/1856768968973062620"
}