🐦 Twitter Post Details

Viewing enriched Twitter post

@llama_index

Multimodal RAG with Contextual Retrieval 🖼️🤖 RAG over slide decks is hard. We first show you how to build a multimodal RAG pipeline over a slide deck to pre-extract and index the visual content on each slide, as both text and image chunks. 🌟 You can do this thanks to LlamaParse premium, which is now 4.5c per page! (Down from 7.5c per page 📉) We also add in contextual summaries to each slide using @AnthropicAI prompt caching + metadata generation. This helps ground each slide in the section it’s in! Check out our full cookbook combining both techniques: https://t.co/Mo0JUyxze3

View on Twitter

📊 Media Metadata

{
  "media": [
    {
      "type": "photo",
      "url": "https://pbs.twimg.com/media/GZTHBl_awAA8NXm.jpg",
      "media_url": "https://pbs.twimg.com/media/GZTHBl_awAA8NXm.jpg",
      "filename": "media_0.jpg"
    }
  ],
  "conversion_date": "2025-08-13T00:36:22.544164",
  "format_converted": true,
  "original_structure": "had_media_only",
  "enhanced_from_raw_response": true,
  "enhanced_at": "2025-08-13T17:20:00Z"
}

🔧 Raw API Response

{
  "user": {
    "created_at": "2022-12-18T00:52:44.000Z",
    "default_profile_image": false,
    "description": "Build LLM agents over your data\n\nGithub: https://t.co/HC19j7vMwc\nDocs: https://t.co/QInqg2zksh\nDiscord: https://t.co/3ktq3zzYII",
    "fast_followers_count": 0,
    "favourites_count": 1261,
    "followers_count": 82611,
    "friends_count": 26,
    "has_custom_timelines": false,
    "is_translator": false,
    "listed_count": 1366,
    "location": "",
    "media_count": 1375,
    "name": "LlamaIndex 🦙",
    "normal_followers_count": 82611,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/1604278358296055808/1696908553",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1623505166996742144/n-PNQGgd_normal.jpg",
    "screen_name": "llama_index",
    "statuses_count": 2997,
    "translator_type": "none",
    "url": "https://t.co/epzefqQqZx",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "1604278358296055808"
  },
  "id": "1843317002041274450",
  "conversation_id": "1843317002041274450",
  "full_text": "Multimodal RAG with Contextual Retrieval 🖼️🤖\n\nRAG over slide decks is hard. We first show you how to build a multimodal RAG pipeline over a slide deck to pre-extract and index the visual content on each slide, as both text and image chunks.\n\n🌟 You can do this thanks to LlamaParse premium, which is now 4.5c per page! (Down from 7.5c per page 📉)\n\nWe also add in contextual summaries to each slide using @AnthropicAI prompt caching + metadata generation. This helps ground each slide in the section it’s in!\n\nCheck out our full cookbook combining both techniques: https://t.co/Mo0JUyxze3",
  "reply_count": 0,
  "retweet_count": 36,
  "favorite_count": 221,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/GZTHBl_awAA8NXm.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/llama_index/status/1843317002041274450",
  "created_at": "2024-10-07T15:46:35.000Z",
  "#sort_index": "1843317002041274450",
  "view_count": 39954,
  "quote_count": 2,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/llama_index/status/1843317002041274450"
}