🐦 Twitter Post Details

Viewing enriched Twitter post

@jerryjliu0

Adjusting your chunk size is one of the first things you should tackle in improving your RAG app - but it’s not always intuitive! ⚠️ More chunks ≠ better (lost in the middle problems / context overflows) ⚠️ Reranking retrieved chunks doesn’t necessarily improve results, in fact can worsen them. To evaluate which chunk size works best, you need to define an eval benchmark and do a sweep over chunk sizes / top-k values. @jason_lopatecki + @arizeai team came up with a comprehensive starter kit (Colab notebook + slides) showing how you can run chunk size sweeps and do retrieval + Q&A evals with Phoenix + @llama_index. If you're trying to iterate on your RAG pipeline make sure to check it out 👇 Notebook: https://t.co/pGZNGxeWJ7 Slides: https://t.co/edICh3lNaC Check it out!

View on Twitter

🔧 Raw API Response

{
  "user": {
    "created_at": "2011-09-07T22:54:31.000Z",
    "default_profile_image": false,
    "description": "co-founder/CEO @llama_index\n\nEx-ML @robusthq,  AI research @Uber_ATG, ML Eng @Quora, @princeton",
    "fast_followers_count": 0,
    "favourites_count": 3927,
    "followers_count": 23787,
    "friends_count": 1156,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 610,
    "location": "",
    "media_count": 592,
    "name": "Jerry Liu",
    "normal_followers_count": 23787,
    "possibly_sensitive": false,
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1283610285031460864/1Q4zYhtb_normal.jpg",
    "screen_name": "jerryjliu0",
    "statuses_count": 2708,
    "translator_type": "none",
    "url": "https://t.co/S7FkTSefQ0",
    "verified": false,
    "withheld_in_countries": [],
    "id_str": "369777416"
  },
  "id": "1710300984654663962",
  "conversation_id": "1710300984654663962",
  "full_text": "Adjusting your chunk size is one of the first things you should tackle in improving your RAG app - but it’s not always intuitive!\n\n⚠️ More chunks ≠ better (lost in the middle problems / context overflows)\n\n⚠️ Reranking retrieved chunks doesn’t necessarily improve results, in fact can worsen them. \n\nTo evaluate which chunk size works best, you need to define an eval benchmark and do a sweep over chunk sizes / top-k values.\n\n@jason_lopatecki + @arizeai team came up with a comprehensive starter kit (Colab notebook + slides) showing how you can run chunk size sweeps and do retrieval + Q&A evals with Phoenix + @llama_index. If you're trying to iterate on your RAG pipeline make sure to check it out 👇\n\nNotebook: https://t.co/pGZNGxeWJ7\n\nSlides: https://t.co/edICh3lNaC\n\nCheck it out!",
  "reply_count": 5,
  "retweet_count": 46,
  "favorite_count": 273,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/F7w1kxUXkAACO8D.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/jerryjliu0/status/1710300984654663962",
  "created_at": "2023-10-06T14:28:25.000Z",
  "#sort_index": "1710300984654663962",
  "view_count": 66984,
  "quote_count": 3,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://twitter.com/jerryjliu0/status/1710300984654663962"
}