🐦 Twitter Post Details

Viewing enriched Twitter post

@femke_plantinga

How do professional RAG applications chunk their text? Let’s cover some Advanced Chunking Techniques. In our latest video, we cover simple chunking methods like splitting documents into sentences or sections. But these methods often miss out on ensuring each chunk has independent meaning. Semantic chunking solved exactly this! By measuring the semantic similarity between sentences using vector embeddings, we can combine similar sentences into meaningful chunks. https://t.co/MLLBZdZCky With LLM-based chunking, large language models help break down text effectively, although it can be slow and costly. https://t.co/B0IN09mgLw And what about the newest Late Chunking? Which keeps context intact across chunks—more on that soon. 👀 In this video, we cover these advanced techniques in detail. Watch it to learn more. A big shoutout to @drdannywilliams for helping create this video! 💚

🔧 Raw API Response

{
  "user": {
    "created_at": "2022-09-30T15:33:28.000Z",
    "default_profile_image": false,
    "description": "Developer Growth @weaviate_io Learn with me!",
    "fast_followers_count": 0,
    "favourites_count": 2077,
    "followers_count": 3713,
    "friends_count": 606,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 37,
    "location": "Barcelona, Spain",
    "media_count": 101,
    "name": "Femke Plantinga",
    "normal_followers_count": 3713,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/1575871377957011459/1732049465",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1858975405182459904/jbOWemea_normal.jpg",
    "screen_name": "femke_plantinga",
    "statuses_count": 1226,
    "translator_type": "none",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "1575871377957011459"
  },
  "id": "1851648729583145330",
  "conversation_id": "1851648729583145330",
  "full_text": "How do professional RAG applications chunk their text?\n\nLet’s cover some Advanced Chunking Techniques.\n\nIn our latest video, we cover simple chunking methods like splitting documents into sentences or sections. But these methods often miss out on ensuring each chunk has independent meaning.\n\nSemantic chunking solved exactly this! By measuring the semantic similarity between sentences using vector embeddings, we can combine similar sentences into meaningful chunks. https://t.co/MLLBZdZCky\n\nWith LLM-based chunking, large language models help break down text effectively, although it can be slow and costly. https://t.co/B0IN09mgLw\n\nAnd what about the newest Late Chunking? Which keeps context intact across chunks—more on that soon. 👀\n\nIn this video, we cover these advanced techniques in detail. Watch it to learn more.\n\nA big shoutout to @drdannywilliams for helping create this video! 💚",
  "reply_count": 3,
  "retweet_count": 69,
  "favorite_count": 471,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/ext_tw_video_thumb/1851639253224099840/pu/img/dHxGJySIJm88Wh9X.jpg",
      "type": "video",
      "video_url": "https://video.twimg.com/ext_tw_video/1851639253224099840/pu/vid/avc1/1280x720/aoYgiqup7skI4GN1.mp4?tag=12"
    }
  ],
  "url": "https://twitter.com/femke_plantinga/status/1851648729583145330",
  "created_at": "2024-10-30T15:33:53.000Z",
  "#sort_index": "1851648729583145330",
  "view_count": 29020,
  "quote_count": 1,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/femke_plantinga/status/1851648729583145330"
}