๐Ÿฆ Twitter Post Details

Viewing enriched Twitter post

@victorialslocum

"Just fine-tune your embeddings" they said. "It'll fix your RAG system" they said. They were wrong. Here's what actually works: After working with countless retrieval systems, I've noticed a pattern: teams often jump straight to fine-tuning when their vector search underperforms. But that's like replacing your car engine when you might just need better tires. ๐—™๐—ถ๐—ฟ๐˜€๐˜, ๐—ฑ๐—ฒ๐—ฏ๐˜‚๐—ด ๐—ฏ๐—ฒ๐—ณ๐—ผ๐—ฟ๐—ฒ ๐˜†๐—ผ๐˜‚ ๐—ณ๐—ถ๐—ป๐—ฒ-๐˜๐˜‚๐—ป๐—ฒ: Before spending time and compute on fine-tuning, ask yourself: โ€ข Do many queries need exact keyword matches? โ†’ Try hybrid search first โ€ข Are your chunks oddly split or lacking context? โ†’ Experiment with different chunking techniques like late chunking โ€ข Is the model missing general semantic relationships? โ†’ Try a larger model or one with more dimensions โ€ข Is it only failing on your specific domain terminology? โ†’ NOW we're talking fine-tuning territory ๐—ช๐—ต๐—ฒ๐—ป ๐—ณ๐—ถ๐—ป๐—ฒ-๐˜๐˜‚๐—ป๐—ถ๐—ป๐—ด ๐—บ๐—ฎ๐—ธ๐—ฒ๐˜€ ๐˜€๐—ฒ๐—ป๐˜€๐—ฒ: Fine-tuning shines when off-the-shelf models can't grasp your domain-specific language. Pre-trained models learn from Wikipedia and web crawls - they don't know your company's product names or industry jargon. The payoff can be substantial: โ€ข Better retrieval = better RAG performance โ€ข Smaller fine-tuned models can outperform larger general ones โ€ข Lower costs and latency for domain-specific tasks ๐—ง๐—ต๐—ฒ ๐˜๐—ฒ๐—ฐ๐—ต๐—ป๐—ถ๐—ฐ๐—ฎ๐—น ๐—ฑ๐—ฒ๐—ฒ๐—ฝ-๐—ฑ๐—ถ๐˜ƒ๐—ฒ: Fine-tuning embedding models isn't like fine-tuning LLMs. It's all about adjusting distances in vector space using contrastive learning. Three main approaches: 1. ๐— ๐˜‚๐—น๐˜๐—ถ๐—ฝ๐—น๐—ฒ ๐—ก๐—ฒ๐—ด๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ๐˜€ ๐—ฅ๐—ฎ๐—ป๐—ธ๐—ถ๐—ป๐—ด ๐—Ÿ๐—ผ๐˜€๐˜€: Just needs query-context pairs. Treats other examples in the batch as negatives - elegant and popular 2. ๐—ง๐—ฟ๐—ถ๐—ฝ๐—น๐—ฒ๐˜ ๐—Ÿ๐—ผ๐˜€๐˜€: Requires (anchor, positive, negative) triplets. Great for precise control but finding good hard negatives is tricky 3. ๐—–๐—ผ๐˜€๐—ถ๐—ป๐—ฒ ๐—˜๐—บ๐—ฏ๐—ฒ๐—ฑ๐—ฑ๐—ถ๐—ป๐—ด ๐—Ÿ๐—ผ๐˜€๐˜€: Uses similarity scores between sentence pairs. Perfect when you have gradients of similarity ๐—ฃ๐—ฟ๐—ฎ๐—ฐ๐˜๐—ถ๐—ฐ๐—ฎ๐—น ๐—ฐ๐—ผ๐—ป๐˜€๐—ถ๐—ฑ๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€: โ€ข Start with 1,000-5,000 high-quality samples for narrow domains โ€ข Plan for 10,000+ for complex specialized terminology โ€ข Good news: fine-tuning can run on consumer GPUs or free Google Colab for smaller models โ€ข Always evaluate against a baseline - use metrics like MRR, Recall@k, or NDCG ๐—ฃ๐—ฟ๐—ผ ๐˜๐—ถ๐—ฝ: The MTEB leaderboard is your friend for finding base models, but remember - leaderboard performance doesn't always translate to your specific use case. The bottom line? Fine-tuning is powerful but it's not a magic bullet. Sometimes your retrieval problems need a different solution entirely. Debug systematically, and when you do fine-tune, start small and iterate. Check out the full technical blog - it includes code examples for both Hugging Face and AWS SageMaker integrations: https://t.co/PH1djlDFDt

Media 1

๐Ÿ“Š Media Metadata

{
  "media": [
    {
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1955595439241044274/media_0.jpg?",
      "media_url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1955595439241044274/media_0.jpg?",
      "type": "photo",
      "filename": "media_0.jpg"
    }
  ],
  "processed_at": "2025-08-15T08:12:25.872078",
  "pipeline_version": "2.0"
}

๐Ÿ”ง Raw API Response

{
  "type": "tweet",
  "id": "1955595439241044274",
  "url": "https://x.com/victorialslocum/status/1955595439241044274",
  "twitterUrl": "https://twitter.com/victorialslocum/status/1955595439241044274",
  "text": "\"Just fine-tune your embeddings\" they said.\n\n\"It'll fix your RAG system\" they said.\n\nThey were wrong. Here's what actually works:\n\nAfter working with countless retrieval systems, I've noticed a pattern: teams often jump straight to fine-tuning when their vector search underperforms. But that's like replacing your car engine when you might just need better tires.\n\n๐—™๐—ถ๐—ฟ๐˜€๐˜, ๐—ฑ๐—ฒ๐—ฏ๐˜‚๐—ด ๐—ฏ๐—ฒ๐—ณ๐—ผ๐—ฟ๐—ฒ ๐˜†๐—ผ๐˜‚ ๐—ณ๐—ถ๐—ป๐—ฒ-๐˜๐˜‚๐—ป๐—ฒ:\nBefore spending time and compute on fine-tuning, ask yourself:\nโ€ข Do many queries need exact keyword matches? โ†’ Try hybrid search first\nโ€ข Are your chunks oddly split or lacking context? โ†’ Experiment with different chunking techniques like late chunking\nโ€ข Is the model missing general semantic relationships? โ†’ Try a larger model or one with more dimensions\nโ€ข Is it only failing on your specific domain terminology? โ†’ NOW we're talking fine-tuning territory\n\n๐—ช๐—ต๐—ฒ๐—ป ๐—ณ๐—ถ๐—ป๐—ฒ-๐˜๐˜‚๐—ป๐—ถ๐—ป๐—ด ๐—บ๐—ฎ๐—ธ๐—ฒ๐˜€ ๐˜€๐—ฒ๐—ป๐˜€๐—ฒ:\nFine-tuning shines when off-the-shelf models can't grasp your domain-specific language. Pre-trained models learn from Wikipedia and web crawls - they don't know your company's product names or industry jargon.\n\nThe payoff can be substantial:\nโ€ข Better retrieval = better RAG performance\nโ€ข Smaller fine-tuned models can outperform larger general ones\nโ€ข Lower costs and latency for domain-specific tasks\n\n๐—ง๐—ต๐—ฒ ๐˜๐—ฒ๐—ฐ๐—ต๐—ป๐—ถ๐—ฐ๐—ฎ๐—น ๐—ฑ๐—ฒ๐—ฒ๐—ฝ-๐—ฑ๐—ถ๐˜ƒ๐—ฒ:\nFine-tuning embedding models isn't like fine-tuning LLMs. It's all about adjusting distances in vector space using contrastive learning.\n\nThree main approaches:\n1. ๐— ๐˜‚๐—น๐˜๐—ถ๐—ฝ๐—น๐—ฒ ๐—ก๐—ฒ๐—ด๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ๐˜€ ๐—ฅ๐—ฎ๐—ป๐—ธ๐—ถ๐—ป๐—ด ๐—Ÿ๐—ผ๐˜€๐˜€: Just needs query-context pairs. Treats other examples in the batch as negatives - elegant and popular\n2. ๐—ง๐—ฟ๐—ถ๐—ฝ๐—น๐—ฒ๐˜ ๐—Ÿ๐—ผ๐˜€๐˜€: Requires (anchor, positive, negative) triplets. Great for precise control but finding good hard negatives is tricky\n3. ๐—–๐—ผ๐˜€๐—ถ๐—ป๐—ฒ ๐—˜๐—บ๐—ฏ๐—ฒ๐—ฑ๐—ฑ๐—ถ๐—ป๐—ด ๐—Ÿ๐—ผ๐˜€๐˜€: Uses similarity scores between sentence pairs. Perfect when you have gradients of similarity\n\n๐—ฃ๐—ฟ๐—ฎ๐—ฐ๐˜๐—ถ๐—ฐ๐—ฎ๐—น ๐—ฐ๐—ผ๐—ป๐˜€๐—ถ๐—ฑ๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€:\nโ€ข Start with 1,000-5,000 high-quality samples for narrow domains\nโ€ข Plan for 10,000+ for complex specialized terminology\nโ€ข Good news: fine-tuning can run on consumer GPUs or free Google Colab for smaller models\nโ€ข Always evaluate against a baseline - use metrics like MRR, Recall@k, or NDCG\n\n๐—ฃ๐—ฟ๐—ผ ๐˜๐—ถ๐—ฝ: The MTEB leaderboard is your friend for finding base models, but remember - leaderboard performance doesn't always translate to your specific use case.\n\nThe bottom line? Fine-tuning is powerful but it's not a magic bullet. Sometimes your retrieval problems need a different solution entirely. Debug systematically, and when you do fine-tune, start small and iterate.\n\nCheck out the full technical blog - it includes code examples for both Hugging Face and AWS SageMaker integrations: https://t.co/PH1djlDFDt",
  "source": "Twitter for iPhone",
  "retweetCount": 154,
  "replyCount": 11,
  "likeCount": 985,
  "quoteCount": 2,
  "viewCount": 62511,
  "createdAt": "Wed Aug 13 11:41:00 +0000 2025",
  "lang": "en",
  "bookmarkCount": 1392,
  "isReply": false,
  "inReplyToId": null,
  "conversationId": "1955595439241044274",
  "inReplyToUserId": null,
  "inReplyToUsername": null,
  "author": {
    "type": "user",
    "userName": "victorialslocum",
    "url": "https://x.com/victorialslocum",
    "twitterUrl": "https://twitter.com/victorialslocum",
    "id": "1350861258539450371",
    "name": "Victoria Slocum",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1869063543678468096/AKcYQ5RR_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/1350861258539450371/1751375217",
    "description": "",
    "location": "Berlin, Germany",
    "followers": 6466,
    "following": 546,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Sun Jan 17 17:43:37 +0000 2021",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 3873,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 168,
    "statusesCount": 649,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1897612417237983580"
    ],
    "profile_bio": {
      "description": "learning cool stuff, machine learning engineer at @weaviate_io ๐Ÿ’™",
      "entities": {
        "description": {
          "user_mentions": [
            {
              "id_str": "0",
              "indices": [
                50,
                62
              ],
              "name": "",
              "screen_name": "weaviate_io"
            }
          ]
        },
        "url": {
          "urls": [
            {
              "display_url": "victoriaslocum.com",
              "expanded_url": "https://victoriaslocum.com",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/jldayD93ku"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {
    "media": [
      {
        "allow_download_status": {
          "allow_download": true
        },
        "display_url": "pic.twitter.com/yNlcX5GJeb",
        "expanded_url": "https://twitter.com/victorialslocum/status/1955595439241044274/photo/1",
        "ext_media_availability": {
          "status": "Available"
        },
        "id_str": "1955172816359153664",
        "indices": [
          270,
          293
        ],
        "media_key": "16_1955172816359153664",
        "media_results": {
          "id": "QXBpTWVkaWFSZXN1bHRzOgwAAgoAARsiK1FJlmAACgACGyOrsNhWITIAAA==",
          "result": {
            "__typename": "ApiMedia",
            "id": "QXBpTWVkaWE6DAACCgABGyIrUUmWYAAKAAIbI6uw2FYhMgAA",
            "media_key": "16_1955172816359153664"
          }
        },
        "media_url_https": "https://pbs.twimg.com/tweet_video_thumb/GyIrUUmWYAAAuHj.jpg",
        "original_info": {
          "focus_rects": [],
          "height": 1914,
          "width": 1384
        },
        "sizes": {
          "large": {
            "h": 1914,
            "w": 1384
          }
        },
        "type": "animated_gif",
        "url": "https://t.co/yNlcX5GJeb",
        "video_info": {
          "aspect_ratio": [
            692,
            957
          ],
          "variants": [
            {
              "bitrate": 0,
              "content_type": "video/mp4",
              "url": "https://video-s.twimg.com/tweet_video/GyIrUUmWYAAAuHj.mp4"
            }
          ]
        }
      }
    ]
  },
  "card": null,
  "place": {},
  "entities": {
    "urls": [
      {
        "display_url": "weaviate.io/blog/fine-tuneโ€ฆ",
        "expanded_url": "https://weaviate.io/blog/fine-tune-embedding-model?utm_source=channels&utm_medium=vs_social&utm_campaign=dev_education&utm_content=animated_diagram_post_680683561",
        "indices": [
          2676,
          2699
        ],
        "url": "https://t.co/PH1djlDFDt"
      }
    ]
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "article": null
}