🐦 Twitter Post Details

Viewing enriched Twitter post

@cevianNY

Vector Databases should actually be Vector Indexes Imagine if every time you inserted or updated a row, you had to reach out to an external system to update the associated B-trees. Each call risks failure, rate-limits, and throws in queuing, tracking, staleness handling, and overall complexity. Sounds like some 1984-style dystopia, right? (Well, actually, in 1984 Ingres already managed indexes automatically....) And yet, here in 2024, we’re all too willing to deal with this exact BS for vector indexes. Take a simple example of embedding blog posts. Vector databases treat chunks and embeddings as isolated data atoms, detached from the source data itself. This means each time I publish a new post or edit an old one, I need to manually update embeddings in Pinecone, Qdrant, Weviate, etc. Or I need to set up a complex web-service with monitoring and retry logic to handle it all automatically. Either way, it’s a giant headache, and it shouldn’t have to be this way. That’s why we built pgai Vectorizer — making embedding creation and synchronization as easy as using an index in PostgreSQL. With Vectorizer, you simply have a blog table in your database, and create a vectorizer with a single line of code as seen below. From there, pgai Vectorizer automatically creates embeddings for your blog entries and keeps them in sync with every insert, delete, or update in your blog table. No custom data workflows, infrastructure, or constant monitoring required. There are far more interesting (and fun) challenges in AI than babysitting data infrastructure. Let us take on that burden for you.

View on Twitter

📊 Media Metadata

{
  "score": 0.81,
  "scored_at": "2025-08-09T13:46:07.550857",
  "import_source": "network_archive_import",
  "media": [
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1851320996449726534/media_0.jpg?",
      "filename": "media_0.jpg"
    },
    {
      "id": "",
      "type": "photo",
      "url": null,
      "media_url": "https://pbs.twimg.com/media/GbE2j1VWYAA3vHG.jpg",
      "media_url_https": null,
      "display_url": null,
      "expanded_url": null
    }
  ],
  "reprocessed_at": "2025-08-12T15:25:40.006685",
  "reprocessed_reason": "missing_media_array",
  "original_structure": "had_both"
}

🔧 Raw API Response

{
  "user": {
    "created_at": "2016-07-07T15:28:02.000Z",
    "default_profile_image": false,
    "description": "Technical Leader @TimescaleDB heading up the AI and vector DB stuff. He/his.",
    "fast_followers_count": 0,
    "favourites_count": 7569,
    "followers_count": 443,
    "friends_count": 629,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 4,
    "location": "",
    "media_count": 40,
    "name": "Matvey Arye 🇺🇦",
    "normal_followers_count": 443,
    "possibly_sensitive": false,
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1451750170840997892/BOdW9MvK_normal.jpg",
    "screen_name": "cevianNY",
    "statuses_count": 1189,
    "translator_type": "none",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "751075308862865408"
  },
  "id": "1851320996449726534",
  "conversation_id": "1851320996449726534",
  "full_text": "Vector Databases should actually be Vector Indexes\n\nImagine if every time you inserted or updated a row, you had to reach out to an external system to update the associated B-trees. Each call risks failure, rate-limits, and throws in queuing, tracking, staleness handling, and overall complexity. Sounds like some 1984-style dystopia, right? (Well, actually, in 1984 Ingres already managed indexes automatically....)\n\nAnd yet, here in 2024, we’re all too willing to deal with this exact BS for vector indexes.\n\nTake a simple example of embedding blog posts. Vector databases treat chunks and embeddings as isolated data atoms, detached from the source data itself. This means each time I publish a new post or edit an old one, I need to manually update embeddings in Pinecone, Qdrant, Weviate, etc. Or I need to set up a complex web-service with monitoring and retry logic to handle it all automatically. Either way, it’s a giant headache, and it shouldn’t have to be this way.\n\nThat’s why we built pgai Vectorizer — making embedding creation and synchronization as easy as using an index in PostgreSQL. With Vectorizer, you simply have a blog table in your database, and create a vectorizer with a single line of code as seen below. \n\nFrom there, pgai Vectorizer automatically creates embeddings for your blog entries and keeps them in sync with every insert, delete, or update in your blog table. No custom data workflows, infrastructure, or constant monitoring required.\n\nThere are far more interesting (and fun) challenges in AI than babysitting data infrastructure. Let us take on that burden for you.",
  "reply_count": 3,
  "retweet_count": 5,
  "favorite_count": 26,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/GbE2j1VWYAA3vHG.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/cevianNY/status/1851320996449726534",
  "created_at": "2024-10-29T17:51:36.000Z",
  "#sort_index": "1851320996449726534",
  "view_count": 2623,
  "quote_count": 2,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/cevianny/status/1851320996449726534"
}