🐦 Twitter Post Details

Viewing enriched Twitter post

@avthars

VECTOR DATABASES ARE THE WRONG ABSTRACTION. Here’s a better way: introducing pgai Vectorizer, a new open-source PostgreSQL tool that automatically creates and syncs embeddings with source data, just like a database index. ❌ Why vector databases fail Vector databases treat embeddings as independent data, divorced from the source data from which embeddings are created, rather than what they truly are: derived data. This pitfall means that many AI projects that start out as simple vector search implementations inevitably evolve into a complex orchestra of monitoring, synchronization, and firefighting. 😓 Keeping embeddings in-sync is hard In an attempt to avoid stale embeddings, engineering teams have to build and maintain a maze of ETL pipelines, juggle multiple databases (vector DB, metadata store, lexical search), and manage complex queuing systems for updates. Add monitoring for data drift, alert systems for stale results, and validation checks across systems - and you have a brittle infrastructure that inevitably breaks down, leading to stale embeddings and wasted engineering hours. What if you could just use Postgres instead? ✅ Pgai Vectorizer: Vector embeddings as database indexes Pgai Vectorizer treats embeddings like database indexes. It automatically creates, updates, and maintains embeddings as your data changes. Just like an index, the database handles all the complexity: syncing, versioning, and cleanup happen automatically. This means no manual tracking, zero maintenance burden, and the freedom to rapidly experiment with different embedding models and chunking strategies without building new pipelines. 🤔Why did we build pgai Vectorizer? Our team at @timescaledb built pgai Vectorizer because many developers regard PostgreSQL as the “Swiss army knife” of databases, as it can handle everything from vectors and text data to JSON documents. We think an “everything database” like PostgreSQL is the solution to eliminate the nightmare of managing multiple databases, making it the ideal home for vectorizers and the foundation for AI applications. ⚙️How does pgai Vectorizer work? Check out the code snippet below – it takes just 6 lines of SQL to put your embedding creation pipeline on autopilot with pgai Vectorizer! Under the hood, pgai Vectorizer checks for modifications to the source table (inserts, updates, and deletes) and asynchronously creates and updates vector embeddings in an external worker. 🧑‍💻 Sounds exciting! How can I get started? Pgai Vectorizer is open-source under the PostgreSQL license and available for free to use on any PostgreSQL database. You can find installation instructions on the pgai GitHub repository (see end of post). It’s also available as a managed service in Timescale’s PostgreSQL cloud platform. 📚Learn more [1] Pgai github repo: https://t.co/hut1MxuwPZ [1] Technical explainer post: https://t.co/A9hOz482Rg Share this post with your followers to let them know about pgai Vectorizer and comment your reactions and questions.

View on Twitter

📊 Media Metadata

{
  "score": 0.91,
  "scored_at": "2025-08-09T13:46:07.550835",
  "import_source": "network_archive_import",
  "links_checked": true,
  "checked_at": "2025-08-10T10:32:45.671982",
  "media": [
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1851252850619277358/media_0.jpg?",
      "filename": "media_0.jpg"
    },
    {
      "id": "",
      "type": "photo",
      "url": null,
      "media_url": "https://pbs.twimg.com/media/GbD4lj6XgAEhGGH.jpg",
      "media_url_https": null,
      "display_url": null,
      "expanded_url": null
    }
  ],
  "reprocessed_at": "2025-08-12T15:25:38.984927",
  "reprocessed_reason": "missing_media_array",
  "original_structure": "had_both"
}

🔧 Raw API Response

{
  "user": {
    "created_at": "2010-08-18T06:20:15.000Z",
    "default_profile_image": false,
    "description": "ai + developer product builder @TimescaleDB | prev: startup founder, @Princeton cs | 🇿🇦 to the 🌎",
    "fast_followers_count": 0,
    "favourites_count": 58375,
    "followers_count": 3362,
    "friends_count": 5123,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 69,
    "location": "the internet",
    "media_count": 700,
    "name": "Avthar",
    "normal_followers_count": 3362,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/179837138/1724335242",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1526252101143351296/tI-vIbQG_normal.jpg",
    "screen_name": "avthars",
    "statuses_count": 9596,
    "translator_type": "none",
    "url": "https://t.co/MyuL53vmio",
    "verified": true,
    "withheld_in_countries": [],
    "id_str": "179837138"
  },
  "id": "1851252850619277358",
  "conversation_id": "1851252850619277358",
  "full_text": "VECTOR DATABASES ARE THE WRONG ABSTRACTION. Here’s a better way: introducing pgai Vectorizer, a new open-source PostgreSQL tool that automatically creates and syncs embeddings with source data, just like a database index.\n\n❌ Why vector databases fail\nVector databases treat embeddings as independent data, divorced from the source data from which embeddings are created, rather than what they truly are: derived data.\n\nThis pitfall means that many AI projects that start out as simple vector search implementations inevitably evolve into a complex orchestra of monitoring, synchronization, and firefighting.\n\n😓 Keeping embeddings in-sync is hard\nIn an attempt to avoid stale embeddings, engineering teams have to build and maintain a maze of ETL pipelines, juggle multiple databases (vector DB, metadata store, lexical search), and manage complex queuing systems for updates.\n\nAdd monitoring for data drift, alert systems for stale results, and validation checks across systems - and you have a brittle infrastructure that inevitably breaks down, leading to stale embeddings and wasted engineering hours.\n\nWhat if you could just use Postgres instead?\n\n✅ Pgai Vectorizer: Vector embeddings as database indexes\nPgai Vectorizer treats embeddings like database indexes. It automatically creates, updates, and maintains embeddings as your data changes. Just like an index, the database handles all the complexity: syncing, versioning, and cleanup happen automatically.\n\nThis means no manual tracking, zero maintenance burden, and the freedom to rapidly experiment with different embedding models and chunking strategies without building new pipelines.\n\n🤔Why did we build pgai Vectorizer?\nOur team at @timescaledb built pgai Vectorizer because many developers regard PostgreSQL as the “Swiss army knife” of databases, as it can handle everything from vectors and text data to JSON documents. \n\nWe think an “everything database” like PostgreSQL is the solution to eliminate the nightmare of managing multiple databases, making it the ideal home for vectorizers and the foundation for AI applications.\n\n⚙️How does pgai Vectorizer work?\nCheck out the code snippet below –  it takes just 6 lines of SQL to put your embedding creation pipeline on autopilot with pgai Vectorizer!\n\nUnder the hood, pgai Vectorizer checks for modifications to the source table (inserts, updates, and deletes) and asynchronously creates and updates vector embeddings in an external worker.\n\n🧑‍💻 Sounds exciting! How can I get started?\nPgai Vectorizer is open-source under the PostgreSQL license and available for free to use on any PostgreSQL database. You can find installation instructions on the pgai GitHub repository (see end of post). It’s also available as a managed service in Timescale’s PostgreSQL cloud platform.\n\n📚Learn more\n[1] Pgai github repo: https://t.co/hut1MxuwPZ\n[1] Technical explainer post: https://t.co/A9hOz482Rg\n\nShare this post with your followers to let them know about pgai Vectorizer and comment your reactions and questions.",
  "reply_count": 43,
  "retweet_count": 166,
  "favorite_count": 1069,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/GbD4lj6XgAEhGGH.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/avthars/status/1851252850619277358",
  "created_at": "2024-10-29T13:20:48.000Z",
  "#sort_index": "1851252850619277358",
  "view_count": 105842,
  "quote_count": 16,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://x.com/avthars/status/1851252850619277358"
}