🐦 Twitter Post Details

Viewing enriched Twitter post

@iScienceLuvr

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation abs: https://t.co/MXOldpo8uk website: https://t.co/7LaOqBANZY This paper from Google & CMU introduces MAGVIT-v2, a joint image-video tokenizer which can be used with a masked language model to perform image and video generation. Obtains SOTA on class-conditional ImageNet 512x512 generation and Kinetics600 frame prediction. Also demonstrates use of MAGVIT-v2 for video compression.

View on Twitter

📊 Media Metadata

{
  "data": [
    {
      "id": "",
      "type": "photo",
      "url": null,
      "media_url": "https://pbs.twimg.com/media/F8DxI6PbYAAqzGI.jpg",
      "media_url_https": null,
      "display_url": null,
      "expanded_url": null
    },
    {
      "id": "",
      "type": "photo",
      "url": null,
      "media_url": "https://pbs.twimg.com/media/F8DxMHKbkAAxIqX.jpg",
      "media_url_https": null,
      "display_url": null,
      "expanded_url": null
    }
  ],
  "score": 0.89,
  "scored_at": "2025-08-09T13:46:07.549680",
  "import_source": "manual_curation_2023",
  "links_checked": true,
  "checked_at": "2025-08-10T10:31:50.608729",
  "media": [
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1711633145332920785/media_0.jpg?",
      "filename": "media_0.jpg",
      "original_url": "https://pbs.twimg.com/media/F8DxI6PbYAAqzGI.jpg"
    },
    {
      "type": "photo",
      "url": "https://crmoxkoizveukayfjuyo.supabase.co/storage/v1/object/public/media/posts/1711633145332920785/media_1.jpg?",
      "filename": "media_1.jpg",
      "original_url": "https://pbs.twimg.com/media/F8DxMHKbkAAxIqX.jpg"
    }
  ],
  "storage_migrated": true
}

🔧 Raw API Response

{
  "user": {
    "created_at": "2011-12-20T03:45:50.000Z",
    "default_profile_image": false,
    "description": "PhD at 19 |\nFounder and CEO at @MedARC_AI |\nResearch Director at @StabilityAI | \n@kaggle Notebooks GM |\nBiomed. engineer @ 14 |\nTEDx talk➡https://t.co/DwMkst4bnG",
    "fast_followers_count": 0,
    "favourites_count": 60004,
    "followers_count": 45439,
    "friends_count": 995,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 703,
    "location": "",
    "media_count": 1203,
    "name": "Tanishq Mathew Abraham, PhD",
    "normal_followers_count": 45439,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/441465751/1675968078",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1553508977735962624/nnlSwBmu_normal.jpg",
    "screen_name": "iScienceLuvr",
    "statuses_count": 12087,
    "translator_type": "none",
    "url": "https://t.co/nNzCz2VVd1",
    "verified": false,
    "withheld_in_countries": [],
    "id_str": "441465751"
  },
  "id": "1711633145332920785",
  "conversation_id": "1711633145332920785",
  "full_text": "Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation\n\nabs: https://t.co/MXOldpo8uk\nwebsite: https://t.co/7LaOqBANZY\n\nThis paper from Google & CMU introduces MAGVIT-v2, a joint image-video tokenizer which can be used with a masked language model to perform image and video generation. Obtains SOTA on class-conditional ImageNet 512x512 generation and Kinetics600 frame prediction. Also demonstrates use of MAGVIT-v2 for video compression.",
  "reply_count": 7,
  "retweet_count": 79,
  "favorite_count": 399,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [
    {
      "url": "https://t.co/DKWS0MdqwS",
      "expanded_url": "https://arxiv.org/abs/2310.05737",
      "display_url": "arxiv.org/abs/2310.05737"
    },
    {
      "url": "https://t.co/TcWPD0pWMF",
      "expanded_url": "https://magvit.cs.cmu.edu/v2/",
      "display_url": "magvit.cs.cmu.edu/v2/"
    }
  ],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/F8DxI6PbYAAqzGI.jpg",
      "type": "photo"
    },
    {
      "media_url": "https://pbs.twimg.com/media/F8DxMHKbkAAxIqX.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/iScienceLuvr/status/1711633145332920785",
  "created_at": "2023-10-10T06:41:57.000Z",
  "#sort_index": "1711633145332920785",
  "view_count": 77427,
  "quote_count": 6,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://twitter.com/iscienceluvr/status/1711633145332920785"
}