🐦 Twitter Post Details

Viewing enriched Twitter post

@DrJimFan

In 2021, Meta Reality Labs published a method called Pixel Codec Avatars (PiCA). I didn't realize its significance until @lexfridman's one-of-a-kind podcast. PiCA is actually the MP4 format for VR. A brand new protocol for 3D streaming. Here's the intuition: - The encoder first compresses the image captured by VR face cam into a latent code. The code captures the fine-grained facial expression and nuances, which give Lex's interview a hyper-realistic touch. - Send the latent code over internet - wayyy more efficient than sending 3D mesh or images over. - The decoder does two things: (1) Reconstruct the global, 3D geometry of the face & expression in real-time. (2) Re-render the color at each pixel, given a particular viewing angle. PiCA does NOT render any pixels that are occluded, i.e. the back of Lex and Mark's heads actually don't exist. I find an intriguing connection to the Simulation Hypothesis: the world isn't there until you actively look at it.

🔧 Raw API Response

{
  "user": {
    "created_at": "2012-12-12T22:11:27.000Z",
    "default_profile_image": false,
    "description": "@NVIDIA Senior AI Scientist. @Stanford PhD. Join me on the frontier of AI Agents, LLM & Robotics. MineDojo (NeurIPS Best Paper), Voyager. Ex: @OpenAI, @GoogleAI",
    "fast_followers_count": 0,
    "favourites_count": 6218,
    "followers_count": 144111,
    "friends_count": 2809,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 2891,
    "location": "Views my own. Get in touch →",
    "media_count": 642,
    "name": "Jim Fan",
    "normal_followers_count": 144111,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/1007413134/1672408318",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1554922493101559808/SYSZhbcd_normal.jpg",
    "screen_name": "DrJimFan",
    "statuses_count": 2967,
    "translator_type": "none",
    "url": "https://t.co/H4rXo4Ei8X",
    "verified": false,
    "withheld_in_countries": [],
    "id_str": "1007413134"
  },
  "id": "1712144040744136751",
  "conversation_id": "1712144040744136751",
  "full_text": "In 2021, Meta Reality Labs published a method called Pixel Codec Avatars (PiCA). I didn't realize its significance until @lexfridman's one-of-a-kind podcast.\n\nPiCA is actually the MP4 format for VR. A brand new protocol for 3D streaming.\n\nHere's the intuition:\n- The encoder first compresses the image captured by VR face cam into a latent code. The code captures the fine-grained facial expression and nuances, which give Lex's interview a hyper-realistic touch. \n- Send the latent code over internet - wayyy more efficient than sending 3D mesh or images over.\n- The decoder does two things:\n(1) Reconstruct the global, 3D geometry of the face & expression in real-time.\n(2) Re-render the color at each pixel, given a particular viewing angle.\n\nPiCA does NOT render any pixels that are occluded, i.e. the back of Lex and Mark's heads actually don't exist. I find an intriguing connection to the Simulation Hypothesis: the world isn't there until you actively look at it.",
  "reply_count": 38,
  "retweet_count": 219,
  "favorite_count": 1662,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [
    {
      "id_str": "427089628",
      "name": "Lex Fridman",
      "screen_name": "lexfridman",
      "profile": "https://twitter.com/lexfridman"
    }
  ],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/F8K9koVacAASMLY.jpg",
      "type": "photo"
    },
    {
      "media_url": "https://pbs.twimg.com/media/F8K9nZIasAAs2yH.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/DrJimFan/status/1712144040744136751",
  "created_at": "2023-10-11T16:32:04.000Z",
  "#sort_index": "1712144040744136751",
  "view_count": 400467,
  "quote_count": 21,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://twitter.com/drjimfan/status/1712144040744136751"
}