🐦 Twitter Post Details

Viewing enriched Twitter post

@DrJimFan

As GPT4-V is rolling out, you'll see a new hype wave of "AutoGPTs" and "GPT-Engineers", this time promising to convert sketches to full-blown apps. Cool demo is one thing. Truly useful for everyday work is another matter entirely. Don't get me wrong, I'm a big believer & practitioner in multimodal models long before they are sexy. Nothing makes me happier to see more people trying out this new tech and sharing their findings. But it's important to be grounded in reality. The demos you see are rarely useful. No one needs a barebone app or website built from scratch, with little control over the details and features. It’s the same thing as no one trusts GPT to write a full code repo for anything serious, but everyone uses GitHub Co-pilot to boost productivity. The keyword here is contextual. Here’s where GPT-4V for coding will truly be useful: a visual co-pilot that is conditioned on your 10,000 lines of code context, and helps you refine your GUI, UX, and aesthetics step by step. You as the engineer do not give up control, but rather have an extra pair of eyes to aid you in the pixel design space. This is a much more demanding task than regurgitating generic templates. Is GPT-4V already there? Likely not. We may need to develop more robust, no-gradient algorithms on top of the raw model, or find better training recipes altogether. In any case, I believe Co-pilot 2.0 is the way to go beyond parlor tricks into real economic value for the near future.

🔧 Raw API Response

{
  "user": {
    "created_at": "2012-12-12T22:11:27.000Z",
    "default_profile_image": false,
    "description": "@NVIDIA Senior AI Scientist. @Stanford PhD. Join me on the frontier of AI Agents, LLM & Robotics. MineDojo (NeurIPS Best Paper), Voyager. Ex: @OpenAI, @GoogleAI",
    "fast_followers_count": 0,
    "favourites_count": 6202,
    "followers_count": 142746,
    "friends_count": 2804,
    "has_custom_timelines": true,
    "is_translator": false,
    "listed_count": 2879,
    "location": "Views my own. Get in touch →",
    "media_count": 638,
    "name": "Jim Fan",
    "normal_followers_count": 142746,
    "possibly_sensitive": false,
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/1007413134/1672408318",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1554922493101559808/SYSZhbcd_normal.jpg",
    "screen_name": "DrJimFan",
    "statuses_count": 2957,
    "translator_type": "none",
    "url": "https://t.co/H4rXo4Ei8X",
    "verified": false,
    "withheld_in_countries": [],
    "id_str": "1007413134"
  },
  "id": "1709584187748122787",
  "conversation_id": "1709584187748122787",
  "full_text": "As GPT4-V is rolling out, you'll see a new hype wave of \"AutoGPTs\" and \"GPT-Engineers\", this time promising to convert sketches to full-blown apps.\n\nCool demo is one thing. Truly useful for everyday work is another matter entirely.\n\nDon't get me wrong, I'm a big believer & practitioner in multimodal models long before they are sexy. Nothing makes me happier to see more people trying out this new tech and sharing their findings.\n\nBut it's important to be grounded in reality. The demos you see are rarely useful. No one needs a barebone app or website built from scratch, with little control over the details and features. It’s the same thing as no one trusts GPT to write a full code repo for anything serious, but everyone uses GitHub Co-pilot to boost productivity.\n\nThe keyword here is contextual. Here’s where GPT-4V for coding will truly be useful: a visual co-pilot that is conditioned on your 10,000 lines of code context, and helps you refine your GUI, UX, and aesthetics step by step. \n\nYou as the engineer do not give up control, but rather have an extra pair of eyes to aid you in the pixel design space. This is a much more demanding task than regurgitating generic templates.\n\nIs GPT-4V already there? Likely not. We may need to develop more robust, no-gradient algorithms on top of the raw model, or find better training recipes altogether. In any case, I believe Co-pilot 2.0 is the way to go beyond parlor tricks into real economic value for the near future.",
  "reply_count": 32,
  "retweet_count": 109,
  "favorite_count": 551,
  "hashtags": [],
  "symbols": [],
  "user_mentions": [],
  "urls": [],
  "media": [
    {
      "media_url": "https://pbs.twimg.com/media/F7mpCmDa8AA99SR.jpg",
      "type": "photo"
    }
  ],
  "url": "https://twitter.com/DrJimFan/status/1709584187748122787",
  "created_at": "2023-10-04T15:00:07.000Z",
  "#sort_index": "1709584187748122787",
  "view_count": 189588,
  "quote_count": 17,
  "is_quote_tweet": false,
  "is_retweet": false,
  "is_pinned": false,
  "is_truncated": true,
  "startUrl": "https://twitter.com/drjimfan/status/1709584187748122787"
}