🐦 Twitter Post Details

Viewing enriched Twitter post

@karpathy

@BrownCoyoteStd The code to train a GPT is only ~1,000 lines of code. In the case of GPT training the success criteria is quite simple: reach the lowest possible loss (meaning that your GPT is predicting the next token well), but don't regress running time, keep memory in check, and keep a sense of simplicity/aesthetics (don't bloat the code too much to get a small gain). Because 1) the criteria is objective and 2) because AI agents can now write code quite well, instead of having a human think up experiment ideas and try them out one by one (e.g. my entire PhD basically), you just get the AI to do the whole thing. My prompt ("AI source code") in this example is just ~120 lines of markdown document explaining the thing to the AI. The AI of today is very good at implementing ideas, but a lot less good at coming up with creative ones. So honestly, it's a lot closer to hyperparameter tuning right now than coming up with new/novel research, but 1) i didn't super tune the prompts yet, maybe you can just try to ask and 2) it's clear what the trajectory of this is as the AI capability improves - it's AI improving the next version of itself autonomously, maybe with human researchers throwing some ideas into the mix once in a while.

View on Twitter

📊 Media Metadata

{
  "score": 0.42,
  "score_components": {
    "author": 0.09,
    "engagement": 0.0,
    "quality": 0.12,
    "source": 0.135,
    "nlp": 0.05,
    "recency": 0.025
  },
  "scored_at": "2026-03-07T07:24:40.214161",
  "import_source": "api_import",
  "source_tagged_at": "2026-03-07T07:24:40.214182",
  "enriched": true,
  "enriched_at": "2026-03-07T07:24:40.214188"
}

🔧 Raw API Response

{
  "type": "tweet",
  "id": "2029957088022254014",
  "url": "https://x.com/karpathy/status/2029957088022254014",
  "twitterUrl": "https://twitter.com/karpathy/status/2029957088022254014",
  "text": "@BrownCoyoteStd The code to train a GPT is only ~1,000 lines of code. In the case of GPT training the success criteria is quite simple: reach the lowest possible loss (meaning that your GPT is predicting the next token well), but don't regress running time, keep memory in check, and keep a sense of simplicity/aesthetics (don't bloat the code too much to get a small gain). Because 1) the criteria is objective and 2) because AI agents can now write code quite well, instead of having a human think up experiment ideas and try them out one by one (e.g. my entire PhD basically), you just get the AI to do the whole thing. My prompt (\"AI source code\") in this example is just ~120 lines of markdown document explaining the thing to the AI. The AI of today is very good at implementing ideas, but a lot less good at coming up with creative ones. So honestly, it's a lot closer to hyperparameter tuning right now than coming up with new/novel research, but 1) i didn't super tune the prompts yet, maybe you can just try to ask and 2) it's clear what the trajectory of this is as the AI capability improves - it's AI improving the next version of itself autonomously, maybe with human researchers throwing some ideas into the mix once in a while.",
  "source": "Twitter for iPhone",
  "retweetCount": 6,
  "replyCount": 11,
  "likeCount": 79,
  "quoteCount": 1,
  "viewCount": 4660,
  "createdAt": "Fri Mar 06 16:27:39 +0000 2026",
  "lang": "en",
  "bookmarkCount": 25,
  "isReply": true,
  "inReplyToId": "2029953940721389597",
  "conversationId": "2029701092347630069",
  "displayTextRange": [
    16,
    296
  ],
  "inReplyToUserId": "2026520320023375872",
  "inReplyToUsername": "BrownCoyoteStd",
  "author": {
    "type": "user",
    "userName": "karpathy",
    "url": "https://x.com/karpathy",
    "twitterUrl": "https://twitter.com/karpathy",
    "id": "33836629",
    "name": "Andrej Karpathy",
    "isVerified": false,
    "isBlueVerified": true,
    "verifiedType": null,
    "profilePicture": "https://pbs.twimg.com/profile_images/1296667294148382721/9Pr6XrPB_normal.jpg",
    "coverPicture": "https://pbs.twimg.com/profile_banners/33836629/1407117611",
    "description": "",
    "location": "Stanford",
    "followers": 1890969,
    "following": 1055,
    "status": "",
    "canDm": true,
    "canMediaTag": true,
    "createdAt": "Tue Apr 21 06:49:15 +0000 2009",
    "entities": {
      "description": {
        "urls": []
      },
      "url": {}
    },
    "fastFollowersCount": 0,
    "favouritesCount": 22237,
    "hasCustomTimelines": true,
    "isTranslator": false,
    "mediaCount": 854,
    "statusesCount": 9991,
    "withheldInCountries": [],
    "affiliatesHighlightedLabel": {},
    "possiblySensitive": false,
    "pinnedTweetIds": [
      "1617979122625712128"
    ],
    "profile_bio": {
      "description": "I like to train large deep neural nets. Previously Director of AI @ Tesla, founding team @ OpenAI, PhD @ Stanford.",
      "entities": {
        "description": {
          "hashtags": [],
          "symbols": [],
          "urls": [],
          "user_mentions": []
        },
        "url": {
          "urls": [
            {
              "display_url": "karpathy.ai",
              "expanded_url": "https://karpathy.ai",
              "indices": [
                0,
                23
              ],
              "url": "https://t.co/0EcFthjJXM"
            }
          ]
        }
      }
    },
    "isAutomated": false,
    "automatedBy": null
  },
  "extendedEntities": {},
  "card": null,
  "place": {},
  "entities": {
    "hashtags": [],
    "symbols": [],
    "urls": [],
    "user_mentions": []
  },
  "quoted_tweet": null,
  "retweeted_tweet": null,
  "isLimitedReply": false,
  "article": null
}