@karpathy
@tobi Who knew early singularity could be this fun? :) I just confirmed that the improvements autoresearch found over the last 2 days of (~650) experiments on depth 12 model transfer well to depth 24 so nanochat is about to get a new leaderboard entry for βtime to GPT-2β too. Works π€·ββοΈ