@markchen90
If you give GPT-5.4 a raw dump of the GPT-2 weights and ask for a <5000 byte C program to inference it, GPT-5.4 succeeds in under 15 minutes! I remember working on a similar exercise to compare results against a proprietary model in a previous paper - it took days!