@karpathy
@N8Programs a beauty for anyone interested in mechanistic interpretability or getting into LLMs. interesting to look at small algorithms and their "neural implementations" to get a sense of how neural nets implement various functionality. unless the minification really creates "esoteric" solutions that you wouldn't encounter in practice, which might be more based around distributed representations, helixes etc. i tried training the same arch briefly from scratch and gradient descent didn't find the solution, would probably work with more degrees of freedom and enough effort.