@himanshustwts
wait this is actually big. this deepseek research used LogitLens (lets you see what the model is 'thinking' at each layer) and CKA (compares what different layers are actually learning) to figure out why the new Engram architecture works. apparently this is the first time i have seen mech interpretability research being used in a capabilities paper. feels like a shift.