@lieberum_t
Mech interp has been very successful in tiny models, but does it scale? …Kinda! Our new @GoogleDeepMind paper studies how Chinchilla70B can do multiple-choice Qs, focusing on picking the correct letter. Small model techniques mostly work but it's messy!🧵https://t.co/SLFEOqltYR https://t.co/LT29ry8o9t