@omarsar0
LLMs Solve Math with a Bag of Heuristics Uses causal analysis to find neurons that explain an LLM's behavior when doing basic arithmetic logic. Discovers and hypothesizes that the combination of heuristic neurons is the mechanism used to produce correct arithmetic answers. "To test this, we categorize each neuron into several heuristic types—such as neurons that activate when an operand falls within a certain range—and find that the unordered combination of these heuristic types is the mechanism that explains most of the model’s accuracy on arithmetic prompts." Interpretability and steerability are two important research problems in LLMs. We are seeing deep investment into interpretability research in companies like Anthropic and OpenAI that aim to better understand the inner workings of LLMs for steerability. Most of the recent papers I've seen have focused on safety and bias but there is also the capabilities side of things so this is a huge research area that's developing slowly.