@dair_ai
// Think Harder or Know More // Chain-of-thought prompting enables reasoning in LLMs but requires explicit verbalization of intermediate steps. Looped transformers offer an alternative by iteratively refining representations within hidden states, but they sacrifice storage capacity in the process. This paper investigates combining both: adaptive per-layer looping with gated memory banks. Each transformer block learns when to iterate its hidden state and when to access stored knowledge. The key finding: Looping primarily benefits mathematical reasoning, while memory banks recover performance on commonsense tasks. Combining both yields a model that outperforms an iso-FLOP baseline with three times the number of layers on math benchmarks. Analysis of model internals reveals layer specialization. Early layers learn to loop minimally and access memory sparingly, while later layers do both more heavily. The model learns to choose between thinking harder and knowing more, and where to do each. Paper: https://t.co/0Gl77zMwOY Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c