@omarsar0
Emergent Hierarchical Reasoning in LLMs The paper argues that RL improves LLM reasoning via an emergent two-phase hierarchy. First, the model firms up low-level execution, then progress hinges on exploring high-level planning. More on this interesting analysis: https://t.co/Tp95C5dfnA