@dair_ai
New research on improving self-reflection in language agents. A core problem with agent self-reflection is that models tend to generate repetitive reflections that add noise instead of signal, hurting overall reasoning performance. It introduces ParamMem, a parametric memory module that encodes cross-sample reflection patterns directly into model parameters, then uses temperature-controlled sampling to generate diverse reflections at inference time. ParamMem shows consistent improvements over SOTA baselines across code generation, mathematical reasoning, and multi-hop QA. It also enables weak-to-strong transfer and self-improvement without needing a stronger external model, making it a practical upgrade for agentic pipelines. Paper: https://t.co/16Yp56j8Jm Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c