@AINativeF
1. Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning š Keywords: NPR, Large Language Models, self-distilled training, Parallel-Aware Policy Optimization, genuine parallel reasoning š” Category: Knowledge Representation and Reasoning š Research Objective: - The paper introduces NPR, a teacher-free framework that enhances Large Language Models (LLMs) with native parallel reasoning capabilities. š ļø Research Methods: - NPR employs a self-distilled progressive training paradigm, Parallel-Aware Policy Optimization (PAPO) algorithm, and a robust NPR Engine to enable native parallel cognition without external supervision. š¬ Research Conclusions: - Trained on Qwen3-4B, NPR achieves performance gains of up to 24.5% and inference speedups up to 4.6x, setting a new standard for efficient and scalable agentic reasoning with 100% genuine parallel execution. š Paper link: https://t.co/La4uXPrBrA