@HuggingPapers
Top AI papers this week on @huggingface (April 6-12) - GrandCode: First AI to beat all humans in live competitive programming contests, surpassing Google's Gemini 3 Deep Think - Adam's Law: New framework showing LLMs prefer frequent textual data for prompting and fine-tuning - Video-MME-v2: Next-gen video benchmark exposing huge gaps between AI and human experts - ClawBench: Testing AI agents on real-world tasks like booking flights (Claude 4.6 scores just 33%) - SkillClaw: Collective skill evolution for agentic systems - HY-Embodied-0.5: Embodied foundation models from Tencent for real-world robot agents - InCoder-32B-Thinking: Industrial code world model for chip design and GPU kernels - OpenWorldLib: Unified codebase and definition for advanced world models - Plus: Self-Distilled RLVR and rethinking generalization in reasoning SFT