@HuggingPapers
Top papers on @huggingface this week (March 23-29): - MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding - Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models - Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model - Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale by @opengvlab - HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning using @Alibaba Qwen3.5 - OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis via @OpenAI GPT-OSS - PixelSmile: Toward Fine-Grained Facial Expression Editing - Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models - CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents - WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State Find them below: