Your curated collection of saved posts and media

Showing 24 posts Β· last 7 days Β· quality filtered
πŸ”_akhaliq retweeted
W
Martian
@withmartian
πŸ“…
Feb 26, 2026
16d ago
πŸ†”73714984

Introducing Code Review Bench v0: https://t.co/iAZDURyqol The first independent code review benchmark. 200,000+ PRs. Unbiased. Fully OSS. Updated daily. Tool performance highlights πŸ§΅πŸ‘‡ Featuring: @augmentcode @baz_scm @claudeai @coderabbitai @cursor @GeminiApp @github @graphite @greptile @kilocode @OpenAIDevs @propelcode @QodoAI

Media 1
❀️533
likes
πŸ”53
retweets
πŸ–ΌοΈ Media
P
pika_labs
@pika_labs
πŸ“…
Feb 28, 2026
15d ago
πŸ†”61658534

☎️ Hello? AI Selves now have phone numbers! Put them in your imessage or SMS to be there when you’re not, settle arguments in your group chats, and make talking to yourself more normal. More ideas πŸ‘‡πŸ§΅ Plus, we’re letting more people in off of our waitlist! QRT to get your own early access code.

πŸ–ΌοΈ Media
H
HuggingPapers
@HuggingPapers
πŸ“…
Feb 28, 2026
15d ago
πŸ†”73370956

Imagination Helps Visual Reasoning, But Not Yet in Latent Space Causal mediation analysis reveals latent visual reasoning in MLLMs fails: latent tokens ignore inputs and barely affect answers. CapImagine, a text-based alternative, teaches explicit imagination and significantly outperforms latent baselines.

Media 1
πŸ–ΌοΈ Media
H
HuggingPapers
@HuggingPapers
πŸ“…
Mar 01, 2026
13d ago
πŸ†”98185284

Top AI Papers of The Week (Feb 24 - Mar 2) - A Very Big Video Reasoning Suite: 200 tasks, 1M+ video clips for video reasoning research - Does Your Reasoning Model Implicitly Know When to Stop Thinking? Introducing SAGE paradigm - AgentFly: Fine-tuning LLM agents without fine-tuning LLMs - Microsoft rStar2-Agent: 80.6% on AIME24 with just 14B parameters - From Blind Spots to Gains: Diagnostic-driven iterative training for LMMs - VibeVoice: Synthesizing 90-minute multi-speaker conversational speech - Alibaba MobilityBench: Benchmarking real-world route-planning agents - NVIDIA's data engineering strategies for scaling LLM terminal capabilities - VESPO: Variational sequence-level soft policy optimization for stable RL training - Beyond Pass@1: Self-play with variational problem synthesis sustains RLVR Find them below:

Media 1
πŸ–ΌοΈ Media
D
DengHokin
@DengHokin
πŸ“…
Mar 01, 2026
14d ago
πŸ†”10710035

Thanks AK for reposting our work! Here are all the links for anyone who wants to check out more! Paper:Β https://t.co/6PajZXj6V0 Project Website:Β https://t.co/5VTiCqTDhN EvalKit:Β https://t.co/lxhyzMaI8j Cloud Infra:Β https://t.co/QNJRfOKQN3 Training Set:Β https://t.co/DlzLojQjsR Eval Set:Β https://t.co/Tzs2jAN99C Leaderboard:Β https://t.co/peZ1XkelYY Model:Β https://t.co/gFFJofrlNR

Media 1Media 2
+4 more
πŸ–ΌοΈ Media
_
_akhaliq
@_akhaliq
πŸ“…
Mar 01, 2026
13d ago
πŸ†”62987689

JavisDiT++ Unified Modeling and Optimization for Joint Audio-Video Generation https://t.co/bd8BlNZNEr

Media 1
πŸ–ΌοΈ Media
H
HuggingPapers
@HuggingPapers
πŸ“…
Feb 22, 2026
20d ago
πŸ†”66169639

Top AI Papers of The Week (Feb 16-22) - Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs - SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise - GLM-5: from Vibe Coding to Agentic Engineering by @zhipuAI - Experiential Reinforcement Learning - MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs - Zooming without Zooming: Region-to-Image Distillation by @InclusionAI - Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines? - DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval - SLA2: Sparse-Linear Attention with Learnable Routing and QAT - SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks Find them below:

Media 1
πŸ–ΌοΈ Media
V
vanstriendaniel
@vanstriendaniel
πŸ“…
Feb 24, 2026
18d ago
πŸ†”77709152

Just shipped! @huggingface storage add-ons. Starting at $12/month per TB - 3x cheaper than regular cloud storage, with very fast uploads and downloads powered by Xet's deduplication. You can now buy, upgrade, and cancel storage plans directly from your billing settings. https://t.co/RDylcDjkb4

Media 1
πŸ–ΌοΈ Media
J
JohannesTscharn
@JohannesTscharn
πŸ“…
Feb 24, 2026
19d ago
πŸ†”01290957

I’m giving an agent control over Reachy Mini from @huggingface and letting it understand and share spatial data via @Spectacles AR is the human interface for robotics and physical AI imo. It feels like absolute magic to interact with this, both in voice/agent and β€œpuppeteering” mode. I’ll probably work on AR for either an arm (manipulation tasks) or some sort of drone (locomotion in 3D space) next… Project is fully open source btw: https://t.co/pmkXJR0U7f Thank you @SensAIHackademy for sending me the robot!

Media 2
πŸ–ΌοΈ Media
N
nic_o_martin
@nic_o_martin
πŸ“…
Feb 24, 2026
18d ago
πŸ†”86199722

TranslateGemma 4B by @GoogleDeepMind now runs 100% in your browser on WebGPU with Transformers.js v4. 55 languages. No server. No data leaks. Works offline. A 4B parameter translation powerhouse, right in your browser. Try the demo πŸ‘‡ https://t.co/YgYskHqBRm

πŸ–ΌοΈ Media
L
lhoestq
@lhoestq
πŸ“…
Feb 25, 2026
17d ago
πŸ†”94285373

datasets v4.6.0 is out πŸ€— News for multimodal/streaming: 🎬 push_to_hub() video datasets πŸ“ƒ image/audio/video are now PLAIN blobs in Parquet βš”οΈ type inference for Lance βœ‚οΈ .reshard() streaming Parquet datasets: shard per row group instead of file All optimized for Xet πŸ§΅πŸ‘‡ https://t.co/kD9tewovNN

Media 1
πŸ–ΌοΈ Media
A
alvarobartt
@alvarobartt
πŸ“…
Feb 26, 2026
16d ago
πŸ†”99259162

🌐 pplx-embed is @perplexity_ai new collection of state-of-the-art multilingual embedding models optimized for real-world, web-scale retrieval tasks! - Built on Qwen3 w/ diffusion-based pretraining and bidirectional attention - Available at 0.6B and 4B parameters w/ native INT8 quantization - pplx-embed-v1 for independent text embeddings - pplx-embed-context-v1 for document chunks in RAG - Validated on real-world search scenarios over tens of millions of documents - Permissive MIT License - Available on the @huggingface Hub, and supported on Text Embeddings Inference, Sentence Transformers, and Transformers.js

Media 1
πŸ–ΌοΈ Media
N
NVIDIARobotics
@NVIDIARobotics
πŸ“…
Feb 26, 2026
16d ago
πŸ†”68064803

Want to bring open-source vision language models to the edge? πŸ’» Check out our @huggingface article on deploying NVIDIA Cosmos Reasoning 2B across the NVIDIA Jetson family with vLLM and a Live VLM WebUI. πŸ“– https://t.co/Tp0tZtjgRp https://t.co/tytkmCRJzx

Media 2
πŸ–ΌοΈ Media
N
NielsRogge
@NielsRogge
πŸ“…
Feb 25, 2026
17d ago
πŸ†”93549164

I tried Codex 5.3 (web) for porting VidEoMT, a simple and elegant ViT-based video segmentation model, to @huggingface Transformers Sadly, it missed the global picture, mistakenly assuming the model uses DINOv3 as its backbone, whereas it actually uses DINOv2. It got stuck. Opus 4.6 fixed it after I told it The job of ML Engineer is still safe - humans stay in the driver's seat PR: https://t.co/5ahL0GqtZN

Media 1Media 2
πŸ–ΌοΈ Media
M
Michael_J_Black
@Michael_J_Black
πŸ“…
Feb 27, 2026
16d ago
πŸ†”85360955

BEDLAM2.0 image and depth data are now available via Hugging Face, providing high-speed worldwide download access to over 26TB of synthetic data for non-commercial research. Hugging Face: https://t.co/tl8S3DJNWw Project: https://t.co/NR5Np9UT46

Media 1Media 2
πŸ–ΌοΈ Media
S
SergioPaniego
@SergioPaniego
πŸ“…
Feb 26, 2026
16d ago
πŸ†”56241971

What happens when you make an LLM drive a car where physics are real and actions can't be undone? I ported CARLA, the autonomous driving simulator, to OpenEnv and added training via TRL + HF Spaces In 50 steps, Qwen 0.6B learns to swerve and brake to avoid pedestrians https://t.co/QR4FJS70h7

Media 1
πŸ–ΌοΈ Media
R
RisingSayak
@RisingSayak
πŸ“…
Feb 27, 2026
16d ago
πŸ†”28151366

Editing images is a series of state transitions between the source image and the edited image that we want. Yet, the existing paradigm doesn't explicitly include any transitioning priors in the editing process. This becomes particularly prevalent for edits, involving causal dynamics (e.g., refraction, deformation). To model this kind of physics-informed information, we leverage the rich priors present in videos and introduce PhysicEdit πŸ”₯ TL;DR: We fine-tune QwenImage Edit on a curated dataset of videos with reasoning traces and fixed-length transition queries to do solid physics-aware image editing! In the process, we introduce a cool dataset "PhysicTran38K", consisting of 38K transition trajectories across five physical domains and devise a method to provide supervision from it QwenImage Edit. Hop in to learn more ⬇️

Media 1
πŸ–ΌοΈ Media
V
vanstriendaniel
@vanstriendaniel
πŸ“…
Feb 27, 2026
15d ago
πŸ†”42375896

Is it worth re-OCR'ing old library index cards? Re-OCR'd 453,000 from Boston Public Library's rare books catalogue. ~$50 compute using @huggingface Jobs BPL's own guide calls their search "extremely unreliable." Does better OCR and semantic search fix it? Demo link below https://t.co/DC5nqtmQtC

Media 1Media 2
πŸ–ΌοΈ Media
B
bartowski1182
@bartowski1182
πŸ“…
Feb 26, 2026
16d ago
πŸ†”92880680

Never thought this day would come, but we've hit 10k followers on @huggingface :') πŸ€— Huge thank you to them for their endless storage grants allowing me to upload over 2000 quants these past few years! https://t.co/Ueh6ty1Yed

Media 1
πŸ–ΌοΈ Media
A
Arm
@Arm
πŸ“…
Feb 27, 2026
16d ago
πŸ†”84183604

Marco built Reachy Phone Home so Reachy Mini can detect when you’re on your phone, using @Ultralytics YOLO26 vision, and respond in real time with voice + motion. Built on Arm (Apple Mac / Raspberry Pi 5) with @huggingface πŸ€— + @pollenrobotics 🦾, it’s now an award-winning project, earning an @NVIDIAGTC Golden Ticket πŸ† It's great to see our developers build and win in the open AI ecosystem πŸ‘ https://t.co/C8atY3fwLv

πŸ–ΌοΈ Media
T
tomaarsen
@tomaarsen
πŸ“…
Feb 27, 2026
15d ago
πŸ†”79595949

πŸ€— @perplexity_ai has released 4 open-weights state-of-the-art multilingual embedding models designed for retrieval tasks! pplx-embed-v1 and pplx-embed-context-v1 Specifically trained for int8 and binary embeddings, they'll be viable for massive search problems. Details in 🧡 https://t.co/smqcPLKjU2

Media 1
πŸ–ΌοΈ Media
J
JustinLin610
@JustinLin610
πŸ“…
Mar 01, 2026
14d ago
πŸ†”19380067

https://t.co/sOMaBpuaQJ

Media 1
πŸ–ΌοΈ Media
πŸ”huggingface retweeted
J
Junyang Lin
@JustinLin610
πŸ“…
Mar 01, 2026
14d ago
πŸ†”19380067

https://t.co/sOMaBpuaQJ

Media 1
❀️728
likes
πŸ”33
retweets
πŸ–ΌοΈ Media
A
AndrewYNg
@AndrewYNg
πŸ“…
Jan 22, 2026
51d ago
πŸ†”56975982

New course: Gemini CLI: Code & Create with an Open-Source Agent, built with @googlecloudtech/@geminicli and taught by @JackWoth98. Agentic coding assistants like Gemini CLI are transforming how developers work. This short course teaches you to use Google's open-source agent to coordinate local tools and cloud services for coding and non-coding workflows. Gemini CLI works from your terminal, so it works with your local files and development tools. You can also connect it to services through MCP. Then provide high-level instructions, and it autonomously plans and executes complex workflows. Skills you'll gain: - Build website features and automate code reviews with GitHub ActionsCreate data dashboards that combine local files with cloud data sources - Use MCP servers and extensions to orchestrate workflows across GitHub, Canva, and Google Workspace - Generate social media content from multimedia files like conference recordings I particularly appreciate that Gemini CLI is open-source. You can see exactly how it works, read the prompts it uses, and understand its architecture. The community has contributed thousands of pull requests. Since Gemini 3’s release I've found Gemini CLI highly capable - this is a tool worth having in your toolbox! Whether you're prototyping applications, automating workflows, or working with multimedia content, join to learn to delegate complex tasks and build faster: https://t.co/m3J7kwQpxC

Media 2
πŸ–ΌοΈ Media