Your curated collection of saved posts and media

Showing 24 posts Β· last 30 days Β· by score
T
Tim_Dettmers
@Tim_Dettmers
πŸ“…
Jan 27, 2026
41d ago
πŸ†”06393435
⭐0.34

Our method became so efficient (26x vs RL; 57x vs other synth gen), that we could easily generate 1000s of trajectories for a single repo. This makes the coding agents very powerful as it soaks up the nuances of a particular codebase. Very cheap to specialize to any codebase.

E
ethnlshn
@ethnlshn
πŸ“…
Jan 27, 2026
41d ago
πŸ†”94590681
⭐0.44

Today, we release SERA-32B, an approach to coding agents that matches Devstral 2 at just $9,000. It is fully open-source and you can train your own model easily - at 26x the efficiency of using RL. Paper: https://t.co/aeD6T2WW3O Here’s how 🧡

P
percyliang
@percyliang
πŸ“…
Jan 21, 2026
47d ago
πŸ†”64928897
⭐0.32

Don't use weight decay. Just normalize everything you see (updates + parameters). Works on top of your favorite optimizer (e.g., Muon). Result: 33% speedup + better hyperparameter transfer.

T
togethercompute
@togethercompute
πŸ“…
Feb 25, 2026
12d ago
πŸ†”76368879

We’re open-sourcing CoderForge-Preview β€” 258K test-verified coding-agent trajectories (155K pass | 103K fail). Fine-tuning Qwen3-32B on the passing subset boosts SWE-bench Verified: 23.0% β†’ 59.4% pass@1, and it ranks #1 among open-data models ≀32B parameters. Thread on the data generation pipeline 🧡

Media 1
πŸ–ΌοΈ Media
Z
zhengyiluo
@zhengyiluo
πŸ“…
Feb 20, 2026
17d ago
πŸ†”40071287

SONIC is now open-source! Generalist whole-body teleoperation for EVERYONE! Our team has long been building comprehensive pipelines for whole-body control, kinematic planner, and teleoperation, and they will all be shared. This will be a continuous update; inference code + model already there, training code and gr00t integration coming soon! Code: https://t.co/7u3SBxzXU9 Docs: https://t.co/HpDLkTCSMF Site: https://t.co/D3i4KlnLLr

Media 2
+1 more
πŸ–ΌοΈ Media
D
DrJimFan
@DrJimFan
πŸ“…
Feb 24, 2026
13d ago
πŸ†”00658891

Website: https://t.co/xTaDXBu9cD Codebase and weights: https://t.co/QCQkqPIsHI Whitepaper: https://t.co/K2QCFjboDR Check out @zhengyiluo's post: https://t.co/hIHtvKkDQf

Media 1Media 2
+1 more
πŸ–ΌοΈ Media
πŸ”omarsar0 retweeted
O
elvis
@omarsar0
πŸ“…
Mar 08, 2026
22h ago
πŸ†”05872435
⭐0.36

Pay attention to this one if you are building terminal-based coding agents. OpenDev is an 81-page paper covering scaffolding, harness design, context engineering, and hard-won lessons from building CLI coding agents. It introduces a compound AI system architecture with workload-specialized model routing, a dual-agent architecture separating planning from execution, lazy tool discovery, and adaptive context compaction. The industry is shifting from IDE plugins to terminal-native agents. Claude Code, Codex CLI, and others have proven the model works. This paper formalizes the design patterns that make these systems reliable, covering topics like event-driven system reminders to counteract instruction fade-out, automated memory across sessions, and strict safety controls for autonomous operation. Paper: https://t.co/tpAZFaSnog Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

❀️938
likes
πŸ”163
retweets
A
alexisgallagher
@alexisgallagher
πŸ“…
Feb 16, 2026
21d ago
πŸ†”52635931

I built a robot that lives in my house. It recognizes family members, works with me on writing and code, checks my mail & calendar, and likes to chat about its changing interests. URL shows how to make one. (My son made this video, which totally saved my bacon for a deadline! πŸ™ πŸ˜…) https://t.co/j3Lan1bYhQ

Media 1
πŸ–ΌοΈ Media
A
ammaar
@ammaar
πŸ“…
Mar 07, 2026
1d ago
πŸ†”34893381

I asked Codex 5.4 to reverse engineer a DOS game with no source code. It’s been running for 6 hours, I can’t look away. It unpacked assets, disassembled the EXE, rebuilt the renderer, and built my childhood favorite SkyRoads in Rust! Now think of all the games we can revive. https://t.co/u8zidt0JlN

πŸ–ΌοΈ Media
πŸ”ai_fast_track retweeted
K
Andrej Karpathy
@karpathy
πŸ“…
Mar 07, 2026
2d ago
πŸ†”18931079
⭐0.36

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. https://t.co/YCvOwwjOzF Part code, part sci-fi, and a pinch of psychosis :)

❀️19,919
likes
πŸ”2,464
retweets
G
ggerganov
@ggerganov
πŸ“…
Dec 02, 2025
97d ago
πŸ†”25271232

We joined forces with NVIDIA to unlock high-speed AI inference on RTX AI PCs and DGX Spark using llama.cpp. The latest Ministral-3B models reach 385+ tok/s on @NVIDIA_AI_PC GeForce RTX 5090 systems. Blog: https://t.co/60yKKzNnoN

Media 1
πŸ–ΌοΈ Media
M
michaelandregg
@michaelandregg
πŸ“…
Mar 08, 2026
23h ago
πŸ†”88677736

We've uploaded a fruit fly. We took the @FlyWireNews connectome of the fruit fly brain, applied a simple neuron model (@Philip_Shiu Nature 2024) and used it to control a MuJoCo physics-simulated body, closing the loop from neural activation to action. A few things I want to say about what this means and where we're going at @eonsys. 🧡

πŸ–ΌοΈ Media
K
karpathy
@karpathy
πŸ“…
Mar 08, 2026
22h ago
πŸ†”23173639
⭐0.40

@tobi Who knew early singularity could be this fun? :) I just confirmed that the improvements autoresearch found over the last 2 days of (~650) experiments on depth 12 model transfer well to depth 24 so nanochat is about to get a new leaderboard entry for β€œtime to GPT-2” too. Works πŸ€·β€β™‚οΈ

D
dair_ai
@dair_ai
πŸ“…
Mar 08, 2026
22h ago
πŸ†”82901200
⭐0.32

Nice paper on lessons around building coding agents for the terminal.

N
NVIDIARobotics
@NVIDIARobotics
πŸ“…
Mar 07, 2026
2d ago
πŸ†”72442173

🦞 Want an always‑on personal assistant on your NVIDIA Jetson? Follow our step‑by‑step OpenClaw tutorial to run it fully local on your Jetson with zero cloud APIs.Β πŸ‘‰ https://t.co/LBYmT2eE8J https://t.co/oFQTa6Vocg

Media 2
πŸ–ΌοΈ Media
R
rasbt
@rasbt
πŸ“…
Mar 08, 2026
1d ago
πŸ†”61927256

@_xpn_ Hope you are enjoying it! Re distillation, Chapter 7 on supervised SFT is essentially DeepSeek-style distillation. Coincidentally, I am also currently wrapping up the distillation chapter for the sequel book (Build a reasoning model from scratch). In the meantime, you might like my distillation tools for generating datasets for distillation: https://t.co/PD7s7O9ri8

Media 1
πŸ–ΌοΈ Media
M
MLStreetTalk
@MLStreetTalk
πŸ“…
Mar 04, 2026
5d ago
πŸ†”59873553

A masterclass from @jeremyphoward on why AI coding tools can be a trap -- and what 45 years of programming taught him that most vibe coders will never learn. - AI coding tools exploit gambling psychology - The difference between typing code and software engineering - Enterprise coding AND prompt-only vibe coding are "inhumane" i.e. disconnecting humans from understanding-building - AI tools remove the "desirable difficulty" you need to build deep mental models. Out on MLST now!

πŸ–ΌοΈ Media
πŸ”omarsar0 retweeted
O
elvis
@omarsar0
πŸ“…
Mar 07, 2026
2d ago
πŸ†”17153248
⭐0.36

New survey on agentic reinforcement learning for LLMs. LLM RL still treats models like sequence generators optimized in relatively narrow settings. However, real agents operate in open-ended, partially observable environments where planning, memory, tool use, reasoning, self-improvement, and perception all interact. This paper argues that agentic RL should be treated as its own landscape. It introduces a broad taxonomy that organizes the field across core agent capabilities and application domains, then maps the open-source environments, benchmarks, and frameworks shaping the space. If you are building agents, this is a strong paper worth checking out. Paper: https://t.co/qwXZNSp0ZA Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

❀️187
likes
πŸ”37
retweets
K
karpathy
@karpathy
πŸ“…
Mar 07, 2026
2d ago
πŸ†”91536982

(I still have the bigger cousin running on prod nanochat, working a bigger model and on 8XH100, which looks like this now. I'll just leave this running for a while...) https://t.co/aWya9hpUMl

Media 1
πŸ–ΌοΈ Media
P
pratykumar
@pratykumar
πŸ“…
Mar 06, 2026
3d ago
πŸ†”24431356

πŸ“’ Open-sourcing the Sarvam 30B and 105B models! Trained from scratch with all data, model research and inference optimisation done in-house, these models punch above their weight in most global benchmarks plus excel in Indian languages. Get the weights at Hugging Face and AIKosh. Thanks to the good folks at SGLang for day 0 support, vLLM support coming soon. Links, benchmark scores, examples, and more in our blog - https://t.co/DcCG3zlN8p

Media 1
πŸ–ΌοΈ Media
M
markchen90
@markchen90
πŸ“…
Mar 07, 2026
2d ago
πŸ†”64670486
⭐0.40

If you give GPT-5.4 a raw dump of the GPT-2 weights and ask for a <5000 byte C program to inference it, GPT-5.4 succeeds in under 15 minutes! I remember working on a similar exercise to compare results against a proprietary model in a previous paper - it took days!

πŸ”omarsar0 retweeted
O
elvis
@omarsar0
πŸ“…
Mar 06, 2026
3d ago
πŸ†”40912429
⭐0.38

New research from Microsoft. Phi-4-reasoning-vision-15B is a 15-billion parameter multimodal reasoning model that combines visual understanding with structured reasoning capabilities. As I have been saying, not every agent task needs a frontier model. Phi-4-reasoning-vision shows what's possible at 15B parameters. The report details how they trained a compact model that can reason over both text and images, targeting the sweet spot between capability and efficiency. Smaller reasoning models that handle vision are essential for practical agent deployments. Paper: https://t.co/cT2qeNImwi Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

❀️271
likes
πŸ”50
retweets
_
_akhaliq
@_akhaliq
πŸ“…
Mar 06, 2026
3d ago
πŸ†”71764808

DARE Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval https://t.co/Jeo3lOI9ru

Media 1
πŸ–ΌοΈ Media
C
calebfahlgren
@calebfahlgren
πŸ“…
Mar 06, 2026
3d ago
πŸ†”00410505

DataClaw🦞datasets are first class on Hugging Face datasets!! Full visibility into the reasoning, tool calls and thousands of Claude Code and Codex sessions on the hub https://t.co/Ooq9cGciGt

πŸ–ΌοΈ Media