Your curated collection of saved posts and media

Showing 24 posts Β· last 30 days Β· by score
K
karpathy
@karpathy
πŸ“…
Mar 09, 2026
15m ago
πŸ†”49524125

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. https://t.co/WAz8aIztKT All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

Media 1Media 2
πŸ–ΌοΈ Media
F
fekdaoui
@fekdaoui
πŸ“…
Mar 09, 2026
8h ago
πŸ†”42219404

I built an LLM pricing comparison tool: πŸ” Search 200+ models πŸ’° Input, output, blended cost πŸ“Š 40+ benchmark scores βš”οΈ Side-by-side model compare https://t.co/SswDpoDwXX

πŸ–ΌοΈ Media
K
katyaprigara
@katyaprigara
πŸ“…
Mar 09, 2026
8h ago
πŸ†”84144281

We are launching something new at JetBrains – please meet Air. It's a new Agentic Dev Environment built for working with agents from different vendors. More cool stuff is coming, stay tuned: @getsome_air https://t.co/X3pNdmOzFW

πŸ–ΌοΈ Media
G
gilmxres
@gilmxres
πŸ“…
Mar 09, 2026
7h ago
πŸ†”74668048

you know what hell yea https://t.co/mTYyoxakZy

Media 1Media 2
πŸ–ΌοΈ Media
πŸ”youwouldntpost retweeted
G
lauren
@gilmxres
πŸ“…
Mar 09, 2026
7h ago
πŸ†”74668048

you know what hell yea https://t.co/mTYyoxakZy

Media 1Media 2
❀️9,787
likes
πŸ”407
retweets
πŸ–ΌοΈ Media
W
Weather_West
@Weather_West
πŸ“…
Mar 09, 2026
6h ago
πŸ†”67228435

Lots of buzz online about an upcoming major March heatwave for the American SW & California. And in this case, it does indeed appear increasingly likely than an extremely anomalous and even record-breaking heatwave may envelop much of the SW about a week from now. https://t.co/GByhbmJEZb

Media 1
πŸ–ΌοΈ Media
G
github
@github
πŸ“…
Mar 09, 2026
53m ago
πŸ†”57713511

The bottom line: Treat agents like code, not chat interfaces. Design for failure, validate every boundary, and use explicit structure. Get our full guide on building reliable multi-agent systems here. πŸ‘‡ https://t.co/yjrEEXUgwQ

Media 1
πŸ–ΌοΈ Media
D
dee_bosa
@dee_bosa
πŸ“…
Mar 09, 2026
2h ago
πŸ†”46870386

Oracle is building yesterday's data centers with tomorrow's debt Frontier labs like OpenAI want the newest chips. But Nvidia is shipping a new generation annually while data centers still take years to get up and running. That's a mismatch for the whole AI trade Oracle, funding it with $100B in debt, may be first to crack

Media 1
πŸ–ΌοΈ Media
Z
ZeffMax
@ZeffMax
πŸ“…
Mar 09, 2026
2h ago
πŸ†”46713032

NEW: OpenAI and Google employeesβ€”including Google DeepMind Chief Scientist Jeff Dean β€”filed an amicus brief in support of Anthropic in its lawsuit against the US government. https://t.co/3lQrzlq8BE

Media 1
πŸ–ΌοΈ Media
G
GaryMarcus
@GaryMarcus
πŸ“…
Mar 09, 2026
1h ago
πŸ†”74627925

Let it be noted that despite my contempt for LeCun’s recurrent pattern of intellectual dishonesty, I mostly stood up for him re Zuck and Wang: https://t.co/RgtbMYwqpq

Media 1
πŸ–ΌοΈ Media
S
StasBekman
@StasBekman
πŸ“…
Mar 09, 2026
3h ago
πŸ†”63792574

Good news! Ulysses Sequence Parallelism from the Snowflake AI Research and the Deepspeed teams has been integrated into @huggingface Trainer, Accelerate and TRL For extensive details please see this writeup: https://t.co/2xDWUk8p3V Thanks a lot to @krasul for helping make it happen. Also the others in the HF team who helped with integration.

Media 1Media 2
πŸ–ΌοΈ Media
V
ValaAfshar
@ValaAfshar
πŸ“…
Mar 08, 2026
1d ago
πŸ†”56968489

Tim Cook on how Steve Jobs believed that small teams could do amazing work. https://t.co/k7bMtFM6hs

πŸ–ΌοΈ Media
πŸ”SpirosMargaris retweeted
V
Vala Afshar
@ValaAfshar
πŸ“…
Mar 08, 2026
1d ago
πŸ†”56968489

Tim Cook on how Steve Jobs believed that small teams could do amazing work. https://t.co/k7bMtFM6hs

❀️1,260
likes
πŸ”170
retweets
πŸ–ΌοΈ Media
R
r0ck3t23
@r0ck3t23
πŸ“…
Mar 09, 2026
8h ago
πŸ†”19962406

Mark Cuban just described the sharpest divide in the modern economy. And most people are already on the wrong side of it. Cuban: β€œThere’s two types of approaches to AI. Some people who use it so they don’t have to learn anything, and some people who use it so they have the opportunity to learn everything.” Two sentences. The entire future of human capital compressed into a single binary. The first group sees the most powerful knowledge infrastructure ever built and uses it to avoid thinking. They offload reasoning, skip the friction, and call it efficiency. What they’re actually doing is hollowing out the one thing that can’t be replicated. Their own cognition. Cuban: β€œAI is a tool, it’s a way to learn, it’s a democratization of knowledge.” For centuries, elite knowledge was locked behind institutions, geography, and capital. The right university. The right city. The right network. Entire generations of potential buried because the information was never accessible. That wall just came down permanently. The second group understands what that actually means. Same tool. Compressing decades of learning into months. Entire disciplines on demand. Mental models that once required years of expensive education now available to anyone willing to ask the right questions. The knowledge is democratized. The ambition is not. That’s the divide Cuban is actually describing. Not technical literacy. Not access. Pure cognitive initiative. The first group is outsourcing their mind. The second is expanding it. Atrophy doesn’t announce itself. It just arrives.

πŸ–ΌοΈ Media
T
theinformation
@theinformation
πŸ“…
Mar 09, 2026
2h ago
πŸ†”87202946

OpenAI’s IPO hopes are facing skepticism from investors, @AnitaRamaswamy explains: β€œThey were still concerned about the current valuation.” β€œOpenAI… doesn't project that it's going to be generating cash until at least 2030.” https://t.co/7moB74b9M1

πŸ–ΌοΈ Media
R
rohanvarma
@rohanvarma
πŸ“…
Mar 09, 2026
1h ago
πŸ†”66693351

If you want AI Code Review, but don't want to pay $25 per review (not a typo), check out Codex Review! It leverages frontier Codex models, finds complex issues, and 100% usage based. Most runs should cost ~$1 or less https://t.co/43iF6rq8Xa

Media 1
πŸ–ΌοΈ Media
J
jerryjliu0
@jerryjliu0
πŸ“…
Mar 09, 2026
4h ago
πŸ†”61342799

We built a neat tool that lets you convert a directory of Powerpoint files into clean, structured markdown - that Claude Code / agent SDK / any generalized agent wrapper can easily understand. The pptx skill in Claude Code is quite basic and doesn’t have high-fidelity understanding over graphics/charts/tables. Our project Surreal Slides uses LlamaParse to convert presentations into clean structured data that you can put into a db (@SurrealDB) for simple retrieval, without having to take screenshots of the data on the fly. Thanks to @itsclelia for this project, check it out: https://t.co/Fj1PASv8IP

Media 2
πŸ–ΌοΈ Media
πŸ”SpirosMargaris retweeted
E
Elon Musk
@elonmusk
πŸ“…
Mar 09, 2026
18h ago
πŸ†”05094711

I use this analogy a lot. This is what a room full of computers looked like in old times: https://t.co/Y4a93z76lw

Media 1
❀️147,083
likes
πŸ”12,298
retweets
πŸ–ΌοΈ Media
R
reach_vb
@reach_vb
πŸ“…
Mar 09, 2026
2h ago
πŸ†”80687077

ICYMI: As part of Codex for OSS, open-source maintainers can apply for API credits, six months of ChatGPT Pro with Codex, and Codex Security! Apply!! First batch rolling out soon! https://t.co/qcXXO4DLNt

Media 1
πŸ–ΌοΈ Media
P
PyTorch
@PyTorch
πŸ“…
Mar 09, 2026
2h ago
πŸ†”50383131

PyTorch is heading to @NVIDIA #GTC26 in San Jose next week! πŸ“ Visit us at Booth #338 for: ✨ Helion kernel authoring demos ✨ ExecuTorch on-device inference ✨ Meet PyTorch core maintainers & experts Plus talks, hands-on labs & a hackathon! πŸ”— https://t.co/Hn2DEgXXa5 https://t.co/YUK6iWvX3V

Media 1
πŸ–ΌοΈ Media
A
AskPerplexity
@AskPerplexity
πŸ“…
Mar 09, 2026
2h ago
πŸ†”36274180

Perplexity Computer replaced $225K/yr in marketing tools in a single weekend. We built an AI marketing agent that scans hourly, manages budgets, detects fatigue, and coordinates several campaigns end to end. In one test run, it made 224 micro-optimizations to our ad stack. https://t.co/B0ueikpQyp

πŸ–ΌοΈ Media
R
RpsAgainstTrump
@RpsAgainstTrump
πŸ“…
Mar 09, 2026
4h ago
πŸ†”67414904

SHOCKING: Among Republican men under 50, 54% deny the Holocaust. We are so screwed. https://t.co/vt6ZmGsCoY

Media 1
πŸ–ΌοΈ Media
πŸ”ylecun retweeted
R
Republicans against Trump
@RpsAgainstTrump
πŸ“…
Mar 09, 2026
4h ago
πŸ†”67414904

SHOCKING: Among Republican men under 50, 54% deny the Holocaust. We are so screwed. https://t.co/vt6ZmGsCoY

Media 1
❀️2,329
likes
πŸ”506
retweets
πŸ–ΌοΈ Media
P
PtrPomorski
@PtrPomorski
πŸ“…
Mar 09, 2026
5h ago
πŸ†”55724832

No way, a product that sucks doesn’t work as expected! https://t.co/zSVKcLk7uT https://t.co/aPzF2tlNm1

Media 1
πŸ–ΌοΈ Media
Page 1 of 568Next β†’