Your curated collection of saved posts and media

Showing 32 posts Β· last 7 days Β· newest first
πŸ”Scobleizer retweeted
H
H
@hcompany_ai
πŸ“…
Mar 17, 2026
5h ago
πŸ†”14320083
⭐0.34

πŸš€ Live from @NVIDIAGTC, we're releasing Holotron-12B! Developed with @nvidia, it's a high-throughput, open-source, multimodal model engineered specifically for the age of computer-use agents. Get started today! πŸ€—Hugging Face: https://t.co/oaSviLi8IN πŸ“–Technical Deep Dive: https://t.co/pDItQB1frU πŸ’ΌWe are hiring: https://t.co/fcNoR9FIYQ #AI #ComputerUse #NVIDIA #OpenSource #ReinforcementLearning #HCompany #GTC @NVIDIAAI @nvidia

❀️44
likes
πŸ”11
retweets
πŸ”Scobleizer retweeted
S
Sara Hooker
@sarahookr
πŸ“…
Mar 17, 2026
2h ago
πŸ†”66750608
⭐0.34

Two weeks after sharing @adaption_ai adaptive data, we are excited to share our work on blueprint πŸ“˜ blueprint steers data towards your goals, and learns penalties if AI violates any of your rules. Very proud of the team.

❀️37
likes
πŸ”5
retweets
πŸ”HamelHusain retweeted
R
Randy Olson
@randal_olson
πŸ“…
Mar 17, 2026
35m ago
πŸ†”01984370
⭐0.36

Yes! The β€œare you sure?” problem (link below) is especially pervasive in any complex coding task. Ask Claude or GPT to review a PR then ask it to double check its findings when it finishes - it’ll flip on at least 1 of its findings. https://t.co/WUSuOxTuDy

πŸ”1
retweets
R
randal_olson
@randal_olson
πŸ“…
Mar 17, 2026
35m ago
πŸ†”01984370

Yes! The β€œare you sure?” problem (link below) is especially pervasive in any complex coding task. Ask Claude or GPT to review a PR then ask it to double check its findings when it finishes - it’ll flip on at least 1 of its findings. https://t.co/WUSuOxTuDy

@HamelHusain β€’ Tue Mar 17 14:55

One thing that makes me feel that code factory has not arrived yet is the following experiment: 1.Ask a LLM to do an in-depth rigorous review of your code 2. In a new thread, as same/different LLM to consider those review comments independently and address issues it agrees with

Media 1
πŸ–ΌοΈ Media
πŸ”ylecun retweeted
B
Belen Alastruey
@b_alastruey
πŸ“…
Mar 17, 2026
1h ago
πŸ†”03697001
⭐0.34

πŸ”ŽA closer look at Omnilingual No Language Left Behind, the encoder-decoder system presented as part of @AIatMeta new Omnilingual Machine Translation work!🌍 Many say encoder-decoder is dead in the age of decoder-only LLMs but we show it’s not! πŸ“„:https://t.co/isvEzRZbnw 🧡1/n https://t.co/RLs8ncUy0H

❀️16
likes
πŸ”7
retweets
πŸ”random_walker retweeted
S
Stephan Rabanser
@steverab
πŸ“…
Mar 17, 2026
2h ago
πŸ†”50398178
⭐0.36

In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️ https://t.co/GkdAxk0wDO

❀️8
likes
πŸ”2
retweets
πŸ”dair_ai retweeted
O
elvis
@omarsar0
πŸ“…
Mar 17, 2026
1h ago
πŸ†”73572061
⭐0.36

Current vision-language models still struggle with simple diagrams. Feynman is a knowledge-infused diagramming agent that enumerates domain-specific concepts, plans visual representations, and translates them into declarative programs rendered by the Penrose diagramming system. Great insights for those building agents for diagrams and visualizations. One pipeline run produced 10,693 unique programs across math, CS, and science, each rendered into 10 layout variations, yielding over 106k well-aligned diagram-caption pairs. Paper: https://t.co/F4vNS0TII4 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

❀️12
likes
πŸ”1
retweets
H
HamelHusain
@HamelHusain
πŸ“…
Mar 17, 2026
1h ago
πŸ†”38886331
⭐0.42

One thing that makes me feel that code factory has not arrived yet is the following experiment: 1.Ask a LLM to do an in-depth rigorous review of your code 2. In a new thread, as same/different LLM to consider those review comments independently and address issues it agrees with 3. Keep repeating until no new concerns I find that this loop always goes on for a ridiculously long time, which means that there is a problem with the notion of claude-take-the-wheel. This seems to happen no matter the harness or the specificity of the specs. It works fine for simple applications, but in the limit if the LLMs have this much cognitive dissonance you cannot trust it. Either this, or LLM are RLHFd to always find some kind of issue.

D
dair_ai
@dair_ai
πŸ“…
Mar 17, 2026
1h ago
πŸ†”53826696

Even the best reasoning models hit an accuracy collapse beyond a certain problem complexity. Giving an LRM the exact solution algorithm doesn't fix it either. This new work, BIGMAS, improves LLM agents by taking inspiration from the human brain. BIGMAS outperforms both ReAct and Tree of Thoughts across all three tasks. It organizes specialized LLM agents as nodes in a dynamically constructed directed graph, coordinated through a centralized shared workspace inspired by global workspace theory. A GraphDesigner builds task-specific agent topologies per problem, and a global Orchestrator routes decisions using the complete shared state, eliminating the local-view bottleneck of reactive approaches. Across Game24, Six Fives, and Tower of London on six frontier LLMs, including GPT-5 and Claude 4.5, BIGMAS consistently improves accuracy. The gains are largest where models struggle most: DeepSeek-V3.2 jumps from 12% to 30% on Six Fives. Paper: https://t.co/sMqUfvHAGp Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c

Media 1Media 2
πŸ–ΌοΈ Media
B
b_alastruey
@b_alastruey
πŸ“…
Mar 17, 2026
1h ago
πŸ†”03697001

πŸ”ŽA closer look at Omnilingual No Language Left Behind, the encoder-decoder system presented as part of @AIatMeta new Omnilingual Machine Translation work!🌍 Many say encoder-decoder is dead in the age of decoder-only LLMs but we show it’s not! πŸ“„:https://t.co/isvEzRZbnw 🧡1/n https://t.co/RLs8ncUy0H

Media 1
πŸ–ΌοΈ Media
O
omarsar0
@omarsar0
πŸ“…
Mar 17, 2026
1h ago
πŸ†”73572061

Current vision-language models still struggle with simple diagrams. Feynman is a knowledge-infused diagramming agent that enumerates domain-specific concepts, plans visual representations, and translates them into declarative programs rendered by the Penrose diagramming system. Great insights for those building agents for diagrams and visualizations. One pipeline run produced 10,693 unique programs across math, CS, and science, each rendered into 10 layout variations, yielding over 106k well-aligned diagram-caption pairs. Paper: https://t.co/F4vNS0TII4 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

Media 1Media 2
πŸ–ΌοΈ Media
T
tkipf
@tkipf
πŸ“…
Mar 17, 2026
1h ago
πŸ†”40516768
⭐0.36

This is one of the most impressive world model projects I have seen. Very elegant and highly effective combination of an image retrieval mechanism (using 3D locations/views) and otherwise just pure generative modeling. This is the way.

@jyseo_cv β€’ Tue Mar 17 02:59

What if a world model could render not an imagined place, but the actual city? We introduce Seoul World Model, the first world simulation model grounded in a real-world metropolis. TL;DR: We made a world model RAG over millions of street-views. proj: https://t.co/Bx4KUAqrRs ht

πŸ”jxnlco retweeted
S
Simon Willison
@simonw
πŸ“…
Mar 17, 2026
2h ago
πŸ†”68704943
⭐0.34

... and a follow-up chapter about Subagents, now a feature of Codex and Claude Code and Gemini CLI and Mistral Vibe and OpenCode and VS Code and Cursor https://t.co/suGmK4g3Hp

❀️17
likes
πŸ”3
retweets
πŸ”ivanleomk retweeted
A
agrim singh
@agrimsingh
πŸ“…
Mar 17, 2026
2h ago
πŸ†”08839930
⭐0.38

i dj'd a set this weekend and planned half of it with an app i built on @GoogleDeepMind's new multimodal embeddings model. it understands what music actually sounds like - not bpm, not genre tags. raw audio in, vibes out. built with @cursor_ai + @openai gpt 5.4, gemini embeddings and @convex to keep everything running smoothly @ericzakariasson @DynamicWebPaige @waynesutton @gabrielchua here's what it does (demo below) 🧡

❀️6
likes
πŸ”3
retweets
A
agrimsingh
@agrimsingh
πŸ“…
Mar 17, 2026
2h ago
πŸ†”08839930

i dj'd a set this weekend and planned half of it with an app i built on @GoogleDeepMind's new multimodal embeddings model. it understands what music actually sounds like - not bpm, not genre tags. raw audio in, vibes out. built with @cursor_ai + @openai gpt 5.4, gemini embeddings and @convex to keep everything running smoothly @ericzakariasson @DynamicWebPaige @waynesutton @gabrielchua here's what it does (demo below) 🧡

πŸ–ΌοΈ Media
S
steverab
@steverab
πŸ“…
Mar 17, 2026
2h ago
πŸ†”50398178

In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️ https://t.co/GkdAxk0wDO

Media 1
πŸ–ΌοΈ Media
πŸ”_akhaliq retweeted
N
Niels Rogge
@NielsRogge
πŸ“…
Mar 17, 2026
2h ago
πŸ†”69938152

Tried the viral poster-skill with Claude Code on the trending Moonshot paper :) Not too bad! https://t.co/I8lb0aUrbT

Media 1
❀️10
likes
πŸ”3
retweets
πŸ–ΌοΈ Media
N
NielsRogge
@NielsRogge
πŸ“…
Mar 17, 2026
2h ago
πŸ†”69938152

Tried the viral poster-skill with Claude Code on the trending Moonshot paper :) Not too bad! https://t.co/I8lb0aUrbT

@Kimi_Moonshot β€’ Mon Mar 16 03:03

Introducing π‘¨π’•π’•π’†π’π’•π’Šπ’π’ π‘Ήπ’†π’”π’Šπ’…π’–π’‚π’π’”: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dep

Media 1
πŸ–ΌοΈ Media
S
sarahookr
@sarahookr
πŸ“…
Mar 17, 2026
2h ago
πŸ†”66750608
⭐0.36

Two weeks after sharing @adaption_ai adaptive data, we are excited to share our work on blueprint πŸ“˜ blueprint steers data towards your goals, and learns penalties if AI violates any of your rules. Very proud of the team.

@adaption_ai β€’ Tue Mar 17 12:00

Introducing Blueprint, a new capability within Adaptive Data. We firmly believe data that evolves with the world is only useful if it evolves the right way. Blueprint allows you to steer the data space towards any goal you want. https://t.co/8k0WEMYmdd

πŸ”Modular retweeted
F
Forward Future
@ForwardFuture
πŸ“…
Mar 13, 2026
3d ago
πŸ†”14582366
⭐0.32

β€œEveryone should be a GPU programmer.” @clattner_llvm's goal with @Modular: β€œWhat Modular is doing is opening up the box. We’re fixing the language problem and the platform problem. "The goal is to let more developers learn modern compute. And to give developers real choice in the hardware they use.” β€œThose two things unlock the ecosystem.”

❀️6
likes
πŸ”1
retweets
S
simonw
@simonw
πŸ“…
Mar 17, 2026
2h ago
πŸ†”68704943
⭐0.38

... and a follow-up chapter about Subagents, now a feature of Codex and Claude Code and Gemini CLI and Mistral Vibe and OpenCode and VS Code and Cursor https://t.co/suGmK4g3Hp

S
siyuanhuang95
@siyuanhuang95
πŸ“…
Mar 17, 2026
3h ago
πŸ†”37897027

Excited to introduce OmniClone, a robust teleoperation system for humanoid mobile manipulation. While systems like TWIST2 and SONIC paved the way, we put efforts into solving the critical stability and scaling gaps. 1/ πŸ“Š Moving past "vibe-based" testing. We’ve built a comprehensive diagnostic benchmark to systematically evaluate whole-body teleoperation. No more trial-and-errorβ€”get the actionable insights needed for true policy optimization. 2/ πŸ‘€ Universal Human-to-Robot Mapping. Teleop often breaks when switching operators. OmniClone mitigates biases from hardware fluctuations and, crucially, diverse human body shapes, ensuring high-stability control regardless of the person in the suit. 3/ πŸš€ System Optimizations for Whole-body Manipulation Policy. By optimizing for affordability and reproducibility, OmniClone provides the high-fidelity pipeline necessary to collect data and train humanoid whole-body policies at scale. fully The model checkpoints and deploy code are now fully releasedβ€”welcome to play with it! πŸ“¦ πŸ“„ Paper: https://t.co/kDm60WeuMD 🌐 Project: https://t.co/WGcfYridEs πŸ’» Code: https://t.co/U1QLgaipcd

Media 2
πŸ–ΌοΈ Media
R
rasbt
@rasbt
πŸ“…
Mar 17, 2026
3h ago
πŸ†”71094774
⭐0.32

@SalajSonar1086 Already done :). The respective tutorial articles are linked via the β€œView in Article” links there

B
b_alastruey
@b_alastruey
πŸ“…
Mar 17, 2026
5h ago
πŸ†”39491508

Happy to share 🌍Omnilingual Machine Translation🌍 In this work @AIatMeta we explore translation systems supporting 1,600+ languages. We show how our models (1B to 8B) can outperform baselines of up to 70B while having much larger language coverage. πŸ“„:https://t.co/isvEzRZbnw https://t.co/8sdgkQuJ3B

Media 1
πŸ–ΌοΈ Media
H
hcompany_ai
@hcompany_ai
πŸ“…
Mar 17, 2026
5h ago
πŸ†”14320083

πŸš€ Live from @NVIDIAGTC, we're releasing Holotron-12B! Developed with @nvidia, it's a high-throughput, open-source, multimodal model engineered specifically for the age of computer-use agents. Get started today! πŸ€—Hugging Face: https://t.co/oaSviLi8IN πŸ“–Technical Deep Dive: https://t.co/pDItQB1frU πŸ’ΌWe are hiring: https://t.co/fcNoR9FIYQ #AI #ComputerUse #NVIDIA #OpenSource #ReinforcementLearning #HCompany #GTC @NVIDIAAI @nvidia

Media 1Media 2
+2 more
πŸ–ΌοΈ Media
πŸ”huggingface retweeted
G
Gabriele Berton
@gabriberton
πŸ“…
Mar 16, 2026
16h ago
πŸ†”45334177
⭐0.34

VisMatch is on pypi! VisMatch is a wrapper for image matching models, like LightGlue, RoMa-v2, MASt3R, LoFTR, and 50+ more! It's literally as simple as: pip install vismatch vismatch-match --inputs img0 img1 --matcher choose_any To run image matching on any 2 images [1/4] https://t.co/dIr2YapWak

❀️183
likes
πŸ”28
retweets
D
drmapavone
@drmapavone
πŸ“…
Mar 17, 2026
11h ago
πŸ†”58775875

Jensen today announced Alpamayo 1.5 at #NVIDIAGTC! #Alpamayo 1.5 is a major update to Alpamayo 1β€”@nvidia’s open 10B-parameter chain-of-thought reasoning VLA model, first introduced at #CES. Built on the #Cosmos-Reason2 VLM backbone and post-trained with RL, it adds support for navigation guidance, flexible multi-camera setups, configurable camera parameters, and user question answering. The result is an interactive, steerable reasoning engine for the AV community. We’re also releasing post-training scripts to help researchers and developers adapt the model. Additionally, we’ve significantly expanded the Alpamayo open platform across data and simulation, including releasing highly requested reasoning labels for the PhysicalAI Autonomous Vehicles dataset (https://t.co/fD9eUcndya), as well as our chain-of-causation auto-labeling pipeline. πŸ”Ž Learn more about Alpamayo 1.5 and the latest extensions to the Alpamayo open platform: https://t.co/P0nuqkwBab (please note that most of the links will become active in the next few days.) Happy buildingβ€”and stay tuned for more in the coming months! @NVIDIADRIVE @NVIDIAAI

Media 2
+1 more
πŸ–ΌοΈ Media
L
lucas_flatwhite
@lucas_flatwhite
πŸ“…
Mar 17, 2026
11h ago
πŸ†”99053607

πŸ› οΈ Claude Code "opusplan" 말 κ·ΈλŒ€λ‘œ ν•˜μ΄λΈŒλ¦¬λ“œ λͺ¨λΈ.. κ³΅μ‹μž„! Claude Codeμ—λŠ” opusplan λͺ¨λΈμ„ 선택할 수 μžˆμ–΄μš”. > /model opusplan ν•˜μ΄λΈŒλ¦¬λ“œ λͺ¨λΈ alias인데, μž‘μ—… 단계에 따라 μžλ™μœΌλ‘œ λͺ¨λΈμ„ μ „ν™˜ν•΄μš”. λ³΅μž‘ν•œ 좔둠을 μœ„ν•œ ν”Œλžœ λͺ¨λ“œμ—μ„œλŠ” Opusλ₯Ό μ‹€ν–‰ λ‹¨κ³„μ—μ„œλŠ” Sonnet으둜 μžλ™ μ „ν™˜λ©λ‹ˆλ‹€! Opus둜 κ³„νšν•˜κ³  κ΅¬ν˜„κΉŒμ§€ ν•˜λŠ” 것도 λ¬Όλ‘  κ°€λŠ₯ν•΄μš”. ν•˜μ§€λ§Œ 이미 νƒ„νƒ„ν•œ κ³„νšμ΄ μžˆλ‹€λ©΄, 싀행은 SonnetμœΌλ‘œλ„ μΆ©λΆ„ν•˜κ³  더 μ €λ ΄ν•  수 μžˆμ–΄μš”. 각 μž‘μ—…μ— λ§žλŠ” λͺ¨λΈμ„ μ“°λŠ” 것 = 효율 πŸš€ ν”Œλž˜λ‹κ³Ό 싀행은 μš”κ΅¬λ˜λŠ” 인지 λΆ€ν•˜κ°€ λ‹¬λΌμš”. Opus의 κΉŠμ€ μΆ”λ‘  λŠ₯λ ₯은 κ³„νš 수립 λ‹¨κ³„μ—μ„œ κ°€μž₯ λΉ›λ‚˜κ³ , 일단 νƒ„νƒ„ν•œ κ³„νšμ΄ μ„Έμ›Œμ§„ μ΄ν›„μ˜ 싀행은 Sonnet으둜 μΆ©λΆ„νžˆ 컀버될 수 μžˆμ–΄μš”. μ–Έμ œ μ“°λ©΄ μ’‹λƒκ΅¬μš”? - λ³΅μž‘ν•œ κΈ°λŠ₯ 섀계같이 μ•„ν‚€ν…μ²˜ 결정이 μ€‘μš”ν•œ μž‘μ—… - λ¦¬νŒ©ν† λ§ κ³„νšκ°™μ€ 영ν–₯ λ²”μœ„ 뢄석이 ν•„μš”ν•œ 경우 - Opus ν’€νŒŒμ›Œ μ‚¬μš© λŒ€λΉ„ λΉ„μš© 절감이 ν•„μš”ν•  λ•Œ 이거 이제 많이 ν™œμš©ν•˜μ‹€ λ“―!!!

@dani_avila7 β€’ Mon Mar 16 15:58

Did you know about the opusplan model in Claude Code? /model opusplan It's a hybrid alias that automatically uses Opus in plan mode for complex reasoning, then switches to Sonnet for execution. Best of both worlds: Opus thinks, Sonnet builds https://t.co/r7un0X5bVg

πŸ–ΌοΈ Media
G
gdb
@gdb
πŸ“…
Mar 17, 2026
11h ago
πŸ†”37895367
⭐0.32

Subagents are now supported in Codex. They're very fun and make it possible to get large amounts of work done *quickly*:

@OpenAIDevs β€’ Mon Mar 16 20:09

Subagents are now available in Codex. You can accelerate your workflow by spinning up specialized agents to: β€’ Keep your main context window clean β€’ Tackle different parts of a task in parallel β€’ Steer individual agents as work unfolds https://t.co/QJC2ZYtYcA

πŸ”omarsar0 retweeted
O
elvis
@omarsar0
πŸ“…
Mar 16, 2026
1d ago
πŸ†”09077648
⭐0.38

Banger report from the Kimi team: Attention Residuals Residual connections made deep Transformers trainable. But they also force uncontrolled hidden-state growth with depth. This work proposes a cleaner alternative. It introduces Attention Residuals, which replace fixed residual accumulation with softmax attention over previous layer outputs. Instead of blindly summing everything, each layer selectively retrieves the earlier representations it actually needs. To keep this practical at scale, they add a blockwise version that compresses layers into block summaries, recovering most of the gains with minimal systems overhead. Why does it matter? Residual paths have barely changed across modern LLMs, even though they govern how information moves through depth. This paper shows that making the mixing content-dependent improves scaling laws, matches a baseline trained with 1.25x more compute, boosts GPQA-Diamond by +7.5 and HumanEval by +3.1, while keeping inference overhead under 2%. Paper: https://t.co/04IG6FDiVr Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

❀️130
likes
πŸ”17
retweets
πŸ”ivanleomk retweeted
Y
Yoeven
@yoeven
πŸ“…
Mar 17, 2026
14h ago
πŸ†”65291100

The moment he realised that https://t.co/vWmBsnR1nt isn't fully built on transformers and we can run on a single GPU with high accuracy and lower cost https://t.co/ZJYuL62UB8

Media 1
❀️4
likes
πŸ”1
retweets
πŸ–ΌοΈ Media
Y
yoeven
@yoeven
πŸ“…
Mar 17, 2026
14h ago
πŸ†”65291100

The moment he realised that https://t.co/vWmBsnR1nt isn't fully built on transformers and we can run on a single GPU with high accuracy and lower cost https://t.co/ZJYuL62UB8

Media 1Media 2
πŸ–ΌοΈ Media