Your curated collection of saved posts and media

Recent Top

Showing 32 posts · last 7 days · newest first

🖼️ Media

🔁Scobleizer retweeted

H

H

@hcompany_ai

📅

Mar 17, 2026

5h ago

🆔14320083

⭐0.34

🚀 Live from @NVIDIAGTC, we're releasing Holotron-12B! Developed with @nvidia, it's a high-throughput, open-source, multimodal model engineered specifically for the age of computer-use agents. Get started today! 🤗Hugging Face: https://t.co/oaSviLi8IN 📖Technical Deep Dive: https://t.co/pDItQB1frU 💼We are hiring: https://t.co/fcNoR9FIYQ #AI #ComputerUse #NVIDIA #OpenSource #ReinforcementLearning #HCompany #GTC @NVIDIAAI @nvidia

❤️44

likes

🔁11

retweets

View Details View on X ↗

🔁Scobleizer retweeted

S

Sara Hooker

@sarahookr

📅

Mar 17, 2026

2h ago

🆔66750608

⭐0.34

Two weeks after sharing @adaption_ai adaptive data, we are excited to share our work on blueprint 📘 blueprint steers data towards your goals, and learns penalties if AI violates any of your rules. Very proud of the team.

❤️37

likes

🔁5

retweets

View Details View on X ↗

🔁HamelHusain retweeted

R

Randy Olson

@randal_olson

📅

Mar 17, 2026

35m ago

🆔01984370

⭐0.36

Yes! The “are you sure?” problem (link below) is especially pervasive in any complex coding task. Ask Claude or GPT to review a PR then ask it to double check its findings when it finishes - it’ll flip on at least 1 of its findings. https://t.co/WUSuOxTuDy

🔁1

retweets

View Details View on X ↗

R

randal_olson

@randal_olson

📅

Mar 17, 2026

35m ago

🆔01984370

Yes! The “are you sure?” problem (link below) is especially pervasive in any complex coding task. Ask Claude or GPT to review a PR then ask it to double check its findings when it finishes - it’ll flip on at least 1 of its findings. https://t.co/WUSuOxTuDy

@HamelHusain • Tue Mar 17 14:55

One thing that makes me feel that code factory has not arrived yet is the following experiment: 1.Ask a LLM to do an in-depth rigorous review of your code 2. In a new thread, as same/different LLM to consider those review comments independently and address issues it agrees with

🖼️ Media

View Details View on X ↗

🔁ylecun retweeted

B

Belen Alastruey

@b_alastruey

📅

Mar 17, 2026

1h ago

🆔03697001

⭐0.34

🔎A closer look at Omnilingual No Language Left Behind, the encoder-decoder system presented as part of @AIatMeta new Omnilingual Machine Translation work!🌍 Many say encoder-decoder is dead in the age of decoder-only LLMs but we show it’s not! 📄:https://t.co/isvEzRZbnw 🧵1/n https://t.co/RLs8ncUy0H

❤️16

likes

🔁7

retweets

View Details View on X ↗

🔁random_walker retweeted

S

Stephan Rabanser

@steverab

📅

Mar 17, 2026

2h ago

🆔50398178

⭐0.36

In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️ https://t.co/GkdAxk0wDO

❤️8

likes

🔁2

retweets

View Details View on X ↗

🔁dair_ai retweeted

O

elvis

@omarsar0

📅

Mar 17, 2026

1h ago

🆔73572061

⭐0.36

Current vision-language models still struggle with simple diagrams. Feynman is a knowledge-infused diagramming agent that enumerates domain-specific concepts, plans visual representations, and translates them into declarative programs rendered by the Penrose diagramming system. Great insights for those building agents for diagrams and visualizations. One pipeline run produced 10,693 unique programs across math, CS, and science, each rendered into 10 layout variations, yielding over 106k well-aligned diagram-caption pairs. Paper: https://t.co/F4vNS0TII4 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

❤️12

likes

🔁1

retweets

View Details View on X ↗

H

HamelHusain

@HamelHusain

📅

Mar 17, 2026

1h ago

🆔38886331

⭐0.42

One thing that makes me feel that code factory has not arrived yet is the following experiment: 1.Ask a LLM to do an in-depth rigorous review of your code 2. In a new thread, as same/different LLM to consider those review comments independently and address issues it agrees with 3. Keep repeating until no new concerns I find that this loop always goes on for a ridiculously long time, which means that there is a problem with the notion of claude-take-the-wheel. This seems to happen no matter the harness or the specificity of the specs. It works fine for simple applications, but in the limit if the LLMs have this much cognitive dissonance you cannot trust it. Either this, or LLM are RLHFd to always find some kind of issue.

View Details View on X ↗

D

dair_ai

@dair_ai

📅

Mar 17, 2026

1h ago

🆔53826696

Even the best reasoning models hit an accuracy collapse beyond a certain problem complexity. Giving an LRM the exact solution algorithm doesn't fix it either. This new work, BIGMAS, improves LLM agents by taking inspiration from the human brain. BIGMAS outperforms both ReAct and Tree of Thoughts across all three tasks. It organizes specialized LLM agents as nodes in a dynamically constructed directed graph, coordinated through a centralized shared workspace inspired by global workspace theory. A GraphDesigner builds task-specific agent topologies per problem, and a global Orchestrator routes decisions using the complete shared state, eliminating the local-view bottleneck of reactive approaches. Across Game24, Six Fives, and Tower of London on six frontier LLMs, including GPT-5 and Claude 4.5, BIGMAS consistently improves accuracy. The gains are largest where models struggle most: DeepSeek-V3.2 jumps from 12% to 30% on Six Fives. Paper: https://t.co/sMqUfvHAGp Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c

🖼️ Media

View Details View on X ↗

B

b_alastruey

@b_alastruey

📅

Mar 17, 2026

1h ago

🆔03697001

🔎A closer look at Omnilingual No Language Left Behind, the encoder-decoder system presented as part of @AIatMeta new Omnilingual Machine Translation work!🌍 Many say encoder-decoder is dead in the age of decoder-only LLMs but we show it’s not! 📄:https://t.co/isvEzRZbnw 🧵1/n https://t.co/RLs8ncUy0H

🖼️ Media

View Details View on X ↗

O

omarsar0

@omarsar0

📅

Mar 17, 2026

1h ago

🆔73572061

Current vision-language models still struggle with simple diagrams. Feynman is a knowledge-infused diagramming agent that enumerates domain-specific concepts, plans visual representations, and translates them into declarative programs rendered by the Penrose diagramming system. Great insights for those building agents for diagrams and visualizations. One pipeline run produced 10,693 unique programs across math, CS, and science, each rendered into 10 layout variations, yielding over 106k well-aligned diagram-caption pairs. Paper: https://t.co/F4vNS0TII4 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

🖼️ Media

View Details View on X ↗

T

tkipf

@tkipf

📅

Mar 17, 2026

1h ago

🆔40516768

⭐0.36

This is one of the most impressive world model projects I have seen. Very elegant and highly effective combination of an image retrieval mechanism (using 3D locations/views) and otherwise just pure generative modeling. This is the way.

@jyseo_cv • Tue Mar 17 02:59

What if a world model could render not an imagined place, but the actual city? We introduce Seoul World Model, the first world simulation model grounded in a real-world metropolis. TL;DR: We made a world model RAG over millions of street-views. proj: https://t.co/Bx4KUAqrRs ht

View Details View on X ↗

🔁jxnlco retweeted

S

Simon Willison

@simonw

📅

Mar 17, 2026

2h ago

🆔68704943

⭐0.34

... and a follow-up chapter about Subagents, now a feature of Codex and Claude Code and Gemini CLI and Mistral Vibe and OpenCode and VS Code and Cursor https://t.co/suGmK4g3Hp

❤️17

likes

🔁3

retweets

View Details View on X ↗

🔁ivanleomk retweeted

A

agrim singh

@agrimsingh

📅

Mar 17, 2026

2h ago

🆔08839930

⭐0.38

i dj'd a set this weekend and planned half of it with an app i built on @GoogleDeepMind's new multimodal embeddings model. it understands what music actually sounds like - not bpm, not genre tags. raw audio in, vibes out. built with @cursor_ai + @openai gpt 5.4, gemini embeddings and @convex to keep everything running smoothly @ericzakariasson @DynamicWebPaige @waynesutton @gabrielchua here's what it does (demo below) 🧵

❤️6

likes

🔁3

retweets

View Details View on X ↗

A

agrimsingh

@agrimsingh

📅

Mar 17, 2026

2h ago

🆔08839930

i dj'd a set this weekend and planned half of it with an app i built on @GoogleDeepMind's new multimodal embeddings model. it understands what music actually sounds like - not bpm, not genre tags. raw audio in, vibes out. built with @cursor_ai + @openai gpt 5.4, gemini embeddings and @convex to keep everything running smoothly @ericzakariasson @DynamicWebPaige @waynesutton @gabrielchua here's what it does (demo below) 🧵

🖼️ Media

View Details View on X ↗

S

steverab

@steverab

📅

Mar 17, 2026

2h ago

🆔50398178

In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️ https://t.co/GkdAxk0wDO

🖼️ Media

View Details View on X ↗

🔁_akhaliq retweeted

N

Niels Rogge

@NielsRogge

📅

Mar 17, 2026

2h ago

🆔69938152

Tried the viral poster-skill with Claude Code on the trending Moonshot paper :) Not too bad! https://t.co/I8lb0aUrbT

❤️10

likes

🔁3

retweets

🖼️ Media

View Details View on X ↗

N

NielsRogge

@NielsRogge

📅

Mar 17, 2026

2h ago

🆔69938152

Tried the viral poster-skill with Claude Code on the trending Moonshot paper :) Not too bad! https://t.co/I8lb0aUrbT

@Kimi_Moonshot • Mon Mar 16 03:03

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dep

🖼️ Media

View Details View on X ↗

S

sarahookr

@sarahookr

📅

Mar 17, 2026

2h ago

🆔66750608

⭐0.36

Two weeks after sharing @adaption_ai adaptive data, we are excited to share our work on blueprint 📘 blueprint steers data towards your goals, and learns penalties if AI violates any of your rules. Very proud of the team.

@adaption_ai • Tue Mar 17 12:00

Introducing Blueprint, a new capability within Adaptive Data. We firmly believe data that evolves with the world is only useful if it evolves the right way. Blueprint allows you to steer the data space towards any goal you want. https://t.co/8k0WEMYmdd

View Details View on X ↗

🔁Modular retweeted

F

Forward Future

@ForwardFuture

📅

Mar 13, 2026

3d ago

🆔14582366

⭐0.32

“Everyone should be a GPU programmer.” @clattner_llvm's goal with @Modular: “What Modular is doing is opening up the box. We’re fixing the language problem and the platform problem. "The goal is to let more developers learn modern compute. And to give developers real choice in the hardware they use.” “Those two things unlock the ecosystem.”

❤️6

likes

🔁1

retweets

View Details View on X ↗

S

simonw

@simonw

📅

Mar 17, 2026

2h ago

🆔68704943

⭐0.38

... and a follow-up chapter about Subagents, now a feature of Codex and Claude Code and Gemini CLI and Mistral Vibe and OpenCode and VS Code and Cursor https://t.co/suGmK4g3Hp

View Details View on X ↗

S

siyuanhuang95

@siyuanhuang95

📅

Mar 17, 2026

3h ago

🆔37897027

Excited to introduce OmniClone, a robust teleoperation system for humanoid mobile manipulation. While systems like TWIST2 and SONIC paved the way, we put efforts into solving the critical stability and scaling gaps. 1/ 📊 Moving past "vibe-based" testing. We’ve built a comprehensive diagnostic benchmark to systematically evaluate whole-body teleoperation. No more trial-and-error—get the actionable insights needed for true policy optimization. 2/ 👤 Universal Human-to-Robot Mapping. Teleop often breaks when switching operators. OmniClone mitigates biases from hardware fluctuations and, crucially, diverse human body shapes, ensuring high-stability control regardless of the person in the suit. 3/ 🚀 System Optimizations for Whole-body Manipulation Policy. By optimizing for affordability and reproducibility, OmniClone provides the high-fidelity pipeline necessary to collect data and train humanoid whole-body policies at scale. fully The model checkpoints and deploy code are now fully released—welcome to play with it! 📦 📄 Paper: https://t.co/kDm60WeuMD 🌐 Project: https://t.co/WGcfYridEs 💻 Code: https://t.co/U1QLgaipcd

🖼️ Media

View Details View on X ↗

R

rasbt

@rasbt

📅

Mar 17, 2026

3h ago

🆔71094774

⭐0.32

@SalajSonar1086 Already done :). The respective tutorial articles are linked via the “View in Article” links there

View Details View on X ↗

B

b_alastruey

@b_alastruey

📅

Mar 17, 2026

5h ago

🆔39491508

Happy to share 🌍Omnilingual Machine Translation🌍 In this work @AIatMeta we explore translation systems supporting 1,600+ languages. We show how our models (1B to 8B) can outperform baselines of up to 70B while having much larger language coverage. 📄:https://t.co/isvEzRZbnw https://t.co/8sdgkQuJ3B

🖼️ Media

View Details View on X ↗

H

hcompany_ai

@hcompany_ai

📅

Mar 17, 2026

5h ago

🆔14320083

🚀 Live from @NVIDIAGTC, we're releasing Holotron-12B! Developed with @nvidia, it's a high-throughput, open-source, multimodal model engineered specifically for the age of computer-use agents. Get started today! 🤗Hugging Face: https://t.co/oaSviLi8IN 📖Technical Deep Dive: https://t.co/pDItQB1frU 💼We are hiring: https://t.co/fcNoR9FIYQ #AI #ComputerUse #NVIDIA #OpenSource #ReinforcementLearning #HCompany #GTC @NVIDIAAI @nvidia

+2 more

🖼️ Media

View Details View on X ↗

🔁huggingface retweeted

G

Gabriele Berton

@gabriberton

📅

Mar 16, 2026

16h ago

🆔45334177

⭐0.34

VisMatch is on pypi! VisMatch is a wrapper for image matching models, like LightGlue, RoMa-v2, MASt3R, LoFTR, and 50+ more! It's literally as simple as: pip install vismatch vismatch-match --inputs img0 img1 --matcher choose_any To run image matching on any 2 images [1/4] https://t.co/dIr2YapWak

❤️183

likes

🔁28

retweets

View Details View on X ↗

D

drmapavone

@drmapavone

📅

Mar 17, 2026

11h ago

🆔58775875

Jensen today announced Alpamayo 1.5 at #NVIDIAGTC! #Alpamayo 1.5 is a major update to Alpamayo 1—@nvidia’s open 10B-parameter chain-of-thought reasoning VLA model, first introduced at #CES. Built on the #Cosmos-Reason2 VLM backbone and post-trained with RL, it adds support for navigation guidance, flexible multi-camera setups, configurable camera parameters, and user question answering. The result is an interactive, steerable reasoning engine for the AV community. We’re also releasing post-training scripts to help researchers and developers adapt the model. Additionally, we’ve significantly expanded the Alpamayo open platform across data and simulation, including releasing highly requested reasoning labels for the PhysicalAI Autonomous Vehicles dataset (https://t.co/fD9eUcndya), as well as our chain-of-causation auto-labeling pipeline. 🔎 Learn more about Alpamayo 1.5 and the latest extensions to the Alpamayo open platform: https://t.co/P0nuqkwBab (please note that most of the links will become active in the next few days.) Happy building—and stay tuned for more in the coming months! @NVIDIADRIVE @NVIDIAAI

+1 more

🖼️ Media

View Details View on X ↗

L

lucas_flatwhite

@lucas_flatwhite

📅

Mar 17, 2026

11h ago

🆔99053607

🛠️ Claude Code "opusplan" 말 그대로 하이브리드 모델.. 공식임! Claude Code에는 opusplan 모델을 선택할 수 있어요. > /model opusplan 하이브리드 모델 alias인데, 작업 단계에 따라 자동으로 모델을 전환해요. 복잡한 추론을 위한 플랜 모드에서는 Opus를 실행 단계에서는 Sonnet으로 자동 전환됩니다! Opus로 계획하고 구현까지 하는 것도 물론 가능해요. 하지만 이미 탄탄한 계획이 있다면, 실행은 Sonnet으로도 충분하고 더 저렴할 수 있어요. 각 작업에 맞는 모델을 쓰는 것 = 효율 🚀 플래닝과 실행은 요구되는 인지 부하가 달라요. Opus의 깊은 추론 능력은 계획 수립 단계에서 가장 빛나고, 일단 탄탄한 계획이 세워진 이후의 실행은 Sonnet으로 충분히 커버될 수 있어요. 언제 쓰면 좋냐구요? - 복잡한 기능 설계같이 아키텍처 결정이 중요한 작업 - 리팩토링 계획같은 영향 범위 분석이 필요한 경우 - Opus 풀파워 사용 대비 비용 절감이 필요할 때 이거 이제 많이 활용하실 듯!!!

@dani_avila7 • Mon Mar 16 15:58

Did you know about the opusplan model in Claude Code? /model opusplan It's a hybrid alias that automatically uses Opus in plan mode for complex reasoning, then switches to Sonnet for execution. Best of both worlds: Opus thinks, Sonnet builds https://t.co/r7un0X5bVg

🖼️ Media

View Details View on X ↗

G

gdb

@gdb

📅

Mar 17, 2026

11h ago

🆔37895367

⭐0.32

Subagents are now supported in Codex. They're very fun and make it possible to get large amounts of work done *quickly*:

@OpenAIDevs • Mon Mar 16 20:09

Subagents are now available in Codex. You can accelerate your workflow by spinning up specialized agents to: • Keep your main context window clean • Tackle different parts of a task in parallel • Steer individual agents as work unfolds https://t.co/QJC2ZYtYcA

View Details View on X ↗

🔁omarsar0 retweeted

O

elvis

@omarsar0

📅

Mar 16, 2026

1d ago

🆔09077648

⭐0.38

Banger report from the Kimi team: Attention Residuals Residual connections made deep Transformers trainable. But they also force uncontrolled hidden-state growth with depth. This work proposes a cleaner alternative. It introduces Attention Residuals, which replace fixed residual accumulation with softmax attention over previous layer outputs. Instead of blindly summing everything, each layer selectively retrieves the earlier representations it actually needs. To keep this practical at scale, they add a blockwise version that compresses layers into block summaries, recovering most of the gains with minimal systems overhead. Why does it matter? Residual paths have barely changed across modern LLMs, even though they govern how information moves through depth. This paper shows that making the mixing content-dependent improves scaling laws, matches a baseline trained with 1.25x more compute, boosts GPQA-Diamond by +7.5 and HumanEval by +3.1, while keeping inference overhead under 2%. Paper: https://t.co/04IG6FDiVr Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

❤️130

likes

🔁17

retweets

View Details View on X ↗

🔁ivanleomk retweeted

Y

Yoeven

@yoeven

📅

Mar 17, 2026

14h ago

🆔65291100

The moment he realised that https://t.co/vWmBsnR1nt isn't fully built on transformers and we can run on a single GPU with high accuracy and lower cost https://t.co/ZJYuL62UB8

❤️4

likes

🔁1

retweets

🖼️ Media

View Details View on X ↗

Y

yoeven

@yoeven

📅

Mar 17, 2026

14h ago

🆔65291100

The moment he realised that https://t.co/vWmBsnR1nt isn't fully built on transformers and we can run on a single GPU with high accuracy and lower cost https://t.co/ZJYuL62UB8

🖼️ Media

View Details View on X ↗

← PreviousPage 141 of 143Next →