Your curated collection of saved posts and media

Recent Top

Showing 32 posts · last 7 days · newest first

🖼️ Media

E

emollick

@emollick

📅

Mar 17, 2026

2m ago

🆔96966360

⭐0.40

A knowledge-work platform built around GPT-5.4 Pro level intelligence would be really useful. The gap between other models and what Pro can do on complex intellectual work remains stark. I would love to have access in a Codex-like platform with shared file spaces, subagents, etc

View Details View on X ↗

🔁dair_ai retweeted

O

elvis

@omarsar0

📅

Mar 17, 2026

7m ago

🆔99826317

⭐0.32

Another great post on how to leverage agent skills. I use over a 100+ skills already. The hard part is understanding how to keep them relevant and optimized.

❤️5

likes

🔁1

retweets

View Details View on X ↗

O

omarsar0

@omarsar0

📅

Mar 17, 2026

7m ago

🆔99826317

⭐0.32

Another great post on how to leverage agent skills. I use over a 100+ skills already. The hard part is understanding how to keep them relevant and optimized.

@trq212 • Tue Mar 17 16:53

https://t.co/45C3gKydTK

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Mar 17, 2026

18m ago

🆔77849425

Grounding World Simulation Models in a Real-World Metropolis paper: https://t.co/yGrI2F67ej https://t.co/S56BAql8Ka

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Mar 17, 2026

20m ago

🆔28315608

The PokeAgent Challenge Competitive and Long-Context Learning at Scale paper: https://t.co/TrTvHiI3tC https://t.co/jhzZSPVj5Y

🖼️ Media

View Details View on X ↗

🔁jxnlco retweeted

M

Matteo Collina

@matteocollina

📅

Mar 17, 2026

29m ago

🆔24257441

⭐0.36

Now that I’m using both Claude and Codex daily, I’m seeing that Claude tries to “cheat” more often, eg deleting a failing test. What’s your experience?

❤️15

likes

🔁1

retweets

View Details View on X ↗

O

OpenAI

@OpenAI

📅

Mar 17, 2026

25m ago

🆔24731072

GPT-5.4 mini is available today in ChatGPT, Codex, and the API. Optimized for coding, computer use, multimodal understanding, and subagents. And it’s 2x faster than GPT-5 mini. https://t.co/DKh2cC5S3F https://t.co/sirArgn37L

🖼️ Media

View Details View on X ↗

T

tri_dao

@tri_dao

📅

Mar 17, 2026

29m ago

🆔99349921

⭐0.44

There are a bunch of areas where inference-efficient architectures make a huge difference (e.g. RL training where 80% of the time is spent on large batch, long sequence rollout). Lots to do on both the algorithms and systems side to realize the potential benefits of these new architectures! Check out the threads from the students who led this project: https://t.co/7jG3beI9Sj https://t.co/i685cnZrZ7 https://t.co/X7pd2VQ6e3 https://t.co/rQ9DwKT6KS 10/10

View Details View on X ↗

M

matteocollina

@matteocollina

📅

Mar 17, 2026

29m ago

🆔24257441

⭐0.36

Now that I’m using both Claude and Codex daily, I’m seeing that Claude tries to “cheat” more often, eg deleting a failing test. What’s your experience?

View Details View on X ↗

T

tri_dao

@tri_dao

📅

Mar 17, 2026

45m ago

🆔51394051

Now that inference throughput is what’s driving agents’ progress (hopefully you caught Jensen’s keynote 😀), we’ll continue to make Mamba stronger and faster. Some fun stuff in the pipeline: new algorithms and kernels to make Mamba forward 3-4x faster and backward 2x faster. Hopefully will be out in 1-2 months, as soon as I can convince Claude to finish all the implementations. My new strat is just whispering to Claude “make it faster…” over and over 9/10

🖼️ Media

View Details View on X ↗

_

_albertgu

@_albertgu

📅

Mar 17, 2026

45m ago

🆔39451045

The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

🖼️ Media

View Details View on X ↗

G

ggerganov

@ggerganov

📅

Mar 17, 2026

48m ago

🆔25337477

With Nemotron 3 Nano 4B in the NVIDIA Nemotron 3 family, llama.cpp users get a compact model for action-taking conversational personas, available across NVIDIA GPU-enabled systems and @NVIDIA_AI_PC https://t.co/WS2BRzS5Aa

🖼️ Media

View Details View on X ↗

L

llama_index

@llama_index

📅

Mar 17, 2026

1h ago

🆔77023103

One of the hardest problems with document parsing is trust. How do you know the output actually corresponds to what's in the source? LlamaParse has visual grounding with bounding box citations for outputs, and it addresses exactly this. Two ways to use it: 1️⃣ In the UI: hover over any element in the markdown output and it highlights the exact region it came from in the original document. Great for spot-checking complex tables, multi-column layouts, or figures where parsing can be tricky. 2️⃣ In the JSON output: every parsed element carries bounding box coordinates, i.e. the precise location of that element within the source file. That means you can build applications that don't just surface an answer, but can point back to exactly where in a document it came from. For due diligence, where auditability matters, this is a step up from "trust the output." You can verify it, cite it, and build on it. Sign up to LlamaParse to get started: https://t.co/yPVJzqoKal

🖼️ Media

View Details View on X ↗

🔁Scobleizer retweeted

H

H

@hcompany_ai

📅

Mar 17, 2026

7h ago

🆔14320083

⭐0.34

🚀 Live from @NVIDIAGTC, we're releasing Holotron-12B! Developed with @nvidia, it's a high-throughput, open-source, multimodal model engineered specifically for the age of computer-use agents. Get started today! 🤗Hugging Face: https://t.co/oaSviLi8IN 📖Technical Deep Dive: https://t.co/pDItQB1frU 💼We are hiring: https://t.co/fcNoR9FIYQ #AI #ComputerUse #NVIDIA #OpenSource #ReinforcementLearning #HCompany #GTC @NVIDIAAI @nvidia

❤️44

likes

🔁11

retweets

View Details View on X ↗

🔁Scobleizer retweeted

S

Sara Hooker

@sarahookr

📅

Mar 17, 2026

3h ago

🆔66750608

⭐0.34

Two weeks after sharing @adaption_ai adaptive data, we are excited to share our work on blueprint 📘 blueprint steers data towards your goals, and learns penalties if AI violates any of your rules. Very proud of the team.

❤️37

likes

🔁5

retweets

View Details View on X ↗

🔁HamelHusain retweeted

R

Randy Olson

@randal_olson

📅

Mar 17, 2026

2h ago

🆔01984370

⭐0.36

Yes! The “are you sure?” problem (link below) is especially pervasive in any complex coding task. Ask Claude or GPT to review a PR then ask it to double check its findings when it finishes - it’ll flip on at least 1 of its findings. https://t.co/WUSuOxTuDy

🔁1

retweets

View Details View on X ↗

R

randal_olson

@randal_olson

📅

Mar 17, 2026

2h ago

🆔01984370

Yes! The “are you sure?” problem (link below) is especially pervasive in any complex coding task. Ask Claude or GPT to review a PR then ask it to double check its findings when it finishes - it’ll flip on at least 1 of its findings. https://t.co/WUSuOxTuDy

@HamelHusain • Tue Mar 17 14:55

One thing that makes me feel that code factory has not arrived yet is the following experiment: 1.Ask a LLM to do an in-depth rigorous review of your code 2. In a new thread, as same/different LLM to consider those review comments independently and address issues it agrees with

🖼️ Media

View Details View on X ↗

🔁ylecun retweeted

B

Belen Alastruey

@b_alastruey

📅

Mar 17, 2026

2h ago

🆔03697001

⭐0.34

🔎A closer look at Omnilingual No Language Left Behind, the encoder-decoder system presented as part of @AIatMeta new Omnilingual Machine Translation work!🌍 Many say encoder-decoder is dead in the age of decoder-only LLMs but we show it’s not! 📄:https://t.co/isvEzRZbnw 🧵1/n https://t.co/RLs8ncUy0H

❤️16

likes

🔁7

retweets

View Details View on X ↗

🔁random_walker retweeted

S

Stephan Rabanser

@steverab

📅

Mar 17, 2026

3h ago

🆔50398178

⭐0.36

In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️ https://t.co/GkdAxk0wDO

❤️8

likes

🔁2

retweets

View Details View on X ↗

🔁dair_ai retweeted

O

elvis

@omarsar0

📅

Mar 17, 2026

2h ago

🆔73572061

⭐0.36

Current vision-language models still struggle with simple diagrams. Feynman is a knowledge-infused diagramming agent that enumerates domain-specific concepts, plans visual representations, and translates them into declarative programs rendered by the Penrose diagramming system. Great insights for those building agents for diagrams and visualizations. One pipeline run produced 10,693 unique programs across math, CS, and science, each rendered into 10 layout variations, yielding over 106k well-aligned diagram-caption pairs. Paper: https://t.co/F4vNS0TII4 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

❤️12

likes

🔁1

retweets

View Details View on X ↗

H

HamelHusain

@HamelHusain

📅

Mar 17, 2026

2h ago

🆔38886331

⭐0.42

One thing that makes me feel that code factory has not arrived yet is the following experiment: 1.Ask a LLM to do an in-depth rigorous review of your code 2. In a new thread, as same/different LLM to consider those review comments independently and address issues it agrees with 3. Keep repeating until no new concerns I find that this loop always goes on for a ridiculously long time, which means that there is a problem with the notion of claude-take-the-wheel. This seems to happen no matter the harness or the specificity of the specs. It works fine for simple applications, but in the limit if the LLMs have this much cognitive dissonance you cannot trust it. Either this, or LLM are RLHFd to always find some kind of issue.

View Details View on X ↗

B

b_alastruey

@b_alastruey

📅

Mar 17, 2026

2h ago

🆔03697001

🔎A closer look at Omnilingual No Language Left Behind, the encoder-decoder system presented as part of @AIatMeta new Omnilingual Machine Translation work!🌍 Many say encoder-decoder is dead in the age of decoder-only LLMs but we show it’s not! 📄:https://t.co/isvEzRZbnw 🧵1/n https://t.co/RLs8ncUy0H

🖼️ Media

View Details View on X ↗

O

omarsar0

@omarsar0

📅

Mar 17, 2026

2h ago

🆔73572061

Current vision-language models still struggle with simple diagrams. Feynman is a knowledge-infused diagramming agent that enumerates domain-specific concepts, plans visual representations, and translates them into declarative programs rendered by the Penrose diagramming system. Great insights for those building agents for diagrams and visualizations. One pipeline run produced 10,693 unique programs across math, CS, and science, each rendered into 10 layout variations, yielding over 106k well-aligned diagram-caption pairs. Paper: https://t.co/F4vNS0TII4 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

🖼️ Media

View Details View on X ↗

T

tkipf

@tkipf

📅

Mar 17, 2026

3h ago

🆔40516768

⭐0.36

This is one of the most impressive world model projects I have seen. Very elegant and highly effective combination of an image retrieval mechanism (using 3D locations/views) and otherwise just pure generative modeling. This is the way.

@jyseo_cv • Tue Mar 17 02:59

What if a world model could render not an imagined place, but the actual city? We introduce Seoul World Model, the first world simulation model grounded in a real-world metropolis. TL;DR: We made a world model RAG over millions of street-views. proj: https://t.co/Bx4KUAqrRs ht

View Details View on X ↗

🔁jxnlco retweeted

S

Simon Willison

@simonw

📅

Mar 17, 2026

4h ago

🆔68704943

⭐0.34

... and a follow-up chapter about Subagents, now a feature of Codex and Claude Code and Gemini CLI and Mistral Vibe and OpenCode and VS Code and Cursor https://t.co/suGmK4g3Hp

❤️17

likes

🔁3

retweets

View Details View on X ↗

🔁ivanleomk retweeted

A

agrim singh

@agrimsingh

📅

Mar 17, 2026

3h ago

🆔08839930

⭐0.38

i dj'd a set this weekend and planned half of it with an app i built on @GoogleDeepMind's new multimodal embeddings model. it understands what music actually sounds like - not bpm, not genre tags. raw audio in, vibes out. built with @cursor_ai + @openai gpt 5.4, gemini embeddings and @convex to keep everything running smoothly @ericzakariasson @DynamicWebPaige @waynesutton @gabrielchua here's what it does (demo below) 🧵

❤️6

likes

🔁3

retweets

View Details View on X ↗

A

agrimsingh

@agrimsingh

📅

Mar 17, 2026

3h ago

🆔08839930

i dj'd a set this weekend and planned half of it with an app i built on @GoogleDeepMind's new multimodal embeddings model. it understands what music actually sounds like - not bpm, not genre tags. raw audio in, vibes out. built with @cursor_ai + @openai gpt 5.4, gemini embeddings and @convex to keep everything running smoothly @ericzakariasson @DynamicWebPaige @waynesutton @gabrielchua here's what it does (demo below) 🧵

🖼️ Media

View Details View on X ↗

S

steverab

@steverab

📅

Mar 17, 2026

3h ago

🆔50398178

In our paper "Towards a Science of AI Agent Reliability" we put numbers on the capability-reliability gap. Now we're showing what's behind them! We conducted an extensive analysis of failures on GAIA across Claude Opus 4.5, Gemini 2.5 Pro, and GPT 5.4. Here's what we found ⬇️ https://t.co/GkdAxk0wDO

🖼️ Media

View Details View on X ↗

N

NielsRogge

@NielsRogge

📅

Mar 17, 2026

3h ago

🆔69938152

Tried the viral poster-skill with Claude Code on the trending Moonshot paper :) Not too bad! https://t.co/I8lb0aUrbT

@Kimi_Moonshot • Mon Mar 16 03:03

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dep

🖼️ Media

View Details View on X ↗

S

sarahookr

@sarahookr

📅

Mar 17, 2026

3h ago

🆔66750608

⭐0.36

Two weeks after sharing @adaption_ai adaptive data, we are excited to share our work on blueprint 📘 blueprint steers data towards your goals, and learns penalties if AI violates any of your rules. Very proud of the team.

@adaption_ai • Tue Mar 17 12:00

Introducing Blueprint, a new capability within Adaptive Data. We firmly believe data that evolves with the world is only useful if it evolves the right way. Blueprint allows you to steer the data space towards any goal you want. https://t.co/8k0WEMYmdd

View Details View on X ↗

🔁Modular retweeted

F

Forward Future

@ForwardFuture

📅

Mar 13, 2026

3d ago

🆔14582366

⭐0.32

“Everyone should be a GPU programmer.” @clattner_llvm's goal with @Modular: “What Modular is doing is opening up the box. We’re fixing the language problem and the platform problem. "The goal is to let more developers learn modern compute. And to give developers real choice in the hardware they use.” “Those two things unlock the ecosystem.”

❤️6

likes

🔁1

retweets

View Details View on X ↗

S

simonw

@simonw

📅

Mar 17, 2026

4h ago

🆔68704943

⭐0.38

... and a follow-up chapter about Subagents, now a feature of Codex and Claude Code and Gemini CLI and Mistral Vibe and OpenCode and VS Code and Cursor https://t.co/suGmK4g3Hp

View Details View on X ↗

← PreviousPage 142 of 144Next →