Your curated collection of saved posts and media

Recent Top

Showing 32 posts · last 7 days · newest first

🖼️ Media

S

sayashk

@sayashk

📅

Mar 25, 2026

37d ago

🆔69753786

This is a nice essay, and I agree with its characterization of AI as Normal Technology. In fact, on the second line of AINT, we compare AI's potential impact to that of the internet or electricity. But there's a large, qualitative difference between even the most powerful general-purpose technologies which humans can and should influence/control, and creating an omnipotent entity that we have no control over. This was precisely the gap we wanted to highlight.

@deanwball • Wed Mar 25 13:48

I spent a weekend at Stanford recently, which is where, in 2023, I did much of my formative thinking on AI. The Anthropic-DoW affair tested that early intellectual foundation more than anything, so found myself walking around Stanford, reflecting on what I learned in 2023. https:

🖼️ Media

View Details View on X ↗

M

mercor_ai

@mercor_ai

📅

Mar 24, 2026

39d ago

🆔42568587

Introducing APEX-SWE, in collaboration with @Cognition. They see firsthand that real software engineering is not just writing code anymore. It's deploying systems, integrating with tools and debugging when things break. On APEX-SWE, every model fails to reliably solve the real production software engineering tasks. @OpenAI GPT-5.3 Codex (High) tops the leaderboard at 41.5% on Pass@1, followed by @AnthropicAI Opus 4.6 (High) at 40.5%. Every frontier model fails on nearly 60% of real production tasks.

🖼️ Media

View Details View on X ↗

L

LiorOnAI

@LiorOnAI

📅

Mar 26, 2026

37d ago

🆔53392841

Can agents replace software engineers? Not according to this new benchmark. Mercor and Cognition released APEX-SWE. It tests AI coding agents on real engineering work. > GPT-5.3 Codex leads at 41.5%. > Claude Opus 4.6 follows at 40.5%. Nothing crosses the 50% mark. Why? Old benchmarks are basically solved: HumanEval scores jumped from 67% to 90% in two years. OpenAI flagged SWE-bench as contaminated. Models were memorizing the answers. Those benchmarks never reflected the job in the first place. Those tests only measured code writing. Developers spend 16% of their time on that. The other 84% is debugging, infrastructure, and integration. This benchmark tests the 84%. 200 tasks split into two types: 1. Integration: build systems across live databases, APIs, and cloud services in Docker containers 2. Observability: find and fix real bugs using logs, dashboards, and chat history Each task drops an agent into a live environment. Real services, real credentials, and project boards with filler issues mixed in. 50 tasks are open-source on Hugging Face. The eval harness is on GitHub. You can run it yourself. AI writes half the code at big companies. 90% of developers use AI assistants. All of that covers 16% of the job.

@mercor_ai • Tue Mar 24 17:28

Introducing APEX-SWE, in collaboration with @Cognition. They see firsthand that real software engineering is not just writing code anymore. It's deploying systems, integrating with tools and debugging when things break. On APEX-SWE, every model fails to reliably solve the real p

🖼️ Media

View Details View on X ↗

C

caseyleemoore

@caseyleemoore

📅

Mar 25, 2026

38d ago

🆔60224424

Lisa Kudrow doing a perfect Parker Posey impression lol https://t.co/LXtYdFNxj5

🖼️ Media

View Details View on X ↗

🔁youwouldntpost retweeted

C

Lee

@caseyleemoore

📅

Mar 25, 2026

38d ago

🆔60224424

Lisa Kudrow doing a perfect Parker Posey impression lol https://t.co/LXtYdFNxj5

❤️2,082

likes

🔁150

retweets

🖼️ Media

View Details View on X ↗

T

tokensandai

@tokensandai

📅

Mar 25, 2026

38d ago

🆔43566620

most AI apps still don't use the full multimodal stack. vision, audio, real-time processing. all largely untapped we got you access to the latest @GoogleDeepMind models. if you've been wanting to build multimodal agents, this one's for you 🏆 $45K+ in prizes (up to $25K in credits alone) 📅 saturday march 28 · san francisco apply below 👇

🖼️ Media

View Details View on X ↗

A

AnthropicAI

@AnthropicAI

📅

Mar 25, 2026

37d ago

🆔17088921

New on the Engineering Blog: How we designed Claude Code auto mode. Many Claude Code users let Claude work without permission prompts. Auto mode is a safer middle ground: we built and tested classifiers that make approval decisions instead. Read more: https://t.co/dpcMcWMf5k

🖼️ Media

View Details View on X ↗

B

BrianRoemmele

@BrianRoemmele

📅

Mar 25, 2026

38d ago

🆔81185171

LeWorldModel: Yann LeCuns Radical Simplification of World Models Just Made Physics-Aware AI Practical In the race for artificial general intelligence, two paths have emerged. One is the familiar scale everything route: bigger LLMs trained on ever-larger text corpora. The other, championed for years by Yann LeCun, is building world models: compact systems that learn the underlying physics of reality directly from raw sensory data (pixels) so AI can plan, predict, and act in the physical world like a robot or self-driving car actually would. Until now, the second path has been frustratingly difficult. Joint-Embedding Predictive Architectures (JEPAs) - LeCuns elegant framework for learning predictive representations without reconstructing every pixel - kept collapsing during training. Researchers had to resort to a laundry list of hacks: multi-term loss functions (up to six hyperparameters), frozen pre-trained encoders, stop-gradients, exponential moving averages, and other duct-tape tricks just to keep the model from mapping every input to the same useless output. LeCuns team (Mila, NYU, Samsung SAIL, and Brown University) dropped a bombshell: LeWorldModel (LeWM) - the first JEPA that trains stably end-to-end from raw pixels using only two loss terms. No more house-of-cards engineering. Just a clean, simple recipe that works on a single GPU in a few hours with only 15 million parameters. The Core Breakthrough: SIGReg Saves the Day LeWorldModels secret weapon is a new regularizer called SIGReg (for spherical isotropic Gaussian regularizer). It enforces a simple Gaussian distribution on the latent embeddings. This single term prevents representation collapse without any of the previous heuristics. The training objective now has just two parts: 1. Next-embedding prediction loss - the model predicts what the next latent state should be. 2. SIGReg - keeps the latent space well-behaved and diverse. Thats it. Hyperparameters drop from six to one. Training becomes stable, reproducible, and dramatically cheaper. The model learns directly from raw video frames (no pre-trained vision encoders needed) and produces a compact latent world model that can be used for fast planning. Impressive Results on Real Benchmarks Despite its tiny size, LeWorldModel punches way above its weight: - Trains on a single GPU in a few hours. - Plans actions up to 48 times faster than foundation-model-based world models. - Uses roughly 200 times fewer tokens than alternatives. - Matches or beats far larger models on diverse 2D and 3D control tasks (e.g., manipulation, navigation). - Its latent space encodes meaningful physical quantities (position, velocity, etc.) - proven by direct probing. - It reliably detects physically implausible surprise events, showing genuine causal understanding. Crucially, adding a decoder and reconstruction loss hurts performance on downstream control tasks. The pure JEPA objective already captures everything needed for planning - extra visual details just get in the way. Project website: https://t.co/KhGR9LiIQZ Official code: https://t.co/s1lI9kevJS Why This Matters for the Future of AI LeCun has been saying since 2022 that world models (not next-token predictors) are the key to real intelligence. Critics always pointed to the training instability. LeWorldModel removes that objection with elegant simplicity. This is a philosophical reset: AI can learn physics the way babies do - by watching the world unfold - without needing supercomputers or endless text. The implications for robotics, autonomous vehicles, and embodied agents are enormous. Suddenly, building a physically grounded planner is something a researcher (or even a hobbyist) can do on consumer hardware. 1 of 2

🖼️ Media

View Details View on X ↗

🔁jxnlco retweeted

T

TBPN

@tbpn

📅

Mar 25, 2026

37d ago

🆔81435890

BREAKING: @ivanleomk is joining Google DeepMind https://t.co/Fu365QjYDk

❤️35

likes

🔁2

retweets

🖼️ Media

View Details View on X ↗

K

ksorbs

@ksorbs

📅

Mar 25, 2026

38d ago

🆔42251113

Chicago Mayor Brandon Johnson: “We cannot put people in jail anymore. It’s racist.” I can't believe this is real https://t.co/0C6hTaJA8g

🖼️ Media

View Details View on X ↗

B

bneyshabur

@bneyshabur

📅

Mar 25, 2026

38d ago

🆔63694974

We have been heads down but wanted to share a bit about what we are doing 🧵 https://t.co/ICZlLwZ3Fq

🖼️ Media

View Details View on X ↗

🔁ylecun retweeted

B

Behnam Neyshabur

@bneyshabur

📅

Mar 25, 2026

38d ago

🆔63694974

We have been heads down but wanted to share a bit about what we are doing 🧵 https://t.co/ICZlLwZ3Fq

❤️295

likes

🔁17

retweets

🖼️ Media

View Details View on X ↗

A

aiDotEngineer

@aiDotEngineer

📅

Mar 25, 2026

37d ago

🆔29491755

We are excited to welcome @OpenAI to the AIE Expo for the first time as Platinum sponsors for AIE EU! OAI has shipped SO much for AI Engineers this year alone, and this is the best place to catch up: - Meet the team at the Ask OpenAI lounge (bring your hardest tasks and best questions!) - Hear keynotes from @steipete and @lopopolo, and - get hands on with in-depth Codex workshops from @kagigz and @reach_vb! See you April 8-10 in London! AI Engineers💙@OpenAIDevs !

🖼️ Media

View Details View on X ↗

X

xenovacom

@xenovacom

📅

Mar 25, 2026

37d ago

🆔62665211

WebGPU is INSANE! 🤯 Here's a 24B parameter model running locally in a web browser, at a blazing ~50 tokens/second on my M4 Max. ⚡️ It's the largest model we've ever run with Transformers.js... and we're not stopping here. Big announcement soon. https://t.co/4emPjY89ba

🖼️ Media

View Details View on X ↗

T

tbpn

@tbpn

📅

Mar 25, 2026

37d ago

🆔81435890

BREAKING: @ivanleomk is joining Google DeepMind https://t.co/Fu365QjYDk

🖼️ Media

View Details View on X ↗

E

ethena

@ethena

📅

Mar 25, 2026

38d ago

🆔36637867

We've published a blog post on the recent Resolv USR incident. It covers what happened, why offchain trust assumptions matter, and the design principles Ethena adopted from inception to prevent similar outcomes. Read it below. https://t.co/F1Hto8n1E9

🖼️ Media

View Details View on X ↗

M

marouen19

@marouen19

📅

Mar 25, 2026

37d ago

🆔21039608

What is Hypernative? https://t.co/Dy93zLHpYX

@ethena • Wed Mar 25 18:34

We've published a blog post on the recent Resolv USR incident. It covers what happened, why offchain trust assumptions matter, and the design principles Ethena adopted from inception to prevent similar outcomes. Read it below. https://t.co/F1Hto8n1E9

🖼️ Media

View Details View on X ↗

🔁BlancheMinerva retweeted

X

xlr8harder

@xlr8harder

📅

Mar 24, 2026

39d ago

🆔80977801

A tragedy is unfolding at AllenAI. Fs in the chat https://t.co/yPwS7uiE1t

❤️634

likes

🔁54

retweets

🖼️ Media

View Details View on X ↗

P

parismartineau

@parismartineau

📅

Mar 25, 2026

38d ago

🆔26500250

this is a huge deal and a sign of the changing legal tides for big tech. the plaintiffs attorneys here were early adopters of a novel legal strategy that uses product liability law to sidestep tech firms' go-to defense (section 230) & hold them accountable for negligent design https://t.co/FpmRddkSIt

🖼️ Media

View Details View on X ↗

🔁omarsar0 retweeted

O

elvis

@omarsar0

📅

Mar 25, 2026

38d ago

🆔97940174

Nice cheat sheet for Claude Code. https://t.co/ikGzbSqjRK

❤️250

likes

🔁42

retweets

🖼️ Media

View Details View on X ↗

J

jeremyphoward

@jeremyphoward

📅

Mar 25, 2026

37d ago

🆔51424936

https://t.co/BIKKach0bj

🖼️ Media

View Details View on X ↗

L

leothecurious

@leothecurious

📅

Mar 25, 2026

38d ago

🆔76119869

yeah, @sainingxie is one of the few who really get it. he sees through the loud hype. he knows we're not there yet, he agrees with sutton's recent and highly controversial take on LLMs being anti-bitter-pilled. the way he explains it is delightfully simple as well. essentially, just because u smuggled in human knowledge via decades of crowdsourced human annotations pertaining to all interesting aspects of reality, doesn't mean that u can naively advertise the product as scalable in the sense that sutton originally envisioned. LLMs rely very heavily on human knowledge to bootstrap their impressive competence, and once u take that away, the intelligence that remains is severely lacking. tl;dr: just because it wasn't u who annotated reality by hand for a few decades, doesn't mean that ur model requires no human annotations.

@robonaissance • Tue Mar 24 15:32

https://t.co/IECg0FK5YI

🖼️ Media

View Details View on X ↗

G

GMapsPlatform

@GMapsPlatform

📅

Mar 25, 2026

38d ago

🆔80172699

📢 Today, we're announcing general availability for Street View Insights, our first Imagery Insights dataset. 👁️ Access Street View’s vast image repository 💬 Prompt Gemini from Vertex AI Studio 🔗 Connect directly to BigQuery Get started ➡️ https://t.co/P8776NY96m https://t.co/bNTGY2u50J

+1 more

🖼️ Media

View Details View on X ↗

J

JunweiLiangCMU

@JunweiLiangCMU

📅

Mar 25, 2026

38d ago

🆔03824967

Meet DiT4DiT, the FIRST video generation architecture for humanoid robot control. 🤖✨ By treating video generation as a world model, we give robots real "physical intuition." 🔥 The Results: 🚀 >10x better sample efficiency & up to 7x faster convergence! 🏆 SOTA on LIBERO (98.6%) & RoboCasa-GR1 (50.8%). 🦾 Zero-shot generalization on the Unitree G1 humanoid using just monocular vision (1x speed, fully autonomous). 🧠 How it works: We couple a Video DiT with an Action DiT via a dual flow-matching objective. Instead of relying on fully reconstructed future frames, we extract "intermediate denoising features" to guide action prediction—simple but highly effective! Check out the paper, real-world videos, and project page here: https://t.co/Ml0AA8PKqA #EmbodiedAI #Robotics #MachineLearning #WorldModels

🖼️ Media

View Details View on X ↗

T

thorwebdev

@thorwebdev

📅

Mar 25, 2026

38d ago

🆔94440635

What happens when you combine Gemini Live with Lyria 3? You get a super cool AI DJ that fulfills all your song wishes 📻✨ Lyria 3 is now available on the Gemini API! 🥳 https://t.co/gczY3o2DBi

🖼️ Media

View Details View on X ↗

J

jeremyphoward

@jeremyphoward

📅

Mar 25, 2026

37d ago

🆔41285181

Great to see @xyflowdev joining the llms.txt party - and sharing some really helpful examples of how llms.txt helps agents get better results! 😎 https://t.co/gtmo0iiEJd https://t.co/6R1vBQO8OQ

🖼️ Media

View Details View on X ↗

B

BlancheMinerva

@BlancheMinerva

📅

Mar 25, 2026

37d ago

🆔34996959

@yoavgo @deliprao @NeurIPSConf Well, for one thing, OFAC disagrees: https://t.co/QGsx1IPVni

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Mar 25, 2026

37d ago

🆔76220546

Ego2Web A Web Agent Benchmark Grounded in Egocentric Videos paper: https://t.co/4kYE2HEyPA https://t.co/XJh97rmnOJ

🖼️ Media

View Details View on X ↗

I

ivanleomk

@ivanleomk

📅

Mar 25, 2026

37d ago

🆔11035690

Life update I’ve moved to San Francisco and joined @GoogleDeepMind. Excited to work alongside @OfficialLoganK @thorwebdev @DynamicWebPaige @_philschmid @patloeber @ammaar @vadiamit @osanseviero @harrisonfjobe @goodside @alihcevik @matthewridenour https://t.co/HBiaH3yKii

🖼️ Media

View Details View on X ↗

D

duddersj

@duddersj

📅

Mar 25, 2026

38d ago

🆔53928860

https://t.co/96CWVKcudA

@emmma_camp_ • Tue Mar 24 18:51

Question for the group: has there been any great art about Covid? Any incredible literary novels or films? I can't think of anything off the top of my head but my cultural knowledge is not limitless.

🖼️ Media

View Details View on X ↗

🔁youwouldntpost retweeted

D

i’m mr sterling everything

@duddersj

📅

Mar 25, 2026

38d ago

🆔53928860

https://t.co/96CWVKcudA

❤️2,154

likes

🔁192

retweets

🖼️ Media

View Details View on X ↗

K

Kpaxs

@Kpaxs

📅

Mar 25, 2026

38d ago

🆔85832360

Napoleon on procrastination. Written in 1793. Still punching you in the face in 2026. https://t.co/YA6DVWXw4C

🖼️ Media

View Details View on X ↗

← PreviousPage 135 of 656Next →