O

elvis

@omarsar0

📅

Sun

🆔46421333

265 pages of everything you need to know about building AI agents. 5 things that stood out to me about this report: https://t.co/7LKSGvCHlj

❤️2,429

likes

🔁428

retweets

🖼️ Media

View Details View on X ↗

L

Lior⚡

@LiorOnAI

📅

Sun

🆔86781567

Yann LeCun: "We're never going to get to human level AI by just training on text" https://t.co/ucDtggCwLO

❤️697

likes

🔁89

retweets

🖼️ Media

View Details View on X ↗

C

Chris Albon

@chrisalbon

📅

Sun

🆔27102814

Wow. Researchers secretly used an AI on Reddit’s r/changemyview/ as part of an unauthorized experiment https://t.co/QKpZNI9Q3Z

❤️224

likes

🔁23

retweets

🖼️ Media

View Details View on X ↗

E

Eugene Yan

@eugeneyan

📅

Sun

🆔28830081

The Art of Doing Science and Engineering: Learning to Learn by Richard Hamming only $1.99 for the Kindle version today: https://t.co/VMmyoj1Z61 https://t.co/JOzhXdO4J5

❤️52

likes

🔁5

retweets

🖼️ Media

View Details View on X ↗

G

Lucas Beyer (bl16)

@giffmana

📅

Sat

🆔62057867

With first Claude and now Gemini playing Pokemon, I was thinking of doing my own game-playing experiment over the weekend. However, I quickly learned that it's very far from the VLA-style "pixels->plan" that I naively thought it was, and wanted to do myself. It's like 90%… https://t.co/5cVrbmYArc

❤️1,062

likes

🔁90

retweets

🖼️ Media

View Details View on X ↗

R

j⧉nus

@repligate

📅

Sat

🆔53349006

It gets worse. Following the rotating 'spiral' incident, a concerning reply appeared from CLAUDE itself to the simulated post: Translating the binary yielded: When I attempted to view more replies, the entire simulated Twitter staging site transformed into an very different… https://t.co/DpE5JcSBOZ

❤️210

likes

🔁25

retweets

🖼️ Media

View Details View on X ↗

I

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

📅

Sun

🆔99388394

GPT-4o these days https://t.co/5p4FJXpJQ1

❤️76

likes

🔁6

retweets

🖼️ Media

View Details View on X ↗

_

frye

@___frye

📅

Sun

🆔95417159

this seems pretty bad actually https://t.co/JGbmmyblqh

❤️29,044

likes

🔁863

retweets

🖼️ Media

View Details View on X ↗

Z

Zhao Tianyu

@ZhaoTing1024

📅

Mon

🆔80497416

Official announcement: Qwen 3 this week. Reasoning and non-reasoning in one. https://t.co/4xPTnfRvON

❤️793

likes

🔁88

retweets

🖼️ Media

View Details View on X ↗

S

Sebastian Ruder @ ACL

@seb_ruder

📅

Mon

🆔41239173

The Sparse Frontier Efficient sparse attention methods are key to scale LLMs to long contexts. We conduct the largest-scale empirical analysis that answers: 1. 🤏🔍 Are small dense models or large sparse models better? 2. ♾️What is the maximum permissible sparsity per task? 3.… https://t.co/prWfrljmzQ

❤️187

likes

🔁30

retweets

🖼️ Media

View Details View on X ↗

T

Teknium (e/λ)

@Teknium1

📅

Mon

🆔48312386

The new RLHF gave chatgpt a new job https://t.co/HJDsKKXuDg

❤️186

likes

🔁2

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Mon

🆔24411544

For most people in most circumstances, developing the habit of asking o3 or Gemini 2.5 about anything that confuses or intrigues you is often good Specific curiosity (seeking out answers for things we don't know and following up on learnings) is a good trait & the models do well https://t.co/yqSJ8FimZ8

❤️694

likes

🔁54

retweets

🖼️ Media

View Details View on X ↗

Z

Zack Witten

@zswitten

📅

Mon

🆔84843426

Here's a (memory-free) convo with GPT 4o to make this more concrete https://t.co/7Vmq4JI3rp

❤️3,837

likes

🔁185

retweets

🖼️ Media

View Details View on X ↗

I

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

📅

Mon

🆔12645186

HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding? "we introduce HRScene, a novel unified benchmark for HRI understanding with rich scenes. HRScene incorporates 25 real-world datasets and 2 synthetic diagnostic datasets with resolutions ranging from… https://t.co/0o3bS1hceR

❤️35

likes

🔁11

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Mon

🆔61193530

"o3, You are a consultant hired by the Dark Lord, analyze the org chart of Mordor. How would you improve it for today's changing Middle Earth" o3 does some actual humor: “One Org to rule them all, One Org to find them, One Org to bring them all, And in the darkness, align them.” https://t.co/JxyaPGySeA

+1 more

❤️896

likes

🔁77

retweets

🖼️ Media

View Details View on X ↗

C

Chip Huyen

@chipro

📅

Sat

🆔29051557

I think I might've accidentally jailbreak-ed Claude. I asked Claude to craft a conversation between two characters for a story I'm writing and they suddenly started reciting Claude's system instructions. https://t.co/fmbilFhR7n

+2 more

❤️768

likes

🔁64

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Sat

🆔10111468

Controlled trials keep finding that there are real benefits for doctors using LLMs to explain upcoming procedures & get informed consent. In this study, patients asking questions of ChatGPT-4 had lower levels of anxiety. (Doctor's vetted the answers, which were all "excellent") https://t.co/Vo5ROQJNt4

+1 more

❤️424

likes

🔁54

retweets

🖼️ Media

View Details View on X ↗

S

Shreya Shankar

@sh_reya

📅

Sat

🆔71776572

experimenting with meme driven marketing 🤣 join our course so you won’t be the person on the left! https://t.co/IafuxymqvR

❤️47

likes

🔁5

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Sat

🆔86196440

I don’t mean to be a broken record but AI development could stop at the o3/Gemini 2.5 level and we would have a decade of major changes across entire professions & industries (medicine, law, education, coding…) as we figure out how to actually use it. AI disruption is baked in. https://t.co/sLab2kczZx

+2 more

❤️3,178

likes

🔁384

retweets

🖼️ Media

View Details View on X ↗

I

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

📅

Sat

🆔68243630

o3's combined reasoning and tool use abilities enable it to be excellent at geoguessing be careful what pics you share online! (example from @simonw's recent blog post) https://t.co/RysJy5ilWI

❤️25

likes

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Sun

🆔93508264

On one hand the new GPT-4o isn’t doing as many emojis. On the other, it is slowly driving me insane by responding to everything like an overly enthusiastic 1990s teenager. https://t.co/OEK0HhZmH2

+1 more

❤️709

likes

🔁18

retweets

🖼️ Media

View Details View on X ↗

D

🇺🇦 Dzmitry Bahdanau

@DBahdanau

📅

Sat

🆔92652746

I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code: https://t.co/AgEyxXb7Xi Blog: https://t.co/n4FRxiEcrr https://t.co/DkjI9Snz9g

❤️509

likes

🔁114

retweets

🖼️ Media

View Details View on X ↗

I

Ivan Leo

@ivanleomk

📅

Sun

🆔31874589

haha 20k words in prob the last month https://t.co/8ax4Y4uYsN

❤️4

likes

🖼️ Media

View Details View on X ↗

T

Teknium (e/λ)

@Teknium1

📅

Sun

🆔14950687

Okay come on.. lmao https://t.co/L4d3sfiSi0

❤️382

likes

🔁8

retweets

🖼️ Media

View Details View on X ↗

L

Lior⚡

@LiorOnAI

📅

Sat

🆔28643541

No more needle in a haystack. LLMs now retrieve over 1M tokens—no RAG, no retraining. Infinite Retrieval introduces InfiniRetri, a method that uses transformer attention to handle infinite-length contexts natively. No toolkits. No memory modules. https://t.co/JTxlJweASn

❤️1,070

likes

🔁133

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Sun

🆔53135140

https://t.co/uktfew0HhF

❤️1,015

likes

🔁57

retweets

🖼️ Media

View Details View on X ↗

P

Paweł Huryn

@PawelHuryn

📅

Sun

🆔05401291

AI Evals are the most important skill for AI PMs. But there are many misconceptions. A complete guide on AI Evals: 🧵 https://t.co/H7xOGPzzUH

❤️396

likes

🔁42

retweets

🖼️ Media

View Details View on X ↗

L

LlamaIndex 🦙

@llama_index

📅

Fri

🆔96403125

There has been a lot of productive dialogue recently about what agents are, and the best way to build them: @AnthropicAI published Building Effective Agents, @dexhorthy went viral with his 12 Factor Agents, and @OpenAI released A Practical Guide To Building Agents. We've been… https://t.co/AfgH5KeRPJ

❤️33

likes

🔁4

retweets

🖼️ Media

View Details View on X ↗

J

Jerry Liu

@jerryjliu0

📅

Fri

🆔66039581

There’s been a lot of discussions on the best way to build agents: - @OpenAI’s general take is that increased model capabilities simplify the SDK - Others (incl. @Anthropic’s agent pattern guide + @dexhorthy) generally outline a more constrained approach. We’ve been asked by a… https://t.co/SWyMBDIFKb

❤️151

likes

🔁21

retweets

🖼️ Media

View Details View on X ↗

G

Gary Marcus

@GaryMarcus

📅

Fri

🆔36105436

God I wish @elonmusk would have taken this bet. https://t.co/WgzsDG5HwK

❤️1,584

likes

🔁84

retweets

🖼️ Media

View Details View on X ↗

P

Joscha Bach

@Plinz

📅

Fri

🆔71247927

As a Gary Marcus, I would be very careful about what I say about LLMs and AGI in 2025 https://t.co/EMmbkoLaC3

❤️112

likes

🔁2

retweets

🖼️ Media

View Details View on X ↗

R

Marius

@rasmus1610

📅

Tue Apr 22

🆔01301739

Building LLM apps feels different, right? Their opaque nature makes predicting outputs and behavior really tough. 🤔 So how do you move beyond guesswork and actually improve your LLM-based products reliably? https://t.co/gx2oL40GLM

❤️27

likes

🔁9

retweets

🖼️ Media

View Details View on X ↗