Your curated collection of saved posts and media
Memory-R1 Another really cool paper showing how RL can enhance an LLM's agentic and memory capabilities. Great read for AI devs. Here are my notes: https://t.co/aC1hnQerad
In an essay a year ago by me and @sayashk, we observed that more capable models don't necessarily mean more useful products, and that real-world usefulness will require training on user data. In retrospect, I'm surprised Anthropic held out for this long. https://t.co/Bd8ySxgHvF https://t.co/8ThNMyVlKQ
We're updating our consumer terms and privacy policy to help us deliver even more capable, useful AI models. Our users will now be given the choice to allow their data to be used to improve Claude.
Trevis Williams's wrongful arrest is front page today. NYPD messed up mightily, but this isn't a tale of human error or tech that doesn't work. It's a tale about a bad process, a FIXABLE process: Police need more evidence than facial recognition to put a suspect in a lineup. https://t.co/aHpWRuaoZN
Honored to be included in the TIME AI 2025 list. What I'm most glad about is that they highlighted the part of Surge's work that actually matters. ... https://t.co/x1imZR8p3q
Microsoft now has their own foundation model, MAI-1 trained on a relatively small amount of compute and with a pretty modest LM Arena score. I'll be curious to see if they can catch up to the leaders, which has been something that has been getting hard to do, but we will see! https://t.co/iblFE3H4qJ
Asking Claude to read over a post I am working on and find errors... (Asking it to use web search solves the problem, of course) https://t.co/ZUpOnoBsL5
I wrote about the era of Mass Intelligence. GPT-5 and Google's Nano Banana are examples of how advanced AI is now making their way to far more users, at scale, as both performance and efficiency keep improving. We are going to see a lot of weird things happening, all at once. https://t.co/VW1syLGI2w
A really useful prompt for writing: "review this for accuracy, look up any facts you may want to challenge or explore." Even if not perfect, it is a good sanity check. Works well with Claude 4.1, GPT-5 Thinking, and Grok 4. Weirdly, Gemini 2.5 Pro often won't do web searches. https://t.co/oQ5a2NRr7R
Ha. The fact that this works is just great. https://t.co/JZc6HwoVqE

Microsoft launches its first in-house AI models https://t.co/gyc75m5pEn
Incredible things you can do with Nano Banana! One simple image offers infinite possibilities. Here's a sports app I'd like to have. Animated with Kling 2.1 start and end frame https://t.co/E9MjORfZ8H
βA computer at midnight, should not go unused.β βSteve Wozniak, cofounder of Apple computer. My most important historical video: https://t.co/vsuFlRSv3M
ππ€ Our humanoid robot can now rally over 100 consecutive shots against a human in real table tennis β fully autonomous, sub-second reaction, human-like strikes. https://t.co/6DvqhilAjk
π¬ Meet Google Vids: the AI-powered video creation app for work. Our new instructional series, Vids on Vids, starts with how to access Vids, and takes you on a tour of the editor. Watch now! β https://t.co/RooR8YOUZo https://t.co/OJ1khghlw1
π₯π₯ Tesla launches the new Model Y Performance in Europe! β Specs β β 580 km / 360 mi WLTP range [+66 km / +41 mi] β 0β100 km/h in 3.5s / 0-60mph in 3.3s β 250 km/h / 155 mph max speed β New Performance Powertrain β Adaptive Suspension and Unique Drive Modes β Exterior β β Sporty front bumper β Sporty rear bumper β New 21β Archnid 2.0 wheels β Pirelli P-Zero tires β Carbon fiber rear spoiler β Performance badge β Red brake calipers β Interior β β First-Row Performance Sport Seats With Thigh Extension, Heating and Ventilation (and insert) β Carbon fiber interior trim β 16β QHD touchscreen β Grey headliner β Starting at β¬61,990 / Β£61,990 β Deliveries begin in September

Understanding Tool-Integrated Reasoning https://t.co/ARJDnv5Lce
discuss with author: https://t.co/y6AdFdM3jY
Thanks AK for posting our work! Check out our Notion blog https://t.co/Mj5yHB9Gjo and HF collection https://t.co/MaWp4cCdEG
Understanding Tool-Integrated Reasoning https://t.co/ARJDnv5Lce

Thanks AK for posting our work! Check out our Notion blog https://t.co/Mj5yHB9Gjo and HF collection https://t.co/MaWp4cCdEG
Mixture of Contexts for Long Video Generation https://t.co/vRVRhos8Ei
discuss with author: https://t.co/0DhSzZ9r3W
Microsoft presents rStar2-Agent Agentic Reasoning Technical Report rStar2-Agent boosts a pre-trained 14B model to state of the art in only 510 RL steps within one week, achieving average pass@1 scores of 80.6% on AIME24 and 69.8% on AIME25, surpassing DeepSeek-R1 (671B) with significantly shorter responses
discuss with author: https://t.co/vJXW6hBToE
The Realtime API is officially out of beta and ready for your production voice agents! Weβre also introducing gpt-realtimeβour most advanced speech-to-speech model yetβplus new voices and API capabilities: π Remote MCPs πΌοΈ Image input π SIP phone calling β»οΈ Reusable prompts https://t.co/fX5yvt0CDD
<cot>I wonder if the timeline over at Substack is better, maybe there is less slop and more interesting longform or so on. Opens Substack. https://t.co/Bbnlfe1XBX
Just dropped on HF! HunyuanVideo-Foley from Tencent AI Lab an end-to-end Text-Video-to-Audio (TV2A) model that turns silent videos into lifelike soundscapes > 100k-hour curated TV2A dataset via automated pipeline > Modality-balanced MMDiT: dual-stream audio-video fusion + text cross-attention > REPA loss: aligns internal states with self-supervised audio features β higher fidelity & stability > DAC-VAE audio codec: 48kHz, continuous latents, strong reconstruction across speech/music/sfx > SOTA on Kling-Audio-Eval, VGGSound, and MovieGen-Audio-Bench (audio quality, semantic + temporal alignment)
Two Reachy 2 setting and clearing the table, all in real time teleoperation! Shot in a single take with all the successesβ¦ and a small failπ One example of what Reachy 2 can do: efficient, versatile object manipulation, with the precision needed for delicate or fragile tasks https://t.co/48fixe6dPN
Over 1 million Liquid foundation models downloaded through @huggingface! The community realized how far we can push with tiny models when they are designed from first principles. Proud of my team at @LiquidAI_! Liquid Discord community: https://t.co/BzRXDhGVSS Play with our models in Apollo: https://t.co/gTn5gzquaI Build with Liquid models in LEAP: https://t.co/eBbsG67kWk

2,000,000+ public models on HF https://t.co/i5TWciDSJV
2,000,000+ public models on HF https://t.co/i5TWciDSJV
Iβve been working on something new: πΒ Build a Reasoning Model (From Scratch). The first chapters just went live! (The book will cover topics from inference-time scaling to reinforcement learning) https://t.co/m0gXnhATKC
I wish I could have the CoPilot he is using. Because mine has never worked on the most basic tasks. I just tried it again now. https://t.co/LHYZSCzeoD
Itβs been a few weeks since we brought GPT-5 to Microsoft 365 Copilot, and itβs quickly become part of my everyday workflow, adding a new layer of intelligence spanning all my apps. Here are 5 prompts that show whatβs now possible: