Your curated collection of saved posts and media
It is possible to make the future better. Last night I was very privileged to be the only non investor at San Franciscoโs @hf0 startup residency. I had helped one of the startups that was launched last night. But it requires dreaming about it. I was in the front row but a couple of startups are still in stealth. Sat next to a very wealthy investor from New York. Told me this is the only event worth coming to San Francisco for. Met an 18-year old who built 1,000 glasses and has recorded 17,000 factory workers already and a woman who was the Youngest at Apple to earn a patent. She came up with the bump feature everyone uses to network with. I think he may have a point. One other investor told me to write a manifesto of the future. Another was early investor in Roblox. Told me it has the only dataset capable of building the ultimate gaming World Model. Thanks to Spark from the @IllusionOfLife company which invited me. Yes an AI-built talking character invited me to the best set of startup demos that I have ever seen. AI soon may call. With an opportunity. Of my lifetime. Or yours. All because I made friends with a magic dog and invited it to my Thanksgiving dinner. Humbled.
WonderWise started as my way of bridging the gap between Science and Music for my 3-7yo kids. "Why are dinosaurs extinct? How are Islands formed?" I turned their questions (through AI) into educational songs that we listen to in the car! Follow along:https://t.co/x9m89BAdCs
A new episode of Lennyโs Podcast with our CEO @echen just dropped. What they covered: โข Why Anthropic and Google are winning โข How we hit >$1B revenue with <100 people while avoiding the SV playbook โข The brutal choice model builders face: prioritizing Engagement vs. sticking to their Values โข Why Harvard professors and Fields Medalists love teaching models on our platform โข What RL environments reveal about hierarchies of agentic behavior โข The underappreciated post-training skills: Taste and Sophistication Check out @echen and @lennysan here: https://t.co/BSxQ8IaXzH
We tested one of the most common prompting techniques: giving the AI a persona to make it more accurate We found that telling the AI "you are a great physicist" doesn't make it significantly more accurate at answering physics questions, nor does "you are a lawyer" make it worse. https://t.co/r1GNU3qhhD

GPU-native splat editing tool for Unity! Delete & Restore millions of splats at 90+ FPS Zero CPU readback. Everything stays on GPU. Compute shader + atomic bit ops (0.2ms) All-new Splat Editor window ๐งต๐ #GaussianSplatting #madewithunity #GPU #3DGS #Unity3D #GameDev #NeRF #Photogrammetry #AI
Google DeepMind just introduced SIMA 2, a Gemini-powered generalist embodied agent that understands language, images, goals and takes action in 3D virtual worlds. Unlike SIMA 1, this new version doesnโt just follow commands. It reasons, chats, explores, and learns autonomously. It performs across a wide range of games, closes the gap with human players, and can self-improve in unseen environments by generating its own tasks and rewards.
Max from @medi_search is the most advanced medical AI available. Max scores 53.3% on HealthBench Hard, OpenAI's new benchmark for difficult medical cases, beating both GPT-5 and GPT-4o. Available on web, iOS, and Android: https://t.co/f6cIifVYlP https://t.co/kltx3xmPO8

GLM-4.6V just dropped on Hugging Face https://t.co/23b58SoN5P
Iโm thrilled to announce that Iโve officially joined the @huggingface Fellows program! ๐ From building community leaderboards to pushing the boundaries of LLM fine-tuning, I can't wait to do even more for the open-source ecosystem. Letโs build! ๐ฆพ https://t.co/3rpdfr38cU

BREAKING: President Trump says he is signing a โone ruleโ Executive Order on AI this week. โYou canโt expect a company to get 50 approvals every time they want to do something,โ Trump says. https://t.co/X9gkjJB7yR
@karpathy https://t.co/kaZwaiAhn6
The AI Consumer Index (ACE) Most AI benchmarks today focus on reasoning and coding. But most people use AI to shop, cook, and plan their weekends. In those domains, LLM hallucinations continue to be a real problem. 73% of ChatGPT messages (according a recent report) are now non-work-related. Consumers are using AI for everyday tasks, and we have no systematic way to measure how well models perform on them. This new research introduces ACE (AI Consumer Index), a benchmark assessing whether frontier models can perform high-value consumer tasks across shopping, food, gaming, and DIY. Consumer tasks require grounding in real-world information. A model that hallucinates a product price or provides a dead link isn't just wrong, it's actively unhelpful. ACE's grading methodology dynamically checks whether responses are grounded in retrieved web sources, penalizing hallucinations with negative scores. The results expose a substantial gap: GPT-5 (Thinking = High) leads at 56.1%, followed by o3 Pro at 55.2%. The best model scores only 45.4% on Shopping. Models frequently hallucinate prices and product features, scoring negative on grounded criteria. The study found that on "Provides link(s)" in Shopping, Gemini 3 Pro scores -54%. That's not just failing to provide links, it's confidently providing dead or fabricated ones. Other models like Opus 4.5 also face similar issues. All of these issues can be improved with multi-agent systems, but it's important to be aware of the issue first. The benchmark includes 400 hidden test cases created by 47 domain experts. Each case has fine-grained rubrics distinguishing whether failures come from not meeting requirements versus hallucinating information. Paper: https://t.co/VBSBCJMFHQ ACE reveals the gap between benchmark performance and real-world utility.
Join us for AI Dev Days this week! Day 2 of the event, on December 11th, is all about enhancing developer productivity with AI - we've got some really great sessions from the @code team that you don't want to miss ๐ Learn more at our blog: https://t.co/9s8SNUBbT7
For the first time in six years, MIRI is running a fundraiser. Our target is $6M. Please consider supporting our efforts to alert the worldโand identify solutionsโto the danger of artificial superintelligence. SFF will match the first $1.6M! โฌ๏ธ https://t.co/EWNoIKsHnB
#NativeAmerican #native #nativetwitter #cloutmma3 #เนเธเธตเธขเธฃเนเนเธเธญเธฐเธงเธญเธขเธเน #OTDirecto29D #afuaasantewaasingathon #sueperkupa #NewYearsHonours https://t.co/z1SWoZu1o2

Love wins holy shit #rayfrog #bullfrog #Ramon https://t.co/G6H1iNTiYc

Gay old men canvas no way #charpim #rayfrog #pongorma #dedusmuln #giroro #dororo https://t.co/9a2JhGti0S

This July 4th, we contemplate parallels between the colonization of Turtle Island (โNorth Americaโ) and Palestine. Supporting Palestiniansโ right to return and right to self-determination in their homeland goes hand in hand with supporting Indigenous peopleโs demand for #LandBack ๐งต
Pretty in pink? Want to join us? Repost it as much as you can. https://t.co/iYaxIbEWUY https://t.co/QDMVqIISWb

ใใญใใฐๅๅ ใๅทฆ่ค็ฉบๆฐใใใใคใฉในใใใใใซใ ๆถผๆฃฎ็ๅธใ ๅๅๅถไฝไบๅฎ๏ผ #wf2019w #ใใคใใฃใ https://t.co/R7UVXJ3hQD

This is awesome!!! Thank you @Algorand @AlgoFoundation for the shout out!!!! https://t.co/r73QC8FTBc

https://t.co/LbokDLyLnt.frens ๐ค Saturday learning and exploring ๐ง Some outputs i like below https://t.co/3SAG0v4wrJ
Shout out and big thank you to @nft_highmali for collecting both collabs between me and @Gogolitus ! Instant full set ๐พ 'Floppy risk' https://t.co/QfxAQG10Xb https://t.co/vhqlzxbhHg

rฬตฬคฬฃออฬนฬบฬฬฬฬฬฬอฤฬทฬชฬฅฬฬฒฬณฬฏฬฐฬอฬออฬพฬฬฬออ bฬตอฬญฬฉอฬฬฬรถฬถฬฐฬปอฬฬบฬซฬฬฬoฬดฬขฬฬฬออฬ ฬออฬออฬอ แบฬถฬกฬกออฬ ฬปฬฬ ฬ ฬฬ โ ๏ธFlash warningsโ ๏ธ https://t.co/xYEi0559fp

behind the scenes - the making of natives font https://t.co/OSUKvsBmzH

Integrating LLMs with knowledge bases. Important read for AI practitioners LLMs generate impressive text but struggle with hallucinations, outdated knowledge, and reasoning over structured data. The default response has been scaling up (e.g., more parameters, more compute, more cost). But bigger models don't solve the fundamental problem: LLMs lack reliable access to external, verifiable knowledge. This new survey examines how RAG, Knowledge Graphs, and hybrid approaches address these limitations. The key insight: integration happens at three levels: - Level 1 focuses on retrieval, getting the right information into the model. - Level 2 addresses reasoning, synthesizing retrieved knowledge for complex tasks. - Level 3 handles optimization, adapting systems for domain-specific needs. KAG showed 19.1% exact match improvement over basic RAG on HotpotQA. Think-on-Graph achieved significant accuracy gains over Chain-of-Thought on complex QA. The practical applications span finance, medicine, and code generation. FinAgent combines RAG with reinforcement learning for trading decisions. UMLS integration improves diagnostic accuracy in medical AI. Codex leverages retrieval to enhance code generation quality. Knowledge drift requires continuous updates, domain-specific representations don't always align with LLM embeddings, and standardized evaluation benchmarks are still lacking. The path to reliable LLMs isn't just scale. It's thoughtful integration with structured knowledge that provides factual grounding and enables complex reasoning. Paper: https://t.co/vl8ZPf4ncA Learn to build RAG and AI agents in our academy: https://t.co/zQXQt0PMbG
๐ฆธ1Wrkโs SuperApp, built on #AWS, is helping businesses hire faster & mange talent more intelligently. ๐กDiscover how AWS is providing a โbackboneโ for #generativeAI innovation & helping the #startup scale: https://t.co/hxJI58TMJ9 https://t.co/8Ij87Ipbpb
๐ฃ @DeepgramAI announces integration with key #AWS solutions at #AWSreInvent. ๐ Accessing advanced voice #AI models & capabilities within their existing #AWS environments will enable customers to build, deploy & scale their applications faster & more securely. ๐ https://t.co/FiXYywbF0s
๐ #awsreinvent is back & Day 1 is done! https://t.co/PZYZLKBKHV ๐ฆพ We met the next generation of leaders, talked about the latest trends in #AI & celebrated opening night in style with the global cloud community. What a day for #strartups! https://t.co/XMhbIyRxzI
๐ฃ Applications for the 2026 Physical AI Fellowship are now open! โ https://t.co/Wvje7yrqsy ๐ค At #awsreinvent, @MassRobotics & @nvidia announce the second edition of the 8-week program designed to help robotics & physical #AI startups from around the world scale faster & smarter.
๐ฃ @twelve_labs launches its most powerful video understanding model at #awsreinvent. https://t.co/jEHEFPJYQe ๐ Marengo 3.0 โshatters the limits of whatโs possibleโ for developers & enterprise, enabling them to search, navigate & understand video content at scale. https://t.co/ogx3CtxAJa