Your curated collection of saved posts and media
Saw a presentation from @barrowjoseph on vlm OCR today at https://t.co/5EJfcPMVJC. Great presenter and shared a lot of tactical insight. If similar interests, might want to check out https://t.co/VNVvGrDaqw. Thx @barrowjoseph!

holy shit @ivanleomk i used @GoogleDeepMind's gemma4(with codegraff) on the flight to Japan to read through a few papers i was interested in and it cooked!!(i think it just needs a really good harness) https://t.co/IksPGvhY5p (pre-release) https://t.co/yWyH0VKlCr
Before they pulled it, I fed Anthropic's Fable model the instructions on how to make a Creation for r1, and gave it one prompt - "make an awesome game for r1" - this is what it did 🔥🔥🔥 Now available in the Creations Gallery on r1! (rabbithole - the game!) 🥕🥕🥕 #rabbitr1 https://t.co/mWhnWgDbbF
I’m surprised the gaming community haven’t pushed harder to work on Neural Texture Compression considering the RAM squeeze we’re seeing. Unity, Unreal, Valve, Microsoft, Sony, Nintendo, Intel, AMD, Nvidia should help push this as a standard where possible. https://t.co/kwbLd7UAgg https://t.co/yIjTazeqv8

we started a company!! so, we’re tackling continual learning: what’s the learning algorithm to take arbitrary data — documents, conversations, the models’ own experience — and make better models? how do we scale compute in the same way we’ve already seen with pre-training and inference time, but scaling on the same data we see as humans, day after day with no labels, no rewards? A lot of the ingredients are out there already (rl, distillation, long-context, sparse / param-efficient architectures, etc.). our team is at the frontier of these topics, and we’re singularly focused on this. we want to understand this problem better than anyone else in the world. nobody’s solved this problem yet, but even today it’s extremely greenfield opportunity to co-develop research & useful products. in our space, how people interact with the models defines what the data distribution is - and working on this problem end-to-end, from core science to end user, gives us incredible freedom to define the problem and imagine new kinds of experiences. i expect we’ll use models that continually learn much differently than we’re using them today. it’ll feel different when the models _just know_, and build on our thinking and direction in ways we can’t even imagine. we don’t even know the queries we’re not asking, the things we would do but aren’t able to today. i’m so excited to share what we’re doing with the world in the coming months!! and the team is extremely cracked :) tackling this grand challenge and working alongside @jxmnop @EyubogluSabri @dan_biderman @MayeeChen @__howardchen @shizhehe and many others has made every day so fun. come work with us!
https://t.co/CGIef5lIBI
Google fue muy listo; usan los acelerómetros de miles de teléfonos Android cómo una red global de sismos, toda esa data se envía y Google logró una forma de detectar esas ondas a tiempo y enviar las alertas. https://t.co/U7VFGxTCQ5
Finally finished building my AI datacenter! 🚀 32x3090s across 4 servers (8 GPUs each), all connected over InfiniBand. The whole setup is solar-powered with a massive battery bank and generator backup. More technical details and benchmarks coming soon. https://t.co/8GfedrSzNp