Your curated collection of saved posts and media
Video world models today have a very limited context length. Mode Seeking meets Mean Seeking (MMM) unlocks long-context, persistent video world models through a unified representation. 1/8 ๐งต https://t.co/XXMic82qoc
New model updates from iquestlab. If you're trying to find an inference model that you can run offline, this is probably the one you're looking for. - 7B and 14B coding models - Optimized for tool use, CLI agents and HTML generation - 128k context length - Explicit and detailed prompting works best - MiT license with requirement of display logo - available on @huggingface
Diffusers 0.37.0 is out ๐ฅ New models, including LTX-2, Helios, GLM-Image, and more. We're proud to be shipping the wild hot RAEs in this release, too! New CP backends, caching methods, etc., are in too! Check out the release notes for more details ๐งจ https://t.co/fzwmRDgk80
Diffusers 0.37.0 is out ๐ฅ New models, including LTX-2, Helios, GLM-Image, and more. We're proud to be shipping the wild hot RAEs in this release, too! New CP backends, caching methods, etc., are in too! Check out the release notes for more details ๐งจ https://t.co/fzwmRDgk80
@DnuLkjkjh This one doesn't have MoE; but I have the larger Qwen3's with MoE if you are interested: https://t.co/IcyLHmP4dz
GPT-5.4 Thinking and GPT-5.4 Pro are rolling out now in ChatGPT. GPT-5.4 is also now available in the API and Codex. GPT-5.4 brings our advances in reasoning, coding, and agentic workflows into one frontier model. https://t.co/1hy6xXLAmJ
Wrote a blog post about my journey here. Has some scalability limitations & will fix them soon. Appreciate any pointers/feedback! https://t.co/javKm9ebYa
GPT-5.4 Thinking and Pro are rolling out gradually starting today across ChatGPT, the API, and Codex. https://t.co/LukYM9v1vk
Agents, for real work. The latest @code release gives you better agent orchestration, extensibility, and continuity. Here's what's new: ๐ช Hooks support ๐ฏ Message steering and queueing ๐ Agentic integrated browser ๐ง Shared memory And more... https://t.co/F5NTXXYjsZ
I have a new workflow for automated bug fixing and minor enhancements. 1- push bugs & enhancements to GitHub issues 2- ask your agent to comment and @ @coderabbitai to plan the fix 3- schedule agents to work on those 4- run agents to review & fix feedback from PR 5- approve
โจCan you imagine your personal assistant run in a bottle cap? The tiny #PicoClaw has done it!๐ฆ Itโs not an RPi0 connect to remote #openclaw server, but #RISCV open-source hardware truly running a local PicoClaw, also supports voice interaction,all for $20! Want to adopt one?๐ https://t.co/PqeIELNyQi
Terminal session manager for AI coding agents https://t.co/oWlUMAxM5q https://t.co/bBG9bPb2XH

Terminal session manager for AI coding agents https://t.co/oWlUMAxM5q https://t.co/bBG9bPb2XH

I've decided to leave OpenAI. I'm incredibly proud of all the work I've been part of here, from helping create the reasoning paradigm with @MillionInt, scaling up test-time compute with @polynoamial, working on RL algorithms with my fellow strawberries, shipping o1-preview (which started life as of one of my derisking runs), to post-training o1 and o3 with @ericmitchellai, @yanndubs and many others. I'm most proud of having led the post-training team here for the last year -- the team has done incredible work and shipped some really smart models, including GPT-5, 5.1, 5.2, and 5.3-Codex. OpenAI has genuinely some of the most talented researchers I have ever met, and I have learned more than I could have imagined knowing since I joined as a new grad. I want to thank @markchen90 @FidjiSimo @sama @merettm for all their support over my time here, and too many collaborators to name for the insights, ideas, and just plain fun we have had working together. After leading post-training for a year, though, I'm longing to start fresh and return to IC research work. I've been thinking about going back to technical research for quite some time, and I genuinely believe my colleagues and team here are set up to succeed going forward without me. I'm personally very excited for my next chapter -- I'm proud to be joining @AnthropicAI to get back into the weeds in RL research, and I'm looking forward supporting my friends there at this important time. Many of people I most trust and respect have joined Anthropic over the last couple of years, and I'm excited to work with them again. I have also been very impressed with Anthropic's talent, research taste and values, and I'm excited to be part of what the company does next!
Introducing the Google Workspace CLI: https://t.co/8yWtbxiVPp - built for humans and agents. Google Drive, Gmail, Calendar, and every Workspace API. 40+ agent skills included.
๐ Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B ยท Qwen3.5-2B ยท Qwen3.5-4B ยท Qwen3.5-9B โจ More intelligence, less compute. These small models are built on the same Qwen3.5 foundation โ native multimodal, improved architecture, scaled RL: โข 0.8B / 2B โ tiny, fast, great for edge device โข 4B โ a surprisingly strong multimodal base for lightweight agents โข 9B โ compact, but already closing the gap with much larger models And yes โ weโre also releasing the Base models as well. We hope this better supports research, experimentation, and real-world industrial innovation. Hugging Face: https://t.co/wFMdX5pDjU ModelScope: https://t.co/9NGXcIdCWI
Back in January, the MLflow team sat down with @mlopscommunity to discuss why MLflow is being rebuilt for the "AI Engineer" era. As more teams move toward autonomous agents, this conversation is more relevant than ever. The highlights: ๐น ๐ง๐ต๐ฒ ๐๐ฒ๐ป๐๐ ๐ฃ๐ถ๐๐ผ๐: Why MLflow is being rebuilt for agents and real production systems. ๐น ๐ง๐ต๐ฒ ๐ ๐ฒ๐๐๐ ๐ฅ๐ฒ๐ฎ๐น๐ถ๐๐: Tackling evals, risky memory management, and governance that actually works. ๐น ๐ง๐ต๐ฒ ๐๐๐๐๐ฟ๐ฒ:ย Why MLflow remains the leading open-source standard for the next generation of AI. Don't build the next generation of AI on a legacy stack. ๐บ Watch: https://t.co/TpLzUGNei0 ๐ง Listen: https://t.co/VABLK7jqcC #MLflow #GenAI #LLMOps #AgenticAI

zoe was burning 24M+ opus tokens/day monitoring agents that weren't running. replaced her cron with a 2-layer system: - bash pre-check, zero tokens when idle - webhook fires opus only when needed. ~95% token reduction and more reliable output. details below. (set up a cron to watch this performance, if it works well I'll double down on this event driven stack, seems like the future)
TIL: There's a whole bunch of interesting skills in the oss codex repo: https://t.co/gNFHV3MD2j $skill-installer playwright-interactive (also /fast is sweeeeet, 1.5x codex makes a huge diff!) https://t.co/XTENPuZ9Ie

Someone just bypassed Apple's Neural Engine to train models. The Neural Engine inside every M-series Mac was designed for inference. Run models, don't train them. No public API, no documentation, and certainly no backpropagation. A researcher reverse-engineered the private APIs anyway and built a transformer training loop that runs forward and backward passes directly on the ANE hardware. The method bypasses CoreML entirely. Instead of using Apple's official tools, the project constructs programs in MIL (Model Intermediate Language), compiles them in-memory using undocumented `_ANEClient` APIs, and feeds data through IOSurface shared memory buffers. Weights get baked into the compiled programs as constants. E ach training step dispatches six custom kernels: attention forward, feedforward forward, then four backward passes that compute gradients with respect to inputs. Weight gradients still run on the CPU using Accelerate's matrix libraries, but the heavy lifting (matrix multiplies, softmax, activation functions) happens on the ANE. This makes three things possible that weren't before: 1. Training small models locally without burning through your battery 2. Fine-tuning on-device without sending data to a server or spinning up the GPU 3. Research into what the ANE hardware can actually do when you ignore Apple's guardrails If this approach scales, the next wave of on-device AI stops being about running someone else's frozen model.
Can AI agents agree? Communication is one of the biggest challenges in multi-agent systems. New research tests LLM-based agents on Byzantine consensus games, scenarios where agents must agree on a value even when some participants behave adversarially. The main finding: valid agreement is unreliable even in fully benign settings, and degrades further as group size grows. Most failures come from convergence stalls and timeouts, not subtle value corruption. Why does it matter? Multi-agent systems are being deployed in high-stakes coordination tasks. This paper is an early signal that reliable consensus is not an emergent property you can assume. It needs to be designed explicitly. Paper: https://t.co/3fllhchiKX Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
MCP is dead? What are your thoughts? I mostly use Skills and CLI lately. I still use a few MCP tools for orchestrating agents more efficiently. https://t.co/o6saSxNQ9s
MCP is dead? What are your thoughts? I mostly use Skills and CLI lately. I still use a few MCP tools for orchestrating agents more efficiently. https://t.co/o6saSxNQ9s
@alex_prompter Without opening the paper, how did they gather the ground truth? My naive assumption is if they are able to gather the ground truth, it is somewhere out there.