Your curated collection of saved posts and media
π vLLM v0.17.0 is here! 699 commits from 272 contributors (48 new!) This is a big one. Highlights: β‘ FlashAttention 4 integration π§ Qwen3.5 model family with GDN (Gated Delta Networks) ποΈ Model Runner V2 maturation: Pipeline Parallel, Decode Context Parallel, Eagle3 + CUDA graphs ποΈ New --performance-mode flag: balanced / interactivity / throughput πΎ Weight Offloading V2 with prefetching π Elastic Expert Parallelism Milestone 2 π§ Quantized LoRA adapters (QLoRA) now loadable directly
@elonmusk @Cmin914725641 More like: https://t.co/hXbPIvSmnw
I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. https://t.co/YCvOwwjOzF Part code, part sci-fi, and a pinch of psychosis :)
AI long-form video generation is rapidly improving. I got early access to @UtopaiStudios new PAI model, and it has blown my mind about what's possible. I really like their editing tools. Will continue to test this out and share more fun examples. https://t.co/Vw3wdXMwPh
New research: FlashAttention-4 FlashAttention-4 achieves up to 1.3x speedup over cuDNN 9.13 and 2.7x over Triton on B200 GPUs with BF16. FlashAttention-4 co-designs algorithms and kernel pipelines for Blackwell GPUs, where tensor core throughput doubles but memory bandwidth and exponential units scale more slowly. The techniques include fully asynchronous MMA operations, software-emulated exponential rescaling, and leveraging tensor memory to reduce shared memory traffic. FlashAttention-4 achieves up to 1.3x speedup over cuDNN and 2.7x over Triton on B200 GPUs, reaching 1613 TFLOPs/s at 71% utilization. Implemented entirely in Python via CuTe-DSL with 20-30x faster compile times compared to C++ templates. Paper: https://t.co/wBiS51m8Bm Learn to build effective AI agents in our academy: https://t.co/LRnpZN7deE

New research on automatic harness synthesis for LLM agents. Great read if you are engineering your own agent harness. The agent harness is the scaffolding that lets an agent interact with its environment: tools, code execution, file systems, APIs. Building a good harness is hard and often done manually. AutoHarness proposes letting agents automatically synthesize their own code harness. Instead of hand-crafting the execution environment, the agent generates the scaffolding it needs to complete a task. Agent harness engineering is becoming one of the most important skills in AI development. Automating harness creation could dramatically lower the barrier to building effective agents. Paper: https://t.co/N85XPr1vMp Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c
they dont want you to know this is way healthier than a salad https://t.co/BNHU7tb5Pp
they dont want you to know this is way healthier than a salad https://t.co/BNHU7tb5Pp
Last night, I built a dashboard with Perplexity Computer to track the "Prepper Index" essentially, stocks tied to people prepping for the worst right now. Now, personally I don't think the worst is going to happen, I'm a positive guy, and think/hope everything works out. Also, I'm not a prepper. But there are a lot of preppers out there going nuts right now, it's all over You Tube and TikTok. And looking at the Prepper Index, it has gone up over the last week. This is a live dashboard, anyone can use it, link in first comment below.
π€―Because of Excel, a THIRD of all genetics papers published in top journals have errors, as many genes have names like SEPT2 (the official name of Septin 2), which Excel automatically makes dates. The issue was found in 2016, but still hasnβt improved! https://t.co/9E46yk82cd https://t.co/ixgppwcCoM

The issue was found in 2016, but didnβt improve until 2023 (maybe AI actually helped?) https://t.co/MN9C2YY1Jm
My favorite genre of American folk photography is, "fast food joint in a stunning natural setting." https://t.co/IhBs2fgkxt
May I add the Chipotle in Sedona to this conversation? https://t.co/DxSjeR7Gfb

Every damn day, another post with a thousand plus likes for a year old "breaking" paper that should "scare everyone using AI" because of issues with "latest top models" like Llama 4 and o3. (The paper was good & multi-turn is hard, but, again, big progress since it was written.) https://t.co/9d3rYisbJ5
I gave ChatGPT for Excel and Claude for Excel a try on a very hard Excel file: macro-economic data from 1,000 years of English history across over a hundred tabs. I think both did a good job, and I did not spot errors (though I only did spot checks). However, Claude was harder to check because ChatGPT tended to stick within the Excel app, building formulas and manipulating the data in the way a person would. On the other hand, Claude used Python and often pasted material into Excel for display purposes only, making it harder to trace or edit. If that holds, I think it will generally make ChatGPT more useful for serious users if you want to audit the results. Prompt: "help me understand the relationship between the mix of agricultural products in the UK, GDP, and population, along with hours worked. I want this over the total period, and you should illustrate interesting trends with graphs and statistical analysis

@saintgeorge Nope https://t.co/XV1eS1FEi2
Like the AI generates absolute bangers of metaphor that make no sense, but, because the writing is meaning-like, you figure out ways for it to make sense, and through that interpretation, find something deeply meaningful Very indicative of the general problem of AI personality https://t.co/iw9aGeOEkA

We present a research preview of Self-Flow: a scalable approach for training multi-modal generative models. Multi-modal generation requires end-to-end learning across modalities: image, video, audio, text - without being limited by external models for representation learning. Self-Flow addresses this with self-supervised flow matching that scales efficiently across modalities. Results: β’ Up to 2.8x faster convergence across modalities. β’ Improved temporal consistency in video β’ Sharper text rendering and typography This is foundational research for our path towards multimodal visual intelligence.
π¨BREAKING: Yann LeCun just dropped a paper that should make every AI lab rethink its roadmap. One brutal conclusion: chasing AGI is the wrong goal. Hereβs why: β Humans arenβt general weβre survival specialists. β Walking and seeing feel βgeneralβ only because they keep us alive. β Outside that zone, weβre terrible. Chess computers proved it decades ago. β Most AGI definitions today either canβt be measured or assume human = general. We built the benchmark around the wrong species. The team proposes a new target: Superhuman Adaptable Intelligence (SAI). Not βcan it do what humans do,β but: how fast can it learn something new? The approach: specialized expert systems with internal world models + self-supervised learning built to master the massive task space that humans biologically canβt reach. One giant model mimicking human limits isnβt the ceiling. Itβs the trap.

We asked people around the world to rate the morality and ethics of others in their country. The U.S. is the only place we surveyed where more adults describe the morality and ethics of others living in the country as bad than good. See our full morality report here: https://t.co/qBtj1ycDkP

Jobs report uniformly weak: 92K jobs lost (with job losses in almost every industry), household survey employment down too, unemployment rate up to 4.4%, participation down, avg weekly hours flat. Main sign in the other direction was strong wage growth. https://t.co/tX3LF6WfVN
Trumpβs second-term pardons are historic in their enormityβbillions in fines erased, allies protected, donors rewarded, DOJ undermined, and election norms threatened. Corruption looks less like an exception and more like the rule, says Catoβs Dan Greenberg. https://t.co/rR2YH0O7py
Israeli Finance Minister Bezalel Smotrich says that Beirutβs Dahiya district will soon βlook like Khan Younis.β https://t.co/KjQDryRAeK
Israeli strikes displace hundreds of thousands across Lebanon https://t.co/LRMn3B31hk