Your curated collection of saved posts and media
The "Visual Explainer" agent skill just crossed 3.5K stars on GitHub π Just updated with: /generate-visual-plan slash command for more structured plan specs, code block patterns, typography polish, mermaid fixes, anti slop guardrails https://t.co/qzde42tVEV
The fundamental issue with PDF parsing is that PDFs are designed for display purposes. The internal representation of data is outputting shapes at specific coordinates on the page (e.g. "render this string at coordinate (84, 720) with this font") each displayed character could be not contiguous at all, there could be no font mapping back to unicode so you have no idea what the character is. Any PDF parser needs to magically reconstruct this random sequence of display coordinate data into semantically meaningful text, tables, and more. VLMs do help (screenshot the page and read it), but besides collapsing the metadata they still struggle in terms of accuracy and cost. note: parsing Word/Pptx as text representations so typically a bit easier too read. Our entire company at @llama_index is laser-focused on PDF parsing so we've been really trying to understand all the nuances of doc formats, especially PDFs π more notes on this coming soon
"Build me Perplexity Finance but for Pokemon cards. Make no mistakes." Computer: β researched Pokemon card APIs on its own β wrote 5,000 lines of React + Python β debugged itself using browser devtools β deployed and pushed to GitHub (built by u/NoSquirrel4840 on Reddit) https://t.co/kLBQnyA2Vk
Heβs not kidding. Took me HALF AN HOUR to vibe code Notion with Perplexity Computer. Software is legit a zero. https://t.co/eBbIDQsNRI
Ok, this is insane...π€― I've just built the most comprehensive RAG system (UX Knowledge base) for me to use in my projects with @perplexity_ai . > Instant, research-backed best practices (548 items) for design > 10X the output quality for Project Aristotle with a grounded knowledge layer: https://t.co/ko1oELOvaA > Ability to present design decisions to stakeholders with cited rationale and data. Data is the new oil. Already shared it with those who pre-purchased @AgenticUi in January as a token of appreciation for support.
Video generation models are improving fastβreal-time autoregressive models now deliver high quality at low latency, and theyβre quickly being adopted for world models and robotics applications. So whatβs the problem? Theyβre still too slow on consumer hardware. π What if we told you that we can get true real-time 16 FPS video generation on a single RTX 5090? (1.5-12x over FA 2/3/4 on 5090, H100, B200) Today we release MonarchRT π¦, an efficient video attention that parameterizes attention maps as (tiled) Monarch matrices and delivers real E2E gains. π Paper: https://t.co/d1AAMIseow π Website: https://t.co/41mqriKekx π GitHub: https://t.co/hp5iJttviA π§΅1/n
Introducing LlamaBarn β a tiny macOS menu bar app for running local LLMs Open source, built on llama.cpp https://t.co/F1Z3DVl9Kg
This is very impactful: you can now distill frontier performance into small models that are specialized to private repositories. Companies can quickly and cheaply train on their data and have super-efficient deployments of 32B agents. https://t.co/03jsS6cWJ3
What if you could extend your @aisdk custom agents with Skills? You can now! We added native Skills support to `bash-tool`, so you can add capabilities in a context-preserving manner. The intuition why using skills in custom agent makes sense is the same as to why using bash is so helpful: You get to benefit from the pre- and post-training that the models have gone through to be good at coding and let that apply to any other domain https://t.co/eKmAghwZC1

Anthropic has been coding with fast mode for Opus 4.6 and this has substantially increased our feature velocity. Now, itβs available in research preview on Claude Code for all developers on our Claude subscriptions via extra usage and on Claude Console.
I built https://t.co/4rAUTXjAhc to help me stay-up-date with new AI stuff. It's tracking 14K open source repos so far, with contributions from over 145K developers. Every day, it: - searches for new AI repos (based on 123 keywords and topics) - surfaces repos that are gaining traction, and - categorizes each repo The annotations are done by AI so they are not super accurate, but they've helped me find some useful stuff. It also lets me see where the contributors are, so when I travel, I can find folks doing cool stuff in a new city or country.

Bash is ubiquitous, so giving applications a bash interface lets coding agents become general purpose agents I wonder if same will happen to web frameworks. HTML/JS/CSS is also ubiquitous. If agent generates an html file, your browser opens it with no deps to install
Agentic RL and Environments with @matthew_d_white, @bhutanisanyam1 (@Meta), @danielhanchen (@UnslothAI), @ben_burtenshaw (@huggingface), and @joespeez (@Meta). Notebook-first RL environment workflows using OpenEnv, @vllm_project /Llama flows, and Hugging Face Hub submission, plus a walkthrough of torchtitan and torchforge. Virtual workshop with live Q&A. π Tomorrow, 10am PT: https://t.co/61JYZthuNQ #PyTorch #OpenSourceAI #AgenticAI
Helion's autotuner has been a powerful tool for optimizing ML kernels, but it came with a challenge: long autotuning sessions that could take 10+ minutes, sometimes even hours. The PyTorch team at Meta set out to solve this bottleneck using machine learning itself ποΈ Read here how they did it: https://t.co/N8lYGeXuv9 Spoiler alert: Using Likelihood-Free Bayesian Optimization Pattern Search, they achieved a 36.5% reduction in autotuning time for NVIDIA B200 kernels while improving kernel latency by 2.6%. For AMD MI350 kernels, they saw a 25.9% time reduction with 1.7% better latency. Some kernels showed even more dramatic improvementsβup to 50% faster autotuning and >15% latency gains. βοΈ Ethan Che, Oguz Ulgen, Maximilian Balandat, Jongsok Choi, Jason Ansel (Meta) #PyTorch #Helion #MachineLearning #BayesianOptimization #OpenSourceAI #Performance
π Announcing the @PyTorch OpenEnv Hackathon with CV and @SHACK15sf Build RL environments, post-train models, and tackle 5 major RL + agentic orchestration challenges π° $100K+ cash in prizes π₯ Teams up to 4 π In-person in San Francisco Top judges, mentors, and speakers from: @Meta @huggingface @UCBerkeley @UnslothAI @fleet_ai @SnorkelAI @PatronusAI @mercor_ai @HalluminateAI @scale_AI @CoreWeave @OpenPipeAI @northflank @cursor_ai and Scaler AI Labs Register below π
Are you tired of hand-tuning each model you develop? What if you could describe the architecture once and let a system apply graph transformations and optimized kernels? NVIDIA TensorRT LLM AutoDeploy marks a shift toward approaching inference optimization as a compiler and runtime responsibility rather than a burden on the model author. This approach enables faster experimentation, broader model coverage, and a cleaner separation between model design and deployment. Learn more from our documentation, examples scripts, and the blog, βAutomating Inference Optimizations with NVIDIA TensorRT LLM AutoDeployβ Read the full post: https://t.co/qlIO7q35oY #PyTorch #OpenSourceAI #AI #Inference #Innovation
Doc-to-LoRA: Learning to Instantly Internalize Contexts https://t.co/bDqLdqhmB9 https://t.co/UOHnPZ8sfO

If you use Pythia and like it, we're making an updated version. Tell us what you want. Here's a question for y'all: would you rather have a scaling suite trained on Nemotron-CC (very high quality, some distilled) or CommonPile (public domain, permissively licensed, more crunchy)?
It's a triple threat - It produces actions! It produces future states! It produces values! You just need a good video model to get you started, and you get SOTA.
These days, I'm much more excited about dataset releases than model releases. Models come and go and don't compose, whereas good datasets are more enduring and can be studied, used, revised to create better models more broadly. Excited about these 155K coding agent trajectories...just SFT'ing on this data improves SWE-bench Verified massively (23% -> 59.4%).
The founder of Cursor wrote a banger. This is a must read. π
The founder of Cursor wrote a banger. This is a must read. π
Xcode 26.3 with Claude Agent & Codex hits the Mac App Store today! With advanced reasoning capabilities in Xcode, you can streamline workflows and build faster. And MCP support lets you easily connect other compatible agents. https://t.co/88NjaznE6E
My new favorite tmux dev layout features @opencode (with Kimi K2.5 running on @FireworksAI_HQ) on top and Claude Code on the bottom. I start almost all agent tasks with Kimi (so fast!), then ask Claude if I need a second opinion/more advanced stuff. Great combo! https://t.co/cUxfPgHFlW