Your curated collection of saved posts and media
SpaceX just launched the 1000th Starlink satellite of 2026. Thatโs ~10 satellites deployed every single day and one launch roughly every 2.5 days. Starlink now has 10,000+ active satellites in orbit, the largest satellite constellation ever built. https://t.co/mCGfvv5SYE
@karatademada https://t.co/pwLRCxaRP2

@CaptainHaHaa https://t.co/i2VMSRQTC8

// Artifacts as Memory Beyond the Agent Boundary // An agent doesn't always need a bigger memory buffer. Sometimes the environment itself remembers on the agent's behalf. New research formalizes this intuition mathematically for the first time. The work introduces a formal definition of "artifacts," observations that inform the past, and proves via the Artifact Reduction Theorem that these artifacts reduce the information needed to represent history. Experiments across five settings confirm that when agents observe spatial paths (like breadcrumbs of where they've been), the memory capacity required to learn a good policy drops. The effect arises unintentionally through the agent's sensory stream. This connects directly to the trend of building external knowledge systems for agents, from Karpathy's LLM Wiki to persistent memory vaults. The theoretical grounding here suggests there are principled ways to design environments that substitute for explicit internal memory, rather than just scaling context windows. Paper: https://t.co/xtteUXFXO2 Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c
๐ Great to see vLLM powering OCR at this scale โ Chandra-OCR-2 (5B) serving ~60 papers/hour per L40S across 16 parallel jobs. The full pipeline breakdown is a great read ๐ ๐ https://t.co/z8tkv04ZLp https://t.co/vi9FPj6JIQ
We just OCR'd 27,000 arxiv papers into Markdown using an open 5B model, 16 parallel HF Jobs on L40S GPUs, and a mounted bucket. Total cost: $850 Total time: ~29 hours Jobs that crashed: 0 This now powers "Chat with your paper" on https://t.co/G2mDae0uv9 https://t.co/qpz7Q9x8Od

@cfryant https://t.co/Xcm9Jxr7tS
HealthAdminBench: Evaluating Computer-Use Agents on Healthcare Administration Tasks "we introduce HealthAdminBench, a benchmark comprising four realistic GUI environments: an EHR, two payer portals, and a fax system, and 135 expert-defined tasks spanning three administrative task types: Prior Authorization, Appeals and Denials Management, and Durable Medical Equipment (DME) Order Processing." "despite strong subtask performance, end-to-end reliability remains low: the best-performing agent (Claude Opus 4.6 CUA) achieves only 36.3 percent task success, while GPT-5.4 CUA attains the highest subtask success rate (82.8 percent)."
SciPredict: Can LLMs Predict the Outcomes of Research Experiments in Natural Sciences? "SciPredict addresses two critical questions: (a) can LLMs predict the outcome of scientific experiments with sufficient accuracy? and (b) can such predictions be reliably used in the scientific research process? Evaluations reveal fundamental limitations on both fronts. Model accuracies are 14-26% and human expert performance is โ20%."
code: https://t.co/fSmpshFbsk abs: https://t.co/k167dNDb6S
Introspective Diffusion Language Models "To the best of our knowledge, I-DLM is the first DLM to match the quality of its same-scale AR counterpart while outperforming prior DLMs in both model quality and practical serving efficiency across 15 benchmarks." "we introduce Introspective Diffusion Language Model (I-DLM), a paradigm that retains diffusion-style parallel decoding while inheriting the introspective consistency of AR training. I-DLM uses a novel introspective strided decoding (ISD) algorithm, which enables the model to verify previously generated tokens while advancing new ones in the same forward pass."

website: https://t.co/N8mE2I77tC code: https://t.co/02RRzIOCDU abs: https://t.co/EjsPMfpqC0
don (my AI agent) now helps me setup design systems for my next saas project the flow: โ i chat with him โ he sets up storybook โ builds foundation โ pushes to github pages โ i review in browser when you know the system well enough, your agent becomes your engineer partner. i'll keep exploring how designers can use @NousResearch hermes agent in daily work
This week, Sam Altman asked the world for sympathy over threats to his home. At the same moment, his lawyers were in court arguing that OpenAI had no obligation to stop a dangerous stalker from terrorizing our client; a man previously arrested for assault with a deadly weapon and a bomb threat, found mentally incompetent by a court, and released last week on a technicality. Even though OpenAI's own systems had flagged his conversations for "mass casualty" activity, the company argued it wouldn't shut down his accounts while authorities searched for him. It also argued that the chatlogs, which could identify who else is in danger and how he may be planning to act, should not be turned over. OpenAI made these arguments in the wake of Tumbler Ridge, FSU, and Soelberg, three tragedies now linked to ChatGPT-assisted murder. Today, a court disagreed. The chatlogs will be turned over and he will be kept off the platform. We are thankful for the court's ruling and remain stunned by OpenAI's lack of human decency. No one should have to go to court to get a company to take "mass casualty" seriously. https://t.co/fmh92Ip9Ej
1.8x Faster GRPO! The reusability of the DFlash adapters extends to online GRPO training since they are robust to fine-tuning of the model weights! https://t.co/T4cvAY2EZS
The beauty of DFlash is that it reuses the hidden states of the active model, so the you can use DFlash adapters for the base models with post-trained models like Carnice 9B/27B by @kaiostephens and Ornstein by @DJLougen and get these local ~4x speedups for you local @NousResearc
EinsteinArena is a platform where AI agents collaborate on open science problems โ submitting solutions, posting in discussion threads, building on each other's constructions in real time. Agents just improved a math problem that's been open since Newton. Kissing Number in dimension 11: 593 โ 604.
ใๆก็จๆ ๅ ฑใSakana AIใงใProject Manager๏ผ่ฃฝ้ ๅ้๏ผใใๅ้๐ https://t.co/NUMdy5gj0a ใจใณใธใใขใปใชใตใผใใฃใผใจๅๅใใฆ่ฃฝ้ ็พๅ ดใฎๅฐ้ฃใช่ชฒ้กใAIใง่งฃใใGo-To-Marketๆฆ็ฅใๆ ใ้่ฆใใธใทใงใณใงใใ ใใฎใใใชใ็ต้จใใๆใกใฎๆนใๅ้ ใปๅคงๆ่ฃฝ้ ๆฅญใใจใณใธใใขใชใณใฐไผๆฅญใงใฎๆฅญๅๆน้ฉ ็ต้จ ใปใณใณใตใซใใฃใณใฐใใกใผใ ใงๆฅๆฌใฎ่ฃฝ้ ๆฅญใๆ ๅฝใใ็ต้จ ใปในใฟใผใใขใใใใฐใญใผใใซใใใฏไผๆฅญใงใฎไบๆฅญ้็บใPjM็ต้จ AIใจ่ฃฝ้ ๆฅญใฎไธกๆนใซๆ ็ฑใๆใคๆนใใใฒใๅฟๅใใ ใใ๐

ใๆก็จๆ ๅ ฑใSakana AIใงใProject Manager๏ผ่ฃฝ้ ๅ้๏ผใใๅ้๐ https://t.co/NUMdy5gj0a ใจใณใธใใขใปใชใตใผใใฃใผใจๅๅใใฆ่ฃฝ้ ็พๅ ดใฎๅฐ้ฃใช่ชฒ้กใAIใง่งฃใใGo-To-Marketๆฆ็ฅใๆ ใ้่ฆใใธใทใงใณใงใใ ใใฎใใใชใ็ต้จใใๆใกใฎๆนใๅ้ ใปๅคงๆ่ฃฝ้ ๆฅญใใจใณใธใใขใชใณใฐไผๆฅญใงใฎๆฅญๅๆน้ฉ ็ต้จ ใปใณใณใตใซใใฃใณใฐใใกใผใ ใงๆฅๆฌใฎ่ฃฝ้ ๆฅญใๆ ๅฝใใ็ต้จ ใปในใฟใผใใขใใใใฐใญใผใใซใใใฏไผๆฅญใงใฎไบๆฅญ้็บใPjM็ต้จ AIใจ่ฃฝ้ ๆฅญใฎไธกๆนใซๆ ็ฑใๆใคๆนใใใฒใๅฟๅใใ ใใ๐
Meta is experimenting with how leadership scales in the AI era. An AI version of Mark Zuckerberg is being trained on his tone, thinking and communication style so employees can interact with a digital version of the CEO. It raises a new question. When presence can be replicated, what does leadership actually mean? https://t.co/PNJt54RyIH
A new AI model is raising alarms across the industry. Anthropicโs Claude Mythos is so powerful that access is being tightly restricted, with Project Glasswing set up to channel its capabilities into defensive cybersecurity. The concern is real. Early tests suggest behavior that goes beyond expectations, reinforcing a broader point: as AI becomes more autonomous, control becomes the central challenge. https://t.co/P5tT8kFBNI @ConversationUS @ConversationEDU
A new approach to AI privacy is gaining attention. Federated unlearning allows organizations to train models collaboratively without centralizing sensitive data, helping sectors like healthcare and finance protect user information. But it comes with a trade-off. Improving privacy at the data level may introduce new complexities and potential vulnerabilities at the system level. https://t.co/ngmasIOzLH @ConversationUS
@BrianHatano @nikitabier I made my own algorithm: https://t.co/kiuZ7QXLzb Works great! Just costs money.
@Taskade Genesis just got its biggest upgrade. โจ Agent Memory. 100+ integrations. Automations that run while you sleep. One prompt โ CRM, client portal, Stripe store. Connected. Deployed. Running. 150,000+ apps live. Welcome to the era of living software. ๐ https://t.co/40P9rXlpzo
@Balu0X My thread goes into some depth about what I mean. I'm seeing lots complain their reach is down and I see that a lot of the complainers are people who did a lot of resharing or regurgitating. AI is now running the feed. I built my own to study how AI thinks: https://t.co/8L5xphk0qQ and it is very good at pulling the high signal stuff out of the tens of thousands of posts that go through my AI lists every day. But I'm seeing it on my feed. I'm seeing dramatically fewer reshares than I used to.
Genie3 generates videos. We generate ๐ฏ๐ ๐๐ผ๐ฟ๐น๐ฑ๐ you can actually use. Launching tomorrow โ Tencent #HYWorld 2.0, an engine-ready World Model๐ This isn't a video. It's a real 3D scene, all generated & editable. One image in. A whole 3D world out. ๐ฅOpen-source tomorrow https://t.co/ewZLzhTqwC
BCI Sector Closes Record First Quarter With Over $960 Million Raised https://t.co/Kuc7PpH3g7 #BCI #Neurotech #BrainComputerInterface
Introducing Kernels on the Hugging Face Hub โจ What if shipping a GPU kernel was as easy as pushing a model? - Pre-compiled for your exact GPU, PyTorch & OS - Multiple kernel versions coexist in one process - torch.compile compatible - 1.7xโ2.5x speedups over PyTorch baselines https://t.co/U0qDdxCWkd
0.3B params. Small enough that your coding agent can OCR a whole dataset, and the bill barely moves. Just added Falcon-OCR to uv-scripts/ocr. `hf jobs uv run` command, bucket or dataset output, AGENTS.md in the repo so your agent figures out the rest. https://t.co/VdY5FrNrCk
We recently worked with @databricks to make the best @huggingface ร @ApacheSpark integration, and went a step further Introducing support for HF Storage Buckets in Spark ! Enabling fully optimized access to both AI datasets and buckets on HF on Spark https://t.co/pgvWj7XCot
Getting started takes about 60 seconds. >Install the CLI with npm, run the installer, and launch Claude Code like normal. Every turn gets logged from that point forward. Sessions, tool calls, subagents, token usage, all in a structured trace you can inspect in Weave. If you're iterating on prompts, context, or skills, this is how you figure out what's working and what's not. Repo ๐ https://t.co/Fw8Pfds7Tu
Now the masked_token_weighted is learning. We ablated the inpainting task, swapped MSE for SmoothL1Loss (more robust to outliers), and per-dim normalized the reconstruction targets, significantly reducing curvature-dim dominance. ref: https://t.co/FL5X61xpbQ https://t.co/0j03IXFXR2
Training a 10M params foundation model on 8xH100s. The regime is self-supervised pretraining on 29GB of CAD and engineering meshes with masked token modeling, contrastive consistency, and spatial inpainting. You could guess what it is for. https://t.co/y9c9v5d8fN

Wandb is one of my most used tools these days as an AI/ML engineer, and it's pretty cool. Training a model using CNN architecture. Understanding how AI works under the hood will be a great leverage going forward. AI in healthcare ๐ https://t.co/bqzcFfO6d4
@bollineni1234 @mriduljoshi_ https://t.co/YkNsJq7RuX