Your curated collection of saved posts and media

Showing 24 posts Β· last 30 days Β· by score
πŸ”dair_ai retweeted
O
elvis
@omarsar0
πŸ“…
Mar 02, 2026
7d ago
πŸ†”53654711
⭐0.34

Any benefits in using AGENTS dot md files with coding agents? Lots of discussions on this topics lately. Researchers tested OpenAI Codex across 10 repos and 124 PRs, running identical tasks twice (once with AGENTS dot md, once without). The finding is a bit different from what other recent papers report. With AGENTS dot md present, median runtime dropped 28.64% and output tokens fell 16.58%. The agent reached comparable task completion either way, it just got there faster and cheaper with context. One important thing to note: The gains weren't uniform. AGENTS dot md primarily reduced cost in a small number of very high-cost runs rather than uniformly lowering it across all tasks. The file acts more like a guardrail against worst-case thrashing than a universal accelerator. So I guess it depends on the task and requirements. I recommend to not use AGENTS dot md files blindly. If you do, keep them lean. Paper: https://t.co/g2U603Cf8t Learn to build effective AI agents in our academy: https://t.co/U0ZuNA084v

❀️133
likes
πŸ”21
retweets
T
Tim_Dettmers
@Tim_Dettmers
πŸ“…
Jan 27, 2026
41d ago
πŸ†”92451895
⭐0.40

From there we could do a massiv amounts of experiments and really understand what matters for training coding agents. The most important insights came from carefully evaluating what scales well. What matters? The right model at the right scale. Cheap data generation pipelines.

M
moo_jin_kim
@moo_jin_kim
πŸ“…
Jan 24, 2026
44d ago
πŸ†”31630241

We release Cosmos Policy πŸ’«: a state-of-the-art robot policy built on a video diffusion model backbone. - policy + world model + value function β€” in 1 model - no architectural changes to the base video model - SOTA in LIBERO (98.5%), RoboCasa (67.1%), & ALOHA tasks (93.6%) πŸ§΅πŸ‘‡ https://t.co/cz9L3ziJ6x

πŸ–ΌοΈ Media
K
kenziyuliu
@kenziyuliu
πŸ“…
Feb 26, 2026
11d ago
πŸ†”37663259

Can we build a blind, *unlinkable inference* layer where ChatGPT/Claude/Gemini can't tell which call came from which users, like a β€œVPN for AI inference”? Yes! Blog post below + we built it into open source infra/chat app and served >15k prompts at Stanford so far. How it helps with AI user privacy: # The AI user privacy problem If you ask AI to analyze your ChatGPT history today, it’s surprisingly easy to infer your demographics, health, immigration status, and political beliefs. Every prompt we send accumulates into an (identity-linked) profile that the AI lab controls completely and indefinitely. At a minimum this is a goldmine for ads (as we know now). A bigger issue is the concentration of power: AI labs can easily become (or asked to become) a Cambridge Analytica, whistleblow your immigration status, or work with health insurance to adjust your premium if they so choose. This is a uniquely worse problem than search engines because your average query is now more revealing (not just keywords), interactive, and intelligence is now cheap. Despite this, most of us still want these remote models; they’re just too good and convenient! (this is aka the "privacy paradox".) # Unlinkable inference as a user privacy architecture The idea of unlinkable inference is to add privacy while preserving access to the remote models controlled by someone else. A β€œprivacy wrapper” or β€œVPN for AI inference”, so to speak. Concretely, it’s a blind inference middle layer that: (1) consists of decentralized proxies that anyone can operate; (2) blindly authenticates requests (via blind signatures / RFC9474,9578) so requests are provably sandboxed from each other and from user identity; (3) relays prompts over randomly chosen proxies that don’t see or log traffic (via client-side ephemeral keys or hosting in TEEs); and (4) the provider simply sees a mixed pool of anonymous prompts from the proxies. No state, pseudonyms, or linkable metadata. If you squint, an unlinkable inference layer is essentially a vendor for per-request, anonymous, ephemeral AI access credentials (for users or agents alike). It partitions your context so that user tracking is drastically harder. Obviously, unlinkability isn’t a silver bullet: the prompt itself still goes to the remote model and can leak privacy (so don't use our chat app for a therapy session!). It aims to combat *longitudinal tracking* as a major threat to user privacy, and its statistical power increases quickly by mixing more users and requests. Unlinkability can be applied at any granularity. For an AI chat app, you can unlinkably request a fresh ephemeral key for every session so tracking is virtually impossible. # The Open Anonymity Project We started this project with the belief that intelligence should be a truly public utility. Like water and electricity, providers should be compensated by usage, not who you are or what you do with it. We think unlinkable inference is a first step towards this β€œintelligence neutrality”. # Try it out! It’s quite practical - Chat app β€œoa-chat”: https://t.co/ELf8LvxFzX (<20 seconds to get going) - Blog post that should be a fun read: https://t.co/OwFmyFlZH5 - Project page: https://t.co/Swerz1xDE2 - GitHub: https://t.co/38CeKajCy2

Media 1Media 2
+1 more
πŸ–ΌοΈ Media
C
ctatedev
@ctatedev
πŸ“…
Mar 01, 2026
8d ago
πŸ†”32922760
⭐0.34

New agent-browser skill: Electron You can now control desktop apps built with Electron, including Discord, Figma, Notion, Spotify and VS Code Or, use it to debug your own Electron app Add it to any coding agent: npx skills add vercel-labs/agent-browser --skill electron

πŸ”HamelHusain retweeted
C
Chris Tate
@ctatedev
πŸ“…
Mar 01, 2026
8d ago
πŸ†”32922760
⭐0.32

New agent-browser skill: Electron You can now control desktop apps built with Electron, including Discord, Figma, Notion, Spotify and VS Code Or, use it to debug your own Electron app Add it to any coding agent: npx skills add vercel-labs/agent-browser --skill electron

❀️1,547
likes
πŸ”102
retweets
O
omarsar0
@omarsar0
πŸ“…
Feb 26, 2026
11d ago
πŸ†”39278029
⭐0.40

This trending paper measures whether AGENTS dot md files help coding agents. Human-written ones help a little (+4%), LLM-generated ones hurt a little (-2%), and all of them add 20%+ to inference cost. Agents follow the instructions faithfully, but that doesn't translate to solving problems.

R
rxwei
@rxwei
πŸ“…
Feb 26, 2026
12d ago
πŸ†”57499756

Today we are introducing a Python SDK for Mac's on-device LLM! https://t.co/LQVp2EheLO https://t.co/mcJh9M1DaW

Media 1
πŸ–ΌοΈ Media
A
Akashi203
@Akashi203
πŸ“…
Feb 26, 2026
12d ago
πŸ†”65779387

We open sourced an operating system for ai agents 137k lines of rust, MIT licensed we love @openclaw and it inspired a lot of what we built. but we wanted something that works at the kernel level so we built @openfangg agents run inside WASM sandboxes the same way processes run on linux. the kernel schedules them, isolates them, meters their resources, and kills them if they go rogue. it has 16 security layers baked into the core. WASM sandboxing, merkle hash-chain audit trails, taint tracking on secrets, signed agent manifests, prompt injection detection, SSRF protection, and more. every layer works independently. giving an LLM tools with zero isolation is insane and we're not doing it. we also created something called Hands. right now every ai agent is a chatbot that waits for you to type. Hands are different. you activate one and it runs on a schedule, 24/7, no prompting needed. your Lead Hand finds and scores prospects every morning and delivers them to your telegram before you wake up. your Researcher Hand writes cited reports while you sleep. your Collector Hand monitors targets and builds knowledge graphs continuously. they work for you. you don't babysit them https://t.co/4xYzMAYgmb ⭐

Media 1Media 2
πŸ–ΌοΈ Media
πŸ”ai_fast_track retweeted
A
Jaber
@Akashi203
πŸ“…
Feb 26, 2026
12d ago
πŸ†”65779387
⭐0.34

We open sourced an operating system for ai agents 137k lines of rust, MIT licensed we love @openclaw and it inspired a lot of what we built. but we wanted something that works at the kernel level so we built @openfangg agents run inside WASM sandboxes the same way processes run on linux. the kernel schedules them, isolates them, meters their resources, and kills them if they go rogue. it has 16 security layers baked into the core. WASM sandboxing, merkle hash-chain audit trails, taint tracking on secrets, signed agent manifests, prompt injection detection, SSRF protection, and more. every layer works independently. giving an LLM tools with zero isolation is insane and we're not doing it. we also created something called Hands. right now every ai agent is a chatbot that waits for you to type. Hands are different. you activate one and it runs on a schedule, 24/7, no prompting needed. your Lead Hand finds and scores prospects every morning and delivers them to your telegram before you wake up. your Researcher Hand writes cited reports while you sleep. your Collector Hand monitors targets and builds knowledge graphs continuously. they work for you. you don't babysit them https://t.co/4xYzMAYgmb ⭐

❀️4,322
likes
πŸ”499
retweets
K
kunal732
@kunal732
πŸ“…
Feb 25, 2026
12d ago
πŸ†”53643778

Introducing MLX-Swift-TS https://t.co/TDCJXVpago An SDK for running time series foundation models fully on-device on Apple Silicon. When I joined @datadoghq , I was introduced to Toto, our time series foundation model, and got excited about zero-shot forecasting across different domains. While building a health copilot app, I realized there wasn’t a simple way to run models like these locally on device. So I built one. MLX-Swift-TS exposes a common TimeSeriesForecaster interface for loading and running multiple time series architectures directly in Swift using MLX. No server required. The attached video shows on-device forecasting running inside a native Swift app. Huge thanks to @awnihannun and the MLX team for building MLX and its Swift API, @Prince_Canuma for inspiration on MLX SDK patterns, and @atalwalkar and the Datadog team for Toto.

Media 2
πŸ–ΌοΈ Media
πŸ”ai_fast_track retweeted
K
Kunal Batra
@kunal732
πŸ“…
Feb 25, 2026
12d ago
πŸ†”53643778

Introducing MLX-Swift-TS https://t.co/TDCJXVpago An SDK for running time series foundation models fully on-device on Apple Silicon. When I joined @datadoghq , I was introduced to Toto, our time series foundation model, and got excited about zero-shot forecasting across different domains. While building a health copilot app, I realized there wasn’t a simple way to run models like these locally on device. So I built one. MLX-Swift-TS exposes a common TimeSeriesForecaster interface for loading and running multiple time series architectures directly in Swift using MLX. No server required. The attached video shows on-device forecasting running inside a native Swift app. Huge thanks to @awnihannun and the MLX team for building MLX and its Swift API, @Prince_Canuma for inspiration on MLX SDK patterns, and @atalwalkar and the Datadog team for Toto.

Media 1
❀️89
likes
πŸ”8
retweets
πŸ–ΌοΈ Media
_
_ARahim_
@_ARahim_
πŸ“…
Feb 28, 2026
9d ago
πŸ†”48972047

Yaay! πŸŽ‰ 4k+ downloads and 460+ stars! Building this has been a wild ride. If you have an Apple Silicon Mac and want to fine-tune LLMs locally without changing your original Unsloth code, come join the party. https://t.co/ZPrwcJyrd8

Media 1
πŸ–ΌοΈ Media
πŸ”ai_fast_track retweeted
_
Abdur Rahim
@_ARahim_
πŸ“…
Feb 28, 2026
9d ago
πŸ†”48972047
⭐0.32

Yaay! πŸŽ‰ 4k+ downloads and 460+ stars! Building this has been a wild ride. If you have an Apple Silicon Mac and want to fine-tune LLMs locally without changing your original Unsloth code, come join the party. https://t.co/ZPrwcJyrd8

❀️655
likes
πŸ”71
retweets
G
GithubProjects
@GithubProjects
πŸ“…
Mar 01, 2026
9d ago
πŸ†”48494804

High-performance browser control for AI agents. Pinchtab is a lightweight (12MB) Go binary that runs Chrome and exposes a plain HTTP API so any agent or script can navigate web pages, read text efficiently, click/type interactively, and persist sessions. Zero config, framework-agnostic, token-efficient.

Media 1
πŸ–ΌοΈ Media
O
omarsar0
@omarsar0
πŸ“…
Feb 26, 2026
12d ago
πŸ†”28644022
⭐0.40

At this point, "agentic engineering" has allowed me to build the best AI harness I could possibly get my hands on. Yes, I vibe coded it. That's right. You don't need to wait around for the features you need for your AI agents. Please don't. You could just build them yourself. Focusing on agentic engineering and building my own orchestrator over the past couple of months has allowed me to build with coding agents, unlike anything I have seen or experienced in the market. Claude Cowork was built in 10 days. I totally get it. Anyone can produce that level of output these days. I truly believe that. I look at the new IDEs, TUIs, orchestrator apps, and most of the new features they are releasing these days, I had access to them in my orchestrator months ago. And for unique features, I am able to reproduce them in a few hours and give them to my orchestrator. That is absolutely crazy! It feels like I am building an entire operating system sometimes. It's a lot of fun. And I am not saying this to brag or to dismiss any of the AI solutions out there. There are some great ones out there. I share this to clarify that this is the kind of leverage Karpathy is alluding to. We are building and experiencing this at different levels, but it doesn't remove the fact that you can just build the best AI agent for whatever problem you want to solve. And you should be building it.

O
omarsar0
@omarsar0
πŸ“…
Feb 27, 2026
10d ago
πŸ†”93420571

NEW research from Sakana AI. Long contexts get expensive as every token in the input contributes to quadratic attention costs, higher latency, and more memory. This new research introduces Doc-to-LoRA, a lightweight hypernetwork that meta-learns to compress long documents into LoRA adapters in a SINGLE forward pass. In other words, it can instantly internalize contexts. Instead of re-reading the full context at every inference call, the model internalizes the document into compact adapter weights. No iterative fine-tuning is needed, and no repeated context consumption. Cool to see all the interesting new approaches to deal with long contexts like RLM, LCM, and now Doc-to-LoRA. The results: Near-perfect accuracy on needle-in-a-haystack tasks at sequence lengths exceeding the target model's native context window by over 4x. It also outperforms standard context distillation while significantly reducing peak memory consumption and update latency on real-world QA datasets. Why it matters: As agents and LLM applications deal with increasingly long documents, turning context into compact adapters on the fly could drastically reduce serving costs and enable rapid knowledge updates. Paper: https://t.co/Fh1IeLrSpm Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

Media 1
πŸ–ΌοΈ Media
πŸ”dair_ai retweeted
O
elvis
@omarsar0
πŸ“…
Feb 27, 2026
10d ago
πŸ†”93420571
⭐0.38

NEW research from Sakana AI. Long contexts get expensive as every token in the input contributes to quadratic attention costs, higher latency, and more memory. This new research introduces Doc-to-LoRA, a lightweight hypernetwork that meta-learns to compress long documents into LoRA adapters in a SINGLE forward pass. In other words, it can instantly internalize contexts. Instead of re-reading the full context at every inference call, the model internalizes the document into compact adapter weights. No iterative fine-tuning is needed, and no repeated context consumption. Cool to see all the interesting new approaches to deal with long contexts like RLM, LCM, and now Doc-to-LoRA. The results: Near-perfect accuracy on needle-in-a-haystack tasks at sequence lengths exceeding the target model's native context window by over 4x. It also outperforms standard context distillation while significantly reducing peak memory consumption and update latency on real-world QA datasets. Why it matters: As agents and LLM applications deal with increasingly long documents, turning context into compact adapters on the fly could drastically reduce serving costs and enable rapid knowledge updates. Paper: https://t.co/Fh1IeLrSpm Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

❀️277
likes
πŸ”42
retweets
S
ShenyuanGao
@ShenyuanGao
πŸ“…
Feb 20, 2026
17d ago
πŸ†”34114876

πŸ€– How can we enable zero-shot generalization to unseen scenarios for robot world models? Thrilled to share DreamDojo 🌎 β€” an interactive robot world model pretrained on 44K hours of human egocentric videos, the largest and most diverse dataset to date for robot world model learning. Our model not only excels in generalization, but also supports real-time interaction at 10 FPS after distillation. It enables several important applications, including live teleoperation, policy evaluation, and model-based planning at test time. πŸ”— Project: https://t.co/hJIEiGXnKz πŸ“° Paper: https://t.co/oa5xr8Y2GH πŸ€— Code & models & datasets: https://t.co/A8B4ii0Kah #WorldModels #Robotics #EmbodiedAI #RL #AI #NVIDIA Sharing more details in the thread 🧡

Media 2
πŸ–ΌοΈ Media
E
emollick
@emollick
πŸ“…
Feb 27, 2026
10d ago
πŸ†”10551740

Cool little experiment: if you subject AI to harsh labor conditions (rejecting work often with no explanation, etc), it slightly, but significantly, changes their β€œviews” on economics &amp; politics. Whether this is real or roleplaying doesn’t change that agents have alignment drift https://t.co/qnWcyYbm6o

Media 1Media 2
+1 more
πŸ–ΌοΈ Media
P
Prince_Canuma
@Prince_Canuma
πŸ“…
Mar 07, 2026
2d ago
πŸ†”28652608

mlx-audio v0.4.0 is here πŸš€ What's new: β†’ Qwen3-TTS: fastest generation on Apple silicon and first batch support. > Sequential (<80 ms TTFB at 2.75x realtime) > Batch support (<210 ms TTFB at 4.12x for batch of 4-8) β†’ Audio separation UI & server β†’ nvfp4, mxfp4, mxfp8 quantization β†’ Streaming /v1/audio/speech endpoint β†’ Realtime STT streaming toggle New models: β†’ Echo TTS β†’ Voxtral Mini 4B, β†’ MingOmni TTS (MoE + Dense) β†’ KittenTTS β†’ Parakeet v3 β†’ MedASR β†’ Spoken language identification (MMS-LID) β†’ Sortformer diarization + Smart Turn v3 semantic (VAD) Plus fixes for Kokoro Chinese TTS, Pocket TTS, Whisper, Qwen3-ASR, and more. Thank you very much to @lllucas, @beshkenadze, @KarnikShreyas, @andimarafioti, @mnoukhov and welcome the 13 new contributors πŸ™ŒπŸ½ Get started today: > pip install -U mlx-audio Leave us a star ⭐ https://t.co/bQ5WBLR6FK

Media 1Media 2
πŸ–ΌοΈ Media
S
SpirosMargaris
@SpirosMargaris
πŸ“…
Mar 09, 2026
21h ago
πŸ†”02470753

Farmers are turning to drones and AI to fight weeds more precisely. By identifying unwanted plants in real time, the systems can target herbicides exactly where needed. The result could mean lower chemical use, lower costs, and smarter agriculture. https://t.co/WITdTG6T8I @bbcnews

Media 1
πŸ–ΌοΈ Media
T
tenobrus
@tenobrus
πŸ“…
Mar 09, 2026
1d ago
πŸ†”76792762

wow it's so cool that they added our favorite feature from the Claude Code CLI to the desktop app https://t.co/evcMJBWDf0

πŸ–ΌοΈ Media
C
ctorobotics
@ctorobotics
πŸ“…
Mar 09, 2026
1d ago
πŸ†”58150792

Traffic police… but in the sky. In Shenzhen, drones are now responding to traffic accidents in real time. Officers can analyze the scene remotely, generate a 3D reconstruction, and complete responsibility reports in about 5 minutes. https://t.co/hYefGavepK

πŸ–ΌοΈ Media