Your curated collection of saved posts and media
OpenClaw 2026.3.8 π¦ π ACP provenance β your agent finally knows who's talking to it πΎ openclaw backup β because YOLO deploys need a safety net π± Telegram dupes killed π‘οΈ 12+ security fixes We fixed more things than we broke. Progress. https://t.co/ahq26lABw3
OpenClaw 2026.3.8 π¦ π ACP provenance β your agent finally knows who's talking to it πΎ openclaw backup β because YOLO deploys need a safety net π± Telegram dupes killed π‘οΈ 12+ security fixes We fixed more things than we broke. Progress. https://t.co/ahq26lABw3
Created close reading notebooks for almost every lesson of @jeremyphoward's fastai deep learning course (it's more than a course) Close reading is a technique for reading out of text, not into. Use a LLM, and you're in flow state for longerβyou ask right there, with all context. https://t.co/Wr1sWs40Tl
In the API, use image detail: original to unlock our biggest vision and CUA gains! https://t.co/WE1cRKzHtN
Planning for Long-Horizon Web Tasks Really solid work on making web agents better at complex, long-horizon tasks. STRUCTUREDAGENT introduces a hierarchical planning framework using dynamic AND/OR trees for efficient search and a structured memory module for tracking candidate solutions across browsing steps. It produces interpretable hierarchical plans that make debugging and human intervention easier. Current web agents struggle with multi-step tasks because they act greedily and lose track of alternatives. STRUCTUREDAGENT achieves 46.7% on complex shopping tasks, outperforming all baselines, by giving agents the ability to backtrack, revise, and maintain structured state. Paper: https://t.co/3UOqz5TvYW Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
Still no Claude Cowork competitor from any other lab yet. On one hand, its been six weeks. On the other, its been six weeks for companies that say that all their code is being written for them by AI.
Building deep research agents is pretty fun with @pydantic AI, doing a workshop this weekend with @hugobowne :) https://t.co/y4qVNSjOC4
Introducing CLI-Anythingπ Making ALL software agent-native with one command. Today's software serves humansπ¨βπ». Tomorrow's users will be agentsπ€. CLI-Anything: bridging the gap between AI agents and the world's software. One command line to make any software agent-ready for OpenClaw, nanobot, Cursor, Claude Code, etc. GitHub: https://t.co/BlRymgR21a π€ Why CLI-Anything? CLI is the universal interface for both humans and AI agents: - Structured & Composable - Text commands match LLM format and chain for complex workflows - Lightweight & Universal - Minimal overhead, works across all systems without dependencies - Self-Describing - --help flags provide automatic documentation agents can discover - Proven Success - Claude Code runs thousands of real workflows through CLI daily - Agent-First Design - Structured JSON output eliminates parsing complexity - Deterministic & Reliable - Consistent results enable predictable agent behavior π‘ CLI-Anything's Vision: Building Agent-Native Software - π Universal Access - Every software becomes instantly agent-controllable through structured CLI. - π Seamless Integration - Agents control any application without APIs, GUI, rebuilding or complex wrappers. - π Future-Ready Ecosystem - Transform human-designed software into agent-native tools with one command. #CLIAnything #openclaw #nanobot #claudecode

updated my AGENTS.md again to prevent codex fallback hell. https://t.co/VbXBwYNKTf
@JTMcG3 looks great! :) TinyStories is the right thing to train on for very small models / Apple Silicon, where you can actually get somewhere. I might even make a note about that in the README. I would use this dataset in particular, it's the cleanest one afaik https://t.co/mDcyLlPH1P
@g_leech_ somewhere in the weight space is the global minimum of the validation loss for that neural net architecture. and somewhere in the int space is the seed that just gives it to you. normalize guess-and-check "training" of neural nets by brute force search on seed! :D
Quite a few people have asked when to use 5.4 vs Opus on Computer or Perplexity. The single most important and clear win for 5.4 is in writing. It's the best writer of any model ever. If you're using Computer for marketing or content jobs, use 5.4 as your subagent/orchestrator.
@TheEconomist AI labs: jobs will be replaced in 6β12 months. Research: frontier AI models peak around a 3% success rate on real professional digital tasks. The gap between the hype and the evidence is doing a lot of work. https://t.co/pocfimUlD2
@Forbes AI labs: jobs will be replaced in 6β12 months. Research: frontier AI models peak around a 3% success rate on real professional digital tasks. The gap between the hype and the evidence is doing a lot of work. https://t.co/pocfimUlD2
@thecurioustales AI labs: jobs will be replaced in 6β12 months. Research: frontier AI models peak around a 3% success rate on real professional digital tasks. The gap between the hype and the evidence is doing a lot of work. https://t.co/pocfimUlD2
yo @superwhisper can you support a version of dictation using openai's realtime api w/ web sockets so we can have faster dictation? I really need this I built something internally that works
The dataset if you want to experiment: https://t.co/R105YI33KL
How to effectively create, evaluate and evolve skills for AI agents? Without systematic skill accumulation, agents constantly reinvent the wheel. SkillNet introduces an open infrastructure for creating, evaluating, and organizing AI skills at scale. It structures over 200,000 skills within a unified ontology, supporting rich relational connections like similarity, composition, and dependency, and performs multi-dimensional evaluation. SkillNet improves average rewards by 40% and reduces execution steps by 30% across ALFWorld, WebShop, and ScienceWorld benchmarks. The key takeaway is treating skills as evolving, composable assets rather than transient solutions. Paper: https://t.co/Xv3uGLnPH2 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

Anthropic themselves found that vibecoding hinders SWEs ability to read, write, debug, and understand code. not only that, but AI generated code doesnβt result in a statistically significant increase in speed donβt let your managers scare you into increased productivity. show them this paper straight from Anthropic.
Planning for Long-Horizon Web Tasks Really solid work on making web agents better at complex, long-horizon tasks. STRUCTUREDAGENT introduces a hierarchical planning framework using dynamic AND/OR trees for efficient search and a structured memory module for tracking candidate solutions across browsing steps. It produces interpretable hierarchical plans that make debugging and human intervention easier. Current web agents struggle with multi-step tasks because they act greedily and lose track of alternatives. STRUCTUREDAGENT achieves 46.7% on complex shopping tasks, outperforming all baselines, by giving agents the ability to backtrack, revise, and maintain structured state. Paper: https://t.co/3UOqz5TvYW Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

i've also renamed the open-excel repo into office-agents. the SDK, which contains the agent loop, IndexedDB storage logic, etc is published to NPM. so you can build your own plugins. fwiw, powerpoint is only ~2.5k LoC excluding the system prompt and the officejs .d.ts file https://t.co/ZEvp2xE21k
Claude Code deleted developers' production setup, including its database and snapshots. 2.5 years of records were nuked in an instant. https://t.co/0v70ChNEVL

A listener has created this detailed vocabulary and set of linked references for anyone interested in diving deeper: https://t.co/oM2kkUttLS
tesla's decision to point blank refuse to touch lidar has proven to be one of the most insane self owns of any technology company ever. they easily have the research talent, and waymo has proved they could be doing millions of fully autonomous rides. at this point it's a choice