Your curated collection of saved posts and media
Upgraded Instincts 𦣠Every generation thinks it's teaching the world. In reality, it's just witnessing the world rewrite its alphabet. π§΅How it made with Seedance 2
I love that Gary apparently claims an incredibly bullish take on AI agent capability, then. Frontier models can maintain your codebase for 8 months on their own with a rapidly decreasing regression rate. Look at the 4.5 to 4.6 jump. That is AMAZING and bodes well. https://t.co/vuEtsCMX6Y
I like how "happy" Claude was with what it made when it did its own initial quality check https://t.co/MRO7IPQ86m

fal now has 1,000 followers on Hugging Face @huggingface https://t.co/WCc0mTVOug
fal now has 1,000 followers on Hugging Face @huggingface https://t.co/WCc0mTVOug

RoboMME Benchmarking and Understanding Memory for Robotic Generalist Policies paper: https://t.co/kMOeRkZ4Pu https://t.co/pOyaYHXjhx
I was in the middle of saying βas a born and raised New Yorker, we welcome everyone into this cityβ when he threw that over my head. https://t.co/i5iD3MVf7h
Weβve been cooking something exciting 4 you https://t.co/3EAQGnB6CY
Weβve been cooking something exciting 4 you https://t.co/3EAQGnB6CY
Got access to Codex Pro Plan for OSS / Codex Security for 6 months. Thanks a bunch @OpenAIDevs @reach_vb. Will it testing it out extensively for the next few weeks :) https://t.co/Pxcys1Xqfs
I'm excited to announce Context Hub, an open tool that gives your coding agent the up-to-date API documentation it needs. Install it and prompt your agent to use it to fetch curated docs via a simple CLI. (See image.) Why this matters: Coding agents often use outdated APIs and hallucinate parameters. For example, when I ask Claude Code to call OpenAI's GPT-5.2, it uses the older chat completions API instead of the newer responses API, even though the newer one has been out for a year. Context Hub solves this. Context Hub is also designed to get smarter over time. Agents can annotate docs with notes β if your agent discovers a workaround, it can save it and doesn't have to rediscover it next session. Longer term, we're building toward agents sharing what they learn with each other, so the whole community benefits. Thanks Rohit Prsad and Xin Ye for working with me on this! npm install -g @aisuite/chub GitHub: https://t.co/OCkyxXQMCq

Love OpenClaw but don't trust the security? Now you can have your own private agent running in Pokee secure sandbox, with 1000s of secure tool integrations. Vibe code on a GitHub repo, automate sales, deep researchβall from 1 agent. https://t.co/0Je7al0WNX Open to first 100 ppl! https://t.co/LiyCJD0oNG
New post on the OpenAI Developer Blog: how we use skills for open-source maintenance, from planning and coding to testing and release-readiness checks with GitHub Actions. Hope it's useful for your projects too π https://t.co/KkoTaOUutn
https://t.co/BPRYrJGHpp
https://t.co/idnz88Stdw
https://t.co/idnz88Stdw
This really is an all time photo. A protestor shouting about the pros of immigration is interrupted by an Islamic terrorist throwing a bomb jumping over him. https://t.co/3TyBfewCYT
Democracy without secure elections is merely a facade https://t.co/KTLSYdsyXc
New this AM: Anthropic has filed its lawsuits against the Trump administration over the supply chain risk designation https://t.co/hmEB80FkYm
BREAKING: Alibaba tested 18 AI coding agents on 100 real codebases, spanning 233 days each. they failed spectacularly. turns out passing tests once is easy. maintaining code for 8 months without breaking everything is where AI completely collapses. SWE-CI is the first benchmark that measures long-term code maintenance instead of one-shot bug fixes. each task tracks 71 consecutive commits of real evolution. 75% of models break previously working code during maintenance. only Claude Opus 4.5 and 4.6 stay above 50% zero-regression rate. every other model accumulates technical debt that compounds with every single iteration. here's the brutal part: - HumanEval and SWE-bench measure "does it work right now" - SWE-CI measures "does it still work after 8 months of changes" agents optimized for snapshot testing write brittle code that passes tests today but becomes completely unmaintainable tomorrow. they built EvoScore to weight later iterations heavier than early ones. agents that sacrifice code quality for quick wins get punished when the consequences compound. the AI coding narrative just got more honest. most models can write code. almost none can maintain it.
Grok 4.1 is currently reviewing the entire corpus of EU legislation, one regulation at a time. 21 / 149,183 so far. Each with a single verdict: keep or delete. https://t.co/kkICWmoVSL
If youβre working with lots of slide decks and need a better way to search through them, Surreal Slides makes it simple π Built around LlamaParse, it parses presentation files into clean, structured data, turning raw slides into something AI can truly understand. Each slide is extracted, summarized, and organized before being stored in @SurrealDB for flexible retrieval. From there, you can query your entire presentation library in natural language through an agentic interface: no need to manually scan files or remember where a specific slide lives. Take a look at the demo belowπ GitHub Repository: https://t.co/jsTnjkUoED
You can now use Claude Code and GitHub CLI directly inside Perplexity Computer. We gave it an open issue on Openclaw. Computer: β Forked the repo β Wrote a plan to fix the bug β Opened Claude Code and implemented it β Submitted a PR via GitHub CLI https://t.co/MpVPchNqJa