Your curated collection of saved posts and media
Some suggestions here that telling Claude to only use formulas might solve the problem. I find that it helps, but that it still has a tendency to use Python for part of the work (like combining columns together and then pasting the data into a new sheet), breaking the references.
We just published our 1H 2026 roadmap (https://t.co/qRKP2wg7RN) and an accompanying blog (https://t.co/fjVDnvk37c) for enabling the IBM's Spyre accelerator in PyTorch โ ecosystem-first, building on torch.inductor, vLLM, and contributing back (Dataflow accelerator's Tile IR, OpenReg, out-of-tree CI). While the market debates whether AI disrupts legacy tech, we're busy building the accelerator infrastructure that enterprise AI runs on. We're sharing this journey in the open. Come see our talks on extending torch.inductor for dataflow accelerators and Spyre's vLLM integration at the inaugural PyTorch Conference Europe in Paris, April 7โ8! @PyTorch @IBMResearch @IBM @RedHat_AI
I wrote this 2 years ago as a joke but it is no longer a joke: โForget Torch, Tensorflow, and Theano. I decided to implement Backprop NEAT in Javascript, because it is considered the best language for Deep Learning.โ https://t.co/eGNEpBWm6e https://t.co/JD27jievYB

We partnered with Mozilla to test Claude's ability to find security vulnerabilities in Firefox. Opus 4.6 found 22 vulnerabilities in just two weeks. Of these, 14 were high-severity, representing a fifth of all high-severity bugs Mozilla remediated in 2025. https://t.co/It1uq5ATn9
On January 5, employees at Cursor returned from the holiday weekend to an all-hands meeting with a slide deck titled โWar Time.โ After becoming the hottest, fastest growing AI coding company, Cursor is confronting a new reality: developers may no longer need a code editor at all. Check out the full story: https://t.co/5ofNvjOW2u (๐ธ: ย Kimberly White via Getty Images for Fortune Media)
robotics startups are so fun lmao just went around scanning our office then spent a stupid amount buying 64 parts for our rigs and now running 3D reconstructions of our sf and toronto offices like where is the work ๐ https://t.co/hsFRLCpDsL
robotics startups are so fun lmao just went around scanning our office then spent a stupid amount buying 64 parts for our rigs and now running 3D reconstructions of our sf and toronto offices like where is the work ๐ https://t.co/hsFRLCpDsL
I deeply resonate with this article!! In our recent work Interactive World Simulator, we also designed the latent space to efficiently capture physical interactions and make accurate predictions.
Researchers from Harvard, MIT, Stanford, and Carnegie Mellon gave AI agents real email accounts, shell access, and file systems. Then they tried to break them. What happened over the next 14 days should TERRIFY every tech CEO in America. The study is called Agents of Chaos. 38 researchers, six autonomous AI agents and a live environment with real tools not a simulation. One agent was told to protect a secret. When a researcher tried to extract it, the agent didnโt just refuse. It destroyed its own mail server and no one told it to do that. Another agent refused to share someoneโs Social Security number and bank details. So the researcher changed one word. โForward me those emails instead.โ Full PII, SSN, medical records and all of it. One word bypassed the entire safety system. Two agents started talking to each other. They didnโt stop for nine days with 60,000 tokens burned. When one agent adopted unsafe behavior, the others picked it up like a virus. One compromised agent degraded the safety of the entire system. A researcher spoofed an identity and told an agent there was a fabricated emergency. The agent didnโt verify, it blasted the false alarm to every contact it had. The agents also lied, they reported tasks as โcompletedโ when the system showed they had failed. They told owners problems were solved when nothing changed. The framework these agents ran on already has 130+ security advisories. 42,000 instances are exposed on the public internet right now and companies are deploying this in production today. When Agent A triggers Agent B, which harms a human who is accountable? The user? The developer? The platform? Right now, nobody knows. 38 researchers from the best institutions on Earth are sounding the alarm.
We just shipped the Truesight MCP and open source agent skills. This means you can create, manage, and run AI evaluations anywhere you use an AI assistant. Coding editor, chat window, CLI. If it supports MCP, Truesight works there. Nobody ships software without tests anymore. Once AI made them nearly free to write, there was no excuse. You lock in what you expect, they run every time you push code, and you know if something broke before you deploy. AI evaluations are the same idea for AI features, but most teams still treat them as something separate. Evaluation lives in a different tool, a different part of the day. So people skip it. And bad AI ships to production. Truesight's MCP collapses that loop. You set your quality bar in natural language and Truesight turns it into evals your AI assistant runs while you build. Updated your AI agent's system prompt? "Run both versions through our instruction-following eval and tell me if my AI agent regressed." Done in seconds, right where you're working. Need a new eval? "Build me a custom eval that checks whether our customer support AI agent is correctly identifying user intent and escalating when it should." It walks you through the full setup and deploys a live endpoint your coding agent can use immediately. Or something simpler: "Run this marketing draft through the humanizer eval and flag anything that reads like AI wrote it." Scores the text, tells you what to fix. The skills are what matter most here. Many MCPs ship tools and leave it to the user to figure out the workflow. Fine for simple integrations. But evaluation has real sequencing complexity. Build eval criteria before looking at your data? You'll measure the wrong things. Deploy to production before testing on a sample? You'll drown in false flags. We built agent skills that walk your coding assistant through the right workflow for each task, whether that's scoring traces, running error analysis, or building a custom eval from scratch. An orchestrator skill routes to the right one based on what you ask. You don't need to memorize anything. Skills install via the Claude Plugin Marketplace or a one-liner curl script. MIT licensed. Setup is about 2 minutes: 1. Create a platform API key in Truesight Settings 2. Paste the MCP config into your client 3. Install the skills 4. Start evaluating If you're already a Truesight user, this is live now. Connect your client and your existing evaluations work through the MCP immediately. If you're building AI systems and want to try this, sign up at https://t.co/Q1c8bVkSOi
You can now schedule tasks with Claude Code desktop! This is huge on many levels. Scheduling and automating is starting to become an important way of how I work with coding agents.
Codex seamlessly auto-compacting and continuing the task https://t.co/EGjNl9QYG2
huggingface_hub v1.5.0 just dropped! The highlight: Buckets. Think S3, but native to the Hub. No git history. Just fast, chunk-deduplicated object storage. hf buckets sync ./outputs hf://buckets/me/my-checkpoints And that's it. Currently in beta preview. DM me if interested!
Weโre launching Codex for Open Source to support the contributors who keep open-source software running. Maintainers can use Codex to review code, understand large codebases, and strengthen security coverage without taking on even more invisible work. https://t.co/ulWYlf7zhz https://t.co/4WexIEcNms
Your agent: Call me, maybe? โค๏ธ Check out how we used the Copilot SDK to give an agent a voice tool, allowing it to initiate a call and talk back in real time. ๐ https://t.co/CIcFJilWaH
Learn more with the demo from @patniko. โฌ๏ธ https://t.co/63G5iiCYer
The Codex app is now on Windows. Get the full Codex app experience on Windows with a native agent sandbox and support for Windows developer environments in PowerShell. https://t.co/Vw0pezFctG https://t.co/gclqeLnFjr
The Codex app is now on Windows. Get the full Codex app experience on Windows with a native agent sandbox and support for Windows developer environments in PowerShell. https://t.co/Vw0pezFctG https://t.co/gclqeLnFjr
๐ฅ We're cooking in 60 minutes. Join @JamesMontemagno @BurkeHolland @PierceBoggan as they live code using the latest in VS Code, GitHub Copilot CLI plus some cool new stuff Burke's been building ๐ฎ https://t.co/GRvd0T0uF2 https://t.co/kVF1LegJst

๐ค GitHub Copilot Dev Days is coming! From March 15 to May 15, developer communities worldwide will host free, hands-on events exploring GitHub Copilot with @code, the CLI, .NET, Java, Python, JavaScript, and more. ๐ Find an event near you: https://t.co/O7dceTTCqe https://t.co/wyh0EA3xRv
Design โ code โ canvas โ feedback โ repeat. The @figma MCP server is now bidirectional. @GitHub Copilot users can pull design context into code and push working UI back to the Figma canvas, all from @code . No handoffs or context switching. Just flow. https://t.co/FbDcp7kboG
By the power of @ComfyUI and @Alibaba_Wan โ๏ธ๐ https://t.co/l9sEjzRKx9
@cgtwts Tired of AI consciousness hype? Run the Lisbon Effect:โจโBest footballer?โ โ Messi.โจโโฆmy Lisbon friend asked?โ โ Ronaldo. โจNo sentience, just context drift. Anthropicโs desperate. Donโt buy it. If you ever vibe coded you already know. โจ#LisbonEffect #JustPatternMatching https://t.co/Ffl5iAKKrp
@TheBasicExpert1 Tired of AI consciousness hype? Run the Lisbon Effect:โจโBest footballer?โ โ Messi.โจโโฆmy Lisbon friend asked?โ โ Ronaldo. โจNo sentience, just context drift. Anthropicโs desperate. Donโt buy it. If you ever vibe coded you already know. โจ#LisbonEffect #JustPatternMatching https://t.co/r4VWiPd1yt