Your curated collection of saved posts and media
you know what hell yea https://t.co/mTYyoxakZy

Lots of buzz online about an upcoming major March heatwave for the American SW & California. And in this case, it does indeed appear increasingly likely than an extremely anomalous and even record-breaking heatwave may envelop much of the SW about a week from now. https://t.co/GByhbmJEZb
no thank youβ¦
@ArthurMacwaters No it wonβt. Leading AI labs, including Anthropic, know full well that current models are unreliable, third-party tests show a staggering 97% failure rate on digital tasks. Pause and let that sink in. Silicon Valley has always lived in a bubble. Today, its recklessness threatens the entire economy, and our systems arenβt ready to cope. Brace yourself. Ask yourself: why do we take AI labs at their word about their own technology? Scrutiny isnβt anti-innovation, itβs pro-accountability. https://t.co/Ut4hpvTU3C
@MilkRoadAI Nonsense. Leading AI labs, including Anthropic, know full well that current models are unreliable, third-party tests show a staggering 97% failure rate on digital tasks. Pause and let that sink in. Silicon Valley has always lived in a bubble. Today, its recklessness threatens the entire economy, and our systems arenβt ready to cope. Brace yourself. Ask yourself: why do we take AI labs at their word about their own technology? Scrutiny isnβt anti-innovation, itβs pro-accountability. https://t.co/Ut4hpvTU3C
Bets on Zuckβs next bad bet, after Metaverse and AGI/Alexander Wang?
@neil_projects @rubanlah this uses hooks right?
If youβve built a multi-agent workflow, youβve probably seen it fail in a way thatβs hard to explain. π€ An agent closes an issue another just opened, or ships a change that fails a downstream check. Why does this happen? And how do we fix it? π§΅β¬οΈ
The core problem: We treat multi-agent systems like chat interfaces. But the moment agents begin handling related tasks, they start making implicit assumptions about state, ordering, and validation. They are actually distributed systems.
Fix 1: Typed Schemas π§± Natural language is messy. Agents need typed interfaces and strict schemas at every boundary. Passing machine-checkable data means invalid messages fail fast, and downstream steps donβt have to guess what a payload means.
This is the part that actually matters for code reviews: Generating code is about output. Verifying code is about skepticism, judgment, and trust. Those are different engineering muscles, and strong coding teams need both where things are headed with coding agents.
@thesandeep_nair see eg https://t.co/HbxB1boF77, though it needs to be updated (again)
my contempt originates in the kind of behavior I discussed here: https://t.co/HbxB1boF77
The @AskPerplexity account is now meant exclusively for Perplexity Computer updates. Give it a follow to stay up to date on everything Computer is capable of doing, and consistent updates like new tools, connectors capabilities, and workflows.
@buntyverse You donβt know me
@ianatmars @ML3democrats There is no voter fraud to speak of. You're a delusional grokon.
Oracle is building yesterday's data centers with tomorrow's debt Frontier labs like OpenAI want the newest chips. But Nvidia is shipping a new generation annually while data centers still take years to get up and running. That's a mismatch for the whole AI trade Oracle, funding it with $100B in debt, may be first to crack
NEW: OpenAI and Google employeesβincluding Google DeepMind Chief Scientist Jeff Dean βfiled an amicus brief in support of Anthropic in its lawsuit against the US government. https://t.co/3lQrzlq8BE
Generating code and verifying code are fundamentally different engineering problems. Good to see Claude Code recognizing the importance of review. And does it cost $15-20 per PR ?!@#? But the real question is: Should the same system that generates the code also verify it? In mature systems, we separate concerns: Β β’ Creation systems β generation Β β’ Integrity systems β quality, governance, verification, observability They operate on different philosophies: Β generation optimizes for output, verification for skepticism. Speed is easy. Quality is the hard part. And as code generators evolve, teams will want the freedom to switch between them. Thatβs why we believe the winning stack will look like: Β Claude + Qodo = speed + quality = velocity Β (Claude reviewing Claude risks shared blind spots and even $$ spend?)
This is the part that actually matters for code reviews: Generating code is about output. Verifying code is about skepticism, judgment, and trust. Those are different engineering muscles, and strong coding teams need both where things are headed with coding agents.
I've been using @QodoAI for code reviews and their deep expertise in this area is clear. Their recent rule system is brilliant if you want to explore. Thanks to the team for partnering with me on this post. Get 1 month free of Qodoβs Teams plan with promo code: UNBIASED
Applications are still open distributing and ramping up slowly.
Good news! Ulysses Sequence Parallelism from the Snowflake AI Research and the Deepspeed teams has been integrated into @huggingface Trainer, Accelerate and TRL For extensive details please see this writeup: https://t.co/2xDWUk8p3V Thanks a lot to @krasul for helping make it happen. Also the others in the HF team who helped with integration.

New research from Databricks. It's about training enterprise search agents via RL. KARL introduces a multi-task RL approach where agents are trained across heterogeneous search behaviors, constraint-driven entity search, cross-document synthesis, and tabular reasoning. It generalizes substantially better than those optimized for any single benchmark. KARL is Pareto-optimal on both cost-quality and latency-quality trade-offs compared to Claude 4.6 and GPT 5.2. With sufficient test-time compute, it surpasses the strongest closed models while being more cost efficient. Paper: https://t.co/CToEmDU89J Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c