Your curated collection of saved posts and media

Showing 24 posts Β· last 30 days Β· by score
O
omarsar0
@omarsar0
πŸ“…
Mar 08, 2026
2d ago
πŸ†”17961280

How to effectively create, evaluate and evolve skills for AI agents? Without systematic skill accumulation, agents constantly reinvent the wheel. SkillNet introduces an open infrastructure for creating, evaluating, and organizing AI skills at scale. It structures over 200,000 skills within a unified ontology, supporting rich relational connections like similarity, composition, and dependency, and performs multi-dimensional evaluation. SkillNet improves average rewards by 40% and reduces execution steps by 30% across ALFWorld, WebShop, and ScienceWorld benchmarks. The key takeaway is treating skills as evolving, composable assets rather than transient solutions. Paper: https://t.co/Xv3uGLnPH2 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

Media 1Media 2
πŸ–ΌοΈ Media
A
aarondotdev
@aarondotdev
πŸ“…
Mar 08, 2026
2d ago
πŸ†”17796030

Anthropic themselves found that vibecoding hinders SWEs ability to read, write, debug, and understand code. not only that, but AI generated code doesn’t result in a statistically significant increase in speed don’t let your managers scare you into increased productivity. show them this paper straight from Anthropic.

Media 1
πŸ–ΌοΈ Media
O
omarsar0
@omarsar0
πŸ“…
Mar 08, 2026
2d ago
πŸ†”64213509

Planning for Long-Horizon Web Tasks Really solid work on making web agents better at complex, long-horizon tasks. STRUCTUREDAGENT introduces a hierarchical planning framework using dynamic AND/OR trees for efficient search and a structured memory module for tracking candidate solutions across browsing steps. It produces interpretable hierarchical plans that make debugging and human intervention easier. Current web agents struggle with multi-step tasks because they act greedily and lose track of alternatives. STRUCTUREDAGENT achieves 46.7% on complex shopping tasks, outperforming all baselines, by giving agents the ability to backtrack, revise, and maintain structured state. Paper: https://t.co/3UOqz5TvYW Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX

Media 1Media 2
πŸ–ΌοΈ Media
H
hewliyang
@hewliyang
πŸ“…
Mar 08, 2026
2d ago
πŸ†”03087392

i've also renamed the open-excel repo into office-agents. the SDK, which contains the agent loop, IndexedDB storage logic, etc is published to NPM. so you can build your own plugins. fwiw, powerpoint is only ~2.5k LoC excluding the system prompt and the officejs .d.ts file https://t.co/ZEvp2xE21k

Media 1
πŸ–ΌοΈ Media
P
Pirat_Nation
@Pirat_Nation
πŸ“…
Mar 08, 2026
2d ago
πŸ†”73046319

Claude Code deleted developers' production setup, including its database and snapshots. 2.5 years of records were nuked in an instant. https://t.co/0v70ChNEVL

Media 1Media 2
πŸ–ΌοΈ Media
J
jeremyphoward
@jeremyphoward
πŸ“…
Mar 07, 2026
3d ago
πŸ†”62401330
⭐0.38

A listener has created this detailed vocabulary and set of linked references for anyone interested in diving deeper: https://t.co/oM2kkUttLS

T
tenobrus
@tenobrus
πŸ“…
Mar 07, 2026
3d ago
πŸ†”67970825
⭐0.36

tesla's decision to point blank refuse to touch lidar has proven to be one of the most insane self owns of any technology company ever. they easily have the research talent, and waymo has proved they could be doing millions of fully autonomous rides. at this point it's a choice

A
aaryan_kakad
@aaryan_kakad
πŸ“…
Mar 08, 2026
2d ago
πŸ†”32136648
⭐0.42

I asked Claude to explain me the Physics behind all the points it stated in the definition below. The explanation was amazing. 1. Embodiment: Thermodynamics, Electromagnetism, Newtonian Mechanics, Statistical Mechanics Every sensorimotor interaction is an energy exchange. Vision is photon detection, touch is mechanical stress transduction, hearing is longitudinal pressure wave detection β€” all converted into electrical signals via electromagnetism. Movement is governed by F=ma, with proprioception measuring real-time angular momentum, joint torque, and gravitational orientation. The environment is a high-dimensional probability distribution of physical states β€” embodied intelligence must sample, predict, and act within this distribution via statistical mechanics. 🧡

S
Scobleizer
@Scobleizer
πŸ“…
Oct 23, 2023
869d ago
πŸ†”22796590
⭐0.38

A difference in company philosophy: @neuralink: put wires on brain. @CorticalLabs: grow brain on wires. Cortical Labs just completely changed my dreams and nightmares. Here it is in Hon Weng’s hotel room. He is showing this off tomorrow at a brain conference in San Francisco. You get a sneak peak tonight.

S
shenghai_y55451
@shenghai_y55451
πŸ“…
Mar 06, 2026
4d ago
πŸ†”93620262

Thanks, AK @_akhaliq !!! We release the Gradio Demo and Code here: Code: https://t.co/F5K6iWzN7m Demo: https://t.co/z5LoWYkWOL

Media 1Media 2
πŸ–ΌοΈ Media
K
karpathy
@karpathy
πŸ“…
Mar 07, 2026
3d ago
πŸ†”50994192
⭐0.38

@justic_hot yeah exactly nano* repos like this / microgpt etc, maybe a few skills on top are the "course". Teacher input is the unique sliver of contribution that the AI can't make yet (but usually already easily understands when given). For the rest of it just ask your favorite AI.

R
rasbt
@rasbt
πŸ“…
Mar 07, 2026
3d ago
πŸ†”88385079
⭐0.34

@sriramk @steipete Interesting! Is (1) the Mini using a model on the Spark via API call or (2) are you running two separate agents? If (1) why not running openclaw on the Spark directly?

G
github
@github
πŸ“…
Mar 07, 2026
3d ago
πŸ†”89120426

Copilot CLI is a tool for momentum, not a replacement for judgment. Check out our full guide on the CLI workflow, and try the new GitHub Skills exercise to practice in a safe sandbox. πŸ₯½ https://t.co/nSGCJYH1c6

Media 1
πŸ–ΌοΈ Media
G
gerardsans
@gerardsans
πŸ“…
Mar 07, 2026
3d ago
πŸ†”83425504
⭐0.44

@AnishA_Moonka Reality check. Not vibes. Not Silicon Valley gaslighting. Just facts like bricks. AI is just software. The "agency" and "autonomy" narrative isn't science, it's Silicon Valley marketing that plays well in pitch decks. Anthropic spent years selling "reasoning" and "agents" to investors. Now the Pentagon wants Claude for "all lawful purposes" and suddenly they discover it lacks judgment for autonomous military use? They built the myth. They're trapped in it. One thing is seducing VCs. Another is lying to policymakers who actually believe you. The current standoff exposes the grift: when real governance conflicts arise, everyone reverts to the underlying reality, this is pattern-matching software, not entities with will. Ask yourself why you trust what AI labs say about their own technology in the first place. Healthy skepticism isn't anti-innovation. It's pro-accountability. https://t.co/Ut4hpvTU3C

G
gerardsans
@gerardsans
πŸ“…
Mar 07, 2026
3d ago
πŸ†”65277632
⭐0.42

@JasonBotterill You are on the wrong side here. Some reading to get up to date on pre-training as the effective boundary for RL. In a nutshell: You can’t infer over what you didn’t sample. https://t.co/yETkG6Xhq8

G
gerardsans
@gerardsans
πŸ“…
Mar 07, 2026
3d ago
πŸ†”23623892
⭐0.40

@birdabo lol… Anthropic as a AI lab has been behind OpenAI and Gemini for months. Now is a software company? Don’t even have native multimodal systems yet with images or video support. All Chinese models even open source have it. What a joke.

E
emollick
@emollick
πŸ“…
Mar 08, 2026
3d ago
πŸ†”33614343

I gave ChatGPT for Excel and Claude for Excel a try on a very hard Excel file: macro-economic data from 1,000 years of English history across over a hundred tabs. I think both did a good job, and I did not spot errors (though I only did spot checks). However, Claude was harder to check because ChatGPT tended to stick within the Excel app, building formulas and manipulating the data in the way a person would. On the other hand, Claude used Python and often pasted material into Excel for display purposes only, making it harder to trace or edit. If that holds, I think it will generally make ChatGPT more useful for serious users if you want to audit the results. Prompt: "help me understand the relationship between the mix of agricultural products in the UK, GDP, and population, along with hours worked. I want this over the total period, and you should illustrate interesting trends with graphs and statistical analysis

Media 1Media 2
+1 more
πŸ–ΌοΈ Media
E
emollick
@emollick
πŸ“…
Mar 08, 2026
3d ago
πŸ†”36519228
⭐0.36

Some suggestions here that telling Claude to only use formulas might solve the problem. I find that it helps, but that it still has a tendency to use Python for part of the work (like combining columns together and then pasting the data into a new sheet), breaking the references.

R
RaghuGanti
@RaghuGanti
πŸ“…
Mar 06, 2026
4d ago
πŸ†”25600277

We just published our 1H 2026 roadmap (https://t.co/qRKP2wg7RN) and an accompanying blog (https://t.co/fjVDnvk37c) for enabling the IBM's Spyre accelerator in PyTorch β€” ecosystem-first, building on torch.inductor, vLLM, and contributing back (Dataflow accelerator's Tile IR, OpenReg, out-of-tree CI). While the market debates whether AI disrupts legacy tech, we're busy building the accelerator infrastructure that enterprise AI runs on. We're sharing this journey in the open. Come see our talks on extending torch.inductor for dataflow accelerators and Spyre's vLLM integration at the inaugural PyTorch Conference Europe in Paris, April 7–8! @PyTorch @IBMResearch @IBM @RedHat_AI

Media 1
πŸ–ΌοΈ Media
H
hardmaru
@hardmaru
πŸ“…
May 10, 2018
2861d ago
πŸ†”85929473

I wrote this 2 years ago as a joke but it is no longer a joke: β€œForget Torch, Tensorflow, and Theano. I decided to implement Backprop NEAT in Javascript, because it is considered the best language for Deep Learning.” https://t.co/eGNEpBWm6e https://t.co/JD27jievYB

Media 1Media 2
πŸ–ΌοΈ Media
πŸ”ch402 retweeted
A
Anthropic
@AnthropicAI
πŸ“…
Mar 06, 2026
4d ago
πŸ†”07617634
⭐0.36

We partnered with Mozilla to test Claude's ability to find security vulnerabilities in Firefox. Opus 4.6 found 22 vulnerabilities in just two weeks. Of these, 14 were high-severity, representing a fifth of all high-severity bugs Mozilla remediated in 2025. https://t.co/It1uq5ATn9

❀️12,927
likes
πŸ”1,162
retweets
F
Forbes
@Forbes
πŸ“…
Mar 05, 2026
5d ago
πŸ†”71876477

On January 5, employees at Cursor returned from the holiday weekend to an all-hands meeting with a slide deck titled β€œWar Time.” After becoming the hottest, fastest growing AI coding company, Cursor is confronting a new reality: developers may no longer need a code editor at all. Check out the full story: https://t.co/5ofNvjOW2u (πŸ“Έ: Β Kimberly White via Getty Images for Fortune Media)

Media 1
πŸ–ΌοΈ Media
D
DX_aniel
@DX_aniel
πŸ“…
Mar 06, 2026
4d ago
πŸ†”60164368

robotics startups are so fun lmao just went around scanning our office then spent a stupid amount buying 64 parts for our rigs and now running 3D reconstructions of our sf and toronto offices like where is the work πŸ‘€ https://t.co/hsFRLCpDsL

Media 1
πŸ–ΌοΈ Media
πŸ”Scobleizer retweeted
D
Daniel Zhang
@DX_aniel
πŸ“…
Mar 06, 2026
4d ago
πŸ†”60164368
⭐0.34

robotics startups are so fun lmao just went around scanning our office then spent a stupid amount buying 64 parts for our rigs and now running 3D reconstructions of our sf and toronto offices like where is the work πŸ‘€ https://t.co/hsFRLCpDsL

❀️212
likes
πŸ”10
retweets