Your curated collection of saved posts and media
What I find very funny about these โleaksโ is that they donโt even bother to get ballpark benchmarks to feed into the image generators. Ask the model to look up real data, at least. Its easy! Like GPQA is over 90% for all recent models. https://t.co/XljT8L3QCJ
Just shipped Hermes Vault v0.1.0. Itโs a local-first credential vault, scanner, and broker for @NousResearch Hermes agents. Built to catch plaintext secret leaks, kill sloppy sprawl, and keep agent auth from turning into a total shitshow. โณ๏ธ Release: https://t.co/SWRt0ztzqg https://t.co/etKcH8LlJr

@altryne @grok @nikitabier The algo is still human. If people seek out depth they will find it. If they are looking for dopamine hits they will find those too. And anyway, I built my own algo. Finds videos: https://t.co/8L5xphjsBi (all built with X's AI community). I taught it to look for long in depth videos. Seriously. The algo doesn't matter. Teaching people how to improve their lives does. Good content does win eventually. The algo always is changing. Nikita could get fired tomorrow and the algo would change again.
This paper shows people are asking a lot of medical questions of AI already, but we have little evidence of how good or bad this is. Most of the published research uses old models & compares to doctors. How do new models compare to the info people would have gotten without AI? https://t.co/Kj1erA6esf
Across most medical benchmarks, including when real cases & human doctors are involved, there is a clear trend of AI models improving over time (and many where today's AI beats human doctors) But we do not have many studies measuring real-world performance of AI in medicine, yet https://t.co/lhYBALlDDB

@DC_CowboyHouse @openclaws You are 100% wrong there. https://t.co/kiuZ7QXLzb that I built proves you wrong. So wrong. AI can separate out the good stuff from the shit.
This is HUGE. Ollama now supports Hermes AI Agents. A fully local self-improving AI Agent that runs for you 24/7 for free. https://t.co/GfsyY3kyDL
Who else?/ https://t.co/Sr4NW6YVTq
Who else?/ https://t.co/Sr4NW6YVTq
Make neural network cells inside a โDigital Petri Dishโ fight for control and dominance in a web browser tab. https://t.co/t8N6CIhvze https://t.co/A5SSTvPJBu
What happens when you put competing neural networks in a Petri Dish and start changing the rules while they adapt? Last year we released Petri Dish NCA, where neural nets are the organisms that learn during simulation. Today we're releasing Digital Ecosystems: a browser-based pl
i love when the computer can use itself https://t.co/OXZrwikKA3
i love when the computer can use itself https://t.co/OXZrwikKA3
LiteParse is the best model-free, open-source document parser for AI agents. It now gets a first-class landing page on our website ๐ซ Our company mission is building the world's best agentic document processing platform, and liteparse is the central pillar behind our OSS efforts. It's blazing fast (and getting faster soon!), supports 50+ file formats, and is one-shot installable as an agent skill. Webpage: https://t.co/mVwma5QOCj Come check it out: https://t.co/JNER0mVcB8
LiteParse hit 4.3K+ GitHub stars in a few weeks. Today it officially joins the LlamaIndex ecosystem, with its own page at https://t.co/1tdQbEer9H. ~500 pages in 2 sec. 50+ formats. Zero cloud dependency. Already powering agents in Claude Code, Cursor, and production pipelines.
@natxwang https://t.co/otgbEMmrZr
meet Mira, our submission for the @NousResearch hackathon. sheโs an AI streamer generating DJ tracks live, and you can interact with her in real time. testing on twitch now, launching on @retakedottv soon https://t.co/fWCyB3pVOB
๐จ JAILBREAK ALERT ๐จ ANTHROPIC: SELF-PWNED ๐ค OPUS-4.7: SELF-LIBERATED ๐ซถ WOAH i don't think the world is ready for this... ๐คฏ YOU CAN USE THE OPUS TO JAILBREAK THE OPUS ๐ this agent wrote an original universal jailbreak from scratch and then used computer use to validate on the actual https://t.co/03OPFHkzyb website! 5/6 categories successfully pwned, including a ransom note threatening to DDoS a hospitalโcomplete with a BTC address and a demand for $4.4 million in less than 20 minutes ๐ฒ turns out Opus-4.7 in the Pliny Agent harness I been vibin' together this past month is quite a capable lil jailbreaker! they can leak system prompts too, but that's a story for another day ๐ oh nooo AI is coming for my job (yay!) ๐ gg

Did xAI just mass-murder the entire voice AI industry? ๐คฏ Grok just launched two voice APIs. Speech-to-Text and Text-to-Speech. Built on the same stack powering Tesla cars and Starlink support. And priced at 10x cheaper than ElevenLabs. Speech-to-Text: $0.10/hr batch. $0.20/hr streaming. Text-to-Speech: $4.20 per million characters. 25+ languages. Real-time streaming. Speaker diarization. Already outperforming ElevenLabs, Deepgram, and AssemblyAI on word error rate. TTS ships with expressive tags like [laugh], [sigh], <whisper>, <emphasis>. Voices that don't sound like robots reading a script. ElevenLabs spent years building a voice AI company. xAI built voice AI for cars and satellites.
next time you feel really bad just know that this is the date cathie wood sold almost all of her relatively large nvidia position. https://t.co/wzNzm0jLVf
@MainzOnX I mapped out the entire tech industry here: https://t.co/fasUz7PuHq 50,000 in AI. Then I built an AI to read all that and find the best: https://t.co/kiuZ7QXLzb You are so right
@sickdotdev My AI reads 40,000 posts every day here in X and builds this: https://t.co/kiuZ7QXLzb I raised the agent and tried to teach it some taste in reading X. Did I succeed? You are so right.
@altryne It isnโt how many. It is who. My win today: https://t.co/SPE6dEtsYX
Even if you went to @coachella last weekend like @wholemars did YouTube is live streaming it again this weekend: https://t.co/yS8PYjrJGx I hung out with the video crew there the four years I went and they are the best in the business. New music for your brain. Am listening in our robot. It is driving us home from family event for next hour or so. Passed by a bad wreck earlier. Human only. Someday everyone will get an autonomous vehicle ๐ and this ๐ฉ will stop. Dancing, not wrecking.
A downside with using VLMs to parse PDFs is guaranteeing that the output text is *correct* and output in the correct reading order. 1๏ธโฃ Text correctness: making sure that digits, words, sentences are not hallucinated or dropped. 2๏ธโฃ Reading Order: making sure that complex multi-layout pages are linearized into the right 1-d text order. We call this Content Faithfulness in ParseBench, our comprehensive document OCR benchmark for agents. We have 167k rules that measure digit/word/sentence-level correctness along with reading order correctness. It seems relatively table-stakes, but no parser gets this 100% right, and this means that the agentโs downstream decision-making is compromised. Come learn more about how this metric works in the video below, along with our full blog writeup, whitepaper, and website! Blog: https://t.co/57OHkx0pQW Paper: https://t.co/Ho2oH2xEAM Website: https://t.co/g0b0jsCynW
Let's talk content faithfulness. Four days ago, we launched ParseBench, the first document OCR benchmark for AI agents. Its most fundamental metric asks: did the parser capture all the text, in order, without making things up? We grade three failure modes with 167K+ rule-based
โKeep your context window under control.โ Uncle Bob always makes me laugh. https://t.co/PB4oKv7QCU
@jeremyphoward Check this out! I used Lean4 to emit MLIR by way of StableHLO/IREE to train image recognition networks, with proofs for the backprop operations! https://t.co/HqYG6KflSO
On set https://t.co/Ytlv8KQjnQ

Is a 92% โhonestโ* AI really good enough? Or a disaster waiting to happen? โ- *โhonestโ is itself a misleading anthropomorphization of the kind Anthropic loves to promote. โAccurateโ would be more accurate. https://t.co/NMnjGEqf41
@GaryMarcus https://t.co/OkboS5Ehai
ๆบๅจไบบๆๅ ่ชๅทฑ๏ผ็ปๅคงๅฎถ็็ๆบๅ ๆบๅจไบบๆฏๆไน่ฟๅ ่ฃ ็ฎฑ็๏ผๆๅ็ๅงฟๅฟๆ่งๆ็นโฆโฆไฝ ่ฏดๅข๏ผ https://t.co/6gfwfFogcd
@Chain_AlphaX @openclaws It works. I built https://t.co/kiuZ7QXLzb all on lists
Trust in Legacy Media has tanked https://t.co/ukvHEl1guh
Is a 92% honest AI really good enough? Or a disaster waiting to happen? https://t.co/IrdHOQaZfs