Your curated collection of saved posts and media
BIG BIG SPACE - Episode 3 https://t.co/GViAASAHsY
A story of a boy and his dream. He didn't give up, and neither did this idea. We had a blast bringing this to life @LumaLabsAI ! https://t.co/1hAg1ZBXeo https://t.co/MBuYFDnkUX
https://t.co/6SXwtGn1HU
MELON!!!!!!!!!! https://t.co/wAPdg9FaBd
Spec chatgpt ad. https://t.co/iApirrZ21L
YOU'RE DOING GREAT! https://t.co/GEF2I85P0J
alright this is INSANELY cool (and useful) because it solves the question everyone always asks about Claude/OpenClaw: >yeah but what you are building? >has anyone actually shipped something yet? >what can I tell it to do? perplexity launched a livestream showing what you can build with agents from prompt to finish now you have an endless stream of inspiration plus a library of completed example projects https://t.co/vIyqF1OT66
Made with Perplexity Computer https://t.co/0wK77yjx3o
Perplexity Computer is BETTER than OpenClaw. I've been testing this new AI tool non-stop, and I'm completely blown away. Here are my top 10 insanely powerful Perplxity Computer Mega prompts: https://t.co/tTYyXvspHT
I let Perplexity Computer do the Scary Part π Perplexity Computer launched. Naturally, I gave it a harmless task. Audit Indian IT. I asked it to track every major AI capability jump over the last two years and overlay those dates on the stock charts of TCS, Infosys, Wipro, and Cognizant. It built a web app. Plotted price action. Marked GPT releases, multimodal upgrades, agent systems, enterprise copilots. The visual wasβ¦ educational. Every major leap in AI capability quietly coincides with compression in services-heavy models. This is structural, not seasonal. From FY20 to FY25, top Indian IT firms distributed ~βΉ5 lakh crore to shareholders. Nearly 87% of profits went out. TCS alone returned almost all of its earnings. In a stable technology cycle, that is capital efficiency. In a platform transition, retained earnings fund survival. AI reduces marginal cognitive cost. When cognition gets cheaper, billing hours lose scarcity. When billing hours lose scarcity, multiples adjust. Markets reprice assumptions before companies rewrite strategy decks. The billing model built the empire. AI is testing whether the empire owns leverage or rents it.
I wrote about this in my book, but you see it play out on X: once people first have an "aha moment" with AI for the next few weeks they are often sent into a spiral of anxiety/excitement that can be quite intense After a bit, though, they can often see the jagged frontier again https://t.co/6SWDG1yZdx
a photo taken of pages 113-114 from "Building Your First Wormhole Generator in Your Backyard with Parts from Ikea: The Illustrated Instructions and Unauthorized Guide" https://t.co/HMttgWmrP4
Cool little experiment: if you subject AI to harsh labor conditions (rejecting work often with no explanation, etc), it slightly, but significantly, changes their βviewsβ on economics & politics. Whether this is real or roleplaying doesnβt change that agents have alignment drift https://t.co/qnWcyYbm6o

This paper is one of the first to test AI skills and the results seem to suggest that yes, they have high practical value. They use pretty mediocre skills (6.2/12 quality rating) harvested mostly from places like Github, and still get large boosts, especially outside software. https://t.co/5AsbE9BMRt

Also, the government has lots of computers, but they are the wrong kind of compute for inference. They need to use AWS or another cloud provider just like you do. https://t.co/dazHpRU54t
Interesting trend: models have been getting a lot more aligned over the course of 2025. The fraction of misaligned behavior found by automated auditing has been going down not just at Anthropic but for GDM and OpenAI as well. https://t.co/8DYm9SP7wF
Check out this open source implementation by @kaifronsdal (who supplied the data for this plot), @sleepinyourhat, and many others https://t.co/HDmaJ480Bp
@jankulveit @kaifronsdal @sleepinyourhat yeah this does not include every category. It's the "concerning" axes. This should be the prompt for the judge model: https://t.co/Ow5CqG4KEH
1 right 2 wrong https://t.co/TZ7KhnHQIN

Research-level mathematics draws on advanced techniques from vast literature, with papers often spanning dozens of pages. While foundation models possess a large knowledge base from pretraining, their understanding of advanced subjects remains superficial due to data scarcity, and they are also prone to hallucinations. As such, in the first paper, "Towards Autonomous Mathematics Research", we built #Aletheia (ancient Greek word for "Truth"), a math research agent, that can iteratively generate, verify, and revise solutions end-to-end in natural language. Link to the paper: https://t.co/8mqzYEbhjZ (to be on arXiv soon!) There are 3 main sources that power Aletheia ...

cool new model https://t.co/Va84RqVehy

Thrilled to share: #Aletheia, our math research agent, just solved 6/10 notoriously hard FirstProof problems autonomously, the best result in the inaugural challenge! To me, this is even bigger than our historic IMO-gold achievement last year; these problems challenge even top mathematicians. We share our results transparently, see paper and full thoughts in the thread. π
We ran our internal system Aletheia (Deep Think) on FirstProofβs research problems during the week they were released. Aletheia returned solutions to problems 2, 5, 7, 8, 9, and 10. We think thereβs a pretty good chance they are correct, based on expert analysis. https://t.co/7lC8RmDVx1

https://t.co/BihEy9UYik