Your curated collection of saved posts and media
Lol codex literally re-implemented markdown to HTML parsing rather than download react markdown for nextjs. lmfao https://t.co/AggF2zrkMr
Damn @AnthropicAI sonnet 4 isn't messing around. This moves with the mouse btw and rotates Artefact here : https://t.co/YRX4GuKW4I https://t.co/jh6gpGrIxq
Learn how to build a custom multimodal embedder for LlamaIndex! This guide shows you how to: β‘οΈ Override LlamaIndex's default embedder for AWS Titan Multimodal support β‘οΈ Create a custom embedding class handling both text and images β‘οΈ Integrate it with @pinecone for efficientβ¦ https://t.co/jBqn7jrMak
NEW: Mistral AI announces Agents API - code execution - web search - MCP tools - persistent memory - agentic orchestration capabilities Cool to see that Mistral AI has joined the growing number of agent frameworks. More below: https://t.co/tjruOmcDP5
wow wow finally playing around with operator curious to see where this is https://t.co/7W0FLwwPGp
Agentic Document Extraction just got much faster! From previous 135sec median processing time down to 8sec. Extracts not just text but diagrams, charts, and form fields from PDFs to give LLM-ready output. Please see the video for details and some application ideas. https://t.co/29lOKf6UGO
Just so everyone knows, we have passed the point where you can tell what is AI at a glance (or even, in many cases, a close look) These were all made by me with text prompts alone using Veo 3. https://t.co/RoZ67BYABr
Long-Context State-Space Video World Models "we propose a novel architecture leveraging state-space models (SSMs) to extend temporal memory without compromising computational efficiency. " https://t.co/7KTQwg4veE
Bunch of good conversations in today's RAG office hours. Most teams get RAG wrong because they're obsessed with the AI instead of the data. The real money is in finding what users actually need through data analysis. I've seen $100k/month value unlocked just by identifying⦠https://t.co/SD56uw6ZZU
this company is still such a mystery to me. https://t.co/0jepqHsrL0
Just came across this old GigaOM article. I guessed in 2013 it might take ~10 years until "packages like word2vec can make deep learning even for relatively unsophisticated users". Not bad, I think! :D https://t.co/XxtbbeIlLY https://t.co/6PxNQR82t7
ok google https://t.co/vyp1GcHzP5
Claude 4 Opus scored 72% on the aider polyglot coding benchmark. Claude 4 Sonnet scored 61%. Both of those are with 32k think tokens. Sonnet 4 seems to have underperformed 3.7. Full leaderboard: https://t.co/mBVaUPG9ZN https://t.co/tj4p5Pn6Tk
1/6 We introduce RPG, a principled framework for deriving and analyzing KL-regularized policy gradient methods, unifying GRPO/k3-estimator and REINFORCE++ under this framework and discovering better RL objectives than GRPO: Paper: https://t.co/7xSUj01GIx Code:β¦ https://t.co/0pn5sqhhC7
xLSTM for the classification of assembly tasks: https://t.co/l3hQTtQ31e "xLSTM model demonstrated better generalization capabilities to new operators. The results clearly show that for this type of classification, the xLSTM model offers a slight edge over Transformers." https://t.co/g5RWU9Bonf
I put together an annotated version of the new Claude 4 system prompt, covering both the prompt Anthropic published and the missing, leaked sections (thanks, @elder_plinius) that describe its various tools It's basically the secret missing manual for Claude 4, it's fascinating! https://t.co/qDKnViikkS
The best fine-tuning guide you'll find on arXiv this year. Covers: > NLP basics > PEFT/LoRA/QLoRA techniques > Mixture of Experts > Seven-stage fine-tuning pipeline https://t.co/Z7NSBBFvSS
I released a dataset of 10k prompts which are refused by Qwen3, but answered by Llama3.3. This highly diverse data can be used to train a model to comply with Chinese law (or not), testing, evaluation, and activation steering. https://t.co/aKhFC1Bunb
I've been reading Bethany Hughes book on the 7 Wonders & the myths of the Colossus of Rhodes (eg it did not span the harbor) As an experiment, I asked Google Deep Research to come make a prompt to describe the historical Colossus for veo 3. Nice, down to the location & riveting https://t.co/VLGzkQnbC3
How do I debug multi-turn conversation traces? This is our advice - what has worked for you? https://t.co/XYEJj5vZDg
Which academic fields transitioned to Blue Sky explains much of the transformation in conversations occurring here, far less humanities & economics discussion happens on X compared to before. (And no, I donβt think this was a net good thing for X, at least if you want new ideas) https://t.co/DttfOKQFfS
Practical Applications of AI Agents in Finance π€π¦ I'm excited to release a set of slides that gives an overview of assistant and automation-based agent architectures and their applications across a broad range of financial verticals - from investment research to back-office toβ¦ https://t.co/VsHGA754bU
Reasoning Models Thinking Slow and Fast at Test Time Another super cool work on improving reasoning efficiency in LLMs. They show that slow-then-fast reasoning outperforms other strategies. Here are my notes: https://t.co/79XsaYcR8N
How @seb_ruder made the ULMFiT paper happen π (from 2018 https://t.co/GEOZunWoXj student notes: https://t.co/1dX4HpAF4V) https://t.co/sl1r6heZ3B
An inexplicable failure of Microsoft & Google's AI tools is that they have access to my email but won't actually use their smarts to help me When I ask for "urgent messages," Google just gives me unread ones and Microsoft literally searches for "urgent" Yet Claude does better. https://t.co/yEQhgkAZxH

What a nice surprise - the MonsterUI component library for FastHTML on the HN front page :) https://t.co/eUcV2ysrrx
Claude 4 Opus gets real weird real fast when you ask it to create something "numinous" (and especially when you ask it to make it "more numinous"). This was just the start. The system card discusses the model's tendency towards producing spiritual themes, definitely noticeable. https://t.co/xKELacUK2C
RAG is dead, long live agentic retrieval! At LlamaIndex we've been saying for a long time that naive RAG is not enough for a modern application. Following from that conviction, we've built agentic strategies directly into LlamaCloud that you can adopt with just a few lines of⦠https://t.co/7Rh7ohDw3x
The agentic AI system I was testing is from @perplexity_ai Labs, which has just launched. My full prompt and the agentβs outputs are now featured on their website (link below). Overall, itβs an excellent and innovative agentic system that takes about 5β10 minutes to run. Iβllβ¦ https://t.co/meumSz7NlN
Neat: "Claude 4 create a game with a completely novel mechanic. start with 20 different ideas and narrow them down" Its idea was for players to "steal, store & redistribute physical properties between objects and themselves." It built this demo & fixed a couple of bugs I found. https://t.co/y2G1QsNAIi
Confused about recent LLM RL results where models improve without any ground-truth signal? We were too. Until we looked at the reported numbers of the Pre-RL models and realized they were serverely underreported across papers. We compiled discrepancies in a blog belowπ§΅π https://t.co/Hmn41grrrh
I don't think people appreciate how much Anthropic has been running the table in the market for coding tools over the last year. https://t.co/wVKF6MdsdF