Your curated collection of saved posts and media
πΌπ Build a multi-agent financial report generating chatbot from scratch, using LlamaIndex agent workflows π The full example from @jerryjliu0's workshop last week is below. In this hands-on Colab, you'll: β Parse & index 10-K filings from Adobe β Use agentic RAG to answerβ¦ https://t.co/cisveE9Ry0
Here is my 2 hour long workshop i just finished at the @aiDotEngineer World's fair. This is all you need to know to learn on how to use Gemini 2.5! It is beginner friendly from getting your first API key to multimodality, function calling and MCP. π Completely free - runsβ¦ https://t.co/aViZBxvx2M
I'm thrilled to see @math_rachel's interesting new article, on a recent deep learning microbiology paper, getting the attention it deserves -- check it out if you haven't seen it yet! https://t.co/K7vhqOaecB https://t.co/4bSn39KCo6
βIs RAG dead?β This question pops up every 2 weeks! I just got the scoop from @HamelHusain and @sh_reya in their amazing AI Evals course: https://t.co/hSpuJqh6eM Letβs clear the confusionβand talk about what actually matters when evaluating retrieval-augmented generationβ¦ https://t.co/o2P17tZYX3
π₯ Introducing Firecrawl /search. @firecrawl_dev just launched an insane feature to search and crawl in one shot. You heard that right! One API call to search the web and scrape any data you need for your AI agents. I took it for a spin in n8n: https://t.co/HViclvq6I1
standard completions dot org https://t.co/DhcwmbEuJH
Teaching robots to learn only from RGB human videos is hard! In Feel The Force (FTF), we teach robots to mimic the tactile feedback humans experience when handling objects. This allows for delicate, touch-sensitive tasksβlike picking up a raw egg without breaking it. π§΅π https://t.co/hshodP8elW
Another RL environment added to Atropos! @MatternJustus released a pydantic schemas dataset that can be used to ask the model to create valid structured outputs of those objects - so I made an environment that asks the model to create JSON, YAML, TOML, etc and validate against⦠https://t.co/kqiSUXBJYf
Everyone should learn to code with AI! At AI Fund, everyone - not just engineers - can vibe code or use AI assistance to code. This has been great for our creativity and productivity. I hope more teams will empower everyone to build with AI. Please watch the video for details. https://t.co/rsGC1QSKHL
Claude-4 Sonnet scores quite well on SPOT, our recent benchmark for identifying errors in academic papers. Its precision of 11.3% is far ahead of its competition, but probably not something you'd want to rely on to report you for fraud... https://t.co/Bwg3beDscd
You know all those arguments that LLMs think like humans? Turns out it's not true. π§ In our paper "From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning" we test it by checking if LLMs form concepts the same way humans do @ylecun @ChenShani2 @jurafsky https://t.co/ctVszZXoDW
How does Sonnet 4.0 compare vs. Gemini 2.5 Pro on document understanding? π Iβve found Sonnet 4.0 to be much better at table parsing. Check out the screenshot below πΌοΈ - I compared both modelsβ visual reasoning capabilities over a screenshot of a dense Caltrain schedule packedβ¦ https://t.co/rcnv1H3mq0
we are now at a point that we can ditch build systems for many projects & many people underestimate the amount weight doing so would lift off their burdened shoulders https://t.co/41vg3fcnXr
Open-Ended Evolution of Self-Improving Agents Can AI systems endlessly improve themselves? This work shows the potential of self-improving AI, inspired by biological evolution and open-ended exploration. This is a must-read! Here are my notes: https://t.co/KRmNve8pl5
FYI @max_paperclips integrated **1,069** new environments to Atropos by porting in @intern_lm's new environment bootcamp - hundreds and hundreds of new tasks - including various task types such as games, logic problems, puzzles, algorithms, and more. We're working on a reasoning⦠https://t.co/RlB5jzIzps
If you're attending @aiDotEngineer on wed, june 4th, check out the recsys track. I'll be hosting talks from Pinterest, LinkedIn, Netflix, Instacart, Youtube. I'll also share 3 ideas that'll likely drive the next few years in recsys: semantic IDs, llm-augmentation, unified models https://t.co/WbL8yhRw0k
Should I build a custom annotation tool or use something off-the-shelf? https://t.co/TlmrtRrnDk https://t.co/7EsMIEHkxV
Just learned about Claude 4's God mode! "Don't hold back. Give it your all!" On a serious note, you'll be surprised just how better the results are when you use clear modifiers, be specific, and demand more from the models. This also works well for other models like o3. https://t.co/7yREQjF33R
Why do you recommend binary (pass/fail) evaluations instead of 1-5 ratings (Likert scales) for applied evals? Links in reply https://t.co/wCi78dH8J2
I wrote a history of AI in 32 images of otters using wifi on airplanes, from images to video to code. It shows two big trends: rapid improvements in AI models of all types and the growth of open weights AI models. Link in the comments. https://t.co/PrZDmKaP7D
A huge (and probably underrated) promise of LLMs is inhaling a million PDFs and making sense of them through automated extraction. The baseline π‘: Stuff tokens into a function-calling LLM with a Pydantic schema, get back structured JSON. You can do this with most frameworks inβ¦ https://t.co/qPCa3NchJp
On my way to SF from Tokyo. Can't wait to talk about Real-time AI Scientists @aiDotEngineer this year! https://t.co/I80upeWgrW

Decentralized Training Progress - 6/1/2025 https://t.co/y4pjVejNks
Veo 3 is really fun to use for historical what-ifs. I put together a 1940s video newsreel as if Project Habakkuk, the World War Two British plan to build a giant aircraft carrier out of pykrete, a mix of ice and woodpulp, had actually happened. https://t.co/kh7wqm3yCB
Knuth shows us the way. Again: https://t.co/kndEGZGHFr
VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL "we propose VisualSphinx, a first-of-its-kind large-scale synthetic visual logical reasoning training data." "we propose a rule-to-image synthesis pipeline, which extracts and expands puzzle rules from seed⦠https://t.co/H3b1PuuqPE
Facebook AI Research is the OG βOpenβ AI https://t.co/awEt2L4ZvD
Some thoughts on leadership: https://t.co/FvDlqCGGu3 β’ What makes an exceptional leader? β’ What do exceptional leaders do? β’ Leadership styles: Commando, soldier, police https://t.co/S0eYpGBjxo

Agent Zero A personal agentic framework that dynamically grows and learns with you. - It uses the OS as a tool. - Has search and terminal execution too. - It has persistent memory to memorize key information to solve future tasks more reliably. - Multi-agent support. https://t.co/b0PMAvzcrq
Should I build a custom annotation tool or use something off the shelf? P.S. This is a correction - I wasn't opinionated enough before. Links in reply https://t.co/PtIW1AjtYF
Asking in meme format Instead https://t.co/MUn6tqbyBS
CVPR 2025 starts in less than few weeks; I'm working on a list of must-see CVPR papers / projects any important papers I should add? link: https://t.co/1VlLn2BWxl https://t.co/XX8uLwmWLc