Your curated collection of saved posts and media
π The @posthog team has just rolled out LlamaIndex support for their LLM Analytics, and we built a demo to showcase whatβs possible. Using LlamaIndex, LlamaParse, and OpenAI, our Agent Workflow compares product specifications and matches users with the most suitable option for their use case π οΈ π¦ Thanks to PostHogβs observability integration, the demo automatically tracks OpenAI usage, including: β’Token consumption β’Cost breakdown β’Latency metrics π₯ Check out the video below to see it in action π π©βπ» GitHub: https://t.co/elk5VKi8IF π Docs: https://t.co/IZI3w6BYKy π¦ LlamaCloud: https://t.co/wZjhFV29gN
"It's somewhere in the PDF" is not a citation. Page-level extraction in LlamaExtract gives you: β Data mapped to specific pages β Bounding boxes showing exact locations β Audit-ready citations Turn 200-page docs into skimmable, structured insights π https://t.co/BTkwspmefz
π Big drop from @GoogleDeepMind: Gemini 3.1 Pro is here, and we built a hands-on demo powered by LlamaCloud to put it to work and turn your receipt photos into real financial insights! Using our Agent Workflows, the app: πΈ Parses receipt images with LlamaParse (Agentic tier) π Stores everything locally in an SQLite database π Aggregates your spending monthly π§ Uses Gemini 3.1 Pro to analyze trends and generate actionable tips to improve your finances Check out the demo below!π π©βπ» GitHub repo: https://t.co/Ny22F4I3n1 π¦ Get started with LlamaCloud: https://t.co/zyE5lXTPFV
The second highest category is backoffice automation, but imo it's underrated by the AI community. RPA is truly dead, and agentic workflows are taking its place. A lot of backoffice work depends on routine operations over unstructured documents (invoices, claims packets, loan files). The best interface to automate these operations is enabling users to create deterministic workflows at scale, instead of solving ad-hoc tasks through chat. We are starting to build an agentic layer within our own document processing product, LlamaCloud, that lets users "vibe-code" these workflows through natural language. Come check it out: https://t.co/XYZmx5TFz8
This paper is one of the first to test AI skills and the results seem to suggest that yes, they have high practical value. They use pretty mediocre skills (6.2/12 quality rating) harvested mostly from places like Github, and still get large boosts, especially outside software. https://t.co/5AsbE9BMRt

Data center discussions, military use, privacy, mental health, job retraining, ethical standards, kids and AI, deepfakes, moral concerns, etc. Policymakers at every level in every jurisdiction are going to have their hands full, and the labs will be unable to respond to it all.
Also, the government has lots of computers, but they are the wrong kind of compute for inference. They need to use AWS or another cloud provider just like you do. https://t.co/dazHpRU54t
Here are a couple examples of how Gemini 3.1 Flash-Lite can solve real-world problems: First, this high-volume image sorter showcases the modelβs ability to quickly analyze and sort large amounts of content, like pictures (something that could have been too expensive or slow in the past). This demo is just a snapshot πΈπ of what can be built using 3.1 Flash-Lite's multimodal analysis capabilities. Think: real-time data visualization agents, CRM management tools, automated content moderation software, and beyond.
New research on scaling agent memory for long-horizon tasks.
Knowledge agents via RL
@jdegoes If you integrate this into the initial point: LLMs handle extensions in dense ecosystems much better than in sparse ones. Extending Python? AI will usually do fine. The training distribution is huge: libraries, examples, Stack Overflow, docs, tutorials. Extending Rust? Much harder. The base coverage in training data is already thinner, so you leave the modelβs support much faster. So the real constraint isnβt model capacity or new language extensions, itβs existing training distribution density.
π‘ The Applications & Case Studies track at #PyTorchCon Europe, 7-8 April in Paris, showcases production deployments, lessons learned & innovative use cases across industries. Learn more: https://t.co/YkKx8kYir6 π Register: https://t.co/kN0M3grGlE https://t.co/nt6npAnpuG
How to effectively create, evaluate and evolve skills for AI agents? Without systematic skill accumulation, agents constantly reinvent the wheel. SkillNet introduces an open infrastructure for creating, evaluating, and organizing AI skills at scale. It structures over 200,000 skills within a unified ontology, supporting rich relational connections like similarity, composition, and dependency, and performs multi-dimensional evaluation. SkillNet improves average rewards by 40% and reduces execution steps by 30% across ALFWorld, WebShop, and ScienceWorld benchmarks. The key takeaway is treating skills as evolving, composable assets rather than transient solutions. Paper: https://t.co/Xv3uGLnPH2 Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
In the API, use image detail: original to unlock our biggest vision and CUA gains! https://t.co/WE1cRKzHtN
for now it has mostly been improvements on βagentic codingβ prepare yourselves for βagentic researchβ expect something from me in the coming weeks on the same
I just spent the last 24 hours building... a free tool for you to create your very own handwriting font quickly in the browser (no logins, all local processing). This *entire* "SaaS" is a single-page HTML app - ready for you to try: β¨https://t.co/p6t4n0kSbHβ© Having tested it extensively with my own handwriting - including ligatures - I can definitely say that it works. ;) Download your own OTF, TTF, etc. when done. 100% created with @AnthropicAI Claude via web chat. Out of the 250+ free apps I've developed, this may have the most mass appeal.

@tobi understanding how to measure progress in what agents produce initially, even if at the basic level, goes a long way. strong seeds are very important. after that, you can encode improvement in natural language for the agent and watch it self-improve rapidly and unexpected ways.
You can now run three frontier models at once and select your orchestrator model directly inside Perplexity Computer. Model Council automatically runs GPT-5.4, Claude Opus 4.6 and Gemini 3.1 Pro simultaneously. Three frontier models. One workflow. Best answer wins. https://t.co/40rPcXpr6s
This might look like a game. But itβs actually the beginning of autonomous coordination between AI agents. π§ π€
This might look like a game. But itβs actually the beginning of autonomous coordination between AI agents. π§ π€
You can actually interact with the world simulator directly in the browser. π€ Here is a quick screen recording (8x speed) of me playing with it: real-time action-conditioned video prediction across rigid objects, deformable objects, rope, and object piles. Try it yourself (no install required): https://t.co/rqAmgzcB7F Huge kudos to my student @YXWangBot for making the interactive demo happen!
@honnibal TBH I think a big difference is my pre-LLM workflow was vastly different to nearly anyone else's. I hardly ever have to debug my code since it's extremely unlikely for me to write bugs using the notebook-based approach. So much faster debugging is only useful for 3rd part code.
@quasicoh What kinds of improvements would you want to see for codex here?
Perplexity is the most underrated AI tool for web search. I'm gonna integrate it into OpenClaw for SEO stuffs. Also, EXTREMELY cheap. https://t.co/WfC29oJ4Zn