Your curated collection of saved posts and media
@fujikanaeda Evidence in favour of that: https://t.co/r3CnLS36N7
New research from Microsoft. Phi-4-reasoning-vision-15B is a 15-billion parameter multimodal reasoning model that combines visual understanding with structured reasoning capabilities. As I have been saying, not every agent task needs a frontier model. Phi-4-reasoning-vision shows what's possible at 15B parameters. The report details how they trained a compact model that can reason over both text and images, targeting the sweet spot between capability and efficiency. Smaller reasoning models that handle vision are essential for practical agent deployments. Paper: https://t.co/cT2qeNImwi Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
New research on evaluating coding agents via continuous integration. Coding agents are moving beyond isolated bug fixes. If they're going to own CI pipelines, we need benchmarks that reflect the actual complexity of codebase maintenance. Most coding agent benchmarks today test whether an agent can fix a single issue. But real software engineering involves maintaining entire codebases over time. SWE-CI evaluates agent capabilities through continuous integration workflows: running test suites, catching regressions, and maintaining code quality across multiple changes. Paper: https://t.co/p8bOTJ9QPX Learn to build effective AI agents in our academy: https://t.co/LRnpZN7L4c

Cursor with Kimi K2.5. Don't sleep on this combo. From a prompt to a personal HN feed in about ~60 seconds. The future of building is going to be so wild. With faster models, you can quickly iterate on more ideas, while improving quality. https://t.co/WOYFcCBqM7
PDFs are the bane of every AI agent's existence: here's why parsing them is so much harder than you think π Every developer building document agents eventually hits the same wall: PDFs weren't designed to be machine-readable. They're drawing instructions from 1982, not structured data. π PDF text isn't stored as characters: it's glyph shapes positioned at coordinates with no semantic meaning π Tables don't exist as objects: they're just lines and text that happen to look tabular when rendered π Reading order is pure guesswork β content streams have zero relationship to visual flow π€ Seventy years of OCR evolution led us to combine text extraction with vision models for optimal results We built LlamaParse using this hybrid approach: fast text extraction for standard content, vision models for complex layouts. It's how we're solving document processing at scale. Read the full breakdown of why PDFs are so challenging and how we're tackling it: https://t.co/K8bQmgq7xN

Parsing PDFs is insanely hard This is completely unintuitive at first glance, considering PDFs are the most commonly used container of unstructured data in the world. I wrote a blog post digging into the PDF representation itself, why its impossible to βsimplyβ read the page into plaintext, and what the modern parsing techniques are π The crux of the issue is that PDFs are designed to display text on a screen, and not to represent what a word means. 1οΈβ£ PDF text is represented as glyph shapes positioned at absolute x,y coordinates. Sometimes thereβs no mapping from character codes back to a unicode representation 2οΈβ£ Most PDFs have no concept of a table. Tables are described as grid lines drawn with coordinates. Traditional parser would have to find intersections between lines to infer cell boundaries and associate with text within cells through algorithms 3οΈβ£ The order of operators has no relationship with reading order. You would need clustering techniques to be able to piece together text into a coherent logical format. Thatβs why everyone today is excited about using VLMs to parse text. Which to be clear has a ton of benefits, but still limitations in terms of accuracy and cost. At @llama_index weβre building hybrid pipelines that interleave both text and VLMs to give both extremely accurate parsing at the cheapest price points. Blog: https://t.co/iLJpIr7cbH LlamaParse: https://t.co/TqP6OT5U5O

Remember that you cannot take seriously LLM statements as evidence for their consciousness, for if you do, then no falsifiable and non-trivial theory of consciousness could apply to them https://t.co/JciOf2ajKP
Here is a study that I wonβt bother reading, not even the abstract. It is clear it is flawed. 6 eggs and a 20oz ribeye steak a day is a good basis for a healthy diet. https://t.co/pORtrQepgH
This headline. THIS HEADLINE. π€¦ββοΈ βEating just ONE egg a day increases your risk of diabetes by 60%β I need everyone to sit with that for a second. I eat 6 to 12 eggs a day. Every single day. For the past year. And in that same year I: π₯Reversed type 2 diabetes π₯Reversed insulin resistance π₯Reversed sleep apnea π₯Stabilized my blood sugar π₯Balanced my hormones π₯Eliminated perimenopause symptoms π₯Lost 140 pounds π₯Got my brain back On EGGS. Beef. Bacon. Butter. This is exactly the kind of fear propaganda that kept me sick for 47 years. Headlines designed to keep you terrified of real food so you keep buying processed garbage and prescription pills instead. Eggs are one of the most nutrient dense foods on the planet. Complete protein. Choline for your brain. B12 for your nervous system. Healthy fat for your hormones. The yolk alone is a multivitamin. They donβt want you to know that. A healthy population is a terrible business model. I was headed for an early grave. Eggs didnβt almost kill me. Believing this kind of nonsense almost killed me. Never again. π₯π₯©π Ready to stop being lied to and start actually healing? Free tools in my bio! Your second chance is waiting.
GPU Puzzle #6: implement a kernel that adds 10 to each position of a vector. The solution is just 3 lines, and getting there requires understanding global thread indexing and what breaks when you skip the bounds check. π€ Full walkthrough in our new video: https://t.co/BPmZugk3q6
A look at our booth from @NVIDIAGTC last year. We're back on March 16-19 at Booth #3004 in San Jose, pushing the frontier on NVIDIA Blackwell. Stop by to see the latest in Mojo π₯ and MAX, state-of-the-art inference on NVIDIA Blackwell, and AI-assisted kernel development in action.
Something we've been thinking about: planning in the age of capable coding agents. Agents can now build entire requirements end-to-end. They code longer, handle more complexity, and break work down on their own. Granular task breakdown? That's the agent's job now. Requirements are what matter. We shipped a new Build experience in @BrainGridAI that reflects this. No more breaking down tasks upfront. Specify your requirement, pick your agent or paste one command. The agent creates tasks as it works β so you have a record and can resume any session without losing progress.
Most builders go from requirements straight to code. Then they spend days adjusting layouts, fixing flows, and rebuilding things that should have been caught earlier. Today we are shipping Designs in BrainGrid, a new way to visualize your app before you build it. Start with a prompt. Get a design tied to your requirement. Iterate by chatting with the agent, annotating what needs to change, or selecting individual elements for precision edits. Desktop and mobile views are there from the start. No surprises when you go to build. The gap between "what I described" and "what got built" is where time disappears. Designs closes that gap.
π https://t.co/NOZsB74aAu
TGIF https://t.co/Mc9Ge2Zn0e
Introducing Skills for Perplexity Computer. Reusable capabilities and actions that Computer applies automatically when needed. Teach it once, and Computer remembers forever. Create your own skills for any tasks you perform repeatedly. https://t.co/zcc0QK4bQs
No more manually pulling data. We gave Perplexity Computer a simple prompt and a free Federal Reserve API key. Minutes later: a fully formatted Excel spreadsheet with live macro indicators and charts. https://t.co/HXLI3LptUy
This @perplexity_ai usecase blew my mind. I've always wanted a tool that tracks all S&P500 earnings and key things said by executives in their earnings calls. I simply do not have the time and bandwidth to read all 500. Prompt: I want an interactive dashboard that tracks every single earnings report in the transcripts of the S&P 500 companies every quarter. Note common themes that executives are talking about. Keywords and trends that could help me potentially make money and identify larger trends. As a momentum trader. S&P 500 Earnings Intelligence Dashboard is live and fully updated with 484 company transcripts covering the latest earnings season. What's inside: 5 KPIs at a glance β total companies, themes tracked, momentum signals, average sentiment, and sector coverage Theme frequency chart : aggregated by GICS sector, so you can see which sectors are driving each narrative (AI CapEx, margin expansion, regulatory risk, etc.) Sector sentiment ranked as horizontal bars β Utilities leading at 0.79, Consumer Staples trailing at 0.66 12 momentum signals β 8 bullish, 3 caution, 1 energy transition β with the tickers behind each signal Searchable executive quotes with sector filters and pagination across all 484 companies Full sector breakdown showing every company's sentiment, themes, and key quotes Recurring refresh: A quarterly refresh is scheduled for May 15, Aug 15, Nov 15, and Feb 15 at 9am EST (cron a07f9c7c). Each run will pull fresh transcripts for all S&P 500 constituents, reprocess through NLP, and redeploy the dashboard automatically. @jeffgrimes9 @dnlkwk @alexhong @AravSrinivas Computer IS INSANE. Can't wait to see what else I can build for myself.
Monitoring the Situation: World Radio. Built with Perplexity Computer. https://t.co/bgPVnFKNUi
Prad: "Oh that's a cool view, let me task @perplexity_ai Computer to create this with live data and an interactive dashboard" Result: https://t.co/yqpXdWE7qe
https://t.co/DZkAx8pmul https://t.co/Azj1Ct8Ff2
Skills are among the most consequential new tools for AI, and Anthropic just released a very impressive nontechnical Cowork Skill that builds Skills, including doing interviews & providing benchmarks. I think you still need to add the human touch, but this is a big leap forward https://t.co/r4fCV9roWp

On most games, performance is flat or even decreasing. What went wrong? Using classic NLP, we find AI models suffer from low discourse coherence, leading to weak performance despite relatively high information density - even when using twice as many tokens as humans. https://t.co/piUFPWyLnO

My Excel toolbar right now. They are all different from each other in ways that are only clear when you use them a lot, and which also differ from the results if you ask Claude or ChatGPT on their websites to create an Excel sheet, or if you use Cowork or Codex. Complicated! https://t.co/iAo1cxZPXg