Your curated collection of saved posts and media

Showing 32 posts ยท last 14 days ยท by score
L
llama_index
@llama_index
๐Ÿ“…
Sep 05, 2025
233d ago
๐Ÿ†”73783135

Command-line agents can get you really far in document search and analysis! We tested SemTools, our CLI toolkit for parsing and semantic search, with coding agents like @claude_code on 1000 @arxiv papers. The results show that combining Unix tools with semantic search capabilities creates surprisingly capable knowledge workers. ๐Ÿ” SemTools adds parse and search commands that let agents handle complex documents with fuzzy semantic keyword search ๐Ÿ“Š Agents with semantic search provided more detailed, accurate answers across search, cross-reference, and temporal analysis tasks โšก CLI access proves incredibly powerful relative to effort - leveraging existing Unix tooling instead of building custom RAG infrastructure ๐Ÿ› ๏ธ The combination of grep, find, and semantic search handles a wide variety of document tasks at high fidelity Learn about our SemTools experiment and see the full benchmark results: https://t.co/3LeaejfRWc

Media 1Media 2
๐Ÿ–ผ๏ธ Media
L
llama_index
@llama_index
๐Ÿ“…
Sep 05, 2025
233d ago
๐Ÿ†”75482506

Global enterprises like Cemex use LlamaCloud to radically accelerate their data ingestion processes for maintenance, supply chain operations, health and safety and more! Check out the full video here: https://t.co/ZBXBRh3Xkx https://t.co/SfQmbJ7wwv

Media 2
๐Ÿ–ผ๏ธ Media
J
jerryjliu0
@jerryjliu0
๐Ÿ“…
Sep 05, 2025
233d ago
๐Ÿ†”81168732

grep (and lightweight semantic search) are all you need ๐Ÿค” When you have a โ€œmediumโ€ sized dataset e.g. 1000 ArXiv PDFs, we found that an extremely strong Q&A baseline is just giving agents access to the CLI, along with some tools for fast semantic search using static embeddings. These agents can answer complex questions, from simple search/filter with keywords, to those that require cross-referencing across docs, to those that require analysis across time. In these cases standard RAG with fixed top-k retrieval is strictly worse. We made file understanding + semantic search very CLI accessible through semtools, come check it out! Blog by @LoganMarkewich : https://t.co/kYr8KkWLYR SemTools: https://t.co/xg1iqbghIr

@llama_index โ€ข Fri Sep 05 16:54

Command-line agents can get you really far in document search and analysis! We tested SemTools, our CLI toolkit for parsing and semantic search, with coding agents like @claude_code on 1000 @arxiv papers. The results show that combining Unix tools with semantic search capabiliti

Media 1Media 2
+1 more
๐Ÿ–ผ๏ธ Media
A
ai_for_success
@ai_for_success
๐Ÿ“…
Sep 03, 2025
235d ago
๐Ÿ†”09345951

Story of every CEO right now ๐Ÿ˜‚ https://t.co/ukZlFa46ZW

Media 1
๐Ÿ–ผ๏ธ Media
C
cihangxie
@cihangxie
๐Ÿ“…
Sep 03, 2025
235d ago
๐Ÿ†”53494832

๐Ÿš€ ~4 months ago, we introduced OpenVision โ€” a fully open, cost-effective family of vision encoders that rival OpenAIโ€™s CLIP and Googleโ€™s SigLIP. Today, weโ€™re back with a major update: OpenVision 2 ๐ŸŽ‰ A thread ๐Ÿงต (1/n) https://t.co/FkLG2a6hnf

@cihangxie โ€ข Thu May 08 20:22

Still relying on OpenAIโ€™s CLIP โ€” a model released 4 years ago with limited architecture configurations โ€” for your Multimodal LLMs? ๐Ÿšง Weโ€™re excited to announce OpenVision: a fully open, cost-effective family of advanced vision encoders that match or surpass OpenAIโ€™s CLIP and Goog

Media 1
๐Ÿ–ผ๏ธ Media
T
TsingYoga
@TsingYoga
๐Ÿ“…
Sep 04, 2025
234d ago
๐Ÿ†”26614940

We can finally share UI-TARS-2๐Ÿฅณ๐Ÿฅณ โ€” a native GUI agent trained with multi-turn agent RL โšก๏ธโšก๏ธKey highlights (all-in-one model!): ๐Ÿ’ปComputer Use: 47.5 OSWorld ยท 50.6 WindowsAgentArena ๐Ÿ“ฑPhone Use: 73.3 AndroidWorld ๐Ÿ›œBrowser Use: 88.2% Online-Mind2Web ๐ŸŽฎGameplay: ~60% human on 15 titles ยท strong on LMGame-Bench ๐Ÿง‘โ€๐Ÿ’ปTerminalUse: 68.7 SWE-Bench ยท 45.3 TerminalBench ๐Ÿ”จTool Use: 29.6 BrowseComp Hybrid flows: GUI clicks + terminal cmds + API calls in one trace Paper https://t.co/gWUAYgHGdL Demo https://t.co/j8ucLo4Oeo

Media 1
๐Ÿ–ผ๏ธ Media
S
sopharicks
@sopharicks
๐Ÿ“…
Sep 04, 2025
234d ago
๐Ÿ†”92912844

I'm a big believer in open source technology. And this conversation with @arankomatsuzaki was very special to me. Aran back in 2021 contributed to GPT-J, the first open-source LLM that matched the capabilities of GPT-3. It was a glimpse of hope that open source models can be comparable to closed models. In this conversation we covered GPT-5, do scaling laws still work, is there a future for open-source models, how founders should think about building a company in the era of AI, and much more. The full discussion below ๐Ÿ‘‡

๐Ÿ–ผ๏ธ Media
D
dmsobol
@dmsobol
๐Ÿ“…
Sep 04, 2025
234d ago
๐Ÿ†”23890376

After part 3 of MoE 101 series we got two main questions: 1. why is MoE forward pass slower than dense network? 2. why can't I train 64 experts on a single GPU and hit OOM? we discuss both problems and solutions in part 4: https://t.co/uW6H78ZE56 1/n ๐Ÿงต

@CerebrasSystems โ€ข Thu Sep 04 19:14

MoE 101 - Episode 4: Theoretical: 60% fewer FLOPs. Reality: 7x slow down You followed all the tips from our last video. Your MoE model finally trainsโ€ฆ Then you try to scale it on GPUs...and... memory issues โŒ, underused experts โŒ, unpredictable compute bottlenecks โŒ. From expe

Media 1
๐Ÿ–ผ๏ธ Media
A
arankomatsuzaki
@arankomatsuzaki
๐Ÿ“…
Sep 05, 2025
234d ago
๐Ÿ†”06643350

Meta introduces Set Block Decoding (SBD), a new inference accelerator for LLMs SBD samples multiple future tokens in parallel, cuts forward passes by 3โ€“5x, needs no arch changes, stays KV-cache compatible, and matches NTP training performance. https://t.co/Ov1ZO22Rce

Media 1
๐Ÿ–ผ๏ธ Media
A
arankomatsuzaki
@arankomatsuzaki
๐Ÿ“…
Sep 05, 2025
234d ago
๐Ÿ†”68626156

Learning When to Plan LLM agents trained with dynamic planning learn when to spend test-time compute, balancing cost & performance. This is the first work to explore training LLM agents for dynamic test-time compute allocation in sequential decision-making tasks. https://t.co/SoJWoJju2Y

Media 1
๐Ÿ–ผ๏ธ Media
A
arankomatsuzaki
@arankomatsuzaki
๐Ÿ“…
Sep 05, 2025
234d ago
๐Ÿ†”50208101

Inverse IFEval: a new bench testing whether LLMs can unlearn stubborn training habits and follow counter-intuitive instructions. - 8 challenge types (e.g. counterfactuals, flawed text) - 1k Qs + 23 domains - Reveals LLMsโ€™ cognitive inertia and need for adaptability https://t.co/CewuwI4h2W

Media 1
๐Ÿ–ผ๏ธ Media
A
arankomatsuzaki
@arankomatsuzaki
๐Ÿ“…
Sep 05, 2025
234d ago
๐Ÿ†”28203273

abs: https://t.co/lyqJy3nBwC data: https://t.co/COMzkM5m5a

Media 1Media 2
๐Ÿ–ผ๏ธ Media
A
arankomatsuzaki
@arankomatsuzaki
๐Ÿ“…
Sep 05, 2025
234d ago
๐Ÿ†”69730114

RLโ€™s Razor: On-policy RL forgets less than SFT. Even at matched accuracy, RL shows less catastrophic forgetting Key factor: RLโ€™s on-policy updates bias toward KL-minimal solutions Theory + LLM & toy experiments confirm RL stays closer to base model https://t.co/NGXSmcgnVA

Media 1
๐Ÿ–ผ๏ธ Media
M
Modular
@Modular
๐Ÿ“…
Sep 03, 2025
235d ago
๐Ÿ†”70114759

Did you miss the latest in-person Modular Meetup? We've got you covered! To kick the event off, @clattner_llvm walked the audience through the newly-released Mojo vision and roadmap documents. The full video is available now! https://t.co/myvGefDFZ3

Media 1
๐Ÿ–ผ๏ธ Media
M
Modular
@Modular
๐Ÿ“…
Sep 04, 2025
234d ago
๐Ÿ†”06570196

September 8, join the us for the next @Modular community meeting! Topics include "Porting GSplat Kernels to Mojo" and "HyperLogLog in Mojo." We'll also be leading a special overview and Q&A session on the newly released Mojo Vision and Roadmap documents! https://t.co/zKQb9giU2u

Media 1
๐Ÿ–ผ๏ธ Media
Y
yminsky
@yminsky
๐Ÿ“…
Sep 04, 2025
235d ago
๐Ÿ†”58616147

A new episode of Signals and Threads just dropped! This one is an interview with @ChrisLattner, talking about Mojo, a new-ish language for GPU programming that's aiming to be an alternative to the CUDA stack. https://t.co/jcWQ5O82Sl

Media 1
๐Ÿ–ผ๏ธ Media
M
Modular
@Modular
๐Ÿ“…
Sep 05, 2025
233d ago
๐Ÿ†”88590200

At last week's @Modular community meetup, @FeifanF showcased how @inworld_ai partnered with Modular to build production-ready voice AIโ€“including an incredible TTS demos. Check out the video now for a taste of real-world AI applications in action! https://t.co/1J2rGnYLtU

Media 1
๐Ÿ–ผ๏ธ Media
Y
YiTayML
@YiTayML
๐Ÿ“…
Sep 02, 2025
236d ago
๐Ÿ†”46512979

Super trees or super intelligence? ๐Ÿ˜ƒ Enjoying gardens by the bay with @quocleix @denny_zhou and @benoitschilling ๐Ÿ˜ƒ https://t.co/dtPbDf4yHx

Media 1
๐Ÿ–ผ๏ธ Media
A
AravSrinivas
@AravSrinivas
๐Ÿ“…
Sep 03, 2025
235d ago
๐Ÿ†”48269939

Moving up the American App Store rankings https://t.co/SdFpS1b4AW

Media 1
๐Ÿ–ผ๏ธ Media
A
AravSrinivas
@AravSrinivas
๐Ÿ“…
Sep 04, 2025
234d ago
๐Ÿ†”19001306

Continuing to move up in America https://t.co/a41DAmptnt

Media 1
๐Ÿ–ผ๏ธ Media
A
AravSrinivas
@AravSrinivas
๐Ÿ“…
Sep 04, 2025
234d ago
๐Ÿ†”44276366

Comet is coming soon to mobile and is now available for pre-orders on Android Play Store https://t.co/vcM0n8LGZw

Media 1
๐Ÿ–ผ๏ธ Media
A
AravSrinivas
@AravSrinivas
๐Ÿ“…
Sep 05, 2025
234d ago
๐Ÿ†”81882029

Another major Perplexity iOS app update. Team cooked. Answers are now streamed smooth as butter. Tables, markdown, intermediate steps. Update and enjoy! https://t.co/vz9CknOqvh

@jonathonstaff โ€ข Thu Sep 04 23:36

Last night, we quietly rolled out another major update to the Perplexity iOS app, this time focusing on answer rendering. The team did an incredible job on this. Best-in-class performance while streaming and delightful animations make the experience feel super polished.

Media 1
๐Ÿ–ผ๏ธ Media
A
AravSrinivas
@AravSrinivas
๐Ÿ“…
Sep 05, 2025
234d ago
๐Ÿ†”40652828

Perplexity Finance pages now support future estimated revenues for individual American stocks. Estimates for Indian stocks coming next week. https://t.co/ruLXecM40l

Media 1
๐Ÿ–ผ๏ธ Media
D
dnlkwk
@dnlkwk
๐Ÿ“…
Sep 05, 2025
233d ago
๐Ÿ†”93170158

At #4 within 2 weeks of the iOS redesign and update. https://t.co/syJXW9guB9

Media 1
๐Ÿ–ผ๏ธ Media
J
jeffgrimes9
@jeffgrimes9
๐Ÿ“…
Sep 06, 2025
232d ago
๐Ÿ†”66673011

US equity pages on Perplexity Finance now include institutional holders info. Tap the Holders tab to view. We'll be expanding this to include insider activity and politician holdings soon. https://t.co/3xcKIq3qSh

Media 1
๐Ÿ–ผ๏ธ Media
E
emollick
@emollick
๐Ÿ“…
Sep 04, 2025
234d ago
๐Ÿ†”00211772

This paper finds LLMs' ability to understand that others have different beliefs (Theory of Mind) comes from 0.001% of their parameters. Break those specific weights & the model loses both its ability to track what others know AND language comprehension. Interesting implications. https://t.co/sBjG7L4eGZ

Media 1
๐Ÿ–ผ๏ธ Media
E
emollick
@emollick
๐Ÿ“…
Sep 04, 2025
234d ago
๐Ÿ†”63153640

Paper: https://t.co/LJh6vRjXMa

Media 1
๐Ÿ–ผ๏ธ Media
E
emollick
@emollick
๐Ÿ“…
Sep 05, 2025
234d ago
๐Ÿ†”26861875

Never built architecture and AI. Gemini image generator (nano banana) does a pretty good job imaging what Boullรฉeโ€™s Centograph, his fantastical (and never built) tomb for Isaac Newton would have looked like. I gave it the original 1784 black and white drawings to work with. https://t.co/y3BH5zzEzH

Media 1Media 2
+2 more
๐Ÿ–ผ๏ธ Media
E
emollick
@emollick
๐Ÿ“…
Sep 05, 2025
233d ago
๐Ÿ†”51617322

Gemini, just like everybody else. From a fascinating blog post about AI agents assigned to play web games, and failing, in large part because vision and computer use tools arenโ€™t good enough: https://t.co/39Plwlf0lH https://t.co/XOmrszY6L4

Media 1Media 2
๐Ÿ–ผ๏ธ Media
E
emollick
@emollick
๐Ÿ“…
Sep 06, 2025
232d ago
๐Ÿ†”99638063

Paper from OpenAI says hallucinations are less a problem with LLMs themselves & more an issue with training on tests that only reward right answers. That encourages guessing rather than saying โ€œI donโ€™t knowโ€ If this is true, there is a straightforward path for more reliable AI. https://t.co/0gxLoIt6ft

Media 1Media 2
+1 more
๐Ÿ–ผ๏ธ Media
P
polynoamial
@polynoamial
๐Ÿ“…
Sep 12, 2024
591d ago
๐Ÿ†”57426689

@OpenAI o1 is trained with RL to โ€œthinkโ€ before responding via a private chain of thought. The longer it thinks, the better it does on reasoning tasks. This opens up a new dimension for scaling. Weโ€™re no longer bottlenecked by pretraining. We can now scale inference compute too. https://t.co/niqRO9hhg1

Media 1
๐Ÿ–ผ๏ธ Media
I
iScienceLuvr
@iScienceLuvr
๐Ÿ“…
Sep 04, 2025
234d ago
๐Ÿ†”87901821

@cloneofsimo wow yours is worse than mine AI is a very male-leaning field i guess https://t.co/vZ3fpIDKJg

Media 1
๐Ÿ–ผ๏ธ Media