Your curated collection of saved posts and media

Showing 32 posts Β· last 14 days Β· by score
M
Armin Ronacher β‡Œ
@mitsuhiko
πŸ“…
Sat
πŸ†”55170039

Not so hot take: if you use Claude Code, most of y’all’s MCP servers could be a shell script. Easier to maintain and faster and Claude uses just as well if not better. https://t.co/CWYMYjCo1S

Media 1
❀️847
likes
πŸ”51
retweets
πŸ–ΌοΈ Media
A
Aravind Srinivas
@AravSrinivas
πŸ“…
Sat
πŸ†”43573120

Some incredible changes are happening in the world and it’s just beginning. It’s not just informational categories where search from blue links is declining. Commercial categories too like Travel, Food/Drink, Fashion and E-commerce. https://t.co/RwguPPLq92

Media 1
❀️733
likes
πŸ”104
retweets
πŸ–ΌοΈ Media
Y
Yuchen Jin
@Yuchenj_UW
πŸ“…
Sat
πŸ†”05973910

Hot take: You should still learn to code. https://t.co/XA3UKS2m1f

Media 1
❀️7,490
likes
πŸ”455
retweets
πŸ–ΌοΈ Media
L
Lech Mazur
@LechMazur
πŸ“…
Sat
πŸ†”32554354

Two of the worst characteristics of LLMs are simply what users prefer. LMArena was a fun idea, but AI companies optimizing for it has become harmful, similar to how most people prefer ultra-processed, high-sugar foods when given a choice. https://t.co/kX32VFpJwB

Media 1
❀️176
likes
πŸ”18
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Sat
πŸ†”35067093

Error Analysis is all you need. Sebastian at Redfin about to dominate https://t.co/dR23WB2cAl https://t.co/ZXVctoW6Mh

Media 1
❀️30
likes
πŸ”4
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Sun
πŸ†”95245971

Wow on AI News again. Same feels as being front page of HN 😊 https://t.co/tiEHpryfCe https://t.co/GmodshAjQM

Media 1
❀️53
likes
πŸ”3
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Sat
πŸ†”75206936

Anthropic is killing it with these technical posts. If you're an AI dev, stop what you are doing and go read this. It shows, in great detail, how to implement an effective multi-agent research system. Pay attention to these key parts: https://t.co/NRi6Xgah63

Media 1
❀️4,639
likes
πŸ”458
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Sun
πŸ†”72255499

Six weeks after ChatGPT I argued that we were already in a Long Singularity For 20,000 centuries of human history, nothing much happened. We spent 19,960 centuries on variations of one tool. Things only accelerated two centuries ago. Surprisingly, we have (mostly) kept adjusting https://t.co/Boo62v0ITA

Media 1
❀️1,969
likes
πŸ”209
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Fri
πŸ†”47736634

TableRAG A new RAG framework for heterogeneous document reasoning. My notes below: https://t.co/MVJvdmmL7B

Media 1
❀️707
likes
πŸ”126
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Fri
πŸ†”14660803

By surveying workers and AI experts, this paper gets at a key issue: there is both overlap and substantial mismatches between what workers want AI to do & what AI is likely to do. AI is going to change work. It is critical that we take an active role in shaping how it plays out. https://t.co/q0uktMJhHW

Media 1Media 2
+2 more
❀️399
likes
πŸ”74
retweets
πŸ–ΌοΈ Media
L
LlamaIndex πŸ¦™
@llama_index
πŸ“…
Wed
πŸ†”38290941

New integration: @CleanlabAI + LlamaIndex LlamaIndex lets you build AI knowledge assistants and production agents that generate insights from enterprise data. Cleanlab makes their responses trustworthy. Add Cleanlab to: β€’ Score trust forΒ everyΒ LLM response β€’ Catch… https://t.co/pTjn642OUO

Media 1
❀️69
likes
πŸ”13
retweets
πŸ–ΌοΈ Media
J
Jerry Liu
@jerryjliu0
πŸ“…
Wed
πŸ†”92985359

Using LLMs to generate structured output is easy. But building high-quality document extraction with precise citations and reasoning for every key in the extracted output is harder. LlamaExtract is our agentic document extraction service, over even the most complex documents and… https://t.co/J4SNBPz5BM

Media 1
❀️203
likes
πŸ”29
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Wed
πŸ†”41630454

On building your personalized deep research agents. I recently built this deep research agentic workflow with n8n and was very impressed by the results. Combining reasoning models + multi-agent workflows is like magic! A few things I learned along the way: https://t.co/myxaKD5udF

Media 1
❀️560
likes
πŸ”69
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Wed
πŸ†”01073966

Truth from @charles_irl : "evals" is an isomorphic concept across many disciplines It means thinking scientifically with an experimentation + data driven mindset https://t.co/ARpmGAcqju

Media 1Media 2
❀️23
likes
πŸ”4
retweets
πŸ–ΌοΈ Media
Y
Yu Su
@ysu_nlp
πŸ“…
Wed
πŸ†”35802868

πŸ“ˆ Scaling may be hitting a wall in the digital world, but it's only beginning in the biological world! We trained a foundation model on 214M images of ~1M species (50% of named species on Earth 🐨🐠🌻🦠) and found emergent properties capturing hidden regularities in nature. 🧡 https://t.co/wIw2JVNGFG

Media 1
❀️268
likes
πŸ”57
retweets
πŸ–ΌοΈ Media
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Wed
πŸ†”45380319

Claude just casually deleting a full days work on an environment for no fucking reason - fuck you claude https://t.co/j7BHXZcdKw

Media 1
❀️925
likes
πŸ”23
retweets
πŸ–ΌοΈ Media
J
Jiaxin Wen
@jiaxinwen22
πŸ“…
Wed
πŸ†”58418441

New Anthropic research: We elicit capabilities from pretrained models using no external supervision, often competitive or better than using human supervision. Using this approach, we are able to train a Claude 3.5-based assistant that beats its human-supervised counterpart. https://t.co/p0wKBtRo7q

Media 1
❀️1,440
likes
πŸ”155
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Wed
πŸ†”13700720

NEW: Meta releases V-JEPA 2, their new world model! Foundation world models aim to accelerate physical AI, the next frontier. Why is this a big deal? Let's break it down: https://t.co/QYzeaK6GPI

Media 1
❀️601
likes
πŸ”106
retweets
πŸ–ΌοΈ Media
H
Erik Meijer
@headinthebox
πŸ“…
Thu Jun 12
πŸ†”69067273

Ever since Anthropic came out with "computer use" in October 2024, I have been trying to make it use the calculator to perform some simple calculations, like "1+2". Alas, I never got it to work reliably. Now OpenAI also has come out with computer use, so I tried again. Same… https://t.co/qxvVjXtLHa

Media 1
❀️1,296
likes
πŸ”96
retweets
πŸ–ΌοΈ Media
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Thu Jun 12
πŸ†”70409357

I never thought a simple "How many x's are in y" letter counting RL environment could get so complex. Just PR'ed the letter counting environment, with more features than I thought I'd put into this lol - My first difficulty threshold built-in, so that if the model is already… https://t.co/Qzz9Gcp4W4

Media 1
❀️32
likes
πŸ”3
retweets
πŸ–ΌοΈ Media
I
Eleanor Berger
@intellectronica
πŸ“…
Wed
πŸ†”85870351

Reminder that if you want to give o3-pro a try and don't want or can't afford the $200/month pro sub, you can access it from the Open AI playground. https://t.co/VFQoVPOpce

Media 1
❀️100
likes
πŸ”8
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Thu Jun 12
πŸ†”33655241

Interesting attempt by Salesforce to create a benchmark for realistic business tasks - we need more of these! Worth tracking over time (though I would love to see an contest, ARC-AGI style, to ask people to try to beat these benchmarks and see if they can with prompts & tools) https://t.co/eWokRVlFHk

Media 1Media 2
❀️422
likes
πŸ”50
retweets
πŸ–ΌοΈ Media
P
Philipp Spiess
@PhilippSpiess
πŸ“…
Wed
πŸ†”95432242

Wrote about my learnings from using Claude Code (and coding agents in general) quite extensively for a month. I'm curious if some of you have made similar experiences and know some additional tricks? https://t.co/ElyfBeAm8x

Media 1
❀️1,304
likes
πŸ”100
retweets
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Wed
πŸ†”73280441

yeah like 8 years ago? https://t.co/PlMMhKpL12

Media 1
❀️43
likes
πŸ”2
retweets
πŸ–ΌοΈ Media
M
Bryce York
@meetbryce
πŸ“…
Thu Jun 12
πŸ†”91406962

If you haven't already heard about @sh_reya & @HamelHusain's Maven course on evals and you have any plan to build in the LLM-space, you're missing out. I'm about to finish the course and I couldn't recommend it more highly. I can't think of a better way for an engineer or… https://t.co/qD7enaG2H1

Media 1Media 2
+1 more
❀️12
likes
πŸ”3
retweets
πŸ–ΌοΈ Media
H
Hamel Husain
@HamelHusain
πŸ“…
Thu Jun 12
πŸ†”04551264

How should I approach evaluating my RAG system? Retrieval: evaluate it like search (and optionally use a LLM to build synthetic data) Generation: follow the standard AI evaluation approach Part 1 https://t.co/qR1AB0pd9b

Media 1
❀️29
likes
πŸ”2
retweets
πŸ–ΌοΈ Media
A
Aravind Srinivas
@AravSrinivas
πŸ“…
Thu Jun 12
πŸ†”26976542

Perplexity can now be on your video calls now thanks to Fireflies https://t.co/Tj3kr5z9Cb

❀️693
likes
πŸ”51
retweets
πŸ–ΌοΈ Media
I
Ivan Leo
@ivanleomk
πŸ“…
Thu Jun 12
πŸ†”20789508

gg @perplexity_ai https://t.co/Eidrh2aZiq

Media 1
❀️1
likes
πŸ–ΌοΈ Media
I
Ivan Leo
@ivanleomk
πŸ“…
Fri
πŸ†”33861927

CI coding agents are beautiful. I think CI agents should just be a one off job with a cached sandbox perhaps to speed up changes. https://t.co/1mYoWZvTHB

Media 1
❀️4
likes
πŸ–ΌοΈ Media
J
Jeremy Howard
@jeremyphoward
πŸ“…
Thu Jun 12
πŸ†”88963156

Claude not able to continue my research chat about context compression papers because it ran out of context because it doesn't use context compression. https://t.co/48lGi59gI7

Media 1
❀️626
likes
πŸ”27
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Thu Jun 12
πŸ†”16224876

Reasoning Models for Workflow Generation You can just generate workflows with LLMs now?! Don't sleep on RL! Something I am also working on, so glad to see research on it. My key takeaways: https://t.co/GW6RGdV4nQ

Media 1
❀️280
likes
πŸ”37
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Thu Jun 12
πŸ†”59221943

Text-to-LoRA Fine-tuning effective models is hard and damn expensive! What if an AI model could help you adapt LLMs on the fly? Meet Text-to-LoRA, a hypernetwork trained to construct LoRAs in one forward pass through natural language. Here are my notes: https://t.co/zPrTlLQVwz

Media 1
❀️377
likes
πŸ”63
retweets
πŸ–ΌοΈ Media