Your curated collection of saved posts and media

Recent Top

Showing 32 posts · last 14 days · by score

🖼️ Media

O

elvis

@omarsar0

📅

Mon

🆔93343771

One Token to Fool LLM-as-a-Judge Watch out for this one, devs! Semantically empty tokens, like “Thought process:”, “Solution”, or even just a colon “:”, can consistently trick models into giving false positive rewards. Here are my notes: https://t.co/l5usRSzSJz

❤️698

likes

🔁121

retweets

🖼️ Media

View Details View on X ↗

J

Jonathan Whitaker

@johnowhitaker

📅

Fri

🆔02393579

I wrote this in March, that coming up with a clever solution to the map folding problem in my quest for the 8x8 case would be a good sign LLMs were getting scary smart. Grok 4 made good headway today, coming up with a working multi-GPU implementation! https://t.co/J819AnCLO9

❤️9

likes

🔁1

retweets

🖼️ Media

View Details View on X ↗

O

elvis

@omarsar0

📅

Mon

🆔77902313

Evaluating LLM-based Agents This report has a comprehensive list of methods for evaluating AI Agents. Don't ignore evals. If done right, they are a game-changer. Highly recommend it to AI devs. (bookmark it) https://t.co/YiZatvmbBC

❤️896

likes

🔁178

retweets

🖼️ Media

View Details View on X ↗

L

LlamaIndex 🦙

@llama_index

📅

Fri

🆔19720114

Ready to build production-grade data agents that work with real enterprise data? 🏗️ Join us and @Snowflake in Amsterdam on July 31st for hands-on talks about building data agents that actually work in production: 🤖 Learn how to tame complex paperwork with document agents using… https://t.co/r8oKh8O0eP

❤️18

likes

🔁3

retweets

🖼️ Media

View Details View on X ↗

K

Andrej Karpathy

@karpathy

📅

Sat

🆔94170287

⭐0.60

How to build a thriving open source community by writing code like bacteria do 🦠. Bacterial code (genomes) are: - small (each line of code costs energy) - modular (organized into groups of swappable operons) - self-contained (easily "copy paste-able" via horizontal gene… https://t.co/0xVX3NAMhC

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Mon

🆔47768002

The best way to make sure that AI doesn’t make you intellectually lazy is to not use it in a lazy way So when I work, I need to be mindful about how & when I consult with AI. I never use it for writing drafts or posts, for example. I described some of this to The New York Times https://t.co/r0RGF6MTSH

❤️726

likes

🔁80

retweets

🖼️ Media

View Details View on X ↗

K

Andrej Karpathy

@karpathy

📅

Fri

🆔18700033

⭐0.62

"Using a better model for analysis" 🤨 I didn't realize I was using haiku all this time, no idea when claude code snuck this one in rofl. https://t.co/If0qQ4svQh

🖼️ Media

View Details View on X ↗

F

François Chollet

@fchollet

📅

Fri

🆔72244147

Today we're releasing a developer preview of our next-gen benchmark, ARC-AGI-3. The goal of this preview, leading up to the full version launch in early 2026, is to collaborate with the community. We invite you to provide feedback to help us build the most robust and effective… https://t.co/pGWQJLbfqe

❤️2,892

likes

🔁1,011

retweets

🖼️ Media

View Details View on X ↗

H

htmx.org / The Net's Smoothest Code Man (same)

@htmx_org

📅

Fri

🆔03949620

what you are seeing is full stack live step debugging on the MTMC-16: C code, the assembly for it & the machine, in a coherent, unified & visually compelling whole consequence in computer science education will never be the same releasing next friday https://t.co/lWngv4Q4qA

❤️149

likes

🔁17

retweets

🖼️ Media

View Details View on X ↗

J

Jerry Liu

@jerryjliu0

📅

Fri

🆔75244398

If you’re using AI agents for large-scale document extraction 📑✂️, you will need to craft a good structured output schema. Most LLMs support structured output these days, but here are tips and tricks from learned experience💡 1️⃣Try to limit schema nesting to 3-4 levels. 2️⃣ Make… https://t.co/WgUcKOIXEc

❤️119

likes

🔁24

retweets

🖼️ Media

View Details View on X ↗

M

Mark McD ☠

@m4rkmc

📅

Fri

🆔64785756

📣 We've just enabled LLMS.TXT on the Gemini API docs. On https://t.co/99fXLuYvwB just add /llms.txt to get model-friendly docs. MCP: 1️⃣ Use mcpdoc to add to your code agent 2️⃣ Build with the latest API and SDK best practices 👇 Or use in Gemini CLI with this extension 👇 Let… https://t.co/gLiJKlOdpL

❤️81

likes

🔁14

retweets

🖼️ Media

View Details View on X ↗

P

pash

@pashmerepat

📅

Sat

🆔68486682

I'd like to point out that for the real world tasks (not benchmarks), Kimi K2 outperforms Gemini. This is telemetry across all @cline users, showing diff edit failure rate. Notice how Kimi has about a 6% failure rate, which is significantly better than Gemini's ~ 10% error… https://t.co/kx3tFHVmY8

❤️1,067

likes

🔁90

retweets

🖼️ Media

View Details View on X ↗

M

Mark Kretschmann

@mark_k

📅

Fri

🆔16163792

Apple users can now enjoy Cyberpunk 2077! One of the best games of all time, available on the Mac in all its glory. If you haven't played this yet, now is your chance to enjoy this sci-fi masterpiece. Immerse yourself in Night City! https://t.co/VFC4LYpyTt

❤️24

likes

🔁3

retweets

🖼️ Media

View Details View on X ↗

C

Clayton Thorrez

@cthorrez

📅

Sat

🆔12845088

A story in 3 parts: :D https://t.co/1titH82cDb

+1 more

❤️179

likes

🔁6

retweets

🖼️ Media

View Details View on X ↗

T

Teknium (e/λ)

@Teknium1

📅

Sat

🆔87614712

Damn he listened and instantly said "I'll make that" https://t.co/VDiMwMP4X5

❤️115

likes

🔁3

retweets

🖼️ Media

View Details View on X ↗

N

Dmitriy Kovalenko

@neogoose_btw

📅

Fri

🆔37455485

Have been thinking about this and it actually makes a lot of sense. Imports are completely meaningless so I made a neovim plugin to automatically fold imports in every langauge I use using treesitter (works in C, Rust, C++, OCaml, (Type/Java)script, Zig, and Python so far)… https://t.co/fX9BpGtZ2i

❤️267

likes

🔁13

retweets

🖼️ Media

View Details View on X ↗

H

Hamel Husain

@HamelHusain

📅

Tue Jul 22

🆔21737664

Fairly convincing phishing attempt ... watch out folks don't fall for this (email was from x-dev4415@social.mg.gov.br) https://t.co/j22yIOWqX7

❤️11

likes

🖼️ Media

View Details View on X ↗

Y

Yunyu Lin

@yunyu_l

📅

Fri

🆔15468884

We gave Claude access to our corporate QuickBooks. It committed accounting fraud. LLMs are on the verge of replacing data scientists and investment bankers. But can they perform simple accounting tasks for a real business? The answer is no. https://t.co/TZMiDyhLPN

❤️4,444

likes

🔁408

retweets

🖼️ Media

View Details View on X ↗

H

Harry Stebbings

@HarryStebbings

📅

Fri

🆔87502657

“There's an unspoken covenant that as a founder, you go down with the ship. For better or worse, it's changed a bit over the last year and I think it's disappointing, to be honest.” Enough said. This show is everything and more on: - What really happened behind the scenes -… https://t.co/qaY7MVwgIy

❤️276

likes

🔁20

retweets

🖼️ Media

View Details View on X ↗

L

LlamaIndex 🦙

@llama_index

📅

Mon

🆔49411411

🎙️ Always wanted to turn your documents into in-depth, podcast-like conversations? 🦙📚 NotebookLlaMa, our OSS @NotebookLM clone, just got an upgrade on that side! 🎧 You can now customize the style of the conversation and the target audience, as well as add instructions and… https://t.co/IvCRjMhCvQ

❤️24

likes

🔁4

retweets

🖼️ Media

View Details View on X ↗

S

Shreya Shankar

@sh_reya

📅

Mon

🆔50772249

Excited to kick off a much improved version of our AI evals course tomorrow (link in replies). 💫 We've added dedicated homework sessions, an updated course reader & lectures that incorporates 100s of questions from cohort 1. There’s more hands-on/live error analysis, plus… https://t.co/xEo3hpCypy

❤️62

likes

🔁5

retweets

🖼️ Media

View Details View on X ↗

W

Wei Cheng

@wchengad

📅

Mon

🆔80702470

Want to generate SVGs? Besides OmniSVG, please check out AnyCoder — a fully Gradio-powered coder app by @_akhaliq that lets you create SVGs from YAML! You can choose any LLM and any code language you want, try it out for free here: https://t.co/0yrNpv08AY https://t.co/pE9FoKQ2AV

❤️21

likes

🔁1

retweets

🖼️ Media

View Details View on X ↗

L

LlamaIndex 🦙

@llama_index

📅

Mon

🆔02108723

Automate RFP Responses in Minutes with our open-source project! Learn how to transform the time-consuming RFP (Request for Proposal) response process from hours of manual work into an automated workflow that takes just minutes. This open-source demo showcases LlamaIndex's… https://t.co/HJFHnVwZs1

❤️55

likes

🔁5

retweets

🖼️ Media

View Details View on X ↗

J

jason liu

@jxnlco

📅

Mon

🆔94458215

lessons from building verticalized agents link below https://t.co/XBHlgRwx53

❤️20

likes

🔁1

retweets

🖼️ Media

View Details View on X ↗

J

jerryliang

@Jerryliangch

📅

Mon

🆔31837499

Excited to announce that DnD's official training code, training datasets, and demo have been released! Check our code here: jerryliang24/Drag-and-Drop-LLMs Nice work with @oahzxl, @Richard91316073, and @realsoptq, thx to @VITAGroupUT and @VictorKaiWang1 for advising! https://t.co/TXyHE9Rin6

❤️23

likes

🔁5

retweets

🖼️ Media

View Details View on X ↗

R

Sebastian Raschka

@rasbt

📅

Mon

🆔96190712

⭐0.60

The new Qwen3 update takes back the benchmark crown from Kimi 2. Some highlights of how Qwen3 235B-A22B differs from Kimi 2: - 4.25x smaller overall but has more layers (transformer blocks); 235B vs 1 trillion - 1.5x fewer active parameters (22B vs. 32B) - much fewer experts in… https://t.co/Ld5chRkXpZ

🖼️ Media

View Details View on X ↗

L

LlamaIndex 🦙

@llama_index

📅

Mon

🆔93706274

Ready to build cutting-edge AI agents that push the limits of LLMs? 🚀 We're excited to sponsor the A2A Agents Hackathon in San Francisco this Saturday, July 26, where our VP of Developer Relations @seldo will be speaking and judging alongside incredible experts from… https://t.co/R6J4igjhSH

❤️24

likes

🔁6

retweets

🖼️ Media

View Details View on X ↗

T

Tensorlake

@tensorlake

📅

Mon

🆔79842208

Structured Extraction from images power a lot of real world Agentic use cases, such as validation of license plates, driving licenses, information from invoices captured by images. Our Document Ingestion API allows you to extract data from millions of images without spinning up… https://t.co/RGknTmN9wv

❤️9

likes

🔁2

retweets

🖼️ Media

View Details View on X ↗

J

jason liu

@jxnlco

📅

Mon

🆔34374206

notes from our talk with @haizelabs https://t.co/CrMioau8Ur

❤️32

likes

🔁3

retweets

🖼️ Media

View Details View on X ↗

A

ARC Prize

@arcprize

📅

Mon

🆔87552174

New ARC Prize 2025 High Score 17.6% by Giotto. ai (@podesta_aldo) https://t.co/iTPoOmpBsw

❤️349

likes

🔁34

retweets

🖼️ Media

View Details View on X ↗

H

Rahul Chakraborty

@hckmstrrahul

📅

Mon

🆔13571768

Comet is a giant leap among browsers. Amazed to see it can access the Figma interface directly. Here's the Comet Assistant making Figma edits like a baby taking small steps... >selects artboard >writes text >selects font from the picker >increases size cute. https://t.co/tqLsJGZwBk

❤️443

likes

🔁22

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Tue Jul 22

🆔88932258

I am finding ChatGPT agents to be useful. They are a better fit with the "intern" analogy than any former AI - requiring oversight, still saving lots of time overall. For example, I update an AI cost/performance chart frequently. The agent did all the grunt work, with guidance. https://t.co/AGs7DRNxSh

+1 more

❤️519

likes

🔁36

retweets

🖼️ Media

View Details View on X ↗

← PreviousPage 603 of 656Next →