Your curated collection of saved posts and media

Recent Top

Showing 32 posts · last 14 days · by score

🖼️ Media

V

Taelin

@VictorTaelin

📅

Thu Jul 10

🆔15139952

Something about this kind of prompt is simply unfathomable to LLMs. They just can't perform better than chance, and I'm not sure why. Most people will dismiss this as just being "hard math stuff", but it is not, I swear. It is just alien to you because it is *niche*, thus, it… https://t.co/R9j73R0wAC

❤️739

likes

🔁66

retweets

🖼️ Media

View Details View on X ↗

S

Sayash Kapoor

@sayashk

📅

Thu Jul 10

🆔58404130

After we invented the dynamo, it took us 40 years to electrify factories. In the process, we had to redesign the entire factory layout — electrifying existing factories didn't cut it. Software engineering will likewise need to undergo drastic changes to truly benefit from AI.… https://t.co/Vu67Annscv

❤️193

likes

🔁33

retweets

🖼️ Media

View Details View on X ↗

O

elvis

@omarsar0

📅

Thu Jul 10

🆔99629470

Excited to announce my new short course: Building Agentic Applications with Replit Agent and n8n. With AI this capable, I believe anyone can become a builder. The stack I use here will teach you how to rapidly build agentic apps with no-code tools. https://t.co/MezezZtuvy

❤️162

likes

🔁15

retweets

🖼️ Media

View Details View on X ↗

M

Maxime Labonne

@maximelabonne

📅

Thu Jul 10

🆔75381864

Liquid AI open-sources a new generation of edge LLMs! 🥳 I'm so happy to contribute to the open-source community with this release on @huggingface! LFM2 is a new architecture that combines best-in-class inference speed and quality into 350M, 700M, and 1.2B models. https://t.co/ILhTJknlAx

❤️697

likes

🔁107

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Thu Jul 10

🆔91869348

This large study of 187k developers using GitHub Copilot finds AI transforms nature of coding. Coders focus: more coding & less management. They need to coordinate less, working with fewer people They experiment more with new languages, which would increase earnings $1,683/year https://t.co/UN62E7Hmem

❤️335

likes

🔁42

retweets

🖼️ Media

View Details View on X ↗

J

Jeremy Howard

@jeremyphoward

📅

Thu Jul 10

🆔56466171

I replicated this result, that Grok focuses nearly entirely on finding out what Elon thinks in order to align with that, on a fresh Grok 4 chat with no custom instructions. https://t.co/NgeMpGWBOB https://t.co/QTWzjtYuxR

❤️4,925

likes

🔁739

retweets

🖼️ Media

View Details View on X ↗

J

jack morris

@jxmnop

📅

Fri

🆔81774280

so xAI just 10x’d the amount of compute we use on RL and the models only got a tiny bit better are we just doing RL wrong? or is pretraining just inherently much more useful https://t.co/xyknCRWemU

❤️1,436

likes

🔁47

retweets

🖼️ Media

View Details View on X ↗

A

Aravind Srinivas

@AravSrinivas

📅

Fri

🆔15928315

“Comet browser gives the first glimpse of 100x productivity” - Early Chrome PM, a16z GP. https://t.co/BhZN7latqL

❤️608

likes

🔁41

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Fri

🆔81019661

Grok 4, in general, is very influenced by search results and pretty credulous when it sees a web search result. When you ask it to code, it often looks for code online first and uses that. https://t.co/7C0dEIyO82

❤️466

likes

🔁19

retweets

🖼️ Media

View Details View on X ↗

J

Jeremy Howard

@jeremyphoward

📅

Thu Jul 10

🆔96917714

Here's a complete unedited video of asking Grok for its views on the Israel/Palestine situation. It first searches twitter for what Elon thinks. Then it searches the web for Elon's views. Finally it adds some non-Elon bits at the end. ZA 54 of 64 citations are about Elon. https://t.co/6Mr33LByrm

❤️7,642

likes

🔁911

retweets

🖼️ Media

View Details View on X ↗

T

Teknium (e/λ)

@Teknium1

📅

Thu Jul 17

🆔81478051

Just merged a PR for an environment to improve LLM as a Judge as well as evaluate models on their capability of doing judgements! Did you know that all verifiable RL environments are nearly equivalent to benchmarks (and vice-versa!)? So we added an evaluate command to Atropos'… https://t.co/5cg5TQojE6

❤️65

likes

🔁3

retweets

🖼️ Media

View Details View on X ↗

O

elvis

@omarsar0

📅

Thu Jul 17

🆔85083895

Agent Leaderboard v2 is here! > GPT-4.1 leads > Gemini-2.5-flash excels at tool selection > Kimi K2 is the top open-source model > Grok 4 falls short > Reasoning models lag behind > No single model dominates all domains More below: https://t.co/dcWDqXSj7m

❤️1,810

likes

🔁214

retweets

🖼️ Media

View Details View on X ↗

Z

Ravid Shwartz Ziv

@ziv_ravid

📅

Mon

🆔43127246

If you're at ICML, come tomorrow (Tuesday) to Oscar's talk, where he will present our paper "Layer by layer: Uncovering hidden representations in language models" at 10am (West Ballroom D) and for the poster session at 11am (East Exhibition Hall A-B #E-2607). https://t.co/8gPdEvldQd

❤️30

likes

🔁6

retweets

🖼️ Media

View Details View on X ↗

H

Hamel Husain

@HamelHusain

📅

Thu Jul 17

🆔42669416

The eval space is the most intense battle for AI market share I have seen second to coding agents. This is why we will have Arize & Braintrust go head-to-head. They will each show how to complete our 5 homework assignments using their tools . Over 1k students learning about… https://t.co/rjLzNOphBz

❤️113

likes

🔁8

retweets

🖼️ Media

View Details View on X ↗

S

Sayash Kapoor

@sayashk

📅

Thu Jul 17

🆔91404244

The mainstream view of AI for science says AI will rapidly accelerate science, and that we're on track to cure cancer, double the human lifespan, colonize space, and achieve a century of progress in the next decade. In a new AI Snake Oil essay, @random_walker and I argue that… https://t.co/p5OBUluVyg

+1 more

❤️233

likes

🔁62

retweets

🖼️ Media

View Details View on X ↗

O

elvis

@omarsar0

📅

Thu Jul 17

🆔18257591

Agentic-R1 This 7B model is surprisingly good at interleaved tool use and reasoning capabilities. It's fun to see small language models improving this fast. Knowledge distillation in full display. Here are my notes: https://t.co/mPaZA36JUv

❤️618

likes

🔁112

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Fri

🆔18445354

ChatGPT agent: "create a PDF of a novel D&D adventure, add illustrations, make it super interesting and deep, add tables, etc" "Fix the formatting, build it out more" Got a 19 page PDF. Agent doesn't do layouts well, but pulls off building a coherent adventure, hard for LLMs. https://t.co/OYEo3L1hgP

+2 more

❤️341

likes

🔁24

retweets

🖼️ Media

View Details View on X ↗

H

Hamel Husain

@HamelHusain

📅

Fri

🆔88197898

⭐0.30

🥁 @hwchase17 has entered the fight https://t.co/UkP9k6xcwF

❤️22

likes

🔁5

retweets

🖼️ Media

View Details View on X ↗

J

Jan Eric Lenssen

@janericlenssen

📅

Thu Jul 17

🆔91336420

You can bring our Sudoku solving diffusion models to other domains! If you are at interested and at #ICML2025, come see @bartek_pog and @ChrisWewer's 🌀 Spatial Reasoners package — now released in beta! Here are some examples for images and videos. Links below. https://t.co/mkSR4WxUGi

❤️193

likes

🔁26

retweets

🖼️ Media

View Details View on X ↗

W

Wizard Gang

@wizardposting1

📅

Fri

🆔52269010

https://t.co/uRGLcBWnxP

❤️21,740

likes

🔁1,721

retweets

🖼️ Media

View Details View on X ↗

S

Haider.

@slow_developer

📅

Fri

🆔93804055

Eric Schmidt says we're on the verge of AI replacing most programming and math tasks these domains are scale-free and don't require real-world data — just compute "world-class AI mathematicians will emerge in the next year, and top-tier AI programmers within two" https://t.co/oJLjx9BIue

❤️671

likes

🔁118

retweets

🖼️ Media

View Details View on X ↗

E

Eugene Yan

@eugeneyan

📅

Wed

🆔53393802

How do you build an LLM-evaluator / LLM-as-Judge? The book for "AI Evals for PMs and Engineers" has a chapter devoted to it (35% discount: https://t.co/3botO05aw6) First, we need to define the right metrics. For example, we can start by listing the failure modes from our error… https://t.co/C3wVt06fT6

❤️851

likes

🔁73

retweets

🖼️ Media

View Details View on X ↗

Z

ZEKE

@zekedup

📅

Wed

🆔45704402

https://t.co/93kn3u7W5T

❤️523

likes

🔁4

retweets

🖼️ Media

View Details View on X ↗

A

Arthur Gretton

@ArthurGretton

📅

Mon

🆔20724360

Accelerated Diffusion Models via Speculative Sampling, at #icml25 ! at 16:30 Tuesday July 15 poster E-3012 https://t.co/GGuwJxdIFf @ValentinDeBort1 @agalashov @ArnaudDoucet1 https://t.co/mu3xgoIdCR

❤️305

likes

🔁46

retweets

🖼️ Media

View Details View on X ↗

S

SkalskiP

@skalskip92

📅

Wed

🆔40086682

supervision-0.26.0 is out we finally released support for ViTPose and ViTPose++ pose estimation models from @huggingface transformers link: https://t.co/xXMRaS3Guk https://t.co/chhTNMWxJ0

❤️993

likes

🔁155

retweets

🖼️ Media

View Details View on X ↗

J

jason liu

@jxnlco

📅

Wed

🆔73998870

Tired of misleading benchmark results that don't reflect real-world embedding performance? Discover why your embeddings might be underperforming despite strong benchmark scores in this eye-opening workshop. Kelly Hong, Researcher at Chroma, will reveal the hidden pitfalls of… https://t.co/SsWBOZX1Qq

❤️4

likes

🖼️ Media

View Details View on X ↗

L

LlamaIndex 🦙

@llama_index

📅

Wed

🆔71760242

Join our own @tuanacelik at ODSC's Agentic AI Summit tomorrow! She'll be presenting a hands-on workshop where you'll learn how to: ➡️ Build agents using LlamaIndex - Learn to create autonomous applications that use goals and tools to accomplish tasks independently ➡️ Compose… https://t.co/6jcYIGR70s

❤️15

likes

🔁6

retweets

🖼️ Media

View Details View on X ↗

D

Dariusz Debowczyk

@ddebowczyk

📅

Wed

🆔56096231

InstructorPHP v1.0.0 released! 🎉 First stable release of InstructorPHP - MIT/OS library for structured data extraction and LLM integration for PHP apps. Framework agnostic, batteries included. 🔥 StructuredOutput, Inference, and Embeddings classes with fluent, cohesive APIs.… https://t.co/5WzUL2WqYZ

❤️4

likes

🔁1

retweets

🖼️ Media

View Details View on X ↗

T

TBPN

@tbpn

📅

Wed

🆔49108009

BREAKING: Claude Code PMs Boris Cherny and Cat Wu have returned to Anthropic after a brief stint at Cursor. https://t.co/GGcNHfppMM

❤️2,097

likes

🔁102

retweets

🖼️ Media

View Details View on X ↗

J

jason liu

@jxnlco

📅

Wed

🆔27987092

notes on context engineering with @walden_yan of @cognition_labs https://t.co/DJsWC1EESr

❤️731

likes

🔁59

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Wed

🆔90207096

The research on AI companions and mental health is still very preliminary & unclear as to long-term impact. Seems like an important topic to research right now. (I would also hope that xAI is tracking anonymized data about their new companion product for known potential harms) https://t.co/bvU8X2tWbE

+1 more

❤️201

likes

🔁19

retweets

🖼️ Media

View Details View on X ↗

C

Chip Huyen

@chipro

📅

Wed

🆔08184115

I open sourced Sniffly, a tool that analyzes Claude Code logs to help me understand my usage patterns and errors. Key learnings. 1. The biggest type of errors Claude Code made is Content Not Found (20 - 30%). It tries to find files or functions that don't exist. So I… https://t.co/nxcjMcjR9C

❤️1,187

likes

🔁124

retweets

🖼️ Media

View Details View on X ↗

← PreviousPage 601 of 656Next →