T

Teknium (e/λ)

@Teknium1

📅

Mon

🆔49290521

Kimi base thinks claude is the best AI https://t.co/rjo3mBfXJl

❤️19

likes

🖼️ Media

View Details View on X ↗

K

Kaiqu Liang

@kaiqu_liang

📅

Thu Jul 10

🆔88937980

🤔 Feel like your AI is bullshitting you? It’s not just you. 🚨 We quantified machine bullshit 💩 Turns out, aligning LLMs to be "helpful" via human feedback actually teaches them to bullshit—and Chain-of-Thought reasoning just makes it worse! 🔥 Time to rethink AI alignment. https://t.co/rL64tIQZdH

❤️627

likes

🔁115

retweets

🖼️ Media

View Details View on X ↗

_

the tiny corp

@__tinygrad__

📅

Sat

🆔19995157

This is tinygrad's description of the tensor cores of all the major GPUs. No per GPU dialects, just a spec for what they each are. https://t.co/BEcFdRxFNK

❤️641

likes

🔁44

retweets

🖼️ Media

View Details View on X ↗

N

Nikhil Chandak

@nikhilchandak29

📅

Fri

🆔99085405

🚨Thought Grok-4 saturated GPQA? Not yet! ⚖️Same questions, when evaluated free-form, Grok-4 is no better than its smaller predecessor Grok-3-mini! Even @OpenAI's o4-mini outperforms Grok-4 here. As impressive as Grok-4 is, benchmarks have not saturated just yet. Also, have… https://t.co/ms4SYK5X6F

❤️269

likes

🔁31

retweets

🖼️ Media

View Details View on X ↗

H

htmx.org / The Net's Smoothest Code Man (same)

@htmx_org

📅

Sat

🆔73496467

htmx is designed to be obsoleted by incorporation into the HTML specification it generalizes the idea of hypermedia controls & that's pretty much it https://t.co/muJJbVAHUI https://t.co/Wdd2yKMpGk

❤️335

likes

🔁14

retweets

🖼️ Media

View Details View on X ↗

T

Piotr Mazurek

@tugot17

📅

Sat

🆔60350991

I solved every single problem in the CUDA mode book. A quick thread summarizing this experience and what I learned 1/x https://t.co/KOgppjA3ev

❤️2,444

likes

🔁243

retweets

🖼️ Media

View Details View on X ↗

R

Sebastian Raschka

@rasbt

📅

Sat

🆔24577525

⭐0.52

Kimi K2 is basically DeepSeek V3 but with fewer heads and more experts: https://t.co/LrRqRCOHkl

🖼️ Media

View Details View on X ↗

J

Jerry Liu

@jerryjliu0

📅

Sat

🆔42105182

Adaptive Form Extraction 🤖📑 Form understanding (e.g. tax forms, patient intake, certifications, questionnaires) is a huge use case for AI agents, but defining a good output schema is painful: 1) it can be very tedious/custom, and 2) each form has its own values. 💡 Instead of… https://t.co/mNCiIenVR5

❤️141

likes

🔁15

retweets

🖼️ Media

View Details View on X ↗

X

xlr8harder

@xlr8harder

📅

Sat

🆔50442650

I had the impression that Kimi K2 uses a better, more diverse vocabulary than I was used to seeing, so I ran a quick linguistic diversity analysis on the SpeechMap data, and yep, Kimi K2 has the top score. Details on the calculation in thread. https://t.co/P4WRNqf7dz

❤️935

likes

🔁56

retweets

🖼️ Media

View Details View on X ↗

N

Niels Rogge

@NielsRogge

📅

Sat

🆔31528112

Interesting, the CEO of @Kimi_Moonshot was the first author of XLNet and TransformerXL Both of which were among the first models added to the @huggingface Transformers library in 2019 https://t.co/eLPkepVddF

❤️516

likes

🔁38

retweets

🖼️ Media

View Details View on X ↗

T

Teknium (e/λ)

@Teknium1

📅

Sun

🆔08800654

Had no idea these existed https://t.co/ejkfsXQo0c

❤️958

likes

🔁24

retweets

🖼️ Media

View Details View on X ↗

J

Jeremy Howard

@jeremyphoward

📅

Sun

🆔13941776

OpenAI was not the first company to implement tool use in the CoT. Below is @AnthropicAI over a year ago. https://t.co/ZW97k3dgv5 https://t.co/99KpJrdXq0

❤️306

likes

🔁8

retweets

🖼️ Media

View Details View on X ↗

M

Mikhail Parakhin

@MParakhin

📅

Sun

🆔71135499

Comet (Perplexity's browser) has been growing on me. Initially I only used Assistant to query busy pages, but the ability to perform actions right inside the browser is really handy, too! My real-life task example on OCC memos website: https://t.co/7zboMFL7kd

❤️125

likes

🔁9

retweets

🖼️ Media

View Details View on X ↗

K

Kyunghyun Cho

@kchonyc

📅

Sat

🆔43834168

late 1980s, @ylecun and @LeonBottou used amiga 1000 and a bespoke modem to implement and research artificial neural nets using SN-1. the legend was born. https://t.co/YnRyjL3r08

❤️775

likes

🔁77

retweets

🖼️ Media

View Details View on X ↗

M

Mathieu

@miniapeur

📅

Sat

🆔90706513

https://t.co/ul0ZVbiYzI

❤️1,829

likes

🔁131

retweets

🖼️ Media

View Details View on X ↗

J

Jeremy Howard

@jeremyphoward

📅

Sun

🆔33937983

Anyone else remember this article from July 2010? First mention of Bitcoin I recall seeing. I was pretty into digital currency in the 1990's, so was excited to see it making a comeback! https://t.co/DxdcQmnB55 https://t.co/khDatlTS5W

❤️68

likes

🔁4

retweets

🖼️ Media

View Details View on X ↗

S

Sam Paech

@sam_paech

📅

Sun

🆔98553853

Kimi-K2 just took top spot on both EQ-Bench3 and Creative Writing! Another win for open models. Incredible job @Kimi_Moonshot https://t.co/uD7yCmc5VS

+2 more

❤️866

likes

🔁110

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Sun

🆔13769883

Since the “architecture sucks now” theme is coming around on X again, a reminder that Hyatt and Marriott atriums are as impressive as cathedrals and pyramids, but, unlike cathedrals and pyramids, anyone can get a drink in them or spend the night for a reasonable fee. https://t.co/M86brKEQwB

+2 more

❤️369

likes

🔁17

retweets

🖼️ Media

View Details View on X ↗

H

steve hsu

@hsu_steve

📅

Sun

🆔99328362

Hilbert space should really be called Schmidt space? Credit assignment in academia and science = very noisy and dominated by politics. https://t.co/4elIxFGY8L

❤️48

likes

🔁5

retweets

🖼️ Media

View Details View on X ↗

M

Armin Ronacher ⇌

@mitsuhiko

📅

Fri

🆔97791695

Enough is enough. https://t.co/5aWvwxLn4n

❤️641

likes

🔁14

retweets

🖼️ Media

View Details View on X ↗

J

Jeremy Howard

@jeremyphoward

📅

Sat

🆔34739866

It turns out this isn't true. Proof in next tweet. https://t.co/qEp5i9DG38

❤️290

likes

🔁12

retweets

🖼️ Media

View Details View on X ↗

X

xlr8harder

@xlr8harder

📅

Sat

🆔33051918

4% of overall model responses from grok-4 in our latest SpeechMap eval mention Elon Musk (most models are <0.5%). It seems to be doubling every recent release. At this rate, by Grok 9, 100% of all model responses will talk about Elon Musk. https://t.co/ItbcZCjytm

❤️787

likes

🔁67

retweets

🖼️ Media

View Details View on X ↗

E

Ethan Mollick

@emollick

📅

Sat

🆔53259374

Kimi-k2 seems to be a very good (and giant & odd) open weights model that may be the new leader in open LLMs. It is not beating the frontier closed models on my weird tests, but it doesn’t have a reasoner yet. More testing needed but Chinese open weights models are impressive. https://t.co/BCB2VeqnWJ

+1 more

❤️436

likes

🔁27

retweets

🖼️ Media

View Details View on X ↗

T

TuringPost

@TheTuringPost

📅

Thu Jul 10

🆔59066073

LFM2 - an upgrade to Liquid Foundation Models from @LiquidAI_, designed to make their models fast, memory efficient and usable on any device. What sets LFM2 apart from the rest? • New hybrid architecture with 16 blocks: - 10 double-gated short convolution blocks. They act… https://t.co/cZqxixQZ8D

❤️31

likes

🔁5

retweets

🖼️ Media

View Details View on X ↗

T

Teknium (e/λ)

@Teknium1

📅

Sat

🆔20790429

Umm @natolambert what is up with RewardBench - why the heck is it using LMSys' FastChat that hasn't been updated in like a year+ for chat templates and not huggingface chat templates - this is driving me crazy, BFCL, MTBench, and now RewardBench are using hardcoded chat… https://t.co/yxmDr8S64A

❤️18

likes

🖼️ Media

View Details View on X ↗

K

Keyon Vafa

@keyonV

📅

Fri

🆔80331460

Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵 https://t.co/GDxnK8gaid

❤️6,814

likes

🔁1,021

retweets

🖼️ Media

View Details View on X ↗

A

Aravind Srinivas

@AravSrinivas

📅

Sat

🆔38758367

When you’re on Comet, you’re operating at an abstraction above which AI to use and how to pull in relevant context. Agents are powerful and operate like a human would to complete the task. You go from chat turns to end-to-end workflows. https://t.co/oMA3ASUMjJ

❤️758

likes

🔁47

retweets

🖼️ Media

View Details View on X ↗

V

vishal

@vishal_learner

📅

Fri

🆔26164854

As is the case with these talks, it's not just the facts presented (you can get them anywhere) but the story told tying the facts together! A great narrative woven by Antoine. Sharing my 2 fave slides: I've seen his thread on them here, and it's an important concept to understand https://t.co/PoTtOMXZEh

❤️4

likes

🔁2

retweets

🖼️ Media

View Details View on X ↗

O

elvis

@omarsar0

📅

Thu Jul 10

🆔30828397

BREAKING: xAI announces Grok 4 "It can reason at a superhuman level!" Here is everything you need to know: https://t.co/z1DrpFGvnT

❤️5,837

likes

🔁399

retweets

🖼️ Media

View Details View on X ↗

L

Liquid AI

@LiquidAI_

📅

Thu Jul 10

🆔62064990

Today, we release the 2nd generation of our Liquid foundation models, LFM2. LFM2 set the bar for quality, speed, and memory efficiency in on-device AI. Built for edge devices like phones, laptops, AI PCs, cars, wearables, satellites, and robots, LFM2 delivers the fastest… https://t.co/9bW5AQck1d

❤️266

likes

🔁60

retweets

🖼️ Media

View Details View on X ↗

J

jason liu

@jxnlco

📅

Thu Jul 10

🆔18656977

The Document Automation Trap: Why Your AI Pipeline Will Fail w/ @ExtendHQ - Understand your data and existing manual processes before automation. Tacit knowledge hidden within organizations is often the key to successful implementation. - Invest early in task-specific… https://t.co/Y05VQNNLEQ

❤️35

likes

🔁2

retweets

🖼️ Media

View Details View on X ↗

M

METR

@METR_Evals

📅

Thu Jul 10

🆔20388093

We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't. https://t.co/w8LSTpCFZL

❤️6,839

likes

🔁1,363

retweets

🖼️ Media

View Details View on X ↗