Your curated collection of saved posts and media

Showing 32 posts Β· last 14 days Β· by score
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Mon
πŸ†”49290521

Kimi base thinks claude is the best AI https://t.co/rjo3mBfXJl

Media 1
❀️19
likes
πŸ–ΌοΈ Media
K
Kaiqu Liang
@kaiqu_liang
πŸ“…
Thu Jul 10
πŸ†”88937980

πŸ€” Feel like your AI is bullshitting you? It’s not just you. 🚨 We quantified machine bullshit πŸ’© Turns out, aligning LLMs to be "helpful" via human feedback actually teaches them to bullshitβ€”and Chain-of-Thought reasoning just makes it worse! πŸ”₯ Time to rethink AI alignment. https://t.co/rL64tIQZdH

Media 1
❀️627
likes
πŸ”115
retweets
πŸ–ΌοΈ Media
_
the tiny corp
@__tinygrad__
πŸ“…
Sat
πŸ†”19995157

This is tinygrad's description of the tensor cores of all the major GPUs. No per GPU dialects, just a spec for what they each are. https://t.co/BEcFdRxFNK

Media 1
❀️641
likes
πŸ”44
retweets
πŸ–ΌοΈ Media
N
Nikhil Chandak
@nikhilchandak29
πŸ“…
Fri
πŸ†”99085405

🚨Thought Grok-4 saturated GPQA? Not yet! βš–οΈSame questions, when evaluated free-form, Grok-4 is no better than its smaller predecessor Grok-3-mini! Even @OpenAI's o4-mini outperforms Grok-4 here. As impressive as Grok-4 is, benchmarks have not saturated just yet. Also, have… https://t.co/ms4SYK5X6F

Media 1
❀️269
likes
πŸ”31
retweets
πŸ–ΌοΈ Media
H
htmx.org / The Net's Smoothest Code Man (same)
@htmx_org
πŸ“…
Sat
πŸ†”73496467

htmx is designed to be obsoleted by incorporation into the HTML specification it generalizes the idea of hypermedia controls & that's pretty much it https://t.co/muJJbVAHUI https://t.co/Wdd2yKMpGk

Media 1
❀️335
likes
πŸ”14
retweets
πŸ–ΌοΈ Media
T
Piotr Mazurek
@tugot17
πŸ“…
Sat
πŸ†”60350991

I solved every single problem in the CUDA mode book. A quick thread summarizing this experience and what I learned 1/x https://t.co/KOgppjA3ev

Media 1
❀️2,444
likes
πŸ”243
retweets
πŸ–ΌοΈ Media
R
Sebastian Raschka
@rasbt
πŸ“…
Sat
πŸ†”24577525
⭐0.52

Kimi K2 is basically DeepSeek V3 but with fewer heads and more experts: https://t.co/LrRqRCOHkl

Media 1
πŸ–ΌοΈ Media
J
Jerry Liu
@jerryjliu0
πŸ“…
Sat
πŸ†”42105182

Adaptive Form Extraction πŸ€–πŸ“‘ Form understanding (e.g. tax forms, patient intake, certifications, questionnaires) is a huge use case for AI agents, but defining a good output schema is painful: 1) it can be very tedious/custom, and 2) each form has its own values. πŸ’‘ Instead of… https://t.co/mNCiIenVR5

❀️141
likes
πŸ”15
retweets
πŸ–ΌοΈ Media
X
xlr8harder
@xlr8harder
πŸ“…
Sat
πŸ†”50442650

I had the impression that Kimi K2 uses a better, more diverse vocabulary than I was used to seeing, so I ran a quick linguistic diversity analysis on the SpeechMap data, and yep, Kimi K2 has the top score. Details on the calculation in thread. https://t.co/P4WRNqf7dz

Media 1
❀️935
likes
πŸ”56
retweets
πŸ–ΌοΈ Media
N
Niels Rogge
@NielsRogge
πŸ“…
Sat
πŸ†”31528112

Interesting, the CEO of @Kimi_Moonshot was the first author of XLNet and TransformerXL Both of which were among the first models added to the @huggingface Transformers library in 2019 https://t.co/eLPkepVddF

Media 1Media 2
❀️516
likes
πŸ”38
retweets
πŸ–ΌοΈ Media
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Sun
πŸ†”08800654

Had no idea these existed https://t.co/ejkfsXQo0c

Media 1
❀️958
likes
πŸ”24
retweets
πŸ–ΌοΈ Media
J
Jeremy Howard
@jeremyphoward
πŸ“…
Sun
πŸ†”13941776

OpenAI was not the first company to implement tool use in the CoT. Below is @AnthropicAI over a year ago. https://t.co/ZW97k3dgv5 https://t.co/99KpJrdXq0

Media 1
❀️306
likes
πŸ”8
retweets
πŸ–ΌοΈ Media
M
Mikhail Parakhin
@MParakhin
πŸ“…
Sun
πŸ†”71135499

Comet (Perplexity's browser) has been growing on me. Initially I only used Assistant to query busy pages, but the ability to perform actions right inside the browser is really handy, too! My real-life task example on OCC memos website: https://t.co/7zboMFL7kd

Media 1
❀️125
likes
πŸ”9
retweets
πŸ–ΌοΈ Media
K
Kyunghyun Cho
@kchonyc
πŸ“…
Sat
πŸ†”43834168

late 1980s, @ylecun and @LeonBottou used amiga 1000 and a bespoke modem to implement and research artificial neural nets using SN-1. the legend was born. https://t.co/YnRyjL3r08

Media 1Media 2
❀️775
likes
πŸ”77
retweets
πŸ–ΌοΈ Media
M
Mathieu
@miniapeur
πŸ“…
Sat
πŸ†”90706513

https://t.co/ul0ZVbiYzI

Media 1
❀️1,829
likes
πŸ”131
retweets
πŸ–ΌοΈ Media
J
Jeremy Howard
@jeremyphoward
πŸ“…
Sun
πŸ†”33937983

Anyone else remember this article from July 2010? First mention of Bitcoin I recall seeing. I was pretty into digital currency in the 1990's, so was excited to see it making a comeback! https://t.co/DxdcQmnB55 https://t.co/khDatlTS5W

Media 1
❀️68
likes
πŸ”4
retweets
πŸ–ΌοΈ Media
S
Sam Paech
@sam_paech
πŸ“…
Sun
πŸ†”98553853

Kimi-K2 just took top spot on both EQ-Bench3 and Creative Writing! Another win for open models. Incredible job @Kimi_Moonshot https://t.co/uD7yCmc5VS

Media 1Media 2
+2 more
❀️866
likes
πŸ”110
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Sun
πŸ†”13769883

Since the β€œarchitecture sucks now” theme is coming around on X again, a reminder that Hyatt and Marriott atriums are as impressive as cathedrals and pyramids, but, unlike cathedrals and pyramids, anyone can get a drink in them or spend the night for a reasonable fee. https://t.co/M86brKEQwB

Media 1Media 2
+2 more
❀️369
likes
πŸ”17
retweets
πŸ–ΌοΈ Media
H
steve hsu
@hsu_steve
πŸ“…
Sun
πŸ†”99328362

Hilbert space should really be called Schmidt space? Credit assignment in academia and science = very noisy and dominated by politics. https://t.co/4elIxFGY8L

Media 1
❀️48
likes
πŸ”5
retweets
πŸ–ΌοΈ Media
M
Armin Ronacher β‡Œ
@mitsuhiko
πŸ“…
Fri
πŸ†”97791695

Enough is enough. https://t.co/5aWvwxLn4n

Media 1
❀️641
likes
πŸ”14
retweets
πŸ–ΌοΈ Media
J
Jeremy Howard
@jeremyphoward
πŸ“…
Sat
πŸ†”34739866

It turns out this isn't true. Proof in next tweet. https://t.co/qEp5i9DG38

Media 1
❀️290
likes
πŸ”12
retweets
πŸ–ΌοΈ Media
X
xlr8harder
@xlr8harder
πŸ“…
Sat
πŸ†”33051918

4% of overall model responses from grok-4 in our latest SpeechMap eval mention Elon Musk (most models are <0.5%). It seems to be doubling every recent release. At this rate, by Grok 9, 100% of all model responses will talk about Elon Musk. https://t.co/ItbcZCjytm

Media 1
❀️787
likes
πŸ”67
retweets
πŸ–ΌοΈ Media
E
Ethan Mollick
@emollick
πŸ“…
Sat
πŸ†”53259374

Kimi-k2 seems to be a very good (and giant & odd) open weights model that may be the new leader in open LLMs. It is not beating the frontier closed models on my weird tests, but it doesn’t have a reasoner yet. More testing needed but Chinese open weights models are impressive. https://t.co/BCB2VeqnWJ

Media 1Media 2
+1 more
❀️436
likes
πŸ”27
retweets
πŸ–ΌοΈ Media
T
TuringPost
@TheTuringPost
πŸ“…
Thu Jul 10
πŸ†”59066073

LFM2 - an upgrade to Liquid Foundation Models from @LiquidAI_, designed to make their models fast, memory efficient and usable on any device. What sets LFM2 apart from the rest? β€’ New hybrid architecture with 16 blocks: - 10 double-gated short convolution blocks. They act… https://t.co/cZqxixQZ8D

Media 1
❀️31
likes
πŸ”5
retweets
πŸ–ΌοΈ Media
T
Teknium (e/Ξ»)
@Teknium1
πŸ“…
Sat
πŸ†”20790429

Umm @natolambert what is up with RewardBench - why the heck is it using LMSys' FastChat that hasn't been updated in like a year+ for chat templates and not huggingface chat templates - this is driving me crazy, BFCL, MTBench, and now RewardBench are using hardcoded chat… https://t.co/yxmDr8S64A

Media 1
❀️18
likes
πŸ–ΌοΈ Media
K
Keyon Vafa
@keyonV
πŸ“…
Fri
πŸ†”80331460

Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧡 https://t.co/GDxnK8gaid

❀️6,814
likes
πŸ”1,021
retweets
πŸ–ΌοΈ Media
A
Aravind Srinivas
@AravSrinivas
πŸ“…
Sat
πŸ†”38758367

When you’re on Comet, you’re operating at an abstraction above which AI to use and how to pull in relevant context. Agents are powerful and operate like a human would to complete the task. You go from chat turns to end-to-end workflows. https://t.co/oMA3ASUMjJ

Media 1
❀️758
likes
πŸ”47
retweets
πŸ–ΌοΈ Media
V
vishal
@vishal_learner
πŸ“…
Fri
πŸ†”26164854

As is the case with these talks, it's not just the facts presented (you can get them anywhere) but the story told tying the facts together! A great narrative woven by Antoine. Sharing my 2 fave slides: I've seen his thread on them here, and it's an important concept to understand https://t.co/PoTtOMXZEh

Media 1Media 2
❀️4
likes
πŸ”2
retweets
πŸ–ΌοΈ Media
O
elvis
@omarsar0
πŸ“…
Thu Jul 10
πŸ†”30828397

BREAKING: xAI announces Grok 4 "It can reason at a superhuman level!" Here is everything you need to know: https://t.co/z1DrpFGvnT

Media 1
❀️5,837
likes
πŸ”399
retweets
πŸ–ΌοΈ Media
L
Liquid AI
@LiquidAI_
πŸ“…
Thu Jul 10
πŸ†”62064990

Today, we release the 2nd generation of our Liquid foundation models, LFM2. LFM2 set the bar for quality, speed, and memory efficiency in on-device AI. Built for edge devices like phones, laptops, AI PCs, cars, wearables, satellites, and robots, LFM2 delivers the fastest… https://t.co/9bW5AQck1d

Media 1
❀️266
likes
πŸ”60
retweets
πŸ–ΌοΈ Media
J
jason liu
@jxnlco
πŸ“…
Thu Jul 10
πŸ†”18656977

The Document Automation Trap: Why Your AI Pipeline Will Fail w/ @ExtendHQ - Understand your data and existing manual processes before automation. Tacit knowledge hidden within organizations is often the key to successful implementation. - Invest early in task-specific… https://t.co/Y05VQNNLEQ

Media 1
❀️35
likes
πŸ”2
retweets
πŸ–ΌοΈ Media
M
METR
@METR_Evals
πŸ“…
Thu Jul 10
πŸ†”20388093

We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't. https://t.co/w8LSTpCFZL

Media 1
❀️6,839
likes
πŸ”1,363
retweets
πŸ–ΌοΈ Media