Your curated collection of saved posts and media

Showing 24 posts ยท last 7 days ยท quality filtered
T
tanishqkumar07
@tanishqkumar07
๐Ÿ“…
Mar 04, 2026
8d ago
๐Ÿ†”96631872

I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.

๐Ÿ–ผ๏ธ Media
W
wonmin_byeon
@wonmin_byeon
๐Ÿ“…
Mar 04, 2026
8d ago
๐Ÿ†”46418709

๐Ÿš€ New paper: Mambaโ€“Transformer hybrid VLMs can go fast without forgetting. We introduce stateful token reduction for long-video VLMs. โœ… Only 25% of visual tokens ๐Ÿš€ 3.8โ€“4.2ร— faster prefilling (TTFT) ๐ŸŽฏ Near-baseline accuracy (can exceed baseline with light finetuning) https://t.co/CJaCktyWCt

Media 1
๐Ÿ–ผ๏ธ Media
T
tedzadouri
@tedzadouri
๐Ÿ“…
Mar 05, 2026
7d ago
๐Ÿ†”06841236

Asymmetric hardware scaling is here. Blackwell tensor cores are now so fast, exp2 and shared memory are the wall. FlashAttention-4 changes the algorithm & pipeline so that softmax & SMEM bandwidth no longer dictate speed. Attn reaches ~1600 TFLOPs, pretty much at matmul speed! joint work w/ Markus Hoehnerbach, Jay Shah(@ultraproduct), Timmy Liu, Vijay Thakkar (@__tensorcore__ ), Tri Dao (@tri_dao) 1/

Media 1
๐Ÿ–ผ๏ธ Media
T
tri_dao
@tri_dao
๐Ÿ“…
Mar 05, 2026
7d ago
๐Ÿ†”58646344

Claude / Codex also have an easier time writing some components of FA4 thanks to the fast compile time. I got Claude to debug a deadlock when we first implemented 2CTA fwd. It ran autonomously overnight for 6 hours, figured out part of the fix, but then went down a rabbit hole convincing itself that the compiler is broken (so very human ๐Ÿ˜‚). After 6 hours, from Claudeโ€™s partial fix, I was able to fix the hang in 10 mins. More details here: https://t.co/ipGhC9FzET Iโ€™m hoping FA5 will be written completely by AI

Media 1
๐Ÿ–ผ๏ธ Media
M
MayankMish98
@MayankMish98
๐Ÿ“…
Mar 05, 2026
7d ago
๐Ÿ†”79317378

FA4 now available in lm-engine: https://t.co/n47TEinAfG 13.4% end-to-end speedup for Llama 8B training on 4x GB200s (1 node) ๐Ÿš€๐Ÿš€๐Ÿš€ 1005.55 TFLOPs for SDPA vs 1140.73 for FA4 (BF16 precision) @tedzadouri @ultraproduct @__tensorcore__ @tri_dao cooked Thanks to @bharatrunwal2 for running the experiment!

Media 1
๐Ÿ–ผ๏ธ Media
๐Ÿ”tri_dao retweeted
M
Mayank Mishra
@MayankMish98
๐Ÿ“…
Mar 05, 2026
7d ago
๐Ÿ†”79317378

FA4 now available in lm-engine: https://t.co/n47TEinAfG 13.4% end-to-end speedup for Llama 8B training on 4x GB200s (1 node) ๐Ÿš€๐Ÿš€๐Ÿš€ 1005.55 TFLOPs for SDPA vs 1140.73 for FA4 (BF16 precision) @tedzadouri @ultraproduct @__tensorcore__ @tri_dao cooked Thanks to @bharatrunwal2 for running the experiment!

Media 1
โค๏ธ51
likes
๐Ÿ”9
retweets
๐Ÿ–ผ๏ธ Media
S
StasBekman
@StasBekman
๐Ÿ“…
Mar 05, 2026
7d ago
๐Ÿ†”75487320

the FA4 integration into @huggingface Transformers is here https://t.co/48XPxmKbMv you will need to apply my proposed changes at the end for it to work if the owner hasn't done it already by the time you try it out

Media 1
๐Ÿ–ผ๏ธ Media
๐Ÿ”tri_dao retweeted
S
Stas Bekman
@StasBekman
๐Ÿ“…
Mar 05, 2026
7d ago
๐Ÿ†”75487320

the FA4 integration into @huggingface Transformers is here https://t.co/48XPxmKbMv you will need to apply my proposed changes at the end for it to work if the owner hasn't done it already by the time you try it out

Media 1
โค๏ธ22
likes
๐Ÿ”3
retweets
๐Ÿ–ผ๏ธ Media
๐Ÿ”tri_dao retweeted
S
Stas Bekman
@StasBekman
๐Ÿ“…
Mar 05, 2026
7d ago
๐Ÿ†”75487320

the FA4 integration into @huggingface Transformers is here https://t.co/48XPxmKbMv you will need to apply my proposed changes at the end for it to work if the owner hasn't done it already by the time you try it out

Media 1
โค๏ธ22
likes
๐Ÿ”3
retweets
๐Ÿ–ผ๏ธ Media
T
togethercompute
@togethercompute
๐Ÿ“…
Mar 05, 2026
7d ago
๐Ÿ†”35702061

Together Research has produced FlashAttention, ATLAS, ThunderKittens and more. This week at AI Native Conf: seven more releases, all coming to production soon. Thread โ†’ #ainativeconf #ainativecloud https://t.co/XXIXMRRiLe

Media 1
๐Ÿ–ผ๏ธ Media
T
togethercompute
@togethercompute
๐Ÿ“…
Mar 05, 2026
7d ago
๐Ÿ†”35702061

Together Research has produced FlashAttention, ATLAS, ThunderKittens and more. This week at AI Native Conf: seven more releases, all coming to production soon. Thread โ†’ #ainativeconf #ainativecloud https://t.co/XXIXMRRiLe

Media 1
๐Ÿ–ผ๏ธ Media
K
Kelaivy
@Kelaivy
๐Ÿ“…
Mar 05, 2026
8d ago
๐Ÿ†”23021787

@FPLGOAT7 I got lucky, sold Dango, sold Haaland. Tarkowski did it for me. https://t.co/BUmAWBP0W7

Media 1
๐Ÿ–ผ๏ธ Media
K
Kelaivy
@Kelaivy
๐Ÿ“…
Mar 05, 2026
8d ago
๐Ÿ†”19474729

@yehiael22 @FPL_Harry Same here https://t.co/2gZdylxMf1

Media 1
๐Ÿ–ผ๏ธ Media
K
Kelaivy
@Kelaivy
๐Ÿ“…
Mar 05, 2026
8d ago
๐Ÿ†”19474729

@yehiael22 @FPL_Harry Same here https://t.co/2gZdylxMf1

Media 1
๐Ÿ–ผ๏ธ Media
๐Ÿ”NaderLikeLadder retweeted
A
Addy Osmani
@addyosmani
๐Ÿ“…
Mar 05, 2026
8d ago
๐Ÿ†”67805081

Introducing the Google Workspace CLI: https://t.co/8yWtbxiVPp - built for humans and agents. Google Drive, Gmail, Calendar, and every Workspace API. 40+ agent skills included.

Media 1
โค๏ธ14,229
likes
๐Ÿ”1,490
retweets
๐Ÿ–ผ๏ธ Media
N
NVIDIAAIDev
@NVIDIAAIDev
๐Ÿ“…
Mar 04, 2026
8d ago
๐Ÿ†”50842580

โš ๏ธ WARNING: THIS PRODUCT MAY CONTAIN SHELLFISH ๐Ÿฆž https://t.co/zJ6n2auo6B

๐Ÿ–ผ๏ธ Media
C
cerebral_valley
@cerebral_valley
๐Ÿ“…
Mar 02, 2026
10d ago
๐Ÿ†”07880476

Do you want to demo your project at the Meta booth during GTC? ๐Ÿ˜Ž Join @Meta and @nvidia, in partnership with CV, for a full-day hackathon at @SHACK15sf, writing high-performance GPU kernels with Helion, PyTorch's new kernel authoring DSL that delivers higher performance in fewer lines of code with autotuning. ๐Ÿ“… March 14th โ€” Right before NVIDIA GTC. The perfect warm-up. ๐Ÿ† Prizes & perks: > Nvidia GPUs and Nvidia DGX Spark > Demo your project at the Meta booth during GTC > GTC conference passes > Ray-Ban Meta glasses > Mentoring from Meta AI researchers & NVIDIA engineers ๐Ÿ“ Fully in-person | Teams of up to 4 | Rolling review, limited spots Register below ๐Ÿ‘‡

๐Ÿ–ผ๏ธ Media
O
openinfradev
@openinfradev
๐Ÿ“…
Mar 03, 2026
10d ago
๐Ÿ†”55165932

How can we securelty contain #AI?. In this live discussion, experts will explore why traditional container isolation falls short for agent-based systems & what changes when agents have persistent memory, filesystem access, GPUs, or external execution authority https://t.co/qi4Mw97DPo

Media 1
๐Ÿ–ผ๏ธ Media
S
SakanaAILabs
@SakanaAILabs
๐Ÿ“…
Feb 20, 2026
21d ago
๐Ÿ†”94132241

Sakana AIใงใ€ๆœ€ๅ…ˆ็ซฏAIใฎใ€Œ็คพไผšๅฎŸ่ฃ…ใ€ใ‚’ๅŠ ้€Ÿใ•ใ›ใพใ›ใ‚“ใ‹๏ผŸ๐Ÿš€ ๅŸบ็คŽ็ ”็ฉถใจใƒ—ใƒญใƒ€ใ‚ฏใƒˆใฎๆžถใ‘ๆฉ‹ใจใชใ‚‹ Applied Research Engineer ใ‚’็ตถ่ณ›ๅ‹Ÿ้›†ไธญใงใ™๏ผไธ–็•Œใƒˆใƒƒใƒ—ใ‚ฏใƒฉใ‚นใฎๆŠ€่ก“ใซ่งฆใ‚ŒใชใŒใ‚‰ใ€ๆฌกไธ–ไปฃใฎใ‚ฝใƒชใƒฅใƒผใ‚ทใƒงใƒณใ‚’่‡ชใ‚‰ใฎๆ‰‹ใงๅ‰ตใ‚ŠไธŠใ’ใ‚‹็†ฑใ„ใƒใ‚ธใ‚ทใƒงใƒณใงใ™๐ŸŸ๐Ÿ’จ โ–ผ่ฉณ็ดฐใฏใ“ใกใ‚‰ https://t.co/eQ7e0rIOmg https://t.co/7Mx8h3JScP

Media 1Media 2
๐Ÿ–ผ๏ธ Media
H
hardmaru
@hardmaru
๐Ÿ“…
Feb 20, 2026
21d ago
๐Ÿ†”40776820

Sakana AIใงไธ€็ท’ใซๅƒใApplied Research Engineerใ‚’็ตถ่ณ›ๅ‹Ÿ้›†ไธญใงใ™๏ผ๐ŸŸ๐Ÿ’จ https://t.co/FuEoI2xrzS

Media 1
๐Ÿ–ผ๏ธ Media
S
SakanaAILabs
@SakanaAILabs
๐Ÿ“…
Feb 22, 2026
19d ago
๐Ÿ†”95381789

Sakana AIใฎใ€Œ็คพไผšๅฎŸ่ฃ…ใ€ใ‚’ใ‚ˆใ‚ŠๅŠ ้€Ÿใ•ใ›ใ‚‹ใŸใ‚ใ€ๆ–ฐใŸใซRecruiterใ‚’ๅ‹Ÿ้›†ใ—ใพใ™๏ผ๐ŸŸ https://t.co/qiY3upbBAV ๆ–ฐ่ฆใƒ—ใƒญใƒ€ใ‚ฏใƒˆ้–‹็™บใŒ้€ฒใ‚€ไธญใ€ๆœช่ธใฎใ‚ฝใƒชใƒฅใƒผใ‚ทใƒงใƒณใ‚’ๅ‰ตใ‚‹Engineerใ‚„PMใ‚’็ฉๆฅตๆŽก็”จไธญใงใ™ใ€‚ใƒ€ใ‚คใƒฌใ‚ฏใƒˆใ‚ฝใƒผใ‚ทใƒณใ‚ฐใ‚’่ปธใซๅ€™่ฃœ่€…ใจ็›ดๆŽฅๅ‘ใๅˆใ„ใ€ใ‚ณใ‚ขใƒใƒผใƒ ใ‚’็ต„ๆˆใ—ใฆใ„ใŸใ ใ้‡่ฆใชๅฝนๅ‰ฒใงใ™ใ€‚ ใƒ†ใ‚ฏใƒŽใƒญใ‚ธใƒผๆฅญ็•ŒใงใฎๆŽก็”จ็ตŒ้จ“ใ‚’ๆดปใ‹ใ—ใ€็งใŸใกใฎๆˆ้•ทใ‚จใƒณใ‚ธใƒณใฎไธญๅฟƒใ‚’ๆ‹…ใฃใฆใใ ใ•ใ‚‹ๆ–นใ‚’ใŠๅพ…ใกใ—ใฆใ„ใพใ™๏ผ๐Ÿš€

Media 1Media 2
๐Ÿ–ผ๏ธ Media
P
PyTorch
@PyTorch
๐Ÿ“…
Mar 04, 2026
9d ago
๐Ÿ†”32329134

โฐ Clockโ€™s ticking! Registration for #PyTorchCon Europe goes up โ‚ฌ100 after 20 March. Also less than a week left to RSVP for onsite child care ๐Ÿ‘ถ 7โ€“8 April | Paris ๐ŸŽŸ Register: https://t.co/53JVfAmOap ๐Ÿ‘ถ Child care info: https://t.co/OnRpL1AQKa https://t.co/okpcmT7qu3

Media 2
+1 more
๐Ÿ–ผ๏ธ Media
H
hardmaru
@hardmaru
๐Ÿ“…
Feb 23, 2026
18d ago
๐Ÿ†”04238572

Applied Research Engineer ๐ŸŸ https://t.co/FuEoI2xrzS

Media 1
๐Ÿ–ผ๏ธ Media
S
SakanaAILabs
@SakanaAILabs
๐Ÿ“…
Feb 24, 2026
16d ago
๐Ÿ†”18439726

We are pleased to announce a strategic investment from Citi! https://t.co/SQp1HEGzEp This milestone marks Citiโ€™s first such investment in a Japanese company. The investment reflects their high regard for our advanced technical capabilities and our proven track record of implementing AI within the financial sector. We are focused on developing new enterprise-grade AI solutions using nature-inspired intelligence. Our goal has consistently been to bridge the gap between cutting-edge research and practical business applications. Building on our work developing highly specialized AI agents for financial domains, we are ready to take the next step. Through this partnership, we aim to accelerate our international expansion and drive innovation in global financial services, originating from Japan.

Media 1Media 2
๐Ÿ–ผ๏ธ Media