Your curated collection of saved posts and media

Showing 24 posts Β· last 30 days Β· by score
πŸ”tri_dao retweeted
T
Together AI
@togethercompute
πŸ“…
Mar 05, 2026
4d ago
πŸ†”35702061
⭐0.38

Together Research has produced FlashAttention, ATLAS, ThunderKittens and more. This week at AI Native Conf: seven more releases, all coming to production soon. Thread β†’ #ainativeconf #ainativecloud https://t.co/XXIXMRRiLe

❀️76
likes
πŸ”12
retweets
P
PyTorch
@PyTorch
πŸ“…
Mar 04, 2026
5d ago
πŸ†”13580671

Recover more than 70% accuracy degradation from 4-bit quantization using TorchAO’s (https://t.co/Jr0qtnIAgZ) Quantization-Aware Training (QAT), now available through fine-tuning in Unsloth and Axolotl! Following the previous TorchAO QAT blog(https://t.co/kXAGBfOSMZ), the PyTorch team at @Meta extended the TorchAO QAT flow to support an end-to-end GPU server flow, targeting fast CUDA kernels for fast inference in @vllm_project, and integrated this flow into popular fine-tuning frameworks like Unsloth and Axolotl. Read our latest blog: https://t.co/nFx4MYHoRj #PyTorch #vLLM #OpenSourceAI #TorchAO

Media 1Media 2
πŸ–ΌοΈ Media
A
ah20im
@ah20im
πŸ“…
Mar 05, 2026
4d ago
πŸ†”48712061

Today we are introducing GPT-5.4 in codex. It's more token efficient and better at tool calling, computer use, and frontend development. We are also introducing /fast to get a faster version of Codex. Enjoy ❀️ https://t.co/uTOlQsK7hE

Media 1
πŸ–ΌοΈ Media
L
ltx_model
@ltx_model
πŸ“…
Mar 05, 2026
4d ago
πŸ†”29586860

If the engine is strong enough, you should be able to build real products on top of it. That's the whole point of LTX-2.3. Introducing LTX Desktop. A fully local, open-source video editor running directly on the LTX engine, optimized for NVIDIA GPUs and compatible hardware. https://t.co/aApm06E6RZ

πŸ–ΌοΈ Media
H
HamelHusain
@HamelHusain
πŸ“…
Mar 06, 2026
4d ago
πŸ†”25024284
⭐0.30

@pamelafox I mean I am just gonna say do evals ℒ️

O
omarsar0
@omarsar0
πŸ“…
Mar 03, 2026
6d ago
πŸ†”70973399
⭐0.34

Impressive if true. The agent harness is powered by recursive and parallel planning. Clever planning is a big deal. Everyone should be trying to build their own harness. Trust me, you really want to be exploring higher levels of orchestration for your agents right now.

O
omarsar0
@omarsar0
πŸ“…
Mar 04, 2026
5d ago
πŸ†”25659668

When you build AI agents, don't treat prompts like config strings. Treat them like executable business logic. Because that's what they really are. @arshdilbagi's blog and this Stanford CS 224G lecture lay out one of the clearest mental models I have seen for LLM evaluation. Stop treating evals like unit tests. That works for deterministic software. For LLM products, it creates false confidence because real-world usage changes over time. Example: an insurance prompt passed 20 eval cases. The team shipped. In production, a new class of requests showed up and failed quietly. No crash, no alert, just wrong answers at scale. The fix is not "write more eval cases," which is what many teams do. It is building evals as a living feedback loop. Start with a small set, ship, watch what breaks in production, add those failures back, and re-run on every prompt or model change. What eval failure caught your team off guard? Blog: https://t.co/HCVhcow5rA Stanford CS 224G lecture: https://t.co/q667gGwckt

Media 1Media 2
πŸ–ΌοΈ Media
W
Wauplin
@Wauplin
πŸ“…
Mar 05, 2026
4d ago
πŸ†”37015074
⭐0.34

huggingface_hub v1.5.0 just dropped! The highlight: Buckets. Think S3, but native to the Hub. No git history. Just fast, chunk-deduplicated object storage. hf buckets sync ./outputs hf://buckets/me/my-checkpoints And that's it. Currently in beta preview. DM me if interested!

πŸ”_akhaliq retweeted
W
Wauplin
@Wauplin
πŸ“…
Mar 05, 2026
4d ago
πŸ†”37015074
⭐0.32

huggingface_hub v1.5.0 just dropped! The highlight: Buckets. Think S3, but native to the Hub. No git history. Just fast, chunk-deduplicated object storage. hf buckets sync ./outputs hf://buckets/me/my-checkpoints And that's it. Currently in beta preview. DM me if interested!

❀️18
likes
πŸ”5
retweets
A
alvarobartt
@alvarobartt
πŸ“…
Mar 02, 2026
8d ago
πŸ†”97875845

πŸ’₯ Learn how to build your own tool-calling agent with @huggingface TRL + @Alibaba_Qwen Qwen3.5 on @Azure Machine Learning! - @NousResearch hermes-function-calling-v1, 500 single-turn samples - SFT with TRL on Qwen3.5 2B (released today!) on a single NVIDIA H100 - Everything on Azure, from Container Registry to Machine Learning! Step-by-step in the thread 🧡

Media 1
πŸ–ΌοΈ Media
V
vanstriendaniel
@vanstriendaniel
πŸ“…
Mar 05, 2026
5d ago
πŸ†”22917718

There is no best VLM OCR model - rankings can flip completely by document type. I built ocr-bench: run open OCR models on YOUR documents, get a per-collection leaderboard. VLM-as-judge with Bradley-Terry ELO, all running on @huggingface. No local GPU needed. https://t.co/qZOwI0Wbes

Media 1
πŸ–ΌοΈ Media
R
rasbt
@rasbt
πŸ“…
Mar 03, 2026
6d ago
πŸ†”09759477
⭐0.34

@BarathAnandan7 hard to say, but I think it's data as usual. I think the architecture change with linear attention is more for efficiency purposes (see Qwen3-Next)

R
rasbt
@rasbt
πŸ“…
Mar 04, 2026
5d ago
πŸ†”08130265

What's actually nice about Gated DeltaNet modules is that they don't grow the KV cache size. So with that 3:1 ratio, Qwen3.5 is much more memory friendly than the previous Qwen3 models. https://t.co/AhegasegIB

Media 1
πŸ–ΌοΈ Media
J
jyt4n
@jyt4n
πŸ“…
Mar 06, 2026
4d ago
πŸ†”06175859
⭐0.36

Wrote a blog post about my journey here. Has some scalability limitations & will fix them soon. Appreciate any pointers/feedback! https://t.co/javKm9ebYa

O
OpenAI
@OpenAI
πŸ“…
Mar 05, 2026
4d ago
πŸ†”02811280
⭐0.44

We're publishing a new evaluation suite and research paper on Chain-of-Thought (CoT) Controllability. We find that GPT-5.4 Thinking shows low ability to obscure its reasoningβ€”suggesting CoT monitoring remains a useful safety tool. https://t.co/isZkNkPXZm

R
Rainmaker1973
@Rainmaker1973
πŸ“…
Mar 04, 2026
6d ago
πŸ†”76060468

An NVIDIA powered farming machine uses Al vision and precision lasers to eliminate weeds in milliseconds without herbicides and without harming crops, a potential shift toward chemical free agriculture https://t.co/aIbDWseMjD

πŸ–ΌοΈ Media
B
BenMildenhall
@BenMildenhall
πŸ“…
Mar 03, 2026
6d ago
πŸ†”55852964

We don't expect LLMs to multiply numbers or sort lists directly within their output token stream. Instead, we ask them emit code and execute it in a separate runtime. Why predict the opposite outcome for simulating interactive worlds? https://t.co/b2QNOBTWjN

Media 1
πŸ–ΌοΈ Media
πŸ”ai_fast_track retweeted
L
Liquid AI
@liquidai
πŸ“…
Mar 05, 2026
4d ago
πŸ†”89086198
⭐0.34

> 385ms average tool selection. > 67 tools across 13 MCP servers. > 14.5GB memory footprint. > Zero network calls. LocalCowork is an AI agent that runs on a MacBook. Open source. 🧡 https://t.co/bnXupspSXc

❀️1,241
likes
πŸ”123
retweets
P
Prince_Canuma
@Prince_Canuma
πŸ“…
Mar 02, 2026
7d ago
πŸ†”69466787

already on mlx :) https://t.co/NXxd7hAWMh

Media 1
πŸ–ΌοΈ Media
H
HuggingModels
@HuggingModels
πŸ“…
Mar 05, 2026
5d ago
πŸ†”31773642

Meet GLM-4.7-Flash-Claude-Opus-4.5-High-Reasoning-Distill: a distilled powerhouse that brings elite reasoning to local machines. This GGUF model delivers Claude-level intelligence in a compact package, perfect for developers wanting high-performance AI without cloud costs. https://t.co/Q0HCPTI2oe

Media 1
πŸ–ΌοΈ Media
0
0xCVYH
@0xCVYH
πŸ“…
Mar 03, 2026
7d ago
πŸ†”47784783

Claude Code acabou de lancar Voice Mode. Voce fala. O agente de IA codifica. "/voice" pra ativar. Rollout pra 5% dos usuarios agora, expandindo nas proximas semanas. Hoje: KREA AI Voice no iPad. Claude Code Voice no terminal. A era da programacao por voz chegou. https://t.co/9adiksDX0r

πŸ–ΌοΈ Media
J
johnrobinsn
@johnrobinsn
πŸ“…
Mar 03, 2026
7d ago
πŸ†”66316497

Comprehensive Python API for Google NotebookLM. Full programmatic access to NotebookLM's featuresβ€”including capabilities the web UI doesn't exposeβ€”from Python or the command line. https://t.co/5YQhAKiGuD

Media 1
πŸ–ΌοΈ Media
A
Alibaba_Qwen
@Alibaba_Qwen
πŸ“…
Mar 02, 2026
8d ago
πŸ†”10965160

πŸš€ Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B Β· Qwen3.5-2B Β· Qwen3.5-4B Β· Qwen3.5-9B ✨ More intelligence, less compute. These small models are built on the same Qwen3.5 foundation β€” native multimodal, improved architecture, scaled RL: β€’ 0.8B / 2B β†’ tiny, fast, great for edge device β€’ 4B β†’ a surprisingly strong multimodal base for lightweight agents β€’ 9B β†’ compact, but already closing the gap with much larger models And yes β€” we’re also releasing the Base models as well. We hope this better supports research, experimentation, and real-world industrial innovation. Hugging Face: https://t.co/wFMdX5pDjU ModelScope: https://t.co/9NGXcIdCWI

Media 1Media 2
πŸ–ΌοΈ Media
A
AlphaSignalAI
@AlphaSignalAI
πŸ“…
Mar 04, 2026
5d ago
πŸ†”21405344

A trillion-parameter model just made half its brain disappear. It got smarter. Yuan3.0 Ultra is a new open-source multimodal MoE model from Yuan Lab. 1010B total parameters, only 68.8B active at inference. It beat GPT-5.2, Gemini 3.1 Pro, and Claude Opus 4.6 on RAG benchmarks by wide margins. 67.4% on Docmatix vs GPT-4o's 56.8%. Here's what it unlocks: > Enterprise RAG with 68.2% avg accuracy across 10 retrieval tasks > Complex table understanding at 62.3% on MMTab > Text-to-SQL generation scoring 83.9% on Spider 1.0 > Multimodal doc analysis with a 64K context window The key innovation: Layer-Adaptive Expert Pruning (LAEP). During pretraining, expert token loads become wildly imbalanced. Some experts get 500x more tokens than others. LAEP prunes the underused ones layer by layer, cutting 33% of parameters while boosting training efficiency by 49%. They also refined "fast-thinking" RL. Correct answers with fewer reasoning steps get rewarded more. This cut output tokens by 14.38% while improving accuracy by 16.33%. The bigger signal here: MoE models are learning to self-compress during training, not after. If pruning becomes part of pretraining, the cost curve for trillion-scale models shifts dramatically.

Media 1
πŸ–ΌοΈ Media