G

Gradio

@Gradio

📅

Aug 26, 2025

245d ago

🆔63016486

🎙️ VibeVoice Podcasting 🔥 🙌 Thanks to @broadfield_dev You Can Now Generate Long-form Multi-speaker AI Podcast with ZeroGPU on @huggingface https://t.co/UKr5JlGPcg

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Aug 27, 2025

245d ago

🆔29804447

app: https://t.co/esPDyHE1YC check image to image to use in your apps https://t.co/ieYOnUI8Nv

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Aug 27, 2025

245d ago

🆔20389432

MV-RAG Retrieval Augmented Multiview Diffusion https://t.co/7Z3RIPGy8M

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Aug 27, 2025

245d ago

🆔11837716

discuss with author: https://t.co/4hZyFkb4kn

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Aug 27, 2025

245d ago

🆔34497030

Hermes 4 Technical Report https://t.co/01n2jfk3D5

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Aug 27, 2025

245d ago

🆔05459212

discuss with author: https://t.co/yXCD4uwFcT

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Aug 27, 2025

245d ago

🆔39731295

nano banana is now available in anycoder for vibe coding use cases https://t.co/U9AQF3kfcC

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Aug 27, 2025

245d ago

🆔34771307

nano banana text to image generations in your vibe coded apps is now supported as well https://t.co/o8rkvWAfmz

@_akhaliq • Wed Aug 27 00:12

nano banana is now available in anycoder for vibe coding use cases https://t.co/U9AQF3kfcC

🖼️ Media

View Details View on X ↗

_

_akhaliq

@_akhaliq

📅

Aug 27, 2025

245d ago

🆔62569299

app: https://t.co/esPDyHDu94

🖼️ Media

View Details View on X ↗

A

Ali_TongyiLab

@Ali_TongyiLab

📅

Aug 27, 2025

244d ago

🆔73465095

Hugging Face Paper：https://t.co/d0ZhJe2EE8

🖼️ Media

View Details View on X ↗

🔁_akhaliq retweeted

A

Tongyi Lab

@Ali_TongyiLab

📅

Aug 27, 2025

244d ago

🆔73465095

Hugging Face Paper：https://t.co/d0ZhJe2EE8

❤️15

likes

🔁2

retweets

🖼️ Media

View Details View on X ↗

R

R_Dimm

@R_Dimm

📅

Aug 25, 2025

246d ago

🆔87638430

I took the Solveit course by @jeremyphoward and @johnowhitaker. Main insight: we can't expect one-shot AI solutions because we can't even ask the right question on the first try. It's no wonder most AI tools feel like a self-driving car hell-bent on driving off a cliff. 🧵 https://t.co/0Pdv6U8JdV

🖼️ Media

View Details View on X ↗

R

R_Dimm

@R_Dimm

📅

Aug 25, 2025

246d ago

🆔60399560

The fix? Work WITH LLM properties, not against them: • RLHF makes them over-eager → Work in small steps, ask clarifying questions • Autoregression causes drift → Edit AI responses, use examples to guide direction • Flawed training data → curate relevant context manually https://t.co/eOGN1PLvnZ

🖼️ Media

View Details View on X ↗

J

jeremyphoward

@jeremyphoward

📅

Aug 25, 2025

246d ago

🆔57074266

@steve2Seattle @SmileyGnome @iwasnevrhere_ @CommunityNotes Seems they actually worked pretty well though? https://t.co/kacXyzLfJv

🖼️ Media

View Details View on X ↗

J

jxmnop

@jxmnop

📅

Aug 26, 2025

245d ago

🆔15528627

first i thought scaling laws originated in OpenAI (2020) then i thought they came from Baidu (2017) now i am enlightened: Scaling Laws were first explored at Bell Labs (1993) https://t.co/CAZPgrxGCX

🖼️ Media

View Details View on X ↗

C

crystalsssup

@crystalsssup

📅

Aug 27, 2025

244d ago

🆔93762018

Kimi's founder, Zhilin Yang's interview is out. Again, you can let Kimi translate for you: ) lots of insights there. https://t.co/nCEb1Cyq5b Several takes: 1/ Base Model Focus: K2 aims to be a solid base model. We've found that high-quality data growth is slow, and multi-modal data doesn't significantly boost textual "IQ." So, we focus on maximizing every data token's value — token efficiency. 2/ Data Rephrasing: With 30T tokens, only a small portion is high-quality data (billions of tokens). We rephrase these to make them more efficient for the model, improving generalization. 3/ Agentic Ability: We aim to enhance generalization. The biggest challenge is making the model generalize well beyond specific tasks. RL improves this over supervised fine-tuning (SFT). 4/ AI-Native Training: We're exploring more AI-native ways to train models. If AI can do good alignment research, it'll generalize better, beyond single-task optimization. 5/ RL vs SFT: RL's generalization is better, as it learns from on-policy samples, but it has its limits. RL helps improve specific tasks, but it's hard to generalize to all scenarios without tailored tasks. 6/ Long Contexts: Context length is crucial, we need millions. The challenge is balancing model size and context length for optimal performance, as some architectures improve with long context but worsen with short ones.

🖼️ Media

View Details View on X ↗

J

JessicaSacher

@JessicaSacher

📅

Aug 27, 2025

244d ago

🆔80686017

maybe antibiotic resistance would have funding if we didn't prohibit investors from coming to our conferences https://t.co/x0RRAs8nOa

🖼️ Media

View Details View on X ↗

A

abidlabs

@abidlabs

📅

Aug 24, 2025

247d ago

🆔82474556

Follow https://t.co/fl3Mvguo1y to stay up to date. https://t.co/wcYqjR1mqB

@elonmusk • Sat Aug 23 22:16

The @xAI Grok 2.5 model, which was our best model last year, is now open source. Grok 3 will be made open source in about 6 months. https://t.co/TXM0wyJKOh

🖼️ Media

View Details View on X ↗

🔁huggingface retweeted

A

🍉 Abubakar Abid

@abidlabs

📅

Aug 24, 2025

247d ago

🆔82474556

Follow https://t.co/fl3Mvguo1y to stay up to date. https://t.co/wcYqjR1mqB

❤️32

likes

🔁5

retweets

🖼️ Media

View Details View on X ↗

L

liran_tal

@liran_tal

📅

Aug 24, 2025

247d ago

🆔56124378

hugging face is the new github https://t.co/yBo3I7ztEK

🖼️ Media

View Details View on X ↗

🔁huggingface retweeted

L

Liran Tal | 🤖 Hacking MCP Servers

@liran_tal

📅

Aug 24, 2025

247d ago

🆔56124378

hugging face is the new github https://t.co/yBo3I7ztEK

❤️110

likes

🔁11

retweets

🖼️ Media

View Details View on X ↗

Q

QuanquanGu

@QuanquanGu

📅

Aug 23, 2025

248d ago

🆔43080770

So many multipliers! Great to see that Grok2 was trained using μP. https://t.co/mURbaZFkCw https://t.co/li7P9OJCr4

🖼️ Media

View Details View on X ↗

🔁huggingface retweeted

Q

Quanquan Gu

@QuanquanGu

📅

Aug 23, 2025

248d ago

🆔43080770

So many multipliers! Great to see that Grok2 was trained using μP. https://t.co/mURbaZFkCw https://t.co/li7P9OJCr4

❤️183

likes

🔁23

retweets

🖼️ Media

View Details View on X ↗

E

eliebakouch

@eliebakouch

📅

Aug 24, 2025

247d ago

🆔22536611

Wow, pretty cool that they also open sourced a FSDP2 compatible Muon and PolyNorm working with @huggingface kernels! https://t.co/Gqw7Hpj1v3

@eliebakouch • Sun Aug 24 12:47

Motif 2.6B tech report is pretty insane, first time i see a model with differential attention and polynorm trained at scale! > It's trained on 2.5T of token, with a "data mixture schedule" to continuously adjust the mixture over training. > They use WSD with a "Simple moving ave

🖼️ Media

View Details View on X ↗

H

heyshrutimishra

@heyshrutimishra

📅

Aug 24, 2025

247d ago

🆔45489322

Hugging Face quietly dropped FREE courses with certification It cover everything from LLMs to diffusion models. Here are the best ones you should bookmark today 🧵👇 https://t.co/QvLywX0lZ5

🖼️ Media

View Details View on X ↗

S

scaling01

@scaling01

📅

Aug 23, 2025

248d ago

🆔72407338

Grok-2 got open-sourced same arch as grok-1 https://t.co/eOdmj6zKaK https://t.co/KHb59ymyQ2

+1 more

🖼️ Media

View Details View on X ↗

🔁huggingface retweeted

S

Lisan al Gaib

@scaling01

📅

Aug 23, 2025

248d ago

🆔72407338

Grok-2 got open-sourced same arch as grok-1 https://t.co/eOdmj6zKaK https://t.co/KHb59ymyQ2

+1 more

❤️417

likes

🔁32

retweets

🖼️ Media

View Details View on X ↗

H

HaihaoShen

@HaihaoShen

📅

Aug 25, 2025

246d ago

🆔96211547

🤔A more aggressive INT4 model for DeepSeek-V3.1: https://t.co/mELIFdbpNP #intel #autoround #huggingface @deepseek_ai

🖼️ Media

View Details View on X ↗

🔁huggingface retweeted

H

Haihao Shen

@HaihaoShen

📅

Aug 25, 2025

246d ago

🆔96211547

🤔A more aggressive INT4 model for DeepSeek-V3.1: https://t.co/mELIFdbpNP #intel #autoround #huggingface @deepseek_ai

❤️279

likes

🔁34

retweets

🖼️ Media

View Details View on X ↗

R

reach_vb

@reach_vb

📅

Aug 25, 2025

246d ago

🆔78417826

Microsoft just released VibeVoice - 1.5B SoTA Text to Speech model - MIT Licensed 🔥 > It can generate up 90 minutes of audio > Supports simultaneous generation of > 4 speakers > Streaming and larger 7B model in-coming > Capable of cross-lingual and singing synthesis Love the expressiveness and the emotion control on the model! Kudos to Microsoft 🤗