@DeepInfra
New voice model on DeepInfra š Qwen3-TTS from @Alibaba_Qwen brings: ⢠voice cloning from a ~3s sample ⢠expressive tone + emotion control ⢠~97ms first-byte latency ⢠9 voices across 10 languages Built for real-time voice agents and assistants. $20 / 1M characters