@Tu7uruu
Just dropped on HF: Marvis-TTS anefficient real-time voice cloning & streaming TTS > Clone any voice in ~10s of audio > 250m parameters > Streaming TTS with human-like prosody > Compact (~500 MB, edge-ready) > Runs on iPhone, iPad, Mac, consumer GPUs > CSM-based backbone and uses Kyutai's mimi codec