@wildmindai
Soprano: An instant, ultra-lightweight TTS model for realistic speech; generates 10 hours of 32kHz audio in <20s; streams with <15ms latency using just 80M params & <1GB VRAM. Has some limitations and drawbacks. https://t.co/BZmckav7mW https://t.co/gWi1qpevWi