@reach_vb
Microsoft just released VibeVoice - 1.5B SoTA Text to Speech model - MIT Licensed ๐ฅ > It can generate up 90 minutes of audio > Supports simultaneous generation of > 4 speakers > Streaming and larger 7B model in-coming > Capable of cross-lingual and singing synthesis Love the expressiveness and the emotion control on the model! Kudos to Microsoft ๐ค