@AlphaSignalAI
Big news. Meta just announced SeamlessM4T, a competitor to Google translate. *turn sound on* SeamlessM4T is a multimodal foundational speech/text translation and transcription model capable of handling: 📥 101 languages for speech input ⌨️ 96 Languages for text input/output 🗣️ 35 languages for speech output It achieves SOTA by using Fairseq2, a new modeling toolkit, and the largest open dataset for multimodal translation, totaling 470k hours. This unified model enables multiple tasks without relying on multiple separate models: ▸ Speech-to-speech translation (S2ST) ▸ Speech-to-text translation (S2TT) ▸ Text-to-speech translation (T2ST) ▸ Text-to-text translation (T2TT) ▸ Automatic speech recognition (ASR)