The dream of a universal AI interpreter just got a bit closer. This week, tech giant Meta released a new AI that can almost instantaneously translate speech in 101 languages as soon as the words tumble out of your mouth.
AI translators are nothing new. But they generally work best with text and struggle to transform spoken words from one language to another. The process is usually multistep. The AI first turns speech into text, translates the text, and then converts it back to speech. Though already useful in everyday life, these systems are inefficient and laggy. Errors can also sneak in at each step.
Meta’s new AI, dubbed SEAMLESSM4T, can directly convert speech into speech. Using a voice synthesizer, the system translates words spoken in 101 languages into 36 others—not just into English, which tends to dominate current AI interpreters. In a head-to-head evaluation, the algorithm is 23 percent more accurate than today’s top models—and nearly as fast as expert human interpreters. It can also translate text into text, text into speech, and vice versa.
Meta is releasing all the data and code used to develop the AI to the public for non-commercial use, so others can optimize and build on it. In a sense, the algorithm is “foundational,” in that “it can be fine-tuned on carefully curated datasets for specific purposes—such as improving translation quality for certain language pairs or for technical jargon,” wrote Tanel Alumäe at Tallinn University of Technology, who was not involved in the project. “This level of openness is a huge advantage for researchers who lack the massive computational resources needed to build these models from scratch.”