French startup Gladia, which offers a speech-recognition application programming interface (API), has raised $16 million in a Series A funding round.

Gladia believes real-time processing is the next frontier of audio transcription APIs

submited by
Style Pass
2024-10-15 13:00:07

French startup Gladia, which offers a speech-recognition application programming interface (API), has raised $16 million in a Series A funding round. Essentially, Gladia’s API lets you turn any audio file into text with a high level of accuracy and low turnaround time.

While Amazon, Microsoft and Google all offer speech-to-text APIs as part of their cloud-hosting product suites, they don’t perform as well as newer models offered by specialized startups.

There has been tremendous progress in this field over the past couple of years, especially after the release of Whisper by OpenAI. Gladia competes with other well-funded companies in the space, such as AssemblyAI, Deepgram and Speechmatics.

Gladia originally offered a fine-tuned version of Whisper’s speech-to-text model with some much needed improvements. For instance, the startup supports diarization out of the box — it can detect when there are multiple speakers in a conversation and separate the recording, and transcribed text, depending on who’s talking.

Gladia supports 100 languages and a wide variety of accents. This reporter can confirm that it works, as we’ve been using Gladia to transcribe some interviews, and accents weren’t an issue.

Leave a Comment