We are open-sourcing Higgs Audio v2, a powerful audio foundation model pretrained on over 10 million hours of audio data and a diverse set of text dat

Search code, repositories, users, issues, pull requests...

submited by

Style Pass

2025-07-28 20:30:04

We are open-sourcing Higgs Audio v2, a powerful audio foundation model pretrained on over 10 million hours of audio data and a diverse set of text data. Despite having no post-training or fine-tuning, Higgs Audio v2 excels in expressive audio generation, thanks to its deep language and acoustic understanding.

On EmergentTTS-Eval, it achieves win rates of 75.7% and 55.7% over "gpt-4o-mini-tts" on the "Emotions" and "Questions" categories, respectively. It also obtains state-of-the-art performance on traditional TTS benchmarks like Seed-TTS Eval and Emotional Speech Dataset (ESD). Moreover, the model demonstrates capabilities rarely seen in previous systems, including generating natural multi-speaker dialogues in multiple languages, automatic prosody adaptation during narration, melodic humming with the cloned voice, and simultaneous generation of speech and background music.

Here's another demo video that show-cases the model's multilingual capability and how it enabled live translation (remember to unmute):

Search code, repositories, users, issues, pull requests...

Leave a Comment

Related Posts

Recent Posts

Search code, repositories, users, issues, pull requests...

Build a Production-Grade Chatbot with Kimi K2 and Milvus

The first company to complete a fully successful lunar landing is going public

Computer Science > Software Engineering

Daryl Davis - Wikipedia

That Time I Worked With a Laptop Thief

A Free Market for Eyeballs - Neel Somani's Blog

Computer Science > Machine Learning

dan goods - FIRST TV IMAGE OF MARS

B-Complex Vitamins - Domo Futu

MetaCPAN's Traffic Crisis: An Eventual Success Story

Tushar Dadlani’s Blog

Contraction Hierarchies: HMC Clinic Project Recap

How Israel’s War Became Unjust

No more Erlang manuals

Can China Become a Defender of Free Trade?

Polish Train Maker Is Suing the Hackers Who Exposed Its Anti-Repair Tricks

Navy Set to Unplug Critical Hurricane Satellites this Week

Zig profiling on Apple Silicon

WP Cron Pixie v1.5.0 released: Front end switched from Elm to Gleam – ianmjones