VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild

submited by

Style Pass

2024-03-29 16:00:11

VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.

All speakers are unseen during training. Utterances are from our RealEdit evaluation set, comprises audiobooks, YouTube videos, and Spotify podcasts

This website is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Poll: Only 18 Percent Of Germans Feel Free To Voice Views In Public

Comment

Nick Cannon: Fired by ViacomCBS for Failing to Apologize for ‘Perpetuating Anti-Semitism’

Comment

Normal People are Beginning to Acknowledge Their Loss of Free Speech on the Internet

Comment

Mozilla Common Voice Adds 16 New Languages and 4,600 New Hours of Speech

Comment

Zhang Yiming’s Last Speech

Comment

Judges Say Web Design Is 'Pure Speech' and That the State Can Compel It Anyway

Comment

Google takes legal action over Germany's expanded hate-speech law

Comment

VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild

Leave a Comment

Related Posts

Poll: Only 18 Percent Of Germans Feel Free To Voice Views In Public

Nick Cannon: Fired by ViacomCBS for Failing to Apologize for ‘Perpetuating Anti-Semitism’

Normal People are Beginning to Acknowledge Their Loss of Free Speech on the Internet

Mozilla Common Voice Adds 16 New Languages and 4,600 New Hours of Speech

Zhang Yiming’s Last Speech

Judges Say Web Design Is 'Pure Speech' and That the State Can Compel It Anyway

Google takes legal action over Germany's expanded hate-speech law

The Challenges of Freedom of Speech – Hongchao's Notes – sensing the abstract nonsense

Toby 'qubit' Cubitt - Evil cursor model

¶ Recipes for Research: An Introduction to MSL

Recent Posts

Power of Libraries – LevelUp Education

Quadra 610 DOS Compatible

Met police to pay ‘five-figure sum’ to French publisher arrested under anti-terror laws

Pete: Your AI PT 4+

Roman object that baffled experts to go on show at Lincoln Museum

Create short videos using Stable Video Diffusion

70 Years Ago, Roald Dahl Predicted The Rise Of ChatGPT

Geiger tube J305 conversion factor: differences between the coefficient for source radiation power and absorbed dose. Technical note - Electronics manufacturer for IoT

Native Support for CJS/ESM Interoperability Begins in Node.js 22

Hydrapulper Pulp Crushing Machine,Waste Paper Pulp CrushingD Type Hydrapulper Pulp Crushing Machine

London Drugs closes stores until further notice due to cyberattack

PIOSEE Decision Model and preparations for critical situations

SpaceX Employees Getting Wounded at Incredible Rates

software + caffeine = blog

How is One of America's Biggest Spy Agencies Using AI? We're Suing to Find Out.

Welcome to exploit.education

Workshops and speakers for 2024 : C++ On Sea

ChatGPT provides false information about people, and OpenAI can’t correct it

Search code, repositories, users, issues, pull requests...

Why Are We in the West So Weird? A Theory