Google Research just released MusicLM: Generating Music From Text. It’s a new generative AI model that takes a descriptive prompt and produces a “

Simon Willison’s Weblog

submited by
Style Pass
2023-01-27 22:00:18

Google Research just released MusicLM: Generating Music From Text. It’s a new generative AI model that takes a descriptive prompt and produces a “high-fidelity” music track. Here’s the paper (and a more readable version using arXiv Vanity).

A fusion of reggaeton and electronic dance music, with a spacey, otherworldly sound. Induces the experience of being lost in space, and the music would be designed to evoke a sense of wonder and awe, while being danceable.

To support future research, we publicly release MusicCaps, a dataset composed of 5.5k music-text pairs, with rich text descriptions provided by human experts.

I love digging into these training datasets—and this one is pretty tiny. I decided to take a look and see what I could learn.

The dataset itself turns out to not have any audio clips in it at all—instead, each row of the data includes a YouTube video ID and a start and end time for a clip within it.

Leave a Comment