Uncategorized

Nvidia claims a new AI audio generator can make sounds never heard before

Cath Virginia / The Verge | Photo from Getty Images

Nvidia says its new AI music editor can create “sounds never heard before” — like a trumpet that meows. The tool, called Fugatto, is capable of generating music, sounds, and speech using text and audio inputs it’s never been trained on.
As shown in this video embedded below, this allows Fugatto to put together songs based on wild prompts, like “Create a saxophone howling, barking then electronic music with dogs barking.”

Some other examples shared by the company include the ability to produce unique sound effects based on a description, like “Deep, rumbling bass pulses paired with intermittent, high-pitched digital chirps, like the sound of a massive sentient machine waking up.”

It can even transform the sound of someone’s voice, changing their accent or giving them a different tone, like angry or calm. There are ways to edit music, too, as Fugatto can isolate the vocals in a song, add instruments, and even change up a melody by swapping out a piano for an opera singer.
A paper released with the announcement shows the long list of all the datasets Nvidia says Fugatto was trained on, one of which includes a library of sound effects from the BBC.

There are already several other AI audio tools out there, including those from Stability AI, OpenAI, Google DeepMind, ElevenLabs, and Adobe, but not ones claiming to create completely new and unheard-of sounds. Some AI startups are even facing copyright lawsuits over their music creation tools, while a recent report found that Nvidia and other companies trained AI models on subtitles from thousands of YouTube videos.
To build Fugatto, Nvidia says researchers had to put together a dataset with millions of audio samples. They then created instructions “that considerably expanded the range of tasks the model could perform, while achieving more accurate performance and enabling new tasks without requiring additional data.” Nvidia doesn’t say when — or if — the tool will be widely available.

Cath Virginia / The Verge | Photo from Getty Images

Nvidia says its new AI music editor can create “sounds never heard before” — like a trumpet that meows. The tool, called Fugatto, is capable of generating music, sounds, and speech using text and audio inputs it’s never been trained on.

As shown in this video embedded below, this allows Fugatto to put together songs based on wild prompts, like “Create a saxophone howling, barking then electronic music with dogs barking.”

Some other examples shared by the company include the ability to produce unique sound effects based on a description, like “Deep, rumbling bass pulses paired with intermittent, high-pitched digital chirps, like the sound of a massive sentient machine waking up.”

It can even transform the sound of someone’s voice, changing their accent or giving them a different tone, like angry or calm. There are ways to edit music, too, as Fugatto can isolate the vocals in a song, add instruments, and even change up a melody by swapping out a piano for an opera singer.

A paper released with the announcement shows the long list of all the datasets Nvidia says Fugatto was trained on, one of which includes a library of sound effects from the BBC.

There are already several other AI audio tools out there, including those from Stability AI, OpenAI, Google DeepMind, ElevenLabs, and Adobe, but not ones claiming to create completely new and unheard-of sounds. Some AI startups are even facing copyright lawsuits over their music creation tools, while a recent report found that Nvidia and other companies trained AI models on subtitles from thousands of YouTube videos.

To build Fugatto, Nvidia says researchers had to put together a dataset with millions of audio samples. They then created instructions “that considerably expanded the range of tasks the model could perform, while achieving more accurate performance and enabling new tasks without requiring additional data.” Nvidia doesn’t say when — or if — the tool will be widely available.

Read More 

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top
Generated by Feedzy