AI can simulate anyone’s voice with 3 seconds of audio

Get help or discuss anything relating to audio/video software & hardware
User avatar
Golem
Posts: 352
Joined: Fri Feb 26, 2021 5:25 pm
Has thanked: 22 times
Been thanked: 64 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by Golem »

zaval80 wrote: Tue Jan 31, 2023 7:01 am
tdgrnwld wrote: Mon Jan 30, 2023 4:07 pm It's a reasonable application of neural networks, but to do it right, it requires a large dataset of paired hi-fi and lo-fi (recorded on the same rig you aim to upsample from) recordings. I suspect that the lack of such a dataset - due to the cost of producing one - is a major reason we haven't seen these systems already.
Would be under lock by the governments.
how so? governments aren't the only ones making AI
theboxinargentina
Posts: 301
Joined: Fri Oct 08, 2021 4:12 pm
Has thanked: 184 times
Been thanked: 97 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by theboxinargentina »

Sure, it's not much now. But you know it's only a matter of time until you can't tell the difference:

AI has been generating an endless Seinfeld episode for more than a month now
Nothing, Forever was launched on December 14th, 2022 and has been broadcasting since. In it, blocky, lo-poly versions of Jerry, Elaine, George, and Kramer hang out in a brightly colored, slightly reconfigured version of Jerry’s sitcom apartment, talking in clipped, robotic sentences. Sometimes the scene changes to show an exterior shot of the apartment building, Jerry performing stand-up, or a mock TV channel guide. There’s a laugh track punctuating lots of not-funny lines of GPT-3-generated dialogue.
https://www.avclub.com/seinfeld-nothing ... 1850053210
zaval80
Posts: 390
Joined: Sun Mar 07, 2021 9:19 pm
Has thanked: 23 times
Been thanked: 18 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by zaval80 »

Golem wrote: Tue Jan 31, 2023 3:56 pm how so? governments aren't the only ones making AI
Governments have many ways. They can prohibit the use, they can refuse certification. Or even something worse.
User avatar
Golem
Posts: 352
Joined: Fri Feb 26, 2021 5:25 pm
Has thanked: 22 times
Been thanked: 64 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by Golem »

zaval80 wrote: Wed Feb 01, 2023 7:25 am
Golem wrote: Tue Jan 31, 2023 3:56 pm how so? governments aren't the only ones making AI
Governments have many ways. They can prohibit the use, they can refuse certification. Or even something worse.
I kinda doubt governments would prohibit use of whats basically "audio upscaling", not sure I see a reason to, there are plenty of way more dangerous AIs that aren't being prohibited
Davenicks
Posts: 80
Joined: Wed Feb 17, 2021 9:31 pm
Has thanked: 502 times
Been thanked: 9 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by Davenicks »

Lord Reith wrote: Fri Jan 13, 2023 9:08 am It's a terrible idea. I can't imagine what use this could possibly serve other than open the doors for yet more fraud and fake news. Someone's voice is as unique as their face. Allowing it to used by others is a blatant form of identity theft.
Never underestimate the public's appetite for junk.
zaval80
Posts: 390
Joined: Sun Mar 07, 2021 9:19 pm
Has thanked: 23 times
Been thanked: 18 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by zaval80 »

Golem wrote: Wed Feb 01, 2023 12:31 pm I kinda doubt governments would prohibit use of whats basically "audio upscaling", not sure I see a reason to, there are plenty of way more dangerous AIs that aren't being prohibited
Not up to a certain level allowed for Joe Schmoe, for sure. I've meant rather the level where high teraflops are required.
User avatar
Lord Reith
Posts: 4598
Joined: Thu Feb 18, 2021 8:22 am
Location: BBC House
Has thanked: 139 times
Been thanked: 3939 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by Lord Reith »

I don't think turning, say, an AM radio quality recording into an FM quality recording should be that difficult. It's only a tiny piece of the audio spectrum that needs to be computed, and most of the important overtones that determine the timbre of a voice or instrument are not in that range. After all, the sound quality of "Dream Baby" from 1962 is fairly poor and limited to about 4khz, but we can still easily tell that it's Paul singing. Pulling Pete's cymbals into existence would be more difficult, because we don't know whether he played, say, a high hat or a ride cymbal on that song (or any cymbal at all). So no "dumb" harmonic synthesizer like the ones currently in use can hope to do that. Only a clever algorithm that is capable of analysing a similar track like "Searchin'" (amongst others) and figuring out what should be "put back" has any hope of doing that. But it's still perfectly within the bounds of reason. It isn't Warp Drive or anything hopelessly futuristic like that.

I think we will see stuff like this appearing within the next few years, primitive at first but with the potential to evolve rapidly. Just look at how Google's image recognition technology rapidly progressed to the point where we now think nothing of photographing a random object, feeding it into Google and having it tell us it is an "oyster spoon". Then equally quickly people had the bright idea of using that technology to analyse spectropgrams, and suddenly we had AI demixing, which has in barely 5 years progressed to the point where instruments can be almost perfectly isolated in less than a minute.

So I think all the Good Stuff is imminent, but equally imminent is the other side of the coin... the very, very Bad Stuff.
Women there don't treat you mean, in Abilene
Mixerrog
Posts: 279
Joined: Sat May 22, 2021 5:01 pm
Has thanked: 41 times
Been thanked: 100 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by Mixerrog »

LR,

Yes I agree with your statements. Already iZotope RX series has a "Spectral Repair" for a few years which can repair dropouts perfectly with careful manual manipulation that I have used for a long time now. This will become automatic soon & may already be happening with minor dropouts.

As the training tables get much larger, old records should be able to be updated to new quality as you stated. I have even had some luck on brick walled songs fixing some of that upon AI stem separation & then putting the song back together as most of the issue is the bass being pushed way too hard on the newer so called re-mastered mixes so reducing the bass level can help that heavy compression issue.

CBS Sunday Morning show has been running a series on AI with the one last Sunday on finding fake videos you can see at Hopefully they will do one on AI & Music soon.

Rog
zaval80
Posts: 390
Joined: Sun Mar 07, 2021 9:19 pm
Has thanked: 23 times
Been thanked: 18 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by zaval80 »

One can expect people will start hunting downloadable recordings with some kind of pedigree :lol:
theboxinargentina
Posts: 301
Joined: Fri Oct 08, 2021 4:12 pm
Has thanked: 184 times
Been thanked: 97 times

Re: AI can simulate anyone’s voice with 3 seconds of audio

Post by theboxinargentina »

Well this is godawful

Voice actors are increasingly being asked to sign rights to their voices away so clients can use artificial intelligence to generate synthetic versions that could eventually replace them
Those contractual obligations are just one of the many concerns actors have about the rise of voice-generating artificial intelligence, which they say threaten to push entire segments of the industry out of work.
https://www.vice.com/en/article/5d37za/ ... telligence
Post Reply