Voice Cloning: The 15-Second Reality

Using a voice cloning device like „e2-f5-tts“ by „mrfakename“ locally, you are able to create convincing voice clones locally with only 15 seconds of original voice material.

Let me put this into perspective:

You can leverage anyone’s „out-Of-Home“ message which they spoke on their telephone voice box to create a stunningly realistic digital copy of how they sound.

15 seconds.

Let that sink in…

Talking about letting things „sink in“ and because I rarely trust stuff I just read and because I need to test things out myself to know what the heck I am talking about – well, because of all of this, I created a fake video of Elon Musk telling us a little secret about himself – again:

THE VIDEO BELOW IS FAKE.

It is here for educational purposes only. It is there to show you the ease in which producing this kind of content is these days (ok, and I think it’s a tiny bit funny as well). So don’t run around telling everyone and their grandmother it ain’t so.

Workflow:

  • Write a script with chatGPT
  • Install and run „e2-f5-tts“ with Pinokio
  • Record 15 seconds spoken words (no background sounds) off of some random Musk Video you’d find on YouTube
  • Feed original 15 seconds audio and chatGPT created copytext into „e2-f5-tts“
  • Download generated audio file with Elon’s voice from „e2-f5-tts“
  • Throw this into HeyGen with an Avatar similar to Elon
  • Throw this result into „Face Fusion“, activate the „age_modifier“, use a randomly downloaded real image of Musk as „Source“ and the HeyGen video as „Target“ and finetune the slider for age. Hit „Start“.
  • Download final video from „Face Fusion“.

Yes, it’s still some work, but each step is rather simple.

Ok, so much for the tech enthusiasts in most of us…

Now just imagine what people may be capable of doing, who might not have „educational“ benefits in mind.

  • People who have better tech knowhow than me.
  • People who have the financial and hardware means to push their „agenda“.
  • People who have rather sinister goals in mind, society-breaking aspirations even.
  • People who just want to see the world burn.

Me not shutting up about AI has less to do with tech enthusiasm and everything to do with my relentless endeavor towards that we’re getting this right. This time, it’s a once-in-a-lifetime chance. Now is the time to get up and start a serious discourse – as „a society“, as fathers, mothers, children, as teachers, workers, CEOs and educators – as the empathic humans that we were originally born to be.

Here’s to hoping that, eventually, we won’t have to land on Mars far far away to rescue humanity. My hope is that we land in each others hearts and minds as one humankind to realize what has to be done.

It’s right there in front of us, in the word:

Human and kind.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert