Hey everyone, I want to make a personal voice assistant who sounds exactly like a real person. I tried some TTS like tortoise TTS and coqui TTS, it done a good job but it takes too long time to perform. So is there any other good realistic sounding TTS which I can use with my own voice cloning training dataset? Also I'm a bit amazed by the TTS used by eleven labs, so can someone explain how can I achieve that level of real-time efficiency in a voice assistant?

Comments

You must log in or register to comment.

marcus_hk t1_j7lqpav wrote on February 7, 2023 at 6:42 PM

#1,746,103

I haven't been keeping up with TTS since Tacotron 2, but it seems Eleven Labs works fundamentally the same way.

As for real-time performance you may need to port your Python code to C++.

Kthulu120 t1_j7lud63 wrote on February 7, 2023 at 7:05 PM

#1,746,346

Do

Kthulu120 t1_j7lui8k wrote on February 7, 2023 at 7:06 PM

#1,746,357

Do you need it as an API?

johnwireds t1_j7mxns2 wrote on February 7, 2023 at 11:22 PM

#1,748,622

Would also interest myself to clone my voice and have someone speak with my voice in real time?

gunshoes t1_j7nj5co wrote on February 8, 2023 at 2:00 AM

#1,749,926

Fast speech 2 would be your best bet.

nmfisher t1_j7osgdc wrote on February 8, 2023 at 9:41 AM

#1,751,976

Replying to gunshoes (#1,749,926)

FS2 is fine for training a TTS model from scratch, but I haven't come across a good FS2 model for cloning (which is basically zero-shot TTS).

gunshoes t1_j7p91py wrote on February 8, 2023 at 1:03 PM

#1,752,682

Replying to nmfisher (#1,751,976)

You can throw GasTs or use a speaker embedding to influence the energy/ pitch outputs. The sound is meh but it works.

nmfisher t1_j7pawou wrote on February 8, 2023 at 1:20 PM

#1,752,782

Replying to gunshoes (#1,752,682)

That's why I added the qualifier "good" :)

theLanguageSprite t1_j7relsn wrote on February 8, 2023 at 9:41 PM

#1,756,754

You have to pay to use the api and it’s completely closed source but resemble.ai works pretty well

akshaysri0001 OP t1_j7ts2wg wrote on February 9, 2023 at 10:11 AM

#1,760,473

Replying to Kthulu120 (#1,746,357)

After some training, Yes!

-Alexandros t1_j887z9p wrote on February 12, 2023 at 11:00 AM

#1,787,250

Replying to johnwireds (#1,748,622)

Yeah that would be pretty cool and trippy.