#How to train voice model with non-English data but get native-sounding English voice conversion?

1 messages · Page 1 of 1 (latest)

autumn owl
#

I'm working on training a voice model using only non-English audio clips from native speakers of other languages. My goal is to use voice conversion to have them speak English fluently, but I’m concerned the output will have a foreign accent since no English phonemes were present in the training data.

Is there a way to get a native-sounding English output from a model trained only on non-English data?

Thanks in advance.

sterile ice
#

having non-english phonemes in the training data is not much of a problem, having no english phonemes is a problem

#

setting index to 0, or borrowing an index from an english model, may fix the accent problem

autumn owl
#

From an English model of a different voice?

#

That doesn't seem like a good idea

sterile ice
#

russian voice model with english audio

autumn owl
#

Hmm

#

But how does that work? Would I have to find some English model of a voice that sounds similar to the voice I'm training?

sterile ice
#

model.pth is the voice, model.index is phonemes

#

during inference the process extract phonemes from the audio and tries to find something that matches in the index

#

the accent comes from finding something that is way way off the target because your training data does not have the required phonemes

verbal olive