#Question about Voice V3 and training samples

1 messages · Page 1 of 1 (latest)

void turtle
#

I'm don't know very much about TTS voice creation, but it intrigued me when Vedal said Cerber's samples were not quite helpful because of the distortion caused by Pitch Shifting.

We all know both Neuro and Evil have a natural lower pitch base voice, and it's distorted in the "output" (If I can call it that) to sound like how they do on stream. The original voice samples used to create their TTS's are lower by default.

Considering that, I noticed Cerber's natural voice sounds incredibly similar to Azure Ashley's (Neuro's TTS) unpitched voice (see video below). I honestly don't think Vedal can find anyone better than her if he wants it to sound like V1 (aside from hiring the actual VA from Ashley's TTS).

Is it not possible to take samples from her natural voice and then do the same thing Vedal does to Neuro and Evil nowadays? Train a lower pitched base TTS voice, using Cerber's natural voice, and THEN distort it 4 semitones upwards? Evil's TTS sounds so clean it barely sounds like she's +4 semitones up from her original voice.

In the Video:
-Ashley: +0 Semitones, original (or -4 semitones from Neuro-Sama)
-Cerber: -4 semitones (to counterbalance the +4 she did to sound like Neuro on the clip, there's a little loss of quality because of that, but I couldn't find her saying that without distortion)

fickle ore
warm wasp
#

I mean just doing that it wouldn't sound close enough to neuro

#

BUT

#

it might help me with my current plan, i am considering it

fickle ore
dim narwhal
#

for the neuros themselves

fickle ore
#

Real voice might not work. Imitate neuro then -4 semitones then train then +4 semitone might work.

void turtle