#Volume progressively dropping during inference.
1 messages · Page 1 of 1 (latest)
That is very odd. Usually, a drop in volume happens over longer generations, but this was fairly short. What voice are you using? A cloned one suspect? If so, could you share the samples you used and the setts?
Using my own PVC. I know the recommended model is Multilingual but that sounds too posh. I've split my audiobook text into blocks of 4 sentances to avoid long generations. I know stability is on the lower end, but again...identity is lost if that's higher and the vast majority of generations are great.
https://api.elevenlabs.io/v1/text-to-speech/S1GxattMxHrXozy2QM7J?output_format=mp3_44100_192
payload = {
"model_id": "eleven_english_v2",
"text": text,
"voice_settings": {
"similarity_boost": 0.98,
"stability": 0.35,
"style": 0.48,
"use_speaker_boost": True
}