#Volume progressively dropping during inference.

1 messages · Page 1 of 1 (latest)

timid python
#

It doesn't always happen. I've batched a significant number of paragraphs for my audiobook and some generations have strange artifacts. Others drop in volume as the audio progresses and are unusable. I'm obviously expending quite a high number of credits on this, especially when needing to regenerate.

icy frost
#

That is very odd. Usually, a drop in volume happens over longer generations, but this was fairly short. What voice are you using? A cloned one suspect? If so, could you share the samples you used and the setts?

timid python
#

Using my own PVC. I know the recommended model is Multilingual but that sounds too posh. I've split my audiobook text into blocks of 4 sentances to avoid long generations. I know stability is on the lower end, but again...identity is lost if that's higher and the vast majority of generations are great.

https://api.elevenlabs.io/v1/text-to-speech/S1GxattMxHrXozy2QM7J?output_format=mp3_44100_192

 payload = {
        "model_id": "eleven_english_v2",
        "text": text,
        "voice_settings": {
            "similarity_boost": 0.98,
            "stability": 0.35,
            "style": 0.48,
            "use_speaker_boost": True
        }