#how to change volume gain , pitch and speaking rate on the elevenlabs TTS websocket

1 messages · Page 1 of 1 (latest)

tawdry vigilBOT
#
AI-generated Answer for suraj_63976

Currently, ElevenLabs does not offer built-in features to adjust volume gain, pitch, or speaking rate directly within its platform, including the TTS WebSocket. These adjustments need to be made using external audio editing software after generating the audio.

For pitch adjustments, you can use third-party tools or audio editors. Similarly, speaking rate modifications can be achieved by using external programs like Audacity or other audio processing tools.

The team is working on features to provide more control over audio elements, but no release date is available. If you need further assistance, feel free to ask!

ocean vigil
#

does it honour prosody attribute of the SSML ?

tawdry vigilBOT
#
AI-generated Answer for suraj_63976

ElevenLabs currently offers limited support for SSML (Speech Synthesis Markup Language) in their text-to-speech models. The Eleven English V1 and Eleven Turbo V2 models support some SSML tags, primarily for controlling pronunciation using the <phoneme> tag with IPA or CMU Arpabet.

However, the current implementation does not include direct control over volume gain, pitch, or speaking rate through SSML prosody attributes. The ability to change the mood or tone of the voice is also not currently supported through SSML.

For adjusting the speaking pace, ElevenLabs has acknowledged user requests for speed control features, but these are not yet implemented in the API. The company has stated that expanding SSML support is on their roadmap, but there's no specific timeline for when additional controls like prosody attributes might be added.

In terms of WebSocket support, ElevenLabs offers WebSocket functionality for text-to-speech conversion, but not for speech-to-speech. If you need real-time streaming capabilities, you can use their API for this purpose.

For the most up-to-date information on available features and SSML support, I recommend checking the ElevenLabs API documentation or reaching out to their support team directly. They may be able to provide more specific details on upcoming features or workarounds for your particular use case.

ocean vigil
#

for an already created websocket connection, can I update the speed and other attributes /

tawdry vigilBOT
#
AI-generated Answer for suraj_63976

Currently, ElevenLabs does not support adjusting the speed, pitch, or volume gain directly within the TTS API or WebSocket connection. Expanding SSML support is on the roadmap, but additional SSML tags for controlling attributes like speed or prosody are not yet available.

For now, the best approach is to modify the generated audio using external audio editing tools, such as Audacity, after the audio has been created.

ocean vigil
#

can it be done during websocket initiation ?