#AI text to speech?

1 messages · Page 1 of 1 (latest)

wicked dove
#

Does OpenAI offer a text-to-speech service? For instance, if I have a book that I prefer not to read, can I then use AI to convert the text into spoken audio, allowing someone me to listen to it?

keen vaporBOT
#

Hi @wicked dove. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Yes, OpenAI offers a text-to-speech service through its API. This service can convert text into high-quality spoken audio. It's a part of the Audio API and uses the TTS (text-to-speech) model. You can certainly use this service to convert a book into spoken audio.

The TTS model comes in two variants: tts-1 and tts-1-hd. The tts-1 model is optimized for real-time use cases, while tts-1-hd is optimized for quality. You can choose between these based on your requirements.

To use this service, you need to send a request to the speech endpoint of the Audio API. The request should include the model, the text that should be turned into audio, and the voice to be used for the audio generation. Here's an example of how to do this in Python:

from pathlib import Path
from openai import OpenAI
client = OpenAI()

speech_file_path = Path(__file__).parent / "speech.mp3"
response = client.audio.speech.create(
 model="tts-1",
 voice="alloy",
 input="Today is a wonderful day to build something people love!"
)

response.stream_to_file(speech_file_path)

In this example, the text "Today is a wonderful day to build something people love!" is converted into spoken audio using the alloy voice and the tts-1 model. The audio is then saved as an MP3 file.

You can also stream the audio by setting stream=True. This allows you to chunk the returned audio file.

Please note that the maximum input size you can submit per request is 4096 characters, which is equivalent to about 5 minutes of audio at default speed. If your book is longer than this, you'll need to break it up into smaller chunks.

For more information, you can refer to the OpenAI API documentation (https://platform.openai.com/docs/api-reference) and the Text-to-Speech guide (https://platform.openai.com/docs/guides/text-to-speech).

#

Please also note that OpenAI's usage policies require you to provide a clear disclosure to end users that the TTS voice they are hearing is AI-generated and not a human voice.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@wicked dove kapa.ai is still learning and improving, please let us know how it did by reacting below