#How exactly can I generate the AI voices to have emotion ?

1 messages · Page 1 of 1 (latest)

slim emberBOT
#
AI-generated Answer for 420_skill

To generate AI voices with emotion, you can follow these tips:

  1. Contextual Phrases: Use phrases like "he said angrily" or "she whispered softly" to guide the AI in generating the desired emotion. These phrases help the AI understand the emotional tone you want.

  2. Punctuation: Proper punctuation can affect delivery. Use exclamation marks, question marks, and commas to convey emotion.

  3. Quotation Marks: Emphasize specific words or phrases by using quotation marks. This can enhance the emotional expression in the output.

  4. Voice Samples: When using a cloned voice, ensure that the uploaded samples reflect the desired emotional tone. The AI replicates the speaking style contained in the samples.

  5. Context: The AI is sensitive to the broader context surrounding each utterance. Setting the context appropriately with relevant text can help evoke specific emotions like anger, sadness, or happiness.

  6. Punctuation for Pauses: Utilize dashes, em-dashes, or multiple dashes for longer pauses to manage pacing and add emotional depth.

Here are some example prompts:

"Are you sure about that?" he said, confused.
"Don’t test me!" he shouted angrily.

Remember, you will need to remove these prompts later as the AI will read exactly what you give it. While these methods can enhance emotional delivery, they do not guarantee a specific outcome. Future features will provide more control over emotion in the text.

dreamy musk
#

How come I can’t upload a video to speech to speech ?

slim emberBOT
#
AI-generated Answer for 420_skill

The Speech-to-Speech feature only accepts audio files, not video files. Please try uploading a short MP3 audio file instead. If you continue to experience issues, feel free to reach out for further assistance.

dreamy musk
#

How many files can I upload for a custom voice ?

slim emberBOT
#
AI-generated Answer for 420_skill

To create a custom voice, you need to provide at least 2 hours of high-quality audio featuring a single speaker. The audio should be free of background noise, excessive room reverb, music, or long pauses. Any external sounds could be misinterpreted by the AI.

If you have any further questions or need assistance, feel free to reach out!

dreamy musk
#

2 hours of an MP3 file ?

slim emberBOT
#
AI-generated Answer for 420_skill

For creating a Professional Voice Clone, you should provide at least 2 hours of high-quality audio. The audio must feature a single speaker, without background noise, excessive room reverb, music, or long pauses. Using MP3 files in mono format is recommended.

dreamy musk
#

Does it have to be mp3 file ?

slim emberBOT
#
AI-generated Answer for 420_skill

For creating a Professional Voice Clone, we recommend using MP3 files with a bitrate of 192kbps or above. While we accept various file types, focusing on the quality of the recording is crucial. Ensure the audio is recorded professionally without any background noise, room reverb, multiple speakers, or long gaps of silence. Using an uncompressed format like WAV will yield little to no improvement over a high-quality MP3.

dreamy musk
#

What about custom voice cloning ?