#Expressive voice model that breathes, laughs, sighs, giggles, sing, etc.

1 messages · Page 1 of 1 (latest)

plush shuttle
#

In the same topic of having a new voice for Neuro, I would like to endorse the use of Bark transformer. https://github.com/suno-ai/bark

Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints ready for inference.

Note: For use with utterances (run on separate thread), not fast enough for real time TTS on responses.

Google Collab with sample audio: https://colab.research.google.com/drive/1eJfA2XUa-mXwdMy7DoYKVYHI1iTd9Vkt?usp=sharing#scrollTo=NyYQ--3YksJY

plush shuttle
#

prompt:

Hello, my name is Suno. And, uh — and I like pizza. [laughs] 
But I also have other interests such as playing tic tac toe.
fiery fox
#

Oh god don't tell Aletta

calm cedar
#

Could this be implemented alongside the usual system? I.e. using Bark only for non-verbal communication, which is then added to Neuros usual ramble?

fiery fox
#

Kind of like how the subtitles only use Arial for characters that don't appear in the normal font? 😉

fallow thistle
#

isn't it going to be hard to clone neuro's voice to suit this model? and the fact it's not fast enough to be used in real time really doesn't leave it with many ways to use it on stream

tiny pewter
#

Bark is licensed under a non-commercial license: CC-BY 4.0 NC. The Suno models themselves may be used commercially. However, this version of Bark uses EnCodec as a neural codec backend, which is licensed under a non-commercial license.

#

Would vedal be even allowed to use it?

calm cedar
#

Could ask for a commercial licence, typically its only a question of licencing fees

plush shuttle
fiery fox
crude cloak
#

Dont you guys find the output to be noisy af? are there examples of cleaner outputs?

plush shuttle
# crude cloak Dont you guys find the output to be noisy af? are there examples of cleaner outp...

no its just the sample see here more test https://www.youtube.com/watch?v=rU5Do9yHbwM

Bark AI by Suno AI is a transformer-based text-to-audio model that can generate realistic and multilingual speech as well as other audio such as music, sound effects and nonverbal expressions. It can also clone voices and audio from a limited set of synthetic options. Bark AI is an open source project that provides access to pretrained model che...

▶ Play video
crude cloak
#

I'll see if i can make a demo