#๐Ÿฃโ”ƒsuno-showcase

1 messages ยท Page 1 of 1 (latest)

lyric steeple
#

Doing some code-switching experiments with the new language-specific prompts

#

Mi hermano is very inteligente, pero sometimes he's un poco terco.

#

The word spacing around he's is a bit choppy in terco_c

lyric steeple
tardy topaz
#

WE HAVE ACHIEVED KNOCK KNOCK GENERATION. Last text prompt is "Suno who?" The research continues.

#

(It looks like an accident, but I laughed, so I'm counting it.)

#

I know this isn't what the model is for but I LOVE IT SO MUCH

#

There's audio foundation models sure. But why not generate the content too, even MORE foundational!

#

I won't rest until SUNO is generating SVG code

#

I've been randomly throwing wrenches into the sampling to encourage more unprompted content, sort of vaguely beam search-y

#

but with no theoretical grounding and my poor technical skills, mostly trial and error lol

lyric steeple
#

@tardy topaz, you inspired me to try "Why was six afraid of seven?"

tardy topaz
#

It absolutely kills me, I get the same thing. It's like you called on a student in class who wasn't paying attention, or the teleprompter died, and they speak is stalling.

blissful pulsar
#

Has someone tried [cackling]? ๐Ÿ˜†

dire zealot
#

You got any samples?

blissful pulsar
#

Unfortunately not

dire zealot
#

I meant of generations resulting from using [cackling] lol

fast girder
brave pawn
#

cursed

blissful pulsar
#

Maybe it will be able to do all if these some day ๐Ÿ™‚

deft grotto
azure tangle
#

hello peeps

#

anyone using this locally

deft grotto
azure tangle
#

dang pretty good @deft grotto you running locally?

deft grotto
azure tangle
#

ah thank you. im diggin into it now ill let you know if get it working

tardy topaz
#

If you only git the bark repo, make sure you do a 'pip install .' while in the main /bark directory. That's the only install, though you need to setup cuda and all that stuff if you haven't

zinc pecan
#

Oh wait suno can't clone?

#

its just a tts with supplied voices?

#

hope it gets updated to support cloning, i'll just stick with so-vits-svc

azure tangle
#

i been fucking with so-vits-svc, works pretty good. i doesnt have text 2 voice right. its just cloning?

zinc pecan
#

yeah voice2voice

proud yacht
#

Someone know how to clone my voice ?

zinc pecan
#

I dont see why they'd gimp this so hard- all it'll take is another company to release it without synthetic restrictions

zinc pecan
#

I suppose we could just pipe the output of bark to so-vits-svc with a trained voice model so you get the unique intonations

tardy topaz
#

few nonsense kpop experiments

feral oriole
gray latch
#

what's Koe AI?

high river
# deft grotto

How did you clone the Jay-Z voice, i there a working tutorial anywhere?

olive wave
#

Iโ€™ll have to see if this is better than Voice.ai

jolly kindle
#

hi

zinc pecan
#

you have to install it and its subscription, no thanks

obtuse sparrow
#

Got something completely random as my first attempt. Not related to the prompt at all. Still very much clear though and not nonsense for the most part:

What I hear:

Man1: "Hmm... Heeheehee, heehee yeah, he may be uh, I guess something different"
Woman2: "*soft gasp* That explai-"```
night quiver
#

๐Ÿ˜†

#

What was your prompt text?

#

I got something similar too which really surprised me

obtuse sparrow
#

Hmm, not sure if I can share. As it had at lest 1 curse word. Which I'm not sure is permitted here and/or auto blocked by the bot. It was also quite long.

#

But basically it was a back and forth between a female and male speaker about recording voice lines and about the oddness of the location in question.

#

I think the long prompt may have been a cause of the weirdness, and maybe too many hesitation commands, and a [throat clear] early on.

ebon widget
#

yeah long prompt sometimes lead to the model just going completely off the rails. and i also noticed that tags early on sometimes create issues (i think has to do with our data prep)

obtuse sparrow
#

It's too bad it's only (currently) capable of generating around 13 seconds of audio. I would've loved to have heard more of what these mysterious AI ghosts were talking about. It sounded quite interesting 7derpylaugh

ebon widget
#

hehe, yeah in our internal generations it's super fun to listen to minutes of fully generated stuff without text prompts. goes anywhere from sermons to music to arguments ๐Ÿ˜‚

#

we'll get to release stuff like that at least in the studio soon hopefully. just have to iron out some scaling kinks so it doesn't topple over when people try it

obtuse sparrow
#

I used to spend a lot of time generating Stable Diffusion using blank prompts, so it'll be fun to do something similar with with sound rather than images. In fact, might even be fun to listen to in the background when doing visual tasks. Can't wait to hear those kinda things.

#

Is there currently a way I'd be able to rig the colab to continually run a prompt by outputting audio files one after the other upon completion? It'd be tremendously slow, but it'd at least achieve that kinda thing and may help with long prompts.

ebon widget
#

Hm not too too familiar with collab and when it kicks you off but probably can just write a loop and save it some place, maybe even upload to drive or something. They also allow you to upgrade gpu for faster inference etc

obtuse sparrow
ebon widget
#

Awesome!

obtuse sparrow
#

AAAAAAA, my ears 6lunascream

fast girder
#

Hmm ๐Ÿ˜ฆ this is one of the failure modes we've been hearing for "harder" prompts.. we are thinking about how to prevent this more generally

autumn cloud
obtuse sparrow
#

I don't remember specifically. Generated a ton since then.

olive tide
dull hemlock
#

no laughs and shhhh at the end that not in the text prompt

#
text_prompt = """
     Hello, my name is Salah. And, uh โ€” and I like pizza. [laughs] 
     But I also have other interests such as playing tic tac toe.
"""
slate summit
#

i got similar artefacts

#

'hallucinations' i assume

#

started out good, but decayed

fossil stratus
#

WOMAN: Yabadaba doo! I like Tick Tock Clocks.
Result: ??? Wat

slate summit
#

lol

tardy topaz
#

I have now listened to more than 200 "Why was 6 afraid of 7 completions", and not a single actual joke yet. KNOCK KNOCK and "Why did the chicken cross the road" had a lot higher hit rate. They are hilarious though.

zinc pecan
#

lol

#

definitely trained on podcasts and television interviews

proud yacht
native sorrel
#

lol

viscid sable
#

It is interesting~
we could use it to generate rap style

proud yacht
#

Nice, are you using a custom voice ?

viscid sable
# proud yacht Nice, are you using a custom voice ?

It's my prompt, I use [rap] to include the lyrics
text_prompt = """
[rap]
You pray for my demons, girl, I got you [music]
Every time I sip on codeine, I get vulnerable
I'm knowin' the sounds of the storm when it come [music]
She understand I can't take her everywhere a nigga going
I been in the field like the children of the corn[rap]
"""

I think we could use [rap] or other song style to make the generated singing like

tight pawn
#

โœ…

chrome tapir
tardy topaz
wild crater
#

oh i think i should put my audio hre

#

christ that tooka white

tardy topaz
#

What's the prompt strategy your testing? No backets?

wild crater
#

yeah

#

brackets dont work

#

for me at least

#

i do this

#
  • 1960's breakbeat solo *
#

with astericks

#

and it seems to worok better

tardy topaz
#

Where are the astericks?

wild crater
tardy topaz
#

* 1960's breakbeat solo *

wild crater
tardy topaz
#

You hit three backticks

#

before and after

wild crater
#

???

#

what are those

chrome tapir
#

haha

wild crater
#

/// the h ///

#

nope

blissful pulsar
tardy topaz
#

Left of the one key

wild crater
#

hi

#

oh ok

wild crater
chrome tapir
#

just hit 3 underscoredots bro

tardy topaz
#
*1960's drum solo* 7 seconds
wild crater
#

i guess i could use the bass at the beggining

tardy topaz
#

Yeah, it works well to build a sample library for sure, where you can just say 'give me 100 tries' and find good ones

wild crater
#

yeah

#

i want like an ai generate vst that will give you good sounding instrument samples that sound as you describe them. probably wont be a thing for at least another year or so though

jovial ferry
#

So all of these gens sound very robotic and tinny, why is that?

#

Much more so than tortoise for instance.

wild crater
#

idk

wild crater
#

it screams when i dont want it to

slate summit
#

let's see

wild crater
#

better

#

its really having rtouble

#

i actually could use this

#

for like a choir

wild crater
#

electric ppiano

#

cool drums

blissful pulsar
wild crater
#

?

wild crater
tardy topaz
wild crater
hidden talon
#

[VOLUME WARNING - Screams at the start]
using [rap] in the prompt gave me an insane intro lol, the boom after the scream

quasi pier
#

Anyone here figure out how to do voice to voice with this?

zinc pecan
#

I dont think you can

quasi pier
#

It's possible but have to do a lot of things

blissful pulsar
ebon widget
#

hah, amazing!! ๐Ÿ™‚

tranquil ravine
autumn cloud
wild crater
#

good drums

hardy warren
hardy warren
wild crater
blissful pulsar
wild crater
echo void
wild crater
blissful pulsar
wild crater
#

wait fuck

#

it doesnt give you the infinity notebook

blissful pulsar
#

if u want audio more than 13 sec's than
python bark_perform.py --use_smaller_models --text_prompt "abcd" --split_by_words 32

blissful pulsar
wild crater
#

nah

#

afs

#

h

hidden talon
drifting quiver
#

2 jokes ๐Ÿ™‚
This is just an idea, but is it possible to make this model follow instructions, such as asking for a song and the model sings it? Given that the model adds sounds and weird stuff on its own, it should be possible for it to learn to respond, right?

marsh apex
warm garnet
warm garnet
drifting quiver
past summit
hardy warren
#

Bark would be perfect for an AI generated soap opera with rediculously melodramatic actors

chrome tapir
hardy warren
#

i dont have the prompt saved, but from memory i just put [rap] before the text. that was probably the only good result out of ten though

fading wasp
vapid coyote
#

I need me a Minerva

hardy warren
hardy warren
chrome tapir
strong sentinel
wild crater
wild crater
pale olive
#

Just Figured out how to make it accept insanely long files as text input in a google colab,

#

used a quick ex from dune as a test case

#

to make it more seamless I tried separating it into sentences and not word count

chrome tapir
#

did you use a history prompt for that @pale olive

pale olive
chrome tapir
#

haha

autumn cloud
pale olive
#

1 i think

#

oh XD i mean english 1 speaker

idle iron
#

Suggestion for female voices anyone?

pale olive
#

so i created a fork with this modification.....not sure what im suppose to do after tbh

pale olive
idle iron
#

no i from the speaker list

#

i mean

pale olive
#

oh idk never tried them,

jagged dragon
idle iron
#

thanks man @jagged dragon

cedar saddle
#

I tried out a few things and got it to generate this amazing audio XD

chrome tapir
#

gotta love when it starts out nice and then suddenly transitions into an ear piercing screech

cedar saddle
#

XD

#

[very shocked gasp] [clears throat] [screams] [dies] [bangs hands] [clapping sounds]

keen fable
#

anybody willing to give a python noob a hand getting this up and running on vscode on m1

cedar saddle
#

im just running it on google colab

keen fable
#

id like to get it working locally

cedar saddle
#

im currently attempting to do that aswell

keen fable
#

yay troubleshoot party ๐ŸŽ‰

jolly veldt
#

Guys, I'm a layman, how do I run the repository?

keen fable
#

aight, 3 strong

jolly veldt
#

???

#

heeelp

keen fable
#

open one of the .ipynb files in vscode

chrome tapir
#

start feeding your errors to chatgpt until it works thats what i do

keen fable
#

yeah thats what im doing, its not being of much help

cedar saddle
#

bur

keen fable
#

at this point for all i can tell yall are chatgpt to me

cedar saddle
#

bur

#

we are the large language models

keen fable
#

and . well .. to some extent.. so am i ๐Ÿ˜„

#

i mean technically we've all been trained on a load of data and just spit inferences of that out

jolly veldt
#

Is there any tutorial teaching how to use it?

cedar saddle
#

XD

keen fable
chrome tapir
#

Scaling Transformer to 1M tokens and beyond with RMT

Recurrent Memory Transformer retains information across up to 2 million tokens.

During inference, the model effectively utilized memory for up to 4,096 segments with a total length of 2,048,000 tokensโ€”significantly exceedingโ€ฆ

Likes

3047

Retweets

766

keen fable
#

isn't memory in this context just RAM for training data

chrome tapir
#

i thought it was input tokens

keen fable
#

yeah yeah

#

im speaking in abstract

#

so it would go : training data > fine tuning > memory tokens

#

which is basically all the same thing iiuc

#

anyways back to bark

#

can someone help me bark

cedar saddle
#

attempting

chrome tapir
cedar saddle
#

what

jolly veldt
#

Guys, help me here! How do I run the code?

cedar saddle
#

its downloading something

chrome tapir
#

progress

cedar saddle
#

rip not using my gpu tho

chrome tapir
cedar saddle
#

hu

jolly veldt
#

Guys, help me here! How do I run the code?

keen fable
#

stop spamming

jolly veldt
#

help brother

cedar saddle
#

I'm just running it through terminal

keen fable
#

can u not read that we're all trying to get this to work

#

lol

#

ok yeah the import works on terminal

#

so my vscode setup is bonked

cedar saddle
#

i dont even have vscode ๐Ÿ’€

chrome tapir
#

ill just keep making you guys jealous by posting cool stuff

latent condor
#

Pycharm or codium or even just terminal is fine. Gonna try in Colab though. Let's messing around

chrome tapir
#

it will be good motivation

keen fable
chrome tapir
#

but i struggled for a few hours trying to get it to work right too

cedar saddle
keen fable
#

pod3000 u have it working locally my man ?

chrome tapir
#

havent tried cloning yet no

latent condor
chrome tapir
#

yeah i have bark infinite workin on miniconda3

keen fable
#

alright

chrome tapir
#

but im still running the unoptimized version

latent condor
cedar saddle
chrome tapir
#

yeah apparently there was a speed update

latent condor
#

Does the time to generate a song or audio get x'd if its longer. For example is 30 seconds normal for 10/seconds. But a minute might take an hour. Just because of the increased complexity

chrome tapir
#

you can only do 15 seconds at a time

#

afaik

cedar saddle
#

how much stuff is it going to download?

keen fable
#

it has decided to download other stuff on its own

#

it's gone rogue

cedar saddle
#

๐Ÿ˜จ

#

i think its generating

chrome tapir
#

it will say it/s when its generating

cedar saddle
#

o okk

chrome tapir
#

kb/s and mb/s for downloading

cedar saddle
#

a

latent condor
#

Pro life tip. Don't run code you don't understand on your own machine lol. Use colab or a VM. Especially for AI models they are huge

chrome tapir
cedar saddle
#

bro your making music XD

chrome tapir
#

i know its crazy

cedar saddle
#

what prompts are you using?

latent condor
#

Haha what's the prompts

#

Haha

#

Exactly my question

chrome tapir
#

that was

#

beat Somewhere over the rainbow, way up high, there's a land that I heard of once in a lullaby, somewhere over the rainbow, skies are blue, and the dreams that you dare to dream really do come true

#

beat surrounded by asterix

jolly veldt
latent condor
#

So the beat itself is decided by the lyrics then?

chrome tapir
#

id say to a very slight degree

latent condor
#

This is fckn awesome

cedar saddle
#

bruh

chrome tapir
#

seems like if you start off the prompt with dark lyrics it is a darker tone to the whole thing

#

and vice versa

#

if you start with yo yo yo check it you get a rapper usually

latent condor
#

Lmao

#

I wonder if you say "man run a man down " do you get drill lol

chrome tapir
cedar saddle
#

bur

blissful pulsar
cedar saddle
#

music

#

oh god tic tok

chrome tapir
#

maybe song works

#

lets see

cedar saddle
#

so @chrome tapir i reran it and i think this is it generating but i dont know what its doing with it

chrome tapir
#

probably saving a wav file in the root bark dir

#

i gotta get the faster version my bottom part is so slow compared to yours

cedar saddle
#

its not saving anything ๐Ÿ’€

chrome tapir
#

oh check the samples dir

#

bark_samples

cedar saddle
#

im just running it off a python file i created

chrome tapir
#

you probably need some save audio to file function

cedar saddle
#

maybe

chrome tapir
#

i feel like im judging a talent contest and half the people slowly walk on stage and then start screaming at the top of their lungs

cedar saddle
#

XD

chrome tapir
#

i think beat gives the best results so far

cedar saddle
#

nice

chrome tapir
cedar saddle
#

do do dod ododo

chrome tapir
#

guess someone cut the beat on him

#

not really fair

cedar saddle
#

hmmmm still showing the "No GPU being used. Careful, inference might be extremely slow!" thing

chrome tapir
#

u probably got the pytorch/cuda incompatibility problem i had

cedar saddle
#

a

chrome tapir
#

see if that returns true or false

keen fable
#
AssertionError: Torch not compiled with CUDA enabled
cedar saddle
#

ima try reinstalling

chrome tapir
#

i wonder how close these beats are to the ones the model trained on

#

if they are different that would be pretty crazy

cedar saddle
#

getting a tts ai to make music for me

keen fable
#

now getting

"The operator 'aten::_weight_norm_interface' is not currently implemented for the MPS device.

chrome tapir
#

i really need more than 14 seconds

#

14 seconds is right where the lyrics usually kick in after the intro

cedar saddle
#

it finnaly made an audio file

#

but it just sounds like static with slight vocals

chrome tapir
#

was it screeching

#

now make a script to batch create

#

then review them after

keen fable
#

sigh running this locally is rocket science

#

need to b a pythonista

chrome tapir
#

just keep pasting the tracebacks into chatgpt

#

4 preferablly

keen fable
#

doesnt work

#

unfortunately

cedar saddle
#

i only have normal free gpt

keen fable
#

i got 4

chrome tapir
#

4 is quite a bit smarter

#

and a lot slower

cedar saddle
#

"torch version does not support flash attention. You will get significantly faster inference speed by upgrade torch to newest version / nightly."

#

not even using my gpu ๐Ÿ’€

#

its to loud im deleting it

wild crater
#

can someone help me with the voice cloning thing?

#

i was able to use a hugging face space to create a clone of my voice but i dont know how to use it in the coalb notebook

chrome tapir
hazy whale
chrome tapir
#

nice laff

cedar saddle
#

creepy laugh

blissful pulsar
cedar saddle
#

the ending i was not expecting

hazy whale
cedar saddle
#

i can keep go-

chrome tapir
cedar saddle
#

do do do do

chrome tapir
cedar saddle
#

what da heck

chrome tapir
#

i hope his boyfriend dont mind it

cedar saddle
#

my boyfriend XD

chrome tapir
#

ill try aggressive

#

started out like a WWE entrance song

cedar saddle
#

im just going to try running it in notebook

stiff sinew
cedar saddle
#

i finnally got it to work

stiff sinew
#

good job, BTM, i pm you

chrome tapir
#

oh not your 2nd bar it/s is down to 1 just like mine

#

now*

hollow citrus
#

so how did you guys do the beat thing? I guess in collab the beat thing in brackets?

pale olive
#

managed to get 14 minutes of audio from a passage from dune in like 30 minutes in colab

#

nope file is too large

chrome tapir
#

works sometimes

warm pond
#

are you guys figuring out how to speed it up? On colab? Because the problem I had with this and tortoisetts is that it's just too slow to use for anything

chrome tapir
#

song, music seems to work ok too

sour junco
pale olive
#

14 min audio idk

versed atlas
#

Where can I see a tutorial on how to clone a speaker's voice?

chrome tapir
pale olive
#

yuppp

#

i actually got a thing for that which has diffrent speakers for each character automatically adn guesses the characters gender by their name

#

sadly it uses tortus rn cause it was from a few weeks ago but im gona be updating it to use other things

#

the readme has a demo you can use in colab

#

should have more time to work on it over the summer

chrome tapir
cedar saddle
chrome tapir
#

beat Singin' in the rain, just singin' in the rain, what a glorious feeling, I'm happy again, I'm laughing at clouds, so dark up above, the sun's in my heart and I'm ready for love %

#

the % is just what i use to break lines u can ignore it

cedar saddle
#

bur

chrome tapir
#

haha yeah i like how random the results are. total crapshoot

cedar saddle
#

im making weird stuff

hollow citrus
#

how many of you are using colab and how many people are using local? I am using local with an rtx-2070 and the small models and the github repo that was posted lately

violet narwhal
cedar saddle
hollow citrus
#

what's the difference between the regular model and the small one?

violet narwhal
hollow citrus
#

i'm not too sure what that means

cedar saddle
chrome tapir
#

beat I came in like a wrecking ball, I never hit so hard in love, all I wanted was to break your walls, all you ever did was wreck me, yeah, you wreck me

cedar saddle
#

bruh

#

its so good XD

chrome tapir
#

haha if alanis gave a ted talk instead of made a song

#

i had one that had a studio audience clapping in the background

#

pretty cool

tardy topaz
chrome tapir
#

have you guys played with elevenlabs TTS too?

#

i did a bunch of famous movie speeches one night

chrome tapir
hidden oar
#

how use the portuguese speaker??

chrome tapir
umbral solar
#

is it possible to give it the melody? or is it just random?

chrome tapir
chrome tapir
#

if sing is the first word in the prompt it sings more

umbral solar
#

did it generate the instruments?

chrome tapir
#

yeah

umbral solar
#

cool

chrome tapir
#

yeah it has amazeed me a lot today

#

ive been itching for AI music for a while

umbral solar
#

ther is also one that generates music with stable diffusion

#

its called rifusion but it cant generate speach only music and it works complytly different

chrome tapir
#

i didnt make it as far with riffusion

#

i think its responding to piano

#

im gonna try cutting it off into smaller chunks

#

it seems to present good content in teh first half more often

#

it comes out the gate hard and then fizzles around 6 seconds

umbral solar
#

yes and some generatioms are soo good and some are so bad xD but this is better then most stuff xD

chrome tapir
#

someone try bpms too

#

piano makes it hit one piano chord and thats it. seems too powerful

#

the 14 second song its a new thing

#

i should try some david goggins

#

omg this bass is nuts

#

woofers*

umbral solar
#

are u runing it localy?

chrome tapir
#

AI is so aggressively loud lol sometimes

chrome tapir
umbral solar
#

how mutch vram does it use?

chrome tapir
#

10.6

chrome tapir
#

definitely prefers east coast rap (nsfw and loud)

umbral solar
chrome tapir
chrome tapir
#

so i pick the best 10%

#

and yeah i am just prompting beat lyrics go here

#

with beat in asterix

#

everything else is default .7 temp

#

at least from bark infinite default

umbral solar
#

and do u chose a speaker?

chrome tapir
#

no

#

for more than 14 seconds you'd need voice files im sure

umbral solar
#

witch ones do u use?

#

becasue i just used the 1 cklick installer for web ui

chrome tapir
#

which whats

#

i havent really messed with the voice files but i have saved every one so i might pick some good ones to try

umbral solar
#

ok

chrome tapir
#

i feel like its about to go into this incredible guitar song and then it just runs out of tokens at 3 seconds

#

we are so close

#

sounds like the strings on the guitar broke haha

#

ok i am gonna try 1 line songs

rigid sluice
#

Here's an audio film created by using Bark through a free add-on I've made for Blender(screenplay is written by chatGPT and images are made by Stable Diffusion): https://www.youtube.com/watch?v=AAdQfQjENJU

This film was created with Blender and these add-ons:
Generative AI for the VSE: https://github.com/tin2tin/generative_ai
Using Bark: https://github.com/suno-ai/bark and Stable Diffusion through the Diffusers module: https://github.com/huggingface/diffusers
Blender Screenwriter: https://github.com/tin2tin/Blender_Screenwriter
Screenwriter chatGP...

โ–ถ Play video
chrome tapir
#

n1 john

#

broke out of the friend zone thats a miracle

#

the voices are great

umbral solar
chrome tapir
#

nice you managed to keep the same voice and beat with no history prompt?

#

for multiple chunks i mean

odd wasp
fallen rapids
#

I've only had Bark for a few hours, but I'm going to have a great time with it already, I can just feel it ๐Ÿ˜‚
Note: I just used bark for the voices, I did the music myself.

last vault
#

Could you share prompts and settings? ๐Ÿ™

wild crater
tardy topaz
fallen rapids
#

@wintry yew Thanks, I had to stop messing around at a bit over the 2 minute mark as I had more pressing matters.

I might try and finish it, and see if I can do some text-to-video for AI video as well.

autumn cloud
umbral solar
umbral solar
cedar saddle
autumn cloud
wild crater
#

i still dont know how to clone voices

short scaffold
blissful pulsar
jovial ferry
blissful pulsar
#

SMFEN EJIEFEE FFI .JSJA.S..SDV.EMFV.MV.NVM.HLHVfj<l

edgy mango
jovial ferry
#

What was your prompt?

chilly flax
#

The input text was "[sad][weeping][Crying] Hello, my name is Suno. And, uh โ€” and I like pizza. [laughs]
But I also have other interests such as playing tic tac toe.". โ˜ ๏ธ

blissful pulsar
blissful pulsar
hollow citrus
#

ei wonder if it can do stuff like a meowing cat sound

rough berry
#

"""
Hello, my name is Suno. And, uh โ€” and I like pizza. [laughs]
But I also have other interests such as playing tic tac toe.
"""

pale olive
#

so it keeps specific voices for charaters as i selected so thats good but.....idk not very coherent, cause theres some long pauses. idk but first test i guess

hazy whale
chrome tapir
#

the voice files for songs are working pretty good

#

the beat is staying the same

edgy mango
#

made sure there were no quotes in it and cli'd it as text_prompt in entirety.

hazy whale
#

came out pretty good i think

chrome tapir
#

i have that npz if anyone wants it

#

gonna try another

autumn cloud
hazy whale
autumn cloud
hazy whale
#

next podcast, why I'm taking your job and ruining your life

autumn cloud
#

Iโ€™m AI myself; quite safe

chrome tapir
wild crater
#

plus can i use that sample

chrome tapir
wild crater
chrome tapir
#

risky click

honest atlas
hollow citrus
#

i'm going to make a batch file for the command line tool for those who want it

chrome tapir
#

im hoping that jonathonfly releases a new bark infinity soon

hollow citrus
#

yeah, bark is so amazing. BTW i use a screen reader as i am fully blind

chrome pecan
hollow citrus
#

but with long text, i don't know how many lines/words

obtuse sparrow
hollow citrus
#

i am messing with the confused travolta model and going to put the results here soon

chrome tapir
hollow citrus
#

what was the prompt?

chrome tapir
#

(dance beat) Pump it up
You got to pump it up
Don't you know, pump it up

#

been using parenthesis and 2 words in the beginning with good results

hollow citrus
#

oo just picturing the dance beats with kraftwerk lyrics

chrome tapir
#

havent had much luck making edm

#

ill try

oak schooner
hollow citrus
#

has anybody messed with the confused mode whatever it's called? I call it text to speech completion

honest atlas
chrome tapir
#

lol

honest atlas
hollow citrus
#

hey try that with the confused travolta mode whatever it's called, it may do some weird results

honest atlas
#

I'm using the Colab. I don't know how to call for different voices. How do you do it?

hollow citrus
#

oh okay sorry

#

i have a gpu so i am using the command line tool

honest atlas
#

I tried installing it local but on 8GB Vram GTX1660ti all I get is Cuda Out of Memory.

#

I love these dramatic readings. They're so random!

#

I have Colab Pro set to Premium GPU High Ram and it spits out these 15 sec clips in about 20 secs. Not bad. Do you think having an A100 makes a difference?

hollow citrus
#

i used the smaller model

#

or whatever you call it

honest atlas
#

I'll try that if I can figure out how to set it.

hollow citrus
#

just type python bark_perform.py -h for help, i know it's the wrong channel to answer but yeah

honest atlas
#

That works in the webui? Ok I'll try that. Thanks.

hollow citrus
#

not sure but i just used the command line

honest atlas
#

Ok, I'll try it in the terminal window.

#

The Colab is pretty easy and quick but I think running it locally has more features?

hollow citrus
#

the command line version at least, as i don't know of any other repos that have the smaller models supported

fresh mulch
chrome tapir
#

sounds normal to me

patent bramble
#

open na noor

fallen rapids
echo void
fallen rapids
#

This is a slightly altered version of en_male_professional_reader from JonathanFly's fork.

proud patio
#

made a spoken about ants with bark and some prompt engineering and cutting:

  1. "[Clears throat] Ants, oh ants, they never cease to amaze, [Sighs] With their resilience! they Can survive even when the water stays.. [Laugh]"
  2. "[Laugh] Who would've thought.. these tiny creatures would come together, Forming rafts.. and floating in Stormy weather!? [Gasp]"
  3. "Ha [Gasp] It's incredible What Nature can do..[Sigh] and these ants are proof - that even the Smallest!- can be mighty too! [proud]"
    en_speaker_1
blissful pulsar
hollow citrus
#

besides [laughs] can i also do [screams]?

wild crater
#

[1996]?! NO! THAT WAS [No success]!! [1997] MY BELOVED

hollow citrus
#

so here is my first ever i am going to share! The prompt is: and uh ... it's like it never happened [laughs] so i think ... i think ... i think if we were going to do it, we need to do it right. [sighs] it's a tough life.

blissful pulsar
#

That prompt seems to always give me female voices, even for my male voice files

blissful pulsar
#

bombastic side eye... criminally offensive side eye

hearty kernel
formal ice
#

I can't really make out the words HUHH

plain skiff
#

chatgpt4: Once upon a time in the town of Gigglesburg, there was a clumsy mime named Benny [laughs]. Benny was notorious for always causing accidental chaos during his performances.

One day, Benny was invited to perform at the Gigglesburg Comedy Festival. Excited, he prepared a new act featuring an imaginary "banana peel" [laughs].

During the performance, Benny mimed slipping on the imaginary banana peel, and as fate would have it, he accidentally stumbled upon a real banana peel! Benny slipped, crashed into a drum set, and sent cymbals flying [laughs].

The audience erupted into laughter, thinking it was all part of the act. Benny, though embarrassed, decided to embrace the moment and kept slipping and falling throughout the show [laughter]. It became Benny's most famous act, turning his clumsiness into comedy gold [laughs].

ebon widget
#

towards the end it loses track of what it was playing ๐Ÿ˜‚

obtuse sparrow
honest atlas
#

Where can I post NSFW Bark clips? lol?

formal ice
#

input: [upbeat music loop]
not a music loop but a pretty cool ambience bass hit type of sound.

lyric vine
#

@ebon widget omg sick

chrome tapir
#

im picturing cars full of robots blasting music with smoke billowing out

#

lets see if i can make a 'what does the fox say' remix

#

the world needs it

chrome tapir
small sluice
violet narwhal
#

Warning, it is very LOUD! โš ๏ธ

#

prompt

chrome tapir
#

suno responds well to requests for screaming

honest atlas
#

So using the Colab, how do you get it to make beats like that?

#

And I've read about a voice called Confused Travolta? How do I call that up on Colab?

chrome tapir
#

can you select no voice in collab?

#

if so do that and then put "(dance beat) something something lyrics go here for around 10 words" for the text prompt

#

not sure how you would trigger confusedmode

chrome tapir
torn ferry
#

๐Ÿ”Š Gets loud, but literally did a prompt like that last night! This sounds like a viral Tiktok sound TBH

chrome tapir
#

yeah i could see it being big on tiktok. i was in the hospital and someone in the bed next to me was scrolling tiktok with the volume on max. sounded pretty close

#

gonna have to save that history file for sure

torn ferry
#

[obsequious] Good evening, sir! Let me know if there's anything else I can assist you with today or if you have any updates on your to-do list. Wishing you a relaxing evening!

chrome tapir
#

using my gpu to make music thats one thing i didnt plan on

#

seems to be a winner

violet narwhal
chrome tapir
#

there we go

#

wheres all the suno producers at?!

blissful pulsar
ebon widget
chrome tapir
#

lol

#

what prompts are you using for good nsfw?

#

provided its not too graphic

blissful pulsar
ebon widget
violet narwhal
undone glade
#

I think I may have cracked the consistency and cloning issue

First I lock the model by seeding everything like this:
`def set_seed(seed):
seed = int(seed)
torch.manual_seed(seed)
random.seed(seed)
np.random.seed(seed)
torch.cuda.manual_seed(seed)

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

os.environ["PYTHONHASHSEED"] = str(seed)

`

I then use short prompts to find the voice I want, after that I use the same seed on my longer prompt, I've also exposed the fine_temp setting from api.py this seems to control how consistent the tone and pitch of the voice are. Default is 0.5, I'm using 0.2

Example:
Far above the Ephel Duath in the West the night-sky was still dim and pale. There, peering among the cloud-wrack above a dark tor high up in the mountains, Sam saw a white star twinkle for a while. The beauty of it smote his heart, as he looked up out of the forsaken land, and hope returned to him. For like a shaft, clear and cold, the thought pierced him that in the end the Shadow was only a small and passing thing: there was light and high beauty for ever beyond its reach.

It still has problems between the stitched clips

umbral solar
#

areu usning the infinety version?

#

becasue i dindt had stichig proplems there

undone glade
#

I'm using my own version of infinity I made

umbral solar
#

oh

umbral solar
#

i also wanted to play around with the seed. when you have the same seed pus text is the output exectly the same?

undone glade
#

Thank you, I had lost the page, I had copied from

pale jasper
#

"the thought pierced him that in the end the Shadow was only a small and passing thing" is exquisite ๐ŸคŒ
It correctly guessed the implied pauses that often trips up human readers on first time reads

undone glade
# pale jasper "the thought pierced him that in the end the Shadow was only a small and passing...
umbral solar
#

this model shuld be good at cloning becaseu its based of vall e and they can clone a voice from 3 seconds of input audio

#

maby if we look into the valle paper we can find out how they do it

wicked vapor
undone glade
undone glade
# violet narwhal Yeah, something like that, https://discord.com/channels/1069381916492562582/1100...

Oh wait, I didn't actually click the link, I got it from here https://wandb.ai/sauravmaheshkar/RSNA-MICCAI/reports/How-to-Set-Random-Seeds-in-PyTorch-and-Tensorflow--VmlldzoxMDA2MDQy I didn't realize it was already mentioned

umbral solar
undone glade
#

Yeah, it certainly makes the model easier to control

violet narwhal
#

torch.use_deterministic_algorithms(True)

undone glade
#

Yeah, I had pulled that cause I kept hitting up against the CUBLAS and didn't feel like finding the solution, lol

violet narwhal
undone glade
#

Nice, thank you. I'll have to implement it

violet narwhal
umbral solar
#

hehehe

#

549

#

one long chunck and not multible stiched ones

#

lets hope that it does not crash

undone glade
#

That's awesome

umbral solar
#

is it me or is it getting more quiet

#

and the voice also chages a bit

#

and the pronunciation chages a bit

#

i saved the semantic array and copied it before the generation into a variable because of that its repeating

undone glade
#

It does change a bit, but it never felt like a different person more like someone practicing lines and trying different deliveries

umbral solar
#

yes

#

the chages here are all not from the semantic model

#

and the strage sound you hear at the beging of each repetision was me chageing some nummbers in the array xD

#

i wonder if you would generate a longer semantic array that it drifts into different voices becasue it cant remember the start but idk

#

and the chage i made is quite smol

undone glade
#

What if we treat it like a chatbot with a very small context window, we could wait till it get's to the end and then feed it the last half of the semantic output along with the correct chunk of text, it might be possible for it to maintain coherency in longer texts.

umbral solar
#

oh that is very smart

#

i am just testing to split the text in 35 word lists and then let the sementics generate for each part but then i put all the semantic strings together and let it run in one go. but with oyu idea it will probably be more consistant

undone glade
#

Yeah, I'm working on how I want to split things up right now

hollow citrus
#

how do you make it rap? Just the beat without the eitght note thingies?

umbral solar
# undone glade Yeah, I'm working on how I want to split things up right now

split the long text into chunks of 35 words

words = long_text.split()
chunks = [words[i:i + 35] for i in range(0, len(words), 35)]

# apply the generate_text_semantic function to each chunk
outputs = []
for chunk in chunks:
    text = " ".join(chunk)
    x_semantic = generate_text_semantic(
        text,
        history_prompt=history_prompt,
        temp=temp,
        base=base,
        allow_early_stop=allow_early_stop,
    )
    outputs.append(x_semantic)

# concatenate all the outputs together
x_semantic = x_semantic

#x_semantic = generate_text_semantic(
#    text,
#    history_prompt=history_prompt,
#    temp=temp,
#    base=base,
#    allow_early_stop=allow_early_stop,
#)
print(x_semantic)
return x_semantic
#

and have the text in the long_text variable

umbral solar
#

i just test it and it sounds very consisted alredy

#

but maby your idea will improve it even more

undone glade
#

Nice, I'm gonna splice your code into mine and see

umbral solar
hollow citrus
#

here's one! the prompt is:
beat โ™ชMama, just killed a man. Put a gun against his head, pulled my trigger now he's dead. Mama, life had just begun, but now i've gone and thrown it all away.โ™ช

umbral solar
undone glade
#

Lol, yeah I figured

umbral solar
#

xD

#

good does it also work for u?

undone glade
#

Still chasing down bugs

umbral solar
#

f

hollow citrus
#

i took the first verse from bohemian rhapsody

umbral solar
umbral solar
hollow citrus
#

oh, with a bracket. Not yet, let's see

undone glade
#

My model isn't unloading

umbral solar
#

i mean after its done creating the audio

#

it chages a bit

#

ang chages from british to american

undone glade
#

Yeah, it loses a some consitency

chrome tapir
#

@hollow citrus i have some good npz files for rap if you want

#

this one in particular

umbral solar
undone glade
#

the history_prompt?

umbral solar
#

yes

undone glade
#

It like using a reference image in SD, it guides the model towards a certain voice.

umbral solar
#

maby this will help a bit

undone glade
#

It does, it constrains the voice to a range, lowering the fine_temp control I've found contrains it even more

hollow citrus
chrome tapir
#

try beat in asterix and parenthesis too

#

both seem to work better than brackets

hollow citrus
#

i did asteriscs

chrome tapir
#

make 50 and pick the best 1

#

heh

hollow citrus
#

oo okay. I can do that locally actually. I did these with the hf space

chrome tapir
umbral solar
chrome tapir
#

the bass is wild

#

i am playing on 8" woofers

#

i have to keep the volume at like 5%

undone glade
#

Damn

umbral solar
#

maby we shuld save the semantic data in the metadata

#

so it can get recreated

#

and the promt

undone glade
#

Yeah, do you have seeds? Because with seeds we would just need the prompt, history, temps and seed

umbral solar
#

no but seed would also be nice

umbral solar
#

i didnt say that it shuld make music

#

and it dyes at the end

chrome tapir
#

hahah

undone glade
#

Yeah, mine ran into the same thing, it got super robotic at the end

chrome tapir
#

started out so enthusiastic

#

thats why i just rock with 14 seconds at a time

#

even that is a push

#

for music anyway. i think with text you can just do chunks and stich them together pretty successfully

hollow citrus
#

i just have 8gb vram so i can't use the regular bark model

chrome tapir
#

damn

#

so close

#

once those 24gb models come out i am gonna be forced to sell an organ

#

probably gonna need more ram soon too for 65B parameter LLMs

hollow citrus
#

i'll have to try the sound effect generation

umbral solar
chrome tapir
#

someone will eventually

umbral solar
#

i mane for this projeckt

chrome tapir
#

i know everyone says fine tune = better but i think bigger = better as well

#

just a hunch

umbral solar
#

yes idk about finetune

chrome tapir
#

big model + lora seems to be a really good combo in image creation

#

so maybe it will be similar for audio

umbral solar
#

but not consitant

chrome tapir
#

if you listen to enough suno you will start to hear AI when normal people speak

chrome tapir
#

i was watching some guy give a speech on stage and it was tripping me out

#

the way he paused and said uhh and stuff sounded very suno like

#

its almost like people have the same mannerisms programmed into their speech

#

just gives you a different perspective i suppose

undone glade
#

I think it so jarring because it's a stark remind that we as humans are not unique

chrome tapir
#

im gonna try lyrics that have sound device type words in the lyrics

#

that seems to work well

umbral solar
#

maby its becaseu of bad compression in audio

#

btw deep floid is open

chrome tapir
#

i was just looking at that

undone glade
#

Do not use en_speaker_5 for long texts, it does not work well

chrome tapir
#

ill wait for someone to make it into a gradio

umbral solar
chrome tapir
#

(dance beat) Shake it off, I shake it off, I, I, I shake it off, I shake it off, heartbreakers gonna break, break, break, and the fakers gonna fake, fake, fake, baby, I'm just gonna shake, shake, shake

#

wasnt it obvious?

#

might mess with that one later

umbral solar
#

lets see what happens

undone glade
#

I'm very curious

umbral solar
#

noo my headfones are empty

chrome tapir
#

gotta love when you get hit with the ear piercing scream straight from second 0.0

undone glade
chrome tapir
#

wtf just got back to back screams

#

sheesh what are they doin to these poor AI agents

umbral solar
chrome tapir
#

one day we will have audio books where every character has a unique voice

undone glade
#

I think the main components are the seeding and lowering the temps

undone glade
#

What am I listening to? is this that giant filled array?

umbral solar
#

yes from 1 to 10.000

undone glade
#

So if someone really wanted to they could map the semantic scape

umbral solar
#

yes

#

if you want to lern a new instruent with 10.000 noats

undone glade
#

I'm thinking more if you could filter which ones were the most similar to speech you could limit what tokens are allowed through

umbral solar
#

i think it would also be interesting to try to convert audio to tokens

#

you coul also try to find words that sound got and try to stick them together by hand

undone glade
#

I think there is an instrument like that

hollow citrus
#

those who use the cli tool, what temperature thingies for text temp and waveform temp do you guys use?

undone glade
#

I use .7 for text and .6 for waveform

#

I ran into an issue with pre stitching the semantics, if the text is too long it eats all the vram generating the audio

#

Need to find the limit and split semantic array before concatenate

undone glade
#

Very nice

umbral solar
#

this ๐“€€ ๐“€ ๐“€‚ ๐“€ƒ ๐“€„ ๐“€… ๐“€† ๐“€‡ ๐“€ˆ ๐“€‰ ๐“€Š ๐“€‹ ๐“€Œ ๐“€ ๐“€Ž ๐“€ ๐“€ ๐“€‘ ๐“€’ ๐“€“ ๐“€” ๐“€• ๐“€– ๐“€— ๐“€˜ ๐“€™ ๐“€š ๐“€› ๐“€œ ๐“€ ๐“€ž ๐“€Ÿ ๐“€  ๐“€ก ๐“€ข ๐“€ฃ ๐“€ค ๐“€ฅ is this

#

how

undone glade
#

weird

umbral solar
#

i nuticed when its longer quiet the chance of chaging the voice is higher

undone glade
#

I'm running a really long test right now, hopefully it goes well

#

I forgot to concatenate it, I got 14 seconds of audio for a 30 minute generation. Lol

inland coral
#

Just learned about this tool, going to use it for a new vegas mod

umbral solar
#

nice

inland coral
#

text_prompt = """
[raspy]NCR taxess? Man! [clears throat] I say screw the NCR!
Westside Radio baby, let freedom Ring!
"""
audio_array = generate_audio(text_prompt)
Audio(audio_array, rate=SAMPLE_RATE)

#

this one is hilarious

undone glade
inland coral
undone glade
#

That's really good

inland coral
chrome tapir
inland coral
chrome tapir
#

as opposed to the usual LOUD AF

chrome tapir
# inland coral

new vegas pretty good game? i always hear people talk it up

#

what was i playin those days i wonder. maybe just cause 2, l4d2 prob

inland coral
#

its my favorite game of all time

chrome tapir
#

any game that you can mod instantly makes it twice as good

inland coral
#

yeah im gonna make a radio mod using these voice samples to massively increase immersion

keen fable
#

have the macOS issues been fixed

#

specifically running it with mps rather than cuda

fallen rapids
#

Stable Diffusion + Suno for storytelling. Can't wait for text-to-video to mature as well.

gray portal
chrome tapir
#

put it with your other voice files

wild crater
#

colab*

#

lmao google clown

mellow dagger
blissful pulsar
#

How do you force female voice ? It don't work full time with "WOMAN :"

inland coral
chrome tapir
silver valve
south seal
#

I mean emotions not all there but trying quotes from Her (2013) is funny

worldly tree
#

see its working fine or not

#

and can we clone voice in turkish language?

south seal
worldly tree
south seal
#

You can play around with the Google Collaboratory page

#

You could try doing the turkish voice there

worldly tree
#

its not right place to share because its suno channel

#

sorry for that, im looking for turkish support program

worldly tree
south seal
#

I think if you paste in Turkish text it should work alright? But you might have to use the history prompt for a turkish speaker i.e. here

#

Again I'm not really sure I only just started using this model today

undone glade
#

This was my attempt with turkish using the history_prompt "tr_speaker_1", I have no idea how accurate it is since I don't speak turkish

violet narwhal
violet narwhal
prime night
rugged nymph
# worldly tree good luck, and please try custom turkish voice for me if its working good, I'll ...

selam bunu kullanarak cli dan --history_prompt "en_speaker_1" ekleyerek 9 tane var รผretebilirsin ayrฤฑca history prompt vermeden รผretilen tรผm sesleride kaydediyor onlarฤฑda รงaฤŸฤฑrabilirsin
https://github.com/JonathanFly/bark

GitHub

๐Ÿš€ BARK INFINITY ๐ŸŽถ Power Up The Bark Text-prompted Generative Audio Model - GitHub - JonathanFly/bark: ๐Ÿš€ BARK INFINITY ๐ŸŽถ Power Up The Bark Text-prompted Generative Audio Model

violet narwhal
chrome tapir
#

not giving up on that friday banger haha

#

ill try too since its almost friday

chrome tapir
#

FRIDAY's child is full of woe, but I know how the story goes, break the chain, I'll break the mold, FRIDAY's child has a heart of gold, yeah, a heart of gold

#

it made all 3 of those in a row

#

very cool

#

oh just got a nice beat too

#

(dance beat) Thank God it's FRIDAY night, and I just-just-just-just-juuuuuuust got paid, money, money, money, money, yeah, just got paid, FRIDAY night, party hoppin', feelin' right, booties shakin', all around

simple bison
chrome tapir
#

wow theres like 3 or 4 beats in there

#

thats a good prompt im gonna steal it

simple bison
#

oops i had an extra music note in there

chrome tapir
#

the AI knows what to do with it

simple bison
chrome tapir
#

yeah i give them credit for trying

simple bison
#

[singing fast] โ™ช [dance beat] we going to walmart. we going to walmart. we going to wally wally wally wally wally wally world wally wally wally wally wally wally world. basket basket basket basket [singing fast] โ™ช [dance beat]

#

it improvised some

chrome tapir
#

ok switching to the musical note npz lets see if that works

simple bison
chrome tapir
#

ai remix

#

gonna try the devils advocate

#

and dark knight

simple bison
chrome tapir
#

this is quality content

#

that laff is wild

#

i had a drunk kareoke guy yesterday

simple bison
#

i did something along the lines of
" [farts] farts [farts] farts [farts] farts [farts] farts [farts] farts [farts] farts [farts] farts" and it made something that sounds like a phone ringtone

chrome tapir
simple bison
#

"[laughter] [laughs] [sighs] [music] โ™ช [gasps] [clears throat]โ€” or ... for hesitations, capitalization for emphasis of a word"
unrelated output

chrome tapir
simple bison
#

[laughter] [laughs] [sighs] [music] โ™ช [gasps] [clears throat]โ€” or ... for hesitations, capitalization for emphasis of a word
more chaos

chrome tapir
#

chatgpt has some ideas

simple bison
#

outputs like a misconfigured Vicuna model

chrome tapir
#

i havent messed with vicuna only llama

#

alpaca

simple bison
#

try the prompt i'm using with Suno Bark TTS
[laughter] [laughs] [sighs] [music] โ™ช [gasps] [clears throat]โ€” or ... for hesitations, capitalization for emphasis of a word

#

that prompt,
this output:
"the the building of the book... the building of the book... the building of the book... and the building of the book and all of the greece sitting. the victim of the mosely."