#🎵|stable-audio

1 messages · Page 1 of 1 (latest)

south dawn
#

Introducing Stable Audio: Stability.AI’s first AI product for music and sound generation! 🎧

“Ready for something new? Welcome in Stable Audio!” cinnagroov

music Stable Audio is a first-of-its-kind product that uses the latest generative AI techniques to deliver faster, higher-quality music via an easy-to-use web interface. Stability AI offers both a basic free version of Stable Audio, which can be used to generate and download tracks of up to 20 seconds, as well as ‘Pro’ subscription, which delivers 90-second tracks that are downloadable for commercial projects.

music Stable Audio is the first music generation product enabling the creation of high-quality, 44.1 kHz music for commercial use via latent diffusion. The latent diffusion architecture uses audio conditioned on text metadata as well as audio file duration and start time, allowing for control over the content and length of the generated audio.

We welcome you to give Stable Audio a whirl for yourself, and provide us your feedback! We want to hear from YOU!

GET STARTED:

BlueArrowPointer http://stableaudio.com/
BlueArrowPointer #🎵|stable-audio: For discussion, feedback, prompt-sharing, and outputs surrounding this release!

Make original music and sound effects using artificial intelligence, whether you’re a beginner or a pro.

prisma reef
#

in addition to music it can also generate sound effects! The user guide has examples of both music and sound effects generated if you scroll down https://stableaudio.com/user-guide

Make original music and sound effects using artificial intelligence, whether you’re a beginner or a pro.

wind hedge
#

Best audio generation so far. While I appreciate jukebox's willingness to get a bit chaotic and unpredictable, just spent the last hour generating useable stems with stable audio

#

My god this is amazing

daring hamlet
#

We're thrilled you like it 🙂

#

Please feel free to share your outputs and any great prompts you discover - we’re really curious to see what works well. We've only just started to scratch the surface of how best to prompt the model.

cloud geode
#

I just gave it a whirl! while I'm happy with the improvements in Audio quality the composition of the pieces is... well pretty bland usually.

I've written music for decades though so maybe my standards are higher than most !

As a musician the holy Grail for me would be being able to influence the generation by providing a .midi file or a reference audio file to guide the structure of the piece.. sort of like how controlnet works in stable diffusion? I'm sure that's not applicable in terms of coding but you get the idea.

It's really difficult to thoroughly describe the structure of a piece of music (impossible?) Through natural language so I imagine that will be a challenge as well. All In all pretty cool !

shut saddle
#

Fauno is going to be hosting an office hours this week discussing the project so if that sounds interesting, join the Harmonai discord server so you can listen in.

cloud geode
#

I'd love to if I have the time I'll keep an eye out

compact condor
#

I have a question regarding copyrights and selling AI generated music. I read https://www.stableaudio.com/terms and just to be clear, is it legal to sell generated music to services like SPotify, Apple Music etc..,?
Of course Professional account is the first step.

Make original music and sound effects using artificial intelligence, whether you’re a beginner or a pro.

prisma reef
#

that gets a checkmark for a "Professional" tier account

#

(I'm not a lawyer though, if you're concerned about the precise word of the law, especially if money's involved, contact a professional lawyer to be safe!)

#

(also bear in mind you are bound to the terms of the relevant platforms you're publishing to, not just stableaudio's terms)

#

(also you probably don't want to just generate on stableaudio and then go sell it. It's a tool for musicians not a one-stop-shop!)

#

(also copyright law around AI in general is a slightly weird topic so doubly contact a lawyer to be safe before going through with plans to sell stuff)

compact condor
#

Ok, thank you for your answer 🙂

warped vine
#

That might be enough to dissuade 90% of us from trying to make money, haha. Lawyers aren't cheap!

compact condor
prisma reef
#

I mean the answer is "yes" and literally says so on the homepage in the screenshot there but i'm not a lawyer and am too scared to just say "yes" if the topic is anywhere close to legal issues

small harbor
small harbor
#

As the previous prompt had high frequencies, but it was just soloed drums, let's add some bass: "Neurofunk drums, sub bass, growl bass, 174bpm, crisp sound, professionally mastered, high quality". Ok that is just silly, i'll give up. Conclusion: Soundquality is not much better than in Meta's Audiocraft tha i run locally. Note that i have used a ton of Harmonais Dance Diffusion to train my own audio sample models and used all other AI audio generators a lot.

sly fjord
#

just gone for pro but my profile isn't updating and generations stopped working... also it won't play my last generation... any help?

small harbor
warped vine
unborn gate
#

Is site up and running?? I am getting loader on generate page.. I can Input prompt.

sly fjord
#

not running for me

#

must be getting a lot of requests

oak umbra
#

马斯克

unborn gate
#

Yeah its loading on generate page 😢

sly fjord
unborn gate
#

Well that's sad.. Let's hope it up soon..

small harbor
#

It took like 5 min to re-load the page. Just wait when the AI youtubers make a video about this, Then the site is gonna really crash.

unborn gate
small harbor
#

Yeah, the AI youtubers have problems if the site does not work.

bleak locust
#

Hey. There is a lot of traffic on the site currently. We are looking into fixing this asap.

unborn gate
sly fjord
#

so far, i like it for sound effects, but music has been disappointing in first few tries (admittedly i did not go easy but when i tried something "techno".... it was a hard pass)

#

what I don't understand though... is Stable Audio different from what Harmonai is doing with Dance Diffusion?

small harbor
#

Sound FX also have bad artifacts on the sounds. Harmonai is also Stability AI company like Stable Audio. Probably same people.

sly fjord
#

the SFX were cleaner than the "music" i was getting 🙂

#

I have been using Suno for a couple days now... it's better, altho also imperfect of course

#

(for music)

small harbor
#

It depends on the prompt what it generates. Music is from crappy sound library that is not very professional sounding.

sly fjord
#

I asked what they use under the hood but got no reply

small harbor
#

They use Dance Diffusion code

sly fjord
#

it sounds a lot different from Stable Audio

#

different artifacts

#

and way more coherent compositions

#

but again, i only tried like 10 prompts so far

#

and 200 in Suno 🙂

small harbor
#

They used DD to train the model. I'm waiting them to release the updated DD code as it's really old. DD can generate high quality sounds without any artifacts. But it's for single shots. As soon as it's loop or longer sound, the quality goes down.

sly fjord
#

I played around with the Colab before, been waiting for a nice interface

small harbor
#

I run DD locally with 2 different GUIs.

sly fjord
#

interesting

#

don't have the gpu for it personally

#

cloud computing's so convenient

small harbor
#

I have now tested Stable Audio against META Audiocraft and so far Audiocraft is winning. AS has it's moments, but it's hard to get usable results. With Audiocraft Plus i can daisychain prompts up to ten prompts and generate 5 minutes of audio.

sly fjord
#

oo I haven't checked the Meta one

#

is it public?

small harbor
#

Yes, there is public demos and collabs for it. I run it locally

sly fjord
#

shall investigate thanks 🙂

small harbor
#

This explains a lot: "Because part of the dataset contains MIDI instrumentation, some generations can sound like MIDI instruments."

sly fjord
#

seen yeah thx 🙂 👍

hollow dune
#

how to use stableaudio?

sly fjord
hollow dune
sly fjord
hollow dune
cloud coral
#

someone else can't generate?
site is not loading, cleared cache already

modern charm
#

give it a time, probably getting swarmed right now and they are working on scaling it further

charred tendon
#

I thought maybe if I subbed to Pro it would maybe allow access to Generate as like a fast track to use the service....nope. lol

shut saddle
#

This was quite a surprise to wake up to!

scenic marten
#

Anyone tried to generate melbourne bounce music? Any good hits?

celest magnet
#

Is it gonna be opensource?

modern charm
#
We are building the open dataset version now that we will release along with the trainer etc per the research blog.

Also continuous improvements to the platform and features so you can fine tune your own models, create music videos and much more (done when they are done).
charred tendon
celest magnet
#

Is the site broken it won't load the generate page

charred tendon
celest magnet
#

It sounded pretty great from the demo

hybrid star
#

AI Music hmm

charred tendon
iron owl
scenic marten
charred tendon
#

Ok update. After maybe 20 mins I think of loading the page, I saw what the Generate page looks like. Just wanted to test it to see what would happen if I reloaded the page...well, its loading again. lol So yeah, its probably just bogged down or being fixed at the moment.

eternal remnant
#

Guess I'll wait for a bit

#

🙂

charred tendon
# unborn gate Happening for last 2-3 hours.

Oh. Well hopefully they get it ironed out. I can't even get the site to view me as a Pro member, even when I cleared the cache and logged back in with my email account that I used to pay for the sub. So yeah, they clearly have a few things to iron out till the site is working as it should. It shouldn't be much longer I feel

river latch
shut saddle
#

Pretty safe to say that you'll be lucky to get a gen or two today.

odd geyser
#

got kicked while it was generating 😭

unborn gate
odd geyser
#

D1 Error

shut saddle
#

Yeah I'm guessing they are overloaded.

unborn gate
unborn gate
shut saddle
#

Yeah I was hoping to try it out because I'm gonna get a lot of questions in the main Harmonai discord.

harsh hatch
#

How long does it take to generate one audio?

odd geyser
#

I just want Haydn trumpet concerto 4th movement

shut saddle
odd geyser
#

guys can you all stop generating for 5 mins so I can get a 45sec audio tysm

#

wtf it uses up the attempts even when it fails?

shut saddle
#

I got a "cant generate more than 1 track at a time" so I'm guessing yours will pop up once the server finally gets to yours.

odd geyser
#

I got another error

#

when I press download this happens

#

I'm on mobile, is that the issue?

shut saddle
#

🤷‍♂️

torn zinc
#

Is the model downloadable?

charred tendon
torn zinc
shut saddle
#

Yeah they haven't released a model yet.

merry raft
#

the duration limit for pro is way too short to be viable imho

#

instead of 90 seconds, it should be 240

#

also choosing mood and instrument is nice, but what about bar and chords structure ?

crimson turtle
#

Looking forward to it working:
Never got my first generation but a credit was deducted.
Waited an hour, submitted a new prompt but it said I can only have one prompt at a time -- i.e. the prompt that hung up is blocking further access.
Just got this message now, so contacting support:

foggy elbow
#

they should throw in an extra 100 credits for us who bought it and cant use it

fresh falcon
#

hi

crimson turtle
#

@foggy elbow based on how DreamStudio was operated, I suspect they will be happy to be generous in resolving issues of lost credits.

wind hedge
#

Would upgrading to Pro circumvent the traffic issues?

scenic marten
#

no

wind hedge
#

😭

scenic marten
#

I recently upgraded too and it didn't help at all

wind hedge
#

That's a deal breaker for me then

#

I literally saw the ping at like 4 am and then used it immediately, so that's why I was able to generate most likely

scenic marten
#

try again tomorrow 😭

prisma reef
#

I don't think the team anticipated quite how much attention this launch would get lol

fresh falcon
#

should’ve released it quietly

shut saddle
#

It's a stablity based audio project. I figured it was gonna get some attention.

#

I think there's going to be a few people thinking this means there's an open source model available and that's not the case.

iron owl
shut saddle
#

Soon is the best answer I can give. I don't really speak for Harmonai or the team. I'm just a discord janny.

tame canyon
#

I think SAI is underestimating just how popular music generation can be, I wager it will surpass txt2img in the future

iron owl
#

Huh. I remember setting up their dance diffusion stuff, was neat but far too resource intensive for anything local, and the architecture didn’t really allow for full songs to be generated, only snippets, and it was more like random loras as opposed to a general text to audio model

shut saddle
#

Fauno has made a lot of progress since then but hasn't released the code to the public.

#

The oficially released stuff is close to a year old at this point.

iron owl
#

Quite unfortunate, I honestly believe that the more locked down some ml project is, the less innovation will be in the field it is in

tight anchor
#

We're working hard on the open source code and models 🙂

#

The library we'll be releasing is the same library we used to train Stable Audio, and will also support our newer versions of Dance Diffusion

odd geyser
#

site is completely down for me 😦

tight anchor
#

Yeah, it got hugged to death with all the traffic, we're working on getting it back up and running smoothly

odd geyser
shut saddle
#

It means we love it already 🙂

tight anchor
#

I can't wait for y'all to get in there and try stuff out!

merry raft
#

you can always try the meta alternative, it s prett much the same

tight anchor
#

We've also got thumbs up/thumbs down buttons on the site, which we'll be using to improve the model, so if something sounds good, smash that thumbs up button!

#

And if it's bad, smash that thumbs down button

sick cipher
#

@pliant chasm @elfin jackal @neon silo
Someone from stability DM me. Want to share a security issue.

charred tendon
shut saddle
#

If we have queued an audio clip will it eventually generate or will it timeout?

#

The one I queued seems to be persisting between refreshes so I'm assuming it will eventually generate.

charred tendon
#

And its been at least an hour

shut saddle
#

Same.

odd geyser
shut saddle
#

Personally, I'm banking on the assumption that an open source model will be so much better once it releases.

trail hazel
#

Gonna have to learn to prompt now lol

shut saddle
#

The model is probably not going to recognize copyrighted music. Just saying.

wind hedge
#

So I can't generate things in the style of an artist?

shut saddle
#

Not likely. There's pretty strict restrictions on music and how it can be used.

trail hazel
trail hazel
#

Also, seems like it only knows 4/4.

spring girder
#

it's actually running on some poor 3070 in someone's basement so the outages make it look more popular

scenic marten
#

I have a 3090 for lease if anyone wants it habby

charred tendon
prisma reef
#

that's not real

proud dragon
#

Hello class

shut saddle
#

My audio track is stuck and isn't generating. Has anyone had success or is my account bugged now?

river latch
#

Though I'm sure that's just what happens when someone like me ignores the warning label telling us to come back tomorrow, lol.

shut saddle
#

I didn't have the warning label at the time heh.

inland wasp
#

How to report a bug? My generation is stuck since half an hour ago and I can't submit new prompts ☹️

shut saddle
#

You aren't the only one. They haven't mentioned a fix yet.

tacit jackal
#

i hope there's a way to regain the credits we lost on requests that timed out. congrax everyone at Stable, this seems amazing 🙂

shut saddle
#

Those credits are 20 per day so even if you miss out don't worry.

tacit jackal
shut saddle
#

Oh my mistake. Month.

tacit jackal
#

still pretty generous for free plan

shut saddle
#

Yeah hopefully you'll get reimbursed.

tropic cape
#

Is there a Stable Audio API?

daring hamlet
tropic cape
river latch
reef mica
#

What happened to "open-source"????

shut saddle
#

They haven't released the audio model yet. Relax. Let them test it out for now.

modern imp
#

I gave a prompt for a 7 beat per measure bass, but it gave me 4/4. It sounded pretty cool, but not what I had wanted.

stark pawn
#

For something called "Stable Audio" the website certainly is not stable at all lmao

#

Not bashing it though, I'm really looking forward to using it 👀

copper glen
#

Subbed, but doesn't seem to be reflected on my account. How long should it take?

copper glen
#

Already relogged and refreshed and whatever, still says "Upgrade to Pro" and clicking that says I already have a sub

plain knoll
#

"hardcore" which is a genre of music gets removed from prompts :/

winter halo
#

I'm documenting the limits of Stable Audio.
I downloaded one so far, I'm waiting for others as it is overloaded.
Already it seems it is far, far, very far behind MusicLM by Google Research. Note MusicLM fell apart finally as 3 months hit and now repeats itself, thankfully I got in all but my sadly hardest test early enough.

You can see those early-only tests, and how far ahead MusicLM is by listening to as many of my top picks below as you can, or at least the rin kagamine ones, lost lava world ones, and large diamonds in advanced starship ones.
https://www.reddit.com/r/singularity/comments/13h0zyy/i_really_crank_out_music_tracks_with_musiclm_this/

Also, just below is my first test with Stable Audio so far:

Reddit

Explore this post and more from the singularity community

#

@south dawn

#

MusicLM released in 2023 January also. I'm just showing what exactly yous are up against.

wary sky
#

in my opinion suno chirp is still better in genral. but stable- audio has a better stucture becasue its trained on coplete songs and it can make longer songs. this is a example from chirp. and yes chirp can also do lyrics. but the fundamental sturcure of both ais are different. the chirp model works simular to a laguage (gpt) model and the stable audio more works like stable diffusion.(diffusion)

delicate stirrup
#

hello there! is there a way to install stable audio locally?

tidal lava
tidal lava
charred tendon
shut saddle
copper glen
heavy arch
wind hedge
#

damn, even 2 am and can't access the site

shut saddle
winter halo
#

Outputs are choppy (I think, I might be wrong).
Does Pro Plan make audio better????

#

HOW BETTER 🙂

plain cloak
#

Trance, Ibiza, Beach, Sun, 4 AM, Progressive, Synthesizer, 909, Dramatic Chords

#

Trance, Ibiza, Beach, Sun, 4 AM, Progressive, Synthesizer, 909, Dramatic Chords

winter halo
winter halo
copper glen
#

Site seems to be working right now NepCurious

#

banger

sly fjord
#

not working for me still but at least my credits have been added

tidal lava
scenic marten
copper glen
#

It's generating immediately for me now

sly fjord
#

ok now it generates, but the sound player doesn't work... at least the download works so yeah, getting there

copper glen
#

Ok now it's broken for me

#

We takin turns?

copper glen
#

spooky

sly fjord
#

the Monolith is near, i can sense it

tidal lava
knotty quiver
#

It's a real mess. It's like 10 tracks playing at the same time.

deep thistle
#

after 5 attempts and a long wait... 2 were successful, the sound and the song in general are very bad, clearly this model will need more work!

urban venture
#

Can the audios be rebroadcast with editing?

elfin zenith
#

how long does it useually take to generate?

elfin zenith
cedar narwhal
#

As a 29-year jazz arranger and music teacher, I cannot WAIT to make tons of new music. Looks overloaded right now, testing with Bach inventions to see how it gets counterpoint, and will then test Mozart and Debussy to see how close to their music it can get. My whole education was studying advanced music theory and getting kids scholarships to college, this looks beyond awesome and right up my alley for being able to use at top level.

placid pewter
placid pewter
cedar narwhal
deep thistle
unkempt steeple
little bone
#

When will it be open source

shut saddle
#

There's going to be an office hours today you could ask that there.

sharp viper
compact condor
#

Audio prompt was: drum and bass, fast, energetic, melodic, 160 bpm

exotic raven
tight anchor
next gyro
tight anchor
#

Occasionally the model moves away from its 4/4 bias

tidal lava
tight anchor
next gyro
tidal lava
#

How long does it take for you to generate?

copper glen
#

A minute or so

#

Sometimes needs a page refresh

sharp viper
tidal lava
tidal lava
shut saddle
#

Harmonai office hours is starting if you guys wanna join us.

copper glen
#

Where's that?

shut saddle
#

It's the Harmonai discord server.

mighty dome
#

I read the rules before posting this to see if there was anything about putting this link here, and I didn't see anything, but please delete it if it's against the rules.

Stable Audio seems to be working a little better. I'm streaming stableaudio generations. I'm not monitoring the chat. The prompts are in the file names, and were themselves generated by Claude 2 --

https://www.twitch.tv/endalarius

plain cloak
#

Guitar, Russian romance

shut saddle
exotic raven
crimson turtle
covert patrol
copper glen
#

pro sub still not processed on the site 😦

tender otter
#

any news on when the audio model is released?

wind hedge
#

I wonder how much vram it would need

tight anchor
#

We're still working on the open-source models, hoping to get something out in the next month or so

#

Basically "when it sounds good"

tender otter
#

will the dateset be released?

wind hedge
#

Just checked and I got 20 free generations back I guess lol

tender otter
shut saddle
#

AudioSparx owns the initial audio dataset for the stable audio model, so that's not what the open source one is going to be trained on as far as I remember.

tight anchor
#

Correct, the AudioSparx model will not be made public, we’re training open-source models on different datasets.

neat python
#

soooo ive been waiting 4-5 horus for mine to load

#

free version

#

abnormal, yeah?

tight anchor
#

Have you refreshed?

neat python
#

should i?

tight anchor
#

Yes

neat python
#

looks like just a loading screen now

#

oh!

#

i see there is a play button n such now. but taking a bit to start playing? on the site. maybe il try downloading

#

ohp. looks like takes a while too

#

this sounds horrible

#

yikes!

tight anchor
#

It's certainly hit and miss, make sure to hit the thumbs down on the ones that are just awful

neat python
#

will do

keen hare
#

I'got a "Sorry, you have been blocked
You are unable to access stableaudio.com" message when I first try to enter the website

wind hedge
#

Lol generated clips started disappearing, some doubling, and now I have another 20 credits again lol

#

Anyways

#

This put a massive smile on my face

#

That face when the AI drummer has a cymbal stack

civic rampart
river latch
#

Ok, call me impressed... And I'm hard to impress when it comes to AI audio generation. SF gets the most important thing right when it comes to generating audio, and that's following your prompt--quality and cohesion comes after that. I look forward to what y'all can do going forward. Great stuff.

Prompt: dark, cinematic, synth drums, slow build, ethereal pad, epic chord progression

charred tendon
acoustic ridge
#

has anyone tried if it can do black metal yet? 😄

charred tendon
neat python
north dagger
#

Hello

#

When can we download the open source version????

unborn acorn
queen gust
#

I can't access the pro tier features. The site knows that I had paid (in the billing section) but allow me to generate 90 sec tracks. If this was a zombie horde like in Days Gone, the Stable Audio team would be devoured to the point of no revival or restart. GAME OVER!

novel frigate
#

Are there any plans to make the creation of sound effects and short stuff <10 secs cheaper than maybe the full 90 seconds?

tired halo
#

any recommendation for TTS text to speech webui ?

#

I want to try out some of the facebook tts releases.

harsh hatch
#

Something happened. Anyone who wants can rate it

acoustic ridge
#

it can't do metal

rough crest
#

so, when does stableaudio come with their own infinite radio channel?

queen gust
#

I don't know about you but I'm getting sick of renaming the default downloaded filename. Your prompt = your downloaded file's name. My " idea ": Your prompt = downloaded filename* . * represents numbers 0 to 499. The site must never give you duplicate files.

stoic jolt
#

How do you run Stable Audio locally or in the cloud? Is there a guide?

shut saddle
# acoustic ridge

No one is more disappointed in this than dadabots. Believe me. 🙂

untold oasis
#

output is clipping hard for some reason in tracks it generates for me

charred tendon
river latch
charred tendon
# civic rampart generation is fast today

I wish I can generate stuff. lol My account is still not connecting to the Stripe site where it shows I paid. Hopefully it gets resolved. But yeah, the Generate section of the site def loads quicker today than yesterday, so thats good at least

true magnet
#

Hey, is there anyone from the Stability team available in here? I have some technical questions about the new Audio generating tool.

#

@tight anchor you're on the team?

copper glen
tight anchor
charred tendon
# copper glen Contact support, they manually adjusted limits on my account. By any chance, use...

Contacted them already, and told them the issue. Still no resolution. I originally went through Google login, but then I manually created an account, and still there are issues. Both accounts, once I try to "Upgrade to Pro", once it gets to the Stripe site, it says I already have an account So yeah, the account part, is not "seeing" the Stripe payment section for me. And I provided proof via screenshots to the person who did reply

true magnet
charred tendon
copper glen
#

It's still showing upgrade, they just manually adjusted the limit

#

Yeah

tight anchor
charred tendon
copper glen
#

90 sec but mp3 only lol

charred tendon
#

Ah ok

copper glen
#

and gen is stuck again 🥲

charred tendon
copper glen
#

yeah, seems to happen randomly and then it's just sitting there lost for half an hour

#

generating on gpu /dev/null

shut saddle
copper glen
#

bit janky at the second half but great start

#

prompted gpt to give me an example of an average track description on audiosparx NepSmug

#

surprised it actually made something clean out of that gigantic copy paste

lunar drum
#

The theramin just isn't. Prompt: theramin solo, slow, mournful

#

Not sure what that is.

river latch
#

That's a *very * interesting sounding piano.

lunar drum
frosty wharf
copper glen
#

dropping an audio prompt into img gen gets you instant cd cover

frosty wharf
wind hedge
#

My generation quality has noticeably gotten worse since day one

#

As in audio quality

copper glen
#

Do shorter tracks sound better?

shut saddle
wind hedge
#

Used same prompt and was different from yesterday

copper glen
#

pray for blessed rng

wind hedge
#

Yep

shut saddle
#

Believe me, this is better text conditioning than it used to be.

#

It's still finicky but it's nowhere how it used to be.

wind hedge
#

Instead of "masterpiece" and "high quality" we are now gonna have "mastered" and "EQed"

copper glen
#

and microphone brands instead of camera models NepWink

#

recorded on fisher price casette deck

frosty wharf
frosty wharf
#

F key, noice

copper glen
#

whimsical in any prompt really adds a couple bottles of wine

next gyro
tight anchor
#

not the most melodic, but definitely more theremin

tight anchor
radiant stump
#

Is there a way to makes sounds are sfx specific, keep getting "music" , typewriter was like weird guitar

fallow sage
small harbor
# fallow sage that slaps

If this would be normal quality audio, it would actually be good. Stable Audios audioquality is quite low for paid service. Maybe at some point there is Pro-PRO version of it.

timber garden
#

Can’t download the file though I subscribe

iron yew
#

/fantastic ohm chanting

#

fantastic ohm

#

ohm chant

charred tendon
#

Just adding this here for anyone looking to have a base to build from when using Stable Audio. I asked this of Bing Chat and plugged some of them in myself, and they sound great:

#
  1. E Major chord with a Standard 8th note groove layered with a Bass Guitar: This combination is common in rock and pop music.
  2. A Major chord with a Four to the floor beat layered with a Synthesizer: This combination is often used in dance and electronic music.
  3. C Major chord with a Shuffle groove layered with a Harmonica: This combination is great for blues and jazz.
  4. G Major chord with a 16th note groove layered with an Electric Guitar: This combination is often used in funk and R&B.
  5. D Major chord with a 12/8 groove layered with an Organ: This combination is common in blues and soul music.
  6. A minor chord with a Motown groove layered with a Horn Section: This combination is characteristic of many classic Motown songs.
  7. E minor chord with a Reggae groove layered with a Melodica: This combination is perfect for reggae music.
  8. D minor chord with a Disco groove layered with a String Section: This combination is great for disco and dance music.
#

And like I said, just add things to these well known beats if you will, for those respective genres

#

For anyone looking for Metal suggestions, here are some starters:

  1. E minor chord with a Double Bass Drumming layered with a Distorted Electric Guitar: This combination is common in traditional heavy metal.
  2. D Major chord with a Blast Beat layered with a Growling Vocals: This combination is often used in death metal.
  3. A minor chord with a D-beat layered with a Bass Guitar: This combination is great for punk-influenced metal genres like thrash metal.
  4. C Major chord with a Breakdown layered with a Lead Guitar Solo: This combination is often used in metalcore.
  5. G Major chord with a Gallop Rhythm layered with a Harmonized Guitar Riff: This combination is common in genres like power metal and NWOBHM (New Wave of British Heavy Metal).
tight anchor
fluid flume
#

yes, when well the image to audio models be released?

shut saddle
#

Soon tm

copper glen
fluid flume
#

this is insanely impressive. I've been into stable diffusion image rendering for a while now, but I did not check out the audio side of things until today. another insanely addictive time sink for me, lol

real skiff
#

How do I use the music generator?

fluid flume
#

link at the top of the screen

copper glen
#

type out your prompt on your typewriter and fax it to stable audio

#

then your fax machine will play the song

fallow solar
#

strange - I cant hear music via webbrowser (safari, ipad ) but if I download the file - its fine.

nocturne matrix
#

Is the model out or nah?

fluid flume
#

when will stable audio controlnet be available?

small harbor
storm thicket
#

Any news on when an official API will be released?

tropic obsidian
#

If it 's a diffusion model we should be able to denoise an audio file and generate variations, right? Like img2img?
Does anyone know when / if it's being released?

shut saddle
#

That stable audio model is created using the audio library provided by AudioSparx. There is a revenue split that goes towards supporting AudioSparx and the artists that contributed to the training library. So that model isn't going to be released to the public.

The team is training an open source model that is not trained on that data, but on copyright free music. That model is still being developed and will be released later with the Harmonai tool suite that Fauno is developing. He doesn't have a set date when it will be released, just that it will be released when it's ready.

cyan quest
#

guys how do you do that

#

what is the command

west veldt
hard steeple
#

Trip Hop, Cinematic, soundtrack, Massive Attack, Bristol, Drum Kit, Shaker, Double Bass, Dramatic Reverbed Piano, Powerful Choir, Electric Guitars, Cool, Moody, Melancholic, Atmospheric, Dreamy, groovy, introspective, thoughtful, beautiful, Spacious, 90 BPM

I was QA and Audio Engineer during the stable audio project. If anyone needs help with prompting, the guide was based on my notes, and quite a few of the examples and most the default prompts are my creations. Would be more than happy to help anyone wanting to learn best prompt practices with the model!

north plaza
#

How to prompt for audio/music?

autumn elm
#

Is spectrogram used to represent the output of the model?

small harbor
small harbor
#

Stable Audio Feature request list:
SEED - Very important for tweaking prompts.
Multigenerartion 1-4 from one prompt.
Prompt sequence: 0-30s Promp1, 30-60s Prompt2, 60-90s prompt3.
audio2audio: Similar to img2img in SD.Drop in audio and get variations from it.
Remove the fade from the end that happens way too often.

copper glen
wide blaze
#

Hi all, can I use my generated AI audios with a Pro account to my Web Radio ? Is there some claim problems ?

steady remnant
#

Can I find a file or a list with used prompts for training the stable audio model?

plush glen
#

hey guys

quasi drum
#

This is like having session musicians do what you tell them without any complaints

wide blaze
# quasi drum

So it seems to be good for webradios. Thanks a lot 🙂

hard steeple
hard steeple
#

90s Rave, Rave Stab, Chord, Sample, Synthersizer, Major Key, Short Sustain, Sharp Attack, Short Release, Nostalgic, Retro There we go if you just replace the synthersizer with whatever instrument you want or adjust this prompt to what you want this should work. Sounds pretty rave stab to me! The key to prompting this model is to supply it with as much context for it to find what your trying to do. It often helps if you already know the right musical terminology to decribe what you are looking for!

hard steeple
#

There is a definite for lack of a better term a prompt engineering skill gap for this model. In my personal opinion it might be a little to expectant of people knowing and visualising things in the right cohesive musical language. I'm sure this will get better and easier to prompt as we continue improving it.

compact sand
#

hi all. anyone has a promo code to Pro Subscription? thanks

fair delta
#

bro i literally can not wait for the open source model

sly jetty
hard steeple
tight anchor
sly jetty
fair delta
#

whats the difference between stable audio and dance diffusion. is it just different models

sly jetty
tight anchor
tight anchor
shut saddle
#

Because dance diffusion was unconditional, most of its use cases was in audio2audio "style transfer".

#

There were plenty of people who used it for sample generating though.

hard steeple
shut saddle
#

Daft Punk B sides over here.

hard steeple
#

Yeah mayne this sounds sick!

jagged arch
#

hi,my friends

#

Do you all use stable diffusion?

rugged lagoon
rugged lagoon
#

this is so cool. I wanna make something sparse. Can I reference specific songs? I really like music with a fat beat, textured sounds, a little... funky

#

Can you reference specific songs?

rugged lagoon
#

How do you prompt for dynamics, like what if I want the instruments to kinda drop out and there just to be a little beat but then they fade back in

#

I'm thinking like Radiohead's "You and Whose Army", or like a build-up

tight anchor
#

maybe try something like "Breakdown" in your prompt?

rugged lagoon
#

ok. can I reference songs explicitly? apparently so. Can I be like, "Unchained Melody" and "Mr. Postman" had a stillborn baby in C minor?

tight anchor
#

You can try, but it's not trained on popular music like that so it might not know the songs.

rugged lagoon
#

ah. are you sure? the best result I got I explicitly referenced a song. But you know how it is, noise is usually good within a limit

#

not two songs though

tight anchor
#

Some people in the dataset may say their song sound like another song, so it could get picked up from there, or be a coincidence

rugged lagoon
#

this stuff is too cool

tight anchor
#

Glad you like it 😄

rugged lagoon
#

haha I'm the king of finding patterns where there are none, it's my crummy superpower

#

I also recommend doing things like "revolver-era beatles"

#

like you know, bands change, etc.

tight anchor
#

Subgenre: Irish Drinking Polka|Polka drinking song with acoustic Irish instrumentation, groovy, Dancy, Full Mix, Grade: Featured, Grade: Featured, oompah, steady beat, 120 BPM, clapping

rugged lagoon
#

@tight anchor are you the architect?

tight anchor
#

Yeah, I trained the model

rugged lagoon
#

Would it understand something like waltz-tempo? I like my stuff pretty sparse and scaled down, repetition is fine, like minimalist electronic music

tight anchor
#

sometimes, if you prompt it in the right way (still not sure what the right way is for most things)

#

I've seen it do 3/4, but it certainly has a 4/4 bias

rugged lagoon
#

I see. Yeah like you know how sometimes you'll prompt an AI art generator and it will generate the subject you're prompting for but there'll be like an echo subject because of the way you phrased it?

tight anchor
rugged lagoon
#

also if I want something that has like an... untuned piano or something, discordant, do you know if that works? I think I'm running out of credits. Or do I get to do unlimited so long as I don't make them longer than a 1:30

#

I love that. That stilted playing, very nice

tight anchor
#

the credit count in the top-right corner should be accurate, you can subscribe to Pro to have up to 500 per month

#

which takes a long time to get through

rugged lagoon
#

I see. ok very cool

tight anchor
rugged lagoon
#

is there a way to tell if a token isn't really powerful? like if I use kalimba, would it know it?

tight anchor
#

best way to find out is to try it

#

it tends to know instruments pretty well

#

Genre: Piano|Subgenre: Pop|Modern pop piano music, catchy, simple melodies, pop drums, instrumental cover of that one popular song, emotional, dynamic

fierce night
#

Will the open source model be able to create more than 90 seconds for one song?

snow pier
#

This is the best AI Audio Generation I have heard to date... Who ever trained this model should be congratulated! Amazing!

snow pier
# next gyro 🌃 <a:SXFAnyaCoolGlint:646052319217188894>

Come On! How in God's Name is that possible. It sounds way to good to be true. I'm absolutely floored by this. I don't know about the rest of you however, I've been keeping a close eye and an close ear on the other attempts of Audio/Music Generation. Nothing even comes close.

shut saddle
manic patrol
#

i get on the website this "Oops... Expiration Time (exp) claim error in the ID token; ..." can some one help me out what s wrong?

shut saddle
#

That just sounds like it timed out.

manic patrol
#

but i want be get timed in

#

🙂

shut saddle
#

I'd just try refreshing your browser and trying again.

manic patrol
#

tried that since this morning

#

even with inkongito and google login it doesnt work

hard steeple
warped mist
#

genre

hard steeple
wide blaze
#

Hi, is there a way to ask for features, like Folders for generated music ?

manic patrol
#

where to post bugreports?

#

contact link is dead

shut saddle
#

Try posting this in the troubleshooting section of Harmonai.

Disregard this.

manic patrol
#

ty

tight anchor
shut saddle
#

Maybe that needs a pin?

tiny frigate
#

Weeeeeeeee

slender matrix
#

is it open source yet or still paywalled royalty free generative stuff

quiet delta
fossil adder
#

im getting very discordant stuff too

#

its also so LOUD

slender matrix
#

it feels

#

cashgrabby almost

fossil adder
slender matrix
#

like pushing experimental tech to market before people even made it worth using

#

i believe in local AI more than saas

tight anchor
#

Working on the open-source versions 🙂

silent lintel
#

do yall think that this model will run in any low end pcs? seems a lot cheaper than image generation

slender matrix
#

maybe im too critical

fossil adder
#

need variation and 'inpainting/inmusicing?' features 😛

tight anchor
#

The model can be hit and miss, learning how to prompt the model properly can certainly improve outputs

slender matrix
#

but you just see it everywhere in the ai space

tight anchor
#

also make sure to hit the thumbs down on ones you don't like, and thumbs up the ones you do

hushed horizon
#

It was 1B parameters roughly iirc?

fossil adder
#

this is very cool though ive been looking for something like this

hushed horizon
#

so that should run easily even on low-end cards

silent lintel
#

nice

slender matrix
fossil adder
hushed horizon
#

Yeah for sure I'm waiting for the local version

slender matrix
slender matrix
fossil adder
#

hey im all for releasing it, but like, ive seen the amount of people that fail to just set up sd locally <_>

abstract lantern
slender matrix
shy mango
#

any good prompts for lofi hip hop? so far most of the outputs i get are off pitch

abstract lantern
#

hows it going guy rvc

hushed horizon
#

No offense but Beethoven would never write that cacophony :P

silent lintel
#

i bet these audio models are gonna get insane in 1-2 years

abstract lantern
hushed horizon
#

It really falls apart towards the end

#

like 0:30 and onwards

#

Just slamming the keyboard randomly :P

silent lintel
hushed horizon
#

I mean the classical piece to be clear

#

well, "classical" :P

barren parrot
#

this is nuts, but i need a negative prompt 😭

hushed horizon
#

One of the things I'm very curious about is if it can do classical-like compositions as opposed to just nice beats and repetitive/textural stuff

barren parrot
#

is this gonna be released open source?

abstract lantern
#

negative prompting and stable audio v2

#

thats what i want

edgy hound
#

are they offering any trial other than the 20 tracks?

shut saddle
#

There will be open source tools released but they aren't ready yet.

abstract lantern
#

hol on

#

trials?

#

srs?

shut saddle
#

You get 20 free gens per month.

abstract lantern
#

so its basically like novelai

#

can gens overfill?

edgy hound
abstract lantern
tardy patrol
hushed horizon
#

How much of the classical corpus is public domain

#

Does it all like belong to Disney or something

#

Although I guess for training this takes recordings of performances, right? And not just notes.

bleak wigeon
#

i've only generated one thing that is actually listenable

hushed horizon
#

There might not be that much in the public domain for classical pieces

silent lintel
barren parrot
#

I need to add something like (disonance:1.7) to my negative prompt 😂

abstract lantern
#

also is deepfloyd emerging as a lora soon?

fiery robin
#

i just put the prompt "screaming" in stable audio and its nightmare fuel

tardy patrol
#

that last bit at the 3rd second is actually a very nice crunchy snare

barren parrot
#

we're gonna bring back michael jackson without his love for boys this time

abstract lantern
#

this audio gives me hooded kars vibes

silent lintel
tardy patrol
#

i know im just point out the crunchiness

shut saddle
#

Audio prompting currently is vibes based instead of precise control. Getting your audio in the right direction is tricky enough.

tardy patrol
#

very sample-able

shut saddle
#

Just from my personal experience prompting for audio.

hushed horizon
#

that one kind of works

tardy patrol
#

its pretty good at texture and vibe it seems, like Geck said

shut saddle
#

I just think that we don't often have the right vocabulary to describe audio.

rare oxide
tardy patrol
silent lintel
rare oxide
#

tbh sounds like a fairly nice song except i am locked in a rubber room

abstract lantern
#

crazy

#

with rats, right?

#

hopefully you recover

#

this takes the cake

rare oxide
#

payday 2 soundtrack

tardy patrol
#

obviously if you do that kind of stuff, you share the vocab, but for images and art the vocab is more concrete i think

rare oxide
#

holy heavens

abstract lantern
#

more breakcore than synthwave tbh

rare oxide
#

0:24 slaps

bleak wigeon
#

yeah i like that part

silent lintel
#

just needs a little bit less weirdness

random galleon
#

is there going to be a local version of this too?

abstract lantern
rare oxide
#

truly the 8 bit chiptune of all time

bleak wigeon
#

splatoon character goes through the printer

silent lintel
abstract lantern
tardy patrol
#

the printer part at least

silent lintel
abstract lantern
#

ye

#

the beginning is gud

tardy patrol
#

not a single prompt worked for me

#

see you next month

silent lintel
#

the problem with using authors as reference, is that the model probably doesnt know any

tardy patrol
#

no lucky

#

🥶

abstract lantern
#

audio to audio when

tardy patrol
#

i havent heard a single vocal in any of the things i generated or the ones i listened to here

#

is that dataset all instrumental?

tardy patrol
#

mayhaps

inland bear
tardy patrol
silent lintel
inland bear
silent lintel
#

thats pretty good

tardy patrol
#

Ambient Diffusion

silent lintel
#

oh i havent heard jazz yet

tidal lava
#

are there stable audio models I can download?

shut saddle
#

They haven't released the stable audio model and probably won't since there's a revenue share split with AudioSparx and the artists that contributed to that model.

silent lintel
shut saddle
#

The open source model isn't ready.

#

Fauno also wants to release the tool suite at the same time, so we likely won't have anything until both are ready.

#

I look at stable audio as a proof of concept.

icy phoenix
lyric finch
#

I tried it out until I ran out of free tracks. I found a lot of distortion in my early efforts, and the tracks I tried to be thematic didn't turn out to be very melodic.

tight anchor
lyric finch
#

On interesting track I made was simple a tuba riff. Interstingly, there was a point I heard the tubist inhale, which I quite liked.

tardy patrol
shut saddle
#

There's several non stable audio programs that have made realistic audio. This model didn't specialize in it.

rare wind
#

YES! FINALLY! I have been waiting for something like this!

#

Funny... just a few hours ago (after months of not checking) I was looking around and found something similar... guess I'll have to test both.

#

A music is finally here... let's see if it becomes a trend.

nimble torrent
icy phoenix
nimble torrent
#

first shot out the gate too?

#

it's either pretty darn good or has a very consant level of decency regardless of your input

#

here's to hopin it's the former blaze

icy phoenix
#

I used up literally all my free generations to make what I wanted to make sound good

coarse lantern
#

Now i really can't wait until stableaudio gets a git and custom models as well! :D

hard steeple
nimble torrent
#

trip hop, i like that sound of that

hard steeple
#

🙌

nimble torrent
#

for my second gen i tried a trap beat

#

lets just say i had a bad time

hard steeple
#

lmao there is loads of trap in the dataset and I have had mixed results it's really wierd. I swear it's a prompt away though!

nimble torrent
#

it's like the model kept shtting itself or something

hard steeple
#

it sounds like taking a hammer to the piano and tearing it apart hehe

#

what was your prompt

nimble torrent
#

Not really pleasing to the ear

#

Trap beat, hip hop, 808 drum machine, 808 kick, claps, shaker, piano melody, phonk samples, aggressive high hat patterns, underground rap

hard steeple
#

maybe try adding moods to your prompt, like how you want it to sound emotionally

#

that might add the context for the instrumentation to work off

nimble torrent
#

...... urban sus

hard steeple
#

oh dear

#

nooooooooooooooooooooo

nimble torrent
hard steeple
#

stuff like, gritty, raw, moody etc

nimble torrent
#

ok ty ill give her a go

hard steeple
#

like I said i've had mixed results with trap but there must be a way i refuse to believe there isn't

shut saddle
#

Any chance we can get a token list of frequently used words that were used in the training dataset? 🙂

nimble torrent
#

those are really my only 2 lanes unfortuantely

#

disgusting heavy death metal, hip hop trap

shut saddle
#

If not, that's ok. The discovery process is fun too.

hard steeple
#

you can literally pull up the audiosparx website and take a look yourself

shut saddle
#

Good to know! Thanks!

hard steeple
#

I did that to figure it out

mellow cosmos
#

where is the model? is it not open source?

nimble torrent
# mellow cosmos where is the model? is it not open source?
hard steeple
#

this model is trained on licenced data! we are currently working on open source models utilizing what's available for us to use

nimble torrent
#

"Our first audio AI model is exclusively trained on music provided by AudioSparx, a leading music library and stock audio website."

mellow cosmos
#

or is this not meant to be run locally like SDXL

nimble torrent
#

that's all that's public rn

hard steeple
mellow cosmos
nimble torrent
#

can only use free stuff in your own production, paid accounts can use commercially, nobody can use output to train a model

mellow cosmos
hard steeple
nimble torrent
#

Right, but I gave you all that's available on it rn

mellow cosmos
nimble torrent
#

yep haha no issue at all right guys

mellow cosmos
nimble torrent
#

define normal

mellow cosmos
#

4000 series, 3000 series

nimble torrent
#

I render animations at 1it/s on my 2060 6GB

tight anchor
#

should be fine-tunable on your own computer

#

particularly as people find out things like LoRAs

shut saddle
nimble torrent
#

they were comparing the requirements not the models

mellow cosmos
#

though when blending images with SDXL the speeds can shift sometimes

nimble torrent
#

oh it does vocals?

#

I didn even realize

#

well

#

"vocals"

spice crow
#

hahahah

shut saddle
#

It can, it's just not great at it.

nimble torrent
#

GtawaAAAAAAAAAAAEYYYEEEEaataweTWeaaayEEEEEEEYEAW

spice crow
#

are u related to 100 geccs

nimble torrent
#

when she said that, I felt it

mellow cosmos
#

will it be able to blend audios like SDXL can blend images? or is that aiming too close to the sun

tight anchor
#

We'd probably need to add some audio conditioning to the model for it to do that properly

mellow cosmos
#

and stuff like AIT and OneFlow will be compatible, right?

hushed horizon
#

it's surprisingly musical

hard steeple
#

hahahaha that's sick

spice crow
#

i'm finding it tough to make anything listenable 😅 I am rlly trying to make it do consistent vocals tho, vocals that dont suddenly cut in and out

#

is anyone else getting a lot of microphone-feedback like quality?

hushed horizon
#

I think it managed to spit out an reasonable piano piece at least?

#

Well, it cuts off early, but what's here makes some sense

#

want my money back for those last 5 second tho

abstract lantern
#

audio2audio is gonna be sick

#

perhaps like img2img

#

with a text prompt

lyric osprey
vague cliff
#

Tried a few things. Overall the sound quality is much better than previous models I've played with, so it's a step in the right direction, but I'm not sure I could get much that's really usable out of it given the current state of it. I'd say the licensing terms are ambitious in relation to the current quality. Maybe you have different markets in mind than the typical home studio enthusiast, but anything that steps too far beyond the familiar terms for royalty-free samples is going to be a hard sell. That being said, the jump in quality is exciting, and I look forward to checking out how things develop (and reading the papers!).

vague cliff
# spice crow how do u get better sound quality

I'm not saying the sound quality is good, I'm just saying it's better than other generative models I've tried to date. The first generative model I tried was google's (years ago), and they did everything by squashing the audio with mu-law compression (IIRC), so everything it generated sounded like it had been smashed into oblivion. Most generative audio models seem to have followed that pattern. This is one of the first things I've heard that doesn't sound like everything is compressed to death.

tight anchor
#

Yeah we use a learned autoencoder to compress the audio to death, sometimes the decoder properly brings it back to life haha

viral fractal
#

It needs to be as easy as the rest of them (google and other opensource ones) if they want to charge us (I know you don't have to).

For right now it sounds like shit.

tight anchor
#

It's a tradeoff between high compression allowing longer sequence lengths, but also making the artifacts worse

#

I'm certainly looking for ways to make it sound better

viral fractal
#

I can wait though. It does feel unusable and hyped (by accident)

tight anchor
#

It sounds great for some outputs and styles, and abysmal for others

sharp crow
#

any easy to install local webui yet? for 6Gb vram?

spice crow
viral fractal
#

is the database just poorly sorted or too large?

spice crow
#

soon my vocaloid garage punk band will be ready to world tour

tight anchor
#

The dataset is great, there's just a lot of technical stuff to get it to sound good, be quick, and work with long sequence lengths

#

And right now in particular the quality can vary greatly depending on how you prompt

#

a lot of the dataset is sound effects and field recordings, so there's lots of non-musical stuff that gets picked up as well, which can probably make the music sound worse if it's not fully aware it should be making music

sharp crow
tight anchor
#

Hey! No local web UI yet, but we're working on that for the open-source models

viral fractal
#

why is it fucking hard to use? why do I want to support the companies that I don't like? Why can't I just write something simple for it to follow?

Are the other ones just scamming the user? Are they pretending to understand the full prompt? Is this trying to understand the full prompt? Does it actually truly understand a large amount of what I type?

sharp crow
vague cliff
# tight anchor I'm certainly looking for ways to make it sound better

One of the things that I tried to do a long time ago (when I was trying to use RNNs to generate audio) was to use vector quantization instead (instead of what was essentially mu-law + 8-bit scalar compression). It still introduces artifacts, but it's got an interesting character to it that can sound good in a synthesis context (e.g. I suspect that some of Roland's samplers use VQ internally--that's just speculation on my part though...).

tight anchor
#

Yes, though the Harmonai community is the main focus for our open source stuff

sharp crow
#

well I am member there as well

tight anchor
#

I have yet to train a good one that has the properties I want for diffusion, but it's certainly on my list of things to keep trying

spice crow
tight anchor
vague cliff
sharp crow
tight anchor
#

not sure why you can't share it, weird

spice crow
tight anchor
#

That has its pros and cons. Because you're generating the sequence one token at a time, the full power of the model is going in to "what is the next token?" That lends itself much better to generating melodies compared to diffusion where the whole sequence is being denoised at once

#

The tradeoff is that the inference time scales linearly with output length since you're going one token at a time

#

But also you can go as long as you want

#

Of course transformers will only have so much of a context window, so the model "forgets" the past at a certain point as it goes along

#

With diffusion it's the opposite. You've got the whole window all at once, so everything can see everything else. But you're also stuck generating the full window size for every output

sly jetty
sharp crow
tight anchor
#

yeah, I also generated a new link so I see how it could be tricky, not sure how the whitelist feature works

sly jetty
quiet delta
tight anchor
vague cliff
# tight anchor But also you can go as long as you want

Here's a thought: The primary use case I have for audio generation is sample/sound effect generation, so a fixed-length generator that sounded better would be vastly preferable to an arbitrary-length one. So even it if had the same artifacts as convolutional systems like SD (repetition/lack of coherence when the image size exceeds the primary training size), it would still be very usable for post-production music.

hushed horizon
#

What length are you targeting for the open source one?

tight anchor
#

Yup, for the open-source models, I'm more focused on short lengths (12s right now) with more of a focus on quality (...as much as FreeSound will let me)

#

that's mostly because it's more practical to fine-tune on smaller GPUs, and also because we just have way more sound effects to train on than songs for an open model

vague cliff
tight anchor
#

I haven't pulled that in yet, but I may at some point.

#

The vast majority of the data for our open models right now is CC0/CC-BY data from FreeSound

quiet delta
#

I feel like trying to make a long coherent piece will end up rambling unless we prod it towards a little bit of structure

tight anchor
vague cliff
#

It's really fantastic. Meticulous metadata. But yeah, any CC0 sources are ideal.

shut saddle
#

I'd imagine that once an open source model is out, CLAP can be better trained to tag existing audio with the appropriate details which will help improve future open source models.

untold saddle
#

I'm super excited to see this evolve

terse narwhal
#

is there going to be a local install version of stableaudio or on the website only?

pliant chasm
distant ember
civic heron
#

anyone know if this works with phonk/drift phonk style beats? can it do lyrics (memphis tapes, phonk tapes etc.)

unique mason
#

Wen API for this

distant ember
#

In general it doesn't seem to do lyrics

civic rampart
distant ember
fluid tulip
#

Those are my 3 best shots, i think that until version 4.0, i personally will not enjoy it so much...

next forge
tidal lava
#

i thought this came out nice (i do not know anything about music production and prompt it like it's sd, no claps? clapless no snares? snareless, idk)

untold meadow
#

what's the expected VRAM requirement and is it possible to create music that loops seemlessly? Excited to see what comes out of this, I have been waiting to have an AI generator for music to use as background music in games for a while now.

grand matrix
#

Looping seemlessly: it loops, but you can't tell when

distant ember
brazen flicker
#

Is there a support room or ticket system?

pliant chasm
brazen flicker
#

Thx! Great stuff so far, ai generated horror audio drama made from my phone while I charged my car. The future is now

tidal lava
#

generiere eine silberfarbene Roboterhand mit dunkelblauen Hintergrund

junior mauve
#

any idea when the open source release is coming?

shut saddle
#

If you have that answer, let me know next week's winning lotto numbers. mochicat

tidal lava
#

When you enter "hardcore music" into the prompt, it gets censored for some reason

hard steeple
tidal lava
#

Drum and Bass, 90s Style, Upbeat, Energetic, Rhythmic, Angelic Female Vocals, Mastered, Synthesized Bassline, Jungle Breakbeats, Atmospheric Pads, Melodic Hooks, Dancefloor Vibes, Nostalgic, 170 BPM

hard steeple
#

that fill where it just wierdly devolves into spectral garbling kind of hard ngl

nocturne zinc
nocturne zinc
rugged lagoon
#

Hey y'all, I hope you don't mind but I wanted to give some feedback.

This platform is very cool. If I'm honest, I think it sounds better when it's shorter

my favorite outputs have been like a cool intro or buildup to a song, like the beginning of "pictures of you" by the cure. Or an outro

Is there a way I can prompt for that? I guess I can try. But I'd be curious to see what it's like training the data set on song intros and outros only

rugged lagoon
rugged lagoon
rugged lagoon
#

Does anyone know if there are any platforms that offer sound genre prompts for short clips of sound, like if it would give you suggestions of what that style is

sly jetty
#

A cool experiment I saw earlier in a tweet and expanded upon is describing what you want to an LLM (chatGPT, Claude, Beluga) for them to craft the prompt for stable audio.

Example1: I am in a zen state, on a tropical paradise, completely relaxed and in tune with nature. I like music. Describe what I should listen to in a short sentence as a list of descriptive words separated by commas.

Example2 (from Nao Tokui): You are a musicology professor. Describe the most obscure music you know in a short sentence as a list of descriptive words separated by commas.

fluid tulip
# pure aurora

I don't sure if the right word can be used without being banned, so, id like a "Deutsche" Macarena.

fluid tulip
pure aurora
sly jetty
#

Great insights on that stage @errant cosmos and @hard steeple ! 👏

shut saddle
#

Good office hours session.

#

Also some awesome news about the progress on the open source model, but you had to be there to hear it. 👀

nocturne zinc
#

Howdy, I created two songs using: downtempo, minimal, 90 bpm, epic intro, synth, chopped, melodic, then I arranged them in Ableton to be more structured, added a lil bit of reverb and delay to smooth some things, and a big of mastering to make them sound more similar to each other for this:

#

This is the album art in the MP3, obviously also AI generative:

nocturne zinc
nocturne zinc
sly jetty
#

If you guys want to try more moods/cinematic stuff, since the training was done with Audiosparx data, there is some metadata that produces good results. You can say stuff like "horror macabre" or "scifi robots" or "psychological thriller". Audiosparx is used in movies/tv so it has a lot of that inside, those genres and moods

nocturne zinc
#

oh interesting I have several tracks up on Audiosparx, maybe I'll get my own vibe in there like a oroborous.

sly jetty
distant ember
nocturne zinc
#

I really like the way some phrases are very distorted like tape saturation, very cool vibe.

sly jetty
#

if I remember correctly they are pulling BPM, moods, instruments and topics

#

and style of course

nocturne zinc
#

BPM was spot on for me, it was super easy to bring into Ableton.

#

very nice, i've used Riffusion before and it's also fun, but I have to manually set the tempo and length.

sly jetty
#

keys don't work, i specifically asked them about it, it does not understand keys

nocturne zinc
#

aha good to know!

sly jetty
#

you can prompt "minor" or "major"

#

but not specific keys

distant ember
nocturne zinc
sly jetty
#

nice

#

someone did "cursed" 10 times with commas, amazing result

shut saddle
#

We had a few fails too but that's how it goes sometimes. Turns out "Medical Drama" just produces dialogue lol.

distant ember
#

To confirm, no time signature specificity? Asked for a few in 7/8 that seem to be in 4/4. Not that I need it, just playing and testing.

sly jetty
#

time signatures don't work

#

broad tags: style, mood, instrument, topic, regions

#

topics are things like: birds, addiction, birthday...

#

regions: latin, asian, specific countries and specific regions

#

you can also do genres, subgenres and BPM is a must

#

all movie/cinematic genres: blaxploitation, action, horror, scifi, western (and subgenres)

shut saddle
#

You could coax time signatures indirectly through genres.

sly jetty
#

jazz has odd timmings

#

but you cant specify, I burnt like 10 generations trying no avail. 😂

nocturne zinc
tight anchor
#

but it might still work because the info is often in the description as well

#

and lots of regions/countries are represented as their own genres

#

a hell-bound ship's crew singing a sea shanty on their way to the depths, stomping their feet to the beat|Instruments: Male Voice, Ship Horn, The Wails of the Damned, Stomping Feet|44kHz stereo, high quality recording, steady stomping beat

shut saddle
#

Was there any token alterations made to the initial dataset or was the model trained off of the AudioSparx set as is? If it's the latter, the website does sort music by country.

crimson turtle
tight anchor
crimson turtle
#

Hey how do we get a "sharing url", to post to social media, etc?

tight anchor
#

also eating some Nashville hot chicken rn haha

tight anchor
#

I've seen people posting videos, honestly I'm not sure how they're doing it, probably some screen capture thing