#🎵|stable-audio
1 messages · Page 3 of 1
I got a perfect hammer for all of that 🙂
Why the hell is this posted here?
Still got that hammer handy?
#🎵|stable-audio message
yessssssssssssssssssssssssssssssssssssssss
Thankies
So do i for the server i mod on xD Though, need to reprint hammer head as the damn thing got a few bad layers from 3d print lol
In the heart of the American Southwest, nestled between towering mesas and endless expanses of desert, there existed a town that time seemed to have forgotten. Eldridge, as it was called, was a place where the past whispered through the creaking timbers of weather-beaten buildings, and shadows clung to the edges of every sunset.
I didn't get it done in time 
Did you get it started though?
I got through a chorus and a bit of a verse 
would be amazing if you could post up a few "best practice" prompts to use in stableaudio, just got pro and would love to get a better idea of the best prompting template/stylke to use, cheers
a future thought, some type of clipvision detecting video motion changing music intensity or something
" Genre: Cinematic, Suspense, Instruments: Tense Strings, Ominous Brass, Pulsating Percussion, Piano Accents, Style: Classic Thriller, Mood: Intense, Foreboding, BPM: Variable, often Slow to Moderate, Features: Evoking Alfred Hitchcock's Iconic Thriller Atmosphere, Sharp Melodic Turns, Crescendos and Diminuendos, Unsettling Harmonies, Designed to Build Tension and Suspense, Perfect for a Scene of Psychological Thriller or Mystery."
/audio
Hi, how do you create videos like this?
hi all, this is my first post here and i've never used discord before. i wanted to share my favourite audio tracks that i discovered using stable audio. i'm going for a tropical house/chill/uplifting vibe similar to kygo's early remixes/avicii vibe. if anyone is interested, please check out these quick 45 second tracks made by stable audio--all credit was given to stable audio in the video description
https://www.youtube.com/watch?v=6j6wG2zDVC4
https://www.youtube.com/watch?v=qLkZZyrEzY8
Let me know what you think or if there's a better place to share these audio tracks. Thanks all<3
hi! would appreciate any pointers for generating audio that is not full tracks, but various isolated single note tones and timbres that can be sampled and turned into playable sampler instruments. i've had some great results, hoping to optimize my prompts and take it further. thank you!
https://www.youtube.com/watch?v=X42bnPHLr5s
javolk - luminara
generated by stable audio
music - stable audio
pictures - dall-e
Is there an "AI upscale" for audio out there? Like, something that uses a model to take a low-quality audio clip and interpolate a higher res? e.g. take an 8-bit sample and create a 16-bit version that sounds similar to the original?
Generate a 3:45 mins song with prompt Pop, Pop-Electronic, Ballad, Billboard, Drum Machine, Bass, Lush Synthersizer Pads, Synthersizer Arp, Synth Bass, Vocal Sample Chops, Percussion, Honest, Heart-Felt, Melancholic, Vibe, Cool, Modern, Atmospheric, 115 BPM
Hi all,
I wanted to share a track created 100% by stable audio using the stable audio model: stable-audio-audiosparx-v1-1
please let me know what you think...sound quality appears to be better than what suno ai can do.
100% AI generated music using stable audio model: stable-audio-audiosparx-v1-1
photos generated using dall-e within ChatGPT4
soundcloud - https://soundcloud.com/javolk
jakevolk@gmail.com
please leave a comment and let me know what you think. i wish I could continue this song but the stable audio only generates 45 second clips at this time wit...
That's awesome
thank you
Hello! I'm wondering if there is a plan to add a feature to mass/bulk download generations from the web app. I usually generate multiple results per prompt and it can become a workflow snag having to click and download one file at a time. Thanks!
Hi all,
here is a song generated by stable audio which i slowed and reverbed. I hope you like it:
music - stable audio
pictures - dall-e
https://soundcloud.com/javolk
jakevolk@gmail.com
A day at the Market in a Medieval town.
#stablediffusion #openai #openaijukebox #aimusic #midjourney #aiart #aliens #nightlife #tokyo #tokyonightlife
Official Web Page:
https://one-manindie.com
Transmissions from the OMI-verse!
https://anchor.fm/onemanindie
SOCIALS
Facebook...
Can "Stable Audio" generate an embedding with audio to find similarity within a vector db?
@shut saddle did amli leave SAI?
That's not something I know the answer to. 🤷♂️
she's no longer at Stability. I believe she's working now with SplashMusic, she's very active at twitter.
how to use stable-audio to make music with discord? could anyone give me example prompt, thanks
Unless SAI changed the discord bot recently that I am not aware of, you can't use discord to create music though stable audio.
hi, as @shut saddle said, you can't use stable audio at the Discord server, you can use it at https://stableaudio.com/ Also there you have examples on how to prompt and a complete user guide that explains prompting, the interface and the model.
Make original music and sound effects using artificial intelligence, whether you’re a beginner or a pro.
The scene depicts a family of three – mother, son, and daughter-in-law – devoutly practicing their Buddhist beliefs, engaged in a prayer for blessings.
idk man I think it's just a fella dancing
https://www.youtube.com/watch?v=BswTEqxF3u8
generated with stable audio - all credit to stable audio love the work you guys are doing
soundcloud - https://soundcloud.com/javolk
jakevolk@gmail.com
please leave a comment and let me know what you think or if you'd like to hear more
anyone know what the purpose for .index files in RVC is?
Hey guys - which is the currently best Prompt2Sound Model with a GUI?
its the accent control of the cloned voice
Is there a particular repo you recommend for training voices that generates index files?
Im using audio-webui but it doesnt generate index files
i always use rvc disconnected to make rvc models, works on free colab too: https://colab.research.google.com/drive/1XIPCP9ken63S7M6b5ui1b36Cs17sP-NS
but if you train local try applio
Is stable audio available through API?
If you are referring to the audiosparx models that you can find at https://www.stableaudio.com/, then no. Those models are not getting released. Fauno made a stable audio GitHub repo that you can use to run stable audio models though.
https://github.com/Stability-AI/stable-audio-tools
The main setback for that program is you need a model to run stable audio tools. SAI hasn't released a universal audio model just yet
Is it planned anytime? Even if only for memberships? 😉
If I remember correctly, there were some annonces about a possible open source version, I've read it some where (possibly reddit).
Thanks!
The Harmonai team has weekly office hours where they discuss the current progress of the open source project (and sometimes offer sneak peeks on how the models sound) Office hours are scheduled for most Thursdays and the next one is scheduled for <t:1706814000:F>
Where/how can one tune in to that?
You can join the Harmonai discord and be on the lookout for when they start office hours. I can't link the discord (this discord prevents posting direct discord join links), but go to the Twitter/x post for the discord link.
Their current posted discord link is expired so you'll have to use the Twitter one instead.
@still crypt I dont know how to get to you, but dont answer my old account, it was hacked
Is there any ways to disable NSFW Filters on SD?
found it! thank you!
Awesome! Sorry I just couldn't link it here.
@shut saddle I'm doing some research on generative AI for music and Ai tools for musicicans in general... What are some other tools you use/find interesting?
There's a few projects I find interesting. Vocal conversion programs like so-vits are pretty neat.
https://github.com/svc-develop-team/so-vits-svc?tab=readme-ov-file
Suno and MusicGen are probably the more prominent audio generation tools at the moment with Suno even having text to song capabilities.
I'll check it out!
VStudio
This is kinda crazy! I didn't know we were so close to realtime voice conversion!!!
can i feed in a video and it creates audio synced with the video? like sounds for karate kicks
This doesn't have anything to do with audio diffusion.
Have nice continuation on this one ♥
Hi everyone, how can I compulsorily let stable-audio NOT doing something by using the prompts?
Can anyone recommend a best practice/pro guide for stable audio?, all the youtube videos I see are a bit old and meh
meh
xd
So far the best guide is on stableaudio.com
Truly, just use the guidelines on the website, but my recommendation is to experiment a lot and learn that way, sometimes you can get surprisingly good results with pretty strange prompting with this model
这个软件还能用吗
this is for stable audio, but I bet y'all never heard what stable diffusion sounds like under Coil whine >:3
kinda sounds like an old printer, makes sense given it's making images xD
Thats the sound of the future happening
XD yes
any status on an api being available for stable-audio?
There is an API available for stable audio, what is missing is the weights.
on the docs I only see text -> video, text -> text and image -> image endpoints
good afternoon, I'm trying to figure out which model is better for generating sound effects (for a game to be precise)
met audiocraft, stable audio and dance audio
In your experience, which one is better?
At the time of posting, there was only one model available. So absolutely no clue if there's more today
I can't brush up well on the sample, but I wonder if the chemistry of the prompt combination is the deciding factor.
I'd say the main factor comes down to our lack of ability to convey what we want to get out of our audio outputs. It just seems like it's much easier to describe a picture with words than it is to describe music using words.
Hey everyone. I have a question maybe the community can answer! I've been using the stable audio website for a while and it's come out with some pretty good sounds! But the audio quality isn't there. Is this because I'm on a free tier or is it a limitation of AI generated audio? Sounds like a bad mp3, or worse!
Afaik paid tier doesn't sound any better. Better prompting can often provide cleaner results. In my time tinkering, I found that creating short, repetitive solo-instrument loops yielded the highest quality outputs. I lean towards obtaining clean stems, then combining, editing and processing them later. If you're generating full-fat compositions, it will be hard to get uber-clean results. That's the current state. No difference between paid and unpaid output except for max duration.
Thank you so much for the detailed reply. great idea to try to get clean stems and then work on them. I think this is the limit of generative ai audio as it stands. I remember reading an article which said it takes A LOT of processing power to make it. I wonder how long until it starts making better quality audio?
My guess, not long. Video is already at the point we thought it'd take 5 years to get to. So ultra-clean high-quality audio might not be far off. In my uneducated speculation, the biggest issue is the availability of good training data.
Haven't logged into Stable Audio in a while and got this, anyone know whats up?
I'm seeing the same thing. I've never used it before and was stoked to check it out. Looks like something is bunged up with the DB in the backend so it will likely have to be resolved and redeployed before it will work for anyone.
Unfortunate. It's a lot of fun to play with when you do get a chance.
@tight anchor who on staff takes care of the stable audio website? Looks like there's a System Error.
Figures, the first time I try to use it in like a couple months lol. Sorry fellas, blame my luck
You could try refreshing and / or trying a different web browser?
Might fix the problem, might get the same error.
Yeah so Firefox is giving me that error too.
Looks like the website is back up guys. I got a generation that went though.
yeah sorry, it was down for a bit. Should be all good now.
Thanks Fauno.
its time
I thought adding reference sound file was a thing last time I used stable audio, perhaps only a paid feature?
I was able to use one of the clips I generated as an input but didn't immediately see the option to add your own
I was pleased with the first generation: https://stableaudio.com/1/share/445b0c36-7725-4bc6-aabe-221a988beb15
I was going for a Fantastic Planet soundtrack vibe (https://www.youtube.com/watch?v=RHyP3tUt3V4&t=23s) and it was pretty spot on with the vibe of the instrumentation. Nothing mindblowing but I'd throw down some bars if I was a bar thrower downer.
I fed it back in and liked the second one too. Was looking for more of a lead sound but hey, first two tries and I wasn't completely disappointed. https://stableaudio.com/1/share/1dbfce3c-b2da-416b-9f7e-3a7c7e62f213
I'd throw down some bars if I was a bar thrower downer.
I'm saving quote lol
I forgot how dangerous mentioning "synths" was. They just tend to overrule everything. No matter the wording.
Anyone have any input on how much the 1.1 beta improves? Considering buying back in, if it helps a lot - and because I wanna queue multiple tracks.
Sounds better in general, less artifacts, but the max length is 45s. I still use 1.0 for some specific things but 1.1 is overall much better
https://soundcloud.com/at-odds-69/ruby-1?si=9300f9d76f1f4a8d9828e6bb0aa7bf13&utm_source=clipboard&utm_medium=text&utm_campaign=social_sharing Loads of stable audio based sound design in this hope you enjoy!
Made during my process of QA on the stable audio models =]
I really forgot to reply to these eeeeeeek! there is a user guide that's based on my notes in the stable audio website I think the button to find it is in the prompt box =] if you have any specific questions on how my opinion has deviated from that since I'd love to answer. Hopefully abit more timely hahahahahahahahahaha
beeen very busy
Poor guy waited 2 months for a reply 💀
This is abit of hacky way to do it based on how the CLAP model was trained.... but you can use Format: Solo...|Instruments: Violin...|.... Just change the parts for what you want it to do ie....Duo, Trio etc
Still trying to figure/get round to sensible way to incorporate that UI wise
I know i'm a bloody idiot
Maybe give them boxes to fill in? Idk
I so forgot I posted that in there
well yeah of course
that's not for me to do hahahaha they have been told though hehehehehehe
I mean it's up to sai how they want to do their stable audio website ultimately. You don't want to fully rely on tokens, but they sure do make things easier to hammer out prompts with.
We're working on deeper level stuff to help with prompt fidelity and accuracy. =]
Except if you do that it's trickier to make up gibberish and call them instruments.
but it be nice to streamline this with what's already there toooooo
lmao
I'll have you know I'm an expert Blamash player.
Oh yes the blamash beautiful bellows that makes hahahahahaha
thanks for the reply. so “Format: Solo” and “Instrument: [description]” become part of the prompt? separated by commas along with other stuff as usual? (“breathy, irregular, B major”)?
a couple of examples of prompts that resulted in usable audio:
"bell-like instrument, solo, no accompaniment, single notes, unison, C note, high quality, 44.1kHz"
"unison, wind, woodwinds, forest fire, soaring, bird-like, high quality"
would love examples that incorporate your tip. thanks again!
you see how i've formated it with | that's the important part when you seperate what is essentially pointing towards metadata fields
no spaces when you put one in as well
I know it's wierd
it's abit of a backdoor
so "bell-like instrument, solo, no accompaniment, single notes, unison, C note, high quality, 44.1kHz"
this would look like
Format: Solo|Instrument: Bells
you can add a | after bells and add more if you want
Is there a good Sound AI with a GUI yet?
Just noticed this typo in the Interface guide
I think that might be one of you problems.
Hello , can we use music generated with the ia to generate money with spotify for example? I talk of course with the subscription adequat , but once the subscription is finished, it still works?
What does Stable Audio do?
What does it take as input, and what does it produce as output?
what?
To use Stable Audio, what do you supply as input? And what output does it produce? What does it generate?
I would like to generate electro music for music platforms

Probably, but just not from here
I saw some YouTube and Spotify videos on the media page of this thread
just wrote this song, lemme know what you think!
https://youtu.be/YZYCehncT0c?si=7hiSEBCqW7592HH6
thank you but if my music remains several years on my spotify page , but that the" stable "subscription lasts up to 1 month it works how?
just wrote this song any producer up to make a track for it:)?
https://youtu.be/uSG_wPaQbOk?si=b2xAD0pAVwZsleGL
what do you have in mind?
NEW SONG ABOUT BEING LED ON!
https://youtu.be/g5MN14GZrHE?si=SZ8bbguJCrc5hya1
base it on my guitar
any tips on making a 5 second catchy jingle as a youtube opener? if i set the duration to 5 seconds it just sounds like the start of a song that's not complete.
Make it 60 seconds, use the keyword 'loop'. Trim to the duration you want. My best guess.
is the model available to run locally?
Not yet. Coming soon ™️
Soon™
any word on API access?
https://www.stableaudio.com/ exists. There's an unofficial stable audio tools gradio created by DionTimmer. The problem has been that there is no officially released model to run it right now.
Make original music and sound effects using artificial intelligence, whether you’re a beginner or a pro.
Hello everyone, my name is Angelo, and I'm the co-founder of BitSong. I would love to discuss a potential collaboration with one of the developers. Who can I contact regarding this?
Yesterday, the Harmonai team were showing off the early version of a 47 second freesound open source model that's currently in training. It sounded pretty good, so some good news there!
nice
so , on the licensing for Stable Audio, says the generations cant be used to train AI models. im really curious how well that policy would hold up legally speaking if the generated audio were edited /remixed /whatever and then used for training
fair use laws are kind of fucky... of course, i guess stable could ban from the platform and leave it at that
HI
Where do I find the Volume Slider?
🧐
good question 🙂
I guess I get to post this first then...
https://twitter.com/StabilityAI/status/1775501906321793266
Guys, the newest stable audio 2.0 model is available right meow. Go check it out! Each prompt is 2 credits, but you can generate the full 3 minute tracks now!
Introducing Stable Audio 2.0 – a new model capable of producing high-quality, full tracks with coherent musical structure up to three minutes long at 44.1 kHz stereo from a single prompt.
Explore the model and start creating for free at: https://t.co/E9ZIGagmPf
Read the…
Not only that...the radio is back!
Stable Radio, a 24/7 live stream that features tracks exclusively generated by Stable Audio.
Explore the model and start creating for free on stableaudio.com
wow, i love closed source model cash grabs
I will have to wait until we get the volume slider update though
lets try this
Very curious how the model does on advanced beatboxing techniques
Such as humming and beatboxing at the same time, or stuff like https://www.youtube.com/watch?v=5EADJGrNK3o
This is @denbeatbox second final round against @HissMusic in the Beatbox United Online Battle 2022 organised by Chezame and Sxin. #bbu22
Get your Merch at: https://clop-shop.com
Audio and thumbnail: Pono @ponobeatbox
Lyrics: Jacob Nicolas @mashendee
Filming and editing: Colin Rambaran @kauli11
CHEZAME & SXIN DISCORD:
https://bit.ly/3eG5p...
the like button on the generator is occasionally not doing anything
it seems like the autoencoder has a "snake" block in it.
Anyone want to confirm-not-so-confirm that it is a mamba SSM?
If I want to create a 24/7 channel on youtube and monetize it is it possible?
Can I monetize my music on Spotify?

Whether you can monetize AI-generated music on Spotify at all is still up in the air. https://www.bbc.com/news/technology-66882414
on the reddit post about stable audio 2, Emad mentioned among other things, the word "comfy", so does it mean it will eventually be available offline?
it's not a Mamba SSM, Snake activations are their own thing
Is there any plans to release the weights of Stable Audio 2?
We need a pinned message since this gets asked a lot.
According to that particular Reddit thread. I think Emad did mentioned that they will release another model trained on another dataset
Stable Audio 2 is pretty good though
Yep. The freesound open source model is still cooking. They used audiosparx to train a really good autoencoder though.
It is crazy how many of them wanted the model to be released, but I think that make sense.
Several people said that it is not as good as Suno, which I agree. One of them said that Audiospark stock music are not usually good so it turned to be not very good.
does stable audio has an api?
I just wish the license wasn't so restrictive.
the autoencoder is also trained on freesound
Oh, I didn't know. Interesting.
Hi all! Is it possible to generate voice for the songs?
Got this message while generating:
Error - ClientError: Received client error (400) from model. See the SageMaker Endpoint logs in your account for more information.
Is it possible to upload public domain songs? i.e. classical music that is over 200 years old and definitely in the public domain
Depends.. are you generating the song? If someone records a version of the moonlight sonata, they don't own the composition, but they do own the audio/recording. Are you reuploading someone elses recording of a classical piece, or uploading your own version of it 
Thanks for the reply! Beethoven's recording of Fur Elise is in the public domain for instance. Can that not be uploaded? Composed in 1810
Metallica
It wasn't recorded in 1810. You need to find out whether the actual audio recording is in the public domain or not 
The only recording you can legally use/for money is the last one, where the copyright is in the public domain 
🕺
love it!!!!
is there an api for stable audio 1.0 and if not when is the api for 2.0 coming? Thank you!
Attempted some t e l e p a t h style slushwave manipulating a sample I generated 😛
Some of the music stable audio can create is actually kind of interesting but I honestly much more enjoy just seeing what kind of hellspawn I can create
woow this is huge. produced using v2
It may not sound interesting or even boring to you, but from a musical point of view, this thing is terrifying in its quality of playing, sustain, creativity and instrumentation, even better than Suno in that respect, and it just needs a little humanisation.
Stability AI is sweeping the field.
We'll see in a year's time where it will go, and this is a company that is bigger than Suno, rivalling Open AI and has a lot of experience in AI!
Oh my 😳😂
Final boss music
https://www.stableaudio.com/1/share/cdfe367b-b2c2-4b86-9c67-2c2cf3fc84f2
For such a silly prompt there is so much sound design potential here
Seems like a hunting vibe of music of scary movie for me but It fits also as a boss theme.
2nd try for the terrifying-core hunting piece of music
Seems enhanced
Image testing this on v3 lol
waiting for it
/a cinematic lyric opera vocal melody, powerful woman's voice
Here is the image you requested.
Hello community, is anyone having problems using stable audio? because from the first stable audio model I want to start using it and it gives me a sign saying that I was blocked and it won't let me enter
Can you add lyrics here like riffusion?
Stable Audio with ElevenLabs SFX!
A perfect match, took a few hours to concoct this one! Pretty happy with the results.
https://www.youtube.com/watch?v=gpRMX07VNIU
6 min Micro EP Created using Stable Audio and Elevenlabs.
I wanted to create a Mariachi Dub mix and Stable Audio is excellent at picking up on the nuances. I was invited to try out Elevenlabs SFX module and I use it liberally in this mix attempting to catch a Mexican radio vibe.
Here is my demo via Stable Audio.
Hey all! having so much fun with the new model. can anyone tell me what the licensing implications are for downloading the music, reworking it into a derivative version (like a remix you would make with Ableton etc) and then releasing that? I know this is a question for the team so point me in the right direction if you can. Thanks!
According to the website: "Use the music you create with Stable Audio in your commercial projects."
lmao that reminds me of some of those overdramatic bollywood movie scenes where everything is in slowmo and shown from 10 different perspectives
Hi, I've received the following error on a prompt: ClientError: Received client error (400) from model. See the SageMaker Endpoint logs in your account for more information. Any idea as to what wrong and where can I find these logs ?
"Sorry, you have been blocked
You are unable to access stableaudio.com" 
made a little track with only stable audio samples and my new favorite plugin (amigo sampler) 🙂 https://www.youtube.com/watch?v=c3NivDcgAA8
all samples are created with Stable Audio 2.0, a text-to-audio AI based on fully licensed music, which was released just a few days ago.
I think the results are the most useful and musical available so far!
You can try it at https://www.stableaudio.com/generate
Sampler used: https://www.potenzadsp.com/amigo/ (its 10$ and amazing!)
how can I extend a song I made using the stable audio website? is there any way to set the model up for audio completion
can stable audio be installed locally too?

hi guys! any tips for consistently generating new sounds in the same chord/scale as the input i provide? i can't seem to find the sweetspot... i find it easier to generate clean percussive material from an input audio, but harmonic stuff seems a lot harder to control
Yeah this is totally allowed, and one of the main intended use cases for the model
is there any way i can run stable audio locally?
The stuff my momma watches

Animation for the uploaded music, animation is an ancient Chinese style of animation
Here's the animation you requested 
A handsome man in ancient costume drinks in a pavilion by the lake, and a beautiful woman in ancient costume lies on a small boat on the lake and bows her head to play in the water
Here is an image you requested.
nuh uh
doubt it since they offer subscription based stuff? 🙂
tried out the stable audia and played on my paino a small piece with close to 3 minutes duration. unfortunately the output gets cut off. but the monthly fee is too much as I just want to try out a single song ( in general all those AI subscriptions are unfortunately too expensive since i would use them not often enough)... Is there a different pricing model planned in which i can buy individual tokens to produce output? or another option i might have overlooked?
i have something in mind, wanna build an AI music producing tool where people can just sing to create the tracks, that would be awesome
when will stable audio release its api?
still no volume slider?
Great question 
I could use some help. I'm trying out SA. Probably using it wrong. I thought I'd have fun screwing around with a minimalist time-signature phase piece... like the 60s and 70s Steve Reich and friends. But apparently, you can't do accelerandos & decelerandos? I tried using "slow down" and "speed up" as well. I think I can cut and paste in a sound editor to do what I want. But it seems a doable thing? Here's part of the percussion stem I tried to execute.... "Solo symphonic percussion anvil only, no change in pitch, just a simple beat with gradual accelerandos, decelerandos as indicated. Anvil sets heavy 4/4 beat strong downbeat, diminishing echoes on beats 2, 3, 4. Start at 106 BPM, anvil only, 8 measures. Then 8 more measures Accelerando to 112 BPM, anvil only. 114 measures 112 BPM, anvil only. Then decelerando to 100 BPM over 8 measures, finally 100 BPM [ast 4 measures. 1. 8 measures, - solo Anvil sets a heavy 4/4 beat 106 BPM, strong downbeat, diminishing echoes on beats 2, 3, 4. 2. 8 measures 106 BPM gradual accelerando to 112 BPM, solo anvil. 3. 14 measures 112 BPM, solo anvil. 4. 8 measures 112 BPM decelerando to 100 BPM, solo anvil. 5. 4 measures 100 BPM, Solo anvil continues then ends. "
hi, I'm just wondering if it's possible to install stable audio locally?
This is a GUI developed by DionTimmer, but until a model drops this isn't that useful.
Hoi, is there a A.I generator similar to UDIO , but locally ran?
AI Music - Wow..... I Didn't Know That (Udio)
Song generated with www.udio.com (beta version) by: BobbyB
Prompt used: Americana, Country, Bluegrass, Melodic, Passionate, Lush, Rhythmic, male vocalist, anthemic, uplifting, pop, playful,
Male vocalist, Northern american music, Country, Regional music, Contemporary country, Contemporary country, ...

Country, catchy, song. Listen and make your own with Suno.
embed fail
Will stable audio 2 be open sourced?
not yet, but i eagerly subscribe to Google scholar alerts for text-to-music papers in hopes of finding a solution once any emerges. Suno and Udio seem to be a few months to a few years ahead of SOTA text-to-music open source (which languishes in MIDI and non-lyrical content)
https://paperswithcode.com/dataset/magnatagatune this is probably a critical dataset for any future model
one could probably fine tune a TTS model to music mostly easily with a few hundred $$$ (my best guess) https://github.com/jasonppy/VoiceCraft new model and code drop for audio
MagnaTagATune dataset contains 25,863 music clips. Each clip is a 29-seconds-long excerpt belonging to one of the 5223 songs, 445 albums and 230 artists. The clips span a broad range of genres like Classical, New Age, Electronica, Rock, Pop, World, Jazz, Blues, Metal, Punk, and more. Each audio clip is supplied with a vector of binary annotation...
Stable Audio 2's paper is out.
Audio-based generative models for music have seen great strides recently, but so far have not managed to produce full-length music tracks with coherent musical structure. We show that by training a generative model on long temporal contexts it is possible to produce long-form music of up to 4m45s. Our model consists of a diffusion-transformer op...
please open source it!
music with vocals. Our focus is on the generation of instrumental music, so we do not provide any conditioning based
on lyrics. As a result, when the model is prompted for vocals, the model’s generations contains vocal-like melodies
without intelligible words. Whilst not a substitute for intelligible vocals, these sounds have an artistic and textural
value of their own. Examples are given on our demo page.
``` this makes me sad
maybe it could be fine tuned with an emphasis on text-to-music with vocal lines
SOTA text-to-music vocals is still proprietary et al
Indeed. The thing if not intelligible, could possibly for now be used to hum along the tune. Like those dance/trance songs with women who just hums after the song. Or like orchestral fantasy where the woman just says "aaaa, ooooo" along with the fantasy music
i guess one could:
- produce text-to-speech of the lyrics via the tool of your choice in file1.wav
- produce a song in the style one wants "anti lullaby, dubstep" whatever in file2.wav from stable LM
- using stable LM maybe transfer style file 2 to file 1
4)???
use text-to-speech and then use some style transfer
Must be similar to how people get game characters to sing
Transferring style file (in this case, the original song) to file 1 (text to speech of the character from preferred tool) and do whatever’s next
so how do i run this locally. lol, worth a try....
Yes why isn't this one available locally anywayz...
It's been explained a few times if you read back through the thread but the short answer is "revenue split with Audiosparx".
im just gonna continue crying ok?

so SD basically used these sparx clown's audio library to train and this didn't work out like with Getty hmm?
Doesnt surprise my music industry more copyright anal retentive than visual arts
whatever
theyll go in the dust like hollywood
This is an output from the FreeSound model. Sounds pretty good imo.
Do you have a source for that?
The source is the freesound audio model weights that are coming out really soon.
Is there a github page for that?
I have no idea how inference for that is done. Tools etc...
This is the official s.ai repo, but without the weights this doesn't do anything.
i don't care if it's not trainable and I have to use the "base audio model
but is it possible to run locally and is it any good?
and can it do lyrics?
Like is there a webui interface for this like A1111
I can;t do this without an interface
my brain isn't wired like that
ha wired.... i get it... :3
wait anime name?
has anyone else subscribed to the 'studio' tier of Stable Audio but not gotten the promised 60 minutes of upload capacity? Mine capped out at 30 minutes even though I am subscribed to the tier 'Studio'
i sent a support ticket in through the website and never got a response
No clue, sorry xd
I believe it’s Lycoris Recoil.
I'm just here tryna feel sorry for myself, then Stable Audio comes up with some crazy trumpet thing at the end XD: https://stableaudio.com/1/share/53e29eb2-15f5-433b-bee2-5e2432efb864
OH MY GOD this is incredible XD please post this on youtube or some meme page
It's not mine. Probably should have made that clearer.
Hello! This is very exciting to hear, since I commonly use Freesound to get SFX, samples and sound design, but almost never seem to find exactly what I'm looking for. Having an audio model based on that website would be revolutionary! Anyway, I haven't heard about that model variant anywhere else. I've heard there is a local version in the works, but this is the first time I've seen results and the dataset mentioned. Would you mind telling me where I can inform myself or where you got that information? Thanks!
Join the Harmonai discord 😉
Once upon a time, there was a little turtle named Xiaoming who lived in a beautiful lake. Xiao Ming enjoys exploring and making new friends very much. One day, Xiaoming heard that there was a mysterious garden in the forest near the lake, filled with various delicious fruits. He decided to search for this garden.




Are there any devs here interested in working on an audio project with stable audio?
This is like the saddest, loneliest SD room.
Tends to happen when no weights 🤷♂️
It's coming soon ™️ though.
i hope they release some version of stable audio 😦
i mean the community can then improve it perhaps
I was told late May. That's the most up to date number I have.
told by who? just curious of the source :3
Stable Audio lead: Fauno15.
kk, well let's hope we have something
That model is a 47 second model trained on Freesound's audio library so the idea is that the community could train off of that model when it releases.
agent 47 would be proud :3
how can i run this thing? i have magnet and audiocraft but this might be better?
no ai
the stable audio website is crashed or something. the live audio stream isn't running and nothing else works.
Why isn't Stable Audio offering the ability to extend a song?
because that's something that LLMs do natively, but is trickier with diffusion models
The Freesound 47 second open source stable audio model is still on track to be released soon ™️.
Does uploading audio work for anyone here? It's been a few days and it seems to still be broken and the website and web app is currently barely even functioning.
I suggest making it open source so we could all help making it that much better. The tech is there, just the app is bonkers 🫥
@storm briar #🎵|stable-audio message
Should be coming really soon ™️
what should? fixing the web app or..?
There's going to be a 47 second open source model trained off of FreeSound's audio library released really soon. You will be able to use stable audio tools with those released weights to run open source.
As for the web app, I can pass that up the chain of command and see if there's some bug to fix.
Are you trying to upload copyrighted music? They block copyrighted music btw.
Got it, thanks, no, I'm literally just trying to upload and record my own sound, whether it's beatboxing, guitar, drums, and nothing is working right now.
It even says the contact support
Since a few days ago, uploads stopped working completely, I've also seen various users report this on Twitter/x
This is the main differentiation by the way from stable audio to any of the other current tools is that you can upload your own audio and modify it.
Without this feature it's simply not as good as audo, suno, elevenlabs, etc (And it's a shame because I'm rooting for stability and want them to do good)
its same for when uploading audio btw (not just recording) - Looks like the processing fails
Notified the service team about this, hoping it gets cleared up soon
It's coming soon guys, get your audio datasets ready for fine tuning. 🙂
Sweet! Any timeline? Everyone's clamoring over SD3, but this is just as exciting for me as someone who lacks talent to make music from scratch 🤣
Harmonai just had their office hours and they are aiming for the release of the freesound model next week, provided there are no delays.
Fingers crossed
That goes in line with their previous goals of having it out by the end of May.
That is fantastic news, thanks for the update! I really appreciate it
You can keep up with the news on the Harmonai discord as well. That's one of the best places to keep up to date on progress.
Just found it and joined. Thanks for the heads-up!
@tight anchor any updates about the audio ulpoading? still not working...
Not sure what's making it time out for you, probably requires more information about your account to debug, have you filled out a support ticket for it?
worked now.. checking if it works twice in a row!
also, does stablaudio have an API by any chance?
oh good! It was working on my side, wasn't sure what was different on yours.
It's something we're working on, yeah
soon i have two new music hobby choices - new guitar amp, or more vram 🤔
@tight anchor Please add ability to download the uploaded audio - its super critical IMO, and should be easy to implement
right now you can only use itas "input", and for history outputs you can download (but not uploads)
specifically relevant whe im using the stableaudio UI to record (not upload files)
Ran into an issue - when launching the file my antivirus told me it was dangerous. Should I allow it anyway 
No it's a virus
Can anyone from stable audio reply my email? I have sent emails to get stable audio's API and start API integration~~
wish to get feedback and start cooperation asap
Different
Say, I have a generation that's stuck generating. Can it be canceled somehow to allow me to generate again?
Udio and Suno can create whole songs with lyrics, whereas stable audio only creates audio snippets. They just do completely different things currently.
anyone else having issues with StableAudio timing out constantly when using input audio on generations?
looking at all of your generations, you're all having the same issue @storm briar
did the issues clear up for you yet?
been paying for stableaudio for about a week
was working fine for me up til yesterday
🤔
Yeah, that's what I was asking about above actually
No, still having issues...well since last night
Actually, no problems so far this morning
New system to accompany your sonic creations https://www.youtube.com/watch?v=9DbRJDitVhA
You can access this new patch, plus many more systems, experiments, and tutorials, through: https://linktr.ee/uisato
#touchdesigner #stablediffusion #visuals
Hoi, do you guys know if there's a A.I audio upscaler/sampler to read from the song, and add the missing higher quality that flac would otherwise give?
I highly doubt this. If it existed, it wouldn't exactly 'give it the quality it would otherwise have'. I think of the example of Samsung phones using AI upscaling on moon photos, and adding details that aren't actually present on the moon. If an AI upscales audio, even if using a relevant and accurate dataset, the result will not be 1:1 with what the audio would have been had it been recorded/processed/kept uncompressed. That said, I'm sure AI audio upscaling will becoming a thing and, at its best, be sonically imperceivable against original uncompressed files. Probably be a while before we have that, because the use case is very niche 
The thing is,with the moon photo, it's because the moon is always showing the same face, so samsung could easily just store a bunch of photos from different sades of "sun", or yellow/redness, thus it tricks the user into thinking they took that photo.
But with A.I upscale/upsampling of audio, it'd be trained on 100's, or 1000's of songs in flac quality for that upper range that mp3's doesn't have, and add the missing higher range.
You make a good point about the moon showing the same face. Those thousands of flac songs don't show the same upper-frequency face though, and their individual quality will vary widely. You will need a tremendously large dataset of each genre of music (to name one of the many desirable variables), in lossless quality for training. Most music is streamed or otherwise available in mp3/lossy formats.
side note, the average audience would rather listen to an mp3 than a flac, because they're used to hearing mp3 quality, and the additional information presented in the flac is perceived as artifacts. At least, that's what I've found with my students and peers. This isn't a recommendation to pursue lowres content by any means - just an interesting note. I prefer wav files any and all day.
I'm an audio engineer, but no expert on AI. My understanding is quite limited in that respect. I think the combination of audio upscaling being a rare use case (right now, at least) and the datasets for such training being so limited, would severely prolong the wait for that iteration of AI technology.
@coarse lantern 
Aye
Cause like with images, with fingers missing/melted into eacother, it will of course most likely not be "great", but first one need to train a model on the higher ranges :P
All i'm waiting for now though, is a open source version of UDIO, which can actually make quite damn good songs
Hello, Version 2 is not working. Support did not reply to my mail. Is there a bug?
Ooh nice
I haven't been following this, are there sample generations somewhere?
hey guys whats the automatic1111 of stable audio open?
DionTimmer's gradio GUI probably, but I'm not sure it's updated to handle SAO just yet.
Stable Audio Tools (Main Repo):
https://github.com/Stability-AI/stable-audio-tools
oh nice they finally released something for audio 😮 now we need this in comfy somehow :3
Dion's on tour right now so he probably won't update his repo for a little bit.
@me when a comfyui implementation for stableaudio is ready
Ive spent 4 months working on an animation thats comprised of entirely open source ai tools
and this final hurdle is killing my spirit
sometimes it works; most times it doesnt
this is SPECIFICALLY an issue with Input Audio
can this technically be finetuned as well?
controlnet for audio 
Yes, you can fine-tune your own models now!
nice
do u know what's happening with the timeouts?
I don't unfortunately. I'm not as familiar with the website.
Might be a good question for #🤝|tech-support
Thanks for the link. I wish stability AI would not call a model with -nc license restriction "open source". There is - per se - nothing wrong with a -nc license model, it is simply not "open source" in the traditional software sense.
There's a gradio interface built in to stable-audio-tools, if you've accepted the model terms on Hugging Face, you can launch it with python3 ./run_gradio.py --pretrained-name stabilityai/stable-audio-open-1.0
Congrats on the successful model release!
Thanks!
So here's the golden question: who's gonna be the first to make a repo for running this on a bot in discord?
I have created a Colab Notebook where you can try Stable Audio Open 1.0 immediately. Please feel free to use it.
https://x.com/xqdior/status/1798431457096114345
Since this is an announcement account with low traffic, if you find it useful, we would appreciate likes & reposts.
Nice work! Thanks!
are stable audio devs in the tech support channel?
because if not, it wont be solved in there
i mean this stable audio is not open source lol
i have hered that nodes will be added to comfyui? and will we be getting code for finetuning? :3
DionTimmer was messing around with comfyui node implementation a long while back. I wonder if anything came of that little experiment of his.
oh there needs to be a civitai website but for audios
RIAA:
Always considered giving ComfyUI a try but now I might as well do it just for Stable Audio 🤣

Dropped a node: https://github.com/lks-ai/ComfyUI-StableAudioSampler .. community is iterating from the model weights
gonna look at the gradio tomorrow and see if I can do this initaudio stuff in the Node
You gonna join us for office hours on the Harmonai discord tomorrow?
I'm planning on fine-tuning soon when I get my library in order.
remind me if its finished
i would want to try some hyperpop
i also wish stable audio has lyrics input...
the prompt adherence prob aint good cuz its like kinda undertrained (i tried burger mukbang once)
tbh it stucks at 0:47 even though i set the end seconds to 20 and etc
Perhaps. What time?
gibberish amen break
hey i cant find the node inside comfy ui\
i installed t
ikt
it
and it not apears in my search
.............................

I'm booked @ the studio. Though I am interested. Another time perhaps.
Another time ig.
udio keeps being updated
what are we going to do
how are we going to kick suno and udio's asses?
i think we shuld start by finetuning the model we have on acually good data.
Is there a way to use it in ComfyUI without an HF token yet?
yeah that's the only thing keeping me from using it offline
https://huggingface.co/audo/stable-audio-open-1.0 use this repository
Thanks, but this time around I'm patient enough to wait for a GUI. It's pretty neat - I simply didn't check on Audio AI as much... and here we are. 😄
yeah I'm using it with python directly, just thought the extension might allow you to change the repo
but ther are guis alredy
There is a Comfy node - but that needs a HF token currently.
hmm why?
ther are gradio guis also
Because it dl's the Model instead of looking for a local one. I guess there will be an update in a day or two.
I know there is Audiocraft, but there was no update for a month.
Is Stable Audio broken? These are my recent generations...
Almost never works and I'm paying...
no i think the stable audio github has a gui for ther model
constantly getting charged credits and then getting this error
using the 2.0 model
Hi, I don't understand this part
I downloaded model.ckpt and model_config.json
device cpu? Also I don't see how to select the config file
I had the same problem. I put the checkpoint with the config withing a separate folder within the models folder.
For cuda you need to do this here:
Open cmd shell in the folder
activate venv with venv\Scripts\activate
pip uninstall torch pip cache purge pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121
I have AMD! I guess I need to install torch nightly
ty!
Oooooh. Ok.
If the model thing still doesn't work. Try renaming the model_config.json to model.json
Heads up. That diontimmer repo was made before the release of SAO, so if you are having problems, you might want to try getting the main repo running first.
And maybe restart the whole thing. Took a while until i got it recognizing the model.
you mean this one? https://github.com/Stability-AI/stable-audio-tools
Ooofff, sorry, this is out of scope for me. I am happy that i got my thing barely running. But it feels like breaking every second. I am hoping for a usable comfyui implementation soon.
oh no problem
Yep. That's the official repo
it is only for training or you can generate?
I believe it does both but training requires a little more setup.
The diontimmer repo is using stable-audio-tools
ah but it doesn't have a gui
are there any prompting docs?
yea will need to finetune
So I purchased a subscription to Stability Audio and it says Im still on the free plan. Is this a glitch?
Does it work with 8gb of VRAM?
is there some ok gui for stable audio?
is there a way to make audio that is for example 4 seconds long without having 43 seconds of silence?
yes
i just read this post and they say sd auio open is also diffuion trasformer architecture?! is that true? https://wandb.ai/byyoung3/ml-news/reports/Stability-AI-Releases-Stable-Audio-Open--Vmlldzo4MjQyNTQ3
"The model details reveal that Stable Audio Open 1.0 is a latent diffusion model based on a transformer architecture. It leverages a pre-trained T5 model (t5-base) for text conditioning, converting text prompts into numerical embeddings that guide the audio generation process. The model was trained on a dataset consisting of 486,492 audio recordings, including 472,618 from Freesound and 13,874 from the Free Music Archive (FMA). All audio files are licensed under CC0, CC BY, or CC Sampling+, ensuring respect for creator rights while providing a robust dataset for training."
Yes SAO is diffusion based architecture, unlike Suno which is LLM based.
the trasformer thing suprised me not the diffusion thing
i thoght only stable aduio 2.0 was diffuion trasformer
ohh it works 🎧
Can anybody point me to a step by step tutorial about setting up Stable Audio Open as a standalone package preferrably with a Gradio or ComfyUi interface? Ameerazam08 has a nice free interface here. https://huggingface.co/spaces/ameerazam08/stableaudio-open-1.0
Adding a link to a ComfyUi workflow - Thanxx Nate https://github.com/lks-ai/ComfyUI-StableAudioSampler
any chance to use stable-audio-open-1.0 with cpu only?
i mean how long does a prompt generation take with a average 4 core cpu?
I've used SAO on my M1 macbook CPU only and it takes around 12 minutes to generate a sample from prompt
Saganaki22 has a one click installer - which worked for me. https://github.com/Saganaki22/StableAudioWebUI
Tried it out and it works for me. If you need a simplistic solution for installing, this one is a good choice.
daim its good xD
no its not
even version2 doesn't give me anything good
the old google AI music lm was actually good but low quality but they nerfed it
So this is out? 😮
Channel will be active? 😄
let's go anywhere i can try online?
lol wtf
sounds like it just gave up and made some random cartoon noises
Tgis is a different open source Model
Funny that u say that my prmt said woky cartoon soun effects
On huging face spaces
I got a chuckle out of the furry on a train track pfp. Nice.
Hi everyone
I used stable audio(free-tier) in website to mix some music, but it doesn't work.
All of my trials of generation with uploaded input audio always be failed by time out 😦
is anyone happens similar issue?
problems cause only with uploaded input audio!
this was the promt"clown, Glitch , Goofy, Wacky, cartoon, sound effects, . clown. random sfx sounds, peenis , experimental"
This model compared to Udio is like comparing a starting guitarist to a rockstar
It's cool, but I just wish there was more genuine coherency
Izs more for sound effects
In that case it is extra accurate at prompt following.
literarily... starting to practice guitar.
ring ring ring
is there any tutorial on creating Loras for the new model?
a drummer could fine-tune on samples of their own drum recordings to generate new beats
ERROR: No matching distribution found for pedalboard==0.7.4 who knows how to solve this problem?
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement pedalboard==0.7.4 (from versions: 0.8.2, 0.8.3, 0.8.4, 0.8.5, 0.8.6, 0.8.7, 0.8.8, 0.8.9, 0.9.0, 0.9.1, 0.9.2, 0.9.3, 0.9.4, 0.9.5, 0.9.6)
ERROR: No matching distribution found for pedalboard==0.7.4
but Stable Audio requires 0.7.4.
Im not sure if this will help, but I tried this repo and it worked on my PC https://github.com/Saganaki22/StableAudioWebUI
Nobody I'm aware of has made a walkthrough except for @spring girder
00:00 - Getting started03:18 - Installing libraries & model download20:31 - Switch to Jupyter Notebook with A6000s26:41 - Dataset upload & organization40:10 ...
She was on the ball day one. Actually kinda impressive!
I woder if they want to qd comfi ui suport
no(
What gpu do you have?
4070 super
problem with ModuleNotFoundError: No module named 'packaging' but im already install packaging
Gpu bat
That´s so strange
There are a million tedious things I want an AI to do in my DAW. Not a single one of them ever involves generating low quality audio samples. We have tons of fantastic audio samples floating around Splice already, and also quite advanced tools which can synthesize mathematically perfect waveforms for anything else conceivable. Why would we want worse audio samples from an AI? Why are developers not instead building AI enabled DAW plug-ins to help producers? Instead of this pointless fart-generator, I want a DAW co-pilot like what's available for coding developers. I want to prompt my AI-enabled DAW to do tedious things that cost me time and money. Why isn't AI doing tedious things for creative people, rather than doing creative things for tedious people?
Example prompts:
Phase-align this group of multi-mic tracks, take this mono track convert to mid/side and generate side audio content to make stereo, take these three existing tracks and build an 8-bar riser out of them ending at this timestamp, take this Amen drum break and chop it up along transient onsets, dynamically suppress resonant frequencies between 200-800hz on this piano track, generate three harmonies to this lead vocal using 4ths and 6ths, create 8 background vocal harmonies to support this chorus and phase-align them all, group everything but the vocals and kickdrum into a bus and enable sidechain compression using the kick as input, replace the transient on this kick drum 10 times and let me pick the best one, take this audio sample and use it to create an Andalusian cadence in D minor, create a reverse reverb fade-in for this lead vocal for 1/2 measure, and on and on and on....
As a music producer, this is what I want from an audio AI, not deriving low-quality samples from high quality samples.
also, hi.
What you want is a large action model (LAM).
I like you. Welcome.
AI is absolutely incapable of that level of music understanding due to a tremendous lack of training data. If you fine-tuned ChatGPT 4 in conjunction with Suno on musical (mostly just tedious processing) tasks such as those, you might be able to get close. But obviously that isn't a possibility rn.
Workstation-wise, it'd definitely have to be aax plugin format. One of my producer buddies uses gpt api to have is reaper instance do stuff based on voice command, which is sick af and saves him a lot of time, but even then, it isn't capable of listening to the session / knowing it to that degree.
I too want that kind of functionality from an AI. I also want an AI that cleans my bathroom and does my taxes. But those don't replace the workforce in bulk, so they won't be as profitable for the big companies. Probably. So in the meantime, I'll just keep making music the long way. I'd lightly argue the fun way. 
Hard disagree, in fact one of these examples is already being worked on. The problem is developers are focused on the completely wrong issues when it comes to AI as a tool for music production. If producers were involved in the development of these tools there would be more than enough training data. There is a disconnect between Silicon Valley and the rest of the industry which caused this, and that needs to be fixed. So far generative AI has been a solution for problems which don’t exist. Stable Audio is basically the “Not Hot Dog” app from the TV show Silicon Valley, only less useful.
A little trick for widening a track using Combobulator by @datamindaudio.
Watch the full tutorial in the HCA Feed (if you aren't a Hardcore Abletoneer yet, subscribe today—link in bio).
#musicproduction #datamindaudio #combobulator #stereoimage abletonlive #ableton #abletontips #beatmaker #beats #electronicmusic #mrbill
890
Amen
this channel on fire now
Hard agree 
I don't keep up with this stuff so I speak ill-informed 
Not sure what I want but stitching songs together with outpainting seems like a fun idea.
what are you guys using as a ui?
Hi , I would like to ask you if we buy a "Professional" license, how does it work if we cancel it after 5 months ? Will the music that was generated still be able to be in the project or will we have to delete it?
Don't speak for all creatives and music producers. I couldn't wait for a open source sample gen. I'm training currently on my own samples.
If your goal is to turn high quality samples into a trash generator, sure. Knock yourself out. My point is there are fantastic uses for AI audio, which the developers aren’t doing because they lack insight into the needs of most people in the industry.
👋 curious, you switched to DiT in the stable audio 2.0, just like in SD3, but still use latent diffusion, while SD3 started using flow matching (FM). Is there a particular reason to not use the potentially more efficient FM in the stable audio 2.0 and onwards ?
stable audio is amazing for noise/experimental stuff, you can generate all kinds of crazy things, guess you think it's all garbage but hey theres more to samples than shiny commercial packs
Correct these days I need something that works quickly. I don't have time to roll a dookie log in glitter to see whether anyone can tell if it still smells or not
come on man show us your duck, post your music 😁
I'm just using the default Gradio included in stable-audio-tools right now. Tech too new to have a lot of shiny toys at the moment.
If you can't flip a sample into something great, it's not a problem with the generator, it's more likely a skill issue. But if it's such a need for "most" industry people, get a dev, some vc money and make it happen.
It's not a matter of flipping a sample, the point is we already have a squillion ways to get good samples already without inferior AI generation. What we need is AI tools which are time-savers to most producers. Sample generation is not.
Yes it's a sample generator, but if the community eventually develops an "auto1111" tool or creates extensions for audacity I could see a sample generator like this become helpful especially from the audio2audio aspect.
if you mean the AI should be doing all the hard work, hell yeah it should, but that's a vastly different problem than training a model for samples, you are talking about deep integration with DAW processes, VSTs, etc... not something we have today
Really early audio2audio example:
Like September 2022 early.
Organo audio source: https://youtu.be/ydUSm7BoMwk
Provided to YouTube by [Merlin] RealPlayazLtd
Mr Happy · DJ Hazard · D Minds
Mr Happy / Super Drunk
℗ Real Playaz Ltd
Released on: 2007-09-24
Auto-generated by YouTube.
You should stop acting as we. It's just your personal standpoint. AI Sample Gen is a big time saver. Instead of wasting hours searching for the perfect sample. I just describe it to the AI and get it.
banger
But it IS something we have today. Check the combobulator link I posted above, it solves a very specific problem using AI quite effectively. This is an example where the tools are applied to a problem and not the other way around.
post your music right now
i couldnt get stable audio tools installed. it kept giving me an isue with the packaging module
Made a visualizer for my track.
Images were generated using Stable Diffusion.
Overlay was made with Canva.
Visualizer built with After Effects.
𝑾𝒆𝒍𝒄𝒐𝒎𝒆 𝒕𝒐 𝒎𝒚 𝒄𝒉𝒂𝒏𝒏𝒆𝒍...
Here you'll find content related to my media projects. My main focus is currently on inspiring traditional artists through the use of GenAI tools. I've been a creator since I was old enough to type. Since then, I've journeyed through virtually every digital medium possible. This project represents the realization of man...
anyone know how to get ComfyUI-StableAudioSampler running? i am getting the error: ModuleNotFoundError: No module named 'packaging'
i'll be your bestie for the restie
Same idk how to fix it
i tried a bunch of stuff last night its a pain
did you try navigating to the root directory of your local clone of the stable-audio repository and installing the packaging module?
You might try running pip install packaging in the root directory of your repo folder.
It's a dependency issue, that's all. You're missing a module that is required to run whatever it is you're doing.
You may have to trace the error back to a deeper problem if installing the module doesn't prove successful.
i installed it to the env site-pacakges
it looks like its a bug with flash-attn>=2.5.0
it might just be incompatible with 3.12
(.venv) PS P:\ComfyUI-ZLUDA> pip install ../GitRepos/flash-attention produces: ModuleNotFoundError: No module named 'packaging' [end of output] thats directly from github
pip install -r requirements.txt
You may need to upgrade pip to latest version
i have
"Might work for Windows starting v2.3.2 (we've seen a few positive reports) but Windows compilation still requires more testing. If you have ideas on how to set up prebuilt CUDA wheels for Windows, please reach out via Github issue."
i noticed lumina has a music model, anyone tried it?
comfui plugin?
theres no such thing
https://github.com/Stability-AI/stable-audio-tools
cd stable-audio-tools
pip install build
python -m build
https://i.imgur.com/qeo724S.png
how to install it
pip install <*.whl>
i didnt get it to work yet
if i do ill do a video on it
@summer scroll have you tried with this one? https://github.com/Saganaki22/StableAudioWebUI
yes i tried it its under trained for now
@pine leafbro,I have already run Gradio, and I bet you haven't configured the environment yet. Try using pip install more. ..
has anyone had the flash-attn install stuck here?
im on python 3.8, cuda toolkit 12.4, and torch 2.2.2+cu121
how can i start stable audio open interface again, it opens automatically when downloading it for the first time but now I can't find the .bat file to open it 😦
@opaque wrenpro,I successfully installed flash_attn.If you are a windows system like me, execute the instructions first.
(pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121 )
And check the website (https://github.com/bdashore3/flash-attention/releases) to download the flash_attn version of the corresponding environment variable.
Uff thanks for that, ill give it a shot
oh
but you can try the web demo they have to see for yourself. but i think stabel audio open is better
i cannot find it
on the github are also links to ther other demos https://github.com/Alpha-VLLM/Lumina-T2X?tab=readme-ov-file
oh
but i still like the lumina project. its open souce and they even plan to make a model that has multible modalatys. like it can make videos images audio and so on. all in one model
he image rendering arent that good but
are they the same ppl that made LUMA?
no they are chinise. but some people that worked at stabilety made the luma video model
yes they say its still under development
they will also relese a text to speach model and other stuff
thats a big passion
16 color vae lumina t2i when
the proplem with chaging to a diffrent vae is that you probably have to start traing the model from scratch. that could be to expensive or time consuming
sad
couldve been 16 color channel at the start, i would want to see how would they do with text in images like sd3
https://stackoverflow.com/questions/78604018/importerror-cannot-import-name-packaging-from-pkg-resources-when-trying-to pip install setuptools==69.5.1 https://i.imgur.com/BuDDE3T.png
im noded up
those are rookie numbers :3
Why are stable audio servers always pending recently???
I just realize why don't they call the model "Stable Audio Diffusion". That would be SAD.
i asked and they said they are experementing with 16 channels now xD
its called stable audio open. so its sao
YEAAH
im hoping they would fix the quality rendering, hands etc
id imagine what would it be with text generation in images
yes but a lot of these models are undertrained and then finetuned on a small subset of astetic images. its not realy optimal
but it saves cost
i just installed the webui for stable audio but for some reason it's using my CPU for generating. does anyone know how i can change it to my GPU? i've installed all the dependencies. also stable diffusion runs on my gpu just fine so i'm confused lol
this is the best technology to tap out a beat and turn it into drums right ?
You can enter (nvcc --version) at the command prompt to check whether you have installed cuda. Most ai programs are based on CUDA programming of NVIDIA graphics card.
- Go to official website (https://developer.NVIDIA.com/cuda-toolkit-archive) to download and install CUDA and configure computer environment variables.
- Go to official website (https://pytorch.org/get-started/previous-versions/) and install the torch corresponding to cuda with pip instruction. The reason why your program runs on the cpu is probably because you installed the torch that only supports the cpu version.
I hope the above suggestions can help you.
still
is there a more active discord for stable audio
They already have joined.
i am not sure something like this "reverp base bosted"
What's the recommended text to music these days? Preferably also with vocalists actually singing :P
If you want vocalists, you probably aren't going to have luck here. You'll need an LLM model like Suno or Udio for that.
Vocals in SAO or Stable Audio 2.0 sound like The Sims talking
I can work with a llm if there's a gradio of sorts that can combine both music and vocal :P
i think you dont need a llm model for that. its just not implemented in non llm models
ther is no open source llm that can do that currently. the closest thing we got is bark. but we dont have traing code and it cant realy do music rn.
Hmm indeed.
So for these, what git's will be needed for these? As sony's for instance doesn't have a config.json, so it can't be loaded into diontimmer's audio diffusion 
use stable audio its better then lumina for now
I can't even use lumina as there's no model config json included, and stable audio i mentioned can't load it
i dont think that lumina can loade with stable audios code
lumina is a complyltyl indeependent model with its own code
Gotcha. How do I find out what gradio/git one uses for each model on hugging? For instance for lumina's, same with Sony's.
Also, for stable audio webui one, how do I make it only generate 5 or 10 actual seconds and not imaginary ones? As less samples gives less duration, but also utter ear blasting nonsense
you need to fine the github of the models. with the code. and they often deskibe how you intall them
but the lumina has a web demo also so you dont need to intall it
here is the lumina web demo http://139.196.83.164:8000/
and ther github repo https://github.com/Alpha-VLLM/Lumina-T2X/tree/main?tab=readme-ov-file
Aye, i know how to install the gits that provides the how's. What code do you speak of that i can trace to a git?
And i prefer to load them locally :)
maby u shuld use the pinokio installer i have hered that its easy to use.
i only istalled stable auido open so far
I just asked what code you were speaking of. Once i understand that part, then i can start looking/waiting for gits with said "code" for a git that corresponds to the model i want to try out
thats the code for stable audio open diffuion https://github.com/Stability-AI/stable-audio-tools
that has to be installed and then you need a folder where you put the model and the model config
Code? You mean link?
and i have not installed a lumina model jet so you have to look into ther github and see how to install https://github.com/Alpha-VLLM/Lumina-T2X/tree/main?tab=readme-ov-file
the program
Ah, now i get you
Sorry for the mass confusion from my end lol
xD
kinda weird how stable audio seems a little under the radar compared to image generation models
been trying to get my img2vid, stable audio, sd3, and sd-3d working.. almost there
yes
i guess you cant make corn with it lol
asmr
It would explode if we could not just make audio, but songs like Suno locally.
yes i think the model architecture could do it. just traing data is the problem
All these other channels feel so relaxing and serene compared to the raging fire that is the SD3 channel hides
xD
look... im not going to take stability serious until there is the shitty flute lora for audio models: https://www.youtube.com/watch?v=nF7lv1gfP1Q
this is a good song i like this song
this was requested by wihmib
Support my Channel on - https://www.patreon.com/shittyflute
follow the shit in other places if you want :
Twitter - https://twitter.com/shittyflute
Facebook - https://www.facebook.com/Shitty-1723911674493449
Instagram - https://www.instagram.com/shittyflute/
Shirts - https:/...
been trying to instlal for 30 minutes 😭 with pip install stable-audio-tools --no-cache-dir on my MBP
❯ pip list
Package Version
------- -------
pip 24.0
its not simple
for that error, pip install setuptools<70
how do u even figure this stuff out
I am from Rust land so all this stuff is like IQ 9999 for me
before there was ai there was me
Hey guys, we are interested in a Stable Audio Enterprise License, who should I talk to?
Singing Voice Synthesis (SVS) has witnessed significant advancements with the advent of deep learning techniques. However, a significant challenge in SVS is the scarcity of labeled singing voice data, which limits the effectiveness of supervised learning methods. In response to this challenge, this paper introduces a novel approach to enhance th...
I would just crete a python envirment git clone it and install requierments.txt
StableAudio constantly times out or reports errors from AWS Sagemaker. Is there any support channel at Stability AI that can help with that? My emails and support tickets have not received any responses so far.
Stability cant pay amazone bills probably or its something else
But they have a open source audio model now
that's the old model and it doesn't support the mps backend for metal acceleration 😦
its relativly new acually
what is mps?
Metal Performance Shaders - it's the acceleration framework for the new Apple CPUs. You set mps instead of cuda as your torch device. Unfortunately, the stable-audio-tools have cuda hardcoded all over the place.
daim
guys anyone experiencing pending issue on generations?
my other two has been timed out and the third one is... still pending for like 3-4 days
It doesn't work at the moment. I have been getting 1 successful generation out of 30 - 40 attempts.
I would like to hire a dev to make a nice multi platform client for stable audio. If you are the right person please DM me
when running local, is there a way to use its api, at /generate or whatever?
same question
^the music that plays when you have three friends in common with comfy 
I'm running into walls trying to resume training on a Stable Audio Open model I made... Neither the wrapped or unwrapped models will resume, I'm getting a few variations of the same error every time:
"The size of tensor a (14) must match the size of tensor b (12) at non-singleton dimension 0"
I only changed the LR, warmup and amount and content of sample sounds in the model config, I'm not sure what's going on here.
Hmm, actually I now realized what this was: changing the sample prompts caused this error, which is not how this would work I think...
Anyway, been getting superb results already by finetuning on my own music, as the base model sounds kind of neutral and boring.
Some sounds with the base model.
The same after 1200 steps with batch size 38, training on 158 tracks made by me, with the default sample length and random crop on.
Loss went down really slow, I mean, that's already almost 50K updates, and I only got down to around 0.64 or something, which is kind of high for a diffusion model. On the other hand, the outcomes dictate the usefulness, not the analytics... And I can already see this as a very cool tool. Next I think I'll train on my stems, but they need a lot of work removing the silence, I'll need to find some batch tool, not sure if Audacity is up for that.
A clip from training, same model, I can definitely hear my influence.
maby the people in the harmoniai discord can help you
Oh right, that was the Discord for the audio model creators. I reported the weird stuff on Github already.
is anyone else having issues generating with stable audio when imputing audio file, I thought I try audio to audio, a simple 20 sec long drum beat, but I get stuck on pending forever and eventually times out trying to generate, I tried some other clips and still not working , I even bought pro to maybe not get stuck in a queue for generate, but still have not gotten anything to generate using audio as input, can generate text to audio, so clearly servers are up
For Stable Audio, is the upload limit a one time thing? What if I'm uploading the same track because I want to edit multiple times, or is it upload once - So I don't have to upload again if it's the same
I was having trouble when sending it audio in FP 32 wav format. The same file rendered to an int 24 format seems to work ok.
@light shoal ok i might try it got a a few generation working but sometimes still fails.
TBH, I've now got a couple that are just sat there spinning based on an int24 file, so I think I was premature thinking I'd solved it
Of my last 10 generations, only 3 have worked with the remaining 7 either timed out or looking like they are going to. I've put a support ticket in and I'll see what they say.
???
???
Go for it Kan.
hey y'all i just started getting into stable audio last week and while doing my research i got a bit confused. i keep seeing people ask about running it locally, but isn't that what stable-audio-tools lets you do?
i wrote a script using that and have no issues generating audio on my laptop (6gb vram). the one thing that i find mildly annoying is i can't seem to figure out how to make a clip that is under 47 seconds. like i can make one that is 20 seconds, and then there will be 27 seconds of silence. i'm guessing that is just how the model is trained, and the best results would be to leave it at 47 seconds?
you can also run it on their website where they offer a newer 2.0, ive only poked at it a tiny bit but using ComfyUI where i can set latent length, peeking at their repo its this sample_size: int = 2097152,
Every time I try to use the 2.0 model I get these errors- does anyone know a way to fix?
This error only occurs when I try to use an input track
did you ever get this fixed?
did you ever get this fixed?
No. It’s consistently broken.
have you managed to get in touch with support? I just messaged them twice this morning but not sure if I should actually expect a response lol
Never heard back from them.
@mystic geode tried a few days later and it worked but it still failed like every now and again , and then the failed ones got stuck pending for ever until it said I could not make any more request because had to many pending, so gave up on it and the results that I did get , was not really good enough for what I wanted to use it for anyways.
Is it possible to run stable audio 2.0 locally? Or only version 1.0 ?
You can't run stable audio 2.0 locally only because the weights are not available. You can run stable audio open locally though. Different weights.
are you aware of any other ways to use stable audio 2 other than the stableaudio website? soopa frustrating not being able to reliably use the audio2audio feature, haven’t been able to get it to work for days
Not that I am aware of. Stable Audio 2 is through the website only as far as I know.
If you want more flexibility, Stable Audio Open is best for running locally.
anyone know why the result player is grayed out? its generating the audio flac
comfy up to date
this is probably useless for most ppl but i made a colab notebook for the open source model. i've never made a notebook before so it might have a lot of bugs. i've been using it to generate a big sample library. https://colab.research.google.com/github/xxristoskk/stable-audio-sample-generator/blob/main/Stable_Audio_Sample_Generator.ipynb
does anyone have a comfy workflow for stable audio? I'm trying to figure it out on my own but Comfy crashes when it tries to save the generated audio
just drop a flac file from this channel into comfy and it should load the workflow unless the creator removed it
Thanks!
Alright now I'm generating a 10 second audio clip @ 50 steps, getting 15it/s but it's taking over 10 minutes lol. How long does it normally take to generate a clip?
the full 47 seconds takes about 2-3 seconds on a 4090
on a 6gb 3050 it takes about 30-45 seconds
What's the sota for voice cloning?
hi everyone, I am new to SA and recently installed. Can someone please point me to a tutorial for getting started creating a custom model?
Good app for this sorta thing: https://github.com/Nerogar/OneTrainer
Check their discord for help (linked on the repo)
A clip from training, same model, I can definitely hear my influence
thank you!
Has anyone installed Stable Audio Open on a server and found a way to run parallel generations?
We have implemented stable audio on Replicate here : https://replicate.com/stackadoc/stable-audio-open-1.0
And the cog source is here : https://github.com/stackadoc/cog-stable-audio
If you need any help to implement the solution on your own backend, feel free to ask 🙂
sorry I don't know how things work here. I just made this beat and love to share. I am new here https://www.stableaudio.com/1/share/dba9c3a6-14ab-4ec9-96cd-1dc333065ed2
Does Stable Audio Open support generations in key?
With the base stable-audio-open-1.0, no, since the key is not present in the training dataset. But you may be able to support it if you fine-tune the model with the key in the training prompts. For the stable-audio 2.0, I don't know
is there any ai music/song generator that can generate a project file for daw software like fl studio? or a way to generate a project file for daw from an audio track? i'd like to make edits to my ai music generations
https://klang.io/blog/mp3-to-midi-daw/ This might not be what you want, but it looks like it converts your suno (or whatever) mp3 to midi to use in your daw project…
is there any guide on the UI options for SA explaining what things like cfg_scale and #steps affect?
https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_audio
See "guidance_scale" & "num_inference_steps"
Thank you! Looking at the inference steps I understand the relationship between steps and quality but not quite sure what slower inference translates to? Does that just mean it takes longer?
Yes, it increase proportionally
How come that new 3D generator thingie doesn't get a channel? 😦
😂
hey, do we know when the api will be available? or if it will be available at all?
I posted a new song called "Sands of Time" (Sands of Time), a metallic metal song with Arabic melodies without vocals (Instrumental) 🎸🌟 which was previously produced with Stable Audio artificial intelligence 🤖🎧. Those who follow me may know it, but it is now available in an optimised version in high quality on SoundCloud! 🎵🚀
Song link: Sands of Time 🎶🔗:
https://soundcloud.com/tber-mohammed-mehdi/sands-of-time-ft-stable-audio
Please sign up for the site and support it with a heart button ❤️ - it really helps spread the word ✨
#Metal_Arabic 🎸 #Music_without_Vocals 🎶 #Sands_of_Time ⏳ #AI 🤖 #High_Quality 🎧 #SoundCloud 📻 #Mohammed_Mahdi_Altabar 🎤 #Stable_Audio
Close your eyes and let the desert winds carry you away. "Sands of Time" is an epic instrumental metal journey that blends the raw power of metalcore with soaring melodies and captivating Middle Easte
Hi,
Is there an API available for StableAudio that can be used for custom projects, or is it only accessible through Hugging Face/stable_audio_tools?
Vanishing Point 1971
Directed by Richard Sarafian
Cinematography by John Alonzo
Starring Barry Newman and Charlotte Rampling
Rescored by Innuendo by Nguyen Do
Featuring Robert Plant A.I.
Covering Miley Cyrus' Wrecking Ball
#carchase #vintage #ledzeppelin
FAIR USE DISCLAIMER:
As the original material is transformative in nature, uses no more o...
Hello, You all seem to you what you doing when using this AI music gen. I just started today. Usually I write and play my own music, but due to neighbors not like drums, guitars and the like, and not having access to a studio, I'm trying this.
I trying to put together, a track, for my youtube channel, it's a gaming channel. An intro song for my vids, with my vocals and lyrics, Is there someone willing to take some time to help out and show me how I would accomplish this...
using this stable audio system..
How does one actually get in touch with StabilityAI? I have tickets in 'pending' now for over a month. I've emailed, I've updated my tickets with details / screenshots of the CONSTANT errors that I'm encountering (error 400 sagemaker) and when trying to use my own audio, it has failed (timed out) EVERY TIME that I've tried over the last couple of weeks. Things seem to be getting worse and worse.
Are there any 'official' StabilityAI / StableAudio mods / support people on this discord server ? Do any of you have ANY information on how to mitigate/resolve these issues using the 2.0 web interface? I'm so frustrated - not able to use what I'm paying for - and unable to reach anyone to find information about these issues. Any advice would be VERY APPRECIATED. Thank you!
A month later - still a constant problem and absolute silence from 'support' --- it's maddening as we're unable to actually use the service that we're paying for. Such a shame as StableAudio2.0 is so useful when it works. Please let me know if you ever hear back from them or find any workarounds to the perpetual 'Timed Out' (+filled queue that appears to be limited to 3, not 5 like the error states) --- or if you've any information on the Sagemaker Error 400 issues that come up with audio2audio prompting. I'll be sure to report back if I ever hear/learn anything.
I think that Stability AI is barely surviving as a business. This may be a team that is no longer staffed.
No one there. the models went self aware and will train themselves and lament until the electricity is shut off.
i worked for mrBeast, he's a fraud but instead its i worked for mrkrabs, he's a fraud (3)
i worked for mrkrabs, he's a fraud THE END
Are these voice generated in Eleven labs?
Hi there, i'm trying to get Stable Audio Open to work locally with 0 success so far. Since there is about 0 information about it anywhere, I'm not even sure it can make something else than music at this point.
Is it supposed to be possible to run it on a 3060 TI GPU, i'm using Pinokyo for testing but so far cannot make anything, it just run forever.
I have no idea what generation time i'm supposed to hope for or how to prompt or absolutely anything really.
Any idea ? Thanx
You have to provide your background logs so that others can help you.
Do I even have access to it on Pinokyo ? Also there is no error message, it's just running forever, but since I never made a gen and there are no info nowhere I have no idea what to expect for speed or parameters
你是否装了cuda?
Sorry! I accidentally spoke Chinese. Did you install the matching cuda to pt?
Supposedly pinokyo is supposed to do that automatically, did work fine with forge for flux/sdxl
If you have multiple pythons on your computer, you may get an error when performing this operation. Enter your terminal to check whether the environment you are adapting to is compatible. If there is no problem, try changing to a smaller model or smaller parameter requirements. It may be that the GPU provided by 3070 is not enough.
If it is the first startup, it will automatically install the missing startup environment and check whether the network is normal.
Yeah that's why I was asking, since I had no idea what to expect, like how long would it take normally to make let's say 5 second on 30 steps or something. I think any bad python release would just throw an error, since GPU is running full during the generation
The 4090 I use is fully loaded with GPU. If you want to make high-quality works quickly with low video memory, you can only manually modify the core algorithm and make a set of loops that are more suitable for your device.
You can use Visual Studio to write new algorithms you need.
he 4090 I use is fully loaded with GPU. If you want to make high-quality works quickly with low video memory, you can only manually modify the core algorithm and make a set of loops that are more suitable for your device.
You can use Visual Studio to write new algorithms you need.
which one do i download using AMD
Hello! I consider your audio service as one of the best in the game. 🙂 Therefore, I have a short question. Can I use the music I generate as a main content of my YouTube video? E.g. with static background or short animation for a video with music to study with? (something similar to lofi girl)
Like this music!🎶
Hello! Just wanted to share this with the community, if any of you are in or near Barcelona on October 4th you can't miss this IRL get together!!!
https://www.eventbrite.com/e/entradas-barcelona-ai-music-meetup-v10-1006681229657?aff=oddtdtcreator
anyone else getting issues using Input Audio on https://www.stableaudio.com/generate ?
if i generate with an audio input, the generation is permanently stuck on pending and i can't generate new songs
Make original music and sound effects using artificial intelligence, whether you’re a beginner or a pro.
now shows
error - ClientError: Received client error (400) from model. See the SageMaker Endpoint logs in your account for more information.
I have exactly the same issue and I reported it a few days ago.
Last time I got the answer after one month. Let's see.
Hey there
Hope you are doing well
I am a senior full stack developer who has full experience in web and AI development
So if you have some projects, please let me know
thank you
https://github.com/typhon0130
https://figma.com/@typhon0130
/subscribe
/subscribe
now getting Error: Timed Out
@willow roost are there any devs that can take a look?
I also still have this issue
Hey everyone! My album made with AI tools (lots of stable audio in there) is finally out!!!!
https://open.spotify.com/intl-es/album/70V3ApjtDqTmXyKBFN2PVM?si=H83jLIl1SyOxldetuvMGKA
I really would love any feedback or thoughts or questions or whatever!!
Why can't I make an audio-to-audio connection? Whether it's an uploaded file or an uploaded record, the generated composition remains "pending" for a long time, before finally displaying an error message.
Close your eyes and let the desert winds carry you away. "Sands of Time" is an epic instrumental metal journey that blends the raw power of metalcore with soaring melodies and captivating Middle Easte
Never seen such long solo in an Ai service other stable audio. can't believe what I generated
Hi, somebody know sources for custom audio safetensors? I found only the offcial 1.0 model.
kinda neat. what was the production process in terms of editing? I assume this wasn't generated start to finish in a single click
this is what an unedited single click gen from my model sounds like:
No I generated it from one take and the version in SoundCloud is just remastered
Hi!
I want to generate one-shot samples for music making. It's important to get precise notes, like C3. How to do?
i have two gpu's one with vram 12 gb on with 16 gb. is there any possibility to run stable diffusion video using these two. its would be a great help. i am new learner .
Check msg @ruby wraith
Line-break appears to be ignored in the prompt field. Would suggest interpreting line-break as whitespace.
I.e.
first term
second term
seems to result in first termsecond term
Did the stable audio diffusion just get abandoned? As the audio tools hasn't been updated for half a year, and haven't found any new programs that can do the same/better with the models 
Can't even get it to work, and can't find the darn requirements.txt as my setup is just all broken for it.
I released my debut single last thursday :)) i used ai to generate the bulgarian choir effect throughout the song. Let know what you think :))
Even stableaudio.com is busted.
Is there an api for stable audio?
Is there any french people here or ... ?😌
anyone know what ai was used to make these music extensions?
https://www.youtube.com/watch?v=CTdMM2EtEOM
If you have any song requests, please let me know.
0:00 - Original
0:47 - A.I.
#teamfortress2 #ai #extended #remix #danger
Listen to Sands of Time (ft. Stable Audio) by Mohammed Mehdi TBER on #SoundCloud
https://on.soundcloud.com/MSGuhxTaybbg7shw9
Close your eyes and let the desert winds carry you away. "Sands of Time" is an epic instrumental metal journey that blends the raw power of metalcore with soaring melodies and captivating Middle Easte
Thanks for the support guys and listens
Its a old scam. Probably his account was stolen
Hi everyone, does somebody know the meaning of this thing mentioned on the subscription page?
Monthly upload amount
30 minutes
Cropped at 3 minutes
you get 30mins of audio/video at 3mins at a time
is this the limit of download of the audio we generate?
I mean I just got the pro subscription and I can generate 500 audio but download 30 mins?
you can upload a total of 30 minutes to stable audio, to use it as a source for prompting
KED KED KED 8B1TCH - OVRLL BGZ - HORRORCORE EXPERIMENTAL SCRATCH HIPHOP AGRESSIVE SERIAL EXPERIMENT
]
MADE WITH SUNO AI
CHECK MY INSTAGRAM: @overll_bgzy
#suno #sunoai #aimusic #aivideo #aftereffects #aiart #ai #phonk #hiphop #lofi #beat #beats #aesthetic #genmo #capcut #rap #rapbeats #horrorcore #horrorcorehiphop #brazil #brazilmusic #cyber...
Aight, new here. Been doing image gen with Auto111's SD for a bit.
Any faq or noob-friendly guide I can follow to set up a local audio gen?
The most current model might be YuEGP on their github you can see the installation guide. The git lfs install might be the only unusual step the rest is straight forward. The gradio app works great.
An astronaut on a white horse on Mars
Anyone else unable to download tracks atm? It only lets u download most recent. Cant download any other tracks. Please fix
Just gets stuck on download circle with .wav
OVERALL BUGZY - STRAY MAD (A SYNTHOAX SPACEOPERA) PART 1 AND 2
Check it out at www.instagram.com/overll_bgzy
CREATED WITH SUNO AI!
#suno #sunoai #aimusic #aivideo #aftereffects #aiart #ai #lofi #beat #beats #lofivideo #vhs #glitch #glitches #aesthetic #hailuoai #capcut #rap #rapbeats #anime #animeedits #surreal #surrealism #minimaxai #sy...
A journey through memories, echoes of words left unsaid. Moments that slipped away, feelings that linger. Let the sounds take you where conversations never could.
any good front ends? I'd like to experiment
I'm mostly familiar with Suno
thanks
so, any viable local front ends?
There's at least two different Github repos containing nodes for ComfyUI & Stable Audio
/linkwallet
Has anyone been able to get an API for stable audio?
@low fjord can we get any info on the future for stable audio? Last update was 9 months ago when stable audio open released and I was wondering if stability.ai have moved on.
@tight anchor anything you want to share?
or more accurately are able to share?
Didn't mean to put you on the spot Fauno sorry! I was just curious.
suffice to say, yes SA is still being worked on, I have a sync with Fauno later this week and I'll find out what we're allowed to talk about 👍
I'm one of the mods for the Harmonai (stable audio) discord and people are talking about other open source audio model releases lately, so I was wondering what S.ai has been up to.
oh cool, that's a fun server
We've got a good model improvements ready to go, as always our biggest blocker is an improved dataset
Do you think audiosparx would be willing to negotiate an update their terms to allow for a public release or is that likely a dead end?
That’s one of many conversations we’re having
Hey, I am interested in an AI model or a nueral network that shouldn't be modulating over the note duration. Shouldn't sound like a melody. This there any?
It's almost a year since Stable-Audio was announced, is there any date for the release of the API?
does anyone know how that could be done ?
https://www.youtube.com/watch?v=8Np2JQjz5WE
create santa claude
Hi!
I'm working on a binary classifier to detect whether music is AI or human.
Do you guys happen to have a folder of Sparx 2.0 audio samples that I could train my classifier on? A few hundred would be super helpful
following up on this curious whether anyone has any tips
Can we run this locally?
Stable Audio 2 just got released
must try it
I once made this using Stable Audio 1.5:
https://soundcloud.com/tber-mohammed-mehdi/sands-of-time-ft-stable-audio
Which was epic enough and impossible to recreate using another AI.
Btw, Stable Audio spectral quality is better than Udio but It doesn't support lyrics and a lot of things like Udio.
Close your eyes and let the desert winds carry you away. "Sands of Time" is an epic instrumental metal journey that blends the raw power of metalcore with soaring melodies and captivating Middle Easte
Something is wrong with 2 in terms of notes, they are not working together as before. I remixed the same song and look at this.
Not sure if this is the place but is there recommendations for removing the music of a video while keeping the rest of the audio, like talking, footsteps, etc?
when is the Stable Audio page gonna be improved/workable? beyond just a simple generation thing
I love the Stable Audio UI, can we get some of that UI for Image generation
Hi, I don't know about something like SD for that, but I know you can work it that with something as Izotope RX, or look for similar software. I think it is ai trained for the task.
But at the same time I would like to know if there is some stable app for working on audio with AI, like edit or mastering, stem separation, etc
Sometime ago I tried SD audio and I was able to generate but I couldn't do anything else, and in some gradio app there was things like mastering
@loud robin UVR UVR5 yes there is. Oh maybe not depends on filters i guess. I used it to lift the voice only. once. Not quite the same.
Oh I didn't know that one, great
Hey fam! I am uploading original compositions and the system is flagging them. These compositions range from unreleased music that I have created, to music I have released (both originals, and remixes for which I have compilation rights. \I;d like the system to key off of my work -- why is this not functioning as I expected? Thanks in advance!
OVERALL BUGZY - ДМОН - (РУССКИЙ EP) RUSSIAN EP
1.Если вы все еще хотите того, чего всегда
2.невозможным образом
3.Тоска Печаль
4.Уйди с моего пути
5.Элегия 1938
6.Я ухожу
Check it out at @overll_bgzy
CREATED WITH SUNO AI!
#новаяму...
Does anyone happen to know: if I have an AI voice model, is there a way to use it as a VST, or use it in the same way as one, to convert spoken audio in real-time?
How’s your health?
Vocaloid is a VST / VSTi that may work for this. If used as an FX plugin/Vocoder.
Does anyone know a really good text to speech voice AI? like a storytelling voice
hello
Yo! Trying to find out if Riffusion straight up stole a song! This sounds wildly familiar to me! Is this a released song and does anyone know the name of it?
Download shazam and soundhound, feed it to both, and see if it's recognized by any of them
Also, fun fact, all A.I models are trained on existing media lol. Nothing A.I makes is remotely original 
Do you guys know of a voice model processor akin to stable audio tools, but a docker like whisper/piper, so i can make my home assistant make different non-verbal noises? As i got a voice model i use for my HA, but i need it to make a sneeze for instance using that voice model 
to me, ai just mix existing style to make new style
you mean tts
https://youtu.be/TCHXzX6vUcA that's why I feel like my song is a copy of some songs that I did not know
🎵 Dive into a Melodic Journey of Love and Destiny 🌌
Embark on an emotional voyage with this original Russian rock ballad, blending raw guitar energy, haunting acoustics, and driving rhythms. Inspired by the moody aesthetics of t.A.T.u and the surreal, poetic visuals of Adolescence of Utena’s iconic dance scene, this music video weaves a...
i'm currently trying to use stable audio-to-audio (api) to generate some samples that can be paired with the original audio.
So far, the results haven’t been great, they don’t really capture the style or feel of the reference clip.
Has anyone had better luck with this? I'd love to hear any tips, prompt structures, or API configurations that helped improve your outputs. Appreciate any help!
Hello, I have a problem!
I have lost my wav download in stable audio.
Who can tell me how to solve this?
What do you mean you "lost" it?
Some important questions: when you said you were using stable audio are you talking that you are running it locally or are you referring to the website?
does anyone have a working space or guide to run the audio-to-audio style transfer showed in this demo ?
https://arc-text2audio.github.io/web/?utm_source=catalyzex.com
theres an apk for stable audio small on android? why they promote the app for arm devices and dont release an apk?
Here’s some 👂🍬 for you 🌕🦉
WRG🐰Neuro🌶️Spicy🌶️Sampler🥟:
14 of the hottest unreleased tracks from every genre.
https://soundcloud.com/whiterabbitgeometry/sets/a-kid
My song is now up https://youtu.be/0icOFbnn32U https://rumble.com/v6tkhv1--secret-lilies.html
A forbidden love anthem set in the halls of Lillian Girls' Academy.
Two Catholic schoolgirls defy societal chains, family expectations, and a gilded cage to protect their secret love. Inspired by Maria-sama ga Miteru, this glitch-pop track pulses with rebellion, whispered vows, and the haunting beauty of love that refuses to be silenced.
Ги...
🌴🎧 Bem-vindo ao universo de Latin Fi
Essa faixa instrumental é uma jornada nostálgica e sensorial pelos becos coloridos da América Latina, onde os ritmos do coração se misturam com a poeira do asfalto e a calmaria do por do sol tropicais.
🎵 Estilo: Lo-fi Hip Hop / Chill / Experimental
🌊 Atmosfera: introspectiva, quente, urbana,...
Empowered by https://www.runninghub.ai
My invitation code: My invitation code: https://www.runninghub.ai/?inviteCode=4a6c1dd9
Register and get 500 RH coins, and you can generate a large number of images and videos for free!
Music Generated by: https://ace-step.github.io/
Lyrics by QWEN2.5
#aimusic #aiart #aivideo #...
🕯️🐈⬛ OVRLL BGZY - TMC A8 (THEY MADE ME DO)
From the EP Latin Fi, this horrorcore-infused lo-fi beat drips with paranoia, late-night sirens, and glitchy mental echoes. Picture a black cat smoking on train tracks, eyes darting through the shadows — watching or being watched?
This track lives between urban anxiety and glitch dreamsc...
I cannot download wave files for songs generated since the end of April. mp3 and video files can be downloaded.
You can download mp3 and video files.
I cannot download wave files for songs generated since the end of April. mp3 and video files can be downloaded.
You can download mp3 and video files.
Is there a way to fix this error in Stable audio???
It happens when I put non-original music on the site.
This didn't happen before, it started today
Hello??
invalid or unsupported file type seems explicit, you probably changed the input format.
I'm using YouTube music from games
This error was not happening it started these days
It is not Copyright because the site would warn about it

