#🐣┃suno-showcase | Suno | Page 2

chrome tapir Apr 28, 2023, 4:48 AM

#

haha

#

he nailed it

simple bison Apr 28, 2023, 4:50 AM

#

prompt: "[laughter] [laughs] [sighs] [music] ♪ [gasps] [clears throat]"
Output: strange noise, and guy gets startled awake and attempts to answer the teachers language question.

chrome tapir Apr 28, 2023, 4:52 AM

#

i think think thats his way of saying i aint got the capacity for dat

simple bison Apr 28, 2023, 4:54 AM

#

welcome to the machine

chrome tapir Apr 28, 2023, 4:57 AM

#

guitar didnt work for me well

#

i ahve an idea

simple bison Apr 28, 2023, 4:59 AM

#

prompt: "[slide whistle] [music] [clears throat][laughter] [music] [laughs] [sighs] ♪ [gasps] — [slide whistle]"
output: 🔨 🔨 🔨 🔨 🔨 🔨 🔨

chrome tapir Apr 28, 2023, 4:59 AM

#

i cant stop thinking about the victim of the mosely

#

what was he trying to say

#

what the hell was that laff

#

haha

simple bison Apr 28, 2023, 5:03 AM

#

lost anvil Apr 28, 2023, 5:04 AM

#

Discord rules

simple bison Apr 28, 2023, 5:06 AM

#

chrome tapir i cant stop thinking about the victim of the mosely

i got more lore on the mosley... 20 years

#

"its safe to waze your fuldore feelings"

simple bison Apr 28, 2023, 5:08 AM

#

simple bison i got more lore on the mosley... 20 years

chrome tapir Apr 28, 2023, 5:08 AM

#

freestyle friday

simple bison Apr 28, 2023, 5:08 AM

#

chrome tapir freestyle friday

prompt?

chrome tapir Apr 28, 2023, 5:09 AM

#

(beat with lyrics) YEAH, YEAH, YEAH, YEAH, YEAH, YEAH, YEAH, YEAH, hey, hey, you, you, I don't like your girlfriend, no way, no way, I think you need a new one

#

didnt make it all theway

#

got hung up on the yeahs understandably

#

(beat with lyrics) seems good

simple bison Apr 28, 2023, 5:11 AM

#

simple bison

chrome tapir Apr 28, 2023, 5:12 AM

#

i made some crazy tongue twisters with chatgpt4

simple bison Apr 28, 2023, 5:12 AM

#

i had to screenshot the prompt because it is blocked by discord, i suspect repeating the same words or all the brackets...
but then Jairo Correa posts audio of the rules...
and i googled "the mosley victim" and there was an event 10 days ago, which did involve some of the things said in the audio clips, lol

chrome tapir Apr 28, 2023, 5:14 AM

#

wut

simple bison Apr 28, 2023, 5:15 AM

#

a hearing, medical, etc...

chrome tapir Apr 28, 2023, 5:15 AM

#

Flendiferous plibber-klorping slazzles zlungled slebbidly dlorbitant blurking klentacles, gleebulating qlibber-mlungulated vlivvers strandiferously, jlurching qlabberwabbled lipty-lpotch plibberations, whilst trondiforously clonking plaggled qlibberwinks, plargulating slibberwocked bligglets, and dlalumphing glabberdoodling rlizzwinks in plibber-sprangled tlazmires, vlurbing qlibber-splattered qlabber-splonks, flewting slabber-splorped plarfiggles splatteringly splangled, plibber-splurting rlizzulated slonktacles plibberwabbled, and slurbblingly qlibber-slurgled slibberwobble-slazzled plappledapples, plibber-plonking qlibber-splorped glabber-slurgles, plibber-plungulating plibber-slazzled dlentiferous glabberwocked spleebulations in plibber-sprangled plazmires.

#

someone try that with a long prompt

#

i cant get it in 14 seconds

simple bison Apr 28, 2023, 5:16 AM

#

chrome tapir (beat with lyrics) YEAH, YEAH, YEAH, YEAH, YEAH, YEAH, YEAH, YEAH, hey, hey, you...

LMAO
enthusiasm
betty betty betty betty betty betty betty baby YEAH! YEAH! YEAH YEAH!

chrome tapir Apr 28, 2023, 5:17 AM

#

at least the mic wasnt turned up to 15 like it is most the time

simple bison Apr 28, 2023, 5:17 AM

#

simple bison Apr 28, 2023, 5:19 AM

#

chrome tapir Flendiferous plibber-klorping slazzles zlungled slebbidly dlorbitant blurking kl...

simple bison Apr 28, 2023, 5:20 AM

#

chrome tapir Flendiferous plibber-klorping slazzles zlungled slebbidly dlorbitant blurking kl...

to work with the blood soil as the blood prize?

simple bison Apr 28, 2023, 5:21 AM

#

chrome tapir Flendiferous plibber-klorping slazzles zlungled slebbidly dlorbitant blurking kl...

I added this at the start and end [speaking fast]

chrome tapir Apr 28, 2023, 5:22 AM

#

are your temps set to .7?

simple bison Apr 28, 2023, 5:22 AM

#

chrome tapir are your temps set to .7?

I'm using WebUI, there is no temp setting

chrome tapir Apr 28, 2023, 5:23 AM

#

probably .7

simple bison Apr 28, 2023, 5:23 AM

#

just text + choose voice

#

[speaking fast] djfklglskgjhoewirjg;eiorjhwiqspteiucpnmxitzkqiaistuaiupjaeprfdijtgpetrjatgemverbzqyfvtdczsabqcwbsnudtfvybyghinjmokopls,cxpwkcmjvneuthybvruthvniejkmdicwopkslmxokwsnxjqiysgfoqwenwiuetywejnwfgosduhisugrpqueyrweiuqypqiupgiqusdgkfhgkgfvkbzcmvnncvmzfdvbjdfhglakdjghqeriutyqeoiuryqpiuweyqipwetogqeageveurfcsrdexredzaecryegbxruahwexrihjaeirjewrokmokg,oky,oukympjn,uypj,jukmjuhbndyouhdvnityirtiyegfjkgdskm,fbadfjygarfhnjalsruifagerkjmnlgurbhgliadnriluyghsilhgiyeahioaiyreotiuyerptuiyewriouytrfganfjc,bihn.,ihbx,rwihihbxvhmwvbxwrmfchmcsfdmhrmcfousdfuosrncirxgxvzyqzvqfnxurcfgncgmxbmxerxnqncfruoegmrvqeqvqpetewimtvwregbgjgvkfdvmgks [speaking fast]

simple bison Apr 28, 2023, 5:24 AM

#

simple bison [speaking fast] djfklglskgjhoewirjg;eiorjhwiqspteiucpnmxitzkqiaistuaiupjaeprfdij...

simple bison Apr 28, 2023, 5:25 AM

#

simple bison

chrome tapir Apr 28, 2023, 5:26 AM

#

what did you try to send me in general i just noticed

#

you got busted for sending porn again

#

haha

simple bison Apr 28, 2023, 5:26 AM

#

a link to the walmart music video

chrome tapir Apr 28, 2023, 5:26 AM

#

oh ok

#

i forgot about that banger

#

ok u can delete it i got it

simple bison Apr 28, 2023, 5:27 AM

#

hilarious song

chrome tapir Apr 28, 2023, 5:27 AM

#

hopefully AI can make a whole video like that soon

simple bison Apr 28, 2023, 5:28 AM

#

❤️ AI ❤️

#

[rap song] she got that louisiana purchase, louisiana protocol [rap song]

#

[rap song] basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket basket [rap song]

#

i think when you make the prompt too big, it does whatever

#

[rap song] basket basket basket basket basket basket basket basket [rap song]

#

(beat with lyrics) basket basket basket basket basket basket basket basket (beat with lyrics)
girl... it's a song, not a task.

simple bison Apr 28, 2023, 6:18 AM

#

i made up a rap

chrome tapir Apr 28, 2023, 6:43 AM

#

hmm

#

its no we goin to walmart

violet narwhal Apr 28, 2023, 9:05 AM

#

exotic spoke Apr 28, 2023, 9:11 AM

#

simple bison i made up a rap

Do you have the npz file for this voice? This sounds really good.

simple bison Apr 28, 2023, 9:12 AM

#

announcer

chrome tapir Apr 28, 2023, 9:12 AM

#

simple bison Apr 28, 2023, 10:13 AM

#

exotic spoke Do you have the npz file for this voice? This sounds really good.

📎 announcer.npz

novel sage Apr 28, 2023, 11:04 AM

#

hello, too difficult for me to simply down load this program from GitHub? Do i need to first install PIP?

tardy topaz Apr 28, 2023, 11:51 AM

#

I'm going to make an installer over the weekend, it's a bit annoying

blissful pulsar Apr 28, 2023, 12:01 PM

#

create an venv before to stay clean.

hybrid flame Apr 28, 2023, 12:30 PM

#

lost anvil Discord rules

Holy cow this is good.

ebon widget Apr 28, 2023, 1:33 PM

#

haha, amazing clips 🙂 btw we set up a channel #🐶┃bark-technical to encourage better sharing of npz files. hopefully that helps with finding prompts that are fun and clone well into new clips like travolta, jane etc

gleaming vale Apr 28, 2023, 1:57 PM

#

I thought i'd try to see how far we can push this.... and I think I broke journalism. Here Ive brought together a few AIs - gpt3, Bark, etc. now I can give any 1000 word document to my code, and it will, in a single click, spit out a video.. here's my test: https://www.youtube.com/watch?v=hyi1CgXbZCg - let me know what you think.

YouTube

AI News Media

Synthetic Media. The current state of regulation

Transcript...

VOICE1: Welcome everyone to the show. Today we are discussing a very timely and important topic - synthetic media and the potential harms it can cause. As many of you may know, synthetic media is media that has been produced with the help of artificial intelligence, and it includes things like deepfakes and AI-generated text, vide...

▶ Play video

blissful pulsar Apr 28, 2023, 4:29 PM

#

blissful pulsar Apr 28, 2023, 5:41 PM

#

I am speechless .... story written by an AI ... read by an AI.

hazy whale Apr 28, 2023, 8:33 PM

#

gleaming vale I thought i'd try to see how far we can push this.... and I think I broke journ...

are you using adobe animate for autolipsync?

blissful pulsar Apr 29, 2023, 12:25 AM

#

blissful pulsar I am speechless .... story written by an AI ... read by an AI.

How do you do that ? just for one sentence it takes me dozens of attempts to get the audio just about right...

edgy mango Apr 29, 2023, 1:01 AM

#

gleaming vale I thought i'd try to see how far we can push this.... and I think I broke journ...

Are you using sad talker extension for this?

edgy mango Apr 29, 2023, 1:02 AM

#

blissful pulsar How do you do that ? just for one sentence it takes me dozens of attempts to get...

If you paste the whole deal into infinite it has greater consistency and chooses voice often based on entire text

blissful pulsar Apr 29, 2023, 1:03 AM

#

edgy mango If you paste the whole deal into infinite it has greater consistency and chooses...

Thanks

chrome tapir Apr 29, 2023, 3:18 AM

#

yo yo sunoheadz

#

dance beats or regular beats that is the question

#

both

#

oh shit

#

a lot of these have 3 second brilliant parts and then the other 8 are rough

#

cleean beat

chrome tapir Apr 29, 2023, 3:57 AM

#

what a bassline

#

ok we got some rock vibes there

#

130bpm

#

feel de riddim

chrome tapir Apr 29, 2023, 4:45 AM

#

this beat is so hard it said 14 seconds? nah i only need 7

#

lose controooooooLLLLLLLLLLLLLLLLLLLLLLLLLLLL

#

i cant blame him i put lose control in the prompt

chrome tapir Apr 29, 2023, 5:20 AM

#

wow i actually made a file that needed to be turned UP

#

actually i take taht back it is pretty good at normal volume

desert vine Apr 29, 2023, 5:41 AM

#

Trying with the stitching, but many times it screws up and lose focus

desert vine Apr 29, 2023, 6:15 AM

#

gleaming vale I thought i'd try to see how far we can push this.... and I think I broke journ...

Please explain your process so we can try to make it as good as possible. 😄

desert vine Apr 29, 2023, 10:20 AM

#

If anyone have had a hard day and need some extra appreciation. 😄

gleaming vale Apr 29, 2023, 11:04 AM

#

hazy whale are you using adobe animate for autolipsync?

no - I've written a python code to swap out lip positions based on audio volume

gleaming vale Apr 29, 2023, 11:05 AM

#

edgy mango Are you using sad talker extension for this?

no - just infinite speaker, gpt3 and some python stuff i've written with gpt4's help. next week, i'm going to try to swap out my manga actors with video...

tawny saddle Apr 29, 2023, 12:25 PM

#

hello

edgy mango Apr 29, 2023, 12:56 PM

#

gleaming vale no - just infinite speaker, gpt3 and some python stuff i've written with gpt4's ...

bark infinity? gotcha. I was wondering how the animation was being cued to the sound for the talking. sad talker is supposedly able to do what splines was doing a few months ago but as a wav file drag and drop into the stable diffusion extension. Havent run it yet.

cyan pawn Apr 29, 2023, 2:01 PM

#

desert vine If anyone have had a hard day and need some extra appreciation. 😄

which npz did u use?

pliant spruce Apr 29, 2023, 2:02 PM

#

desert vine Apr 29, 2023, 2:13 PM

#

Will try to post it here when I get home. 😉👍

gleaming vale Apr 29, 2023, 4:12 PM

#

edgy mango bark infinity? gotcha. I was wondering how the animation was being cued to the s...

I wrote python code to split the script by speaker, and run bark on each speaker in turn. Then used moviepy to create the video shots and cut them back together

desert vine Apr 29, 2023, 6:06 PM

#

cyan pawn which npz did u use?

Here you go my man!

📎 female_soft01.npz

cyan pawn Apr 29, 2023, 6:13 PM

#

desert vine Here you go my man!

thanks

night spoke Apr 29, 2023, 7:47 PM

#

I used bark to convert voice lines in a software called "Crew Chief" which is a program that connects with racing games to provide voice guidance while racing https://www.youtube.com/watch?v=mHiSSWUfDb0

YouTube

Aer

Crew Chief with AI voice

An example of AI-generated voices for crew chief

▶ Play video

ebon widget Apr 29, 2023, 9:55 PM

#

night spoke I used bark to convert voice lines in a software called "Crew Chief" which is a ...

very cool!!

brave bramble Apr 29, 2023, 9:58 PM

#

man :hi loli how are you today?
girl: hi Petir I'm good

#

man :hi loli how are you today?
girl: hi Petir I'm good

#

😭

night spoke Apr 29, 2023, 10:00 PM

#

ebon widget very cool!!

Thanks! It's a lot of work, thousands of wav files, but this voice model is really impressive

chrome tapir Apr 30, 2023, 1:04 AM

#

psychobot laff

#

catchy

#

if you ask bark to drop the bass it really drops the bass

#

dat sine sweep at 8 seconds holy shit

#

female singer

#

lets see if history file works for that

#

you can do it allllllllllllllllllllllllllllllllllllllllllllllllllllllll by yourself

green tartan Apr 30, 2023, 3:15 AM

#

my favorite thing so far. it improvised a little ditty.

chrome tapir Apr 30, 2023, 7:33 AM

#

edm style

plain skiff Apr 30, 2023, 9:56 AM

#

made with chatGPT4:
In the depths of the ocean, creatures glow,
Bioluminescence, a light show,
A jellyfish whispers, "I bet you can't see,
I'm 95% water, just like tea!" [laughs]

Volcanoes erupt, spewing lava and ash,
Their molten rock flows, in a fiery flash,
A mountain yells, "I don't mean to boast,
But when I blow my top, I make the best toast!" [laughs]

The Earth is round, spinning with grace,
A giant blue marble, floating in space,
A cheeky astronaut once said with a grin,
"Gravity's the reason we don't just fly off into the wind!" [laughs]

Einstein was brilliant, his theories profound,
E=mc², a formula renowned,
He quipped with a smile, "It's all relative, you see,
The faster I go, the younger I'll be!" [laughs]

In this world of wonders, mysteries, and jokes,
Nature and humor, together they coax,
A laugh and a lesson, they bring us delight,
In this beautiful world, we take flight.

granite quiver Apr 30, 2023, 9:57 AM

#

I tried to copy Scatman John's voice, not yet successful. The line "Scatman, fatman, black and white man, tell me about the color of your soul" from Scatland's World ended a bit creepy in my opinion.

stuck stag Apr 30, 2023, 12:00 PM

#

how do u run it with mps? (my mps is enabled, just wanna know how to make bark work with it)

edgy mango Apr 30, 2023, 7:53 PM

#

stuck stag how do u run it with mps? (my mps is enabled, just wanna know how to make bark w...

https://github.com/wsippel/bark_tts

GitHub

GitHub - wsippel/bark_tts: Oobabooga extension for Bark TTS

Oobabooga extension for Bark TTS. Contribute to wsippel/bark_tts development by creating an account on GitHub.

#

-- extension bark_tts

toxic helm Apr 30, 2023, 9:52 PM

#

I'd love more decent female voices, seems like bark is much better at male voices. This is the only one from the list I've found actually produces decent sounding results

quasi oyster Apr 30, 2023, 10:06 PM

#

using count floyd to get short form and long form generations

#

long egret Apr 30, 2023, 10:29 PM

#

long egret Apr 30, 2023, 11:18 PM

#

it's that infamous book story that Butters wrote, in South Park. this is an example of getting longer than a 13 second output by concatenating the output arrays into a single waveform before exporting that and compressing into MP3.

obtuse sparrow May 1, 2023, 12:19 AM

#

Roughly nine minutes of Bark voices being obsessed with mares: https://youtu.be/_tHnB4BpRRg

YouTube

Hazy Skies

[Suno.ai Bark] Mare Mare Mare

Over nine minutes of various expressing love and praise for mares using the latest broad text-to-audio AI; Bark. It's capable of all sorts of audio input from text including speech, singing, music, sound effects, instrument samples, screeching, strangeness, etc.

https://github.com/suno-ai/bark | https://huggingface.co/spaces/suno/bark

All im...

▶ Play video

long egret May 1, 2023, 12:39 AM

#

britney not-spears - hit me baby, one more time as sung/read by suno-bark, concatenating on line with custom voice

torn ferry May 1, 2023, 1:49 AM

#

toxic helm I'd love more decent female voices, seems like bark is much better at male voice...

I'm a big fan of this one-- #🐶┃bark-technical message

long egret May 1, 2023, 4:52 AM

#

Really enjoying the different voices. This the radio talk show.

quasi oyster May 1, 2023, 5:57 AM

#

used oobabooga, stable vicuna, bark, sd, and sad talker to make this:

#

https://www.youtube.com/watch?v=kIaFq7EahqQ

YouTube

boricuapab

Oobabooba + StableVicuna + SD + Bark = Time Traveling Indy Story

#oobabooga #stablevicuna #stablediffusion #bark

▶ Play video

#

bark's able to read stories quite naturally

grim pumice May 1, 2023, 4:25 PM

#

I put together a video of some of the prompts I’ve tried with Bark:

https://youtu.be/jo8hAqmXmG0

YouTube

fofr ai art and experiments

Amazing text-to-audio with Bark, a prompting guide

See how weird and wonderful Bark can be with these experiments and prompting guide.

Try it out here:
https://replicate.com/suno-ai/bark

Code here, if you want to run locally:
https://github.com/suno-ai/bark

▶ Play video

long egret May 1, 2023, 11:12 PM

#

@quasi oyster it's better to just add the music on top using nonlinear editor 😄

dusty lantern May 1, 2023, 11:21 PM

#

long egret May 1, 2023, 11:40 PM

#

made a longer form rendering of the rules for the server

fathom wadi May 1, 2023, 11:41 PM

#

mmkay lol

brittle cobalt May 2, 2023, 12:21 AM

#

okay yeah the music functionality is not overly precise at the moment lmao

edgy mango May 2, 2023, 1:14 AM

#

quasi oyster used oobabooga, stable vicuna, bark, sd, and sad talker to make this:

I was hoping this would be possible can you clarify how you got the sad talker working with ooba?

quasi oyster May 2, 2023, 1:39 AM

#

edgy mango I was hoping this would be possible can you clarify how you got the sad talker w...

was a bit of work to get to this final result, I didn't link sad talker with oobabooga, here's my workflow

generate a story from ooba
copy the generated story into bark and generate the audio for it
take a front view of one of your sd character generations and upscale it to 1536x1536
generate the sad talker character using the bark audio, I tried using the long 2 minute audio, but my pc ran out of ram, so I did it in sections
put it all back together in a video editing sofware

next girder May 2, 2023, 2:01 AM

#

interesting, I put in "MAN: [laughs] How about we [beep] this place up!" and it said an actual swear instead of [beep]! Interesting... usually beeps actually work

#

same prompt, for the same speaker ("v2/en_speaker_1") hahaha these are so weird

#

this is an instance where the [beep] was generated correctly

next girder May 2, 2023, 2:08 AM

#

long egret Really enjoying the different voices. This the radio talk show.

also i really like this! how did you do it?

quasi oyster May 2, 2023, 2:20 AM

#

here's another raw infinite bark story result:

#

took it into adobe audio enhancer, but it changed up some words

#

@long egret how're you enhancing the audio?

long egret May 2, 2023, 2:31 AM

#

quasi oyster <@636706883859906562> how're you enhancing the audio?

i am not enhancing it other than editing it in Kdenlive

#

the radio one isn't enhanced or edited at all. it's just straight out of Bark using announcer voice

long egret May 2, 2023, 4:33 AM

#

StableLM telling a story about hot dogs for some reason.

#

Finally we have a definitive answer on why six is afraid of seven @tardy topaz

granite quiver May 2, 2023, 9:15 AM

#

An old man.

icy shore May 2, 2023, 10:25 AM

#

Story in the style of a redditor (by ChatGPT), using the long generation (advanced) method as explained by official Readme

night spoke May 2, 2023, 12:33 PM

#

Version 2 of the crew chief for racing games. I have converted over 5000 wav files into the en_speaker6 model. Here is the result https://youtu.be/lqCAgZaknDg

YouTube

Aer

Crew Chief with AI voice v2

An updated example of AI-generated voices for crew chief

▶ Play video

fast ferry May 2, 2023, 4:56 PM

#

not sure what is wrong but audio tends to distort as it progresses

#

using smaller model btw

pliant spruce May 2, 2023, 5:45 PM

#

hearty forge May 2, 2023, 5:58 PM

#

long egret StableLM telling a story about hot dogs for some reason.

I'm curious how long did this take to generate

gaunt laurel May 2, 2023, 6:15 PM

#

Hello ☺️ I'm completely obsessed with AI creations and Bark is becoming a firm favourite! I wanted to share this here with you. Made with Gen-2 and Bark. It's very silly, I hope you like it. https://twitter.com/emmacatnip/status/1651040268613763074?s=20

Emma Catnip (@emmacatnip)

"If I only ever, ever..."

#gen2 #bark #ai

▶ Play video

long egret May 2, 2023, 6:28 PM

#

hearty forge I'm curious how long did this take to generate

about 4 minutes

quasi oyster May 2, 2023, 6:41 PM

#

updated to latest countFloyd commit

#

Im Puerto Rican and speak mostly spanglish, the spanish accent cracks me up

#

fallen rapids May 2, 2023, 7:23 PM

#

Loving the new update!

long egret May 2, 2023, 9:24 PM

#

@ebon widget aha i got it working

#

the prompt was:

{man} So, I was thinking I could come over around 3?
{woman} And what would we do?
...

long egret May 2, 2023, 9:42 PM

#

An episode of "I Think You Should Leave" with Tim Robinson, as written by ChatGPT

long egret May 2, 2023, 10:16 PM

#

If you liked that, here's another. Episode 2: The Talent Show.

ebon widget May 2, 2023, 11:01 PM

#

hehe awesome!

icy shore May 3, 2023, 1:46 AM

#

fast ferry not sure what is wrong but audio tends to distort as it progresses

Seems like you are using JonathanFly/bark , the technique used for longer generation just keeps amplifying the previous chunks style. I would recommend taking a look at the official guide on generating longer generations (as described on Readme), it outputs better generation.

tardy topaz May 3, 2023, 2:29 AM

#

icy shore Seems like you are using JonathanFly/bark , the technique used for longer genera...

Cool. It looks like the difference is mostly they use a fixed speaker prompt? On my I defaulted to 100% feedback, but you can use --stability_mode to instead do that, FYI.

Then they add .25 seconds of silence between prompts. Probably good for the non feedback case but maybe not for progressive?
Semantic Temp lowered to 0.6
min_eos_p=0.05

rigid sluice May 3, 2023, 3:10 AM

#

Bark, animov(modelscope), AudioLDM, Blender & Generative AI add-on used: https://www.youtube.com/watch?v=ZbVR4fknEys

YouTube

tintwotin

How to make your own "Will Smith eating spaghetti" in Blender

Follow the instructions here for installing the add-on: https://github.com/tin2tin/Generative_AI

The narration is done with Bark and the music and sounds are done with AudioLDM. Video done with animov weights.

▶ Play video

chrome tapir May 3, 2023, 6:31 AM

#

rigid sluice Bark, animov(modelscope), AudioLDM, Blender & Generative AI add-on used: https:/...

thats a nice suite you have

#

i might have to try blenderAI

quasi oyster May 3, 2023, 6:34 AM

#

I have yet to figure out how to use the sunoai jupyter notebook locally to setup a conversation, but was able to go about it using count floyd pretty easily

#

#

fast ferry May 3, 2023, 9:14 AM

#

icy shore Seems like you are using JonathanFly/bark , the technique used for longer genera...

Thank you, i will take a look again

tardy topaz May 3, 2023, 12:49 PM

#

icy shore Seems like you are using JonathanFly/bark , the technique used for longer genera...

Got it in last minute, barely tested though

#

@proud yacht So my one-click installer went fine 4 days ago when I tested it, randomly doesn't work this week, lol. Might just a conda update? AHAH

marble pond May 3, 2023, 4:37 PM

#

long egret If you liked that, here's another. Episode 2: The Talent Show.

This sounds so good! What settings did you use?

long egret May 3, 2023, 4:51 PM

#

marble pond This sounds so good! What settings did you use?

upbeat hull May 3, 2023, 7:25 PM

#

long egret

what is this multimodel hub ?

long egret May 3, 2023, 7:58 PM

#

see my post about TRON in #🪦┃community-updates

quasi oyster May 4, 2023, 2:45 AM

#

https://www.youtube.com/watch?v=dzit5CFw3X4

YouTube

boricuapab

Oobabooga & Bark - Jonny Quest vs Robots

#oobabooga #stablevicuna #bark #jonnyquest

▶ Play video

#

it seems that at the end of sentences after every period, the speaker seems to always choose to say either 'and' or 'umm'

ebon widget May 4, 2023, 3:23 AM

#

haha, you can probably lower the threshold min_eos_p to help with that?

long egret May 4, 2023, 4:23 AM

#

wild nacelle May 4, 2023, 8:23 AM

#

long egret see my post about TRON in <#1102676289686609932>

Very cool to see all the audio prompts, especially the music one like kpop_acoustic. I assume these are custom prompts that you created and use with npz files?

torn fern May 4, 2023, 8:47 AM

#

My first test 🙂 The poem/song is written by ChatGPT. At the end I had written "Thank you for listening!", which turned into "Thank you for, what's the name? ... ... ... John", haha.

fossil valley May 4, 2023, 10:47 AM

#

torn fern My first test 🙂 The poem/song is written by ChatGPT. At the end I had written "...

how did you create such a ,long audio, i am only able to create 14sec audio

torn fern May 4, 2023, 11:05 AM

#

fossil valley how did you create such a ,long audio, i am only able to create 14sec audio

I used this https://github.com/C0untFloyd/bark-gui it generates longer passages in chunks and combines them at the end

slow bane May 4, 2023, 11:59 AM

#

Hey everyone, welcome back to our channel! Today, we're going to explore sustainable living and resource management, and how small changes in our daily lives can make a big difference for our planet. If you're interested in making the world a greener place, then this video is for you! So, let's dive right in!

long egret May 4, 2023, 2:12 PM

#

A politician trying to explain that the Egg came first, and not the Chicken.

fast ferry May 4, 2023, 2:36 PM

#

tardy topaz Got it in last minute, barely tested though

Thanks for the update, it is working a lot better now.

grand ether May 4, 2023, 7:21 PM

#

{man} So, I was thinking I could come over around 3?
{woman} And what would we do?
{man} Let's just watch some anime memes [music] [songs]
{woman} [moan!]!

patent stone May 5, 2023, 5:23 AM

#

哈哈哈，你好啊，今天天气真的不错哦

severe musk May 5, 2023, 7:44 AM

#

#🐣┃suno-showcase how could this possible?! today is till a raining day!

#

test for huangxiaofa

robust sedge May 5, 2023, 10:59 AM

#

Anyone else exploring song generation?

#

Rock version this time. Mind blown!

quaint night May 5, 2023, 11:21 AM

#

bronze peak May 5, 2023, 11:44 AM

#

不会

bronze peak May 5, 2023, 11:48 AM

#

patent stone 哈哈哈，你好啊，今天天气真的不错哦

你知道怎么用吗？

midnight sphinx May 5, 2023, 3:02 PM

#

introduce youself

floral kernel May 5, 2023, 3:19 PM

#

floral kernel May 5, 2023, 6:43 PM

#

quaint night

how do you get such long outputs?

long egret May 5, 2023, 6:49 PM

#

use the long form generation notebook in the repo

floral kernel May 5, 2023, 7:00 PM

#

I tried but it doesnt seem to generate a sound file ):

valid harbor May 5, 2023, 7:05 PM

#

https://tenor.com/view/happy-teacher-appreciation-day-pencils-happy-teachers-day-appreciate-teachers-teachers-day-gif-17097525

Tenor

#

https://tenor.com/view/baby-yoda-may-the-force-be-with-you-star-wars-mandalorian-gif-17698540

Tenor

#

https://tenor.com/view/ضحك-صور-gif-23100362

Tenor

quaint night May 5, 2023, 7:12 PM

#

floral kernel how do you get such long outputs?

there are multiple UIs that allow >15s generation

tardy topaz May 5, 2023, 7:45 PM

#

floral kernel I tried but it doesnt seem to generate a sound file ):

self plug https://github.com/JonathanFly/bark

GitHub

GitHub - JonathanFly/bark: 🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bar...

🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bark Text-prompted Generative Audio Model - GitHub - JonathanFly/bark: 🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bark Text-prompted Generative Audio Model

floral kernel May 5, 2023, 7:46 PM

#

long egret use the long form generation notebook in the repo

can you send me an example of the complete file for long form generation please?

long egret May 5, 2023, 7:48 PM

#

see JF's bark fork link there

quaint night May 5, 2023, 7:50 PM

#

well as long as we are self plugging https://github.com/rsxdalv/tts-generation-webui

though more people are familiar with JF's fork so you might get more support with it

GitHub

GitHub - rsxdalv/tts-generation-webui: TTS Generation Web UI (Bark)

TTS Generation Web UI (Bark). Contribute to rsxdalv/tts-generation-webui development by creating an account on GitHub.

tardy topaz May 5, 2023, 8:12 PM

#

quaint night well as long as we are self plugging https://github.com/rsxdalv/tts-generation-w...

Looks cleannnnnnn, and multiple models. An inspiration for when I clean things up

quaint night May 5, 2023, 8:14 PM

#

Thank you.
ngl gradio really has heavy limitations, I recommend doing research before investing too much time in a GUI with gradio

tardy topaz May 5, 2023, 8:14 PM

#

I finally went back and changed my text fields to number fields, like somebody who didn't discover a weak ago

#

Yeah I know, I am dying here

#

Like for example, the dropdowns. I kept trying to make them show a different name to the user than the actual value. And apparently this isn't actually a feature. Like, I thought that was a fundamental definition of a dropdown lol

#

So you have pass a function and process it, like what

long egret May 5, 2023, 8:16 PM

#

#📚┃suno-school guys 😄

quaint night May 5, 2023, 8:16 PM

#

actually the amount of gradio pain is so large we could fill a channel with it, I'll create a thread in technical discussions

tardy topaz May 5, 2023, 9:07 PM

#

It's fascinating how if you use existing song lyrics, you don't need notes. This was sample #2 on a no-notes test, and it tracks the original melody pretty well!

#

Also I don't think line by line formatting makes a difference, or marginal if it does, it's just the lack of periods I believe

#

That acts like a music note

blissful pulsar May 6, 2023, 4:01 AM

#

Made with @tardy topaz's repo

tardy topaz May 6, 2023, 9:41 PM

#

I'm probably gonna be tied up for a couple days and never put out anything new, but there is a dev branch if you crave something new. https://github.com/JonathanFly/bark/tree/dev has some cool user templating stuff, and the main functions should be fine unless I broke it right before I pushed.

GitHub

GitHub - JonathanFly/bark at dev

🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bark Text-prompted Generative Audio Model - GitHub - JonathanFly/bark at dev

lavish arrow May 7, 2023, 12:33 AM

#

@tardy topaz incredible stuff dude

#

the whoami speech is insane

gloomy helm May 7, 2023, 1:46 PM

#

how?

quasi oyster May 7, 2023, 4:36 PM

#

been having lots of fun converting short stories to audio
https://www.youtube.com/watch?v=H6aQ3NyPwPI

YouTube

boricuapab

Bark audio stories listening time!!

#bark #aivoices

▶ Play video

untold smelt May 7, 2023, 5:53 PM

#

quasi oyster been having lots of fun converting short stories to audio https://www.youtube.co...

hello , how about time cost on a short story ? < 10 mins or ?

untold smelt May 7, 2023, 5:55 PM

#

quasi oyster been having lots of fun converting short stories to audio https://www.youtube.co...

and , add this to code , result would be 10x better .

#

quasi oyster May 7, 2023, 6:40 PM

#

untold smelt hello , how about time cost on a short story ? < 10 mins or ?

I didn't time each generation, but I would say on my WIN10 pc that has 2060 rtx super 8gb, and 32 gb ram, it took about 2-5 mins, the time traveling indiana jones one was probably between 5-8mins, using the large models with offloading

untold smelt May 7, 2023, 6:45 PM

#

so cost 15 min to generate 2 min audio ? maybe you too need update your software

deft junco May 7, 2023, 7:54 PM

#

run on gpu

#

much faster

untold smelt May 7, 2023, 7:56 PM

#

deft junco run on gpu

yes , solved , now it just cost 3.5 minutes

deft junco May 7, 2023, 7:56 PM

#

how would you get audio without using the notebook,? i run plain python on pc and got output in bool format playable with vlc not plain windows format wav.

#

not sure why yet

untold smelt May 7, 2023, 7:57 PM

#

just use ffmpeg to fotmat it to mp3 or something ,

deft junco May 7, 2023, 7:57 PM

#

#audio(np.concatenate(pieces), rate=SAMPLE_RATE)
audio_array = (np.concatenate(pieces))
write_wav("bark_generation.wav", SAMPLE_RATE, audio_array)

#

makes it , but bool form

untold smelt May 7, 2023, 7:58 PM

#

as long as you get the audio , you can easily make changes to audio file .

deft junco May 7, 2023, 7:58 PM

#

i think it is the encodeing part

#

what does the encoding?

#

i get audio

#

but my other projects get a windows playable format

#

jut trying to figure it out now

untold smelt May 7, 2023, 8:00 PM

#

😹 i'm new too . don't have answer , yet

deft junco May 7, 2023, 8:00 PM

#

i'm new too

#

🙂

#

this is what i get but plays fine in vlc

#

not windows

#

download and try it

untold smelt May 7, 2023, 8:14 PM

#

deft junco this is what i get but plays fine in vlc

sounds perfect , are you using bigger model ?

deft junco May 7, 2023, 8:55 PM

#

no small models

#

small gpu

#

also plays in itunes

#

but not in my windows players lol

#

also got it to play from array only

#

like if you wanted a conversation with a chatbot

#

RTX 3060TI

#

#

Float values allow for more precise representations of the data, which can be important for maintaining fidelity during encoding and decoding. Additionally, some encoding algorithms may require float values as input to perform certain operations, such as normalization or feature extraction. Overall, using float values can help ensure that the encoded data is as accurate and representative of the original data as possible.

floral kernel May 8, 2023, 6:56 AM

#

tardy topaz self plug https://github.com/JonathanFly/bark

Thanks for sharing, but please change the format of your git page, it literally damages my retinas to see that many emojis in one place, otherwise very nice

tardy topaz May 8, 2023, 8:23 AM

#

floral kernel Thanks for sharing, but please change the format of your git page, it literally ...

Probably not the readme file itself, (because as far I know there's no way have to two versions/themes in github) but the actual app I plan on having a non-silly emoji-free display mode

#

Well I could link to the different readme, that'd be easy

floral kernel May 8, 2023, 1:22 PM

#

tardy topaz Probably not the readme file itself, (because as far I know there's no way have ...

Is your fork less GPU intensive? It seems to run faster

tardy topaz May 8, 2023, 1:23 PM

#

floral kernel Is your fork less GPU intensive? It seems to run faster

I just defaulted offloading to CPU on, but it's a setting in the regular Bark. It's hard to tell the difference in speed

#

It might even be faster, for some reason (more free GPU memory?)

viral lynx May 8, 2023, 2:43 PM

#

Did you know? You can extract the source audio from a history prompt npz. here are some examples from the default history prompts

#

haven't been able to extract the source audio from the announcer yet though, as that one is in uint16 format, while the others are in int32 or int64 (although i did have to manually convert int64 since gradio didn't have it built in. (just do data / 4295229444))

#

nvm don't do that division lol

floral kernel May 8, 2023, 2:49 PM

#

That quality is insane, have you trained that model for those voices?

viral lynx May 8, 2023, 2:49 PM

#

floral kernel That quality is insane, have you trained that model for those voices?

these are the official voices, i just extracted the audio files they were based on

tardy topaz May 8, 2023, 2:50 PM

#

You can do more than that! I made into a feature:

viral lynx May 8, 2023, 2:51 PM

#

interesting

tardy topaz May 8, 2023, 2:52 PM

#

I turned the mutation up too high, but you can generate more subtle variations. Just access the semantic prompt, and treat it like a new sample. In this verison I also chop it up, so it's super diverged

#

I went a little overboard on the RNG for that one, but the more moderate oen is really good if you have a weird noise or hum

#

it'll usually generate a variant without it

viral lynx May 8, 2023, 2:58 PM

#

i'm mainly looking into how bark history prompts work so i can attempt to make a voice cloner that generates the semantic prompt from the actual audio file instead of just generating one with the same text and then praying that it works lol

tardy topaz May 8, 2023, 2:59 PM

#

You'd have to train a model

#

here's a en_speaker_03 variant who doesn't hmmmm as much, probably

📎 en_speaker_3_var_10_1.wav.npz

viral lynx May 8, 2023, 3:00 PM

#

tardy topaz You'd have to train a model

yeah i figured

#

creating the training data will be easy but time consuming. and i'd probably use a markov chain to just quickly create a bunch of text for it lol

#

also the bark in my webui is like a monkeypatched frankenstein's monster from what it originally was lol

tardy topaz May 8, 2023, 3:04 PM

#

mine too, it's such a mess i keep not integrating my actual new stuff

#

ugh

#

i've got some real smooth long clips now

viral lynx May 8, 2023, 4:07 PM

#

funny to reuse semantics, you get a different voice. but same speech patterns (notice the "ssimilar" and the "like-")

small raven May 8, 2023, 6:53 PM

#

asda

#

||哈哈||

sharp mural May 8, 2023, 10:34 PM

#

Anyone knows how to create deep, rough voice like old man?

viral lynx May 8, 2023, 11:41 PM

#

sharp mural Anyone knows how to create deep, rough voice like old man?

Probably trial and error. Although my voice cloning method will probably be easier, but i haven't really finished it yet and i don't want to release it while unfinished.

boreal crystal May 9, 2023, 5:46 AM

#

这个如何使用来着？

plain sleet May 9, 2023, 5:47 AM

#

boreal crystal 这个如何使用来着？

同问哈哈

tardy topaz May 9, 2023, 9:17 AM

#

sharp mural Anyone knows how to create deep, rough voice like old man?

There are some ways I have played with, but honestly the simplest way still work best. 1. Write a text prompt that sounds like a rough older male voice, something they would say 2. Generate 100 random voices, pick the best.

#

Save the best voice, use it for your actual textt

quaint night May 9, 2023, 11:41 PM

#

not sure if someone has tried this prompt before but mildly amusing

#

same prompt, second generation:

tardy topaz May 10, 2023, 5:17 AM

#

I used this speech too much as a test sample to make sure nothing was bugged, starting to hear it in my dreams

quaint night May 10, 2023, 8:08 AM

#

tardy topaz I used this speech too much as a test sample to make sure nothing was bugged, st...

This seems like a belenciaga video in making

agile jungle May 10, 2023, 8:10 PM

#

warm pond May 10, 2023, 10:21 PM

#

tardy topaz I used this speech too much as a test sample to make sure nothing was bugged, st...

love that movie. The AI does a decent job but hard to match the real Fletcher

tardy topaz May 11, 2023, 1:06 PM

#

The very first music I tried was this silly Korean nonsense song and it's still one of the longest coherent clips, musically (also a bunch of others in the YouTube) https://www.youtube.com/watch?v=4pV9d25KqCE

YouTube

Jonathan Fly

ChatGPT draws a Salt Bread Cat and writes a song.

A silly experiment with multi-lingual AI text, drawing, and music.
다국어 인공지능 텍스트, 음악, ChatGPT 그리기 및 노래로 바보 같은 실험.

If you're seeing this silly experiment in your YouTube feed, I apologize, I checked the box that says "don't publish this video" and I thought that's all I have to do. But I haven't made a video in three years I forget how this works...

▶ Play video

#

The second segment was one continuous fully feedback last clip as full history for next clip, no fancy merging or any tricks, but it somehow stayed coherent. The trick is just a single guitar I guess, not too complicatetd

#

I wish I had been saving exact generation parameters but at this time was totally trying random things

#

The only times I've gotten similar coherent Bark output is when I used a very well known song, and it literally outputs an approximation of the melody and chords. But this is the best so far with novel text

topaz dust May 13, 2023, 12:36 AM

#

Hey Im new

fervent briar May 13, 2023, 1:13 AM

#

Just posted a song I made using bark!

https://www.instagram.com/p/CsKgu-5NWCh/?igshid=MzRlODBiNWFlZA==

weary adder May 13, 2023, 6:13 AM

#

Just started experimenting with bark, hope more sounds can be added

tardy topaz May 13, 2023, 6:24 AM

#

fervent briar Just posted a song I made using bark! https://www.instagram.com/p/CsKgu-5NWCh...

I got a 'video has no sound' error, FYI

tardy topaz May 13, 2023, 6:24 AM

#

weary adder Just started experimenting with bark, hope more sounds can be added

search this Discord for .npz to find a bunch. Hm actually that doesn't return files, but just scroll up in #🐶┃bark-technical message

acoustic umbra May 13, 2023, 10:58 AM

#

Can anyone tell if i can use bark for creating podcast audio and upload it on YouTube ?

fervent briar May 13, 2023, 2:44 PM

#

tardy topaz I got a 'video has no sound' error, FYI

Damn really?? Not seeing that on my end. Thanks for the heads up

tardy topaz May 14, 2023, 5:50 AM

#

One shotting full music seems tough, but you can gen good beats to build from

abstract basin May 14, 2023, 10:09 AM

#

very nice

#

Punjab Caretaker Chief Minister Mohsin Naqvi has said that staging protests is the right of every political party but “when those political workers reach Cantt, they convert into terrorists”.

“The worker of a political party cannot attack Jinnah House (Lahore Corps Commander House), a terrorist has done it,” he said, adding that around 400 people had gone inside the building while 3,400 were outside.

“No matter what happens, we will not sit idle until each and every person involved in this is arrested.”

He said that there was “no doubt” that these protests were “pre-planned”.#adio

turbid sparrow May 14, 2023, 2:14 PM

#

惨不忍睹，用这个语音就是，中文说的很好的老外。

#

怎么自己训练模型呢

languid canyon May 14, 2023, 4:18 PM

#

我也想知道

#

这个怎么搭的

heady fractal May 14, 2023, 4:46 PM

#

So, we did some benchmarking with Bark on an H100 and the results were very promising. Also, thanks @tardy topaz for the audio snippets. 😊

#

https://neocadia.com/updates/bark-open-source-tts-rivals-eleven-labs/

Neocadia

Bark: Real-time Open-Source Text-to-Audio Rivaling ElevenLabs

If the trend of improvement continues, we can expect that the newly available H100 GPUs will be able to perform real-time inference for every type of audio synthesis.

slim jacinth May 15, 2023, 1:11 AM

#

heady fractal https://neocadia.com/updates/bark-open-source-tts-rivals-eleven-labs/

This is dope!!

#

Can you share the source code for processing in batches? My understanding is bark out of the box doesn't support batch inference. If you guys built this, it'd be awesome to take a look at how you did it!

heady fractal May 15, 2023, 1:24 AM

#

We aren't doing batch inference, we're just batching up the requests in order.

What is batch inference in this case?

#

Is it just taking 10 sentences and running them at the same time or is it more advanced?

#

Source code for what we used either way can be found here: https://github.com/neocadia/bark/blob/feature/add-http-api/bark/serve.py

GitHub

bark/serve.py at feature/add-http-api · neocadia/bark

🔊 Text-Prompted Generative Audio Model. Contribute to neocadia/bark development by creating an account on GitHub.

tardy topaz May 15, 2023, 7:07 AM

#

https://soundcloud.com/jonathan-fly-620508219/bark-all-night-beats

SoundCloud

Jonathan Fly

Bark All Night Beats

Beats generated with the TTS model Bark

▶ Play video

#

Not a song just some recent favorite samples

slim jacinth May 15, 2023, 2:58 PM

#

heady fractal Source code for what we used either way can be found here: https://github.com/ne...

Sweeet thanks for sharing!

rigid sluice May 15, 2023, 5:59 PM

#

Made with Bark: https://www.youtube.com/watch?v=ep47PDXQvwk

YouTube

tintwotin

'Monsters, Monstrous No More!' - text2video in Blender

Get the free text-to-video add-on for Blender here: https://github.com/tin2tin/Generative_AI
Voice by Bark: https://github.com/suno-ai/bark
Music by szegvari: https://freesound.org/people/szegvari/
Text by chatGPT

▶ Play video

long egret May 15, 2023, 6:22 PM

#

blissful pulsar May 15, 2023, 10:21 PM

#

long egret

Gives me 90s radio vibes with the male voice

lofty flint May 16, 2023, 2:17 PM

#

rigid sluice May 16, 2023, 5:46 PM

#

Bark, Animov(Modelscope) and chatGPT: https://www.youtube.com/watch?v=gm50m_yEyCQ

YouTube

tintwotin

Driving Rain - text2video made in Blender

Free Blender add-on: https://github.com/tin2tin/Generative_AI

▶ Play video

tardy topaz May 17, 2023, 6:35 AM

#

"What if Trevor Noah was French?" Preview of future Bark Infinity fun features. Don't hammer me with questions yet, still sorting it out and will do writeup later. I am big time under water with real life work so not till weekend at earliest - I need to chill on Bark experiments, like seriously seriously. But experiments like this is why Bark Infinity hasn't been updated. The future of Bark is bright. We haven't seen anything yet. 👀⏳➡️🌟

pastel venture May 17, 2023, 6:58 AM

#

我爱你

spare cipher May 17, 2023, 10:14 AM

#

anyone here cloned any famous person voice?

#

like trump / elon musk / biden / david attenbourough etc?

viral lynx May 17, 2023, 1:28 PM

#

spare cipher like trump / elon musk / biden / david attenbourough etc?

i need data

tardy topaz May 18, 2023, 4:14 AM

#

Tricky to thread the needle between 'speak with any accent you want' and 'speak with a random speech impediment'

pliant spruce May 18, 2023, 5:04 AM

#

The Trump voice clones pause in the middle of speech but also don't stop talking while doing it because the only data they have for the clone is the real Trump - who pauses every few words, lol

dusk plover May 18, 2023, 6:12 AM

#

funny

jolly patio May 18, 2023, 7:10 AM

#

what is your name

pliant spruce May 18, 2023, 7:19 AM

#

@tardy topaz check this out lol

tardy topaz May 18, 2023, 7:21 AM

#

I hear something like that a lot. One thing I always hear is sound like that instead of applause, like in a talk show or crowd. The crowd always sounds like static.

#

So many weird artifacts

#

These were being messed with not really natural, but still weird

pliant spruce May 18, 2023, 7:32 AM

#

This is more of what I was going for

tardy topaz May 18, 2023, 7:33 AM

#

I think it's tricky, whispering has a very specific like microphone tone

pliant spruce May 18, 2023, 7:33 AM

#

Actually, I did get a few whispers

#

tardy topaz May 18, 2023, 7:35 AM

#

I think it'll work just be trickier. I managed to used sets of voices to influence other voices to have similar accents, but a whisper is probably a little more subtle effect, so I bet you have to do a bit more work to tease it out

#

Even the accents more often than not just cause weird speech problems

#

I bet you just need a lot of good samples. Like if you had 1000 clear non whispering voices, and 100 whispers, you could probably sort of take the difference between them, and get an idea for what tokens to push

pliant spruce May 18, 2023, 7:38 AM

#

I'm trying to find the right words for generating cuz there are certain things you can say to influence it, but 90% of the time it'll just make all the results bad

tardy topaz May 18, 2023, 7:38 AM

#

It's probably not really worth it honestly, versus just randomly finding some cool voices that sound like they are whispering

pliant spruce May 18, 2023, 7:38 AM

#

Like a specific sentence will get the results I want

tardy topaz May 18, 2023, 7:39 AM

#

But like, that sort of general workflow, it mgiht work for any style. Set of voices like X, versus set like Y, take the difference, use that as a nudge

#

that's the long term idea

#

But right now it's like like 1/3 almost passable french accents and 2/3 people who sound like their lips were tied together or get stuck on a syllable

pliant spruce May 18, 2023, 7:40 AM

#

tardy topaz But right now it's like like 1/3 almost passable french accents and 2/3 people w...

I've also tried tongue twisters and one of the results paused, laughed, and gave up on it

#

Half of the others got mixed up

tardy topaz May 18, 2023, 7:41 AM

#

Interesting

#

That's a fun result

#

Like, the model is not reading your text. But really it's just actually being smart about what a real human would sound like

pliant spruce May 18, 2023, 7:41 AM

#

Shy sheep show sheepish smiles. <- that was the prompt

#

One of the results paused to get "smiles" right

#

a brief pause

tardy topaz May 18, 2023, 7:42 AM

#

Honestly that's a great little showcase for what makes Bark different. Bark will screw up if you give a tongue twister! That's why it's so cool.

#

It it wasn't way too late AM I might try myself...

#

I did try and do Math using the force to keep going flag, you know, can bark add two numbers together? But it wasn't that interesting so far

pliant spruce May 18, 2023, 7:46 AM

#

Possibly

tardy topaz May 18, 2023, 7:47 AM

#

If you prompt something like, "You want me to shut up? Ok I'll be quiet. (Then some other sentence)

#

You might get like grumble or a whisper

#

Like imagine a TV show scene or something that would be half normal and then switch

#

I don't like using [whispers] and stuff, just feels like the overall quality is worse

#

Though if you DO get a good voice. that's probably one you can use that tag with

pliant spruce May 18, 2023, 7:49 AM

#

I usually have more success with throwing [yawn] or [yawns] somewhere in the middle of a sentence, but if it doesn't work it ruins it

#

cause it makes the result sleepy

tardy topaz May 18, 2023, 7:49 AM

#

Yeah I agree, if you HAVE to use a tag, put it in the middle, between two normal text blocks

pliant spruce May 18, 2023, 7:50 AM

#

An example.

tardy topaz May 18, 2023, 7:50 AM

#

Man that's hilarious

#

I'm actually gonna run like million tongue twister samples, someday. Make a 10 hours youtube video

pliant spruce May 18, 2023, 7:52 AM

#

And this one gets it right.

tardy topaz May 18, 2023, 7:56 AM

#

Have you tried like, ridiculously large and complicated words to pronounce? If Bark is good, the person will like pause, maybe think for a sec, and then struggle through it?

pliant spruce May 18, 2023, 7:56 AM

#

Not yet.

tardy topaz May 18, 2023, 7:56 AM

#

I haven't tried, but hopefully it's like that, that'd be cool

pliant spruce May 18, 2023, 7:57 AM

#

I think I tried superscript or something and it ends up saying random numbers and letters instead

#

either that or small caps

tardy topaz May 18, 2023, 7:57 AM

#

Hah. I wonder if there are actually some speakers that can like, perfectly do math equations. Like it must be in the training data, math classes on youtube or something

#

But not sure what the subtitles look like there

#

Probably too innacurate

pliant spruce May 18, 2023, 8:01 AM

#

this may be because this phrase is used too often

tardy topaz May 18, 2023, 8:05 AM

#

Maybe random can do it because that's the text that created them. And then an existing speaker might struggle?

#

Like if somebody chose to say that word on TV

#

it's probably like, not a problem for them

#

I'm not sure if it can get stuff like, "You want me to say what? Blah" and then it should have trouble, right?

#

That's probably a pretty common pattern

#

One thing I wanted to try is like, a prompt that is only used as setup. Like you say (I can't pronounce that!) and it gives that to the speaker, renders it, and then uses it in the next sample. But it's not part of your final clip.

#

So you just use it to try and change the audio style

pliant spruce May 18, 2023, 8:09 AM

#

Haven't tried anything like that yet

tardy topaz May 18, 2023, 8:10 AM

#

Not a high priority. I bet it works but just super randomly, so really, not that useful

#

He meditated so hard he escaped our universe, leaving a giant hole, causing a traffic accident as you can hear in the horn. A short story with just the letter M, never mind six words.

#

That en_fiery voice is such a trooper, no matter what you give it. Even sings Baby Shark with genuine enthusiasm.

tardy topaz May 18, 2023, 11:59 AM

#

I'm really starting to genuinely dig the not-quite-singing but not-quite-talking way Bark renders a lot songs. It's musical, but not sung.

heady fractal May 18, 2023, 12:30 PM

#

@tardy topaz I booted up your UI, looks pretty sweet. It's got only the Suno default voices in it though poi_think are yours in there yet!

#

?*

pliant spruce May 18, 2023, 9:16 PM

#

@tardy topaz Best one so far for whispering.

tardy topaz May 19, 2023, 12:53 AM

#

heady fractal <@614946962139250711> I booted up your UI, looks pretty sweet. It's got only the...

SOON. There are many I've posted in this Discord that are good, if you search

viral stag May 19, 2023, 3:27 AM

#

I'm really starting to genuinely dig the not-quite-singing but not-quite-talking way Bark renders a lot songs. It's musical, but not sung

tardy topaz May 19, 2023, 3:32 AM

#

Honest to God Bark out of the Bark is basically a perfect "Spoken Sung" model. As best exemplified in the classic William Shatner "Rocket Man" clip that I can't believe is such bad quality on YouTube that honest to god I might have a better copy on VHS somewhere. How was this not preserved OMG https://www.youtube.com/watch?v=lul-Y8vSr0I

YouTube

The Museum of Classic Chicago Television (www.FuzzyMemories.TV)

William Shatner "Sings" 'Rocket Man' (1978) - BEST QUALITY!

From The Science Fiction Film Awards, William Shatner's unforgettable performance of Elton John's "Rocket Man".

Includes Karen Black's introduction of Bernie Taupin, and Taupin's introduction of Shatner.

Rock-It, Man... :-)

This aired on local Chicago TV on Friday, January 20th 1978.

About The Museum of Classic Chicago Television:

The Muse...

▶ Play video

#

We need the best science and the best AI to restore that clip. All the alternatives on YouTube seem terrible. This is critical.

#

Maybe Bark... I can clear up Bark generations with enough sweat and luck. Once you can reverse semantic like Mylo is working on, and might be days away. Then encode all the lyrics into Bark, regenerate as clear. Or maybe it just works perfect with a single Shatner model and you don't need to do each lyric. Because Bark is that good, seriously.

pliant spruce May 19, 2023, 4:24 AM

#

@tardy topaz I have 29 good/okay results so far, do you want them :p

tardy topaz May 19, 2023, 4:36 AM

#

pliant spruce <@614946962139250711> I have 29 good/okay results so far, do you want them :p

I'm gonna try and do some basic grunt work first, bugs in Bark Infinity, install easier. But if you send them (drop box, google, whatever) I might take a look this weekend. Have you tested how the speakers generate if you use them? Do you need to use special text to get them to talk in a whisper?

#

It's okay if they don't match the text BTW!

#

as long as they make natural sounding audio outputs that sounds like a real person speaking

#

then it probably doesn't matter, the way I use them, which just as a target reference point to nudge bark towards those tokens

pliant spruce May 19, 2023, 4:37 AM

#

I haven't tested yet, I've just been generating them. All of them match the text. One of them is 11 best results out of 200, another is 15 out of 100.

tardy topaz May 19, 2023, 4:38 AM

#

The import thing is, if you use the voice, does it sounds like whispering ? Or maybe it does but only if you prompt them correctly. The first is the best, the second is still useful, but keep in seperate groups

#

Like whispering I mean

#

The first is really good though, since that's what we're going for, just make ANY speaker file whisper with no special prompt

#

BTW if you happen to get any really nice clear singers, send me!

#

I need more

pliant spruce May 19, 2023, 4:39 AM

#

tardy topaz Like *whispering* I mean

They are actual whispers

tardy topaz May 19, 2023, 4:39 AM

#

Oh can you try and vary your prompt? I know it's a pain

#

But I think one problem I am having is like, I used voices as reference. But all the voices were speaking THE SAME WORDS. So like if you 'more closely match a set of voices all speaking the same words' the that's kind of pushing it to just to towards whispering, but towards those words.

#

I noticed if I more closely matched the French samples, the output was worse. But all the French samples were speaking the same sentence so this kind of makes sense

pliant spruce May 19, 2023, 4:41 AM

#

ok

tardy topaz May 19, 2023, 4:41 AM

#

If I tried to look at the samples and detect 'what does whispering look like' if they are all saying the same words, then that twill also include the words "I'm whispering"

#

It's still sueful though to have all the ones that are the same

#

But thought I'd mention, if you CAN do it, it's beter

#

You can just use the voices to generate more voices!

#

If they still whisper, and it's high quality, then that's fine to get more diverse output

#

Even I could do that but it's just kind of a boring thing you gotta grind through

pliant spruce May 19, 2023, 4:43 AM

#

I can still try

tardy topaz May 19, 2023, 4:43 AM

#

All that said, you can send your current stuff. Maybe it just works!

#

dropbox or google share I guess?

#

Don't bother right now, tomororw night earliest, and probably sat

pliant spruce May 19, 2023, 4:44 AM

#

I was gonna see if I could use another method like you said

#

maybe include I'm whispering in brackets and see if it doesn't break anything

tardy topaz May 19, 2023, 4:45 AM

#

If this does work I'll have to give a custom script, might be awhile I make this a Bark UI feature, not even sure how yet

#

The easiest way to improve that dataset, take each voice, and make like 100 unique prompts, give each one 2 or 3, save the best from the set. So ideally we have a set of all different words, and some voices can be there 2 or 3 times, is okay I think

#

Maybe take a book and use every sentence, try to cover all the possible basic sounds, is the idea

#

None of that might be needed, but if you're poking around anyway...

pliant spruce May 19, 2023, 4:48 AM

#

Maybe you should add a batch generation feature to make that easier then, lol

#

So, generate 10 results with X and 20 results with Y text

#

if that's not already possible

tardy topaz May 19, 2023, 4:48 AM

#

There is, on the dev branch

pliant spruce May 19, 2023, 4:48 AM

#

ok ok

tardy topaz May 19, 2023, 4:48 AM

#

I think, uhm, let me remember

#

Are you on the UI?
or the CLI?

pliant spruce May 19, 2023, 4:49 AM

#

The one you provided https://github.com/JonathanFly/bark

tardy topaz May 19, 2023, 4:49 AM

#

But the web user interface?

pliant spruce May 19, 2023, 4:49 AM

#

Says command line

tardy topaz May 19, 2023, 4:49 AM

#

Like do you use a web browser?

pliant spruce May 19, 2023, 4:49 AM

#

yup

tardy topaz May 19, 2023, 4:49 AM

#

God, I got to stop making jokes

#

It's a WebUI, so it's a joke because I put console output in the Gradio app

#

but software you know, not really the best avenue for humor maybe.

pliant spruce May 19, 2023, 4:50 AM

#

Mhm...

tardy topaz May 19, 2023, 4:50 AM

#

Is there an option to give it a folder of voices?

#

Let me check

#

There should be a checkbox like, "don't join the text"

#

so the you can put in a long text

#

and split the text however

#

and it's all seperate

pliant spruce May 19, 2023, 4:51 AM

#

This ?

tardy topaz May 19, 2023, 4:52 AM

#

That's not it

pliant spruce May 19, 2023, 4:52 AM

#

ok

#

w/e don't worry I'll learn later

tardy topaz May 19, 2023, 4:53 AM

#

hmn, i don't think it' sin that version

pliant spruce May 19, 2023, 4:53 AM

#

I doubt it

tardy topaz May 19, 2023, 4:53 AM

#

In dev branch there is this

pliant spruce May 19, 2023, 4:54 AM

#

Don't have that

tardy topaz May 19, 2023, 4:54 AM

#

You can use that now

#

git checkout dev from command line

#

I didn't put the folder input of NPZ files in there alas

#

but that checkbox might help you

pliant spruce May 19, 2023, 5:01 AM

#

tardy topaz `git checkout dev` from command line

you meant git

tardy topaz May 19, 2023, 5:02 AM

#

yeah

#

You might also find this useful:

#

So if you use the text again, it won't be split exactly the same, more diversity

#

It's like, whatever number you have, + that value, or that value, randomly. So if you have 150 as your goal size you might get 140 or 160. Then if you use

#

Honestly I'm worried dev is bugged, since obviously nobody is really testing is. But I will AT THE LEAST to do a bug fix pass this weekend

pliant spruce May 19, 2023, 5:05 AM

#

This one is surprisingly close to the Jhin quote I used.

tardy topaz May 19, 2023, 5:05 AM

#

At one point when you use the split prompts thing, it was not properly clearing the last voice. So all the samples would sound the same. I can't remember when I fixed that, so look out

pliant spruce May 19, 2023, 5:06 AM

#

tardy topaz Honestly I'm worried dev is bugged, since obviously nobody is really testing is....

Will try it out later

tardy topaz May 19, 2023, 5:07 AM

#

Is that at common expression? I noticed a lot of that with like popular song lyrics in Bark

#

You get a surprisingly amount of matching cadence

pliant spruce May 19, 2023, 5:07 AM

#

Its a quote from a video game so I don't know.

tardy topaz May 19, 2023, 5:09 AM

#

Is it like a fighting game line or something maybe? If so I bet it's in the YouTube training data like 1000 times, for each match, lol. Just over and over.

pliant spruce May 19, 2023, 5:09 AM

#

https://www.youtube.com/watch?v=QwQ3i9L0j74 Most likely. The character is iconic.

YouTube

SkinSpotlights

Voice - Jhin, The Virtuoso

This is the voice for the new champion Jhin.
Purchase RP here (Amazon Affiliate - NA): https://amzn.to/2qZ3Bmv

Note: The Voice might not be final, you never know what tweaks Riot might make to it.

For League of Legends Related News Check Out Surrender@20:
http://www.surrenderat20.net/

Feel Free to Follow me on Twitter as well:
https://twitter...

▶ Play video

#

His voice has music attached as background

#

Kinda like Bark

tardy topaz May 19, 2023, 5:09 AM

#

Oh yeah

#

I bet you hear that when you click on something in League of Legends

#

So just imagine the incredibly amount of times that's got to be on YouTube

#

I bet there's a like a 'click on character who says iconic line' style in Bark, that you could trigger

#

sort of like how you can trigger 'this is a commercial' style, if you've done that

tardy topaz May 19, 2023, 6:17 AM

#

I love so much how if you just keep cranking up the weights on the French accent Bark actually almost REPHRASES YOUR TEXT PROMPT like a French person. Crazy that it somehow works. It usually doesn't, but even 1 in 10 or 20 is all you need to make a funny YouTube video or whatever.

#

Oh that's an old sample, hmn, still a bit of it but not the one I meant

torpid agate May 19, 2023, 6:26 AM

#

How do you "crank up the weights"?

tardy topaz May 19, 2023, 6:31 AM

#

Right now it's a mess of code and super hit or miss, but some day it will be a feature in my fork. Basically I look at a set of French versus English and try to up the chance of French tokens

#

So in the UI you will basically pick a 'target set of voices' and a 'reference set of voices' the reference set is English voices, the Target is French, and the it tries to find the main difference and increase the odds of those tokens in Bark, for your speaker. And honestly accents is the least interesting thing you might do with this format, you can think of many cool ways to use sets of voices like that. But accents seems like a simple case where I can check if it the idea works.

#

Right now there's like 5 or 6 numbers that are super fiddly, like threshold numbers of 'how common should the token be in French, but probably there's a way to make it automatic based on some rules. But I am shelving this until I made some basic updates or I never will, lol

torpid agate May 19, 2023, 6:42 AM

#

Oh I see, so there is some expected distribution of tokens in English vs. French, and you want to overexpress the French tokens? What's the interface for overexpressing tokens? Does it look like the SDWebUI? thing:1.2?

#

I guess... is this a prompt engineering thing you're doing or is it model surgery?

tardy topaz May 19, 2023, 6:44 AM

#

Right now it's scaling the odds by the frequency in the target distribution, with a lot of cutoffs for outliers, and for using most 'french but not english' tokens, or a specially hard penalty tokens common english set but french. a ton of hardcoded values I picked out a hat, no science at the moment, but a proof of concept.

#

But the method right now is just trying some values, didn't work, up them a bit, worse, okay lower them, okay that worked. etc

#

a big mess

torpid agate May 19, 2023, 6:46 AM

#

I see. Does this work for all models in the public repo? It seems like it should if you're changing the frequency (are you basically adding silent "frenchy" tokens?)

tardy topaz May 19, 2023, 6:46 AM

#

It works for the 3 models I happened to be testing with, so I kind of assumed it was general

#

None of them happened to be Suno, but I was using Suno voices in the target set as French and English examples.

#

I'm just multiplying the odds. Like this token is 4x more common French than English. So in the model, at the last step, you just sap the multiplier in there. I thought I would have to also consider tokens in front or back too, the order of tokens should mattter. But actually kind of just works with a general multiplier. Or negative penalty or English.

#

Oh you can compare like speaker voice against average English speaker, and give their tokens a bit of an exemption from the penalties

#

otherwise you can wipe out og voice

torpid agate May 19, 2023, 6:51 AM

#

Can you show me how you over express tokens? I've seen this in the SDWebUI as well, but am not sure if it's custom or a package

tardy topaz May 19, 2023, 6:51 AM

#

Oh it's not models, sorry, I mean prompts

torpid agate May 19, 2023, 6:52 AM

#

This seems to e one of the libs:
https://github.com/damian0815/compel

GitHub

GitHub - damian0815/compel: A prompting enhancement library for tra...

A prompting enhancement library for transformers-type text embedding systems - GitHub - damian0815/compel: A prompting enhancement library for transformers-type text embedding systems

tardy topaz May 19, 2023, 6:52 AM

#

Right now it's in the Bark core code, just my custom code

#

I should probably look at that, because I bet there's smarter ways to do this!

#

Oh that's weighting parts of the prompt. That would also be cool in Bark!

#

This is a lot more sophisticated than what I did, probably worth looking into

#

I'm not really familiar with Stable Diffusion internally so not totally sure how much applies

#

Gosh, the negative prompt is really fun in Stable Diffusion. I guess you could run one bark generation with a negative prompt, save that token distribution, and then run your positive prompt and try to penalize the tokens in the past negative. 90% chance this is completely useless but 10% could do something interesting?

#

Anyone know why LLMs don't have negative prompts, is it just useless?

pliant spruce May 19, 2023, 7:02 AM

#

tardy topaz Anyone know why LLMs don't have negative prompts, is it just useless?

No idea. Negative prompts have a huge impact in stable diffusion from my experience though.

tardy topaz May 19, 2023, 7:02 AM

#

It can't be as easy the idea I explained, or else it would already exist. Probably it just doesn't effect the output in a similar manner like it does in Stable Diffusion.

#

But it still might be interesting in Bark, since it's not quite the same.

#

jeez, also, there is a super easy way to make this actually useful i'm pretty sure will work.

#

I am too overbooked though, just writing it down, not trying this

#

Honestly the french accents thing kind of covers this idea. I can probably work it in there. Essentially you can pick an .npz file, a past sample, as a negative prompt.

#

And maybe I cna get that to work

#

However it's gonna need a LOT of tuning and tests, so you only penalize the right things. Rather than just like, 'the sound of the human voice'

#

But i'm sure it COULD work

#

I'm just imagining the Bark WebUI. This is literally like picking 500 .npz files now, in different menus. Like seriously out of control. hah

#

Gradio is not ready for this.

#

Unless I missed it, THERE IS NO FOLDER PICKER?

pliant spruce May 19, 2023, 7:11 AM

#

no idea

#

if there's a folder picker

tardy topaz May 19, 2023, 7:12 AM

#

pliant spruce No idea. Negative prompts have a huge impact in stable diffusion from my experie...

Can you think what negative even means in Bark? It's a kind of abstract idea! Negative to the raw audio sound, the emotion, the gender, it's not very concrete.

#

Best I can think of. Take 100 average English samples. Take the negative prompt. Find out what's most unique in it. Then penalize that. Maybe, possibly, that works.

pliant spruce May 19, 2023, 7:13 AM

#

tardy topaz Can you think what negative even means in Bark? It's a kind of abstract idea! Ne...

It would associate any word as a negative.

tardy topaz May 19, 2023, 7:14 AM

#

Like for example, if you had a negative prompt of music notes, with the same as your positive prompt. What would you want Bark to do? Just be super formal and monotone?

#

I'm not sure what 'working correctly' means.

pliant spruce May 19, 2023, 7:14 AM

#

The word could be linked to other things so its not 100% guaranteed either, like how I only got 10-20 whispering results out of 200, so its like a 10% reduction on certain things depending on how they're used. I'm sure that yelling or other expressions have higher odds of appearing, so you could at least use it to filter out music, maybe it'll make the audio clearer if you use it that way?

tardy topaz May 19, 2023, 7:16 AM

#

I guess as long as it changes the output in any way you can detect, it's still fun. Just try shoving shakespeare quotes or rap lyrics in the negative prompt. Maybe it has a cool effec.t

#

It's not gonna be essential like SD but has a chance of at least being a fun things to try sometimes.

pliant spruce May 19, 2023, 7:18 AM

#

For sure

tardy topaz May 19, 2023, 7:18 AM

#

For whispering, negative prompt, "I HATE YOUR GUTS!!!!"

#

or something yelling like

#

Everyone is working on voice cloning. Somebody make a negative prompt and let me know if it does anything interesting. I want to try it without having to work out how to do it.

#

I don't really know a thing about Stable Diffusion. If it was like, trained with a negative prompt, then there's basically no chance this is useful.

#

Looks like it's not

pliant spruce May 19, 2023, 7:31 AM

#

tardy topaz I don't really know a thing about Stable Diffusion. If it was like, trained with...

It might be harder if everything isn't already tagged in Bark. Stable Diffusion has the benefit of the fact that the websites they scraped did all the tagging for them e.g. there's this one site that has 50~ tags per image.

tardy topaz May 19, 2023, 7:31 AM

#

I'm too sleepy to think this through. Will save idea for rainy day though.

pliant spruce May 19, 2023, 7:31 AM

#

And when you negative prompt in stable diffusion you're just adjusting the weight, adding brackets increases the strength of the negative prompt

tardy topaz May 19, 2023, 7:32 AM

#

It does like they just hijack the sampler, kind of like I am, but would have to read more about Diffusion models to know how similar it is

pliant spruce May 19, 2023, 7:33 AM

#

Oh, btw

#

When I had ChatGPT do that invisible prompt thing, what does that do exactly to bark?

#

Like, I had it edit the code this one time to let me include invisible prompts

#

That wouldn't affect the output

tardy topaz May 19, 2023, 7:34 AM

#

did it work?

pliant spruce May 19, 2023, 7:34 AM

#

Yes

#

So I could do "Insert text here" and add another sentence at the end that'd be invisible

tardy topaz May 19, 2023, 7:34 AM

#

Did it effect the output though?

pliant spruce May 19, 2023, 7:35 AM

#

Yes but it also increased the odds of noise

tardy topaz May 19, 2023, 7:35 AM

#

What was the part of the code it edited to do this?

pliant spruce May 19, 2023, 7:35 AM

#

Uhhh I can't remember this was last month

#

It let me do it in commandline

tardy topaz May 19, 2023, 7:36 AM

#

I can think of a simple way, like just treat it as seperate samples

#

And throw away the one invisible one

pliant spruce May 19, 2023, 7:36 AM

#

But I think the noise was generated by the fact that the invisible sentence required using special symbols so the symbols were being included in the sentence which could be fixed

#

cuz i was using || to hide the last part of the sentence

tardy topaz May 19, 2023, 7:36 AM

#

There is a way I can add this as a feature very simply, so I might. But doing it a deeper level would be hard.

#

But if you split the text, as new samples, that does lower overall quality

#

compared to just having one long one probably

#

But still, maybe the simple way is useful

#

What symbol do they use for invisible prompt in SD, do you know?

#

Or some other thing, if there is a standard

pliant spruce May 19, 2023, 7:38 AM

#

"||" the symbol didn't matter, it was just two of these

#

It was something to be included in the prompt that would be removed, and everything after it would be removed as well

tardy topaz May 19, 2023, 7:38 AM

#

If you find the code, if it actually modified generate_text_semantic

pliant spruce May 19, 2023, 7:39 AM

#

the old bark code is simpler so i might check how i did it that way

tardy topaz May 19, 2023, 7:39 AM

#

then please show me cause I'm not sure offhand how to make it work at that level, without experimenting

pliant spruce May 19, 2023, 7:39 AM

#

think it was generation.py

#

ok

tardy topaz May 19, 2023, 7:39 AM

#

Maybe it just replaced them with pad tokens or something

pliant spruce May 19, 2023, 7:39 AM

#

I was using the first version of bark so idk how much you changed lol

tardy topaz May 19, 2023, 7:39 AM

#

But the idea is they should still effect the text. so if you I AM YELLING

#

then the person starts yelling

#

I kind of think it just deleted them, and they didn't still effect the audio. But if actually didn't it does work, heck, I'll just add that to Bark Infinity

#

chatgpt can be pretty smart, so it could have done it right

pliant spruce May 19, 2023, 7:43 AM

#

I'll try it again.

#

maybe it was this i needed to edit... def generate_text_to_speech

tardy topaz May 19, 2023, 7:43 AM

#

No hurry honestly I don't need more features, I need basic bug fixes.

#

The easy and fast way to do it fast, just split the text, and ignore that segment in final audio. But still use a bit of as the history_prompt for the next actual audio segment. I could do that simple version and it might be cool. Though I don't have the partial segment audio joining even in Bark Infinity yet...

#

I can't stay up half the night again, no rush, no chance I do anything with this soon.

#

All people want is easy to install Bark Webui or Colab notebook, and a nice set of clear speakers. Literally that's all I should do this weekend.

pliant spruce May 19, 2023, 7:49 AM

#

yup

tardy topaz May 19, 2023, 7:49 AM

#

To be honest I am little confused how I ended up so deep in the audio stuff. The literal reason is that a bunch of really silly ideas keep working and that's addicting.

#

But I don't really need TTS so sometimes I do stop and wonder why I'm trying to hard to make Trevor Noah sing in a French accent. Not only do I not need this, does anyone? lol

#

That said, I bet he does a great French accent while singing. en_fiery always delivers.

#

It is actually ridiculous. I gotta do something with this stuff at least. bug fixes can wait a day, need to make at least one funny video with this Bark tech

pliant spruce May 19, 2023, 8:00 AM

#

what kinda video?

tardy topaz May 19, 2023, 8:00 AM

#

Since I just decided to this, that part is yet to be determined.

#

But you know, singing, accents, something like that.

slim jacinth May 19, 2023, 3:19 PM

#

tardy topaz But I don't really need TTS so sometimes I do stop and wonder why I'm trying to ...

I do think you actually need Trevor Noah singing in a French accent

#

It is a basic human need ❤️

tardy topaz May 19, 2023, 3:20 PM

#

I had trouble combining both singing and french, but I didn't try too much, lol. But I agree

#

I think I just need more singing samples!

#

Or I guess French singing, then it's just one set. That should work

#

Singing in general works way less well than the accents. Just sounds like autotune most of the time. But the sample of singing I have is small

#

Also like, I'm not using like principle component analysis or other things I should probably be doing, just literally counting tokens. So not exactly optimal lol

#

Frankly it's just yet another thing that really shouldn't work the crude way I did it.

#

My singers are mixed with music, so that's probably why it goes into autotune mode

#

They are not all voice only

#

I mean maybe I can lean into the autotune sound. That could be also cool. Mainly every set requires a ton of fiddly guessing of thresholds right now, so there must be a smarter way to do this where it's not so fragile.

gilded goblet May 19, 2023, 6:25 PM

#

One application of a negative prompt for Bark would be generating logits with it as the prompt but negating them and then adding them to the regular prompt generated logits. (You probably want a control for the relative weight of the negative and positive prompts, too.) One problem, though, that I foresee with doing this in the simple and naive way, though, is that because Bark gets things like language and accent from the prompt (regular and history), a negative prompt like this that is prompted in the same language as the main prompt may cause things like the language used to be less stable, which probably isn't the goal.

long egret May 19, 2023, 10:57 PM

#

guys, #📚┃suno-school maybe better for that stuff

pliant spruce May 20, 2023, 11:25 AM

#

@tardy topaz you're gonna have to run this one locally lmao

tardy topaz May 20, 2023, 11:26 AM

#

Are you on the dev branch, let me check. Recently I've been using a new method for clear voices.

#

I think maybe you can do it

pliant spruce May 20, 2023, 11:27 AM

#

Actually, when I convert it with ffmpeg, it works

tardy topaz May 20, 2023, 11:28 AM

#

Yeah you can't quite do it. Basically start with the clearest voice you can. Then I'll explain later, you erase the coarse prompt. Which is or will be checkbox. And you make the history prompt as small as possible, but still clear. And then make long 14s samples. You get a pretty good range of voices, pretty diverse, all re usually fairly clear!

#

I'll make this easy. It really reduces the worse noisy speakers.

#

There is some common feeling between the voices, they aren't THAT different, but plenty different to make a ton

pliant spruce May 20, 2023, 11:29 AM

#

did you want a french voice?

#

w/o singing as well?

tardy topaz May 20, 2023, 11:29 AM

#

if it's clear sure

#

though I need a bunch so no rush

pliant spruce May 20, 2023, 11:30 AM

#

Ok, I should be able to get one in a few results

pliant spruce May 20, 2023, 11:30 AM

#

tardy topaz though I need a bunch so no rush

just show me how to split it later so i can batch them

#

not now tho

tardy topaz May 20, 2023, 11:30 AM

#

I can make one for you

#

Wait let me see

#

Honesstly its late let's table this, it's 7 30 am

#

haha

#

sorry but I almost forgot

pliant spruce May 20, 2023, 11:31 AM

#

i still think "high quality:" makes things better

tardy topaz May 20, 2023, 11:31 AM

#

interesting!

#

I mean, it not impossible

pliant spruce May 20, 2023, 11:31 AM

#

its actually VERY consistent

#

no joke

#

lemme send the result

pliant spruce May 20, 2023, 11:32 AM

#

tardy topaz I mean, it not impossible

first try

tardy topaz May 20, 2023, 11:33 AM

#

lol, probably just a good voice

pliant spruce May 20, 2023, 11:33 AM

#

its not a fluke

tardy topaz May 20, 2023, 11:34 AM

#

don't sweat the voices. I'll set you up later weekend with how I do it now. I wish i had done it like this.

pliant spruce May 20, 2023, 11:34 AM

#

ok

tardy topaz May 20, 2023, 11:35 AM

#

Also maybe with voice cloning, nobody needs to do any of this?

pliant spruce May 20, 2023, 11:35 AM

#

tardy topaz Also maybe with voice cloning, nobody needs to do any of this?

Randomly synthesized voices are still fun. People will also want voice model merging.

tardy topaz May 20, 2023, 11:35 AM

#

Well, you can't voice clone Barack Obama but french, or whatever.

pliant spruce May 20, 2023, 11:35 AM

#

tardy topaz May 20, 2023, 11:35 AM

#

Since he doesn't exist. So my stuff is still useful.

#

You can model merge now actually

#

Bark just kind of works. Not all the time but sometimes

#

It's not a feature but it could be

#

I do it like 1000x

#

it's useful

#

I will make a feture

#

it is super useful

#

Sometimes you just get like, actually, a perfect mix somehow

#

For example a TTS voice, with a very human voice, it's like half that person, half TTS. just works sometimes lol

#

Some of my voices I really sweat over. literally 20 or 30 model merges

#

haha

pliant spruce May 20, 2023, 11:39 AM

#

I'll have to learn that kinda stuff later for sure

tardy topaz May 20, 2023, 11:39 AM

#

It's just gonna be, pick two npzx

#

instead of 1

#

Actually it

#

will be a tool. beause it doesn't always work. So I usually make like 10 versions

#

and then one was okay lol

#

I got a Donald Trump whisper, almost. But the voice changes too much. Still pretty close lol.

#

If you use the voice it's even more changed

#

but maybe fixable

#

I wasn't trying for whisper, actually one of the most clear whispers though

pliant spruce May 20, 2023, 11:44 AM

#

tardy topaz I got a Donald Trump whisper, almost. But the voice changes too much. Still pret...

whispering voices are very hard w/o directly prompting them to whisper

tardy topaz May 20, 2023, 11:44 AM

#

Yeah totally rng

#

only one I can remember lately

#

I saved it mess with

#

try and make it work better

#

singing is similar. if you sing, voice changes

#

it's a really good whisper voice, lol

#

whispering is probably easier than accents, it's a pretty general sound

pliant spruce May 20, 2023, 11:47 AM

#

I think accents are easier

tardy topaz May 20, 2023, 11:47 AM

#

yeah maybe

#

should see if cna voice clone a whisper

#

with mylo's thing

#

then i can use as the samples

#

instead of you finding them

pliant spruce May 20, 2023, 11:50 AM

#

cheat codes aren't fun but ok

#

@tardy topaz Here!

#

wait till 8 seconds.

tardy topaz May 20, 2023, 11:52 AM

#

haahh

#

you didn't type that?

#

IS that your prompt?

pliant spruce May 20, 2023, 11:53 AM

#

Its a prompt.

#

high quality: announcer: Hello passengers, this is your captain speaking. This plane is about to crash!

tardy topaz May 20, 2023, 11:54 AM

#

I was hoping bark wrote the last part

#

with the confused mode

pliant spruce May 20, 2023, 11:54 AM

#

LOL

#

the 2nd result is kinda funny

viral lynx May 20, 2023, 11:55 AM

#

tardy topaz with the confused mode

oh you want to see what my bark generated without being prompted to?

pliant spruce May 20, 2023, 11:55 AM

#

"hello passengers, this is your captain speaking, this plane is about to-" and it cuts off there

tardy topaz May 20, 2023, 11:55 AM

#

Sure

viral lynx May 20, 2023, 11:55 AM

#

so i made a voice based on a dantdm video, just the intro

tardy topaz May 20, 2023, 11:56 AM

#

Can you run it on a partial prompt, like my joke video?

#

And see what it does?

pliant spruce May 20, 2023, 11:56 AM

#

tardy topaz Can you run it on a partial prompt, like my joke video?

. . .how?

tardy topaz May 20, 2023, 11:56 AM

#

I mean mylo

pliant spruce May 20, 2023, 11:56 AM

#

oh

tardy topaz May 20, 2023, 11:56 AM

#

"What was six afraid of 7?" and then keep sampling

#

Why was

viral lynx May 20, 2023, 11:56 AM

#

this was not prompted, the prompt was completely different

#

"if you enjoyed this, like this video... check out AA-"

tardy topaz May 20, 2023, 11:57 AM

#

I had the notion of a 10 hour unprompted Bark YouTube video just endless nonsense

viral lynx May 20, 2023, 11:57 AM

#

lol

#

it's the semantics keeping it alive at that point, as spamming random semantics will drown out the voice

tardy topaz May 20, 2023, 12:00 PM

#

Interesting that it's a youtube line

#

though you used books as training?

viral lynx May 20, 2023, 12:00 PM

#

probably because the prompt was the intro

#

this was the audio i used for cloning

#

it probably recognised that it was an intro, and recognised it

tardy topaz May 20, 2023, 12:02 PM

#

Have you ever seen Whisper, it hears those words so much

#

they are banned in the raw codde

#

lala

#

lol

viral lynx May 20, 2023, 12:02 PM

#

lol

tardy topaz May 20, 2023, 12:02 PM

#

Just give it any audio, if it's not sure, it ouputs thank you for subsribing

viral lynx May 20, 2023, 12:03 PM

#

since it was trained on youtube videos?

tardy topaz May 20, 2023, 12:03 PM

#

Yeah presumably. It literally can't stop hearing it

#

Any time the noisy, it says that

#

Am I hallucinating I can't find it now

viral lynx May 20, 2023, 12:06 PM

#

also, the voice cloning is easy to implement, and i provided some code snippets so you can easily implement it in your webui if you want to

tardy topaz May 20, 2023, 12:07 PM

#

I will, nice

#

I will maybe even train more

#

one thing I got is npz everywhere

#

and usually the prompt

#

not greatest diversity though

viral lynx May 20, 2023, 12:08 PM

#

yeah, there's a 4 and a 14 epoch model on the huggingface repo

#

you could train from there or train from scratch, it doesn't take long to train from scratch

#

and most of the mistakes it makes are not things you will pick up on that much as a human

#

like it might misclassify a token for another token, but if you heard them side by side there would barely be a difference

#

i believe bark, with it's 10000 tokens, has a bunch of duplicate tokens which are interchangable

#

at least from the perspective of HuBERT base

tardy topaz May 20, 2023, 12:10 PM

#

There's some funny stuff, like some tokens are like a description, or at least it feels like it. adds an effect to the whole clip

#

or removes

viral lynx May 20, 2023, 12:11 PM

#

also, the voicemod soundboard sounds pages have a lot of clips that are great for voice cloning as well

tardy topaz May 20, 2023, 12:11 PM

#

Not literally a token

#

but like, a chunk

viral lynx May 20, 2023, 12:11 PM

#

viral lynx also, the voicemod soundboard sounds pages have a lot of clips that are great fo...

the joe biden voice clone is actually based on the elevenlabs generated audio

tardy topaz May 20, 2023, 12:12 PM

#

I think I'm still take some time to tune the clones

#

You can still dial them in a bit

viral lynx May 20, 2023, 12:12 PM

#

tardy topaz or removes

possible, maybe a consistent sound

tardy topaz May 20, 2023, 12:12 PM

#

Yeah, background hums

#

sometimes go away for whole clip

#

or appear

#

not predictably but, if you desperate, you can randomly delete. and try to get lucky

#

I kind of thought a hum would be temporal?

#

but it's like almost a little tag ? total no idea here. I was just little surprised

#

like it just changes where the prediction goes probably

#

but subjectively, it was like that

#

I don't know how semantic and language like, the semantic tokens actually are, maybe not impossible

viral lynx May 20, 2023, 12:16 PM

#

i gotta see if i can make bark generate infinite length (and probably decode on cpu from that point on)

tardy topaz May 20, 2023, 12:16 PM

#

What do you mean, in the actual model?

viral lynx May 20, 2023, 12:16 PM

#

or cut into chunks and then decode

tardy topaz May 20, 2023, 12:16 PM

#

You can chunk coarse and fine easily

viral lynx May 20, 2023, 12:16 PM

#

yeah

tardy topaz May 20, 2023, 12:16 PM

#

I think maybe you can put tokens into the inference space, but I didn't get around to trying that

#

Like instead of puttting history where it should go. take up inference space

#

MAYBE you can use that trick to chunk semantic?

#

if you don't do that it just sounds bad

#

Like giving an actor a 3 word first part of a line

#

and nothing else

viral lynx May 20, 2023, 12:18 PM

#

you can chunk semantics probably, just make sure you have a good history prompt

tardy topaz May 20, 2023, 12:18 PM

#

I tried it a lot

#

2 words is the breakdown ponit

#

but it all just sounds bad

#

because it doesn't have enough context to perform the line

#

it works just bad

#

Bark in general, I find, give it a big text if possible. It's more descriptive.

#

So it sounds like you have an actor, right. And you give him a notecard with 2 words on it. He reads it. Then you give him another.

#

It just sounds wrong lol

viral lynx May 20, 2023, 12:20 PM

#

here's a fun experiment

tardy topaz May 20, 2023, 12:21 PM

#

hit me, maybe i tried it

viral lynx May 20, 2023, 12:21 PM

#

use a cloned history prompt, then generate without prompt with early_stopping=False

tardy topaz May 20, 2023, 12:21 PM

#

Oh that was literally my first idea yeah

#

honestly i tried to do that with WHISPER

#

but i couldn't figure it out

#

coudl it predict based on audio, what next tokens were likely, with no other input, based on the internal llm

#

so it's like speech to text but guesses what you say

#

I think you can do it now, in the cpp fork, but I didn't check

viral lynx May 20, 2023, 12:23 PM

#

damn sometimes i forget to add the quick kwargs and then it doesn't auto hide (gradio please implement element replacements)

#

since the point of the webui is more than text-to-speech, voice cloning was just something i wanted because i thought it would be cool.

tardy topaz May 20, 2023, 12:25 PM

#

There's so many easy features I need to add.

#

just mashign two prompts together, works pretty well

#

like a model merge

#

not always but enough

viral lynx May 20, 2023, 12:26 PM

#

just averages of 2 voices with the same semantics?

tardy topaz May 20, 2023, 12:26 PM

#

like, it should not work

viral lynx May 20, 2023, 12:26 PM

#

it should work with the same semantics

tardy topaz May 20, 2023, 12:26 PM

#

but you really get a nice hybrid!

#

usually have to render different size variatns pick the best

#

even like a robot tts, and human

#

it's like half tts

#

lol

#

even 3 prompts, not impossible

#

oh here's a fun one

#

have you tried just taking a speaker. delete every other token

#

they talk twice as fast. still sound pretty natural

#

haha

#

or the opposite, double token

#

i don't know why I was doing but it's actually not even that unnatural

viral lynx May 20, 2023, 12:29 PM

#

what about this though, instead of merging 2 voices by averaging, you extrapolate the difference from voice a to voice b onto voice b or c? like the add difference merging from stable diffusion webui

tardy topaz May 20, 2023, 12:29 PM

#

yeah did you see my accent work, a little like that

#

it's 8:30 am and I haven't slept I'm not sure I can actually explain

#

but I did in discord previous messages, using french setrs

#

of voices

#

and singing

#

I think there's SO much you can do

#

with voices averaging, differences, using a set of voices as a penalty or target

#

the singing sounds like autotune, but I realized half my singing samples were music after

#

so actually, that was probably working correctly

viral lynx May 20, 2023, 12:32 PM

#

also, you keep talking about music, you can finetune bark with music to have it basically be bark but as musiclm

tardy topaz May 20, 2023, 12:32 PM

#

I wonder. Presumably if it could, base bark would be better though?

#

It must have seen a lto?

#

Oh nevermind I understand now

#

You mean finetune, but overwrite exsiting capability

#

Fully music Bark

viral lynx May 20, 2023, 12:34 PM

#

yeah

tardy topaz May 20, 2023, 12:34 PM

#

Yeah that would be cool. Even finetune to specific artist

#

If it's fast

#

I really think there's a billion things left to do in current model

viral lynx May 20, 2023, 12:35 PM

#

tardy topaz Yeah that would be cool. Even finetune to specific artist

that's what history prompts are for

tardy topaz May 20, 2023, 12:36 PM

#

that's be ideal, i just assumed it would still not work great, but maybe

viral lynx May 20, 2023, 12:37 PM

#

it should be higher quality than the voice cloning with my model though, since it would actually use the same things as the original model did during training

autumn cloud May 20, 2023, 12:48 PM

#

@viral lynx great work! Are you integrating the cloned into webui. I tried to get to test it but I was lost.

viral lynx May 20, 2023, 12:49 PM

#

i'll probably release my webui so people have something to play with cloned voices, (also, cloned voices are saved under the same name as the original voices, but with npz, in the custom speakers folder)

autumn cloud May 20, 2023, 12:52 PM

#

Your repo is meant to be used in conjunction with bark’s api right. I was just lost but I will wait for your ui and read the code.

viral lynx May 20, 2023, 12:53 PM

#

yes

#

you can create a voice clone without bark even installed too though

autumn cloud May 20, 2023, 12:55 PM

#

I was trying test_hubert but it’s expecting semantic.npy lol

viral lynx May 20, 2023, 12:56 PM

#

yeah, that's a file to compare to lol, you could technically just create an empty npy file called that, or disable the check

autumn cloud May 20, 2023, 12:58 PM

#

Oh, so it’s ok to comment out ‘original’

#

Makes a bit sense now that’s why it’s a test, you are trying to see if they are identical

viral lynx May 20, 2023, 12:58 PM

#

yeah, you can remove the print as well

#

this here is how you can actually do voice cloning, as a developer

autumn cloud May 20, 2023, 12:59 PM

#

So how do you use the generated npy?

viral lynx May 20, 2023, 12:59 PM

#

you put it in the npz with the coarse and fine from the same audio

#

to make it easy, just use a different voice cloner, and replace the semantic_prompt.npy inside of the npz with the npy from here, make sure it's called semantic_prompt.npy

autumn cloud May 20, 2023, 1:00 PM

#

viral lynx this here is how you can actually do voice cloning, as a developer

I ran both and got nothing and assumed this was meant to be used in conjunction with bark

viral lynx May 20, 2023, 1:01 PM

#

yeah, correct, the semantic_tokens from that code can be saved to an npy

#

and that can be used inside of a history prompt for the cloned voice, but i'll make my webui public in a bit

autumn cloud May 20, 2023, 1:03 PM

#

viral lynx and that can be used inside of a history prompt for the cloned voice, but i'll m...

Awesome! I will probably get more insight that way but I will still try your instructions now.

viral lynx May 20, 2023, 1:08 PM

#

autumn cloud Awesome! I will probably get more insight that way but I will still try your ins...

https://github.com/gitmylo/audio-webui

GitHub

GitHub - gitmylo/audio-webui: A webui for different audio related N...

A webui for different audio related Neural Networks - GitHub - gitmylo/audio-webui: A webui for different audio related Neural Networks

#

it auto installs when you run the run.bat, you can add whatever flags to it as well, i should probably document those

tardy topaz May 20, 2023, 1:30 PM

#

Bark is too powerful. It's so beautiful.

#

This is typically what you get, nice singing, but doesn't feel like Obama anymore.

#

BUT it can be done. You can keep backing up and hit a spot where it sings, and not change. SO GOOD.

#

I think the UI for Bark, rather than pick a prompt, pick a location in the prompt instead. That would make some of this fiddling easy.

autumn cloud May 20, 2023, 1:59 PM

#

viral lynx https://github.com/gitmylo/audio-webui

Installed is there a tab I need to be in to clone?

viral lynx May 20, 2023, 1:59 PM

#

it's in the text to speech

#

just pick the bark model and it will load the stuff, as "speaker from" put "upload"

#

and you'll get a thing where you can upload an audio file

autumn cloud May 20, 2023, 2:01 PM

#

Thanks will try now. Restarting ui

viral lynx May 20, 2023, 2:03 PM

#

wild results sometimes

#

with no prompt, and squidward as history

autumn cloud May 20, 2023, 2:06 PM

#

Quite impressive. Just tried it

viral lynx May 20, 2023, 2:19 PM

#

yeah, with a good input audio you'll get really good results, + if you generated a really good result, you can download the speaker prompt from that generated audio, they are sometimes more consistent

autumn cloud May 20, 2023, 3:32 PM

#

With some effort, I managed to extract you implementations.

#

The code is a monster bro! How you pulled this off is quite impressive.

#

@viral lynx 🤝

viral lynx May 20, 2023, 3:34 PM

#

thanks

slim jacinth May 20, 2023, 3:35 PM

#

100% - super impressive

autumn cloud May 20, 2023, 3:46 PM

#

@viral lynx you never used the models you generated in the webui? Is there a reason why and how could I try those?

autumn cloud May 20, 2023, 3:46 PM

#

autumn cloud <@704733206792110090> you never used the models you generated in the webui? Is t...

I mean the models you have in huggingface

viral lynx May 20, 2023, 3:47 PM

#

it downloads it from huggingface though?

#

#

the 14 is the epoch, the other on is on epoch 4, but i didn't rename it lol

autumn cloud May 20, 2023, 3:49 PM

#

Interesting, it didn’t download rot me

#

Oh I see Hubert.pt and tokenizer.pth

viral lynx May 20, 2023, 3:52 PM

#

yep

autumn cloud May 20, 2023, 4:09 PM

#

@viral lynx what are you suggestions for audio length to clone from and why do I sometimes get a different voice between chunks.

viral lynx May 20, 2023, 4:26 PM

#

around 6 - 10 seconds is usually great. sometimes you get a different voice, i recommend saving the npz that comes out of a good result, since that one is fully bark generated

autumn cloud May 20, 2023, 4:29 PM

#

Thanks for the hint. Will probably generate 10 samples then pick the best

tardy topaz May 20, 2023, 5:21 PM

#

Best use of voice cloning, no prompt, no stop, just get cool audio you barely hear from Bark typically!

#

Some of the best sound effects and music instrumentals, and like animal sounds, etc, feels a lot different than typical Bark sample

#

Less structured but also kind of more natural in a chaotic way, super neat

#

There's more sound effects in Bark than I thought

viral lynx May 20, 2023, 5:46 PM

#

tardy topaz May 20, 2023, 5:49 PM

#

Mylo has given so many ideas I can't move. You can train this in like 8 hours? You could try SOO really wild unbalanced possibly absurd datasets, like 2 a day, and see what happens!

#

Maybe nothing for all of them and you stop on day 3, still cool

viral lynx May 20, 2023, 5:54 PM

#

tardy topaz Mylo has given so many ideas I can't move. You can train this in like 8 hours? Y...

again, not 8 hours, 20 minutes

#

the 8 hours is the amount of training data i had

#

but it trains faster than realtime

tardy topaz May 20, 2023, 5:55 PM

#

Nice

#

36 models day

shell prism May 20, 2023, 6:59 PM

#

A fun rap AI taking over the world.

opal spear May 21, 2023, 12:16 AM

#

tardy topaz Bark is too powerful. It's so beautiful.

yoo how'd you get obama and trump?

tardy topaz May 21, 2023, 12:20 AM

#

Those are hand made, but you can also clone them now

#

Or both, which is actually still kind of maybe necessary

#

I still had to tweak the wav clones honestly by ear

opal spear May 21, 2023, 12:20 AM

#

I'm downloading that webui rn

#

so yeah

tardy topaz May 21, 2023, 12:20 AM

#

Nice

#

I kind of copied and pasted all that into my code I may even update. Not a polish release but it works.

tardy topaz May 21, 2023, 4:07 AM

#

You can do both now. Clone automatically model merge etc. Though merging is not in next update

hazy rain May 21, 2023, 2:20 PM

#

tardy topaz You can do both now. Clone automatically model merge etc. Though merging is not ...

Can you point me to where I should look for the training part? I've played with the inference a lot and I really like the flow of the voices

tardy topaz May 21, 2023, 2:21 PM

#

you mean voice clone? training no idea

#

just thought I remembered somebody trying a new language

hazy rain May 21, 2023, 2:22 PM

#

tardy topaz you mean voice clone? training no idea

Yep voice clone my bad, I though it classified as training!

#

(trying to replicate my own voice 😅 )

#

Ohh this look terrific I'll give it a go: https://github.com/gitmylo/audio-webui

tardy topaz May 21, 2023, 2:26 PM

#

I should have done but I wasted the day, and now I'm tiried

#

are you technical enough to install via conda yourself?

hazy rain May 21, 2023, 2:27 PM

#

Yep I'm a TD in VFX 🙂

tardy topaz May 21, 2023, 2:27 PM

#

I could push this version

#

but it doesn't have updates ymc for conda or pip list

#

someone would have to just figure it outt

#

and i don't time until maybe late today

hazy rain May 21, 2023, 2:27 PM

#

tardy topaz someone would have to just figure it outt

I could PR that if I manage to figure it out

tardy topaz May 21, 2023, 2:27 PM

#

it does have the cloning

#

but it's like just a mess

#

for produciton

#

i mean sure whatevr

#

let me just at least remove print statemnts

hazy rain May 21, 2023, 2:28 PM

#

Ahaha, well I've seen everything in the AI/Python world I'm immune to this now 😅

#

Hit me up with the link when you can and I'll contribute back if I can! Thanks Jonathan!

tardy topaz May 21, 2023, 2:30 PM

#

it's https://github.com/JonathanFly/bark so i'll push a new branch probably.

GitHub

GitHub - JonathanFly/bark: 🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bar...

🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bark Text-prompted Generative Audio Model - GitHub - JonathanFly/bark: 🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bark Text-prompted Generative Audio Model

#

actually if you check for problems, that would be helpful

#

since i plan doing more tonight

#

maybe an hour

#

the cloning will wrok

#

not sure about generation though haha

#

this is a real mess al in one file just to do it in a couple hours

#

haha

#

I think I can make anti voice clone work, or at least, be occasionally funny. I've been saving bucket of clones tokens and trying things. instead of fixing critical filename crash bugs. audio as input, just alone. cool idea. it's just more tokens. even a vague style hint. a minute of audio is a lot of tokens.

red helm May 21, 2023, 2:46 PM

#

I made a simple web api for whoever wants it, it has streaming and file based generation, and a simple short lived queue https://github.com/demandcluster/bulldog

GitHub

GitHub - demandcluster/bulldog: Unofficial Bark API

Unofficial Bark API. Contribute to demandcluster/bulldog development by creating an account on GitHub.

bold token May 22, 2023, 3:37 AM

#

Check out a sample short podcast I created using Bark: https://youtu.be/CW790VwEO9c

YouTube

Yedidya Harris

IoT Workshops in Agriculture

🎙️ "Tech Talk with Rob: IoT Workshops in Agriculture" 🌾

Join Rob in this insightful episode as he explores IoT workshops in the Faculty of Agriculture in Israel. Powered by AI and featuring seamless narration by Bark TTS technology, this podcast delves into how technology is revolutionizing agriculture.

Discover how IoT bridges the gap between...

▶ Play video

wicked gull May 22, 2023, 7:02 AM

#

@viral lynx I am getting these errors while running the ./run.bat script. Plz help

viral lynx May 22, 2023, 7:16 AM

#

are you on linux? use the sh files instead if you're on linux

light tide May 22, 2023, 6:54 PM

#

tardy topaz One shotting full music seems tough, but you can gen good beats to build from

Like WAT how ???!?! I got work but bark will be what I mess with asap

worn oar May 23, 2023, 7:38 AM

#

shell prism A fun rap AI taking over the world.

This is really amazing. I am new to bark ai and new to the server.. this is one of the first things i've previewed-- I was curious how you got this rapper voice?

formal plover May 24, 2023, 12:27 AM

#

tardy topaz Best use of voice cloning, no prompt, no stop, just get cool audio you barely he...

is there any documentation for voice cloning with bark

tardy topaz May 24, 2023, 12:28 AM

#

It's a bit rough, only even been been out since last Friday. I'll add some soon.

#

The short version in my fork is upload a wav file, you get a bunch of voices. Try the voices maybe you get a real good one.

hazy rain May 24, 2023, 1:42 AM

#

tardy topaz The short version in my fork is upload a wav file, you get a bunch of voices. Tr...

Oh bark infinity does voice cloning differently? That sound interesting

edgy mango May 24, 2023, 3:22 AM

#

https://youtu.be/qlAqHpKrQrA

YouTube

Dagaz Wyrmspear

Sexual time travel

AI musings on using sexualized consciousness to travel through time. Ooba Booga WebUI with Stable Diffusion. Manticore 13b LLM with Bark TTS and Automatic1111's SadTalker extension.

▶ Play video

tacit maple May 24, 2023, 4:16 AM

#

(this is post-processed btw, but the voice in the beginning is straight bark-infinity output)

rain violet May 24, 2023, 10:48 AM

#

formal plover is there any documentation for voice cloning with bark

I made a jupyter notebook that you can use https://colab.research.google.com/drive/1IA3c_R859nANerMARazCSrjc2UD3ws8A?usp=sharing

Google Colaboratory

somber rivet May 24, 2023, 4:16 PM

#

I had a bot I'm building summarize an article, then convert the summary to an audio file.

tardy topaz May 24, 2023, 4:19 PM

#

The prompt was pure laughs and I didn't even mean to the join the segments. Perfection, honestly.

#

I didn't save all the .npz for each segment though, a travesty.

granite quiver May 24, 2023, 4:59 PM

#

That clip sound like it came out a horror movie.

grizzled shard May 24, 2023, 5:03 PM

#

had some similarly haunted. 🙈

white wadi May 24, 2023, 5:03 PM

#

using your self model ?

grizzled shard May 24, 2023, 5:03 PM

#

what self model?

white wadi May 24, 2023, 5:04 PM

#

voice cloning

grizzled shard May 24, 2023, 5:05 PM

#

ah. yes. tried to random generate while using a voice cloned npz

granite quiver May 24, 2023, 5:05 PM

#

grizzled shard had some similarly haunted. 🙈

Sounds like a hounded doll with a build-in-voice-box.

tardy topaz May 24, 2023, 5:06 PM

#

My clip is a totally normal Bark random speaker, just came out perfect.

white wadi May 24, 2023, 5:06 PM

#

have you checked behind you?

#

just kidding 🙂

tardy topaz May 24, 2023, 5:07 PM

#

The laugh at the end after you think it's over, jesus

somber rivet May 24, 2023, 7:20 PM

#

tacit maple May 25, 2023, 12:15 AM

#

Here's another one B)

lofty flint May 25, 2023, 2:09 PM

#

Today podcast made in one go (just one inference for each podcast, not manually picked from hundreds) :
https://soundcloud.com/jacktalk/sets/jacktalk-today-20230524

SoundCloud

JackTalk

JackTalk Today 20230524

JackTalk is brought to you by ai.pictures. All content is generated with A.I.

▶ Play video

grizzled shard May 25, 2023, 11:28 PM

#

probably not a good idea to go over 1 for the temperature. but it started so well...

crisp bone May 26, 2023, 12:27 PM

#

grizzled shard probably not a good idea to go over 1 for the temperature. but it started so wel...

me after the lobotomy:

obtuse slate May 26, 2023, 12:49 PM

#

lofty flint Today podcast made in one go (just one inference for each podcast, not manually ...

Super cool. Do you generate multiple segments of 10-ish seconds and merge them later ? The voices are still Bark-weird but you made an excellent use of this limitation, and that gives the podcast a certain deliberate crazy tone. Love it.

lofty flint May 26, 2023, 2:40 PM

#

obtuse slate Super cool. Do you generate multiple segments of 10-ish seconds and merge them l...

thanks! yes, it limits is 14 seconds, so it is a concatenation of many short clip

lofty flint May 26, 2023, 4:09 PM

#

obtuse slate Super cool. Do you generate multiple segments of 10-ish seconds and merge them l...

Next I will try make a live comedian with this :

https://www.twitch.tv/chatwithalice

Twitch

chatwithalice - Twitch

Feel free to ask questions you want to learn!

▶ Play video