#✨|sdxl
1 messages · Page 58 of 1
I am using dreamshaperXL10 and loving it. it can do more realistic than the 1.5 version did
hmmm, the gallery does show off its versatility even if the sample uploads were blurry
I kind of like my current workflow but I shouldn't avoid it out of laziness
It is a little tricky to avoid sometimes yeah
I've also noticed
Messing about with img2img
Using the refiner gives photos a very painting look @trim orbit
lol @heady vale
What's your workflow
Does anyone know a method get seamless tiling textures in comfyui the same as it is in automatic?
There's a seamless texture node in WAS node suite but that's just a simple gradient overlap algorithm used as post processing that will muddy the picture.
Whatever auto tiling checkmark is doing, affects the generation every step of the way. I keep trying to figure out how it works but I get lost trying to follow function calls around 😅
Still new with comfyui, but could you copy paste the node and put each copy at a different setting then output to multiple preview images?
AFAIK, AUTO sets the unet to use convolution padding to circular
So the network always gives you seamless textures
I delete the outputs folder in side Comfy and then create symlinks to the folders comy expects to the actual locatins.
Have a .bat file I can run when I update COmfy to restore the symlinks
NB of you aren't familiar with symlinks these commads need to be run inside an elevated permissions cmd window
Triple Workflow Refiner-Base-Refiner https://drive.google.com/file/d/1LLIY5IGwGFQ7fwiAg1VpysPD6g2Xorpo/view?usp=sharing
Just submitted my 1.0 workflow for testing by Comfy :p
So are you saying its still not public 😦
I really don't understand that Symlink diagram ... ? 😦
I said the goal was late today or late tomorrow (its 10Pm for me)
if all goes well, it will be up in a few hours
can't wait and thx @high skiff
this is what I get when I ask for a space station orbiting neptune 🤦
Its all good I'm just being funny.... I really appreciate all the work you have been doing. 🙂
I find SDXL still likes word salad
it like, super duper likes to make a space station OUT OF neptune
asdhfkasdfjkdah accidentally loaded SDXL 0.9 in auto, now I have to wait a long time again
I just need Comfy's confirmation, and to find the sample prompt I wanna ship it with
If I have my models at C:\Stable-diffusion\models\Stable-diffusion - what would a Symlink look like to ComfyUI B:\ComfyUI_windows_portable\ComfyUI\models\checkpoints
@visual glade wake up. You've got mail.
Hes testing it lmao
lol... we the people... are hungry for ai art workflow
do you have a pic of architecture or vehicles etc
I can generate some
Noafter making sure the comfy folders are empty (as the frst line deletes them the format for you would be
FYI if you're doing thi sfor the first time on an exisiting install I would rename the folder in comfy from say checkpoints to OG_Checkpoints first
rmdir /s /q "B:\ComfyUI_windows_portable\ComfyUI\models\checkpoints"
mklink /D "B:\ComfyUI_windows_portable\ComfyUI\models\checkpoints" "C:\Stable-diffusion\models\Stable-diffusion"
I dig it
A space station orbiting neptune orbiting earth
I'm trying mklink in an elevated command window
neptune has been ensmallened. or earth embiggened
It does vehicles really well
just checked
crop of the low res
vs the upscale
this upscale is insane, I can't believe how well it works, especially with how fast it is lol
i tried to upscale and add more detail. and i reduced detail or unfocused the background or added weird new body parts to the subject lol
yeah, I had to work on a whole different way of doing it to get it to work well
waiting for the experts 🥳
Looks good. Im already getting similar results but yours is probably much faster
I increased the quality, full image gen at 2048x is taking about 45 seconds on my 3080
let me try architecture, tho from what i have seen, its probably gonna be amazing
this is high res fix after all, it fixes a lot of deformities
This worked a treat - thank you 🙂
Looks good. Nice work
you can also control how much it upscales, from 1x to 4x, as well as how much "fixing" it does
so you can choose to stay faithful, or change things
Thanks it's a nice workflow, pics are looking pretty good. I like the batch display.
my fix is going to ship at 33% fix
Here's some weird stuff using R-B-R - Prompt = Pointillism magic surreal grand guignol peruvian arpillera venice carnival psychedelic Avant garde art in the style of yoh nagao Eric madigan heck yayoi kusama
sdxl results set to tiling in automatic looks mangled compared to 1.5 🤔
I haven't done a lot of tests though to compare because auto is crashing too much and it takes several minutes to load the model each time
I have come to the conclusion that A1111 is getting left behind (to ComfyUI) in the SDXL Stakes - but that's just me!!! 😄
sdxl tiling will probably take a bit to rejigger for the 1024 resolution. inpainting at that scale can be different i believe
Automatic is genuinely just a way worse experience for SDXL, especially now that I have a working high quality and reliable high-res fix workflow, of which automatic is not capable of implementing
That seems spot on... A1111 just can't seem to do the same things
And the reason I say it's not capable of implementing it is, it is also based off of my mixed diffusion, which automatic does not have support for
tbf i'm using the dev branch and next time I launch auto it will be to 1.5 inpaint a bunch of my SDXL images
It was easier to include SDXL in Comfy because the interface is already all about arbitrary workflow. It's harder for A1111 because that was designed around a single model workflow not a dual-model.
The fact that automatic doesn't have any support for mixed diffusion puts it at a huge disadvantage for SDXL
... and of course, Comfy himself actually works at Stability AI 🙂
And it was not designed in a maintainable way so a lot needs to be undone in order to make it dual-model.
Pseudo and I worked together in order to port my mixed diffusion pipeline into diffusers, and it improved diffuser outputs considerably
To the point where the developer for diffusers has implemented it as his default method now
What about the styles prompt?
You can very much use the same prompt to cross both text encoders, however my research for my workflow has concluded that proper separations between terms for the first and second text encoder can result in even higher quality images
i'm using auto right now just playing around with the new hires stuff. i'm not using the refiner as i'm moving away from using it in primary passes now. i'll bring it into play for situations where i want it. i've found that the base model scales up very nicely.
just doing a straight 2x upscale pass in auto like i would before has been working quite nicely. just the vae takes SO long to cook and there are a few memory leaks. i wouldn't say incapable. Comfy is still better for SDXL simply because of the much more required optimizations in place.
@primal vault how would the styles prompt of this bepost-apocalyptic survivor overlooking abandoned city, overgrown ruins, hazy horizon, muted tones, artstation, concept art, detailed textures, art by maciej kuciarawould it be muted tones, artstation, concept art, detailed textures, art by maciej kuciara
I have plans to release a stripped down version of my workflow that only uses the base model, and one positive text encoder
Oh ok, so 3 prompts aren't necessary?
the new stuff is just switching to a different model for the upscale pass. but i haven't been doing that anyways. i'm thinking the refiner feels more like noise offset, only sometimes needed
How improved were the images while using 2 text encoders?
automatic tiling for SDXL gives me stuff like this lol
yeah, while that is true, Auto also doesn't have support for some functions that enable all three of my SDXL tricks to work
all three of my functions (Mixed diffusion, fractional offset, and SDXL high res fix) are not compatible with Auto cause it has no support for advanced sampler control
and i don't see it coding in all of that any time soon
the new branch seems to.
post-apocalyptic survivor overlooking abandoned city, overgrown ruins, hazy horizon, muted tones, artstation, concept art, detailed textures, art by maciej kuciara - done using R-B-R
have support for advanced sampler control?
different samplers on the second pass anyways. i'm not sure the advanced granularity that we see in comfy nodes are really an end user requirement. i've not seen a use case
They are required for any of my 3 tricks to work, of which, all 3 are very beneficial and efficient
Comfy and A1111 will probably never have exact feature parity and that's fine. A1111 does some things that Comfy doesn't and vice versa.
if Auto were to add the features I need to implement my fixes into it, I would be willing to, but it needs to add a lotttt of things, especially for my high res fix to work
which upscaler are you using?
my own custom workflow, which should be released in a few hours on the comfy wiki
nodnod, thanks ^^
no worries
ComfyUI is ahead of A1111 as you can lift the hood on ComfyUI - so to speak
Had some time to fiddle with settings today, Day 2 of trying to wrap my head around comfyui. I'm blown away by the detail some of the outputs are giving. 🤯 Spent tonight trying to do a face refine workflow for wonky human faces and got a little closer, but still not sure if I'm doing it right.
guess i'll have to wait to see the 1.0 release. all i hear is hype so far. your fractional offset trick honestly is a fail more often than it wins. i was whelmed by it. it's a neat novel trick that can be done with the way nodes work. not sure of it's use case still though.
Code quality and maintainability is a lot better. But there are some workflows (meaning processes, not node layouts) that A1111 handles a lot better.
I have some friends testing the crap out of my upscale right now, and its working very good across different styles, so I am happy for that
what does hires fix mean in a1111?
does it pass the image through the ckpt again?
yeah, my fractional offset is the least important, but the mixed diffusion and especially the high res fix are a big deal, which need the sampler control, otherwise they are not possible in auto
A1111 has Latent Mirroring; and a great Image Browser... !
altho, I will say, I found some more use cases for my fractional offset, but I will be keeping those to myself, as I can't give away EVERYTHING I learn :p
they are untested at the moment as well
or I should say unproven
I hope A1111 can somehow get updated to work well with SDXL and that Comfy also keeps getting better. They are different tools for different people. Not competitors.
The staff already the best parameters found on the bot? I am waiting for it 😦
freckles i thought it'd do well for but prompting for them works better
not that I know of, and I wouldn't really listen too much to their results either
SD.Next moves quite a bit faster than A1111 and the maintainer is more responsive. it's pretty good in the meantime
its data from chery picked pre-prompts, from people who were looking for aesthetics over prompt adherence, running in a less efficient way, with excessive steps, and tons of randomized parameters
It was all too inconsistent to draw any major conclusions from IMO, especially compared to the way my workflow runs, as my workflow is optimized for different samplers than theirs
I haven't tried it because I heard it has a lot of problems and crashes. But maybe that's not true.
they don't use the data raw like that. they evaluate the results themselves and the votes are just clues for them to hone in on what people are seeing
fair enough, tho I will say that for my workflow, I have already proven that their data is not applicable to the way I do things, as I ran some small scale tests to compared, and found very different conclusions to them in this community (specifically on samplers and step counts)
A1111 trashes my 8Gb VRAM RTX2070 in SDXL - ComfyUI is just that - my GPU is Comfy in SDXL!!!
Do you run different cfg and samplers frequenly? 🤔
it's worth a try for sure. you'll get updates a lot quicker
my latest auto1111 image with the hires pass on the bass model. i honestly think it's fine. The obvious ai artifacts come more from a phoned in prompt approach than anything else
the worst part of the auto process is the VAE takes minutes
no, but I did test them for the core part of my workflow, and found that most samplers were not compatible with my workflow as a whole (I believe 11 of them all crapped themselves), and then the ones I didn't pick were inconsistent
the sampler I use is reliably solid, whereas some of the others have a tendency to either be great or trash
I wonder if reducing the contrast is better as a negative text prompt or an offset lora added to the workflow
or both
I would say it consistenly gives you 80-90% quality results, where the others that worked were more like 40-100%
This is because Comfy loads and unloads models as needed. A1111 crams them all into VRAM at once which is faster if you have enough VRAM but a LOT slower if you don't. You can use --medvram in A1111 to make it behave somewhat more like Comfy in terms of memory.
Is there any way to remove the ones that don't work completely?
What is your current parameters now if you dont mind? I am usign dpm 2m keras with 20 steps 7 cfg most time
yeah, I omitted them from my tests
I have tried --medvram and --lowvram - but it pushes my GPU to its limit - spitting ERROR messages as it goes 😄
Did you list the ones that worked well?
yeah, its been close to the same since my 0.5 release
DDIM, Normal, 25 steps, 7.5 CFG is the go top rock solid base for my mixed diffusion base
Something else might be wrong then. I've looked at the code so I know what those options really do.
think i'm going to run it with --med-vram just so it decides to try to avoid doing that
let me see if I still have the list
A whimsical 3D illustration of Barbie and children, surrounded by colorful bubbles reminiscent of Pop Mart blind boxes, with a cartoonish style and playful details. Camera: Nikon D3. Lens: 35mm.
Dimm? Interesting, I was using it for fast gen in the past. I will run some tests too.
the ones that didn't have a tendency to explode were:
DDIM
Euler A
UniPC
and DPMPP SDE
I saw that movie - DONNIE DARKO?! 🙂
DPMPP SDE did have higher quality results some times, but consistenly had worse ones as well
honestly is a trip trip movie. i was reminded more of aeon flux / matrix white rabit
so it was a little all over the place hit or miss
I didn't reference that!
Wish comfy could do high-res fix pass it seems far superior to the refiner then normal upscale
I am preparing my reddit announcement for release right now
my workflow getting ready to drop can do damn good high res fix
I did this yesterday... not sure if it's of any help. 240 styles in one picture
Scott Detweiler's favourite model DPMPP SDE GPU
it can. just build the node chain up
Awesome can you please ping me when it drops I really need a workflow like that!
I'm too dumb to do it sadly lol
... and who knew that Karras actually works at nVidia!!!
i know. me too man. me too.
like, damn good
first image is crop of base 1024x, second is my high res fix at 2048x
Scott Detweiler has demonstrated a hi-res fix
Yeah that's so much better
!!!
Not sure why but high-res fix just makes the images so detailed
Is "Hi Res Fix" just a three word phrase for "Sharpening?!"
for my high res fix, the final image is 4x the pixel count
so it goes from 1024x1024 to 2048x2048, while also fixing and refining details and deformities
So Topaz GigaPixel and Topaz Sharpen AI - ... 🙂
I think A1111 Hires Fix isn't really "fixing" anything but it is just using multiple passes with the model and then stiching those together to make a higher resolution image.
ah yeah, that has the crusty pixel upscale look
SDXL VAE uses a lot of VRAM.
static in a lot of areas
I had that issue when I was starting my SDXL venture
its a cool image in general tho
I like it a lot :p
if you guys like cyborg rabbits here are more
naw thats just bad prompting on my part. i get that effect without any hires pass when i phone the negative in
ah, fair enough then
it comes from my magic word habits. "fujifilm"
the man with beard looks better on the left side(a lot more depth and texture) but the eyes are off
so that was without refiner pass?
or the other good one i use "magazine centerfold" but then you get that crusty offset print look
please make the dog the lead singer
base can do a lot of good gens
wait a sec
did somebody steal my workflow and pose as me on my own repository?
WTF
bro what lmao
how can they?
your acc got hacked?
Tries to envisage what "posing on my own repository" might look like as a prompt?!?!? 😄
oh, they added a low quality upscale ontop and then reused my whole read me linking to their donation pages instead of mine
wow lmao
that is scummy as hell
they forked yours?
like, they did credit me, but they took all of my writing and just slotted their name into it
Cyborg Rabbits in Post-Apocalyptic City
they legit took all my writing and just put their donation pages in its place lmao
you literally gave it a gpl
Get that Shinguular guy banned!
I know I did, for people to add to it, not pose as me
you don't really get to decide how people use the gpl though. that's a different kind of restrictive license
Unfortunate side effect of GPL and many other open source licenses.
looks like he might be chewing on it
alright, then 1.0 will be released more strictly, cause I am not cool with people copy pasting my work and posing as me pretending like their donation pages are mine lol
I'm sure some of that was involved.
damn, ppl can really fall into the trap, can you send the link to their repo
Omg I didnt realize how big of a difference the negatives make
A hub dedicated to development and upkeep of the Sytan SDXL workflow for ComfyUI - GitHub - ShinguLari/SinguLari-SDXL-ComfyUI: A hub dedicated to development and upkeep of the Sytan SDXL workflow f...
They made a fork, and just reused all of my test with their socials and stuff
it does say at the very end that it originated from me, but they even filled in my donation links with their own name and stuff
(we inch closer to release :p)
What happens if I report the repository?
for hires fix should we use both base+refiner or just refiner?
no idea, I would say don't worry about it
its all preset in my workflow
oh ok
whats wrong with it? seems to follow the gpl don't it
will check
yeah. if you want to maintain control of how it's used afterwards, you need a less permissive license.
not trying to be xeno, but i'm not surprised that fork is chinese
I thought that was japanese?
maybe i'm being a little xeno
😂
China has the tradition of "stealing everything" - and that's not blatant prejudice - that's a fact!
I didn't realize that the GPL would just allow people to pose as me and use my about me to promote their own donations and stuff
double edged sword these copyleft licenses can be
Prompt - Posing On My Own Repository!!!
I mean they are translating the readme and releasing tutorial videos in their native language
and substituting my name for theirs, and my links to my donation pages to theirs ._.
I cant read it so its hard to say if they are claiming its stheres but yeah a little sketchy for sure
oh I gotchu fam
oh well, I don't care much, their workflow seems to have been slapped together with no real knowledge of what they are doing, and they have gotten little attentionf or it, so I will just let it be for now
FOSS is freedom as in beer, but more importantly, free to do what you want with it
What is the difference between base model and refiner? Which one do I use?
All my CC0 (License) photos are for sale over in China ... but CC0 lets them do it without theft!
refiner is an optional second model for adding detail and refinements to finalize an image
refiner: it's not very useful. send your images generated with base to img2img and run the refiner at a ~~cfg ~~denoise of around .2. But it depends on what you're doing.
Then Should I use both of them?
hmm
is there already an updated upscaler model to use? other than 4x-UltraSharp?
sorrry denoising
my new workflow comes with one I found works just a litttllleee better, it will be on my github
I am preparing my reddit post for launch
at least for postapocalyptic cyborg rabbit illustrations, it's not very effective
I swapped from Voldy's WebUI to ComfyUI and I'm having real big troubles with the eye department. Is there a workflow for Comfy that would automatically find faces and redo the face area?
on the right seems cockeyed
@high skiff you workflow didnt include hires fix
dude
its not out yet lmao
I have been saying this for so long lol
I keep telling you lol
its releasing soon
like, within the next few hours soon
Wow very clean
was asleep at totally regular healthy human times didn't see this
yesterday I updated the Principled node a bit. Highlights include refiner clip/model being optional and an experimental N-Pass upscale that turns your GPU into a space heater
https://github.com/Beinsezii/bsz-cui-extras
kinda been neglecting my two non-principled workflows, might go back and update them some more sometime soon
Also despite refiner being optional, it's still not usable for 1.5 due to it using the SDXL conditioning nodes internally
I've updated every day so there might be more depending when you last looked
https://github.com/Beinsezii/bsz-cui-extras/commits/master
Couldn't you normalize the colors to fix this?
yes. I am asking why it often comes out like this
I just checked one of my gens and it doesnt seem to have that problem but I am using a lora so idk
which one?
the base model is trained without noise offsetting. it averages pixels towards a middle point unless you use a tooled lora to offset it towards darker images
that example is missing lights
mb. probably the same reason. it's trying to average the picture towards the midpoint. noise offset shifts it away from center
It does look a little washed out
yeah that's deliberate. it can be hard to see this with the naked eye, just provided the image as an example
that file is the offset lora
ah is that what that is
I have that
sometimes I use it, but I never notice it...but I am usually not looking at histograms just mashing generate like a crazy person
@high skiff What was that voice software you were trying the other day... RCV or something? I can't remember... I figured I could play around with that while I patiently go crazy waiting for your Comfy Release to drop?
yeah i hear that. you can often see it when it really rears it's face and you're looking for it. some prompts benefit better from offsetting so when you see that flat washed out look, try it
actually I was more curious than anything. not looking for hyper fine technical quality, mostly happy surfing the latent space for interesting stuff
yeah, RVC
Is this the one?
https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/README.en.md
I believe so
Can I do Donald Trump with this?
yes
yes
woah cool, I am looking to make a short written by ChatGPT, Voiced by elevenlabs, art directed by SD, and shot by runway. Was just looking for something I could use for music.
I have a trump model lol
Woo hoo
btw elevenlabs is amazing
elevenlabs is cool. but RVC can achieve way higher quality
Does is have emotional control?
It must be FANTASTIC... I mean The BEST model EVER... Truly AMAZING (in trump voice)
I find elevenlabs to be nearly indistinguishable from reality for some voices
oh, you have to put voice into it
ah ok, that's quite diff from 11 labs
This one is also RVC
so this is you singing?
so you put in a song and it transforms the voice to another artist?
I did make this one tho lol
it can, if you use it for that
hmm
it can work on anything that has a pitch really
and it doesn't just have to do voices
Like its only for singing?
Huh thats surprising
yeah
those singing ones, like the ballin, only take about 20 seconds to render on a 3090
this was me asking a rude chatgpt to recite the events of a very nasty film. then I had elevenlabs read it in attenborough
that took about 2 seconds on my 3090
Wtf
it'd help if I knew what any of these things were lol
Also did death singing Boss Bitch by Doja Cat, this one is supposed to be bad lo
Me listening to RVC
same thing with scarjo
Do one by Awkwafina?
I would have to train a model for her
Eau keigh 🙂
I mean on elevenlabs. they don't let you share
oops, nevermind
I was thinking of the wrong person
you can also use RVC for non voice sounds
you guys know Karoly from two minute papers?
like, if any of you guys have ever heard meow synth, somebody made an insanely good meow synth RVC lol
Meow synth singing ballin lol
lol every single thing you have mentioned except rihanna I don't know
Is that Mickey Mouse singing?! 😄
Its meow synth!
^..^<
Mr. Sandman, bring me a "El Gato"
Made with meowsynth and Fl studio
meow synth
thank you, I have been edified
now you can see how damn good that RVC us haha
gary from Spongebob lol
one of my favorites lmao
david attenborough quoting mike tyson
Some of this stuff is "too toilet"!!!!! 😦
speaking of, I have the sound of aa toilet flushing singing rap god by eminem lmfao
sounds more like em rapping into a toilet
luckily u havent heard ambatukam covers yet
Iron Mike: When your best choices are getting your ear bitten off, or cussed off. Do not engage.
I have the spectrum dial assistant doing the low tier god rant lmfao

U mene that whoosh is the toilet? Nawww 😦
Me, hanging with my homies ...
"let me transfer you"
LTG if he was a receptionist and not a streamer
Minecraft Level singing ballin lol
aw man, my elevenlabs sub has run out
WHAT A TIME TO BE ALIVE!
What a time to fourscore and seven years ago!
ahem
They indeed have an RVC of him lol
can RVC do text to voice?
i want to train my own voice then pipe audio books through them for my kids, but i haven't gotten around to training yet
ah boo
with elevenlabs it takes about 3 minutes.
tho, some people suggest using Bark into RVC, as Bark can do text to voice, but poorly, so yuou use RVC to fix it up
yeah there's quite a few that do it badly
I use NaturalReader to make my own AudioBooks - just feed it vanilla text - and ogg/wav/mp3 comes out!
ima show this to my mate, he's making a sarcastic assitant thing
it's cool but a flat delivery would get old quick I think
nahh. I listen to a lot of audiobooks, and I can't stomach a bad reader
I dont finish those
@high skiff I think I have RVC installed. How do I run it?
you could say the same about being blind, but i like to see 😛
Somebody made an RVC for the "DAMN DANIEL" and the "AR AR AR" part of this video lol
SUBSCRIBE
damn daniel ar ar ar, damn daniel ar ar ar meme, damn daniel ar ar ar 2x, damn daniel ar ar ar sped up, damn daniel ar ar ar one piece, damn daniel ar ar ar remix, damn daniel ar ar ar flight reacts, damn daniel ar ar ar flight, damn daniel flight, damn daniel original, damn daniel meme, damn daniel flight meme, damn daniel ar ar ar...
I have listened to the complete Moby Dick twice (19 hours each); and then Don Quixote (another 19)! 🙂
bro there are very good narrations of both of those
No matter 🙂
these two @spring fulcrum
yeesh
There are totally free vanilla texts at gutenberg.com
the go-web would be where i'd start, you need to get some voice data tho
Tried those and got this
Just finished my first XL lora 😄
did you download the .7z from huggingface?
They take forever to train and don't seem to have a very fast learning rate
how big is it?
800mb
SDXL is pointing towards a sea-change in our desktop hardware - or more renting GPU time! 😄
yea about to rent one tbh
took 7 hours for 16000 steps
phyrexian horror
83 I think?
🙂 i know had to do it
@spring fulcrum if you're on windows you can download from here (i think the RVC-Beta.7z) i have a previous version ending in 0528 https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main, then i believe it will run with all the dependencies they packaged together https://huggingface.co/lj1995/VoiceConversionWebUI/tree/main
oooo
pretty cool results
but then you're going to need some vocie libs
I suspect when we get to SuperSDXL (or whatever it'll be called when it drops) 24Gb VRAM might not just be enough 🙂
or they'll use new tech so it will
My favorite card in all of MTG is Vorinclex, Monstrous Raider
Can you share your training params for that lora?
Right now, with SDXL 1.0 and pytorch compiled unets (base and refiner), I'm using 23GB ram at fp16
i mean didn't some dev peeps say we'll have 30 images generated per second one day soon lol
I prefer training in bigger batches, the results are better tbh
oh, thats not good
you must have used non optimized settings
yea definitely not good at all
on my 3090 I was doing BS 12
WTF
yeah
ok can you teach me pls?
all hail sytan
I don't have my 3090, so my SDXL LoRA tests were cut short
I have 3090
but @boreal bough Is a god at LoRA's
hes really good, much better than me ATM, as my 3090 was DOA
i need these settings before I run this training session overnight
yeah, i want to get better, i would just rent a 3090 24 for doing deambooths from the terminal
you need to make sure you have gradient checkpointing on, cache latents (to disk) on
For what rank?
Hey guys might be a really stupid question, but does it cost to train a model with Lora? I'm a real noob when it comes to this
If your renting a gpu, yes
If you use your own, only your electric prices and the gpu wear
@boreal bough posted this last night and i copied it down... it was a few people back and forth...
accelerate launch --num_cpu_threads_per_process $num_cpu_threads_per_process $repo_path/sdxl_train_network.py
--pretrained_model_name_or_path=$pretrained_model_name_or_path
--train_data_dir="$image_folder"
--output_dir=$output_dir
--logging_dir=$log_dir
--output_name="${train_name}-${versionname}"
--train_batch_size=$train_batch_size
--unet_lr=$ss_unet_lr
--max_train_steps=$max_train_set
--lr_warmup_steps=$lr_warmup_steps
--use_8bit_adam
--xformers
--mixed_precision=$mixed_precision
--persistent_data_loader_workers
--network_dim=$network_dim
--network_alpha=$network_alpha
--shuffle_caption
--keep_tokens=1
--caption_extension=".txt"
--lr_scheduler $lr_scheduler
--min_snr_gamma=5
--network_train_unet_only
--resolution=$max_resolution
--min_bucket_reso 512
--max_bucket_reso 2048
--enable_bucket
--save_every_n_epochs=$save_every_n_epochs
--save_model_as=safetensors
--save_precision=$save_precision
--seed=$seed
--network_module=networks.lora
It seems like I can do batch 8 with gradient checkpointing now
--cache_latents --cache_latents_to_disk --gradient_checkpointing --mem_eff_attn --bucket_reso_steps=64
accelerate launch --num_cpu_threads_per_process=2 "./sdxl_train_network.py" --enable_bucket --pretrained_model_name_or_path="A:/models/SDXL 1.0/sd_xl_base_1.0.safetensors" --train_data_dir="A:/Datasets/Concepts/nier/2B" --resolution="1024,1024" --output_dir="A:/Datasets/Concepts/TRAINING" --logging_dir="A:/Datasets/Concepts/TRAINING" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --unet_lr=0.001 --network_dim=8 --output_name="2B_v1" --lr_scheduler_num_cycles="40" --learning_rate="0.001" --lr_scheduler="constant_with_warmup" --lr_warmup_steps="50" --train_batch_size="8" --max_train_steps="35800" --save_every_n_epochs="2" --mixed_precision="bf16" --save_precision="bf16" --seed="1234" --caption_extension=".txt" --cache_latents --cache_latents_to_disk --optimizer_type="AdamW8bit" --max_data_loader_n_workers="0" --keep_tokens="1" --bucket_reso_steps=64 --mem_eff_attn --flip_aug --shuffle_caption --gradient_checkpointing --xformers --network_train_unet_only
some results from my LoRA I did before my 3090 died
obviously on the Na'vi from Avatar
very small data set, so it would be way better with a full one
wait, your 3090 died?
That did work but nothing is in english... lol
few more tests I ran as well
Gah im really loving this lora It just took ages
ill copy his code and run it with my next one tonight
i have been off for 3 days
yeaahhh, thats the right level of fucked up and messy for phyrexian horror for sure
what happened exactly?
@spring fulcrum you'll need to get some trained models/files here's some, there are 2 types here, so you'll need to find RVC ones... https://huggingface.co/QuickWick/Music-AI-Voices/tree/main
I don't care to get into it. I am being refunded as we speak
any way to change the interface to english?
very very very short story, seller sold me the GPU with missing parts, tried for several days to get me to lose the money for it, blamed me, then the mods in the server destroyed him, and he apologized and he is now refunding me in exchange for the GPU back
I have another new 3090 coming in on wednesday, and its faster, highr quality, and comes with a 275 day EVGA warranry
(3090 FTW3 Ultra)
@spring fulcrum ha mind isn't like that i don't think there might be a setting somewhere for that hmmm... but when you get to loading in models you can follow here... https://youtu.be/qZ12-Vm2ryc?t=160
my socials:
email: nicholasmpetro@gmail.com
instagram: https://www.instagram.com/itsp3tro
soundcloud: https://soundcloud.com/nickp3tro
spotify: https://open.spotify.com/artist/5Kr7b5v6jHjPEUdjo4VyYK
ChatGPT can help if you have any errors, but this should be straight forward!
Have someone train models for you: https://www.customaivoices.com/
...
Thank you for helping scorp, I genuinely forgot all of the setup after doing it like a week ago lmao
SDXL has had an iron grip on my psyche
@spring fulcrum ha guy in the video just right clicks and hits 'translate'
lol thanks
hmmm... i really dont get his CLIP_G / CLIP_L stuff ... tried several combinations between prompt/style plugged to CLIP_G/CLIP_L, but the only working combination that gives me good results, is to put prompt+style to both... is there any use case to pass different values to them (and still giving good results)?
This is what I got when running the file you suggested downloading...
thats better,
for the same price?
$100 more, but arguably an even better deal than the first
Now he's just showing off...
haha
kohya gui, main branch, and use this config file
for anyone wanting to make loras
epoch and max epoch need to be adjusted. 40 for normal loras, 80 for very complex loras (faces/anatomy).
(also obviously adjust batch size, to whatever your card can handle)
repeat on dataset folders = 1
$700 shipped with tax for an EVGA 3090 FTW3 ULTRA with 275 days waaranty left
700?
nvm I ran the web.bat and i got the webui... Its mostly in english
dude, thats a great deal
there'll be a guide eventually on what exactly all settings do, but tl;dr. that config will work as long as your dataset isn't too bad
jelly
Did that rogue GPU work after all?
@boreal bough i'd love to get your setup and see how easy i can get like a runpod/vast setup
Just FYI, the 3090 has overheating ram issues that were fixed in the 3090 Ti
Since the 3090 has ram on both sides, but with 3090 Ti they fixed it and placed ram on just one side
Currently with a 4090, running stable diffusion in fp16 for 1 hour, I'm at 81C temp
it might just be as easy as git clone diffusers, or which ever lib you use, but gettin a linux machine to run it quick for easy training sessions would be nice
i only have 12gb locally so i have to reach out to rent
If you put muted tones, artstation, concept art, detailed textures, art by maciej kuciara in the style prompt, remove those keywords from the main or secondary prompt. The style prompt is mixed into both of them and you don't want it to be applied twice (unless you do)
Am I the only one getting this feeling of "low quality colors" ?
Add the negatives
Desaturated, hazy, washed out, low contrast, foggy, dull colors
To your negative
I mean if y'all are buyin pc's I can throw my 2080FE back in this tower, sell the unit as a whole and upgrade to a case that can fit my 40 series
I def already have an upgrade wishlist picked out lol
does the batch size of 8 run on 24 gigs?
I just recently upgraded my computer for the first time in 6 years lol
https://www.coreweave.com/gpu-cloud-pricing
80GB A100 for $2.21 an hour, dirt cheap
CoreWeave Cloud pricing is designed for flexibility. Instances are highly configurable, giving you the freedom to customize GPU, CPU, RAM, and storage requests when scheduling your workloads.
I'll even refurb the gpu. Got fresh paste and a pad for it
thx, kind of lost with the new prompting thing
All good
not that anyone here wants a 2080fe
I mean, it's a fine GPU for AI
A 4090 is $1600, so if a A100 (last gen, released with 3000 series) is $2.2, you can do 700 hours of work to make the 4090 worth it
A cup of coffee ever 1.5 hours
yep. batch 8 to still use your pc. 10~12 is max limit if you're going to sleep and not touching it anymore
fyi, batch 3 only runs a small bit slower - but then you can use comfy even while training
I've ran them at BS 4, 8, 10, and 12
so that makes testing your checkpoints during training much easier
I was doing that while training 1.5 checkpoints on my 3090
It was extremely useful to be able to just pull out one of the checkpoints and test it while the training was still going
just set your lora output folder as an additional lora folder within comfy, then you don't even have to move anything
so the main and primary prompt should look like post-apocalyptic survivor overlooking abandoned city, overgrown ruins, hazy horizon
extra_model_paths.yaml
With epoch 30, is that just like a skyhigh value, so you can stop the lora when its ready?
ok guys, I need a little community help to speed up my SDXL 1.0 worklfow launch
I need to find a program that can allow me to compared 2 images of different resolution side bys ide and export it as a collage
but you dont know the flexibility of owning a gpu and running it locally, you dont need to switch into the website and activate and deactivate or wait for any server probs
it's not high - it's cause we're running with repeat 1
expect the lora to finish in like 20~30min
I am struggling to find something that can do that ATM
so a grid?
ohh! that makes sense lol
also, usually epoch 20 is the 'final' checkpoint. i run till 40 to see if I want a small bit of overfitting
just one image next the other
faces/anatomy are still hard to get right, but overfitting solves that for now until I write a full guide on how to do it properly
... but my yaml file is already called extra_model_paths.yaml - so why wouldn't it load? 🙂
I tried ICAT, but it can't do images at different resolutions
are you expecting the lower res image to be upscaled so that the heights appear the same or do you want it 1:1
extra_model_paths.yaml.example
remove the example
I am expecting just 1:1
Its OK though, I've used Symlink instead
I need my 1024x images next to my 2048x images at the same scale
Oh, I didn't see the hidden extension! 😦
like this, but it can't export them like this
yes, 1024 up
sounds about right
how many image pairs do you have?
I wonder if a standard collage maker could work
trying to do 10 demo images
you could potentially do a photoshop script, though idk how it works
so 20 images, 10 pairs
i'd probably just use https://www.photopea.com/
Photopea Online Photo Editor lets you edit photos, apply effects, filters, add text, crop or resize pictures. Do Online Photo Editing in your browser for free!
you can use photoshop
I respect that, in that case I would recommend using a video editor
should I do it for you?
import upscaled and original as image sequences into resolve
export as a png sequence
ok, i think I found a site that can do it!
also, just in case anyones curious about file size. the full details & nuance of a face fit in dim 1 - we just use 8 cause we want the 8:1 ratio of dim to alpha.
setting it higher usually won't give better results, as dim 8 is big enough that I've fit the concepts of 100 complete dresses + faces + anatomy into it. So unless you're doing a full finetune level of lora with more than 5k images, dim 8 will be good enough
@high skiff you could make an 'action' where you record upscaling it to help quicken it up
gimp cult gimp cult
I don't use gimp either :p
which site is that?
completely useless :D
you can create a project of 1024x2048(assuming the res of smallest image 1024x1024), and paste 2 images side by side, resize the larger one to 1024x1024
no, thats the last thing I want
I need the bigger image to be full res
alright, I think I know what I have to do
I am just gonna run the 1024x images through area projection upscaling in comfy
in photopea you'd use edit -> image size -> 200 percent & nearest neighbor
what an amazingly roundabout way of doing it
on the 1024
area projection?
then you could place it on th 2048 with a 200 percent canvas increas from the edit menu
0 pixel filtering
Not sure what you're trying to do exactly but you can probably use Chainner
then, use 4096x8192 and make the 1024x1024 bigger by uncropping it
by dragging
Can batch resize, stack, and caption images
your face is a roundabout
This sounds like a 5 line Python script using Pillow.
magick mogrify -scale 200% lowres.png && magick montage -geomtry -0+0 loweres.png hires.png grid.png made your grid that'll be $40
Really comfy should have nodes for that stuff
there
yes, the high res fix for whatever reason messes with the white balance, I have not sorted out why
Right side more texture
@nimble heart yeah if he said like 100 i would have said something like that
upscaler mayhaps?
maybe, tho I think its a byproduct of any img2img, as it also raises the black levels, even with no upscaler
Texture mutes contrast
Don't you think it's a fairer comparison to use bilinear / bicubic filtering for the smaller image?
the world isn't ready to hear our truths of magick
but you can see in the background that the sky is ever so slightly different like he's saying
Meh, I don't reallt caew honestly
I am jsut trying to get this out to people
@nimble heart i mean to be honest i have to look up the answer for 5 minutes, so if it's 10 images i just do it by hand, same with like ffmpeg stuff lol
Still waiting for Sytan workflow release
@primal vault why are these highres and lowres images the same?
The more texture you add the more contrast shifts/mutes. Add texture with contrast compensation?
without really going into details the contrast thing is way worse normally. there's a filter to fix it in the workflow
using nearest is actually the most fair, cause its displaying the pixels at their actual size
it's like the gamma is + 0.2 points on the upscaled image
I disagree, they are texels not pixels
so he gotta filter it back to somewhat normal
I have a post processing node for after the high res fix that you can mess with to fix the black levels
If they were actually displayed at 1 pixel per texel then it would be "fair"
whats a texel?
there is no texture map
since he is doing a 2x upscale, there should be 4 pixels for every 1 1024 pixel, so the neighbor upscale will be perfectly accurate to the original assuming your image viewing software doesnt have an upscaler
Well its basically a pixel of an image lol, but if you are displaying it at a size mismatch with your monitor you would typically use bilinear filtering
why wouldn't they? the highres version is an upscale of the lowres version
If you displayed the image at native size it would in fact be a pixel
Nearest Neighbor can be used on continuous data but the results can be blocky. Bilinear Interpolation uses a weighted average of the four nearest cell centers. The closer an input cell center is to the output cell center, the higher the influence of its value is on the output cell value.
at first glance I thought that was two images of two geese staring each other down
i don't think he's looking for an interpolation
I am specifically not
A texel is a kind of sheep from an island in north Holland/Netherlands
if you stare at those cross-eyed they'll bite you
you get used to it.....
I have the ffmpeg high quality gif filter sewn into my heart
ffmpeg -ss 00:00:20 -i vid.mp4 -copyts -ss 00:00:20 -to 00:25 -vf 'split[0][1];[0]palettegen[p];[1][p]paletteuse' out.gif
Images are continuous, you would interpolate them unless they are specifically meant to be pixel art. If you are intending to resize an image then you are agreeing that it is interpolatable
he's wanting 'pixel' art to remain true to the data
If you wanted to do that badly enough you wouldn't rescale the image
potato, potato
I read both of those like "potato"
isnt it highres fix?
My neighbours out and about ... 😄
not the kind of highres fix that you know from 1.5, no. it's an upscale of the output image to double the resolution
runpod has em for 1.69$ per hour
oh, so in the workflow the upscaled image isn't being run through anoth ksamp?
no, that's something you can do with the img2img workflow
damn I totally recognize you from somewhere
but we should again paste the image, it would be great if you could add a highres workflow by passing the txt2img through ksamp again
sometimes I get these wonked out images, is this because my computer is shitting the bed?
I am wondering if lora effects Clip-G and Clip-L equally... does anyone understand how that works?
a1111 settings?
auto1111 is what im running on yea, settings 30steps, high res fix x2, latent upscaler, the usual
Oh wow, amazing price
what's the denoise on latent upscale
@copper rose #🔧|finetune message
0.4
posted it a bit more descriptive there
Oh wow, seriously those prices are amazing
Tysm caith
How tf
yuuuuh they they real ones
I'm not caith :(
LMAO mybad clicked reply on wrong person
we can all be caith ❤️
Cuz at those prices
we should all be Horatio
Well, if you aren't using tensorcores, 4090 is faster in fp32 than the H100
you can keep your files for a few per min/hr
But the H100 has more mem, and much faster tensorcores
and yes you can ssh in
Nice
I wish they had persistent disk
Like 100GB for $10 a month or whatever
any other upscalers I can use at a lower denoising strength? Latent gets kinda wonky at 0.6 with my lora
pixel based ones
so non-latent
runpod does have something like that, but the deal is that if someone else is using that rack of like 4 gpus or whatever they have, then you have to wait to get a gpu, but can always access without gpu
recommended?
Latent go brrrr
Ah I see, that seems like something AWS offers that runpod does not
But I guess cheap spenders can't be choosers
With AWS you can persist with no GPU compute currently running
yuh. some images work well with latent, others I use pixel then like 0.2 denoise to just sorta cram some extra details in
refiner is good for that as long as your lora isn't too distant
i mean you could do that too, run the cpu all the time without a gpu if you wanted
my lora is a little to complex for refiner I think
also @candid walrus you can view image generation paramters from a1111 and comfy using magick identify
So from like 4 years ago
If the rumors of the 5090 next year has 32GB memory are true, it would be nice
no man's sky RTX ON
$2000
here are 5 of the 10 demo high res fix images for my release. It should be out very soon
Left is base 1024x, right is my 2048x high res fix pass
??
getting some Annihilation vibes 🙂
no man's sky is a game
😄 😄
deconstructed model shoot
i'll check it out first thing in the morning, it's just going to be your main branch on your Sytan-SDXL-ComfyUI github right?
Annihilation was such a weird movie. Cool body horror for the first half then just anime bs for the ending
yes, and on reddit
previous shot
and comfy is releasing it on the example wiki timorrow
nice, looking forward to it
before the crash
she's running headless to reduce vram overhead
haha
@primal vault can I dm about your workflow?
is this your dog? 🙂
really like your images. lots of dynamic actions and atmosphere
Aw thanks!
is there a lib for comfy to generate forever (not add to queue), but something that on completion askes for another and just keeps going?
they're all like, 'this pos spaceship, got to push this thing again to get it started'
"Use the force"
yeah! it broke in the last frame
haha
i swear comfy had that in a right click menu somewhere
this mysterious unlabled checkbox generates forever
yeah i know a1111 has a right click for it, but i've just been loading up the queue before i step away, but it really starts to lag once i get past like 300 to add them
oh snap!!!
uncheck it while it's running to stop the gen forever
next you're gonna tell me you didn't know you could use control and arrows to adjust the weight of words
Does anyone know of a good program for captioning image sets manually?
actually i stumbled on that by accident 2 days ago
i was trying to select like a whole word or line or something like i do on my mac and it wasn't working, bceause windows, and suddenly it starts ticking up
Is it me or does it take around 9 hours to train a lora on a 3090?
i'm like damn, i didn't know comfy weighted, this is GREAT
damn. oh well, at least you already know you can comment out parts of your prompt with /* C markers */
// like this
follow Caith's guide in the finetune channel, it should take 15-45 minutes depending on size
ha i didn't know that one
@upbeat summit I have dm’d you
I have a question for you guys: any specific styles or subjects that are not very well represented in SDXL 1.0?
damn. I almost didn't know you could use brackets to {get|random|chosen|tokens} into your prompts, but I did
legend, thank you 🙂 #🔧|finetune
that's random right, damn comfy even does that out of the box
i had to install an extention for that on a1111
that's going to pair nicely with 'forever gen checkbox'
yea all it's missing is easier grid and inpainting workflows and it can completely replace a1111 for me
ah shit I think the new controlnet poser is specifically a1111 too I'll have to look
What is that
Oh you mean to create openpose images
specifically this one https://github.com/huchenlei/sd-webui-openpose-editor has hands and face support
I know there's a 3d standalone one but it's the buggiest fucking software I've ever used in my life and I installed Windows 8 the day it came out
Great success!
stupid question but is a 100 steps per image still the norm when training loras?
there's a full lora faq thing somewhere Caith always links to it
they got out safely
is there a way/node in comfyUI that lets me generate a batch of image at once, with one variable changing ? like cfg 4,5,6,...,10 ?
if you set cfg to an input and attach a primitive node you can set it to increment each generation
yes, like randomised seed, you can use randomised cfg for every render
then you can set "batch count" in extra options to like 10
Don't think "batch size" on latent input will work since the counts only randomize/increment/decrement every new batch
i made it a input node
and connected to a primitive
i got there value and control_after_generated as parameters
control-after-genrated increment
i can change that to increment, yes. how do i tell it to increment by 0.5 ?
cfg has a step of like 0.5 by default I think
how do i tell it to increment by 0.4 ?
or just manually change the cfg and press Queue each time, they all go into a queue and gen
that's the neat part, you don't
why is that the neat part ?
what i want to do is run one generation with one seed from 4.0 to 6.5, and then do the same for the next seed
its a meme
first.
damn I'll get you next time
can't catch this
or even better, as a matrix, i wanne do all combinations of cfg 4.0 to 6.5 (+0.5) and start_at_step 5 to 15 (+3)
I just rewatched it so it's fresh in my brain
nope.
mcmonkey or someone said there was a semi-official-kinda grid extension coming for it
so its not possible right now, no ?
is it possible to start the whole thing with parameters somehow from the outside ? like an api-call or something ?
yes
so could i realize it that way ?
Probably
how do i do this?
think it's basically just sending a serialized json over socket to the comfy daemon
so i gotta export my comfyUI thing into a json first, yes ?
yea you could probably just make the node graph in the actual ui then just use the python standard library's json lib to easily edit the values you want before sending it to comfy
how do i export my comfyUI thing into a json ?
never done it personally so may be more steps
the big "save" button
every image you make also has its params embedded as json data under the prompts meta field
baller
can i see in the graphical UI which node has which id ?
probably easier to just ctrl-f and search by title in the json
ah, that may work, yes
alternatively if you wanna be really really lazy you could just copy-paste the whole json into your python file and use an f-string to just replace the specific parts. no json deserialization needed
gonna have a 30 kb python file though
did you make a fine-tuning of images of your dog?
Put myself in the blade runner universe
hi all - does anyone know how to perform a "re-imagine" of a room?
anyone?
SDXL Is Amazing!
its just called img2img
or you can inpaint
use lots of noise, but try to stay clear of 0.9, use lower values, unless you use it on a single color area
do we still use the 100_(name) folder when training loras? or is this outdated?
wings on objects, seems like the model cannot put wings normally, it always looks cropped
oh ok!
and 100% of the time misalligned, on the back from all sides, if ya made a lora/emb id really apreaciate it
man this sucks ahahahaha why can't i replace my 6gb vram to more than taht
im even afraid to solder more vram because hell knows what will happen to the laptop
thanks ill give it a try
and vram isn't that expensive anymore! freaking hell
~50% of the gpu's cost 5 years ago was vram(less on lower)
now its like 10$(expensive type) per gb
and they still only go with 6 to 12
cant wait
why batmans sad?
sad? that's his happy face
it seems a little odd, can you try to do a 2 stage k-sampler img?
add the comics style in the 2nd one
i noticed that you lose a lot of quality with the comics if it exists in the first few steps, it tried to purge details sometimes
sorry, I don't understand what you ask me to do. I am using searge reborn comfy setup: https://github.com/SeargeDP/SeargeSDXL/tree/main/example
They say a watched pot never boils, but i'm 90% sure SDXL speeds up if i stare at the UI. 
Just posted my first two loras 'Grumpy Style' and 'Sparkling' to civitai https://civitai.com/user/humblemikey/models
lol yep
i have prompts in which the grumpy one would fit nicely.
sort of interesting even if it wasn't what i asked gpt4 ```Here's a high-level mapping of the changes:
Text Embedding and Projection: Both models use text encoders, but the new model has a larger embedding size (1280 vs 768) and includes a text projection layer. This suggests that the new model might be better at handling larger vocabularies or more complex language tasks.
Transformer Architecture: Both models use transformer architectures, but the new model seems to have a more complex setup. The new model includes a 'conditioner' component, which might be used to condition the model's outputs on some input, such as a text prompt.
Image Generation and Diffusion Models: Both models include components for image generation, but the new model seems to have a more complex setup with a 'first_stage_model' and a 'diffusion_model'. The 'first_stage_model' includes an encoder and a decoder, which are typical components of autoencoder architectures used for tasks like image generation. The 'diffusion_model' might be used to add noise to the model's outputs, which can help generate more diverse images.
Increased Complexity: The new model has more layers and larger tensor sizes, suggesting that it is more complex and potentially more powerful than the old model. This could mean that the new model is better at handling complex tasks, but it might also be more computationally intensive.
Attention Mechanism: Both models use attention mechanisms, but the new model seems to use them more extensively. This could mean that the new model is better at focusing on relevant parts of the input when making predictions.``` sd 2.1 vs sdxl
mmm well, nvm then
did someone have success in auto 1111 with the refiner?
Something beautiful using the R-B-R triple-process in ComfyUI
Any advantage in that setup?
yeh? just need to interupt the base generation when its about 80% complete then img2img it
really? oh crazy
it might work with steps too? like do 35 steps in base, then move to img2img and do another like 10 steps or sth like that
did u try that?
yeh it also works pretty well just doing normal img2img as well
Detweiler's R-B-R Triple-Process uses 3 steps Refiner, then 9 steps Base; then a further 8 steps Refiner - 20 steps overall.
normal img2img erases my details with the refiner
base
yeh it has a tendency to do that a bit which is why you cancel early to leave some noise in the base image
refiner, a lot of image loss
oh ok , maybe i can play around with steps too, will give it a try 😄
HydrusNetwork = best program. but takes a bit of effort to learn the first time. scales well from super small, till giant datasets for fast manual tagging.
https://hydrusnetwork.github.io/hydrus/index.html
FastCaption = super easy to use. 0 learning curve. (does take me 4x as long as on hydrus network, for a dataset of 40 images though - but that's cause I've properly learned hydrus)
https://github.com/lukemoore66/FastCaption
dunno i just generated a lot of images with all the same settings (different seeds) but the overall quality and prompt understanding was always way better in auto , i have no idea why
i have doubts on your giant dataset claim
Dont click if you dont like spiders
but cool if true
its just a personal feel, never would say thats "data"
20~40 is the new average, if you use my settings. depends on how good your tagging is.
to clearify that 😄
Spoiler-the-Spider
(giant datasets to me are 1 billion + images, large are 1 million, medium are 100k, small is everything below that)
Spider Warning ⚠️
my few images are not even small 😄 , its just a feeling haha
I feel slightly offended that my 70k cosplay images dataset is considered 'small' 🤣
also rip my miniscule raccoon dataset. only 6k images ❤️
my last scrape i deleted 900k images heh i really need to spin up a database to store latents, wasting a day everytime i extend my dataset is so dumb
that moment when you need a stack of A100 stacks to finetune your model. rip. XD
hopefully auto will implement the refiner so its usable, sounds not really like an option stopping at 80 percent
u have to time it by hand , right?
or is there any option to do it automatically?
yeh have to do it manually unless you patch the code
i think you need a higher denoise strength
unless that is the look you are going for
you can't. a1111 doesn't support keeping the noise.
you can stop early (without noise) - then pass that on.
but if you're already making concessions, might as well go base only on a1111.
dpm 2m sde + karras + cfg 6 + 25/50 steps
this was made by comfy, was more a reaction that we have to do anyhing manually 😄
i feel like Base and highresfix giving similar results to the refiner
if u upscale to like 1,2x
sytan will post his workflow today - highres fully working and looking damn nice.
even has better lora support than any other base+refiner setup, due to how he implemented it
oh i think its finally clicking why some people are so fanatical about comfy.. it allows them to become experts
My workflow is decent too now id say
At least im getting the results im expecting mostly
How to deal with token length in SDXL which is now restricted to 77?
the token length is the same in sdxl and sd
^ this
you chunk and average the chunks
which is why you usually want to stay below 75 tokens
serious, serious, serious, serious, serious, 1girl
Great job on SDXL! 🙂 We're have an open source project that runs automatic web searches and then evolves solutions to problems using GPT-4. We're then using GPT-4 to create image prompts for each solution and creating 400-500 images using SDXL API - it works really well! This is a use case that could never be accomplished with out AI images! You can check out he project and the generated images here: https://policy-synth.ai/projects/1/
cool project but most people would consider that spam
off-topic maybe?
Tried to make a comparison of the R-B-R (left) workflow versus B-R (right). Prompt was a tapestry of a cat playing volleyball, masterpiece, detailed, masterwork, excellent craft, same seed for both. R1 (3 steps), Base (9 steps) R2 (8 steps) versus Base (12 steps), R1 (8 steps). Ran using Diffusers
i dont see a promotion channel on this server
well it's an open source project run by a nonprofit foundation, we're not really promoting anything
then here its ok since its related to sdxl
just wanted to share this as the SDXL API is working so well for us 🙂
Alright, preparing some final things. Workflow will drop in a few minutes
sdxl 1.0 is still pretty undertrained imo
eyes, anatomy in general are really underdeveloped
undertrained and overtrained at the same time
until you use my res fix :p
@soft zealot You can start sharing your stuff
yeh you can nudge it to do the right thing but thats just a bandaid
im guess im not able to use a high res fix on my hardware anyway
you have to compare apples to apples
SDXL1 is already far superior to how SDXL1.5 was at launch
You shouldnt be comparong to Community trained & fettled 1.5 or 2.1 models
😄
cause those areas were passed on to the refiner. refiner is overtrained in exactly those areas
i would say its superior to 1.2 comparing apples to apples... 1.5 was pretty good for the model arch
chonk
good chonk ❤️
i do appreciate how accurate sdxl 1.0 has been with cat breeds... that was unexpected
hi guys, i have a question, does anybody ever tried SDXL lora training using learning rate 4e-7? can I know if there are any differences between 4e-7 and 4e-4 in terms of output artifacts and overfitting speed?
makes me want to go scrape wikipedia and yoink all of their images/captions
ahem...
4e-7 is intended for full finetuning - which is not LoRA
for LoRA you can use a lot higher settings
Hi guys! Is comming a specific model for inpainting? Actual sdxl-1-base give weird results
@spring fulcrum
#🔧|finetune message
and yes, there are differences, but they depend on a lot of variables. training speed only matters in context of the other settings
ahhh, I have so many butterflies in my stomach haha
hundreds of hours of work now, all in once place!
Is it possible to use SDXL loras on Auto1111?
congrats on shipping!
I am pooped lol
Many thanks to @high skiff for all his work (https://github.com/SytanSD/Sytan-SDXL-ComfyUI) and I'm pleased to report that his method of upscaling (at least on my 1080Ti) runs around 100% quicker than the method I was using with no loss of detail.
Previously E2E generation of a 1024x1024 plus a 2048x2048 would take 500-600 seconds, using this method that is now 200-250 seconds(depending non denoise value selected, I use between 15% & 20% so minimal changes from the original) for the same/better quality.Those of you on newer hardware YMMV as to the percentage gains seen.
My previous workflow was based on Sytans V0.5 and I have now adapted it to include his HRF/Upscale method from V1 adding a few tweaks of my own such as a box you can type the denoise % in directly rather then calculating manually in the Ksampler and also a universal seed input to all relevant nodes.
My workflow also includes a preconditioning step,Lora loader( I like to use the example offset LORA at a max of 0.6 from https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main ) and a feature to write the Prompts data and seeed (still working on other details) to a .txt file alongside the generated images (matching file names).
Will be posting some quick & dirty examples in various style along with prompts used to generate them shortly.
A hub dedicated to development and upkeep of the Sytan SDXL workflow for ComfyUI - GitHub - SytanSD/Sytan-SDXL-ComfyUI: A hub dedicated to development and upkeep of the Sytan SDXL workflow for ComfyUI
Hi
what do you guys to get realistic skin and not just plastic, airbrushed looks? out of the box it pretty much looks the same to me as when earlier versions was up on the bots:
