#✨|sdxl
1 messages · Page 82 of 1
why you are using this if I may?
dude in the video used it
i mean what is the advantage? over the stock nod
idk, never used comfy before in my life until today
Depends. Right now I'm using the refiner extension for A1111 and I typically have it set to somewhere between 20-40% of the steps. If I was using Comfy, I'd tune the .cfg to no greater than 4...(probably 2), and would put my steps at about 8 out of 21 total. But it very much depends on what I'm going for. Often-times, I'll run a plot script to find the look I'm going for and tune from there.
thanks
my gpu is too shit for a plot so imma just use those values lol
I'm doing all of mine on a 3080.
i got a 1080 ti
Job done
Hi folks,i've just finished my character lora training based on sdxl1.0 in huggingface hosted inference API as below:
https://huggingface.co/frank-chieng/michelleyeoh
you can follow the READ.me and have download the lora with your own choise to make a refine img2img,huggingface is more interactive and mature than civitai
what should i do to fix this?
Inpainting.
how do i do that
Depends...what are you using?
this is what i did without any refiner at all on a finetuned model
tested upscaling tooo
stable diffusion checkpoint
No...as in, what tool?
text2img
I am gonna try making my dataset 4x bigger with some refined captions, and much more aggressive training, cause damn this is already amazing
THIS
Should have a button for send to inpaint.
yeah, doing img2img without refiner yeild amazing results too
ok, now what?
Once it's in inpaint, use the tool in the image screen to mark the area you want re-done. Choose "only masked", adjust settings, click generate.
in ComfyUI
the refiner is absolutely destroying anime faces
idk if i'm doing it wrong
not just that! but everything
i think they need to rename to the definer
"base with no mods only refiner"
or disfiner
so do people just use the finetuned versions without any refiner at all ?
I use a CUI img2img wf
i use vanilla SDXL
and now my MODEL!!
jaja
lolwtf
why?
why not ? Chocolate is good
more like chocolate pudding
chocolate you say!!
I love it
That's a winner
@plush drift I love you! ❤️
I never even considered running 100 steps XD yeah. samples are working properly now
just tried it
Great! 😂
I am daying over here!! someone stop me please hahahahah I cant stop making these amazing funny things
I am happy that it was a helpful tip, cuz honestly, I will never understand why someone will spend 40 hours on training a lora
and then... oops,
it was poop
:"D you just described 50% of my loras
but tbf, I'm running various configs to see how it works out
you are running it locally? or GC?
locally. rtx4090 - so I can get through my tests relatively quickly
yes!
and my big 20~30h loras - I already know will work. But I'm just checking the progress of my database expansion
this is the way
no matter how hard i try, this hole in her suit will not go awayu
at 4k images it takes a full day. dont even wanna think about the final form of 30~50k images
might as well do a finetune at that point XD and see how close loras get to full tunes
adjust the prompt to be more focused on the suit, tune up the cfg a touch
I do not want to be an ass, as I have my personal recipes! but it's waaaaaaaaaaaaaaaaaay below that
trust me, more steps and longer time doesn't mean a better lora
let me guess, its all about the right dataset
goal is to stay super flexible - and yeah, not like 10 epochs + 1 repeat is "a lot" of steps
but I'm training multiple hundreds of concepts
in one lora?
its why I keep making loras for every 1k expansion, making sure my weights stay evenly distributed
yep. my current iteration has a bit over 100 concepts full learned, and fully flexible
not training via overfitting
so poses stay true to sdxl base
a mix of both, but sure a the right dataset is a major player
training via overfitting now takes me exactly 14minutes, for a lora that produces more than 80% good images
Are models that fined tuned by sdxl are better than sd1.5 ? Or they feel the same?
god i wish i had a 4090 to do that
finally can run controlnet for SDXL in comfyui 😆
pitstop! this this so fucking cute!!! must share!! ♥️
16gb is all you need. and you only lose like 4 min
Well use colab?
so I dont think anyone complains about waiting a whole 20 min
can you elaborate more on this?
the base sdxl model produces better looking anime images than the finetune, who would've thought
lets be real, I was using 3060 till last month and I got a 4090~! let's be real please hahahaha it's a beast
how will i loose 4 minutes
it takes hours on my setup
nowhere near 20 minute lora training
i dont have room to experiment with lora params because it will take a day to just train 1-2
no money for collab or renting gpus either 
any chance there's a hidden gem comfyui node out there that would overlay a text string onto the image? ex. (add a caption to the image at x_y coords using python/pillow?
Wait you guys pay for google colab?
caucasian/asian/african/indian/etc.. <- so images of "woman" aren't biased towards any ethnicity
age groups are now properly working: 'girl' now produces images from ages 16~25. 'woman' produces 30~40.
I'm using mroe than 50% real photographs of normal people, so no bias towards studio like setups, or model photoshoots
outdoor/indoor tagging, so the concepts stay away from any other words
basically this lora is intended to remove the current existing biases on short prompts, and reinforce prompt control on longer prompts (so words aren't just casually ignored)
other than that, 'side view', 'rear view' and many posing words are trained via 500+ images, so they function really well, without being a control net clone
oh! so you are a nerd! i love it
ok now I get it, this is really some hard work cant wait to try your lora once it's ready
what should i say to get it to make more of this?
but make sure to nerdout on writing a proper bio or a guide so we can use it properly
use danbooru interrogate
its why I had to enable clip training 🥲 so that normal words are useable - else it genuinely would have needed a wiki XD
would you please stop spamming these children photos
wait! CLIP? I thought it's not wrking with SDXL
dude why does the refiner always nuke faces
yep. a whole guide on lora training will drop eventually ^^ for simple till ultra complex training
its why stuff like this gets posted occasionally XD
@shy kelp @buoyant flame kindly stop spamming children creepy photos
thank you
the face detailer strikes
Wtf
Don’t use the refiner problem solved lol. Refiner is only good at realistic stuff
💀
I'm assuming you found the same thing?
close
last i read you were having issues where you couldn't get your loss to go down, bullet list what have you changed sense then?
it seems as tho the issue was CUDNN
its either that, or a code error I had with my captions
regardless, I seem to have fixed it
my LoRA's were legit empty
like, learned nothing through over an hour of training
indistinguishable from base SDXL
skipped CUDNN as a whole
yeah that wild
I did have an error saying it couldn't find my captions, but without captions, it should have learned way more destructively, not less... so IDK
I did like 3 dozen runs
so far the very early tests for this realism LoRA are showing insane promise
like this one
Identical prompt, seed, everything, just with the LoRA enabled on the second
so skipping CUDNN, was that a setting in kohya or like a flag?
You have to do a whole separate process now to do it
(not that i have the same issue, just interested in the workings of it)
second one less cinematic to the max
so I just skipped that process
the point is the second one looks like a real person, not some ransom plastic guy
reminds me of
yes
You might missed this setting for your caption. The default wasn't txt as I remember
are you using kohya?
this guy from spongebob lol
the problem was I used a friends caption script, which was a little bugged
so it output images as .jpg.txt or .webp.txt
Which Kohya in turn said it couldn't read
ah
you got a local windows intall of kohya i'm guessing, so you've tweaked out the cudnn
no, I just didn't install it
installing it is an after step
it doesn't just come in kohya
gotcha
its a whole separate process you have to opt into
What command to use for generate images?
it even says now next to it as an option "Only do it if you have to, otherwise avoid"
Which I am not sure why it says that now... But it appears as tho I was infact having issues cause of it
i'm currently just using the runpod with the setup .sh script, cudnn is already installed on the docker when i get there, the script then does tweak the install to match requirements but yeah, its all self contained there. I don't have enough vram to run local like id want it to
oh lord lol
are those bottom teeth or tic-tacs
less teeth = less tic-tacs
even on subjects, scenes, and clothes I didn't train on, with a prompt optimized for SDXL base realism, I still think my very early LoRA test is providing awesome results in terms of realism improvement
My name is Markiplier
ah damn, he deleted it lmao
the grain/scratchiness in my LoRA images is from under training, not the dataset images, just so you all know
the whole LoRA is very undertrained just to see if it would indeed do something
and it did!
how would i get the AI to render on of these
i'm just messing it does look mighty suggestive tho
So far the dataset is 60 images, trained for only 50 steps total per image
I wanna get that up to 200 images, 150 for humans, 50 for others
And I wanna restructure the tags as a whole, cause I know how to better trigger the LoRA's now
its like one of those clockwork doll keys
all images are at/around/above 4k res
mabye 'two letter Ps back to back' jk now i'm trolling lol
back to unsplash I go!
so with 50 steps total per image what does that put your epoch/repeats numbers
Hello every body.
I tryng ti use SDXL base 1.0 in web UI A1111 and ist dont works.
when I select the model in the checkpoint box it generates the image error and goes back to the last checkpoint I had selected.
I use "git pull" to keep A1111 updated.
Tnk for any help.
10 epochs, 5 repeats
best i can think of is try to use some sort of controlnet
like I said, very under trained haha
but god damn, is it already showing huge improvements
i mixed voxel lora and pixel lora
yes i like the syle too
yeah i've done a disposable flash photo lora, on sdxl it started to change the faces, so my dataset isn't 'diverse' enough, though i then did try it on a 1.5 model and while it did change the faces less, i like how it did it at least. But yeah maybe this weekend i'll spend some time messing with the data set to get it more diverse
voxel syle, lora:VoxelXL_v1:0.9, a small village with people talking in front of the door, at night, backlight, miniature, tilt-shift, film, bokeh, professional, 4k, highly detailed, lora:pixel-art-xl:0.75
that'd be a dope video game
red was like 'there can be no shadows'
Shadown't
yoink what?
is that a yoink from a stock
oh, just a screenshot
its from unsplash
free high res/high quality stock photos
that one is a paid image tho
what if
was just admiring the BRIGHT
Was more ControlNet released for SDXL?
lmao
only canny now.....
I've now just linked those fields to my main width & height selectors
I would do the same, but for whatever reason comfy doesn't let me pair the value going from the width and height into the latent
so I have to have 2 different primatives for width and height
ahhh Im not using Primitives becasue of that, Im using an Int Node from Impact Packs (I think it is)
Yes it is. Experimental ImpactInt
sorry I misunderstood , you're on about splitting the output of empty latent image arent you?
You could do that with a custom node that reads the size from latent but irs easier to have 2 inputs to it either from a primitive or other node
oh no, I am just trying to use a basic primative to control the side of the latent and actual image widtch at one
its not that big of a deal
what is a big deal though is a new undocumented comfyui hotkey
ah whic is what I do .
I have 2 ImpactInt nodes as the width and height inputs , They go to 2 reroutes in my busbar and then from there they split to the latent space and the text conditioners
I tried loading that cube image in comfy, it even claims bad models, although I select models I have installed
wait, like it disables it?
yeah like a lora, it just passes through instead
oh thats hot
SO HOT
that will make my data collection for my LoRA comparisons really easy
yeah lora wiring was a pita lol
what is this? AITemplateLoader
It says that node is missing
ah, so I cant run that workflow
oo thats hot
used face detailer in winston's workflow
did the lora get overtrained?
is there a way to install that?
If you said "in bed", then the lower left image worked
This was orginally written by: https://github.com/hlky - GitHub - FizzleDorf/AIT: This was orginally written by: https://github.com/hlky
if you using SDXL make sure you download SDXL branch zip
i just want to check the face, but i dont think its like Selena Gomez
its not for the faint of heart who can't get things working on the first try and give up forever
I cant even figure out what dir to run that in, much less downloand a branc zip
@boreal bough
i installed it throught Manager, but got error not valid win32 apps.
As much as I want to try this out I feel the best way to run this is to wait for comfy to implement it. I’m too scared it’ll break something😅
its a hack job right now
I now git got it into my custom nodes folder...SDXL branch.zip Where is that found?
Lmao
Copy your comfy folder. Work on the copy.
I’m good lmao. My comfy folder probably like 30 gigs or something
@clever verge OK so tying this upgrade again and same thing as where I eneded up.
For refence I deleted the entire original install and fi a 100% fresh clean install
It all works as adverised providing I don't select sai-base in any of the 3 Primitives i use as remote controls
If I convert the Input to widgets and select sai-base same thing,
Ahhh thats interesting, in the new nodes its "base" not "sai-base" so maybe I do need to either replace the nodes or edit the jsons , hmmm
then if you need to revert to the copy because something broke, copy the models there
not needed anymore. we pushed it to master branch
maybe two issues going on here. your lora is little underfit and the facedetailer has settings that are putting the face a little off kilter
i feel like these detailer passes benefit from controlnet so much and today i want to try to get the canny model wired up to the facedetailer
OK, so I have git got the thing. Someone said the sdxl zip isn 't needed, so I will try running this
Nightmare fuel
@clever verge renaming th e"base" in the json to "sai-base" seenms to have fixed it although to be sure everything aligned going to completely replace all the nodes and primitives and use your default naming
i didnt get you
underfit lora meaning you didn't train long enough or had bad settings
thats great news
off kilter faces meaning when the facedetailer pass hits, it shifts the face a little to the left. like it looks pasted on
why do you need face detailer? are faces bad with whatever you are using?
sometimes. i find it helps. i don't "need" it. but sometimes with portraits it does so well.
it'll really drive eyes home
yeah true, even polished it
That's what it generated from your cube image
try heun sampler to see
honestly selena i think is a hard lora to make, her look has changed a lot through the years, so the data would have to be consistent to the look aimed for
nice! by how much did this boost your performance? I might work on a table showcasing it
No idea. I haven't been running comfy. It took a long time to generate the image
RUN!
@indigo carbon Have you tried the AIT VAE decode?
weird, that workflow should be a few seconds. what GPU do you have?
works only on FP16 VAEs
you can usually count on them to wing instructions, do one key thing wrong, and declare with certainty how things work
didnt use face-detailer
lol should have asked earlier instead of troubleshooting black screens
almsot better. you might have a bad prompt or settings on your facedetailer pass then
didnt use facedetailer
i trained it using 16000 steps
thats what i mean. the ones with facedetailer might've been set up bad. those are better
doesn't seem like an underfit lora
if it's over fit then usually you'd see the subject in one pose, one expression, one outfit, one location
mode collapse
oh ok
you can use the AIT workflow I made and add load_VAE select SDXL_VAE_FP16, then use that on AITVAE decode/encode
i think you got a good lora there but that is a lot of steps for it. experiment with it at lower strengths to see what you get.
as for facedetailer, i find prompts like "an extreme closeup of subject, adjective eyes, adjective lips, expression, other detailing tokens" are good prompts
have you noticed improvements using that node?
how can i improve the lora?\
not sure if worth the trouble was more experiments
peircings in the face detailer pass help a lot too for that goal
AITVae is available now?
don't think so. I'm generating images in ~10-12 seconds anyway
so, whatever
i dont know how face detailers work, so i used winstons
in a few cases i have. it's just another tool and isn't one i use on every generation. the workflow i got for it has a two reroute nodes that i discnnect or connect when i want it
#✨|sdxl message this is what i set up
might be out of date
i'm still waiting for the day when the JWST catches a glimpse of one of these and it sees us see it
Hmm, sai-base is renamed to only base. Should be enough to just select base again instead of sai-base? I can test it to see if I get the same problem.
only on the FP-16-VAEs
I’m curious would this work with custom node samplers? Like impact pack and stuff that uses pipes
did you try it on my AIT workflow? I'm considering adding it
If so I may end up trying it.
i think ive tried a similar thing but i dont have that workflow atm
im suspecting some odd caching going on. If I went in and renamed base to sai-base all was happy but if I put a new SDXLpromptstyle node in(after another clean isntall) it simply wouldnt accept an n existing Primitive input so I ended up with all new SDXLprompt style nodes & all new primitive nodes
they're out there waiting
Also do I just install the custom nodes from the GitHub or is there anything else involved with setting ait up?
i think clone the git, and load a workflow
takes some getting used to on the prompting and styling
just master branch, it will work. for live previews (with my workflow atleast) you need the Comfy_manager live preview.
When one of those realizes I can see it...it will be terrified
Gotcha
Prompt work needs to improve on my end for sure
Why in an image of a cube do you have things like bad hands in negative?
Thinking about changing the handling of no matching style selected to revert to "base" style. Not sure if it fixes all your issue though.
in orrder to find out I would need to undo what Ive just done, delete the upgarde , redo the upgarde............................
errrm
🙂
Did you get this error in the console? "No template found with name '{template_name}'."
@trim orbit
no
they do look a little plasticy. have you tried generating with the lora at a lower strength?
@twri it literally just turned the style input red and went to prompt finished in like 0.5 seconds
0.7?
have not. i haven't given ait a shot yet since i've been playing with canny on sdxl a lot
on that cube image, if I make it a different size, it makes a stretched image. I must be missing someone on the size
that's not what you want
I tried to reproduce but Comfy itself catches the error before my code even runs.
you didn't resize it properly. with AIT it's slightly complicated changing AR
I can't just change the size in the workflow?
oh well, Im sorted now ,. Just dont break it again lol
you can, but there are multiple values that need to be changed when changing size with AIT
you need to do math for that. in order to change the AR you need to change the upscale size and latent size. this is the reason I made two variations of my workflow- one for 4:3 and one for 1:1
what do i want
How do I set it to change?
There are nodes for upscale, but they don't show any numbers
probably no errors, lol. have you narrowed down what is causing the issue?
there is nothing in the upscale node properties to change
no will figure it out
yes there is.. if you don't understand how the AIT workflow works, you shouldn't be messing with the nodes too much.
try using just the one model for the face detailer. the top one, and leave the other two open
I want to change the size...don't want square images
I don't know the specifics, but that's how I got it to work for me
I'll try my best! Think I'm innocent this time, or so I hope. Possibly limitations of litegraph.
How does it work? If I don't understand now, I can never understand it or change it?
like this
lemme check
when I attempted to use the other two model inputs it gave me a barrage of errors\
you can, change the values of the upscale to being exactly 2x the latent size, that's it
There are no values...at least I don't see any
I'm sure the other two models help with things, but I really don't know how to get them to work
i have these
well just try running it to the one input like I did. it might not be ideal but at least you'd narrow down what the issue is
you could also just wait for comfy to implement AIT himself. it wont require any knowledge or learning or nodes. its either on or off is how i read his comment
but what if I want to be required to have knowledge and learn and use nodes?
The nodes for upscale have no number settings, nothing to change as far as AR upscale, times to upscale
change ultralytics to mmdet?
if you have that option. I don't know why, it's just the only configuration I was able to get working without error
this works for everybody else.. you are the only one experiencing this. looks like a problem in your end
What problem? Are there supposed to be numbers there?
from what i'm seeign, changing sizes with AIT is a huge pain in the butt. i'm fine with the speeds i've got for now. i don't want to add more speed bumps to my workflows
yeah, there should be numbers.
pain in the butt, but how? What steps are taken to do it?
Odd that there aren't. Ive tried in three different browsers
i dont know. i just see this big back and forth about changing a size being difficult and needs to be done in multiple places. i'm sure i could get it done because it sounds straight forward, but it has to be done in the first place is my issue
there are different upscaler nodes. some have the option to adjust. some just use the default of the model
what browser do you run it in?
@gloomy barn what AR do you want, I'll make another version of my workflow for it.
I want to be able to change it based on the image
not have to have someone else make specific ARs
I believe by default they mostly do 4x
well i often change it multiple times in my sessoins. thanks for the offer though. I'll keep an eye on what you're putting down you can be sure of it.
4x...that should be 4x the settings for width and height, no?
no, it's 2x
it doesn'\t matter the x
a 4x upscale means 4x the pixels. 2x each dimension on 2 axis
If I set 768 width and 1024 height, then it's whatever x those, no?
I think I can fix this by using some kind calculation nodes
instead, it seems to be using some pre-set value for the upscaled image
yes. what I do is 4x, then use "image scale by ratio"
ooo oyou're talking about automating all the different changes and only setting it once? brains
its 2048x1532
so it upscales by 4, then I multiply by 0.5 on each side
so brings out detail and then shrinks it
yeah, we might be able to predate comfy's solution for AIT
I just always do 4:3, so this didn't effect me
I dont mind having to set it....but how?
4:3 tends to cut off people...not show them entirely
tall ones i just sorta eyeball in.
with 1.5 maybe but not with sdxl and king level prompting. maybe even lord level.
try that
eyeball what? how is it set? How is the final output size set if not calculated from the latent input numbers?
hell i bet a landed gentry could get away with uncropped xl generations at 4:3
I will try some with the default values
I like what the upscalers do, but then I don't necessarily want 16 megapixel image
this is a weird personal workflow i've developed for finding a good odd cropping i like. get a sweet phoot with a sweet subject. start+shift+s to get a screenshot selection, copy the subject with an eyeballed framing. save it. take that image's AR and calculate from there
sweet phoot
I cAN CALCULATE
mathematical!
How do I set it? How do I put in the numbers in the workflow?
you gotta have that sweet phoot first tho
the entire workflow breaks without it
I know the AR I want
I'm making a version of my workflow that doesn't need me making different variations for different ARs
sometimes the image will make an AR thats weird like 4:13.48 so i'll just round it
ok
just be patient, give me a sec
the pre-made workflows are a double edged sword
does anyone know how to call a comfyUI json file from a python program ?
on the one hand they make things more accessible. but then people have no idea what they're really doing, lol
sometimes i like to do that golden ratio plus one more square. that's a nice tall AR
i forget the math on that one
The fing hat
easy to not cut yourself on a blade once you know it's a sharp blade on both sides. just use it right.
people though.
or if you're pervse like me then do it this sort of way so on your front end you put in a whole number to multip[luy by rathe rthan thinking about converting 🙂
part of me wants to learn more and make workflows like that. but then I feel like I'd have to deal with a bunch of irate people, if anyone actually used them
My img2img one on civitai have something like 330 downloads now and I've not had a single person come asking questions
maybe the key is using common nodes
mine are a mess
I made sure to include a bunch of notes though
can you share a link to that one soles? i'm looking for a new img2img one to try, sd ultimate has too much going on
no such thing in this workflow
This SDXL focused Image2Image process utilizes your desired SDXL base model with the SDXL Refiner. To ease the process, all steps are automatically...
why won't it do the fingers i want!? make it do the fing!
well you don't have to use that workflow, but do what brings you joy
I dont know how to make one from scratch.
well you also don't have to use every node in it
most of the workflows I've downloaded had some sort of conflict or node I didn't have, so swapped out for things I could use
oh you've noticed some of my response to people then 🙂
nah, there's a right and wrong way to approach it I think
https://www.youtube.com/watch?v=AbB33AxrcZo detweiler's intro to comfy is essential reall
Today we cover the basics on how to use ComfyUI to create AI Art using stable diffusion models. This node based editor is an ideal workflow tool to leave how AI art is generated, but also how you can really mess with the internal elements much more than you can with any other AI Art interface out there today. #comfyUI #stablediffusion
Install ...
has all the pro tricks
it's perfectly fine to ask questions and try to figure out issues
other youtubers are just looking for clickbait and "like an subscribe" junk it seems. det puts out real lessons
my biggest trigger is people who cant or wont read
I dont see in the workflow how it's setting the final AR after upscale. It's not calculated - it's set. However, I can't see where it's set.
can be difficult to get all the custom_nodes installed if you aren't technical. when i set up winston's workflow i had to use chatgpt4 to rewrite some of the bits where things wouldn't import - but it only took a few minutes for each one to fix. and worth it
this is the video where i learned how to drag out from a connection and then it'll give you a menu to make the new node there, with all the ones specific to that output in quick pick
so that helped
______________ CHANGES EVERYTHING!!!!!!!!!!!!!!!!!!!!!
people have trouble thinking abstractly
this is why prompt engineers are the future
abstract people have trouble thinking
I don't know about all that
I remember nodes like this...but can't recall if it's from the past or future
fragmented people maybe, but I think ther'es a balance
Days of Future Past
nodes or nodules. whats more fun?
There is a place for setting the AR: Empty Latent Image. But that's it. If I set it to anything but what it is, then the resulting image is still the same AR as original before I changed those numbers - it stretches
@clever verge Im an idiot, Was just trying to work out why I wasn't getting my .txt file anymore.
Guees I forgot to connect from the new prompt nodes lol
https://youtu.be/kkYaikeLJdc 3 hours old. he's on this one a little late.
❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers
Try it here online: https://clipdrop.co/stable-diffusion
How to run it: https://stable-diffusion-art.com/sdxl-model/
Run on all platforms: https://github.com/AUTOMATIC1111/stable-diffusion-webui
Alternative, modular solution: https://github.com/comfyanonymou...
I am making a video: Introducing AI Image Generation
Soon, something like this will be available to use both online and run locally. I expext it will take substantial VRAM to run this software, to generate images. It is coming soon!
i didnt get you, i am very new to face optimizer
Yeah I follow this channel and he's been pretty late with these. Like even weeks late
in order to be successful on youtube you must be prepared to make extremely uncanny thumbnails of yourself for each video you make. so when people scroll they'll see your creepy exaggerated expressions and click
he might've been waiting on a real paper but there's just a report
I hate those stupid faces
since his channel is all about papers afterall
tilt shift
Here's another video I will make: Introducing the Internet
Soon, we will be able to share information on a world wide...call it a web....of connected computers.
would love a HOMM game in this style
Mee too
well I actually just tried to use it for the first time the other day. just use "bbox_detector." don't "use _sam_model_opt" or "segm_detector_opt"
it might not be ideal but it'll work
I want to know how to do what's shown at 4:09. Make a video of me, use it to make a video of some character.
when neuromancer came out in 1982 describing a global cyberspace that a significant amount of people spent their time in for the near future, William Gibson was called a mad man
qwerty, I just tried to connect to the other two model inputs in all possible configurations that my install will allow and I got errors each time. the same errors you got. or very similiar
yes
he's talking about what controlnet has done for 1.5 , referencing another video about it.
In the future, you will think things that become real. We are in the very beginnings of such generation.
why not call it a mesh?
I see. However, I'd like to know how to get such good output out of controlnet with 1.5
Good idea.
that video was a popular one on reddit. i don't think the author broke it down. same techniques everyone uses though. controlnet and ebsynth and tailoring it up a bit
the outputs I've seen have been much less stable than that one
there's no pushbutton results with those. the unstable ones are just half assed efforts
the stable diffusion subreddit really doesn't appeal to me these days. maybe I just have a skewed perspective. but it seemed a bit more constructive several months ago
people are making such videos. I can't even get the AR changed on a workflow
good results are skill and eye for detail
good results are the culmination of many bad results
yeah the most talented people don't often show off their failures or rough works
many, many failures
I think that's the case with about anything
but then people from the outside only see the wins and think there's some magic ticket
we see what's on stage for others, then we compare what's going on backstage in our production, find ourselves lacking
can yuou share the workflow
one minute
I have to get this Brave homepage changed but cant figure out how to change it.
what does?
no need for me making new workflows for different ARs, now it adjusts itself for every AR
I want this videogame too
wohaaa - those are awesome!
voxel syle, lora:VoxelXL_v1:0.9, None A group of madmax shoting at a giant mecha in the desert with dunes and sandstorm, atomic explosion in the background, miniature, tilt-shift, film, bokeh, professional, 4k, highly detailed, lora:pixel-art-xl:0.75
cinematic 3d voxel
atomic explosion missing sadly
what a great prompt. nice work!
i'd play the hell out of it
@gloomy barn this should fix your issue.
Thanks for making it. I am trying it now.
lots of nonsense there, but just look at the face detailer
linear node paths
very efficient, guys
ConfyUI don't have an auto Lora loader from the prompt ?
new arrangement?
it works. For some reason it takes a long time to generate once it gets to ksampler
that was from a few days ago
ah I see - it mutated 😄
but this looks very clean
I feel like I’m a few months comfy ui workflows are gunna look like rgb rainbow with the amount of modifications some people have😂 I’m all for it. Love customizability.
what card you using
4070ti
I usually just clear it and start anew, lol
which pack to get?
searge nodes
that doesn't make sense. I didn't change the K_sampler settings. that's a problem in your end, I built the workflow in a near perfect way for AIT..
maybe I have to reboot
looksl ike tinyterra nodes as well
and maybe impactnodes?
once you install node packs restart comfy and refresh your browser
this is about the speeds you should be getting =]
man, they're in the comfy manager
i'll check
btw, I don't actually know what the node packs are by heart. I just google the names of the nodes
I had sdnext and comfy running at the same time. I guess it doesn't like that.
fighting for resources
What's SDXL's max coherent resolution?
it depends - everything that fits in 1024x1024 is in spec
Intel arc's version for WebUI is heavily improved atm.
I can now generate ridiculously high resolutions.
I made many images in native 1920x1024 - you get duplication and mutations, but when it works, the details are really good
i had a minute so installed ait and dragged one of @indigo carbon 's images into it. big huge boy generated in 40 seconds
that is quite remarkable
40 is alot for my workflow.. what gpu are you using?
it must've jsut been warming up the first time. this oen took me 16 seconds
16 seconds is great, AIT is definitely functioning correctly
AIT is still only being used for workflows without the refiner stage, right?
nope, it works on the refiner in my latest workflow =]
gentleman and a scholar
also my latest workflow supports any AR, unlike my previous AIT ones, same efficiency as the others built for specific ARs.
AR = ?
aspect ratio
testing now
no the last one is only pixel art at 0.75
If you make it, you will be rich
?
ah ,I see
biggest lava lamp
IRL, this kind of stuff dont exist at this size
it also doesn't look as cool
by how much did it speed up for you? I'm making a table for average speed up caused by AIT
i'll try to do a side by side test
is there an easy way to "turn off" AIT happening in this workflow so i can do an easy side by side test?
this should do it, no?
yeah
ok will do it 3 times after model loaded and give results for each
``rtx 3090
without AIT (4.32 - 4.60 it/s)
29.95
29.88
29.68
with AIT (5.65 - 6.55it/s)
24.42
22.40
22.25``
@indigo carbon
Similar result using 3090
Coming soon
what CUDA version are you on?
Does anyone know a reliable way to prompt for an image to look down on a character to up at a character? I've tried all kinds of variations and tried using more natural language ect, but it don't believe me.
I ran into that a lot, used a lot of negative prompting
Cuda compilation tools, release 12.0, V12.0.76
Build cuda_12.0.r12.0/compiler.31968024_0
you can get a bigger boost from AIT by switching to CUDA11.8
think that will break anything else? lol
no, you might need to reinstall pytorch though
I don't think so. I get similar result with 11.8
but somewhat faster, aren't they?
4.x to 5~6
my cobbled up sytan workflow is on left. ait on the right. i was wrestling with both and i really think there's a big quality tradeoff for the speed on ait.
it did seem like i was getting different results generally, didn't take time to evaluate better/worse
just apply ait to sytan workflow
those are different ARs and gen settings. AIT is reproducible, it should be about the same image by using AIT with same gen params
well yeah i was tuning and tweaking where i could on each. i just wasn't getting anything i thought was good quality. it all looks like low resolution bad upscales. probably just doing it wrong. maybe xiao is right and its the settings not the ait
I was able to apply AIT to my custom workflow and also speedup with correct setting.
..
those have better consistent details. i guess it just doesn't like how i approach settings. i don't understand the step counts being used anywhere in comfy really.
i still think there appears to be quality tradeoffs on yours
looks more like a gan upscale than a diffusion
does anybody know how to call a comfyUI json-file via python?
I already made a comparison of that. image with AIT looked slightly higher quality
i've seen yours. i can't replicate. lk99 situation i guess.
i've only spent the psat hour with it of course.
lk99 is the new "cannot reproduce" meme, is it ? 🙂
I made my comparison on human face and texture. I don't think they are much different.
idk man, mine come out consistently extremely high quality..
oh wait
I know what's happening to you
you are using CPU ARG, aren't you @trim orbit
Are you using GPU only option? @indigo carbon
no, GPU ARG. it interprates seeds in a better way
no idea what that is. it's telling me that ait is just not ready for workflows. i dont think i would've set any option that said use cpu for anything
just default comfy. i've never touched params or nothing
it's automatically like that. you can set in ComfyUI\Nodes.py to have the GPU handle seeds. it's way better that way IMO
no, the RNG that interprets the seed number. the number itself is always by CPU
that's like "31337" as a seed delta mattering
even with same seed, prompts, everything- the images with GPU ARG are slightly more crisp
i dont think that would matter and i trust comfy to have the best option enabled by default if there were any clear benefits to either. i think it's just a thing people need to worry about to reproduce results across tests.
crispyness isn't my issue though? it just looks like when i use AIT they're coming out of a gan upscaler
maybe, it might be because im used to GPU ARG, it's what A1111 used
this looks like something a GAN upscaler will make?
AIT never seems to make results worse, if not a little better. it's just an insane performance boost
once you find the right tweaks its zoom zoom zoom
yeah, and my workflow seems to use most of them
honestly, i keep coming back to yours cause i keep breeaking it
minor tweaks for awesome results
I did make it more flexible by making it work with any AR, but yeah
thanks for the tips
apologies to those dyed in the wool coders & devs in the audience but.......
You know that old chestnut of trying to work out which package a node has come from when you're slecting from already installed nodes/packs?
Just use f;lipping notepad++ and search in files for the name of it !!
Why didnt I thinkl of this before
And with standard 1.0 checkpoint... I'm getting about a 50% speed increase over not using AIT.
that means you didn't set it up properly. other people using AIT report making images in about 10 seconds.
sytans without ait and sytans with ait. i made that gpu change you said too. 19 seconds without AIT and 25 seconds with AIT. thats when i just transplant the nodes into sytan's workflow with the offset lora on the base
I'm on a 3060 12 GB model. And... the only change I made to the workflow was to flip the checkpoint to the one I had
it shouldn't decrease speeds.. something is wrong
Definitely agree no way this decreases speeds
^
i just lose all fine detail when i have it in play. theres definately a trade off. i'm on a 4080 using xformers
FP16 right?
what does comfy use by default?
Yeah there are some detail workarounds for sure
i thought bf16 was preferred where hardware supports it and ADA definately does
there was a daily challenge in a AI group today themed Animal magician. SDXL did a great job!
odd. for me it makes slight better results.. also for @upbeat summit and @strong field
did someone say animal magician
Is it possible that AIT uses slightly more VRAM and causes the 12 GB to be exhausted?
That would slow it down.
I'm on a 4070TI, it increases performance over x2
3090 with same results
generates all the images I share here in about 12-ish seconds
Prompting makes all the difference. I would say AIT is tempermental
way worth it for the speed though IMO
AIT's speed is influenced by the prompt? That's interesting. Parts of the model are compiled and parts aren't?
I meant quality is impacted by prompt for me
I can’t run the same copy and paste prompts from other workflows
It’s weird honestly can’t say I understand that part
I need to learn more about what AIT is and how it changes the behavior of the model.
It seems like a model should be a model. If AIT affects the output of the model it would be useless for many tasks.
no, it's not compiling any models, it's an optimized optimization
when i up the step count to ones like yours on sytan's workflow, they make the exact same images with or without now. but i get 35 second either way.
the two workflows don't exactly match up settings well but i tried
i was using ddim before. i guess it doesn't do well there
A neural networkto interpret raw CUDA code? That's very interesting. I need to read up on that.
it also doesn't effect prompts, the reason @strong field describes differences with the prompts is likely because they are not used to my workflow.
Definitely possible
AITemplate (AIT) is a Python framework that transforms deep neural networks into CUDA (NVIDIA GPU) / HIP (AMD GPU) C++ code for lightning-fast inference serving.
Oh, that makes more sense.
it does very well with the workflow I built, idk why you claim it to be slower. it's about 2x speed for most people.
it was slower that one time. it's really all over the place. some cases it's faster some it's on par. that oen time it was slower.
maybe it's an ada architecture thing
on my workflow? I get what you're saying, the first generation is longer because it starts up. but after the first one they should all have a consistent x2-ish speed up
its not drop in thats for sure. it's causing a lot of hitches
First gen is always slow. Gotta load model
your workflow i'll try again
It’s definitely a tinker toy
like, my GPU should be slower than your 4080, (its a 4070TI) and I'm generating stuff like this in about 12-ish seconds
But I’ve had much better results after understanding what’s going on
what is AIT?
AITemplate (AIT) is a Python framework that transforms deep neural networks into CUDA (NVIDIA GPU) / HIP (AMD GPU) C++ code for lightning-fast inference serving.
Basically you write code similar to PyTorch and it converts it directly to CUDA or HIP.
It's not something you can just enable. The developer of the UI needs to change a lot of their code. Maybe Comfy supports it. I don't think A1111 supports it yet.
with your workflow i get 17 second generations with ait and 25 second generations without. but this is not something i'm seeing transfer to any other situation
I must stop doing these 😄
that's still a dramatic speed up
why? they are really great 🙂
SDXL's magic special effects are so good
I have successfully made other workflows using AIT with significant increases in generating
i addeda lora to your layout and it broke it. what's with the clip skip on sdxl? isn't that a novel ai thing that doesn't matter
Because it's like a drug 🙂
than you are right at home here 😄
the camel (giraffe?) is awesome
thanks
So my realism LoRA is a huge success
no, just different interpretations. no quality difference what so ever. but hey, 17 seconds is much faster than 25 seconds.. this means it worked =]
I haven’t been able to just drop this into Searge, Syrian, or Winston workflows. I had to start from scratch and slowly experiment.
nice, even has correct finger crease lines
testing some more
you solved your problems so
also, damn my new LoRA is so much better at easy realism than SDXL base lol
Lol
what was the problem at the end?
loras do a really great job of shoving the model towards latent spaces it already knows very well. so a little lora goes a long way
That’s a relief. sDXL at its worst right now. Only improvements left
i got a theory that even the most basic lora with minimal training, will improve outputs
not too sure, either CUDNN issues, or a bug with my captions
either way, it was hell to figure it out
base SDXL
with my LoRA
migth I add, I asked specifically for dappled lighting on her face through an open window
can you share the json? I'm still having problems
can't tell you how many times I go to scroll zoom-in with my mouse scroll on discord after working in the ComfyUI....
my settings are not for general use, they are VERY specific to my LoRA, as I have multiple trigger words, caption droput, and its all tuned for my exact number of images, one more or less and it will destroy all of the gradient checkpointing and accumulation
it also doesn't appear to be ideal yet
The panning and zooming!
😂 male chicken magic
ok. I'm looking to train a style but it comes out very different from original images
having fun with these
most loras are baked with meta data the settings are in them. if you dont want to share them @high skiff keep that in mind
I know, the LoRA is far from done
i see everyone having fun with loras
the current dataset it 90 images, I wanna go for at least 200 if I can
oh, i also trained it on a few animal images, so it also understands how to apply realism to other scenes too
like here
I can't train a good Lora. Not that fun......
@indigo carbon Do you think Loras can be integrated into an AIT workflow at some point? They can be used now, but for every Lora the generation process takes longer to actually start.
I asked it for a medium close up photograph of a pretty french woman with brown hair standing in a forest at twilight with a black dress on
Base SDXL vs my LoRA
possibly. might need to make more modules though
both got the shot right, mine got the time of day right, mine looks more realistic
it takes almost 10 seconds before the gen starts for me with a lora chained in the AIT pipeline
raccoon in a pine forest
SDXL vs my LoRA
i like the base better but that's my preference for warm tones instead. kodak baybi
why does it look so baked
I can agree, but I asked for twilight, not sunset, which is an issue
which one, base?
yea
The whole point of this LoRA is to prompt it as easy as possible
just a linguistic prompt
"A medium shot of a pretty woman in a forest"
Nothing else needed for realism
SDXL needs a lot of hoopla for good realism base images
i'll credit base with twilight. the sun is so low on that tree canopy that it's minutes from
can you do a green/purple one ?
long shadows are the harbinger of the twilight zone
thats not twilight, twilight is blue hour, it depicted golden hour, which is sun above horizon, in all ways its incorrect
here, let me get some more examples
I only did a few before I had to share haha
I’ve already achieved Lora injection into the workflow
oh i see what you're saying
yeah
yes. it's minutes from. i live in the woods. that evokes twilight to me because it's about to begin
I do like the sunset flare in the image, but I didn't ask for sunset
gets my stamp
but does it freeze for you too? It takes 5-10 seconds before it starts generating if a lora is used
very subjective takes either way. both images are fine and dandy. i don't know why you think yorus is a slam dunk. honestly
it looks no different than another word added
what did you use to upscale ?
Agreed
latent, iterative or just upscaler ?
I do everything in ComfyUI
i know i saw the filename 😂
i switched to comfy aswell because its so much faster
I didn’t notice a slow down. Not at comp right now but I can check later
again i like that first one with the warmer kodak tones, but that's ENTIRELY a subjective preference. color temperature is something people have argued over since photography began. second coon's skull looks wrong too
If A1111 ever updates the way they handle VRAM it will be the same speed as Comfy.
This was orginally written by: https://github.com/hlky - GitHub - FizzleDorf/AIT at sdxl
Beware, it’s a rabbit hole
I was also able to achieve sdxl style injections with good results
i dont know. i tried using it in a couple existing workflows i have. there's a whole different theory behind the step counts and i can't recreate quality when i try ot do what i already know
looking at 70 step count images like "what?"
30% (22% to 37% for me) depending on what i'm doing
I want to go down the rabbit hole, but they still don't support SM75, so, I wait.
I didn;t say it was, tho the LoRA is supposed to follow realism, which it does considerably better
Tho these images are not chery picked, and are instead first gen comparisons, so I don't always show the best
depends. on my 4070ti it is over double speed..
Yes I noticed that too. The step counts matter ALOT but I don’t yet understand why
would you look at that, supreme being making a lora
What, how did you get footage of Sytan?
Lol
the skinwalker forgot to put on his skin
at lower step counts, numbers like 20, 30 being high, seems i get garbage when ait is enabled
From 3it/s (ish) to 6.9it/s
looks like gan
but how much work is it to do tho
50 / 25 are the lowest I could get
as of now if you have a 4000 series GPU it's as simple as adding a normal node
oh. sorry, not supported..
Pretty simple. Git clone the extension and use TDGs workflow
Oops yeah Sm80 only
sm whjat
sm dn
SM80= 3000/4000 series
I cannot believe the background coherency of my LoRA
its almost more impressive than the actual subject realism lmao
/sdxl generate a cat sitting on a couch
what does it do ?
base SDXL vs my LoRA
I'm holding out for better AMD support before I upgrade from my 3060. $1500+ for 24 GB VRAM is too much.
imporves things not dying in the background of images haha
let me get a more extreme example
2024 also brings a new gen
no point buying anything now unless you got too much money or no gpu at all
@high skiff you keep saying realism but that's an art style that i don't think you're going for right?
Ima be honest, from other Lora’s that I’ve worked with and made it seems no matter what, Lora’s improve coherence and uniformity. Lol
photo realism
oh, I'm sure
I am just glad in this case that its at least realistic coherence, rather than the plastic look of normal base SDXL



