#✨|sdxl
1 messages · Page 145 of 1
I mean I'm used to these sounds from bad motherboard sound cards that are just not shielded correctly. it sounds very similar. but in this case - it's the hardware itself making the beep boop beep chirping sounds
and I thought I posted something beautiful: got fried directly 😅
yea 100% you can get a cheap external DAC on Amazon for like $40 to get rid of electrical interference on your headphones
won't fix the GPU microwaving the air though
dont be jeallous 🙂
oh no - it's microwave rays? man, if I would have known that I could have made popcorn all the time when exploring the latent space
Wait a second. I think you gave me an idea.... Is there already a deep fried lora?
just put your cfg to 15 - done
So i order my chicken at CFG 15?
both my 1070 and XTX start to whine in uncapped game menus so it's a pretty universal issue. My 560 ti was the only card that didn't seem to whine, probably just not juicy enough.
yeah... 18 might make it too dry
So much to learn
XTX whines in AutoGPTQ and slightly in Exllama but not in any stable diffusion programs.
Someone try this mexican day of the dead, cemetary , mexican sugar candy, day of the dead tacos, day of the dead guacamole, day of the dead taquitos, halloween, fog, flat lay food styling, style of martha stewart
nah it's 5G waves. Too high frequency for popcorn.
be careful if you have vaccinated people in your room while your GPU is loaded though, it might superheat the chips in their blood.
bro, I live for superheating the chips in my blood.
oh I heard about that but somehow nothing happened 2 days ago 😄
I was updating to Android 14 so I missed it
same lol
no reptilian GF for me
also it didn't say in the ads but the very bottom of the patch notes mentioned if you plug your Android 14 into a computer you can use it as a webcam
like natively without droidcam or anything
ah nice. didn't know that. about time. thanks for the hint!
the 14 site spends like 2 pages talking about different lock screen clocks meanwhile the best feature is stuffed in an overflow menu at the bottom lol
yeah I skimmed the bullet point feature list and skimmed the changelog and thought - hmmm kay. not that exciting 😄
Freddy kreuger version of this from Bing. I put charred before all the food names
@icy brook You had to know one of us numbskulls would mash aether fire, cloud, and ghost together 😜
wow
The amount of steam this whale breach would make...
Looks like dang good suqgghi
What the heck is going on there? 🤣
This one was cool 🙂
Ok, one of you must have already done this.
I just see a regular old fire ant.
You got this
Ok, I toned down the fire and upped the cloud. This guy is pretty much quenched.
Have you tried burning ghosts?
Looks so good it's almost like he's gonna pop put of the screen and smack me in the face 😳
Get his LoRA's name out your fuckin' mouth!
Fire massacre 🤣
Ha, I'll have to try that. Ghost Rider wasn't a ghost... let me think a sec on who would work
Use Joachims other lora. Aether Ghost
You should input the classic Win XP desktop image through ControlNet and run it through Van Gogh 🙂
Just check the sample images’ prompts.
photo of fire that looks like a transparent translucent ghost etc
Or
photo of a transparent translucent ghost made of fire etc
Pretty interesting - looks like he's leaving his burning body
Ghost and Fire both at strength of 1.0
The transparency of Ghost mixes with smoke from the Fire Lora I bet.
Oh, nice
Since both share transparency.
Strange, it took a bit to get a single rider. The recipe for this one was Fire: 0.7, Ghost: 1.0, Cloud: 0.4
nice!
constrictors always look like scaly puppy noodles
chupacabra
Now put Chris Rock in the ring as well
pretty dope character posters!
I've wanted to upres these for a while, very happy how they turned out
hi can you use 1.5 loras on sdxl?
no
thanks
amazing style, which Model and lora is it?
protovisionXL. The only LoRA I typically use is the add-detail-xl one.
The magic is in the prompt on this one. 🙂
really good! loving the results
Yeah. I have a dumb workflow that has ipadapter+, controlnet, and a couple of loras. So I take a photo (or render or painting) and use it with the inkdrawing lora but also use controlnet with the canny edge preprocessor. And then I also do img2img with something like 75% denoising. It's dumb but you get very little difference in composition between the original image and the inkdrawing lora won't make it b&w.
if it works, it's not dumb
I'm generating images to train a new lora so that I won't have to push the inkdrawing lora to do colors + comic art.
oops I had it prompted for "iron man" for the dr strange image. It makes a little difference.
It fixes his face if prompted with his name
More importantly, skin tone; was really bad on the hands in the last one.
Yeah it does that like half the time no matter what. I've been generating three of everything and then for the images I'm going to train a new lora with, I am fixing things like that using photoshop + layers with the best images.
@upbeat summit Please how is that node temporaly used instead of AIT called?
o.k. working now.
Thanks to all of you for the good reviews, likes and sample images for my two Inkdrawing loras on civitai. I didn't thought they would get so much attention. Truly a great community. I am working on 2-3 more styles and hope you will enjoy them as well.
They work very well. I used them to generate images to train a new lora (I wanted non-b&w and more background details) for myself. Looking at the results, I probably could have found some old comic book scans to use for data, but your lora worked just as well, which is kind of surprising if you think about it.
Yeah, i was also a bit surprised that they actually support color. Well, it's a bit harder to prompt for, but it kinda works.
Yeah my new lora just does this with no specific prompts for colors. That's not a criticism of the inkdrawing lora because obviously it is made to make certain kinds of images and color is a bonus. And 100% of my training images came from your lora's output.
Anyway it's not that hard ot get color or b&w images from the inkdrawing loras. They're way more useful then 99% of the pretty girl loras on civitai.
I was onto something like this too, but it is hard to not fuck up faces and hands even more when training on generated images. So i have moved on to new styles. Nice to see some creative use. I like your images.
I only used generated images that came from controlnet and img2img and threw out the ones with bad faces. Some janky hands definitely made it in though 😢 So most of the images were around this quality. I think I used about 150 images which may be too much. Idk.
Original dataset size of the DC version is only 95 images
That's probably reasonable. I haven't seen better quality when using a lot of images. But it feels wrong to train with only a handful.
It's easier to get bad images into your dataset if you have a huge one. And they will screw your lora
Hmm. I should probably check my images then and cut it down.
I am trying to get the jankyness a bit more out. And here and there a baseball player appears. It just needs more polish before i throw it out in the wild.
My solution would be once I could get good images from it, I'd start replacing the original baseball cards with generated images. And then that would risk distorting the style and causing problems. But it's easy to do.
finaly driver, but too slow one
speed racer
about the coil whine: i've payed $150 extra for my 4090 to get a card that was known for little to no coil whine, as this is a very common problem i once had with a previous graphics card. since then i always check forum posts and reviews first to try to identify a card that is known for little coil whine.
for the 4090 i think i also got one of the coolest designs - and it has no coil whine when using comfy ui:
My new workflow is available on GitHub now - it allows to turn off features and parts of the workflow by muting nodes - this reduces requirements and increases speed a lot:
https://github.com/JPS-GER/JPS-ComfyUI-Workflows
(check the black info box on the left side of the workflow for instructions and requirements)
Somebody asked for similar look like this, if it helps those one
Freaky...
very detailed
Pure composite (left) vs. Re-render of composite (right). The idea with the re-render is to help balance the colors and make it seamless. It's not a huge difference, but it definitely helps with improving the font. It makes the design "work together" in a subtle way that feels more professional. The absolute darks and absolute brights match in the re-render.
have you thought about using ipadaptor?
Load OG baseball cards as the reference images then generate new ones?
(fortheseI just googled marvel comic pages and grabbed 6 images to use)
hello 👋
how have you been? I haven't seen you in a while 😄
been busy with work plus a week in Prague
a week in prague. nice 
heard it's a wonderful city to be.
well 5 days, its nearly a week lol. But yes its a lovely city 🙂
ah here we go , Random Baseball Card Generator lol
cool. are you using the mini lora technique?
it's a neat idea. probably the best use of ipadapter 🤔
best video Ive seen on it is the devs own
Everything you need to know about using the IPAdapter models in ComfyUI directly from the developer of the IPAdapter ComfyUI extension.
👉 You can find the extension "ComfyUI_IPAdapter_plus" on github here: https://t.ly/yZsbm
00:00 Introduction
01:33 Basic Workflow
05:55 IPAdapter Plus Model
07:32 Prepping Images
09:16 Sending Multiple Images
1...
I like that you can create .ipadpt files to reduce overheads. Create preloaded/ipadapted embeds
but all explained in the video
sounds complicated. hopefully I can figure it out 😄
its a good walk through
anyways busy afternoon day ahead.
Rugby World Cup I have to day
Wales (my team) v Georgia
England v Samoa
Ireland v Scotland
Also would like to try and half follow the F1 sprint race from Qatar lol
I'll give it a watch.
have fun watching WC 🍻
I have a case of Doombar tokeep me company lol
sounds exciting. best companion to have ☺️
Tuack Man's rookie days. 🙂
They were randomly selected and none of the names mean anything to me lol
Yeah, for specific text, I render the text with a font in GIMP or Inkscape, merge it with the image, then re-render.
Double Ice-cream?
combination of two loras I am working on
it is originally a painter style:
maybe you can guess the artist, but it is still quite flexible especially combined with other loras
Game token numbers. Generating a font.
Hmm. Nope, this is the way to go after all. The font needs to be added in-game, but the icon needs to be rendered in-balance with the base card.
Crafting. Unique font for each number, but it takes a while. 🙂
which kinda art is this?
^ Gave it a shot: (Mantis shrimp, whimsical:1.3), highly detailed, Style of stippled oil pastel marker cartoon illustration, whimsical, bold black outline, character art, close up view, mottled dappled rich vibrant colors, (cross hatch shading:1.2), bold contour lines, 90s, best quality
Lol
Bing says I got it (haha)
Haha oh my
So cute
Ok Joachim. I am currently testing my "deepfried" Lora. It's looking tasty so far. @icy brook
do you guys think negative prompts are really needed for SDXL and all the various user-made SDXL models? And if so, what are the bare minimum negative prompts needed? thanks!
I'm trying to make realistic-looking images, btw
Negative prompts are less necessary for SDXL, but they still can be used effectively. I typically only keep 2 in my neg now on well-trained models: watermark, signature.
And whatever you do, don't get sucked into the bullshit that is something like bad hands, bad anatomy, etc. Results that are good from images generated with those terms in SDXL are simply happy coincidences. If you do need to use something like that, an effective term tends to be something like deformed, but often times, the best thing you can do is simply roll the seed and get another image which has just as good of a chance of not having issues as another word added to your neg.
As far as things that look photo-reaslistic, you actually want to stay away from the term realistic as that's not what tends to get used with actual photographs of people; it is used with art that looks close to real, so you're actually doing yourself a disservice if you use that term in a lot of cases.
What does tend to help is to use terms that are related to photography. The size of the film, a specific lens, a specific camera type, an f-stop setting, etc.
anyone know how in the eff you find your own uploads on civitai now that the "My uploads" tab on your profile page is gone?
I say all of the above, but the biggest thing to realize is that, by far, the best thing you can do is understand all the basic settings for the tool you're using, be it A1111 or Comfy. If you know what steps, CFG, sampler, schedule, denoise, CLIP, LoRA, ControlNet, IPAdapter, etc. do, then you will be far better off than people who don't.
even their own bot shrugged and said the site must have changed since I was last updated with the information. ffs.
Thanks. This is also my impression after using SDXL for a few weeks now.
"Your Profile" and then "Models"?
I always just click on my profile, then choose from the menu on that screen.
Nope, models gives me zero
Filter?
I only have NSFW filter on
now none of mine are a model they are loras so that may be why
Published/Draft, All Time/Day/Week/Month/Year?
Hey, quick question here. Is it possible to generate images based on another image like a Logo for example. I want to create an animation for a client
nope, I have never been able to see my own work through profiles BUT I can on a general search
The Models page should show your LoRAs.
never has
Show us a screenshot of your profile page.
LOL
I turned the NSFW filter off (only one I had) and they all appeared so facepalm time
Boom.
Still 3 Filter
That's me.
Sorry. Idiot here
Well, at least it works now. Thanks
About to release a V2 of one of these as it is V2 and not simply 1.1
Noice.
Hey Soul. It's 2:0 for me. Remember when i helped the other poor guy with his Lora problem? 😜
I said the same thing you did at the same time.
Tie.
Are you vacuum sealing things again?
Yes
🤣
🤨 Ok. This time tie.
Question: in SDXL do parenthises and weights make a difference? Example: "side-view profile photo of a pretty white girl with ((natural elf ears))"
it does
is the double (( the most potent weight? any tips for getting elf ears? 🙂
use (XX:yy). (()) is rather dumb these days
Maybe a captain spock LoRa 😬
something like (xx:1.4) for example
Best to get used to the (term:X.X) format so you can set precise control. You can even use multiple decimal places like (big booba:1.437).
I've used it when I was just trying to nudge something the tiniest bit.
"something like (xx:1.4) for example" so what is the available range?
I mean you can use a transcendental weight if you wish BUT... lol
Depends on what you're doing...frankly, you can set it as high or low as you want, including into negative, but generally most things don't like beyond +/- 2.0
There are some things that work well with higher/lower. And it also depends on the rest of what you have going on.
training and warming up my room as the heater came on this morning and shocked me. Turned it off and use this blast furnance card instead.
1.3 seems strong but tends not to distort the image. Use range btwn .8-1.4
and for less importance you can use [something:2]
my PC itself acts like a space heater when I am running AI inference all day. 🔥
Yes, if I do a huge batch run is when it heats up the most
@brave stream - One other thing to note about prompting...generally, the further back the term is in your prompt, the less it will be focused on. So another method if you need to stress something is to change the order of your prompts to move something more toward the beginning. That said, some things tend to naturally be stronger than others and so weights and place in the prompt are simply things you need to toy with to get the output you're looking for.
there are some things that are so heavy even loras refuse to touch them
I need to get a consistent breadcrumb coating. This looks already delicious....and stupid. I guess i have to retrain a lot.
And nearly every model out there has a far heavier bias toward female terms than male.
when long run, i limit gpu a bit, i mean overnight for example.
Almost Cheetos antlers.
soul you can make large baking pan 😄
10-4. The shuttle is ready to deploy the MBA or Massive Baking Apparatus to any destination on the planet. Just tell us where to drop it off, sir.
I think t "B" stands for Balls
I noticed on CivitAI when I want to upload a V2 I have to remake the entire pages over again. Is that the way?
Looking crispy
crunchy
you get it? bcz his name is fry...
Oh my....
can you do a few with sesame on them?
I will....later
Preparing training images at the moment
deep fried hulk @noble shoal
yes
The nose did not grow yet... Pinocchio?
it seems to work pretty great 😄 very cool
Hulk is a hard one for any of the trainings I do as he is so strong that when it works it has spread out all over the image
Might be the prompting but he is tough for me
Thanos is strong as well and is good for similar purposes.
In 2.1 I would use 1967 Corvette and that was the toughest one I could find. If it worked on it then you got it.
Nice
There are just somethings that are so overtrained lora just doesn't want to work on them.
Abraham lincoln
locon might but they were pretty forget it. ia3 works on any then comfy comes along and kills it.
Aeather Glitch meet upcoming Modren Oil Paint ( @icy brook @delicate kelp )
who remember having this filter on photshop ? plastic warp...
Me 🙂
Loved it
Awesome 💕
Would be awesome to see this with Glitch, @zinc cargo right?
Trying to make Thanksgiving camouflage
Are you making a plastic draped Lora? 🙂
V2 is about to be released
xl is fighting me though
I was told to ask you about something
A prime example. How do I keep the lora from spreading all over the gen?
You want the plastic wrapping to only show on the subject/object in focus?
Yes, or other things I have done.
Easiest to help would be to see part of the dataset. If you’d share in dm?
dataset I tried with and without a background
didn't matter
As I said it happens on anything I train
In the captions I will say "stand/sitting" in front of a solid white background.
BS wouldn't have anything to do with this, would it?
old not sure i have not already posted it.
@icy brook
I will surely try and help. But I need to do some family stuff first. Will write later!
@vital ermine actually got some time now. Sure you don’t want to take it in dm? Feels easier
To discuss that way
DM
I am training it again at BS1 but 1h
crispy
deep-fried Colonel sanders
I love the left color and the right structure
Extremely challenging
Hi guys, what's the best upscaler to use with SDXL images?
There is not one specific upscaler that serves all purposes. You should explore https://openmodeldb.info/ for various needs.
next update will allow to select individual ip-adapter sources (up to 5) and revision sources (up to 2). that makes it much more flexible for regarding vram usage (and also performance):
soul is absolutely right that there is not a single upscaler for all use cases. my current favorite upscaler model is this one - mostly for realism: https://openmodeldb.info/models/4x-classicalSR-DIV2K-s64w8-SwinIR-M
What is real, anyways?! 🤣
the spoon sure isn't 😉
If you didn't tell me about the vase, would I have broken it?
I would ask you to sit down, but you're not going to anyway
@sinful falcon 😉
I installed automatic1111 to try to generate images with img 2 img, where the starting img is generated by bing. All A1111 is doing is generating copies of the same image
What is your denoising strength set to?
Sounds like you have your denoise value too low
0.75
Something is wrong then, that should change it a lot
Show a screenshot of your settings.
cfg 10
With a screenshot of everything, we could see better what may be going on.
Well, that shows denoise at 0.4 with a slightly different image in the results versus the input, which would make sense.
Have you guys ever seen this kind of formatting in an SDXL prompt? Is it legit? "(cyberpunk)"
Parentheses adds weight.
sorry, i need to repost that
\(cyberpunk\)
he it is with more context by yoneyama mai, 1girl, lucy \(cyberpunk\), solo, parted lips,
I changed the denoise after many attempts
probably someone bringing a prompt over from midjourney or something
Yeah, but what does output look like once you have something like 0.75 or 0.8 set?
from the 2nd screenshot you gave, it's zoomed out a bit, but it looks like there are some differences. Lookinig at the bottom-right wizard that is in the dark part, the area that looks like the head is light color in the top-right corner, but darker in the one below. The bottom-center wizard holding the wand has different glows around his wand and body. The center ring pattern seems different as well between the various images. Some have a narrow dark ring like the bottom-center, some have a wide dark ring like the top-left.
For a 0.4 strength, that would make sense.
finished. All the images have the same elements and lights. The difference is the texture of the sand ad the rocks
It certainly is changing.
When I try to create text to image what I get is a desert landscape with some bright lights and maybe one wizard
Sure...because your prompt is pretty complex and is not likely to come out the way you have typed it up. The more specific and more elements you are trying to make happen in an image, the less success you're going to have. It's possible to get what you want without additional trickery, but it's exponentially difficult after a point.
AI image creation isn't following instructions the way a human would. It's converting terminology in your prompts into tokens that it's been trained on to understand patterns and then attempts to infer an image out of latent noise based on the patterns that token training has understood. If you desire something a bit less specific, you'll have a higher probability that your expectations will be met. Otherwise, you have to start looking into additional techniques such as ControlNet and LoRAs to get things narrowed down more.
so img2img is not controlnet?
That was all prompt + illustration style
img2img just moves the starting step up, with your image as starting point, while control net can be active through all steps if you want.
now that the discord bot misses controlnet I have to run offline SD to get what I want
if you fine-tune the starting step / strength of img2img you can also get good results. for img2img a few steps often make a huge difference, so you really have to try to find the best setting to get what you want.
This is the aftermath lol
Go lower plane damnit
(Close up, pov:1.3), enraged cyberpunk wizard controlling massive horizontal red beam, red electric arc, red light flash, arid mesa crater, sand dunes, pock marks, haze, desert flora, dry
2nd image 
it wasn't installed correctly
is it a huge task to understand "five wizards casting energy bolts, in a circular formation" ?
You'll find that sometimes getting the number right is an issue.
Honestly, it's going to be a lot of trial and error depending upon how selective you may be.
for the thing you want to do you'll need controlnet. think of describing how to fill in the shapes of the image vs telling a narrative
I tried it with bing with this "Make me an image of 5 wizards stood in a circle casting energy beams towards a central point, make the image be taken from above looking down."
It's not got the number very well, but it's go the right idea.
Maybe try a similar prompt in SDXL
Why the "Make me" part?
Are you not using https://www.bing.com/create ?
I did it from Co-PIlot
ah okay
so you have to tell it you want an image
makes sense
These are crazy cool
all based on basically two source images:
the rest is prompts and fine-tuning of settings for canny and ip adapter
Nice - I just saw your workflow pic way up above. Fun to see such unique outputs 👍
this combination is crazy. you can do so many things with it. and about every second run gives a great image. i'm not even posting everything, just if something is different enough.
for example i didn't post that one - if not all of them were so good i would surely have posted it:
or akira:
also have about 20 very nice horror houses i didn't post
for example:
Well done!
new lora?
Yes, we have talked 2 days ago about it. I asked if there is already a "deep fried" lora. I was just for fun, but i took the opportunity to make on. 🤣
Thank you.
When you make your Aether LoRas, are you also mis-caption the images on purpose?
I forget a lot. Train all the time and rest of it other work and family stuff.
Wdym? Exemplify
Ok, let's say i want to train the concept "made of xyz". Then i take a real image of xyz that looks slightly like another object and say. Photo of a rock made of xyz. Or i just lie. For instance i have a photo if chicken nuggets and caption "photo of pebbles made of {Instance prompt}
I would just name it what it is.
Great, then i produce chicken nuggets. Straight up lying has made it lern very well. But! I didn't try the other way
Might as well try that. But how did you got you hands on shaped water objects?
I create all images from scratch.
Added you as friend and sent a dm 🙂
can someone with the latest updates installed try sdxl zoe depth control net? it's broken for me and one other user at the moment. but as i did some other system updates, i'm not sure if it's a general problem.
Slow af, but my first time training a LoRA for SDXL and am doing it locally on my 3080. I pray for workable results. Probably won't even find out until tomorrow. Current speed is 20.44s/it. 😢
I'll figure out how to speed things up next time and I'm sure most of the fellas in here that train regularly know all the tricks to that already. But I'm excited to see how it ends up.
What Batch Size?
1
The fuck? I get 1.43s/it on a 4070

What optimizer are you running on?
Adafactor
You are already waiting nearly 12 hours? There must be something off.
I started it last night, yeah.
Oh, the concept of daytimes is not valid here. I have no idea what timezone you are in.
Eastern. Was confirming that I've been waiting for about 12hr already, yes.
Sounds like you are using RAM as VRAM
Probably. My 3080 is only a 10G, not a 12G, so it might be swapping out to RAM if needed.
Ah, yes. That's a challenge.
That it is.
generate a photo
Now i want a photo too. 😕
Morning
Hello Sir, good day.
Sleepy wake up face (just an image of me right now)
I am 100% sure nobody will put it through SD....
Ha
as requested:
I should work on my prediction accuracy
Yeah I had 34s/it, but well, what do you want with a 4070 Ti with only 12GB🙃
🧐 The Ti is slower then the non Ti? How that?
Do you mean me with "he"? Now i am even more confused. I have a 4070 with 12gb.
I edited, I thought I was talking to someone else.
Training a Lora with only 12gb and getting 1.43s? How xD
Hmm
Must be some other settings that differ.
What parameters do you use?
A lot. What do you want to know
Network Rank for example
I just drop you that. Might be simpler
From my last lora
But i don't say that's the best config!! Don't trust me. And don't judge i have a lot of things wrong.
Let me know. Btw, i can send you my paypal details. Just ask
I feel your pain.
yeah...
that's why I gotta work with Runpod, and it still fucks up most of the time 🥲
Practicing is a pain when you gotta pay to use a GPU
the nightly versions of pytorch (2.1/2) use more vram when training loras in my experience at least
try to stick with 2.0 or what kohya recommends
the number of buckets also seems to matter
I am confused. Very confused.
I had a lora with 32 Buckets and still got 1.41s/it.
I dunno, there was also a bug where the vae would not release vram after encoding/caching images
second run after they're cached to disk would be faster
Anyway, i love my cheap ass crappy plastic Palit 4070. And i will make a safe copy of my current Kohya and never update 🤣
My Kohya was last updated on the 10th of August.
I updated once and couldn't do batch size 10 (sd 1.5) anymore, max is 5 now
boggles the mind
Never change a running system
as for sdxl I checked just now and I'm using 9.6gb of vram with batch size 1, which seems about normal, not sure how you manage below 8
Me? No i am using 11.7
oh I looked at this but didn't realize it was finished already
Sorry for the confusion. Thats just my normal SD stuff loaded
my fault, anyway I'm getting 2.25s/it on a 3060 which seems about normal too, will probably get a bit faster once the progress calc catches up
40s/it on a 3080 is quite slow but I guess it's because it's offloading to normal ram
I'm using 9.6gb/12gb vram without text encoder so likely just some differences in settings
The 40.78 is on a 4070 Ti, thats what grinds my gears
I think he even used my config file
you're training both unet and text encoder looking at the config but I'm not sure how much vram that adds
Is there an advantage of only training the unet?
a lot apparently, I'm ooming with text encoder too
mostly just to save vram
usually the result will be better with both trained, but sdxl has 2 text encoders so it's hard to say, I think kohya recommended training only unet
--network_train_unet_only option is highly recommended for SDXL LoRA. Because SDXL has two text encoders, the result of the training will be unexpected.
or so the readme says
I might give that a shot. I just have to set the learningrate to 0, or to i have to check something?
After this current epoch is done, I'm going to stop it, blow away my kohya, reinstall it fresh, and make a few adjustments to my config to see if I can't get the speed up.
with unet only you can also use --cache_text_encoder_outputs which should speed it up if you have captions
Very interesting. I am interested in speeding things up. Thank you
giant cron
How did you called me? #1072220168534642768
I don't really have any neat config file for sdxl but you can also do full fp16 training with the madebyollin vae which fixes nulls in latents
and sdpa is likely faster than xformers on torch 2.x but it might depend on card
I am giving the text encoder thing now a try. Let me check what stuff i potentially could train.
there shouldn't really be that much noticeable difference in output imo
eyes tend to get worse somehow
Workflow V28.0 now on Github - redorderd the menu nodes, to allow better access on small screens and 21:9 displays.
https://github.com/JPS-GER/JPS-ComfyUI-Workflows
are you sharing that shnitzel?
Still a bit of work to do on that. I have 3 "unfinished" loras in the pipeline. Picking one of them now to work on, but sadly it isn't "SDXL_Breaded_and_Fried_V1"
I explored lora upscale this afternoon and my outcome is that it's really too inconsistent...Sometimes with 0.7 denoise, it works great, sometimes lot of duplications, variable "best upscale method". I also tried "SD-Latent-Upscaler" custom nodes, it's v1 and xl in those xy plot, I don't see any advantages even at low denoising..
in this case, all work well
in this one, lot of duplications. Both 0.7
is the latest model SDXL?
any other reccomendations?
After clean install, adding --network_train_unet_only, moving learning rates from 0.0003 to 0.0004, changing scheduler from constant to cosine_with_restarts, & network from 256/1 to 32/8. I've dropped from about 40s/it to about 11s/it.
interesting, are you still using shared ram?
Sadly, yes.
that's a whole lot 
With unet only I am now at 1.07 s/it using 8.7 gb vram.
I have a feeling you're not training in fp16/bf16, since that's about the double you'd expect
It's set for bf16
It depends on what you mean with "tuning".
Wait, you have 100% Utitlization?
It bounces up and down just a tiny bit, fluxxuating between 95-100% most of the time, yesh.
Wow...the spammers are running rampant today.
(I just realized this isn't the gen chat for SD lol)
Are you playing cyberpunk while training? Mine is between 0 and 5%. 👀
I'm not doing fuck all at this point other than chatting on discord and browsing some sites. But that is interesting.
Micronomicon LORa )
Very interesting.
When you did your install, did you install CUDANN?
That was 2 months ago. I don't even know what I had for breakfast this morning.
this is like the only stuff you need in env
mamba create -n lora python=3.11 pytorch torchvision pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt
plus your favorite flavor of bitsandbytes
python 3.10 works too if you prefer that (3.12 doesn't)
bitsandbytes-0.41.1-py3-none-win_amd64.whl probably works best
Found the answer: No.
hrmmm....
Seems that the bag is about to explode from all the hot air that comes out of that place.
My butt's been wiped.
regarding torch 2.1 and cu121 (didn't realize it's stable channel now) I just made a new env and it seems to be working pretty well, no vram spikes
yep only 11.8 and 12.1 now showing on the pytorch site 🥲
precisely
My jaw dropped yesterday but I am afraid to touch it due to prior issues from the programs not it
never as simple as just rolling back if it doesn't work
what card do you have?
I nuked my comfy/ooba/kohya envs and upgraded all and everything seems fine for now
3060 12gb
Well, I am waiting on all the new PT for Ada based cards that are supposed to be here for 12.2
xformers 2.1 too
uses less vram and is faster but amp+ cards only
when that rolls out is when I will upgrade it all
wonder how it will compare to sdpa/flash attention
xformers is already better and 2.1 is about 30% better overall
sdp was such a let down. 😦
sdp is about even with xformers for me currently but that depends on card obv, flash attention is much faster but linux only so kind of a pain
I train more than I gen so I see it across all cards I have used from a T4 to a 4090. flash attention isn't even an option in my Linux to try.
I suspect the new xformers is using RT/TC but even without the speedup the less memory needed is direly needed for XL training
that alone wins over SDP as SDP does use more mem, or rather doesn't use as little as Xformer's saves.
XL really wants a 43 to 80GB card so getting it to train on 24GB is a chore. I can't even use prodigy for a dreambooth I have to use Adafactor at BS1.
hmm wanted to see how xformers is doing with comfy now but the wheels don't seem to be updated for 2.1+cu121 yet
bing breaks like every other hour, nothing new
yep, that not being updated is why I am taking a hands off approach as I played that game a few times with this stuff and learned my lesson
Hey can anyone help... All my pictures look grainy.
Kaiju, monster, humanoid, giant, tentacles, ultra modern,futuristic, city, smoke, rain, digital illustration, fantasy, architecture, sharp focus, concept art, octane render, scary, 8 k
negative:
3d, blender, autocad, hills, human, people, man, woman, house, village, town, technical, architect, blueprint, temple, castle, mist, moon, modern, asphalt, garage. winter, floating, casino, apartment, out of focus, depth of field, lens blur
Euler a, 20 steps, 1024x1024
I don't get it
How is everyone getting these photo-quality images
Prompt is great. Add photoreal, cinematic it also looks like maybe you need more refiner steps
And more steps. 30-40. Maybe a different sampler
I've tried both things, and still getting similar issues
Do you have an example prompt I could try, I am guessing I don't have something set up correctly
Do I need to download a VAE or something? I just downloaded the base 1.0 model
Adding photoreal, cinematic seems like it helped
Yes the sdxl vae. Looks like you have it though tbh
35mm is another photoreal look
I just have the base model, with whatever standard vae is in it.
You can try (photoreal:1.3) and put (blur,haze) in the negative
Cthulhu is in most of the base models (fyi)
What are your favorite controlNet models?
Yeah this is the quality I would expect. Still not sure why mine looks grainy. What's your prompt/settings?
It's looks all artifacted
I use Auto111
SDXL base + refiner
70 steps
Ah 70 steps. I guess SD XL needs a lot more steps to look reasonable, or else it causes these effects, maybe. I tried ~45, but I'll try 70
With 70 steps... better? but still weirdly grainy
you don't need 70 steps. 30 are usually enough.
I just downloaded the refiner model, and I think I've set that up now
cinematic photo by Mario Bava, monster, humanoid, giant, tentacles, ultra modern,futuristic, city, smoke, rain, digital illustration, fantasy, architecture, sharp focus, concept art, octane render, scary, 8 k 35mm photograph, film, bokeh, professional, 4k, highly detailed
neg : big hands, fake, fake hands, distorted, drawing, painting, crayon, sketch, impressionist
What seed, and other settings? I can try to reproduce the same image
Actually, changing the image size and switching to DPM is helping a lot.
switch at 0,8
yes was the sampler
I use 70 steps, but ok
The changes I made from my original images were:
- Using DPM2 a Karras, instead of Euler.
- Changing the size from 1024x1024 to 1280x768
- Using a refiner model
I think (2) made the biggest difference
I tried setting to 70, which had a minor improvement, but (2) was a huge jump
refiner also
Thanks, I think I'll avoid square images. The square images seem lower quality. I've heard size effects the results a lot, but I was surprised how much
Refiner model kinda is annoying because it has to keep unloading/reloading the models
I'll play with it and see how much it helps
nearly identical settings, but square:
if I ignore the weird human face and hands, the refiner does seem to be fixing the background grainyness
Various Sizes, everything else the same
@dusk cape how about remove smoke?
In this case, I'm just copying the prompt andreac was using for reference
the original images I was getting had weird grainyness regardless of prompt
o.k.
smoke was also in your original prompt
yeah but even in other prompts, I had the same grainyness
like, just random landscapes or whatever
Like, look at this
Image to Image
landscape, mountains and forests, sharp focus, concept art, octane render, scary, 8 k 35mm photograph, film, bokeh, professional, 4k, highly detailed
I wanted to make an image roughly looking like the one on the left... the thing on the right just looks grainy, and blurred out
what denoise?
^ this is with like 0.6 denoise
try 0,35
0..4
Alternatively: applying the changes above (canvas size, refiner, and sampling method)
helps a lot
at 0.4, the blockyness of the original image impacts the quality too much
Example using my new settings. Pretty faithful upres
not 100% accurate to the original image, but pretty good
with tilt shift
How many parameters does have SDXL exactly?
And how much Dall-E 3.0 does?
is this image o.k. or issues as well, i have little bit bad vision
tried to replicate but too faaar from target 🙂
What do you do with images, that are on edge to delete them or keep them, but you decided to keep them? I think i found solution for me what to do. How about you @cyan crown and @vital ermine
I don't know as I have so many images requiring so much space.
convert them in jpg
that is a disgrace and not for me
Those which i would delete most probably i will convert to bpg. Xnview can read it, so no problem. And very efficient format, probably there are better, but i like this.
San Francisco?
good night
playing around with views from inside the car recently
(denis villeneuve cinematic lighting shadow, detailed depth:1.1), eye witness journalism photo, professional expired polaroid film photo, towering giraffe San Francisco Chinatown smelling fruit stand, narrow street, pretty shops and restaurants,foggy morning, cinematic close up fruit stand, (horror, photoreal), dynamic pose, dynamic background, dynamic composition, dynamic lighting, realistic proportions, extreme detailed, ultra detailed, intricate details, highly detailed atmosphere, highly detailed textures.
can someone help? im trying to train but it aint working
macroscopic, detailed leaf
nice, love this art style
Nice. I'm trying to make views from the lateral windows but no results.
🙄
have an example image?
I would controlnet for this using a simple mask
(car interior, door handle)
Well, speaking of handle. Dall-E is the only one that "understands" those requests. I'll give your advice a try. ✌️🙌
perfect 24k gold diamond plated teeth works occasionally
I just tried with dali3 , it refused 5 times
they wouldnt let me say 'chinatown' earlier.
Just wanted to see if it could do this? Guard rails suck when you just wanna be creative
wouldnt let me do gold diamond plated teeth only prompt, but I was able to get some wild stuff through, my account might be flagged
IP Adapters in action )
A1111 Unstable Diffusers XL for a model
The Long Room at Trinity College, Dublin. SDXL knows about specific places.
What places does it know about???
The Hall of Mirrors, Palace of Versailles
jesus CHIST SDXL is great for gore/horror using my LoRA
WARNING FLESH MONSTERS
This is so damn cool lmao
I never would have assume that my photo realism LoRA would bring horror to the next level
This one has to be the freakiest of them all tho lmao
lol. damn. tree beast
Huge Stable Diffusion XL (SDXL) Text Encoder DreamBooth training comparison
U-NET is always trained
PNG info (prompts) are in alts
All images are 1024x1024 so download full sizes
https://twitter.com/GozukaraFurkan/status/1711345323900100643
Informative
Nice composition
)))
remember they were giving hyppos in kinder eggs?
wabbit
Sharing some dalle3 muppet battletoads
toads?
had to img2img that nice one 🙂
nice!
maek your time hahaha
unceunceunceunceunceunceunceunceunceunceunceunce
just in case people don't know, they should know. watch this, and now you'll know. https://www.youtube.com/watch?v=qItugh-fFgg
All Your Base Are Belong To Us.
The game is Zero Wing
Read all about it here http://en.wikipedia.org/wiki/All_your_base
muppet battletoads movie, cinestill cinematic, 35mm
i love that meme. from a time before we called them memes. i think it preceeded "viral video" even. when the internet forum users all collectively realized we could edit photos really well
sdxk cal get you nice stuff as well 🙂
how far we've fallen. from weird "freaks and geeks" style photoediting to just complete waifu endless fantasies.
i blame filthy frank
(Close up, POV, production still, detailed depth), Beautiful Evil queen, high cheek bones, raging tyrant posture, long jet black hair, tiny crown, furrowed brow, 35mm, high heeled boots, brutalist castle balcony view, purple red Hugo Boss bodysuit, high fashion,high collar cape, grazing natural light, early morning, pure shadows, analog photography
Anyone have problems' with todays update of ComfyUI? Any KSampler fails... or is it just me?
Error occurred when executing KSamplerSDXLAdvanced:
'list' object has no attribute 'to'
File "A:\AI_Files\ComfyUI\execution.py", line 152, in recursive_execute
Waiting on SDXL renders be like
@lusty wolf it can be cause by different node. I got this when error on AIT
Haven't installed anything new... but I will remove some nodes and see...
guys, is it easy to transfer children's book characters and style with IP-Adapter?
Thanks for your help. I deleted the newest folders of nodes, including MTB and Fooocus nodes. Now it works. Dammit... a waste of time...
i couldnt do it yet
hit and miss
I'm looking for some help with inpaint dedicated workflow... if anyone got some 🙂
@uncut fiber ComfyUI working again...
Whatch this. Might be what you after? https://www.youtube.com/watch?v=ivi34PESgjU
Finally... You can paint on Image Refiner.
will watch in a sec. still `miring your albino xenomorph in the city
is this video mute?
Yeah unfortunately... follow the pictures... 😜 Find Dr. Lt on the AI Lounge server.
Dua Peepsa 25 Pop star singer beautiful marshmallow creature, chocolate hair, round marshmallow head,tiny brown dot eyes, sugar Marshmallow covered, pastel-colored body, sugary texture, soft and squishy stature, candy-eyed gaze, bob cut, model pose, midriff shirt, contemporary, beautiful
@half cedar this all bing dalle3?
No - sdxl, I mark my Bing images
r2z2 all of these are your prompt with different Lora's or style
infinite possibilities
let me try
Very nice! Base model?
?
Lol
Ork in cheese pepperoni pizza armor
Killer unbelievable gen, stuck in my head lol
On a platter bier.
pizzasmiths today no longer know how to craft this
Berzerker Viking,wearing pizza cheese pepperoni armor,
yo!
DJ Crispy
And the audience is also cocking
im confused