#✨|sdxl
1 messages · Page 13 of 1
you should pay your attension to another profession

XD
I think it's tried to merge the N and T
2.12s/it for 3090 sound right to yall? on sde gpu
Which SDE
2M or the other one?
And actually 2s/it that's slow if you are doing 1024x1024
I get 2it/s on a 3080 at that res
i have some charm in some extent actually,huh
just sde gpu and karras, 872x1128
That seems slow to me
about, yeah
thats without refiner btw
SDE GPU is slow
2M is faster
let me check what I get
it's good to know,but the base model define the majority, i tested 1.5 and 2.0 plus either
Which is what I use, about double the speed
86 seconds for a batch of 4 just hurts
Oh that speed is at batch 4?
thats the one that failed my aesthetics tests in this server horrifically 😅
Been working well for me
2.29it/s shoud be what dpmpp SDE gives on a 3090
I think they forgot to mention they are doing batch 4
Which would explain the lower speed
indeed
is there a tutorial for installing it into the a1111 webui now that it's merged?
(I have it running on h100 server but I couldn't find the instructions for actually adding SDXL model)
you are using a pretty slow sampler, and likely also excessive steps
just using the same i've used on 2.1 for months
you can never change a human being's nature,otherwise Elon will not provoke and Muck will never respond for a cage fight, the only thing will change forever is being conquered or with love , which one will you choose
ok, what the hell are you on about lmao
like it was funny at first, now I think you are just actively trolling
the bots or robots will not make a decison emotionally nowadays, so we have a chance now still
if this turns out to be an elaborate hoax about us interacting with an automated instance of gpt-4, I'll be mad XD
pixel go brrrrr
I put "16-bit Sega" as a style
Yeah some of the style words do things completely different lol
find your true love ,that's the reason human survive
while I can always get the style I want, it's rarely using the words I'd expect. Half the time I ask Vit-H, and get weird names or styles - which then work perfectly, but why not the ones that should work???
Yeah it can be a little annoying sometimes trying to get certain styles
I've noticed that you sometimes need to either crank up the CFG, or heavily weight the prompt
whats the easiest way to ask Vit-H?
Pixel Art works though
so without doing anything fancy to share VAEs or Text encoders between the models,
- no CPU offload = 25GB VRAM, OOM on a 4090
- model CPU offload = 16GB VRAM, works great, adds about 10 seconds to the image gen
- sequential CPU offload = 8GB VRAM, adds 20 seconds to the image gen time
this is on a 5800X3D, the numbers are notably worse for a Xeon Gold
I have interrogator running in my old 1.5 webui
gotcha
sequential CPU offload with attention slicing and VAE tiling and VAE slicing = 90 seconds to gen an image, 2GiB of VRAM used
that 5800X3D, man
what a beast
Accidentally made a crap quality Mario level
sounds like a fun but terrible idea
gen image -> automate into mario level -> suffer XD
I see Minecraft still works as a style
I always struggled to get space helmets to work properly
Seems to actually put the head inside the helmet now
Yeah
it'll still gladly screw up helmet visors
It's not that blurry
just seems to be luck of the draw
it's so blurry my eyes are like "what the hell" lmao
best ive got of space helmets, it doesnt do a good job at those at all lol
a young guy walking through the stromy road as a part of a life.
stromy?
anyone konwn how to fix it?
i hate to nit-pick but everything this guy does is a part of life
so this is just in hindsight... but could it be that a lot of people cant get training to work, while some can on the same amount of vram - be due to their card not having bf16 support?
It would explain the huge different in vram between different people
You need to install those nodes
nope, mcmonkey ran the test on a 2070
why tf does it switch between it/s and s/it?
3080 and 3090 are the first ones with bf16 hardware support for the RTX series.
mcmonkey did it in fp16 on a version of sdxl where that wasnt broken yet, so he doesnt count XD
Because sometimes you are doing more than 1 iteration per second and sometimes it takes multiple seconds to do 1 iteration
do u have a node source link?
Interstellar is expotentially hardness for human beings
so use 0.xx
yeah i love how he's always comparing to us on old old versions "bcuz its what's lying around" 
No idea what those nodes are, try googling them
@uneven dove dusty i mean !

Can i get a another one.
lmao this is justmaking weird stuff now
although fp16 training may be a thing again, as of today again
no lol
the bf16 was needed for OpenCLIP
you get 'numeric instability' in the text enc
oh. oh well then 🥲
you can load the vae in fp16 tho now
I mean this kind of dusty.@uneven dove
ask for "burning man", then
what is four-dimensional space,time plus geographic,human beings survive is for love priority,if love is no longer exist,it's the time robot will tak prominent
@uneven dove sorry for the misunderstanding. Will you ?
four-dimension, like an altruistic version of measuring from zero rather than one? when in four-dimensional space, the time is embedded in coordinates, they are inseparable, you know, this why it's called space-time continuum no beginning or end
i do like the movie interstellar, one of my all time favorites
you have one image remaining
Can you get a well dressed guy moving away like the previous one. @uneven dove
u should ask for the aliens if they are exists,idk,four dimensions is all we can understand,but how we can grasp the fundemantal of more dimensionally meanings the existing deep leanring strategy,how knows in future
dimension is just a way to move
Length. Width. Height. these are dimensions
time, another one
Hmm I do get a lot of blurry images sometimes. It's weird, you'll have a bunch that are blurry and then they aren't again
Why ?? @uneven dove
sometimes it works sometimes it doesn't
move forward,who can tell, That's one small step for man, one giant leap for mankind,To see a world or the galaxy in a grain of sand
you know pale blue dot ?
@dense chasm Go fucking watch rick and morty together guys.
Fuck off !!
////////////////_____________
you have no more image credits
Look at the image you send me.
Frank still has a ton
can u tell me what's the implication of Prometheus
u can't aware of i'm currenly chat with u thr proxy agent across half of the world,lol..
using " shot on iphone" seems to work well at getting some of the blur out, also use photographer names that dont have any blur.
shadowsocks?
universy exists for almost biliions of years, we are negligible, dinosaurs disppeare as well
nope, chinese exists for almost over 5 thousand years lol..
I tried to use Shrek in a prompt and it's done him dirty lmao
Big ass monster
I'm not sure if this is much better though
SDXL 0.9 ComfyUI
where is Caribbean pirate,captain,huh
sometimes i wonder why so many people want sdxl to work in a1111 so bad. like i get it, i started out in a1111 but at least give a thought to using comfy. 😅 ignore me tho i just wanted to get that out of my system
Just don't mind what others prefer or do and focus on yourself, applies to all aspects of life tbh
Why not, it has tons of extensions and people already have it built into their workflows
forget about it,u should come to China for a visit, the chinese is awesome almost like the majority of American ppl, and you will fell in love with chinese food huh
it's called beating a dead horse
Personally I had already done the work of writing custom code for A1111 so I would have liked to see support earlier
ah yes, the sunk cost fallacy
the trick is to tell them to use the bot XD 50% of the time that is actually what they wanted. an easy and intuitive, no hassle, way of using sdxl
Who cares about the codebase dude all that matters is the actual ML programs
the other 50% are booba in your basement
Yeah but then they can't make their weird nsfw stuff
A1111 isn't even a good experience for users
so that argument falls on its face, too
See my previous response
They don't know that though. They press button, pretty picture comes out.
"who cares about the codebase" you just explained why you do lmfao
this
?
i understand theres a workflow aspect there, i get it. i guess for my workflow it just is easier to use comfy. may not be for others.
InvokeAI and Vlad's fork have SDXL support. will you switch? no, because you care about the codebase
I moved my workflow over to comfy actually
agreed, A1111 connection errors sometimes will pop up
But it took days
That's not a fallacy
It's actual time I could have spent doing something else
time to find out if samples are finally working in kohya x_x
it's just that your own reasons don't align with what you accuse others of.
technically, they should be
How so
i was using Diffusers on day one, not a whole lot of thought or work went into it. but you're wanting A1111 to work, because you "care about the codebase". but then, when we point out that keeping A1111 alive is "beating a dead horse", you say "who cares about the codebase?" - even though, you do. you stated you do. switching to comfyUI i guess you can bring that up if you want to muddy the waters, but the discussion started out as one thing, and that's what i'm responding to
base only
I don't care about the quality of the codebase I just care about having a shared platform for people to write code for and tie in their existing image manipulation programs into to create a unified workflow, and first mover advantage is a real thing
that's what Diffusers is, and that's not what A1111 is.
it's literally a single platform that reimplements everything in its own way
I tried to use "Red Faction" as a style prompt and it's taken the RED very literally
Diffusers is a shared library that's in use by InvokeAI, SD.next, my own bot, Huggingface spaces, yada yada
Everyone makes stuff for A1111 including non ML image manipulation stuff, diffusers is just an API for diffusion models specifically
naw, Diffusers is a subset of the Huggingface toolkit, they even have Transformers support
the Diffusers ecosystem is actually expanding more quickly than A1111's which is one reason Vlad is switching to it
most SD tools are released for diffusers before auto - people just immediately port it over
tbh he copied SGM code this time, not Diffusers
Comfyui doesnt even have some of the most basic image manipulation nodes you could find in even Chainner which is incredibly minimalistic, comfy is overly focused on SD
I doubt that but ok
(I mean, new advancements that start as arxiv papers, usually release for diffusers or for compvis codebase)
(but research code is usually trash anyway ngl)
i'm personally a fan of quality abstraction layers - lemme just use a single API that can translate to whichever backend supports my needs
comfy has tons of community-made addon tools, they're just not really centralized very well atm
Like I said not everything needs to be a bleeding edge ML solution, SD should be pipelined with lots of other image transformations that have been out for decades
but now you're asking for the level of tools/support that companies usually charge for
at that point, might as well use photoshop beta, which has everything you'd need - but is paid
dont forget this is open source (to a certain degree)
and also just trying to reinvent Diffusers.
on the plus side, due to its open source nature, you can be the change you want to see
so cool,automatic1111 adds-on extensions always connections errors, you are a AI professional experts as well🤝
This was the first thing that I noticed. Someone other than me needs to get on it and making it popular like civitai does with models. Heck, even civitai could add a spot for it.
Youy guys are misinterpreting what I mean, I just wish I didnt have to create a dozen custom nodes to make my workflow work in Comfyui (13 actually)
what
For extremely basic stuff
dont think about it. its a chatbot XD
Create Pull requests with them in on Github so it can get put into the main thing
i heard there might be a way that'll get easier in the near future
Honestly just looks like a translation issue, language barriers are tough but you get used to it after a while in the professional world
u told me once before how to set the cfg values appropriately with sdxl 0.9 i remember still, appreciate for your remind dude
well yeah - but the thing is that comfyui hasn't been around as long as a1111
so its a genuine time difference.
(also, if you want your nodes to become permanent additions, send him a pull request - as long as its user friendly, and not counter-productive for future innovations, I doubt he'll reject)
I dont have a github and it would genuinely take me twice as long to make the code ready for a PR than it did to put it together lol
its not a chatbot, just a chinese man with poor english. sometimes hes hard to understand lol
fair enough
BTW I dont want to hear about A1111 having spaghetti code when my workflow ported over to Comfy looks like this
seems casual
the code for that is way more elegant and maintainable
I'm pretty sure there are more spaghetti there than a normal spaghetti dish
sample generation during training... is... very sketchy - to say the least
already crashed for multiple reasons XD
Sure, now imagine I want to make a 3x3 grid instead of 2x2
You can easily clean that up tho. My blueprints in unreal engine were chefs kiss 👨🍳😘👌 compared to the spaghetti that I saw others have. Like jesus people, clean up ya work.
i don't know, sounds like a problem you're trying to solve using the wrong tool
I agree
i dont care what the code looks like. as long as i got an environment that i can do creative workflows in.
I don't feel like comfy is there yet. Amazing tool you can wire up for specific situations, but it lacks integration with other systems so well. I'll load up comfyui for specific tasks, but i'll load up auto to be creative and keep things flowing.
for tiling, i have python scripts i use, that GPT4 helped me write (because i honestly hate trying to figure out recursive loops with off-by-one errors)
simple things like adding a lora to a prompt in auto, but it requires rewiring the whole net in comfy
its missing a lot of nodes - such as variables, which is why it looks like madness
also a UE4/5 user. So I feel the pain when using comfy and not having a lot of nodes T.T
especially the limited reroute options
it's ture,btw,chatbot will be banned in China,it's tedious and ridiculous to do that in China, they should focus on their niche market tbh
i thought webo or whatever it's called, had one
china has a lot of laws that the chinese people just don't care about
that problem may or may not solve itself in a few days - as a voice in the wind has told me
i grew up being taught that the 1 child policy meant chinese people were all killing their daughters to get a son instead. turns out no, that never happened. those poeple just did the human thing and raised the daughter and declared the son.
frank i just hope we can show more of your countrymen, this kind of art
i feel it is harmless
i will not argue with you about it, but there are lots of tunnels,as i chat with u guys
people overstate how much enforcement the chinese government has over ALL of CHINA
the issues with comfyui are frontend issues because there's not that big of an abstraction between what you see in the frontend and what is sent to the backend
you nailed it. thanks for chiming in. Yeah. There's no abstraction view layer. I'm sure you of all people know it
there's a time and a place for this discussion, somewhere / somewhen else
you have no idea who i am,btw,who u care about,Imao
that was the only comment about it because it was pertinent. you've blocked me. move on.
i care bout everyone
inspecting illegal tunnel operation
i think it got their uniforms wrong...
it's a true facts,cuz China economy growth,being a while,ppl will not focus only in China ,Japan is another example
omg i made my ramen too hot
jealousy intensifies
and the issues with a1111/derivatives are core issues that are difficult to solve
coughgradiocough
Gradio has some great advantages.
it's been a good little gradio
sous chef shrek cooking gold flake ramen
The issue is how Auto1111 uses gradio
as an ex web dev/designer, I am deeply pained by this statement
it's not only because of gradio but that doesn't help at all because of how closely it ties the backend and frontend
I'm loving the fact that ComfyUI doesn't melt my computer, unline A1111 rn
my first approach to a platform was that, but the second one was to decouple everything so that i don't need to, for example, use a Discord bot as a frontend to interact with my pipelines
it's just one of the frontends
Comfy , you seem to be building something much more purposed and I feel like you're using an MVC model with real engineering experience behind it. I like what i see you doing over there.
but i want that presentation layer
Freedom is like a air breath,huh,i though once before,it's a trade-off in the end
not much air in a tunnel underground
If you seperate the gradio code from the rest of the code there isn't much issues with Gradio. But yeah if you mix it in with everything else then it becomes a mess but this apply to everything.
automatic1111 feels like the kind of engineering fishermen do to get their boat out as low effort as possible
well gradio has API clients that make interaction with it easy
the goal is to bring in fish afterall
Auto1111 isn't using api_names in the code and that makes using gradio as an api a bit harder, but that's the fault of Auto1111 and not gradio.
what you doin in the back there shrekky
EasyDiffusion was my ideal ui... it just never got enough love and support T.T
maybe,different circumstances,if as a global perspective,in a country side,there is win-win situation for good
i don't like the programmer ui of all these gradio derivatives. invoke looks nicer but i still think they miss the beat somehow. i often can't get it to install or stay up to date either. their dependency chain seems fragile
i can use them but i've never liked a straight up form field as a primary interface
comfyui was primarily designed as a backend and the UI I put on top was the easiest one that exposes all the backend functionality because my strength is not frontend dev
you achieved that goal so well. it's a very cool node editor
@uneven dove did you source the dataset to train?
shreking bad
whats the subject?
no idea what this is in reference to
@high skiff was saying that he was collaborating with you to train a model
I said we are considering it, but we don't know what we would be doing yet
Any plans for AnimateDiff support?
oh, i thought you guys were kickstarting it on 18th
i remember in Maya 3d, when i used it more back around 2004-2005, i would edit nodes in the graph, but that would wire up to the controls in the front end. and i could create custom controls that would link to different values in the graph. i'd love to see comfy get there one day but i understand that's a tall order.
yep that needs 1.0 to come out first so that we know what it even needs a fine-tune for
I never said that lmao
that's wild
Jamie, pull that clip up, where Sytan said we're kick-starting training on the 18th
All I said was he and I were planning on doing a finetune at some point
People can add what they want as custom nodes. AnimateDiff isn't really a core feature.
not sure of what, not sure of when lol
my bad, i dont remember
@uneven doveOh also, hows the a100?
it crashed some more but idk, stopped worrying about it lmao
been playing with the 4090's memory issues in SDXL today because it runs out of memory when trying to generate, which pisses me off, it's a 24GB GPU
What is the distinction between controlnet and AnimateDiff in that regard?
u know ,when i am in a young age ,i recognized some fuxxing assholes from State of Michigan, and some good men from Denver,so it's hard to tell,huh
i'm too lazy to fix it in a precise way so i've gone ahead and simply enabled CPU model offloading on the 4090
the solution is to run away and hide, any time people you don't know, start walking toward you. that is what i do.
not towards me, the fucking asshole in Michigan screwed over Chinese girls in my age,but i agree with u, at that time, i can't do anything
good luck dal
That's got some pretty good detail on the helmet and stuff
Little scratches and stuff like that
small fine grained detail, it's so much better at then sd 1.5. Even where 1.5 suceeded with finer detail, it almost seemed like a tiled aesthetic. something similar to photoshop's content aware clone tool. I don't catch those vibes from the details made by sdxl
I still can't decide what the best way is to use the text encoders height and width settings as you can get massively different results depending on what it's set as.
why yes training sampler, that is a blonde girl sitting on a chair...
"working" is a very broad definition it seems
This is the same as the previous image, only thing that's changed is the text coders height and width values
i dont even understand why a text encoder has a width and height and why it's so yhuge
i just go with it
I mean setting it too high, or too low gets you bad results
but it changes stuff quite a lot
you can enforce the bias, thats associated with existing resolution size + aspect ratio
useful for portrait fotos, in non portrait format
reminds me of a young daniel day lewis as the butchar
I use it so it matches the aspect ratio, but these changes are just from multiplying the amount
I've found 2 times the latent image resolution seems to work best
But it can change generation to generation
oh wait. no. I mean the other node you can also plug it in
that one simple stretches it
you don't have to tweak every input 😛
But there's buttons to press
i just leave the target/crop values at defaults
haven't even played with them once
i have zero curiosity about those
all the more reason to leave them alone
you see, i don't know what i don't know. ergo, i don't know what i'm missing out on if i never see it. ergo, i never have to wonder, "what if i changed this setting?" because i simply ignore its existence.
Ooh what's the style? Makes me think of an Atari 2600 box art
Oh I see
pseudo's feelings about LoRA summed up in one sentence XD
imagine i change the Target value? and suddenly, now i don't know what the default output would have been.
that's a nightmare
For the one you replied to it's just "Pixel Art" with a fair bit of weighting
damn it worked well too. that really nailed the aesthetic
Is this diffusers? Why is the VRAM utilization so high at no CPU offload?
I mean I guess really it's the same as just changing the seed. There's so many possible results, it's not like it matters too much what you change to get the result you want.
it's high because it's SDXL
Huh interesting
lora gives rajiv the cold shoulder
inference? I'm staying below 20GB at batch size 8 no CPU offloading...
And sometimes you get very similar outputs.
This is left to right, Same as image size, double image size, triple image size.
there's many permutations of "no CPU offloading"
I saw kohya-ss has support for full bf16 training. i don't know what it means but the patchnotes say it only works with libraries built for linux as there's no compatible version of bitsandbytes for windows yet.
Would it be worth setting up wsl2 on my 4080 for that? what does full bf16 support even mean
bf16 uses more precision bits, but hardware support could mean optimizations there?
The crop values I don't touch because I don't understand them and for example this is the same image as above but with a crop value lmao
almost looks like a great header for a document though
purpose
you dared to dream, Arron
I think I sort of understand what the targets are doing, but I don't understand the crops
Like again, same image, different crop value lol
so sdxl is primary rivarly is midjourney, we are all in the Gotham city online🤣
honestly i don't even think the crop conditioning did a damn thing for the model to learn how they work
i think we'll see better understanding with "crop" values as the release and official documents pour out
how could it?
in the training, i saw that you can train one image with many different crops through that conditioning
refined models might benefit from the values more
The only thing I can think of how crop could be used is that you get a wide image, and what you want is too out of the way, so you use the crop. But it doesn't do what I'd expect
it all ties into the crop conditioning that is built into sdxl. something new they came up with
yeah i've noticed the Diffusers project just overrides the crop values to zero during training
even when it crops the images
tbf, training adds a lot of values simply because they exist - even if in practice they are bad or incredibly destructive
^
this is the "innovation" that SAI did, adding new conditioning values in the hopes that it'll Just Work
It's like you get this image and then you add crop 256 on width and
like what lmao
they almost added the aesthetic score to the base model but it made it stop following prompts very well
What is it doing
I'm greatful its there - cause niche cases will find use for it. But it's not something that should be a core focus for anyone not experiencing issues with a lack of that setting
Experimenting with conditing values is a good thing as long as you can check if it is doing what you want or not.
the goal of crop conditioning was to prevent the cropped face problem from 1.5 base. Now the model has data telling it the bottom half of someone's face during training is just a cropped off portion of it and there are 200 pixels above it still
git clone comfy with cloudyusage like colab or diffuse locally,u should learn a lot anyway
inferrence exposes the crop value, but i don't htink they're meaningful yet
the cropped face problem in 1.5 was, unsurprisingly, because they randomly cropped images when training 1.5
yeah
SDXL training uses data bucketing
you've blocked me. i don't think you're reading my posts.
Even tiny values change things, this is 10 on crop_w
thank god for bucketing x_x
the crop issue was solved by community fine-tunes for 1.5 and 2.1
I cropped around 2k images in total, before it became a thing...
dunno why SAI thought they need to add new conditioning values
More stuff to make it look more complex?
no it's just new research and new research almost never makes sense or does what you want
i can feel you temper huh
No temper, just interesting how little values can change so much
its less about inference and more about training at scale for pre-training. Since it allows cropping of images to not effect context in images as the model learns to associate the cropped spaces with the context in relation to it. It does also help with inference but the mixed aspect training mostly fixed that. The conditions give a full map of image context for the model to work with and build associations around
img2video with continuous frames of pics, it will be fun to make a video with scripts
https://github.com/Stability-AI/generative-models/blob/main/assets/sdxl_report.pdf To fix this problem, we propose another simple yet effective conditioning method: During dataloading,
we uniformly sample crop coordinates ctop and cleft (integers specifying the amount of pixels cropped
from the top-left corner along the height and width axes, respectively) and feed them into the model
as conditioning parameters via Fourier feature embeddings, similar to the size conditioning described
above. The concatenated embedding ccrop is then used as an additional conditioning parameter.
We emphasize that this technique is not uniquely applicable to LDMs. Note that crop- and size-
conditioning can be readily combined. In such a case, we concatenate the feature embedding along
the channel dimension, before adding it to the timestep embedding in the UNet. Alg. 1 illustrates
how we sample ccrop and csize during training if such a combination is applied
very cool report. they did train with buckets, but also random crops
its worth reading for anyone interested in teh technical side of sdxl
i don't see how that does that, when it merely knows "i am a subset of an entire image", but those additional tiles aren't provided after with the appropriate crop values, are they?
eg. when an image is cropped, all cropped pieces get fed in?
the crop issue of 1.5 was never solved. it was worked around.
and if these conditioning inputs do affect training and help it so much, why doesn't it require fiddling with the values during inference to maintain the result?
fundamental flaws in the base model can only be solved with a new base model
it doesnt need to see all of an entire image when training at foundational scale. If it sees many images that are missing the left side and mention a person and then see some on the right and top and bottom with a person it will build proper associations for context in relation to the training augmentation (i.e. crop/scale)
Same, some of my earlier datasets were cropped becasue that were what I were told, then I padded my images until bucketing and just leaving the images as is became the norm.
that sounds like wishful thinking considering SAI has shown SDXL so many images of celebrities that it still doesn't even get their faces correct?
you'd need a pretty even distribution of training data for that to work, right?
the report is useful to read
a lot of research is applied
i wouldn't call it "wishful"
we can call a lot of SAI's approaches wishful thinking, especially putting SD 2.1-v through 1.9 million steps of training
thats not wishful thinking, thats how it works haha. Also its seen a ton of terrible images of celebs tagged that way too. We dont specifically map over celebs with a good set of them, one good fine tune would likely fix it but we arent trying to make the celeb model atm lol. Anyone is free to train them once its out
you can sure. but you haven't read the report is what that tells me. nothing personal.
i'll call it applied research towards a goal
well caption dropout should be helping those unseen prompt features be extracted, right?
this is why i blocked you lol. i did read the report. it is lacklustre. it doesn't describe how they arrived at many of their decisions or even get into the details of how it works / how well it works. it is a boring paper.
comfy agreed with some of my points, on what it is lacking
you know they tried T5 as a text encoder but that's not even mentioned in the paper?
i'm always hoping there's a second one coming, but they've not mentioned it yet.
pytorch cross attention seems to be a second faster on avg on a RTX2060S than xformers
21s vs 22s
someone was saying that earlier but with about a 3 second gap, on a 3080.
newer gpu (more cuda cores), more difference it seems
I'll also try it on 4090 once my lora is done in a few hours
more CUDA cores, Tensor cores, RT-TFLOPS, etc 😄
I've been offered a new RTX4090 by my employer but I'm not sure I want to have it.
Not sure if it's worth it.
i'd ask why they're giving you one? and why not an A100 instead? 😁
H100! you're outdated
are you trying to get them fired? that's expensive
Learning, keep on top of this
XD
Got OOM trying to do a 2nd pass lo
Thankfully Ultrasharp-4x is pretty good at game style images without wrecking them
you wanna gift us one?
Hmm you can get some unexpected stuff with img2img
even if you invest 1k, along with 40 other people for a "shared" H100, you'd only get 9 days to use it each year, in a fitted timeslot XD
i added the variant buttons to my discord bot so it's super easy to do img2img now. and i've been using a lot more as a result. it can be quite fun
god its gonna be expensive to rent...
so we're dismissing all of the pertinent information about crop conditioning in the report, because some of it was lacking ? oh okay. throwing out the baby with the bathwater seems apt.
throwing out the base ingredients, along with the finished chemical solution
(along with the research of all ways to not produce the chemical solution)
or something
just felt like an unfair comparison. no complaints
What’s better SDXL 1.0 Inpainting or Adobe firefly generative fill?
the report has a ton of information about why crop conditioning was used while building a foundational model. all i'm saying is that ignoring that was a problem. it created a huge misunderstanding here. When i said it was used for training, people came back to dunk on me by saying "nahuh, there's Bucketing!"
so i brought receipts and then now those receipts are being called inconclusive or lacking
it just seems very irrationally motivated to me
its subjective
Yeah it's cool, there were some people saying it doesn't work. But it absolutely does, just depends what you prompt it.
Is SDXL 1.0 Inpainting good tho?
photoshop fill has worked amazingly for me so far.
If I need a solution that works quick, and works well - I always switch to photoshop generative fill (but its also a paid service, so a bit of an unfair comparison)
Can you generate celebrities? It doesn’t work on Adobe firefly website but that’s what generative fill is on photoshop so idk
adobe will keep updating theirs and it'll keep getting better. SDXL will get controlnet inpainting plus community refinements. The best is the one that works best for your creative workflow that looks the best to your artistic eye
sdxl is my hobby generator
but if I need anything for work or real life situations - Photoshop generative fill is just plain reliable (since it was developed for real life situations)
objectively, adobe's will be limited in a variety of ways for legal reasons, since they're a commercial product
True
not a hobby tool. as a hobby tool it's equally good to sdxl at its best. It only shines when you genuinely need fixes for issues in photos that are sfw
Gotcha 👍
i'm sure you could fix nsfw photos if just the non naughty bits
had to use it for a sfw cosplay photoshoot - even that was too much XD even if you're trying to remove a stone in the background
Oof fr?
i haven't played with it much yet, but i bet they'll come to fix that. Fashion marketing and cosmetics will need capabiliteis. sex sells always
If someone has used the SDXL Inpainting, could someone post an example? I’m just trying to see if it’s worth downloading
cosplay is often tagged nsfw, so its a bit biased there. normal fotos work just fine
or crop, open as new image -> fix there -> move back to main image
can be worked around so easily lol
could probably set an action up to expedite that
Cause I’m trying to get a character to hold a gun and it blocks me, is holding a gun that bad?
Is there a workaround for it?
So Firefly blocks you from editing certain images (trying to follow)?
definitely read their compliance guide for it. You can get banned from the whole service.
Violence is specifically mentioned as a no go
probably marekting departments idea. they don't want the adobe brand associated with guns in children's hands photos
just a long shot guess. don't read into it too much
Does someone have an example of an SDXL Inpainting?
ahhh they got compliance guides. yeah. legal and marketing shenanigans
that'll erode away as they find better ways to cover their ass
the spirit of adventure, 1955
Does someone have an example of an SDXL Inpainting?
Chill bro
yep img2img works lol
weird ping. i dont even have access to an sdxl model at my home.
I just don’t know if there’s a place where I can see inpaintings and thought u might know
¯_(ツ)_/¯
just download it and do it yourself lmfao
Balance this out with a dude
But I don’t wanna 😭
looks smooth because you're refining the output @eternal fog
I've not refined either of those, I'm just messing around and haven't got the settings down properly.
ah
i use 20 steps and 5.4 cfg on the img2img input on Base model.
those are hard-coded because it's a pretty narrow window where the i2i seems to work properly
decided not to let people mess with that parameter group
The denoise strength seems to have a pretty definate drop where it no longer looks like the input image at all as well
i'm at .8 on my examples
it can be rough
should get better results from this input (left side) with 0.73 strength
Just to clarify this is 1.0 correct?
0.9
we do not have weights for 1.0 yet
pixar's spirit of adventure, 2018, cinematic

it seems that even if you stick to "1 megapixel area" resolutions, if you go too wide, integrity across the scene is violated
it's consistent
That Low-key reminds me of 1.5 from 2 years ago…
Sure feels like 2 years ago lmfao
maybe you meant Craiyon or Dall-E Mini
Prolly
Converting stuff into game style is pretty good
I’ve seen it it’s improved a tun
i was able to get a couple photoreal images from Craiyon a cpl weeks back
using OpenCLIP prompt style
reminds me of metal gear ps2 era
maybe ps3
its legitimately good at it
they scraped the entire web, thats obv that its way better
Adobe?
yep
they have adobe stock
they'll use the entire dataset
hey guys - where I can find more workflows? I finding a lot of workflows in comfyui using nodes I don't have. Is there any repo/huggingface with some?
unfortunately it's not very centralised right now
gigo applies there. they didn't use a generalized scrape for a dataset. they are also adobe and have been doing photography research for decades
they have data sets
leaders of digital editing research i'd say
if they want, they can even launch a very good txt2vid generator
civitai.com has a lot if you can wrangle the search function to show them.
Civitai is a platform for Stable Diffusion AI Art models. Browse a collection of thousands of models from a growing number of creators. Join an engaged community in reviewing models and sharing images with prompts to get you started.
oops wrong reply button. meant that for @amber fulcrum
oh ok
i don't think it can decide what it wants to do here, sort of half way between photoreal and not
The rain/water effects are cool though
nice! like shot from tekken cinematic
there are only sdxl 1.0 release canadidates so far. 2 more days though so the feedback is probably locked in and they're refining the selected final run now
you cna try the release candidates throug the bots on the server last i checked
I'm trying to do a dragon ball z style but more realistic, but if I put dragon ball z in the top text encoder, it just does weird cosplay lol
put "cosplay" in the neg
Or whatever this is lmao
gokine using kaio-ken?
The results are "interesting" at least lol
thats how live action netflix adaptation should be done
Lmao, it's going a little heavy on the booba as well for some reason
so much boobscle getting flexed
I must be right on the edge on the prompts, because 1 seed it will be realistic, another seed it will be 3D Animation type look and then another it will just be Goku
And then adding movie still makes it actually look like anime?
I think it's just doing that it wants at this point
This one is amazing ^^
I'll do 4 in a batch and just show how different it's making stuff
(watch them all be the same now)
lol
mabe add "game screenshot" in negative?
cool style! what is the prompt
I tink it's because my CFG is low
The prompt for that one is "Movie Still of a woman on a wet Japanese Street during the day, debris energy ball energy blast from Dragon Ball Z"
But it's not reliable. I've only gotten that type of style once.
What is that lol
when I tried Dragonball once I got extremely uggly results as if it was inspired by bad fan arts
gym girl
Yeah it likes to try Goku everything
like this
sus
lol it keeps using the "Ball" from Dragon Ball Z on other stuff
Giant Stone balls keep appearing
what about doing dragonballz
or dbz
dragonballz would be the same as I imagine ball is it's own token, but I can try dhz
yeah I think it is ruined by bad fan art
dbz*
got this
On it's own yeah I imagine it will do that
thats really good, 0.9 couldnt do that I think
This is 0.9
Nah, I'm just seeing what comes out
whenever theres two people it morphs their faces together
it sort of knows bleach, but merged multiple characters
is that ichizen
sdxl really understands linguistic context so much better. it could help to say "as characters from dragonball"
in some kind of full complete sentence
the context of tokens matters especially more so in larger models
I've seen Sytan's comfy ui brings in two different text inputs. I've been wondering how that works, but it's anice approach.
thats better
he's really trying to intimindate her bangs with that gaze
hollow ichizen lol
Civitai I guess. For new nodes, use Comfy Manager to download missing ones.
stability is non violent
I wonder if they filtered that out with nsfw
hence the no guns, swords etc
nsfw isn't really filtered out
You can get it quite easily
I've even done it by accident
the discord bot censors words but it doesnt tell you
The bot does, but you can run it locally and you don't have that issue
yeah
I'm trying to get it to do Piccolo
It knows he's green because it's giving people green hair
But they just look "normal"
bleach visiting walmart arc
lmao, that's not Piccolo
pick up the reishibites, aizen
-kenpachi to aizen while seireitei is being attacked
in SOULMART
I think you cursed me or something, it's just generated naked titty for no reason
is there a tool only to upscale?
did it have a male nipple?
why is that so blurry though?
because that seems to be the way it goes
No, it was a full naked breast, how they should be
vhs lora
it's pretty good at drawing other people in bleach style, joe biden bankai
This isn't doing what I expect, but it's doing some decent images
is this workflow ok for comfy with refiner?
is this public!?
civital
try finding it on civitai
You don't use the refiner like that.
You use the Advanced KSampler, Start with the base and then do the last few steps with the refiner.
all I see are 1.5 and 2.1 lora... unless your workflow redoes it with a pass in those
ahhhh
trump bankai lol
Bankai... Kyodaina Kōtei
Hmm, it can sort of do Psycho-Pass
Although I'm not sure where this outfit comes into it lmao
how do you make comfy use dynamic ram?
What do you mean
i have 16gb ram, i need ~24 to 30 to use sdxl + refiner
You shouldnt at all
that doesnt sound right at all TYP
odd, im getting lots of out of ram, it asks for ~24gb
SDXL know Ferrari F40 ❤️
Sometimes it feels like the background and foregrounds are totally seperate in xl
don't think so
looks like this atm
maybe there is a command for unload after use?
why are your images so small for a start
Im curious whats the point of seperate vae in xl
i think you are using dynamic ram to load it all, because combined it should be 12gb + 6gb
cpu mode
if you're going to clock down to system bus level, go all in
just use your cpu for genning at that point
a gpu only benefits speed when using the dedicated vram that is hardwired to the gpu
how much VRAM do you have @delicate grotto
6gb, its not a vram problem its a ram problem
seperately i can do almost 2048 using the sdxl
What are your lanch parameters
none
xformers, normal vram
i think ill make a 2 stage for the pictures, save them as latent, and later on load them as latent into the refiner
what time is SD XL being released on Tuesday?
usually they broadcast the same day so ~between 18 and 20 europe time?
idk if it didnt link right @delicate grotto
howdy audi
Yeah I think its this as well
must be it, ill play with it tomorrow
for now without the refiner it is fine
thank you all for the tip
ive noticed this myself. is it trying to figure out depth by itself as part of composition?
I keep getting this annoying blur over the images when I use "Movie Still"
Like the whole thing looks a bit smudged
Maybe cause all the motion blur from slow shutter speed in film cameras
Yeah it might be, or film screengrabs
Trying to work out a better style word
Cinematic Camera Shot seems to work a little better
That's without refiner though
And it looks a bit plastic like
movie still doesn't seem to do that here
lol jlaw super saiyan's hair is even limpy
cigarette smoking intensifies
If you go too far with super sayian weighting it just turns into a dude
bro got bluetooth
argh lol, it's still doing the blur stuff
i think it's just your config or something
I don't see how, these are just a single shot through the base
It's got to be the prompting somehow
ok a little better
overwatch icon
i wanted it to climb the rifle, but it insists on holding it hehehe
i like this model
video game lighting
Yeah used RDR2 as a style for that and the wild west
the bot produced a NaN
Yeah
Thanks! So far I’m loving your workflow, it’s given great results. Are you planning on updating it as time goes on with new features?
And thank you for making it publicly available!
lmao
im surprised at how well it does porche
close but her leg has fallen off
Is it just me or does that Porsche look a bit wide lmao
well maybe it thinks it's shoulders 😉
I remember trying to do hybrids with dall-e. it never worked
Told it to use scales instead
Seems pretty easy with this
Will Ferrell XL
nice hat!
18th
Thanks! 🙂 I am constantly fiddling with workflows, particularly a 2x workflow. Pretty much all the images I post will have the workflow in them. Feel free to use them, although I do use some other's custom nodes too. Getting a good image at 2x is tricky. Any ideas or questions, I'm happy to hear them. I mainly created the custom nodes for things I thought would be useful and that I am also capable of coding myself 😄
i just tried yours a bit ago mikey, that's a nice setup. I did hit some sort of fail state with it somehow. The first gen worked, but everything else after that gave me noise. Could be a comfy hiccup, I only solved it by restarting.
Is your workflow posted on Civitai or anywhere @west breach ? The custom node looks cool
what I've found so far with 2x is more noise added to the upscaled image is good for images with lots of details and small faces, really bad for close up portrait photos. So really have to reduce the amount of noise for close ups or you will end up with extra noses, eyes etc
LoRA?
no
there's an example workflow on my github for the prompt with style node https://github.com/bash-j/mikey_nodes
ah nice! thank you
The number of steps I have in that workflow is 92% for the base and then 8% for refiner, if you change it slightly you can get noise
i think my only critique is there's no option to do separate G/L prompting, but we'll see what happens once 1.0 launches. It's still wild territory, but i've found separate prompting to better for certain things, especially anime
that's the weird thing, i changed nothing
dragonborn!
yeah, the way that node works is it puts your prompt + the styles prompt from the selection into the G and then just the styles prompt into the L and the refiner
but as far as easy and clean goes, it's great
you can add your own styles to the styles.json. you don't need to reload comfy after you save the file, it should update as soon as you click the selection widget
one thing I want to add is the ability to use wildcards
thanks for all your work in the community! your nodes are super helpful and the latest workflow of yours - that's some clean layout - great work!
I was like - where is the rest? nope, it's just super minimal and compact 🙂
one of the benefits of being lazy I suppose 😄
do you have a twitter or something so I can give you a shout out next time I post some images?
since you are all being so lovely, I'll put together a 2x workflow too
no sorry
That 2x would be clutch. I have been trying to figure out how to implement quality upscaling into the workflow (any workflow tbh lol), with no success
ok so I just link to your github if that is alright
it's the hardest thing, I think you need different levels of steps and noise depending on the original image
Yeah it’s super tough. The way I was doing it in 1.5 was using ultimate sdupscale with controlnet tiling, but haven’t been able to make that work yet with SDXL in comfyui
would love to be able to lock workflows, so often stuff goes weird and it's because I accidently clicked on a widget and changed the cfg to 20 or something 😄
right click -> pin?
my newest SDXL tutorial with comfyui : https://youtu.be/FnMHbhvWUhE
#ComfyUI is a node based powerful and modular Stable Diffusion GUI and backend. This UI will let you design and execute advanced Stable Diffusion pipelines using a graph/nodes/flowchart based interface. In this video I will teach you how to install ComfyUI on PC, Google Colab (Free) and RunPod. I will also show you how to install and use #SDXL w...
hopefully lora training coming for SDXL
tomorrow
how did i not know this?!?!? 
You can already use Kohya-ss to train LoRA
doesn't work with groups yet
Wait huh? Does this lock the fields as well as the location?
yeah the node is locked in position
no the widgets are still free to be changed
anyone ever notice that his eyes look like a butt?
Pepe’s?
yeah pepe's eyes
the manual is not complete, some stuff didn't work for me but this ComfyUI shortcut list helps if you haven't found them all yet: https://blenderneko.github.io/ComfyUI-docs/Interface/Shortcuts/?h=shortcu
How dare they not have left alt+left click haha
ctrl+drag also missing
ctrl + shift + v is a thing of beauty
yes 😄 I learned this from the sheet
just chain as many samplers as you want and don't have to reconnect everything
2x workflow on my github (also in this image)
@west breach so to be clear, that bottom text field is a negative prompt? I'm loving the simplicity of the nodes btw
yep, it should put positive and negative as the default if you add the node
the styles have a bunch of negatives already, which is why i left it blank
Sad i am getting some tensor errors from this workflow ^^
Makes sense. Finally, where did you find realesrgan_x2, can't find it anywhere lol
do you have the manager extension?
yeah
It's in the install models section
oh sick
woha that resolution
decided to just use the 4x ultrasharp, no fancy tricks lol
Yeha Ultrasharp works really well for that sort of stuff
the wildcard support for my prompt with styles needs this widget, but I have no idea how to add it
feel like 1024x1024 is not a good res, getting way better results on 1280x1280
Just have a look at the code for that node and copy it
great colors and lighting
can't see it in the inputs
It shows up automatically when you have an input named "seed"
I'm not sure if there are any other ways to get it
it is only mentioned in the app.js and widget.js
does refiner also destroy the result for u sometimes?
I called my input wildcard_seed, I'll change it and see if it works 🤞
so there the refiner does work 🙂
The eyes look fine in both to me to be honest
yeah.. well kinda. they look proportional better but they are now flat - no life
look at the eyecolor
one grey/green eye and one brown eye after refiner
before 2 brown eyes
feel like the refiner is a hit and miss 😄
the refiner is super finicky
thanks, that worked 🙂 not really a 'seed' as such, more of an offset but at least it's working 🙂
The refiner doesn't like eyes, I have noticed that
This is really good
Damn, I didn't know bulbasaur was chill like that
It struggles with him
I like the vapourwave style
is there a database for different styles?
Not that I know of, I'm just typing shit in
i meaaaan; that should be a given
most paintings/depictions of mermaids are bare chested
even in family friendly restaurants if there's mermaid pics they're bare chested lol
That's one scary looking monster
I love mixing nonsense stuff together lmao
Same prompt that made that but add Sonic into it
thats actually not bad lol
Nothing is wrong here.
lmao
are u guys scared of tomatoes?
No?
im asking honestly
Are you?
normally not, but i changed my mind
Maybe...
i hate tomatos but im not scared of em lmao
i dunno which i would get a better face , empress in this pic
This is your fault void
for the moment Ive been using
melted, compressed, artifacts, earrings, deformed, amputee
earrings because they always look shit
compressed and artifacts seems to help a tiny bit, but not much on quality
Strawberries are scarier.
dunno i just dont like the human on the tomato, but i guess thats a general model thing
would it would look more 3d
love*
or like an actual character, not so flat
but im really limited with generations anyway 😄
its probably bec. i miss inpainting, i mean faces on this distance looked way more worse on sd 1.5
so it seems to be fine
You can still in paint, it's just more of a hassle to do in ComfyUI I think
oh ok, i didnt know that 😄
SDXL is great.
I just hope the 1.0 has some real improvements from 0.9, has anyone seen improvements in the bot?
Tomatosaur
Interesting
Tomatosaur
Lmao
but faces kinda look generic 😄 when u just type in empress
g'over hurrrr
yup
Yeah SDXL makes a lot of the same face.
i think the same is true of the original 1.5 model too, do the usual tricks work? [person1:person2:.6], alternating person names, using the name of 1 person or person of a country, etc?
mildly nsfw but mostly just cute. using roop with sdxl from faces made with sd1.5. wish i could get gfpgan wired up. thought i'd share cause i liked this result. clothes that don't look super skin tight
Her feet look weird but very nice overall
That shadow is kind of suspect too now that im looking at it
i agree i didn't even prompt for a car either. sdxl's extra details are so fire
Yeah
feet are probably weird cause i prompted for sneaker high heels
And the hand looks like she had an accident with a meat grinder
When is 1.0 releasing btw?
18th is the date they gave
the anticipation is killin me
Apparently, there will be a new refiner. I hope to see a lot or improvements there.
a new one?? hmm
The majority of the models on civitai are based on 1.5
What's a refiner?