#✨|sdxl
1 messages · Page 168 of 1
yeah exactly. gpt would do it, but it would put stuff in random boxes. tail over here, body on the other side of the yard type of thing.
even with regional prompting, I don't think I can match this sd3 shot
Miiiiigtt be possible maybe with comfy nodes for that
Def won't commit to saying I could do it though lol
"A high quality cinematic photograph of a Pickle clown shark tickle in a messy apartment, garbage clown freak shrooms, gun tree, missile launch snake, drug shrimp"
Certain combos of words just drive sdxl bananas
ok I'll try that one with claude
Claude rewritten prompt? Or does it have an img gen too
it's making the prompt
I think the censors are kicking in for the missle launch and drug shrimp. so what it's giving is seriously twisted.
Wow that prompt blows sdxls mind so badly it got carosello to generate photos... Nothing I've tossed at it just did that
Those are great
A pickle clown shark tickle, in the style of photorealism ADDCOL mushroom clown freak, in the style of surrealism ADDCOL gnarled gun tree, in the style of dark fantasy ADDROW missile launch snake, in the style of sci-fi ADDCOL drug shrimp chef, in the style of satirical realism ADDCOL messy apartment interior, in the style of gritty realism
whoa. better one coming up
this is right up your alley for random stuff smashed together.
A whimsical pickle head sporting a jester's hat, in the style of Salvador Dali ADDCOL
A shark fin protruding from a tangle of human limbs, in the style of Hieronymus Bosch ADDCOL
A bouquet of vibrantly colored fungi, in the style of Ernst Haeckel ADDROW
A pistol-gripped branch laden with bullets, in the style of Giuseppe Arcimboldo ADDCOL
A serpentine rocket exhaling smoke rings, in the style of M.C. Escher ADDCOL
A shrimp adorned with pill capsules and syringes, in the style of Yayoi Kusama
Dark arts goes haywire with this one "A high quality cinematic photograph of a Pickle clown shark tickle cannibal in a messy apartment, garbage clown freak shrooms, gun tree, missile launch snake, drug shrimp, song bong gong gang bang, warp slug, beep freak"
Is that Claude?
I love whatever that hat head is
With the eyes staggered in a ring lol
i'm impressed that it can do photos as well as illustrations
so the prompt was generated by claude, using SDXL regional prompter with dark arts model
ideogram doesn't care about violence, but it rejected this one. 🙂
probably because of the shrooms
it doesn't always work with dark arts, but the cute 3d render lora sometimes adds a depth to it.
claude 3 rocks. This was from highly detailed panther with jewelry in blue lighting with large paws and sharp claws is gripping into a tree
good stuff
yeah that's really good. the weapons and pointy faces are neat
Check out those with pnginfo
That was gpt 4 output
Dumped a big chunk in there for safe keeping lol
AI really is very good at prompting ai... Lol sounds like the opening of a Terminator sequel
here i am thinking that new 512 token limit for SD3 is awesome and you throw 732 tokens at it
whoa. that might be the best moby dick novel in the prompt box to image quality I've seen yet
I assumed that was regional prompter on your part
Hahaha
Nah just dumped it in
I think forge might be splitting the prompt in blocks and concatting them
I've noticed it always shows tokens used in increments of 75 and everything in that prompt affected the output
right
i'm assuming it's forgetting a lot of it, but it's also keeping a lot
1 of something is hard, 10 of something is easy
Those numbers are why I'm wondering about that
Once you hit 76 its 76/150, then 151/225 etc
Like it's splitting it into blocks of 75 and concatting all together
I need to do some systematic tests on this
do they concatenate them? I thought they blend/average them, but I'm not sure.
In principal SD could take any number of tokens, but it was trained with a fixed length of 76 tokens. Similarly, CLIP was trained with a maximum of 76 tokens, so it doesn't work well if you add more to it
Would you mind sharing your instructions to Claude?
Not to Claude, I mean sharing the instructions you use for Claude
How do you get Claude to generate images? or did it just give you a better prompt for it?
Thank you, let me try i t
I want you to create a terse text to image prompt. For that image, I want you to split the image up into 6 pieces; 2 horizontal rows of top and bottom, and 3 columns per row, left and middle and right. I want you to describe what is in each piece, starting from top left, then top middle, then top right. Then for the bottom row, bottom left, bottom middle, and bottom right of the image. Use the word ADDCOL to delimit columns and ADDROW to delimit rows. Don't add words denoting which image piece it is. Don't mention more than one subject per image piece. Put each image piece prompt on its own line. A subject is allowed to span vertical image pieces. Determine an appropriate artistic style for the overall image and mention ", in the style of " with that style at the end of each prompt line, but before the row or column delimiter. Please make a text to image prompt for:
I want you to create a terse text to image prompt. For that image, I want you to split the image up into 6 pieces; 2 horizontal rows of top and bottom, and 3 columns per row, left and middle and right. I want you to describe what is in each piece, starting from top left, then top middle, then top right. Then for the bottom row, bottom left, bottom middle, and bottom right of the image. Use the word ADDCOL to delimit columns and ADDROW to delimit rows. Don't add words denoting which image piece it is. Don't mention more than one subject per image piece. Put each image piece prompt on its own line. A subject is allowed to span vertical image pieces. Determine an appropriate artistic style for the overall image and mention ", in the style of " with that style at the end of each prompt line, but before the row or column delimiter. Please make a text to image prompt for:
The way you describe it makes more sense intuitively, I'm referring to the name of the comfyui node that behaves pretty similarly to BREAK in a1111/forge
I just did a test
checkpoint: darkArtsImages_v10Abyss, steps: 20, sampler: dpmpp 2m, scheduler: karras, seed: 2711398504
clown clown clown clown clown clown clown clown clown clown
shark shark shark shark shark shark shark shark shark shark
clown clown clown clown clown clown clown clown clown clown BREAK
shark shark shark shark shark shark shark shark shark shark
clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown
shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark
clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown clown BREAK
shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark shark
so if you have two chunks that are each 75 tokens, separating them with "BREAK" makes no difference
so i think it must be using that when you sail over the limit
(clownshark:87.3)
shark
clown
Volcano
Skyscraper
UFO
shark BREAK
clown BREAK
Volcano BREAK
Skyscraper BREAK
UFO
WTF?! @copper kraken
interesting. merge vs. also
Typing past standard 75 tokens that Stable Diffusion usually accepts increases prompt size limit from 75 to 150. Typing past that increases prompt size further. This is done by breaking the prompt into chunks of 75 tokens, processing each independently using CLIP's Transformers neural network, and then concatenating the result before feeding into the next component of stable diffusion, the Unet.
I don't think it does? Could be wrong
Ok I tried your prompt and it just said "RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 7 but got size 8 for tensor number 1 in the list.
(claude:3)
shark
BREAK clown
BREAK Volcano
BREAK Skyscraper
BREAK UFO
shark
BREAK
clown
BREAK
Volcano
BREAK
Skyscraper
BREAK
UFO
Just gives me this, with or without the BREAK
shark BREAK clown BREAK Volcano BREAK Skyscraper BREAK UFO
Same checkpoint same seed same params?
RuntimeError: Sizes of shark queue must match except in dimension 2. Expected size 5 sharks but got size 87 for shark number 1 in the list.
You need more breaks in your break so you can break while you break.
Yes. It seems to discard all the "clown" tokens, or it thinks there is a real "clown shark"
Weird
Ah...Cascade
Oh that keyword isn't in comfy
The comfy equiv is the conditioning concat node
Take the parts on each side of BREAK in separate prompt boxes and "merge" them using the conditioning concat node
I don't think I ever go over the 77 tokens anyway 😄
This was 732 tokens lol
I found last night that having gpt 4 generate a very detailed description of god knows what will almost always result in a very interesting image, even with it being 400+ tokens
Yeah, there's obv tons of prompt neglect but every chunk does indeed have an effect
I have the regular image generation prompt if you're looking to stop the prompt vomit from gpt
Act as a creative agent who generates a terse but highly creative image prompt derived from the prompt I send you. Include descriptive visual elements of the subject, lighting and surroundings. Specify an artistic style or camera settings at the beginning of the sentence, using descriptive elements that pertain to this artistic style. Include no more than 10 elements presented as discrete descriptors in one long sentence without story. Put the most important descriptive elements at the beginning of the sentence. Here's the prompt:
You can make it bigger or smaller by adjusting that 10 elements number
I bring that way down for cascade since it does best with prompts way below 77 as many have pointed out.
Thanks haha I will def give that a shot. Just been finding it interesting to see what happens when you use plain English descriptions and concat all the chunks together vs writing something that describes something similar that's within the 75 token limit
If you're trying to get something specific... Not a great approach
But this only works if you are using an extension like Regional Prompter right? The ADDCOL stuff seems to have no impact in Autmatic1111 with SDXL without an extension like https://github.com/hako-mikan/sd-webui-regional-prompter
Correct. You need that.
Set it for attention instead of latent, and turn on the don't convert and and break at the bottom. Do not set a resolution and column number etc below, those tags in the prompt itself will create the columns and rows.
You're welcome. lol
Even without Claude, I'll just take what it creates for me and start separating things out in the prompt box to have more influence on where things go
I like the little house on his back with the big ham radio antenna
Broadcasting shark waves
I also look forward to a better vae in sd3 so everything futuristic isn't "neon lit"
It looks quite edible 🤤
Thank you very much for sharing. I'm afraid this ADDCOL and ADDROW won't work with Draw Things, but still very helpful
Sometimes the local mixtral model can churn out good stuff as well.
Anthro horse in a bar setting, drinking beer, in the style of Impressionism.
ADDCOL
Anthro horse with a mustache and monocle, enjoying a pint of beer, in the style of Pop Art.
ADDCOL
Anthro horse wearing a Hawaiian shirt, holding a coconut filled with beer, in the style of Tropical Art.
ADDROW
Anthro horse wearing a beret, sketching another anthro horse, in the style of Cubism.
ADDCOL
Anthro horse sitting at a bar, sipping a martini, in the style of Film Noir.
ADDCOL
Anthro horse dressed as a bartender, pouring beer for a patron, in the style of Art Deco.
oh this is great
Uhm... just had time to check your workflow out... and that has nothing to do with using Differential Diffusion without putting the whole image through VAE encoding first.
Your source image gets turned into a latent right away - and there is no step that combines the original pixel image with the difdif'd latent/image.
I think we were not talking about the same thing / process.
You can do the same thing with image masked composite
Alright. I'll try that.
Yeah just add that at the end
You'll have to composite all the masks you use in total of course
But if you're only doing one in painting step then you don't need to worry about that
I got it. Now that of course also leaves the usual slight differences because the inpainted area got Re-VAEd of course. But at least now I can always choose what is more important to me.
Thanks.
how about you?
Visuals [#touchdesigner] and soundscape [lyrics taken from "I fall in love too easily"] by uisato.
For more experiments, project files, and tutorials, head over to: https://linktr.ee/uisato
#stablediffusion #aiart
how to change folder to which comfyUI saves images?
in the save image thing, just put foldername/filenameprefix
no leading slash
I am confused... Is it not possible to declare a full path? if it is, where would the subfolder end up?
it starts in the output folder, so if you put sdxl/comfy, it'll put it in comfyui\output\sdxl\comfy934893.jpg
haven't tried a full path though.
alright thanks
ah shweet
Did you test the Puppeteer prompt in XL? I'm actually surprised that he posted that one when it didn't include the puppeteer at all.
the one I just posted above is SDXL with regional prompter
an evil pupeteer looking down ADDROW
an evil pupeteer looking down ADDROW
evil pupeteer hands ADDROW
a cute rusty robot head made of garbage ADDROW
a robot torso made of garbage with a sign with "I M SD3-BASED" written on it ADDROW
rust robot legs
but yeah, he posts an SD3 picture and it doesn't actualy prompt adhere. that said, I think his prompt is kinda messed up, so I don't blame SD3 yet.
That's actually great. You've been playing with that extension all day, eh?
When I get moments. 🙂
Some from today.
Holy crap the one with Donald Trump and the scientist
I love these
Sorry if you were asked 300 times before but is this regional prompting in Comfy or in Auto1111/Forge
automatic. the functionality for it in comfy is extremely complicated, just ask clownshark batwing. he's spent the time to get it working in comfy but it's like a swiss watch in there.
automatic is just throw simple column and row markers and you're done.
I have seen some examples with noisy latent compositing and area compositing
but those require literal pixel values
that said, specific models with specific settings work best. others just don't do much of anything
these settings and model work really well
Thanks
https://huggingface.co/lllyasviel/sd_control_collection/tree/main?clone=true
https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/tree/main?clone=true
https://huggingface.co/thibaud/controlnet-openpose-sdxl-1.0/tree/main
is this a representative sample of the relevant controlnets for sd1.5 and sdxl?
am i missing anything significant?
automatic 1111, 1.5 and sdxl, for ControlNet v1.1.440
it's in the picture.
i'm saying that as far as i know, "InstructP2P" has only been created for SD1.5
alright thank you
there's some other biggies... there's inpainting and tile ones for sdxl
but also illumination ones for sd15
and the qr code monster ones, an opticalillusion one
mediapipeface
is there a SD3 channel already?
i'll get you a list of some later for sure
i've collected basolutely everything i could find
I guess for people that have been invited to the preview
Can any one say free ways to generate best ai images
nihaoya
Here is the image you requested.
It is possible.
red dog
🔴 🐕
been trying to train lora on sdxl but the speed, it feels unreasonable slow at around 15-30x slower than sd1.5 lora training, any1 got settings that work well with 2080ti? i use https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
Looks like more SD3 pictures showing up on twitter
It's going slower because you likely don't have enough vram. If it uses any shared memory at all, it will be exponentially slower. (Shared memory is in your regular system ram). SDXL base resolution is 1024 vs 1.5 being 512.
There's a huge latency overhead when having to shuffle calculations back and forth between ram<>vram if your vram is maxed out.
Does anyone have a working SDXL controlnet adapter for lineart?
how about you?
Visuals [#touchdesigner] and soundscape [lyrics taken from "I fall in love too easily"] by uisato.
For more experiments, project files, and tutorials, head over to: https://linktr.ee/uisato
#stablediffusion #aiart
This page is like a bad acid trip.
Architectural clausura. From a modern city to a city of the future: man, environment, technology
Subject:
The territory of the center of the residential area
Goal:
To design these spaces as a kind of transition to the city of the future, corresponding to historical and cultural traditions and images of identity, socio-demographic, natural, aesthetic, environmental and technological trends of modernity.
Tasks:
In the project, it is necessary to ensure a balance of three groups of criteria: environmental, aesthetic and functional.
Landscaping elements should use the latest achievements (existing or projected) in the construction and finishing materials industry, eco-technologies, “smart technologies” that contribute to the creation of an environmentally friendly living environment.
To offer a graphical interpretation of the phased implementation of the project from the present to the future.
To express the spirit of the place of design (linking to a specific territory you have chosen).
/prompt home blue
Here is the image you requested.
Sadly no not that I know of
that's very unfortunate since my entire workflow depends on me first drawing the lineart before letting ai color it in, so SDXL is essentially useless to me... unless there are some other ways?
Hi.Can anyone explain the diference between RealVisXL 4 and the Lighting variant? Plus, its not explained in detail, is one supposed to use it with SDXL refiner? https://civitai.com/models/139562?modelVersionId=344487
Use Turbo models with DPM++ SDE Karras sampler, 4-10 steps and CFG Scale 1-2.5 Use Lightning models with DPM++ SDE Karras / DPM++ SDE sampler, 4-6 ...
same model. the lightning one generates faster in less steps. like turbo model variants
Tks, so I guess lower RES as well, BTW: do you happen to know if its expected to use SDXL refinder in the normal variant?
you will get almost the same detail and res. No refiner needed
ok,tks
Lightning is like Turbo but works on normal SDXL resolutions
(however, most Turbo merges also work on normal SDXL resolutions. So it's subjective what you like more)
I never use Refiner and I think most people don't use it. Custom models are often much better than refiner anyways
I guess Turbo has its uses, but I just really want higher RES, so I stick with SDXL , Trying variants , but not sure if those need refiner or not
For lanscapes I ussualy use base+refiner at 50/50, and get what I consider good results, but maybe I'm wrong
as said: the Turbo Merges support native SDXL resolution
With basic SDXL, that is , just trying variants now, I'm mostly into real landscape scenes
Turbo -> 1 step generation, but only 512y512
Er..isnt Turbo supossed to work on 700 something pixels?
Turbo merge, LCM, Lightning -> 6 step generation, but 1024x1024 (and other) resolutions
yes, but turbo merges are turbo models merged into sdxl models
thus, they behave somewhat in between
Haaa, so Turbo merge is yet another variant?
yes
Ok, got it now
tks for explaining , so something like that REasl Vis Lightning should get comparable results but with 6 steps or so?
yes
I only use classic models with more steps if I need negative prompts or have very complicated prompts
cause these low step models need very low cfg
otherwise, I get MUCH better results with, say, DreamshaperTurbo than with SDXL base+refiner
if you want larger res gens you can use the Kohya deep shrink node in comfyui. Ive done many 3840 x 2160 native gens with turbo models
Ive gone really huge before, using just regular SDXL, its tricky but doable , Im no expert but got decent results with Controlnet Canny and somewhat higher than regular denoise for upscale (in A1111) , I+m not really into speed, just want better (realistic) results
I mostly use img2img
From this rough client provided as reference:
with a lora?
No Lora, it's just the dark arts images model from civitai
imagine prompt:A close-up image of a vibrant, colorful Jungle Juice Popper bottle surrounded by dense tropical foliage and exotic fruits, under a mysterious, moonlit jungle canopy. --v 6 --ar 16:9
Ok, got it now, a person gets out of touch for a month or so and everything is changed, so Lightning is actually a speedier Turbo version, so what models have base resolution of 1024? All Turbo merged,Turbo and Lightning or just some of those?
Myself I would not recommend to use any fast / turbo models as they degrade the output to much vs clean XL (finetunes), the only one method using mentioning is using DeepCache but that also degrades.
Unless you need it do something video like.
No idea what DeepCache is, you have any link I can read about it?
No, I'm using mostly to enhance rough images composed on PS
Mainly landscape backgrounds or props/extras to include in my real work
You can download a image of my Rembrandt model page: https://civitai.com/models/282340/rembrandt-infinity-sd-xl-10
I believe all workflows where done with it.
But once again only if you want to degrade your image go turbo / fast / deepcache.
Civitai webpage isn't very mobile friendly is it?
Works fine on mobile for me.
What browser? Firefox doenst pan
Firefox is what I'm using and I have no issues.
Chrome is ok,tks,
Skill issue. 😛
So, what is exactly DeepCache? Is that some kind of Lora?
Tks
I know things aren't always straightforward but assuming there's a (presumably small) quality degradation on Turbo and Lightning models is it a usual workflow to try stuff with those and then send to img2img to "enhance" with base+refiner SDXL?
And BTW: are custom (regular) SDXL based models ment to be used without refiner?
It's almost a full time job trying to keep up with the latest
I've read that on DeepCache but not certain it would work for me (I'm on Macintosh)
Speed is good , but ultimately I want the best quality possible and most of the time I need stuff to be at least 4x4k pixels
I don't know what to think about this one...if that was an actual glass sphere, odds are that the refraction would flip the internal image.
I agree, it's certainly not physically correct heh. maybe it's a portal
Yeah...it's mind-bending. It actually took a moment to register with me.
trying out some things 😉
heh...wild
I've got something interesting, but it's Cascade-related. I'll mention it here conversationally, though.
Essentially, I've found a word combination that's got an overwhelming strength to it. The example is that if I use the word "bumblebee", I can easily tilt cascade one way or the other by adding "transformer"/"car" or "insect". Pretty standard up to this point.
However, add the word "flower" and now you are locked into the insect and you pretty much can't escape that unless you re-add "transformer" or "car" at which point both a car and flowers will appear. But good luck with modifying the bee itself with anything other than generally strong tokens like batman, beans, or the usual list. (Obviously, there's more that will be overpowering sometimes as wel've seen in all models.)
What's more though is that it's so strong in Cascade that this positive:
(police:1.1) [bumblebee:1] on a flower
with this negative:
(bumblebee:2)
still results in an image of a plain bumblebee on a flower. By all other logic, the bumblebee should pretty much be eradicated from the image.
(...and yes, I've also tried (bumblebee:-1) just in case something with brackets was broken, but even then, the negative should've had some impact.)
In essence, the combination of bumblebee and flower is a combination in Cascade that is rather overwhelming and I found that interesting. Typically the tokens that we find are overpowering are single words or they end up being names or styles.
That's a very interesting find! Thanks for sharing it. There are a couple of token combinations in each model that really fight for dominance.
I've done some research with Cascade when it was published, but there are so many new SDXL models I'm testing right now with amazing fidelity and SD3 arriving that I haven't spend much time with it the last few days.
But what I've noticed it, as many others have, that prompt tokens behave quite different with Cascade. Longer prompts can overwhelm it and you get images like you have used crazy high CFG settings. Also distilled models like Lightning based but especially Turbo behave more restricted in versatility and sometimes it's hard to break out of concepts.
I'm noticing this a lot right now since I'm mostly testing Lightning based models and I'm dissecting prompts quite more often to find a token that doesn't let me move in the latent space
there's certainly a technical reason for it (distilling etc), but this is how behaves for me on the user side
Yeah, the long prompt issues were immediately apparent to me within moments of toying with it. It's great with most short prompts, though.
You've probably seen enough of my images and discussions to know how interested I am in TXT2IMG versus some of the other stuff. (Not that I don't use them sometimes, but prompt engineering is so fascinating!)
yeah - absolutely the same for me 🙂
draw arch count
Yeah, I like that compare tool as well.
epic
I see your batman, and raise you 1000 more
tell me guys how many gb you have on your video card and how long you are generating SDXL
24giblets
and how long you've been generating?
I have 4gb and it takes 30 minutes to generate
Ouch
yea u need at least 8gb otherwise it moves the model to ram and takes forever
Don't mind my English, I'm just learning
this took about 60 seconds on a 4090 with that new lightning lora. it's pretty great.
the legend of zelda
you're either very rich or you've already sold your kidneys😂
20 secs for a 1024x1024 img on forge with 12gb vram on a 3060 and that card is not rly expensive
yea laptops suck unless u buy one with a 4090 but those ones are rly expensive
I don't want to know what a 4090 on a laptop would be like for heat and fan noise
its the downgraded 16gb version
Plus, a lot of the laptop GPUs are not really the same as their desktop counterparts.
No the laptops are top notch just bad that they are cut down but I can play all the games on ultra
The biggest disadvantage is the cut-down graphics card(
Play City Skylines 2 on ultra. 😉
I generate a regular SD in 60 seconds, but SDXL takes 30 minutes
well I'm not interested in this game but I see what you're getting at)
Here is the image you requested
What about genshin impact
Here is the image you requested
ok that's funny. is that pandora?
you need blue men riding those things
and the many faces of his dog.
This next one is hilarious
hah i love the eyebrows
it's the one in the cabinet that gets you.
Working on a project where creatures are morphing into objects.
anyone have trouble inpainting on sdxl?
im trying to inpaint to fix eyes but it just doesnt work out right
I have just started using this one it might be over kill https://youtu.be/6OO37ZjjbSs?si=raxW3xnZZOMwzjFW its a bit weird because it uses ESAM to segment what you tell it instead of drawing the mask yourself.
Download the workflow:
https://drive.google.com/file/d/11iGOQIdmVh5uFTr748PTJlO5LUD4ICAN/view?usp=sharing
Join and Support me ###
Support me on Patreon:
/ aifuzz
Let’s be Instagram friends:
/ aifuzz1
thanks
If you're doing inpainting, check out "differential diffusion" with comfyui. It's amazing and works with any model. The more steps you use, the better it blends
Here's an example of inpainting using the new differential diffusion method. Make sure you blur your masks (old inpainting was a binary black or white thing, this can handle gradients and you'll see that while it's diffusing)
And like I said, it works with any SDXL model you throw at it. This is just using the base 1.0 model. I wouldn't recommend it with turbo models because you really need more steps to get clean transitions
Here is the image you requested
Here's an example of inpainting using the new differential diffusion method. Make sure you blur your masks (old inpainting was a binary black or white thing, this can handle gradients and you'll see that while it's diffusing)
/a girl
woman of stunning beauty, very exuberant, with long hair with a lot of phosphorescent white shine, big hair like Rapunzel, flowing, FULL BODY FLOATING in a dark CEMETERY full of mist dressed as if a cloud came out of it, white in color, ultra realistic, ultra quality, ultra detailed, "-c", in the background a large moon and white fireflies Pale makeup, illuminated skin, SHINY EYES, HDR, 8k, glitter particles and crystals throughout the image
Is it a boot like dream? How do I enter?
You don't. You read:
#1047610792226340935
do not enter the boot
sdxl has mature dbut still color bleeds bad. if you request any thing in a color (clothing eyes etc) then ull likely end up with a lot of thing in that color.... sigh
You learn not to specify colors in prompts and save that for inpainting. It's probably best to not expect a masterpiece artwork on the first generation since AI isn't at that level yet. Even artists today first draw a rough draft first then detail later.
diff diff creations
damn
experimenting with workflows and settings for SDXL-Lightning (Dreamshaper XL) + TCD (Trajectory Consistency Distillation) LoRA:
what does tcd do?
SIGMA 
I'll look that node up
Project page for Trajectroy Consistency Distillation
interesting, thanks
TCD scheduler node from that repo works, sampler not so much - but samplers compatible with SDXL-Lighgtning (like Euler) seem to be working with TCD LoRA. though for pure SDXL-Lightning it's usually with eta 0, and for TCD you need bigger value
imma try that out cause I'm using lightning anyway
Anyone do outpainting in comfyui? I'm trying to turn a square image into a 16:9 one. It's doing a good job, but it's got these seam lines as you can see in the second image. Ideas on what to adjust? I've tried changing the feathering number but that didn't get rid of the seam lines.
remove them in photoshop and do an img2img pass to clean it all up
well, I'm trying to have it be part of my automation.
where it renders at 1024x1024 and then widescreens it all via json and thrown to the comfy api
automatic1111 does this without giving seam lines.
so i know it's possible...
Ive not had perfect outpainting using any software, so automating it all will be tough
example of a1111 outpainting. no seam lines.
yeah im not sure what causes the seams. it does suck
openoutpaint for a1111 was really nice
i think it's not denoising the whole image, whereas a1111 is.
ok, that was actually the fix. denoising the whole image afterwards as another step
from comfy
nice. surprising it didnt create something from the left over noise
I'm just going insane with left/right expansion. 🙂 it's absolutely just making stuff up, but it's still inline with the prompt since it's pulling from those.
yeah, I'll try my lorax gang members on the subway one.
hah, finally an image that fits my superwide monitor
that last step really fixes up any glitches, like hands holding swords not lining up.
Myrlin Clownroe.
Kissie kissie. 😉
The round table of faceless people.
The phrase coming soon with summer colors and a colorful golden rose in background
" coming soon" with summer colors and a colorful golden rose in background
The phrase coming soon with summer colors and a colorful golden rose in background
Who is this?
Dranald Aubrey Trumprake Graham
You can imagine a quirky, cute and sweet girl with long hair and braids
i would rather not
Here is the image you requested.
ready for school kids?
fill the shadow of the image with coffee beans
why u make me do dis?
Here is the image you requested.
No. thanks
Here is the image you requested.
i love dark arts, but playground 2.5 upscaled is pretty crazy
click open in browser
an anthropomorphic centipede-pope hybrid in a business suit giving a high five to a trump as a little boy who is smoking
wow that's awesome
what's the most photorealistic base model for portraits?
Check out imperfectportrait too
I just zoomed up that turtle. that's really good
Thanks im experimenting with the AND prompt syntax, things get pretty weird
with an extension or just in general?
Just general
So what's an example prompt look like where AND is helping in some way? vs. just batman and putin.
Something like photo of batman :1 AND photo of Putin :1
You can get something like this
Yeah is basically just combining the 2 prompts
Vladman
prompt: photo of batman-putin-centipede-bus hybrid - it gave me lots of wheels like centipede legs, batman's cowl horns/ears, and it put it in Russia. 🙂
Yeah i don't like using Break or |
where's the obligatory mushroom cloud given the setting? 😛
I made a thing that like one or two people would care about. If you're one of those couple of people, thanks for checking it out:
https://docs.google.com/spreadsheets/d/1es_BjyKJDg6LB-2sM0HpF1fLLoV7OYEDCPKWBJmh6nM/edit?usp=sharing
Start Here
This sheet created by soul. You can generally find me in the Stable Diffusion Discord, but if you're here, you probably know that already.
At the bottom there are tabs named for megapixel (MP) ratios in decimal format. Each is based off of the 1024x1024 megapixel standard, not the 10...
it's so beautiful. 🙂 I've actually spent way too much time trying to figure that stuff out. thanks.
You're one of the 2 people, then! ❤️
putin goes hard
He doesn't look hard in that image.
What does a sharp minded ballerina wear?
😛
how could you tell with the ballerina skirt in the way
I mean, if he's to be believed, you'd definitely know, even with that on.
lol
if you look closely.....
Forbidden burger.
time to feast
mmmm....rock loaf
@heady vale@gloomy lark
whats a highlight?
Ideogram 1.0
Don't really know actually. Found it funny though.
Ideogram 1.0
yeah I figured out the outpainting. it actually exposed me to the truth about latent space upscaling. no more latent to image to latent to image, losing quality at every step. so much higher quality images now, and far faster.
What are you referring to
Ideogram 1.0
That's a static image up above (the highlight) not. A link
this is my new workflow. I'm sure it's old news to people such as yourself, but I was using ultimate sd upscaler for everything. for resolutions ~3k and below with the 24 gigs of vram I have, I can do much faster upscales, maintaining much higher quality by skipping all that.
and I can use ultra fast dpmpp 2m at 20 steps sampling and still get crazy high quality
Oh letwe get you my workflow
the denoising takes care of all the stuff I used to spend tons of time with dpmpp_sde on, just so the fingers and hands would be right. this solves all that.
I can literally watch it fix hands and arms in realtim
with the fastest cheapest sampler there is.
There's a few really good ones I can get you in a min here
and it does a 1920x1080 res upscaled image in 7 seconds.
That are actually simple
Not your usual clownshark monstrosity
Ultimate kinda sucks
Dalle-3
Dalle-3
there's a few good methods
regarding the straight up latent upscale you showed, try getting NNLatentUpscale and swapping in that node, i've found it's usually better than the base one
yeah I tried the methods they have there, the "area" one is majorly better than the others.
Dalle-3
here's the simplest one that i liked the most when giving it just a bit of freedom to tweak or add details
if you're using regional prompting, you can't use that node, but you CAN use the "tile diffusion" node which patches the model
Ideogram 1.0
then feed that into regular ksampler etc with your regular regional prompts but upscale em first
so this is going form latent to image, back to latent, then back to image, similar to the old method I was doing. When I looked closely, I was losing quality of the image each time I did that.
switch the tiledksampler scheduler to exponential to get less change
this is better than latent upscale
you lose more detail from the latent upscale than from the conversions
but this was the best combo i could find including with the upscale model
the main advantage is you don't need as high of denoise levels to get satisfactory detail
tiledksampler takes a long time but i'm the happiest with those results, at least with the weird shit i tend to make
well, the high denoise I found was to fix the lack of correct limb placement by the quick simple dpmpp 2m sampler which is prone to that stuff.
the other reason i like tiledksampler is cuz you're sticking to the standard resolutions
you're not inducing mutations, you're usually fixing them if anything, unlike a straight up latent upscale
this is with NNLatentUpscale instead of the latent2img->upscale-img2latent
it might be better for your purposes
i tend to kinda like the weird details that sometimes pop up with more aggressive settings unless they flat out change the fundamental composition
I keep going back and forth between speed and highest quality every nook and cranny looking amazing. I ran that one, which took 75 seconds on mine, vs. 7 seconds. 🙂 I find I can riff faster and respond to cool ideas faster when it's 7 vs. minutes that I'm used to waiting.
this is pretty good for 7 seconds.
I played around a lot with turbo and lightning models in the last day, I thought the lightning stuff was amazing, until I started comparing it head to head against non-lightning. It's pretty much trash. amazing loss of detail and prompt adherence. lorax gang members with tatoos, bandanas, switchblades, etc etc. 75% of that was gone on the lightning version compared to regular.
So I bit the bullet and ordered more 4090 hardware instead. 🙂
if there's a loss in quality, unless that loss can act like a preview and you can just swap the settings back to get the same thing with details restored (never happens really...)... then it kinda sux
ha did you order a second one?
why stop at 2?
here's a faster method
hahahah do you have three??
this is straight up unsample resample
not faster than the latent upscale
but does give diff results
yeah I figure with SD3 coming shortly, and with all my scrips and stability swarm being able to use all of them at once, I'll just have the 3 4090's and the 3080 running in parallel for maximum silly pictures.
and there's opportunity to do some really interesting stuff by changing the prompts on the unsampler alone
love it haha you're insane
I found it!
it's a set of 6 upscaled high quality images against 2 models every 40-50 seconds now with a 4090 and 3080. so adding 2 more 4090's will speed that up by a lot. so I can throw prompts at it and get stuff back quickly to have fun just throwing ideas a it.
I meant to ask, were you around here when the stability ai bots were working here? how fast were they on requests?
nope, i wasn't
i first hopped on here shortly after cascade launched because i got annoyed at how crap the info on reddit was
With the new SD3 bots coming along, I'm wondering if there's benefit to extending what I have going to other people or is the bot stuff here going to cover any need.
2/18 that was when i joined this discord
yeah I was shortly after
i'd imagine it would cover any need of a certain type, but not the need the Sharkb0t is currently covering
ah ok
well, when I get the new hardware and can figure out proper queueing of multiple rapid fire requests, where the next request doesn't skip the line of the first one's images, I'll let you know.
probably a couple weeks.
cool
i finally got around to setting up tailscale so now i can use comfy wherever i want in town
neat. I'm finding that the discord bot method works best when I'm out. the ability to submit a prompt and then close the phone and check it later is great.
obviously full comfy is best if you're trying to adjust stuff.
but for just having an idea and want to see what that would look like, discord bot is great.
for sure
saw this on reddit the other day, so i made a couple
reminds me of the cruise ship in fifth elemtn
big badda boom
^_^
another thing you can try too...
the second is with noise injection, this is a straight up latent upscale with NNlatentupscale
it brings a bit of those details back that you lose from latent upscales
the thing that sucks with them is a lot of more complex textures are lost
assets disappear or are simplified and textures get smoother
for comparison, this is the same NNlatentupscale except with tiledksampler
check out the grass
hah the person in the background is normal vs. dead body in the first 2
leeloo dallas multi-pass, falling past futuristic flying cars, multi-level, floating city, futuristic fashionable people.
and here's one last one for when it's imperative to preserve the source composition
yeah, like that high denoise I'm doing. I accept that it's going to give some weird things happening every now and then, but it's the tradeoff of returning lots of images quickly.
hah yeah exactly. torso crotch
but sometimes that's awesome
yep
and fish lamps 😆
bmut yeah absolute best i've found for preserving the original image has been tiledksampler or tiled diffusion for regional prompting needs
if you want high levels of detail, dpmpp_ancestral with 30+ steps
source composition is generally fine at 0.5 denoise, 0.55-0.6 if you want shit added, beyond that it'll go bananas
2s ancestral at 35 or 70 steps is my favorite. best looking out of all the samplers. second only to dpmpp sde's accuracy.
yeah for real
i love that sampler
res momentized is getting a real soft spot in my heart too
that is a very very good sampler
what are you outpainting with
I was doing the latest padding for outpainting. and then denoising after. that's actually how i stumbled upon the whole denoising of an image afterwards, and that led to upscaling the latent and denoising that.
then i also did dark art images render, with latent upscale but with juggernaut9rundiffusion.
that led to this one.
got a workflow? curious what you're doing exactly there
i used that some for a while, then ditched it cuz i got tired of patching subtle seams
so dark art, did a good one of this, but doing a juggernaut denoise massively increased the realism of his face and skin
the denoise afterwards gets rid of all seams
it worked really really well in sd15 but i've had issues with the inpainting models in sdxl
that was my complaint.
ahh so you're denoising the entire image afterwards
i said to the guy, a1111 has no seams! and then I thought, it's because they're denoising the whole image, not just the masked parts on the sides. voila! denoise the whole thing, no seams. looks great.
ha, yeah
that's another place where tiledksampler fn rules
diff diff could probably solve that issue too
what i've noticed too though is you still end up with issues with "soft" seams
where changes to the structure of the image have been introduced
sigh, so many things in sdxl are just so much crappier than sd15 and inpainting models are def one of them
ok here it is
makes weird compositions, but with realistic textures.
those weird dark arts compositions, that juggernaut would never do natively
yeah for sure
i need to get a workflow set up to make carosello img2img realistic outputs
that model is fn incredible for making the weirdest of the weird
and yeah regarding the upscale def give NNLatentUpscale a shot, maybe it'll suck in your case but i've gotten better results
so this made the dumplings look more photo style
it's just an interesting option.
I ended up just using this method without the separate checkpoint for the 7 second upscaling.
what i'd love to do down the road is develop a workflow that automates outpainting one chunk at a time in a ring around an image until it's doubled in size
then downscale it 50% and iterate a few times
then upscale and tiled resample for a while
it's interesting that you bring that up.
could do some really wild looking shit
So when we render at 1024x576 for example, what's in the image will be significantly different than if we do at 1536xwhatever. it's not just more texture details, but what makes up the image, the complexity of how many subjects etc is very different. I'm noticing with services like ideogram, they're getting very high complexity of subjects and interactions, all at very low res with their default 1280x720. Same with midjourney. very complex details, even when it's just 1024x1024. stable diffusion XL, doesn't do that. resolution directly dictates what's going on in the image.
I'm really curious to see how SD3 handles this. if 1024x1024 will be more complex now.
yeah i'm wondering if they're hiding shit
what i'd be tempted to do is upscale, get the extra details, then downscale to hide the artifacts
iterative upscale is great for adding
exactly. i've spent countless hours on the iterative upscale.
just to get that kind of complex world building.
ahh cool you've used that one then
when you see my crazy monster workflows, tbh they're not as crazy as they really look
usually i'm doing shit like reproducing an iterative node, or something conceptually similar that doesn't exist yet
so that i can actually play with all the parameters internally
so I just plopped that nnlatentupscale in there.
is it supposed to make things better?
or do i need to use that with the tiled ksampler? just with that one node swapped in, the image is a lot software now.
softer
switched back to the original upscale latent by and it's much sharper again
I just realized Ive been doing the ldsr plus workflow since I started using SDXL full time
So maybe it was only better with sd15
that's the partner of dpmpp_sde. slowest. upscaler. ever.
ok, the output of tiled ksampler, is different than just ksampler advanced with denoise that i was using
That scheduler does two passes per step, that's why
this one is the tiled ksampler
this is the original ksampler with advanced denoise I was using.
As you can see, on the crucifix on the second one, is there's little jesus trump with trump hair. the clear winner. 🙂
the details are very close, but the trump on the cross, and also if you look at her hair bun, it's a little better on the non-tiled ksampler.
but most important, ksampler with denoise takes 7 seconds, compared to 20 for tiled.
Yeah I'm sure there's some cases where you might get something more preferred that way
all that said, the "area" method is really the star of the show. all the others are blurry like that nnlatent upscale node.
Sapphire dumpling?
The stuff I've made tiled with that upscale model (whatever it's called, away from comfy now) has been the best pretty consistently
Nature stuff in particular
Ha
all that said, tiled is how to get real big. this latent upscale thing only works up to around the 2k-2.5k resolution, then it starts blowing up the 4090.
but for 1k res upscale to 1.7 or 1920 res? it's perfect.
Yeah
What I've found is the farther you get from native resolutions the worst latent upscale gets
That's obv going to depend a lot on the checkpoint too
definitely. If I set this to 2x resolution it starts putting pixelated squares everywhere. 1.5x seems to be the sweet spot
turned on fp8 yet?
Freak limbs all that and a loss of assets
i breach 2k with 16gb 4080
kohya hires fix is my favorite tool for raw gens at high res
Lava dumpling?
When Po becomes a fire bender? or lava bending woudl be earth bending wouldn't it?
dumplings are for sure po he invented them
times like these when it puts out stuff like this, i think i need to go back to the censored model
/me beautiful girl
times like these when it puts out stuff like this, i think i need to go back to the censored model
hah good stuff. can't wait until these things can hold swords correctly
Here is the image you requested.
/me What a sweet weird cute elf girl
Here is the image you requested.
these anime models are hilarious. it doesn't matter if you say they're wearing power armor, they're no more clothed than without.
SDXL 0.9
i don't see that reply in his twitter posts or replies, so i feel like it can't be recent
Gotcha
yeah, it's gonna be pipelines i guess. something that takes "man running" and looks through it's library of controlnet openpose files for that, then uses that on what the model IS trained on to give you a man running.
they say that's what dall-e does, but who knows. it's all black box
I doubt ideogram does that.
Have you heard of ELLA, basically it's equipping any SD based diffusion model with LLMs for enhanced sematic alignment, there's a whole paper on it, i think Ideogram might be using something similar, maybe SD3 too
yeah they said they'd release it in a week, it's been a week. 🙂 maybe this coming week.
having that kind of prompt adherence with the maturity of a lot of the existing models would be amazing.
with sd3 we're starting over again, i'm hopeful, but it's gonna be a while. kind of like with cascade, I personally feel like I've hit my limit with it until finetunes come out, which'll probably be never.
Yeah cascade isn't as big as i taught it would be, and the fine-tunes are worst than SDXL for some reason
This actually makes it look really good. Can you please share that workflow?
Just look at the second image, it's like two extra nodes from a standard workflow. The only node you'd need would be a mask blur node
Comfyui essentials has the mask preview and blur node. The rest are built in
The big key with differential diffusion is that you HAVE to have gradients in your mask, otherwise, it will just be standard inpainting. So the blur will make those gradients for you or you can do masks in photoshop
what? why? 😄
#Aishwarya Rai Bachchan transformed into a male character, wearing modern male dress, standing in a futuristic cityscape with neon lights and flying cars in the background, under a purple-hued evening sky, exuding a sense of confidence and mystery.
I made that for you, "modern male dress", not sure what you mean by that one
maybe this one
Tried out SuperPrompt-v1
very OP even with SDXL it seems?
WTF IT ACCIDENTALLY COOKED SO HARD
Lol
okay I'm actually going to use clip L and Clip G separately
to see if Clip G can extract that natural language intelligence
I also read somewhere that CLIP G is better at interpreting natural language, while CLIP L is better at interpreting tokenized prompts (our usual comma-separated prompts). Maybe that's the reason?
okay??? nice??
im plugging in the superprompt prompt into Clip G since its more like natural language
and all this with SDXL Lightning (DreamShaperXL)
man it would be great to have that 512 context length, I keep cutting from the prompt to fit in that measly 77 token limit
I'm using llava 1.6 Mistral to describe images in large detail and then feeding that into a Hermes2 model that's good at following instructions (The llava model sucks at following strict directions for some reason). It takes the wall of descriptive text and consolidates it down into really decent prompts. I wrote a few nodes for comfy to interface with ollama server to do it. The hard part is writing the right system prompt for the second stage.
Here's an example of my autocaptioner. I just point it to a folder of images, set the seed to 0 and increment, then queue up how many ever images are in the folder. It really does spit out some half decent captions for doing loras and you can put in your own custom tag. On a side note, I hate coding in python...
dayum
@graceful nexus Sorry for pinging you, but your SuperPrompt model has been cooking for me in SDXL, I hope to see the comfyui node one day. With SD3 it's going to be even more useful. I also tried plugging the superprompt prompt into clip_g and it gave some nice results as well. Thank you for your work.
btw i just checked what that upscale model was
LSDIRplus
about to try the new proteus rundiffusion model
Man, I love proteus. 0.3 especially 0.4 grew on me, but he keeps going down this weird road and now he's full embraced this specialized booru style Pony tagging method. Hard to get on board of proprietary stuff.
Oh damn is that what's going on with the new one?
Stability staff have commented on his tweets about the new model, referring to it as a virus, that him and some others are moving us backwards from natural language and back to a sd 1.5 style of comma delimited keywords.
Yeah, I saw that
Boy holding a cup is sdxl, boy, cup, holding, doesn't establish the relationship and only works well for single subject portraits.
The problem with sd15 style is you're then leaving it almost entirely up to the attention layer to decide what your composition is imo
Exactly
It has to guess
Yep
I've had the LLM give me that style back, and for single subject pics, it actually does better than full language. But if anything more complex, it falls apart.
Proteus is really goood at prompt adherence, it was king before dark arts and deep blue came out. But now he's going backwards.
For the sake of prettier pictures.
Yeah
Well that sucks lol
Honestly the models already make pretty enough pictures
IDC about the aesthetic score shit at this point
It's all about getting the right composition now imo
It does seem like the actions issue will remain present like you've said
But cascade has shown us that they should at least have resolved some issues with other image types, like dark scenes in particular
Drove me fn crazy trying to get sdxl to generate an image of a truly dark scene, I don't think I've ever really succeeded, there's always been a significant light source injected or implied somehow
So talk about prompt adherence, I said unearthly beast emerges from portal at end of prompt. Kept getting this.
Only deep blue followed the more complicated prompt
Lemme try the dark thing.
these are epic
So what couldn't you do with sdxl concerning darkness?
Get darker than that
I said - completely dark room. Small anthropomorphic bear shining flashlight at his nervous face. Sense of extreme darkness.
So what would be an example of language that's darker
Not the darkest I've gotten with cascade, just what's handy ATM
But I've gotten it pretty easily to generate images with no strong diffuse light source off screen, no candle, almost pitch black
Not sure off the top of my head
But basically what the interior of your house looks like during a power outage at 1am where you can only see after your eyes have adjusted
Lol
hehe old sdxl image
I feel like plush animal is a wholly unexplored prompt tree for me, soon to be remedied.
a tiny lit candle in the middle of the wooden floor illuminating a legion of horrifying creatures in the room.
That is one horrifying creature.
finally got around to writing darkarts a much earned review
"This is a truly incredible checkpoint. I have 240+ on my SSD and this is my #1 go to for almost anything (the other being Carosello). Its prompt understanding is outstanding, and the qualities of the outputs are diverse, with a lot of unique styles (with as many checkpoints as I have, it's obvious how many are inbred). Absolutely fantastic work and I look forward very much to any future versions or checkpoints trained or merged by the author."
just checked, it's the first model released by that author
Yeah I went looking for his realism based on. I have to see if it's better than juggernaut
wait there's another somewhere by that author? i must've missed it
ouch
Yeah there's a whole bunch. Dark arts was a one off to handle this kind of content
spoiler that or delete it, that's an epilepsy trigger
dark arts looks like if Illuminati Diffusion was for SDXL
⚠️ @smoky patrol @uncut steeple
@slim wren @nova aspen @vestal breach
22 🤔
lol, i like how i got timed out in less than 90 seconds when i posted a pic of a batwinged vampire where the shape of her nipples was visible through her black dress... but there's a racist meme that's been sitting in general with images for nearly 18 hours now, and we've got epilepsy triggers flashing away unspoilered >_>
where do i go to generate images in this discord?