#✨|sdxl
1 messages · Page 184 of 1
Creature
looks cute and scary at the same time 😭
I just found this model called SSD-1B, it's amazing. Super fast, I loaded it for the first time so already some time spent there, and then even at 24 steps it did the whole image (combined with loading model) in 30 seconds
lol
Testing facial expressions
A couple of example results from my latest system, a video player with an audioreactive playhead. There’s a couple of things more to it, but I’ll leave it to my patrons to explore.
You can access this system plus many more in my Patreon profile: https://linktr.ee/uisato
#vfx #touchdesigner #audioreactive
yeah its a distillation of the sdxl model. I would honestly recommend using lightning sdxl models like realvisxl lightning/dreamshaper lightning because they are better quality and even faster then ssd1b. They just require 4-6steps instead of 25ish.
I use realvisxl but I do like how this one doesn’t always resort to like photograph quality
Plus this is super fast still at like 30 seconds total, including upscaling
Try dreamshaper or the base lightning as they can do basically all styles. Lightning models usually take like 2-3seconds(no upscaling).
new forge release will load sdxl as nf4 and supposedly makes it sd15 fast
random prompting sometimes does stuff
SDXL + A1111
really nice
which checkpoint model do you use?
A special treated mix + some finetune. This is cleaned, removed random noise and many more
ah okay yeah
I really love making sci fi with these models
Me too, but it is hard to find good ones, especially for those resolutions.
yeah its true it can be hard to make it work
Finally it is working out of the box, i am enjoying it several weeks now 🙂
This is the 3rd checkpoint i used this method to clean, denoise and detangle. Can't wait until some more tools for Flux are available, then i try it there as well.
I like how contrasty it is
Up to now, Flux lacks a lot scifi imho
yeah Flux has a few issues with subjects
The tech is insane, it just need more variety
Until then, i like my SDXL 😄
yeah I am still on SDXL
Sure, it is the SDXL channel. I assume Flux images go to SD3, hehe (No offence SAI)
lol
yeah the SD3 channel seems to be Flux now
and sadly the cascade channel is gone
Cascade was is good too, but next to a Flux BASE model, a lot other finetunes look like ancient tech.
flux is a next level yeah
Funny thing, clarity comes with a price, random stuff like a Bolognese Sauce doesn't look right anymore. Too repetitive because random noise is removed.
Or the Pizza,lol
oh yeah that sauce is weird
As long as it does other stuff without open ends or vanishing wires, i can live with that.
American style house in red color with white background
A piece of paper is burning, the burn marks are in the shape of a trumpet
hi guys!! how do you avoid these kinds of artifacts?
Treeshark's latest workflow on L2 discord can fix this issue (hallucinations and tiles during tiled upscale)
Reduce the denoise, and change how the prompt is used in the upscale process
what is the L2 discord?
trying.. the thing is I like the results better when the denoise is high..
Then you need to use a mask to only change the main subject
I sent you the link in DM
I'm not affiliated BTW
I'm just a fan
i will try this
thanks!! highly appreciated.
also your image is 4k
you can just about get to this resolution using latent upscaling
if you combine all the tricks (SAG/SEG/PAG/FreeU/Res-adapter/Gradual Deepshrink or Hi-diffusion)
along with rescaling and tonemapping to allow very high CFG
its a lot trickier than tiled upscale but its an option
Can you post the original, before upscaling please?
i will research on this..
sadly the ComfyUI version of Hi-diffusion isn't quite as good as the Diffusers one
it still helps though
apparently full Hi-diffusion would require some modifications to comfy to work
oh wow.. this is nice.. did you use usdu?
That is part of the workflow, but the "no upscale" version.
It uses controlnet and multiple latent upscales to recreate the image first.
i didnt realize upscaling is this complicated..
There are many ways to do it, but avoiding the unwanted hallucinations is the tricky part.
This one had a slightly higher denoise and also used facedetailer to clean up the face.
Up to you, I use multiple
You need to mask it and then use the original image as the background.
The whole workflow is on my Patreon.
armageddon training camp
A DSLR photo of a portrait of Groot head, 3d asset
hi guys Im having this error while trying to load flux cnet.. any ideas?
where can you find this controlnet 5000.bin?
.bin is weights
there's a format that looks like .bin and .param as an alternative to .pth
.ckpt or .safetensors are more common for comfy
.bin/.param is what I used to use for NCNN in chaiNNer
.safetensors files are safer, would recommend them where possible
Hello! can anybody tell me why openpose isn't working at all here? i am using xinsir/controlnet-union-sdxl-1.0
My SDXL w/f with Face Detailer is giving this error (no controlnets, all models are sdxl) - error = mat1 and mat2 shapes cannot be multiplied (77x1280 and 2048x640)
Generate anime animation expression package
Create a brush cutter monster
Dots!
cross sections of balls technically
Dots are tiny balls.
Joyoive. I too prefer Joyoive over Sony.
So I think Mangled Merge Matrix is ready. 3850 total loras. With this new one I used 3 different block combinations depending on the types of the loras being merged. 1 for concepts, another for styles, and a 3rd for text, although I'm not sure how much it benefitted from that one.
probably the most loras merged of any checkpoint
I'm interested to see how far I can take it. I'm surprised it's held up this far into it. Although going from strictly 2D for the Magic version to strictly realism was tricky. But I learned some new techniques in the process. I'm putting more focus into building up the clip for the next one.
I follow your attempts closely, if you are at the PC can you try the following prompt please with your latest mix:
'grids, lines and a circle'
and no negative.
Something like that will occur
#dream
Here is your requested image
I didn't have a whole lot of luck. It mostly gave me grid lines. I changed the res. from 832x1216 to 1024x1024 and changed the prompt from "grids, lines and a circle" and got better results. But not consistent.
from "grids, lines and a circle" to "a circle with grid lines in it"
Thanks for the test. It has nothing to do with luck. Each seed does something different of course.
It is the resolution i am after and the integrity of the patterns, how parallel are lines and how many defects are in.
No problem. I've heard of the circle test. The grid lines seem to throw it off though. I'm gonna try to improve the clip more for the next one and keep this test in mind.
I know the curiosity by myself and i love research too, but for something AAA you can get away with a dozend carefully selected checkpoints 😉
So, why merging 4k LoRAs?
Just to see if I could and to learn and experiment with different merging methods. This version I explored block merging more and used Mateo's prompt injection for block references.
I wanted to see if merging into specific blocks could improve merge quality.
The lora merging process is automatic with code that utilizes the comfy API. So it's not such a tedious process.
merge quality.. do you mean prompt following quality or visual quality? Versatility?
All, sure.. but you can't have the cake and eat it all.
Both 🙂 Although I'm learning more about the importance of the Clip I don't think I have that part perfected just yet. Gonna tweak and test some things for the next one.
If anyone would like to play with it.
Merge it all! That's my moto. 😂
It most likely did. I'm excited about seeing what comes with Flux merges or any improvements that will come down the line to the entire merging process. I'm sticking with working with SDXL for now though until I decide a good time for a video card upgrade.
you guys should try training the checkpoint its way more rewarding and some of the outputs in training crease you up lol
I have with SD2.1. Found the whole process of building a database and labeling each image to be pretty tedious. Not to mention the time and money spent on the failures. I did enjoy making a loha with 71 different planes. But OMG the process of putting it all together ...
youll get better tho bro, I was pants on my first try
Maybe once I get a 24gb 9080 I might look into training again. I'm sure by then you will need 10 A100's for whatever latest model is out.
I run48GB Vram And sometimes with my newer larger checkpoints its pushing all 48
I need to look into the latest and greatest for labeling images. I could never really figure out a good way on 8gb vram.
do you have 2 cards with SLI or 1 big 48gb monster?
https://tensor.art/models/759856135286068673/FLUX-HYPER-TRAINED-DREAM-DIFFUSION-BY-DICE-V-1 hyper training that was a stretch
i have 2 RTX 4090s
duel flow watercooled rig
Looks awesome. Do you see a benefit with training on SLI other than with batch processing?
https://youtu.be/jEcQqhcAtOI?feature=shared me building my beast
I GET ASKED A LOT ABOUT WHAT SYSTEM SET UP I HAVE. SO I THOUGHT I WOULD PUT TOGETHER A VIDEO OF THE PHOTOS I TOOK WHILST I BUILT IT. ITS A TOTAL BEAST OF A DUEL WATERCOOLED RIG. THERE IS A VIDEO OF THE FULLY BUILT UNIT AT THE END OF THIS VIDEO TO.
I HOPE THIS ANSWERS YOUR QUESTIONS ON WHAT I RUN FOR USING THE AI TOOLS YOU WERE ASKING ABOUT...
That's a pretty serious rig. I would like to have something like that soon. I'd probably go with dual 9080's though since they are so much cheaper. Is the dual card SLI worth it?
3090's i mean. Didn't get much sleep last night.
i srted out with it running a RTX 3090 with 24GB vram. which was sweet, untill checkpoint training and running 3 webui at the same time was a bit of a ripper, So threw 2 RTX 4090s in her and havent looked back
I bet! 👏
444
Released a lightning version
From the description to the trainingset prompts, later the captioning and scoring, down to the example prompts and images, all AI.
what do you use to caption?
utilizing an api to run images through?
Interesting. I haven't done much with llms other than chatgpt. as far as captioning. Is what your doing possible on 8gb vram?
short description up to..
long ones, just ask for it
i have 24gb vram, i don't look very often how much is used.
Ollama nodes are cool
I'm gonna look into it and see. What model do you run for captioning?
ok cool. cause I got buzz to use but just hate captioning manually lol
You have to play with it a lot. sometimes if you ask for keywords seperated by comma you get 40 images fine, but then it starts writing essays.
I had to compare. Exaworld is on the left, Mangled Merge on the right. Both look really good. I think yours captures the essence of hollow knight a little better.
You are too kind. Yours has more details and color, mine is a bit faded because of the depth. Whatever one prefer, it is different.
These are from a quick merge of Exaworld + MM Matrix 50/50 merge, messing around.
Got it working. Thanks @shy kelp!
Enjoy 🙂
Boid.
My stomach and thighs are so thick, I can't keep my clothes open
So to illustrate what I meant about Pro and Dev, here is an image by Pro (note that 2 out of 4 times it fails to produce any text at all), and the images are all in these lines
member for 6 months just to troll so weak sauce like that? disappointing
how do i test a sdlx model using only promts? only safe for work pls.
Prompt themes you want and if you like the output, the model is good. If you don't like the output, search harder or modify the model.
that one almost looks like a civil defense poster
trying to inpaint/outpaint in fooocus and I can't get rid of the phantom fingers
the way I see it, my setup is a great tool for manual artists if you wanted a piece to be 90% done and you can photoshop or draw the rest to your liking
my files are fcced though 😂 I'm trying to get everything in the same folder and it's impossible
I think I lost the logs on some images
A1111 + SDXL
https://civitai.com/models/667232/exa-world-addons
LoRA is up, you might want the EXA World checkpoint too.

1 girl
Fantastical
Thank you
Trying to think of some lore where these airships make sense
It's like if some propulsion method was accidentally discovered in the 1800s
So this is maybe like a WW2 era aircraft carrier but with advanced levitation
like a starship u boat
Interlude:
Easy Buzz begging now with SDXL
https://civitai.com/models/669935
#dream logo for a page named Animal rescuers Algeria
#dream
Recent research/experiment done in TouchDesigner and SD: https://www.youtube.com/shorts/VyZFzKAuqsk (more info in video's description)
After a deeply introspective and emotional journey, I fine-tuned SDXL using old family album pictures of my childhood [60], a delicate process that brought my younger self into dialogue with the present, an experience that turned out to be far more impactful than I had anticipated.
What's particularly interesting about the resulting visuals, is...
Here is the image you requested:
a mountain have many trees
I don't understand what difference the guidance scale slider really makes in Fooocus, but turning it up seems to have blown out the pictures with very high contrast.
she looks like she's about to tell you something really bad lmfaooo
Testing.
A peaceful, quaint village nestled in a valley, with simple houses surrounded by lush greenery. In the foreground, a teenage girl with a bright smile and hopeful eyes stands in traditional clothing, looking out over the village.
cant see very well but try first one set noise to enable and maybe try different int nodes
hi guys, does anybody know exact model/LORAS/settings that stablediffusionweb.com uses?
for some reason their outputs are so much. better than anything i try.
@copper kraken im dead serious. i don't really know how to use AI,. i just setup InvokeAI and downloaded SDXL turbo which is supposdely what they advertise to use, but my results are fucking garbage.
and when i prompt it on their website the results are incredible
i use same steps count and CFG scale, not sure what else i should do to replicate the results from there.
same seed too.
if someone could help me get close to that, would be super helpful
well but the thing is, stablediffusionweb.com claims to be using SDXL Turbo.
so this must mean they have some sort of fine tune on it that makes it so good, right?
look at the one generated on their website vs the one generated locally, lol
prompt for spongebob battle royal game logo
hmm what does that mean exactly?
looking it up rn
btw does it matter if i run it on my M1 mac?
or does that only affect speed
no idea don't have a mac
damn my favorite show, SpirbbBoog¥ Spin£bogk
wow that's actually beautiful and solemn
that's small detail im sure typography can be improved or omitted idc. I care about the quality of the image
IDK why but on my local the model insists of giving spongebob 2 mouths lol
the second one is a half decent rendering of Spongebob, idk what's happening behind him though
well what do i do in order to achieve this result locally?
for now its enoguh for me, i just want to replicate these outputs locally
what are you running, Fooocus?
I only really have hands on experience with Fooocus to a degree but that is SDXL based
is the prompt the same for both, and what kind of negative prompts are on both
yes, same prompt, same generation steps, same CFG scale, same seed
but completely different results
the double spongebob on the first one could be a resolution issue
other than that check your styles
This is what i can see going out to their API from their UI
This is my local settings
This is their output (with the missing eye) (it's not perfect ofc but u can clearly see it's higher quality, this is what i want to achieve for now)
And this i get with the same prompt and the above (looks the same) settings (will send in few mins once it stops generating)
A1111 + SDXL
Thank you
you're now in my top 5 favorite ai artists
it also stylistically takes advantage of the glitches
Oh welp.. it is just the model and some bugs in A1111 which lead to such images. I am only an explorer
Yes,that was the target
The whole style was design around these
genius
and this is waht i get locally...
i dont understand why my local model insists of giving two mouths...
@torpid stag man I think your prompt leaves the program in full control of the scene and arrangement since you didn't specify anything like that; try being a little more specific.
Resolution too high or wrong upscaling
but in that case why does it work perfectly on their website?
I'll admit I'm really new to this stuff man; I've just been a little obsessed with SDXL the last few days but take what I say with a grain of salt
no if i use the same seed its the same
i just tested out different ones between their UI and local to see if anythings diff
yea yea for sure. getting a good result on their website i've gotten multiple
it's just that im trying to replicate their results locally
even their worst result is better than what i get locally
i use 1024x1024, i tried changign to 256x256 and it was just ugly blurry stuff.
whats upscaling? can i change it in settings??
turbo is 512x512
Try 512x512 as Clownshark recommended. Upscaling is at a later stage.
let me try it now
btw does it matter what "Scheduler" i use?
it's set to euler
Let it there and experiment later with them. Lock seed and change it from image to image and you will see, which work good and which work better 🙂
this is what i get on 512...
no double mouth but low q
it is 512x512 and a turbo model.. i am not familiar with them, maybe experiment now with steps and the sampler.
the thing is i know desired steps should be 40 from their API
is turbo model just "faster" then the base SDXL?
try 10 steps
or is it supposed to yield better results
Turbo is just for speed, not quality
wouldnt less steps yield lower quality results?
got it. Im assuming the higher speed means quality will be lower though no?
i am trying now 10 steps
this feels like a super cinematic version of Streets of Rogue if you turned on the random environments mod
SoR is a procedurally generated game that loads one map at a time but it's broken up into different styles
if you put on that mod it mixes them up
with 10 steps...
where can i find the normal sdxl model (none turbo)? i would like to try it
any reason you really want to use turbo
no
go to civitai.com
simply thats what i saw written on their website and i wanna replicate it
Not normal, but i use it now for several days
is this some fine tune of it?
Which UI do you use?
yes
i mean
the UI website thing that generated the good results
is called
and i just setup invoke.ai locally to run the models via local UI
Ah, you don't generate locally?
Then you have to stick with their model. I never used any online service.
i am generating locally, i am trying to replicate exactly the results from their website exactly to my local : )
because i initially used their service to test, i liked it, then i tried to gen locally with same params from their UI and the results are way different locally
That could be model, ui, seed, version of the model, how the random numbers are generated from seed, UI version..
average house interior in my hometown
I would forget the idea of pure replication anyway and start prompting what is in your mind. 2 days, and you are heavily addicted 😄
where are you, detroit?
honestly, I can see AI art getting more accurate and less prone to glitches but I don't think you will ever be able to get exactly the result you want with no skill. There are so many options and things to configure that I think it will always take a degree of skill to generate the right picture unless the computer can read your mind.
Start simple, play around, get experience.
Then examine 2 different, maybe a third checkpoint and examine the difference. In style, quality and prompt following.
From taht point, you will know what you want and need. Don't run from checkpoint to checkpoint every half an hour. 'Learn' at least one checkpoint.
but I will say I like a lot of AI art styles because of how imperfect, random or glitched they are, and when people use models like Flux to get indistinguishable results I think that is bad for humanity because people will get lazy and use tools like that in place of real photography or exploration, or it could be used as propaganda. I'm not extremely biased either way but you need to be open minded to the pros and cons of this technology.
yesterday was all about transitioning art styles for me, that's what I learned how to do from a tutorial and when I got a result that was close I kept the seed and most of the prompts and started making fine adjustments to the sliders, styles and prompts
of course I'm a damn idiot and moving files around I lost the log for that whole project so I don't have all that info anymore
That is alreadyy too much for me 🙂 I just play around the last 2 years with this tech, hardly watched any video.
it was a really easy tutorial and because it's just fooocus I think anyone could wrap their head around it
I am pleased, if the image 'looks' good.
And enough to explore, 20sec later i need another one. 😄
he looks like an automaton
H
IP-Adapter adapts. A very powerful tool to push an SDXL models to a certain output honestly.
For users who still only use SDXL models, I've Created a SDXL checkpoint that runs on the Turbo base. Its hyper trained to a Maximum 100%. The adherence isnt that of flux but its very very good and the output image is the best youll see from any SDXL checkpoint. See what you think of the images posted in the gallery. Cheers https://tensor.art/models/751494452436224082/TWIN-TURBO-Dream-Diffusion-By-DICE-v1 https://www.shakker.ai/modelinfo/3ed729873f034271bef568cfc92176aa?from=personal_page
1.8K runs, 33 stars, 2 downloads. TWIN TURBO Dream Diffusion Collection that brings total realism to life.I've had a lot of requests on how to use and add th...
Our hub provides members with exclusive access to an elite selection of AI image generation models, designed to produce superior quality images that stand out in any creative project
@ me if you know how Fooocus metadata works
#✨️ Mickey mouse
Here is your requested image
#bot
that s not how the bot works, cf #artisan-faq
I THINK so Brain, but where are we going to find a tattoo parlor open at THIS time of night? NARF!
Working on a latent upscale workflow to squeeze the most out of Mangled Merge Lighting for a Flux dataset. These are cherry picked. Still some kinks to work out.
For instance ....
Might need controlnet.
its working well in terms of avoiding errors
would be great if you could get the CFG-burn a bit lower
Scheduled_CFGGuider from Inspire pack and sampler_tonemap_rescalecfg.py from ComfyUI_experiments can help a lot
nice, you may like 1_particular creations https://www.instagram.com/1_particular/
Thank you, i don't like instagram.
mmm x? https://x.com/UnoParticular
Really?..
You say you didn't like instagram
Let me check..
I wasn't sharing instagram, just the artwork
A more neutral link here- Interesting LoRA material.. https://mixam.com/print-on-demand/65e91aa33457744a5a6ac2eb
I'm gonna look into them. Lightning is kinda weird that CFG has to stay around 1 especially with the newer pp samplers.
oh its lightning, never mind
those nodes may well mess lightning up
lightning is very delicate
Right!? I'll still try them out though.
when I want speed
I go for 15 steps, beta scheduler, unipc sampler
with regular sdxl
its nearly into turbo range
but with base sdxl models
I found the sampler tonemap node but can't seem to find the scheduled CFG guider. The tonemap helped but I'm gonna have to play around with it a little more later.
on the tonemap node if you put the rescale CFG to 0.5 that's usually good
but the tonemap level you gotta test lots of different %
its different every time
scheduled CFG guider is from inspire pack I think
https://github.com/ltdrdata/ComfyUI-Inspire-Pack
there's a perp-neg version, if you are a perp-neg user
rescale CFG at 0.3-0.4 is also ok
you might not like it, I have different taste to most users lol
oh that's great, yes I just wanted to share it, I though you were pissed off or something
he is a friend of mine, I didn't even know about that site
Processing A1111 photos (via metadata) in ComfyUI+LoRAs
I was confused earlier because I had the inspire pack but I wasn't seeing the sampler nodes ... I'm still not. I've updated comfy and the inspire pack. then uninstalled the inspire pack and reinstalled it. None of those worked for me. I also don't see any conflictions for it in the manager. Dunno what the issue is. shrug
It's a messy workflow, but I think I finally got it dialed in. It's using a 3 pass NNLatent Upscale process.
I brought the denoiser on that last ksampler down to .4 I think .5 was a bit much.
They are already here..
They hide..
But they are ready..
They will search you..
They will find you..
And then it is all up to you.
Dream your own end with the Cine Pony XL checkpoint and : https://civitai.com/models/698523
on the github page here:
https://github.com/ltdrdata/ComfyUI-Inspire-Pack
scroll down to Sampler nodes
and there is Scheduled CFGGuider (Inspire) and Scheduled PerpNeg CFGGuider (Inspire)
its better to avoid using the manager and just fetch lines of python yourself with wget and git pull really
so you could just rip out the code from here:
https://github.com/ltdrdata/ComfyUI-Inspire-Pack/blob/main/inspire/sampler_nodes.py
stick it in a node yourself
Guys I tried various style dora and they're much more consistent wtf
From various authors too
dora good ye
and the verb is "train"?
Finally got it working. I had to uninstall all of my custom nodes and start from scratch. I added a depth controlnet into the mix as well. Super messy workflow:
tone mapping, custom sigmas, thresholding, NNlatentupscale, noise injection, PAG, freeU, scheduled CFG guider
such an amazing workflow
congrats 😄
I think this is the coolest comfy workflow I have seen so far
beta scheduler and deis sampler are both great choices too
Thanks! It's still not perfect yet, but I think it has to do with the fact that it's a lightning model. Gonna test with vanilla mangled merge tonight.
Testing the LoRA
Is it flux?
Very good. Crisp detail.
It is just the model.
Which one is that... so many SDXl and mixes and merges. Getting so confusing...
Flux. Not so good...
Wait until there is more variety and plenty finetuning happened.
Still some minor errors here, will redo the LoRA
It turns out a lot of the issues I was seeing with weird artifacts were due to it being a lightning model I was working with. I made some small tweaks to run it with a vanilla model and added a canny controlnet. Results are much better and details are controlled by the denoise on the last Ksampler. I embedded the workflow into this jpg. Not sure if discord keeps that in if you download it. Gonna test it now.
OK cool. the embedding works if anyone wants to try the workflow out.
It only upscales by 2x, but it's enough for what I'm trying to do with it. If you want to upscale more, you will have to find the right % formula for the 3 ksamplers. The last one should upscale as little as possible to stay close to the original.
works well with higher denoise too. This is at .9.
What vae would be best with this? https://civitai.com/models/288584?modelVersionId=324619
I forget what a vae is
(Masterpiece), (Best quality), (Ultra HD), (Super detail), (Whole body :1.2), 1 girl, Chibi, cute, smile, flowers, outdoors, holding the camera, sitting on the roof looking out into the distance, with mountains in the background, amber, warm yellow, sunset, artistic sense, Quadratic style, white clothes,
I really love your project
its nice to see diffusion projects that are physically larger
Usual Portrait XL V2 released. From boring 'female portrait' images up to shiny beauties from long AI generated prompts, everything is in there.
https://civitai.com/models/499529/usual-portrait-xl
this is an interesting lora
could be good to apply at different strengths
on some images
what is the most efficient way to prevent SDXL from generating any text?
ive been working on prompting for logos but a lot of the time SDXL keeps generating texts/typeography on the logo no matter what my negative prompt is
perpneg is by far the most important thing
perpneg makes your negative really powerful
also perturbed attention guidance or smoothed energy guidance helps with prompt following
also there was a paper showing that scheduling negatives massively boosts their effects
so the negative is not running for all of the sigmas
can u explain what that is exactly? im really new to this stuff. been just experminting with diff prompting, settings and LoRAs
You could be using a model trained with watermarked images. Try another.
Yknow addons that you use to enhance image quality? Unless pony doesnt use vae...
Updated latent upscale workflow. Took out the first 2 passes altogether, cleaned things up and added grouped processes and replaced canny with sketch.
Workflow is embedded in the image above.
nice
Only issue now is how to keep denoise above 50 without adding weird things like extra buttons. Anyone got any ideas?
ipadapter tiled seems to help. here's the first test. the left is without and the right is with. Both at .6 denoise.
Added controlnet tile as the initial upscale with latent now being a second beauty pass. Also added clipvision ipadapter tiled to the 2nd pass.
does it somehow apply NNLatentUpscale multiple times?
cos 1.02 seems like a low multiple
I'm not trying to upres the 2nd pass too much. I'm still testing but my theory is that the smaller it is, the less can go awry, but it still gives extra details. I'm gonna try 1 at 1.02 and another at 1.25 and we'll take a look at the diff. I will leave all seeds the same.
make t-shirt
Both are with Denoise .53. On the left is 1.02 and the right is 1.25. Check out the underarms and her bottom lip. Also, I took out inject noise and the clipvision ipadapter. both were giving wonky results.
ah yeah I see
.53 is high too. I don't really like the underarms on either of them.
that NNLatentUpscale can be tricky
I personally do VAE decode, do stuff in pixel space, then VAE encode
yeah. it needs to be small. that's why I added the controlnet tile upscale.
TBH if it gave some extra details at 1.02 without side effects then that's great
cos this could be dropped into nearly any workflow that has multiple passes
It does. I turned down the strength on the sketch and depth. For some reason the higher strength was adding dots in black areas.
yeah control net is tricky its very strong
sometimes its a case of only running it for a short while too
Now that makes me wonder, cause i'm running it for 20 steps. Maybe 12 is better?
gonna test...
if you are looking to do complex workflows, upping the step counts helps a lot
on the 2nd pass upscale ksampler though?
not much of a difference between 20 and 12 steps. nice time saver!
ah yeah that's almost identical
I think that's as clean as I'm gonna get it tonight. The updated workflow is embedded.
Pass 1 on the left with controlnet tiles, Pass 2 on the right with latent upscale.
it does look better on the right yeah
looks very similar to pixel space noise injection
which is good as the latent method is faster than pixel space
hi guys, i learned recently a bit about finetuning and i have some questions about how dynamic it is.
im building an app where one could input their biz/startup/game idea and go through steps where they will be generating context about it (objective, target audience, biz model, etc)
every time you generate something, it is used as context for next time you interact with an AI
rn im also working on a step to generate a logo for that brand.
what im doing currently is im dumping the brand context into GPT and asking it to create a prompt for SDXL, which i then insert with some other prompt keywords to make sure it looks like a proper logo
the issue is GPT's prompts are kind of trash.
i was wondering how well fine tuning would work if i:
- Gotten a bunch of pairs of
brand context dump->ideal logo image - Trained a SDXL model on it
i.e would it work well if the "prompt" is a huge context about a brand, and not rather some small trigger keyword?
or is that not how it works?
🤣
fairly certain this would not go well
you could look into Omost
its an approach that is very under-rated
the way Omost works is you prompt an LLM
and the LLM goes and makes masks to set up a Diffusers workflow
I've been exploring SDXL latent space and its really just not a smart model
the more tools you use to examine SDXL the more you realise
bottles are strongly associated with lavender
if you boost apples it boosts bananas
I don't wanna criticise but its possible you had the dynamic thresholding setting in a rly sub optimal setting
I believe its meant to be more like half cosine up than half cosine down
Criticize all you want. I am always looking to improve. I am going to test that out now.
The guidance part of the workflow is currently only going into the initial generation and isn't part of the upscale process, but I'm gonna test that too. I didn't want things getting too crazy on the latent upscale pass. But I'm also testing having a second clip for that pass and seeing how having a different positive prompt changes things.
For instance, the first pass is on the left and the latent upscale on the right with the prompt changing to "a professional photograph" .52 denoise.
Pass 1, Half cosine up on the left, Half cosine down on the right.
lol
in your test I preferred cosine down
but on comfy discord someone posted comparisons and I preferred cosine up
I think it depends on the model
my own images were better with cosine up mostly but they do get more fragile
cos they spend early steps with low CFG
so the form might be off
yeah its probably completely different for ret flow vs diffusion
and then different again for distilled vs non distilled
yeah, with the lightning model I was setting it to power up
leaving the clip blank on the 2nd pass still makes women for some reason... biased model I bet. Also interesting when I changed the prompt to an anime illustration
funnily enough I spent most of the last few hours fighting a lightning model
was very tough
they have quite a narrow band of settings that they are okay with
they are so finicky. especially with guidance tools
yeah
this is partly why I was unhappy with flux
its gonna be similar issues, maybe not as bad, but similar
less image diversity also
this was good in my experience: https://civitai.com/models/216190/lcmandturbomix-lora-only-12mb-8-step-sampling-effect-is-superior-to-using-lcm-or-turbo-alone?modelVersionId=246747
at least for their base model Hello World, it worked well
its a hybrid of turbo and LCM
it seems like the user friendly approach though. they want to make it accessible to newbs who don't want to fiddle around with cfg settings.
yeah frankly that's better for the market
not good for the experimenters though
the average response to flux has been really positive, so they judged the market well
my only issue with flux is that so many people are training loras on it that I can't find a point in training any myself.