#✨|sdxl
1 messages · Page 154 of 1
someone on the wing, some...thing
tried something peaceful but SDXL understand ... ignite immerse imagination ... Its own way...
is there a way to make auto1111 understand meta info from comfy?
@noble shoalNew LoRA test proved a substantial success
had a failure in it, unfortunately. Ran out of space mid test, but I was smart enough to save training states
same exact training as my previous new one that was a success, but training with bigger BS, more choppy caption drop out, 2x the epochs, and 1/6th the LR
Results are underbaked due to stopping early, but immensely promising
left is the successful previous one, right is the new one that finished way too soon
skin texture and lighting is massively improved, as well as background fidelity
I used some weight studying in training, so I know only .504/1 of the model has been tuned, so I can resume from where I was and do the other half, hopefully getting to about 0.91, which is right before it starts to deform and over-fit it seems
for example, by epoch 15/30 in the last training, the model was already at 0.907, and by 30 it ended on 0.94 and was way overtrained
While here at epcoh 30/60, its at 0.504, so hoping for about 0.9 by 60
if that proves a success, I will continue and re-train using my full 500 image dataset instead of the current 118
Hi guys, is there a download link for nmkd siax 200_1 upscaler please?
https://openmodeldb.info/models/4x-NMKD-Siax-CX
only siax on that upscaling dedicated page.
Where do I place it?
if you are in auto, same place you would put SD models, but instead put it in the folder that says upscale
or sorry, no
auto > models > ESRGAN
Okay thanks. There isn’t an upscaler called siax 200k_1?
I think it might be depricated (outdated)
that one is already in Auto
But it tells me couldn’t locate the upscaler
could be this, it is trained on 200.000 iterations.
these should be your folders
Granted, my install is quite old by now
oh wow, you have a lottttt less
I am not sure in that case
then make a folder and put it there
put what there?
what model are u asking for?
i think esrgan is as good as realesrgan
I just downlaod the Siax 2000
where do I put it?
and Also I am looking for the ESRGAN_4x
ersgan folder
Fo Siax?
yes
idk what is a formula upscaler
based on FormulaXL model
I couldnt find ESRGAN_4x
could you please provide me a download link for it
Siax is esrgan 4x
check page i send you link, choose one you would like.
https://openmodeldb.info
yea put it in esrgan folder too
In the dropdown list in A1111 UI there are list of upscalers, and ESRGAN_4x is seperate from the Siax
I also downloaded the SDXL refiner
it is called sdXL_v10RefinerVAEFix, where do I put that file?
yes
where in models folder?
Stable-Diffusion
what is called
thats the name
same as where my models are?
It is called VAE in file name
That means the built in VAE is fixed
all models have a VAE inside of them, a dedicated VAE is if you want to use a different one
yes
put it in the same models folder as the others
its a diffusion model just like base, but its made to run after
Alright thanks
no problem, hope it helps
screen of cmd?
it is fixed now
I just restarted the UI
idk what have caused it
AttributeError: 'DiffusionEngine' object has no attribute 'cond_stage_model_empty_prompt'
I am getting this now
🤷♂️
(Photoreal, POV close:1.3),cosmic evil DMT realm headspace, horned hooved muppet baphomet, Vinton Studios claymation, cinestill cinematic, beautiful dim lighting, night time, contemporary, despair hope, pentagram on forehead, seated cross-legged, wings spread, both male and female traits
sounds like you were trying to generate Baphomet lol
?
happy birthday!?
Is there any way to use controlnet tile with sdxl yet? I tried all the stuff like LLLite with blur/ultimate upscalers and such, but it just doesn't look as good
ssd-1b is pretty cool as a side note
big sadge
maybe in a year or two sdxl will finally replace 1.5
no chance even we will have dalle 4 lol
It looks so legit 😂
sdxl has never looked so good
I am happy to announce this monstrosity: https://civitai.com/models/183354?modelVersionId=205793
Listen. I need to win this 4090 to keep my quality as high as it is at the moment. 😅
Kohya might be one of the most annoying tools to use. It's got dozens of settings that are not properly documented, tons of redundant features that interfere with each other, buggy and unreliable bucketing that seem to ruin training, and, resuming from save states doesn't work.
Scam
Bro's getting a free ban
Yessir
@smoky patrol scam above
I would have loved to show you guys the results from my latest training for my realism LoRS, but kohya crashed a whopping 3 steps in
i 3 times the strength
Which model?
ssd 1b
Oh, no guarantees with that
Ah, see. Much worse
Feel free to report messages when you see them!
Is that 343 Guilty Spark?
If anyone is looking for something different to play with, I made a LoRA that creates caricature portraits. 😜 Many thanks to Wizard Whitebeard for sharing his horrible and wonderous Gildenface LoRA with me! It was my inspiration and I used it to create the initial training images for the first round of distorted images for this LoRA. I hope you can have some fun with it! https://civitai.com/models/181092?modelVersionId=203235
You ever have one of those days when you learn something that changes things for the better but still pisses you off because you didn't already know it?
wget --content-disposition ugh...but yay
Does that set the proper file name?
Yeppers
Taylor Swift as a muppet on cctv
Guy on the left "definitely her
"
I see what that is
"portable pear"
welcome back, @hardy cipher
Wow, that's such garbage that I had to download it. PoP, you might enjoy my MS Paint LoRA. 😬
did you hand mspaint all the images?
Nope, I am lazy af. I used a magical thing called the internet to find reference images.
this lora does look like a gem though
I wonder what it would do if you're a mad lad and make it negative
🤔 hmmmmm....
perfect
Just made this fun and colourful lora. Check it out
https://civitai.com/models/184095/neon-style-xl
happy halloween
SDXL Tensor.art ... dozens of free generations/day
It has LoRAs, base and refiner, and Controlnet
It can be linked to (or is planned to) ComfyUI
Hi. I'm completely new to all AI stuff and need some pointers on how to get started.
I'm trying to use stable diffusion (Installed the easy diffusion version) to create some standard fantasy art (Knights, witches, monsters and so on) for use in TTRPG's I run. Character art, landscapes, locations etc.
But I have no idea how to start, just typing in prompts ended with some... interesting results.
Any pointers on where and how to start would be greatly appreciated.
You need good prompts and styles, use https://civitai.com/images to look for images and looks you like and build up some prompt templates
See the latest art created by the generative AI art community and delve into the inspirations and prompts behind their work
While you're doing that, take a look at the models of images you like, you'll eventually have a few models around
Then if you have the hardware for it you can install comfyui and a1111 and start learning more technical workflows (start by copying workflows).
I am trying to get realistic life like images using SDXL for my website so anyone can create realistic and masterpiece image in few words without need to write long prompt
I tried to fine tune sdxl tried different prompt but at last I get these type of images only
Please can anyone help me 👀👀👀
I used this prompt
an old woman holding lamp, photorealistic
Happy Halloween
Nice bro
Awesome,
can you please tell the model name ?
looks like it's base model with a mummy lora and an art lora
you can drop it in comfy to see the workflow
Thanks
happy halloween you comfy peoples
workflow included
custom text chatbox node
Happy Halloween! 🎃 👻 🦇
...What is hidden in this image? 👀🌀 https://www.instagram.com/reel/CzEbjOMoEp_/?igshid=MzRlODBiNWFlZA==
lol I thought this was Pixelmind at first
the design is almost identical
||netflix||
Pixelmind was a text-to-image service that I worked on for about 4 months before Midjourney's beta started
I'm just commenting on the web design
Oh sorry I thought about model
go to civitai and search for some models/loras to help u out
@myg the more specific you are with the prompt the better the results you will get, also the model you download like "Realistic Vision V5.1" is a good one for real looking people, Matters ,because it make the image from that models database also when you get deeper into it you use LoRa and Controlnet to fine tune your outputs.
Have this been posted here? System Memory Fallback for Stable Diffusion: https://nvidia.custhelp.com/app/answers/detail/a_id/5490
Did somebody said pixel? 👀
Asked in the prompt section but didn't get any response. Does anyone else prompt with Text Concatenation and Cutoffs? Seems to help a lot for specific detail. Wouldn't mind seeing other prompt workflows. This is using ComfyUi for those unfamiliar.
red paint redemption
Another classic. American Paint
Okay well apparently my lora can do the Netflix logo, just maybe not spelled right every time 
I gave it a 5/5 on civit 🙂
I saw. Thank you 🔥
Nah, it's just a MS Paint style LoRA
I bet it would still be really look if used with mine
is it not possible to make embeddings with XL? (trying in automatic)
don't see why it would be impossible
almost all training in automatic1111 are flat out not working for SDXL. All we have is really Kohya
not quite as good as yours, but only lora I used here
must be from the covid life days. wearing those masks, people socially distancing
need a lora based on this theme
characters from the oregon trails game fighting dudes from fight club in the streets of a large city while 8 bit babies that drink gasoline and spit fire add to the mayhem and smoke cigarettes
a prompt on base sdxl?
Wdym?
is that a lora or something? it looks awesome!
Good Evening... who's neck is available...
Yes. Aether Pixel. Will release it soon.
Ohhh okay it's not out yet. It looks really great so far, can't wait to try it out!
My lora generates custom text, and I think that effect would be so sick with it
Cool!
just made this for somebody on another server
I feel like that particle effect would look so sick with these
How do you control what pixelates/from where?
Happy Hellowin )
Cutie Audrey Hepburn if she was alive today. 😉
I always had a huge crush on her even old and shit
Btw, that ain't Kathryn that is Audrey
Fixed. This is why I shouldn't do 3 things at once.
Do one and do it right.
Katharine never did it for me but she would be good as your tom boy friend to just mess around with. Later in Life she sort of went "I am Kate, and I don't give two shits anymore" on everyone.
new feature added to NV drivers for stable diffusion. mainly for users with 6-8gb cards https://videocardz.com/newz/nvidia-introduces-system-memory-fallback-feature-for-stable-diffusion
NVIDIA has a solution to Stable Diffusion maxing out the available memory Drivers have a new feature called “System Memory Fallback”. The NVIDIA Control Panel may still look old, but it still continues to receive feature updates. In a recent driver release, NVIDIA introduced a new option in the Control Panel tailored for consumer GeForce […]
allows you to make it perform like it did a bunch of updates before
I thought they had that on Windows since long time oO
the question is: when do we get that on Linux, too
Ive not seen it before and they claim its new
weird. I always thought windows is using the RAM as fallback when vram is full 🤷
if you google that you get so many explanations about that. But I have no clue, I don't use windows for that
the new feature allows you to decide to allow that or not
previously you had no choice
Well this is interesting, implemented, I go from about 3 minutes upscale process, to 35 seconds upscale process.
But then I OOM if I try and push to my FaceDetailer node
oh yall chatting about it lol
yeah thats the down side. it may crash if it cant handle it
For straight up simple gens though, I get an absolutely monstrous boost in speed though,
vram is blazing fast compared to system ram
ram ist fast, too, but the transfer between ram and vram is slow I guess
Yeah wonder where exactly that bottleneck is.
Imagine if that could get resolved and RAM could be utilized efficiently.
Could be a pipedream though haha
dreams in 64gb RAM
This one is so creative and well executed, care to share the prompt that managed to create it?
Needs bigger fangs, like this. I added "like sabre tooth tiger teeth" etc.
Could someone explain the difference for me:
yes
post the link
Use: support word "isopod-like" to achieve the right effect. SD1.5 version created as described in the tutorial: https://civitai.com/models/52697/t...
it was SD 1.5 and now thy updated a new version for SDXL
Okay wouldn't it make sense for it to be 1.5 and 1.0?
It's the way people do their lora versioning. It's version 1.0 of their model based on SD1.5, normally you would see version 2 based on SDXL and eventually version 3.0 of the lora based on whatever version of SD is out. You can see Sergeant's workflow on civitai is version 1, 2,3,4,4.1 etc...just depends on the user and how they want to number it
Ooh glossy detail, nice
Spawn? 🤌🏿
✨
What the F
he's massive and ballistic 🫡
Bro has balls of steels 🤣
ball-istic 🪩
🥶
nice one
A Spawn/Venom mix.
Shameless self-promotion:
https://civitai.com/models/185138/sdxl-sanddrawing
looks good! 🙂
nice!
thanks!
hey guys, I'm happy with this result, after a long round of adjustments, what do you think?
Love the lighting; nice!
I like the overall image quality and the background details, but i can't unsee this neck 
😄
is it a general problem in sdxl when u have something like sunlight in the bg ,the person in front stands out a lot looks overexposured? Like there is too much contrast
You can do that, although i can't 100% recommend it. Problem will be double features. Like SD 1.5 on 1024.
My smallest dataset was 85x111. Just for fun
ok cool thanks
have you tested loras vs ckpt for training people? Like do you think a lora produces better results of the subject over training a whole ckpt
I have not tested it since i have only a 12gb card on my hands. But feel free to ask in #🔧|finetune
This IPAdapter is way cool!
I've never wanted to eat a cat so much
sushity
Ya man. 😄
a very serious wizard made a small error in his transmogrification spell, he used the wrong form of you
cool style! alice in wonderpunk land
hey guys, can I upscale an image without using the Hires setting at the time I generate?
i am here
what did i miss
noice 🔥
bro 
No time to say, "Hello", goodbye!
Yes, but you will need extensions like Ultimate SD Upscale, ControlNET Tiles, and perhaps Multidiffusion—perhaps even all together—to get good results. But you WILL get much better quality from such methods
Here’s an example; it starts off with a crappy painting I did, then I put it into the AI at a low resolution, and then up scale it to 2.1K:
Cool
And also, if you decide to use them, you will need to read the GitHub documentation as it can be a bit confusing to use. In these images, I used ControlNET Tiles and Multidiffusion to create a high quality painting that doesn’t have the “Upscaling” texture that you get when you only use ControlNET Tiles + Ultimate SD Upscale
Good to know, thank you!
Only for A1111?
I personally used in A1111, but I believe it is possible to use these in ComfyUI
why does CR Latent Batch size increase s/it so much and increase the time it takes to generate by 1000x?
I put it at 16 since I want to generate more images at once
you need more vram for larger batch sizes
IPAdapter can go really slow the more Photo Inputs you use. I found that 3 inputs and Photoshop really slowed things; so closing Photoshop made things a lot faster! But Bingo - there is Online PS now - which makes no difference to the IPAdaper at all 🙂
An SDXL ComfyUI original ... using Generative Fill in Online PS
in auto1111 you can generate 100 images np
that is batch count. generating them one by one which any gpu can do
batch size generates x number of images at the same time
any node for batch count in comfy?
its on the toolbar. tick extra options box
Preserved...
Harry Frogger and the Galactic Jedi Shrooms
tron mechanic
guys i have to get a new video card i guess, my pics suck
gpu doesn't change quality of image
well i cant run xl and regular crashes alot from no vram
that's a separate issue lol
I have yet to see ComfyUI give VRAM errors
wish i could do something with this one im on. oh and the one i have is not nvidea
i have lots of issues eh? or comfyUI may solve most?
Cheaper to try ComfyUI than buying a new GPU.
Unless you're dead set on purchasing a new one
Does it let you allocate some regular ram to it by chance?
Boba Tron
not exactly.
How much vram do you have?
says 4 but i think 3.6
64 mobo ram, gpu is only part i didnt buy when i built this box
you may squeeze by with 4 in Comfy. It'll be slower as it'll likely use tiled vae for everything
yep, that's exactly what I was about to say
Do you think a $400 or less video card would work for SDXL
I use a 2060 Super just fine
i have 6400 XT
Wish it was faster, wish I had more VRAM, but I can get by with nearly everything
no 6400 RX sry
Probably generated over 10k images at this point, I couldn't even count
yea but 2060 has cuda and prob more stream threads
Yup, it's a good card for what it is
ohhh i see what your saying
old AMD cards are a bad idea for AI imo
yes i checked its good price and quadruple ram
I think used 3090s/ti from ebay or something are the best budget rn, right?
u have the 12gb or 6gb?
Not sure who you're asking, my 2060s is 8GB. Not sure if there are other options for that
yea diff manufactures put different fans and stuff on that chipset i think, they range from 6-12gb from what i seen.
those 24gb topshelf or are there higher?
Consumer grade, I believe 24 is about the max
If you're gonna continue with this route, you won't regret getting more vram
Yes.
yeah, that must change in the future.. if we would want models that use something like T5 we would need a little more than that
I'm really curious what nvidia does with the 5k series, since between the 4k release and now, there hasn't really been anything released, and AI has exploded since then
I'm thinking 32gb cards maybe? it probably won't cost them too much to bump to that, then SAI could consider having a bigger text encoder on future models, and using something like LLaMa with SDXL would be a possibility
If they don't tread the right way with it, it'll open the doors for something else to really swoop in
nvidia
it has a 2tb ssd but i havnt read it yet, looked it up again to sharehttps://videocardz.com/newz/asus-shows-off-rtx-4060-ti-graphics-card-with-m-2-ssd-slot
ASUS wasting no PCIe lanes, introduces GPUs with SSDs ASUS has come up with an interesting idea for consumer graphics cards. ASUS RTX 4060 Ti DUAL with M.2 SSD slot, Source: Tony/ASUS Currently, most entry-level and mid-range cards are limited to PCIe Gen4 specifications and 8 lanes. All of these cards, however, will ultimately be […]
Interesting if it's comporable speeds as built in (it may specify in the actual article)
bumping the VRAM to 32 would defiantly be smart from NVIDIA, considering they are actively making driver updates SPECIFICALLY for Stable Diffusion
so we could expect the 5k series to have some capabilities that allow SAI to start to consider making models with a fat text encoder
I personally don't expect too much from NVIDIA considering how much they fucked up the 4000 series, but judging by their recent moves; there is a chance that they will actually make the GPUs capable of such a thing
My fine tuned sdxl model running on nvidia A100 large
Model : fine tuned sdxl
Prompt : A mystical house in Forset, concept art
Model playground : https://ionium.web.app/sdxl
its nothing, just an additional slot to be efficient. nothing more. weak.
$15k gpu and u got that? which one u braggin about lol
Lmao bro it's SDXL you can try generation time is just 20 seconds for 2 images
I'm on a 4070ti and for me it's 30s for batch 4
steps?
takes me a little bit over 30 seconds to make stuff like this
why do you have that card? I dont know anyone that would have that... @velvet shore
I've 4070
I use AITemplate so the sampling is near instant
Yes, but I have some users also
Hehe 😁
I liek that pink
how many steps?
I did 60 there
like TensorRT, but a little faster and entirely flexible
just acceleration
im for real.. did you write it off? what kinda work?
I'll try Nvidia accelaration on A1111 as soon as possible
I just wanted to take reviews as I am making a complete fast stable diffusion web playground with many models
im looking at specs.. thats super insane. like having a 1995 ferrari f50 or something
I'd personally recommend ComfyUI, A1111 is outdated
world’s fastest memory bandwidth at over 2 terabytes per second (TB/s)
When u upgrade please dm me
I think it must be supercomputer
2Tb/s
Nvidia new drivers extension is for A1111
it's still not as fast as AITemplate. NVIDIA finished making that extension a few months ago, they just now released it because licensing or whatever..
also you'd need to compile EACH checkpoint individually to turn into TRT engines, AIT is architecture specific; not checkpoint specific.
in 10 years we should have specs of that a100. its happened time and time again for almost 60 years
H100 is even stronger
bro watch the language
just a heads up, typo on your main page "cinematic" vs "cinemantic"
world war 2 soldier werewolf, flamethrower expert, propaganda photo, Bernie Wrightson (American, 1948 – 2017) photography, 1986 aliens cinematic lighting by Denis Villeneuve
if i see stuff like that, i feel like i'm looking brightprotonuke v1.0 (my first model had so much of those twirly artifacts)
if they do it'll be something like
5090 32GB: $2200
5090 24GB: $1600
5080 16GB: $1100
There's gonna be price pressure for sure given the state of the economy, as well as maintaining market share
nah, it doesn't hurt too much to bump the VRAM up
in manufacturing costs no but nvidia will 100% charge a shittone extra for the priviledge of having 32
is my point
I just used the extras tab which is upscales pic using gfpgan or code former
that's dumb, more people will buy the 5000 series if it will have a normal price and 32gb
it would be more profitable and better for both sides
NV have limited caacity purchased with TSMC, so its more profitable to use that time making server chips
if 1000 people buy a $2200 card with a 30% margin it's more profitable than 5,000 people buying a $1600 card with a 5% margin
though arguably the Titan class cards failed so it hopefully wont come to that
that's not considering the likeliness of SAI making bigger AI models due to the consumer cards being stronger. SAI brings NVIDIA plenty of attraction
the last few drivers are made SPECIFICALLY for SD
i dont think its a question of whether or not nvidia wants to make the best Stable Diffusion cards I think it's more whether they feel threatened by AMD's current cards
there's definitely been some movement with a lot of new ROCm support PRs being brought up in major repos, more specifically for the XTX
whether its enough to dampen the nvidia gouging is the question
speaking of; I managed to quantize T5 to 6bits and it was almost identical to FP32
what's the test?
quantizing can work well to your meat brain but might not work well when it's part of a diffusion pipeline is what im wondering
like the seemingly subtle losses could be magnified
just asked some general questions, responses are almost identical
I think we could have a quantized T5 as a text encoder? or will that need adaptations to the other components of the model before training
no idea. Pixart still hasnt released their weights so our only hands-on demo is DFIF which is certainly one of the models
there's no way to do that, the only way to test an encoder as a diffusion pipeline is if a compatible UNET would've been created. this means SAI must be 100% sure about each component before they start training it
DFIF's T5 isn't quantized I think?
dont think so
i dont have it installed anymore
havnet run it since i got my xtx
could probably add it to my diffusers CLI script ig
I can try to quantize whatever version of T5 DFIF uses then replace it with the included one I think?
i dont speak in runes
To give you an idea, these 2200 USD become the value of a motorcycle here in Brazil, it's always like this, waiting for the 50 series to be launched before being able to buy the 3070. 🥺
buy like three 3060 12 GBs and string them together using deepspeed or something
what's the worst that could happen
I just read the repo and people talking about the repo, will give it a try in some time
"Text encoder needs a bit above 8GB of VRAM loaded in 8bit, so the whole thing should be able to run on 10GB cards." this means the UNET takes 2gb
it should take less if they load it in 6bit though
oh you mean dfif uses 8bit by default?
yep, according to people on the repo
fucker OOM's my 24GB xtx so I think 10GB is bullshit
unless diffusers has made memory improvements since
I have 12gb, it should OOM for me if it did for you as well. I'll test it out
dfif was like the first thing I tried on my XTX and I had to reduce the res below base to render cause the 2nd upscale would OOM
I'll also change the code to use 6bit on the text encoder to see what it changes as a diffusion pipeline
maybe the script I used was just trash idk. i made my own diffusers script for SD models and it works almost as well as comfy for the most part
i really think i gave up on upscaling
trying like 12h now and still having the same issue
no upscaled
upscaled , RIP Details
for SDXL run either ultrasharp for art or swinir for photos then re-run XL at the final res after the upscale with like 0.3 denoise to clean up the artifacts
its the same for me, kinda no matter if sdxl or normal sd...
it always deletes any details, i prefer using extras tab now
left img2img, right extras tab
I usually use something else and it works insanely well for all stuff
every detail always get smoothed out, i hate it
do both together
i dont understand why others dont have this problem
or at the very least preprocess your img2img input with an unsharp mask or something
they do you just dont see all the failed results lol
have no idea what u mean 😮
oh they are on A1111, that must suck
the only solution for me is running the upscale with 0.01 denoise
then it looks "ok"
sharpen the image in GIMP/Photoshop/something then feed the sharpened image into img2img so SD can pick up on the details easier
sometimes helps
mh... i mean i could just use the extras upscale, seems to work better for me at all
if you're talking auto, what I do in other uis equivalent is extras upscale first then img2img after
mh might try that
or maybe its just i like the grain of the pixelation more and its matter of taste
"grain of pixelation" is certainly some words
your meat brain fills in the missing details at low res better than your computer's sand brain
you should look at SD.Next's sampler configs lmao. They're all hardcoded at like the opposite settings they should be using.
like ddim is set to "linspace" I think for the timestep spacing which I dont think any SD models use???
and the set alpha thing forgot the exact name is wrong along with step offset
yeah, ComfyUI is easily the best UI; there is no function it can't do that others can
simply because you create the functions
i feel like a proper diffusers frontend could be competitive but right now the only major one I know of is SD.Next which sucks ass
comfy is omega specialized for SD so it generally outperforms diffusers but diffusers is super flexible. Train an XL model with velocity prediction instead of epsilon and comfy cant sample it correctly
- the non-sd pipelines in diffusers are fun
anything based on diffusers can't really be competitive because the library isn't organized correctly
What makes you say that?
could maybe use a touch of deduping lol
the whole point of a stable diffusion library is to be able to use as much stuff as possible at the same time
diffusers just isn't good for that
maybe i should look back into making a library with libtorch directly instead of through pytorch lol. probably a ways beyond my experience though
is that with both scalers for the full (1024?) output? that's the part that ate like 28 gigs for me
and sd.next isn't the only frontend that uses diffusers, invokeai also uses it
invokeai was actually faster than A1111, but not as fast as ComfyUI
it's not hard to be faster than a1111
from what I recall invokeai doesnt support other models outside of SD which is a big part of diffusers.
right now I just have my own cli script so I can do hacky stuff like fix the DDIM timesteps in 2 lines of code instead of trying to make a patch for a whole ass UI
yea that's diffusers. Usualy a few % behind comfy and a large chunk ahead of A1111
diffusers xl 1024 for me is like 2.95 it/s on diffusers with split cross, 3.17 on comfy with subquad, and i think like 2.6 or something on auto last I checked.
with a triton compiled unet I can get like 3.39 on diffusers
I recently improved subquad so it should OOM much less on AMD now
Oh nice. Guess I'll try re-enabling smart mem for a bit
I also heard AIT is about to be officially built in UI?
i compiled a flash attention 2 wheel but when I tried it on a llama model I guess it doesnt support the shuffled forward v2 fn needed for inference so big sad
with my recent comfyui refactors it's very simple to change the scheduling and v prediction, etc...
I need to write a node to expose that functionality
lol i saw that. I made a cursed node that just
model.model.model_type = comfy.model_base.ModelType.V_PREDICTION
return (model,)
and it actually worked
alright, so T5 loaded in 4bit takes about 4gb VRAM, easily enough for most cards
Good night
when used as a component of the model the precision of the encoder barely does anything to the quality/coherency
SD at least is very sensitive to imprecision in the text encoder outputs
if you inference it in fp16 vs the fp32 (with weights in fp16) you will notice lower quality images
I tested that on DFIF, to see if SAI has a reason not to use a bigger text encoder on future models
so the natural solution is a new brainfloat8 type. All exponent no precision
god damnit i just pulled master and now the vpred node doesnt work. 0/10 literally unusable
https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/model_base.py#L117
you just have to set that model_sampling to the right class
8bit T5
4bit
i feel like you'd enjoy Rust's trait system from the way you make a shell class that exists only to inherit from two others lol
Still, the degradation isn't too bad, I think a bigger text encoder would be a good idea for future SD versions
CLiP's language model is probably too small to have a coherent language understanding, so that at least certainly need to be switched
Using After Detailer in ComfyUI to adapt the face and it actually yields pretty good results.
all great
What does the envelope reaction mean?
what sampler do you guys prefer?
Good Models to generate Air-Planes like B-24 Liberator?
I haven't extensively looked into it, I usually go Euler as of now but have no benefit why.
thanks. theres so many options its overwhelming lol
can i get some help running sdxl on auto1111? on comfy everything works fine
but auto gives me black screen
Briefly looking at what I just posted i will stick with Euler or UniPC. Obviously take your time to look through and see what works best towards your goal.
Anybody know where you can apply for money grants for LoRA's and stuff?
Cause I am spending so much time per day on this realism LoRA, and while its current handily beating results from realism models, I have a ton of work to go, and this is turning into a full time job, and I just can't support myself as it is
as it is now, this is my new workflow I have to do for every single image in my dataset:
find the image, favorite the image, pixel peep the image, download the image, hand crop the image, down sample the image into a custom bucket res, rename the image, and tag it all by hand following a strict formula all by hand
And I have to do this 500 times for my next test, which is likely gonna have at least a week just for the collection
If I deviate at all, the quality falls off significantly
if anybody knows where I could apply for a grant or funding or something, I could really use it cause I am an unemployed 19 year old doing something I am passionate about, but I can't keep doing this if I am not at least making some income off of this.
I can't just dump hundreds of hours down the drain for nothing, its not sustainable
Civitai
But you'd get paid in "buzz", which unfortunately you can't exchange for money atm
Thats not exactly what I meant, but I do appreciate the suggestion. I was meaning like actual funding, not offering some buzz for their limited rewards 😅
So just... somebody to give you money to make your own models...?
Yes. Its more common that one would think, and I have just put in some applications from some other people who reached out in DM's. However any OTHER suggestions would be extremely appreciated
Hoping the results I am having can interest LAION enough
Indeed or Only Fans like the rest of the world. Sorry to say it but seems like everyone is doing this for fun.
Client requests could help.
Yeah I have no idea then. I've done commissions before but they always want something specific. Or want something.
Yeah, this is more of a very powerful LoRA I am working on that is honestly handing fine tunes their ass, but the scale at which I wish to do the next iteration is just too big for me to commit to without some support from a benefactor. I have been told about a few I am trying out, crossing my fingers
I just made my first about a week ago
It solves text generation for stable diffusion XL 
try with kofi/patreon so ppl can support u each month
I have been making them for a long time, but this new one I am working on is by far the most promising I have ever seen, and EASILY the most time consuming one I have ever seen as well haha
if ppl like your models/loras they will support you but not always sometimes they just dont care 😔
I am not looking for the support of individuals at the moment, but the support of groups/orgs that see what I am doing at the moment. I am holding out on a few options right now
Yeah I mean obviously I'm new the to model training side of things, but it's the highest rated XL model on civit this week, and I don't see how I'd really make a dime off of it
But if you want to just turn a profit, there's many ways to do that.
We are working on very different types of LoRA's. All of my previous ones have been like that as well, and I felt the same way
but I am hundreds of hours deep into this one, and I likely have hundreds more to go
but my small scale tests are so promising, I think it could be attractive to training groups
I was a programmer for many years but I fell in love with creating art when AI tools came around, and now I do it full time
Both as my job and side hustle
with less than 1/25th of my full dataset, only 1/4th trained, I am already crushing dedicated realism finetunes
100's? yes. Good? I have yet to see one ATM
I guess I don't know what your standards are lol
I am stretched so thin I don't have many good examples of my newest tests, and my newest yet failed unfortunately, but I have some really old examples, and trust me, the results are much better now
from my experience ppl give u money for a lora when they want a lora from certain char or celebrity that they cant find anywhere
here are some tests from my much older model. this is less than 1/25th of my full data set, trained only 50% of the way. top left is mine, top right is realistic vision XL, bottom left is Realism engine XL, and bottom right is RealStockPhotoXL.
Text LoRA's are dope. I have no idea how to train one, so thats all magic to me haha
I don't think I've ever seen one before tbh
but yeah, those results from above are less than 1/25th my datasset, and only about 1/4th the full training potential even from there
also, they require no key words
i don't quite know what that implicates though
like the alpha version of my LoRA was trained on 25 images just to test and see if it worked
top left is base SDXL with a 100MB LoRA compared to 3 of the best realism finetunes for SDXL
and that one is already a step behind my hybrid 1.5 version, which is a step behind my 2.3 failed training which is only 40% of the way done
And that new one is less than half the quality I am expecting from my 3.0 demo
and the 3.0 is 1/5th the size of the 4.0
yeah if there is a direct relationship between the size of the data set and the quality/capability of the LoRA, I'm not sure what it is
my new results are so promising, but I can't properly share them cause they are not done, and are very volatile
oh, sorry I left that out
my LoRA allows for control over the following:
lighting luminance, lighting direction, Subject, crop, aperture, color grading, location, time of day, and more
in addition to not needing any of that info to produce the results above
With no trigger or activation terms?
Reminds me of Army Men: Sarge's Heroes the old game for some reason lol..
if you wanna control those values, you need to prompt them, but if you don't, it works as see above
so its like a DSLR, it shoots good in "auto" or you can dial it in with manual to get better looks
I don’t tend to use the Extras tab for upscaling because utilizing Stable Diffusion itself for upscaling along with those extensions makes for dramatically better results than using dedicated upscalers alone. I only use the Extras tab for upscaling when I want to have a raw “backup” upscale that I can edit and “collage” for inpainting into the real upscale created with the extensions if SD created messed-up parts
Well if it really is the best realism model for SDXL, then maybe if everyone else agrees they may help fund your following projects
ok, here we go, I have another example of just how good my new failed LoRA is
Left is base SDXL, 2 is my realism 1.0 (in the results above), 3 is 1.5 hybrid, and 4 is my new crashed training that only got 40% done before it crashed. And you can already see how promising the 4th one is here, even severely under trained
And I have new tricks up my sleeve that are showing potential to increase quality greatly again, then a 5x increase in datset, then another trick a friend is writing from scratch, and then another 5x dataset increase
Also, this LoRA is made to upscale, so its really 2048x base with support for 1024x images
prompt was "Blonde woman in forest"
no negative
you can already see how much more realistic the last one is, barely even 40% trained
Sure, but like I said, I not convinced that larger data set is a good thing always
^^^facts
its better fro the specific variables I am testing, and every dataset increase so far has yielded much better results. Also, the full size dataset will be accompanied by full text encoder training with subject breakdown using nested folders, courtesy of a research partner
my original dataset was 20, then 60, then 90, 120, next is much much more
I guess we'll see how it stands up when you release it
Oregon Trail remaster anyone?
also, my results beat out the competition wayyy more if I cherry pick them lol
but I chose general prompts from my research partners
dang that's some sick lookin pixel art
hell yeah it is
you have died of dysentery
@bright valleymy progress so far from base, without the last one even being fully trained
The key, ([Pixel | Voxel]:1.1)
on a subject and location not represented in the dataset
I trained my lora to do pixel art the best I could, it does 8-bit, 16-bit, and 32-bit
actually here, let me try something
Female Maximo!
here's 8-bit
How do you prompt the text you want?
Drop a link?
You use ComfyUI or Auto1111?
comfy
Noice
i usually use this VERY simple workflow to make shit with it, takes 6s per gen
It doesn't work perfectly everytime, it's only v1.0 and a work in progress, so sometimes you need to go through a few seeds to get the correct spelling
but all things considered, I'd say it's a HUGE step foreward already
Is there a picture I can drag in for the workflow on CIVIT?
@bright valleyhere you go
My LoRA is trained on foliage, the others are not lol
TY
why do the first two images look like trash tho
cause one is base SDXL, and the other is realistic vision
I mean
realistic vision sucks at animals
I am not spamming tiggers and negatives
just "Photograph of a white tiger in a forest at dusk"
its not always as big of a win, but my model is trained specifically for forest images
i guess if somebody wants to gen animals in the forest, youre the guy
or people
anything in the forest, my LoRA destroys the finetunes lol
cause I originally trained it to do just that lol
Personally, I tend to front a lot of my prompts with “RAW Photo, taken with Provia” to get higher quality images—though sometimes I also use “stock photography” to get higher quality. I also prefer to keep my prompts relatively short if possible
ngl the panther in the second one looks pretty diesel
and thats why I am making such a big dataset, cause the most images in my dataset are in a forest, and it shows cause my LoRA does forest portraits really good
hopefully that works out for you
A portrait photograph of a woman wearing a purple dress in a forest
which is the good one in that
remember, this one is only 40% trained, which is a shame
the one that doesnt look like from a stock photo
not yet no
I am just showing different examples. The one on the right is mine, and has more accurate lighting on the face and chest. The light is from above, and thus the chest is brighter than the neck and face
My LoRA is made specifically above all else for better lighting
if you make some shit with it, I'd love to see!
Trying one now will post it
🤘
if you ahve any you want me to try, let me know
yeah try my lora with it
I asked from the side, not a profile. Her body is from the side
I don't care for any of the results
see if you can make the most realistic letters possible
I am not testing with other LoRA's, just any prompts you'd like
this LoRA isn't even fully trained, not made to pair with other LoRA's lol
oh, well I'd say finish the dang thing then
bruh, thats why I need funding lmao
mines not made to either but people do it constantly
the next training is likely gonna be at least 100 more hours of work
cause of all of the improvements I am making
unfortunately nobody is going to pay you to do that though
so you can either do it or not do it lol
We've yet to see that
There are a few groups that do it all the time, some of which I have just reached out to
give people money to make their own LoRA's?
I mean, I work in the industry and I've never heard of it
RunDiffusion paid a ton for JuggernautXL, a model my LoRA already bitch slaps in realism
Clip -2 1024x1024
fk yeah
thats dope :p
Clip -2 Upscale 2x
went straight for the pixel art i see
if anybody has any realism prompts you want me to compare, let me know
clip 1 and clip 2 are the same exact image
Was wondering if the clip would cut back on some of the artifacts, going to dial it back to -4
I am sure there are plenty I would fail at cause of my small dataset
Like always?
Pretty sure it does nothing in sdxl
dude the first version of this one was 25 images
and it kicked ass somehow
good and big are not directly related always
some concepts are a lot easier to train than others, and it depends a lot on how you captioned it too
try clip skip 12
Goog to know, it has still beem in my workflow.
It was a "trick" used in 1.5 for certain models
just skip ALL of clip lol
how about achieving custom text generation in stable diffusion 
maybe 2 also, but was necessary for some in 1.5 I know for sure
-12 lol
I know somebody else who has done it as well, a few weeks ago
LOL
I have 0 clue how you would ever tho
Still pretty cool though
I don't train stuff like that
Link it
let me find their examples, IDK if they posted it, but it can do sentences
with SD only and no control net. Just a LoRA
he trained it at only 768x res I think, and he uses a second diffusion on top to upscale. I am not super sure, but he had some even better results I didn't save
looks like dalle tbh
looks like stock photos
yeah its tough for me to glean anything from a bunch of images
also, you can see the images names lol
I have no idea how they made them
is their model not out anywhere?
it starts with an A
if I can find him I will ping
found!
@noble shoal Any info on your text LoRA?
more of his examples he has sent here
Theyre cool, but don't mean much of anything without knowing how they're created
hmm, it looks like he doesn't have his LoRA out, so IDK actually
Strange, if it can do all that
but its for sure not dalle, cause of the content he had it make lmao
Pic or it didn't happen.
I honestly have no idea, I will have to see what he says
he told me he trained it at a low res, so it got the shape of the words, then used img2img to upscale
and now it can gen full sentences without any control nets or anything?
or thats how he makes the images
this is when he was showing it off
not like full full sentences, but what I showed
"NO PINEAPPLE ON PIZZA"
Not like a super eloquent sentence, but a string of words properly spelled in order
lol
Yeah I was just wondering if he has a model and the lora and presses generate and gets that
I get it man
Same lol...
still meaningless withou tknowing the workflow
its just a LoRA according to him
@bright valley
he trained it at like 768x res, gens with SDXL at the low res, and passes it to normal SDXL to upscale if he wants to
thats what he told me at least
Yea got bumped...
actually wait, when did Dalle 3 even get launched?
yeah, for sure
what was their name
sarah
Adamantium
also, a friend of mine retrained all of SDXL from nothing using just his single GPU, and the results are shockingly good for having destroyed all the weights and rebuilt them lol
but yeah, they go by sarah
hes actually the owner of the research group I am in, and he spent so much time on it, OMG
its no finetune, just a completely new base for SDXL to train finetune on with terminal SNR and gamma correction, as well as v-pred
seems a little redundant, no?
no, its not even close to normal SDXL
it looks nothing like it
and it trains completely different
he took base SDXL, nuked it to incoherence, and then trained in Terminal SNR, Gamma, and V-pred from scratch using his own dataset of like 500k images he scraped from all over the place
so he changed the way the model diffuses as a whole
now thats what I call a Blood moon 
It's trained on all my own artwork, so I love how it puts horns on shit like I do

I don't fully understand what he did, but he basically tore apart all of SDXL and retrained it on different papers with drastically different tools and stuff
Hugging face has been pestering him to release it, and he finally did after several months of work
yeah seems like everything you talk about isnt done or isnt out
he also retrained the whole base of SD 2.x to 1080p, and it can rival SDXL sometimes
it is out lol
just a sec