#✨|sdxl
1 messages · Page 30 of 1
Put male or man or boy in your positive prompts. If you still get females, you can put 'female, woman, girl' in negative prompt
Though... How exactly do I edit the prompt so I get something like... Marx from Kirby Super Star?
Combining things can also be done in other ways like through prompt editing
If you're new to auto1111, I'd suggest you at least skim through this whole page of features and try to experiment with things you like.
You can write exactly that, and put in things you don't want in negative prompt. But I should tell you that many checkpoints will not even know what characters are you talking about. They will make generalised images. That's why we use Loras and TI for specific characters and styles.
Okay, so how exactly do I make a LoRa and train it to recognize Marx from Kirby?
I downloaded Detail Tweaker LoRA. Where do I put it?
My GPU is still working, so I am gonna say I think we did in fact fix it
In you stable-diffusion-webui folder, there's a folder called models, in that there's the lora folder. Put it in there
Training a lora won't be practically possible on your PC without GPU. You'll have to use Google Colab for that. You can find videos on youtube for lora training
In stable diffusion webui folder.. Look for embeddings folder. Put TI im there
Thx
I'd also suggest to not leave the negative prompt empty. At least put "(worst quality, low quality:1.5)" in your negative prompt. That will make your images wayyy better.
Got it
Weigh two geau
Weird Gingerbread Look (like Mulhouse in France) SDXL 0.9
Can LoRAs, Embeddings, and Models be trained with pre-existing photos or do they need to be done with photos that are being created at the time of training???
You make a dataset of the photos you want to train on and then train a lora or embedding from that. So, yes, you would use pre-existing photos.
awesome... I was just wondering if me creating like a hundred of several different styles without actively trianing something was wasting my time... This is awesome... Its like I am prepping now.
These videos are older now, but I seem to remember them doing a good job woth explanation: https://github.com/bmaltais/kohya_ss/#tutorials
I'm waiting until this is all possible in ComfyUI at least for now. I do also have VLAD A1111 and A1111 installed and running though
It's also worth pointing out that the process for sdxl is not really any different than 1.5 except that you would want images at least 1024x1024.
I train LORA using using kohya but then use the loras in a1111 and comfyui. I haven't found a more reliable way than using kohya.
They are. I have folders for the originals at 10241024 and folders for 40964096 upscaled
I keep my folder structure pretty well organized... every different prompt also gets a new folder for both originals and upscales
Can I train my own models with the upscaled images and have a model that will out of the box use 4096x4096???
It's pretty unlikely that that will work at all since higher res uses more vram and you'll just run out at some point. I used to use 768 for sd1.5 and now use 1024 for sdxl. However that doens't mean that you have to use square 1024x1024 images. The trainer does kind of crop/scale to appropriate sizes for the training.
well anyways thats a problem for a different day. I'm going to bed. see ya tomorrow
oh and thank you for the info... I've never done any training of anything before so I appreciate the help.
Here's some images I've generated
That last one I'm hoping to make a prompt that has the style implemented properly
what is lycoris?
you're getting lots of bloom that adds the 100% humidity and fog feeling
i think if you remove the bloom, you might perfect this style a bit more
Do I do that in negative prompts or something?
There's like a half dozen different lora versions that work slightly differently and have different names. lycoris is one of them.
How do I remove the bloom and get cleaner lineart and make my prompt look like it's actually from an anime?
ComfyUI - Saturday's first batch!
interesting ty
i think so, you can also remove the prompt that adds it up alongside it
What words do I use in the negative prompt? Bloom, I got that. Anything else?
And what words do I use that give my prompt a crisp, clean anime look?
Such Beauty
Anything similar you can think of: bloom, glow, haze, low quality, stuff like that. It is hard to see exactly what the effect of a negative prompt is unless you keep the seed for some "bad" images and experiment with the negative prompt to see what helps the most.
Alright
Keep in mind I'm merely using Stable Diffusion to get some cool references for any projects I'm planning on working on in the future
If you know the name of a particularly bloom-filled anime artist or show or something, you could also try that in the negative prompt.
And I'll definitely train a LoRA bot to recognize certain characters with certain prompts once I get back from vacation
Great rendering
I need to look into getting a more photographic aspect to my output - qwerty's stuff is majestic!
@timid sonnet also posted some good renders. I couldn't replicate that workflow thought as there is another model being used after refiner called 'test.ckpt'. No idea what that is.
ComfyUI SDXL 0.9
Boss!
Excellent
Panda with a baseball cap
Use / dream in Bot Channels Shubhamk
Thanks a latte!
vlad1111 not recognizing the sdxl vae 😭
I put it in models/VAE
refreshed / restarted several times.
nice one! 
holy shit,the sdxl bots renders Elon so real,it just weird there is a portion of certain celebs generating well,the others suck
in the style prompt you can use enhance
and also clean/clean art or anything, neg:shiny/bloom/jpeg artifact etc...
sdxl is quite new so im not familliar with lots of ways to deal with it, 2.1 had a lot of ways that fail in sdxl
rule of thumb is to find the prompt that adds the bloom always, and add description around it without causing it
for example instead of asking for pink hair, you can say
prompt, pink foreground, prompt
neg:pink background
this way you will most likely get the main char in pink stuff, and negative anything that you dont want pink
is 1.0 going to be much better than 0.9?
from emad roughly 10% better
that's why they delayed it
because they want to make it much better?
not sure if much better, but they did want to finetune it a bit
ok. i hope the community fine tune it a lot, because 0.9 results dont seem that much spectacular compared to community fine tuned 1.5 models
it might seem like that at first, but remember that this is the general model... and it competes with the top bangers in 1.5/1.4 and 2.1
and it is not fine tuned to any style yet
in other words, this general model can be trained to be something that surpasses them
i know. Appel and oranges.
I dream with having something like midjourney running on my pc
i just hope that training in sdxl is going to be as fast as in 2.1
well, mj is prob a mix of lots of models
it is definitely run on stable diffusion
i noticed lots of hand artifacts that are common in 1.4/5 in it
also, they leave metadata out, so im suspicious of them quite a bit
yes i thjink so too. Probably it is some kind of AI improving prompts and selecting SD models.
it must be one of the best guarded secret on the internet.
meh, most likely they just pass it through several different models for a single image
and using the user prompt they selectively choose which to pass it through in the process
anyway, until it leaks we wont know
@visual glade Sorry to bother you. Tried to sift through your codebase in search for the path to my custom node. Found folder_paths.py but not helping me. Is there a neat way to get the path of a custom_node directory? I want to read a json within my custom node. Maybe the object itself contains it's path?
e.g.
p = os.path.dirname(os.path.realpath(__file__))
file_path = os.path.join(p, 'styles.json')
with open(file_path, 'r') as file:
data = json.load(file)```
Sweet, thanks!
Writing a node that uses the style data we got from SAI they other day, makes it easier to use predefined styles.
check out https://github.com/bash-j/mikey_nodes
nice! looks straightforward. good job 🙂
Never written python before 😉
The refiner on it's own seems to make decent images at 512x
Or should I say it composes well at 512x and then doubling to 1024 with a hires fix ends up with a pretty decent image
Wonder how we can exploit that
But also makes no sense why they trained sdxl base on 1024 and refiner on 512
Wait… is this why latent upscale delivers poor results?

well that image was upscaled to 1024, if you try do a 1024 image from scratch you get a mangled mess
Exactly!
So inherently they trained refiner on 512 whole sdxl on 1024.
Seems like pretty big oversight
so maybe try to use the ultimate upscaler at 512?
And reason for problems
Yeah
hmm.. but 512 the output is not great
Hmm. I wondered if they realized this.
refiner 0.9 with no base beforehand
sdxl only supports few res properly, just make sure to use them, and why are not using the base?
refiner is not good at offsetting the noise
I'm just experimenting
Thanks. I know how to use SDXL properly. I'm just playing around with things, I noticed if you drop the resolution for the refiner to 512, it can compose images properly like a 1.5 model, although the overall look is not great
I don’t think u understood. Testing refiner alone yields good images, but they are trained at 512 so there’s a little bit of an issue there.
yo i think my SDXL is fucked, ALL my images are tillable. help?
did you tick tiling checkbox under the sampling method?
nope
I turned it on only once and now it stays in that state, even after a restart
sorry, I'm not sure 😦
did you try a different checkpoint?
what other SDXL checkpoint is there????? did 1.0 release yet?
I mean a 1.5
oh, that problem doesn't occur on 1.5
@west breach NODE_DISPLAY_NAME_MAPPINGS does not seem to be doing what I except it to. Any idea?`NODE_CLASS_MAPPINGS = {
"SDXLPromptStyler": SDXLPromptStyler,
}
NODE_DISPLAY_NAME_MAPPINGS = {
"SDXLPromptStyler": "SDXL Prompt Styler",
}`
I just use the NODE_CLASS_MAPPINGS
Trying to mimic build in nodes but fails.
I'll try that.
Well, then the node name changes and also title. Weird.
got it =]
Whats the cause of granular gens? Its like textures made of dots
Is it happening to you in some prompts?
Do you have an example?
Yes hold on
it should, we just need new controlnet models
Curious, what was the fix? Or did it just randomly fix itself lol
SAI said they have them ready, they are gonna release them with 1.0
Try a lower CFG value at 7 or less, set the scheduler to simple
ok
yeah, I found 6.5 to work best with SDXL, well, at least on A1111
Oh I had it on 10 for some reason
A separate CFG of 3.5 can work well for the refiner
idk about the refiner. it's not implemented in A1111 yet
straight out from Emad himself
You need to change to sdxl branch tag 1.5
I'm on dev branch, it doesn't
Tag 1.5?
I am, no option or dropdown menu for the refiner
Img2img
but heck, I'm making the best AI images I generated rn lol, so I ain't complaining
I'm not up for clicking 4 buttons after every gen I do. also it's not the same.
Nah
Just make a bunch of base
Then refine the one you like
Maybe that doesnt work for you
IDK man, Idk if this even needs the refiner
Some artstyles dont need it for surr
Some get the image even worse
But for some others it does a reslly good job
Like photorealistic
Idk, using the refiner on IMG2IMG isn't like using it on comfyui, and I don't like comfyui
I guess I'll be patient until A1111 implements it without the need for img2img
👍
What sampler are you using?
I find DPM++ 2M Karras to work extremely good for SDXL
dpmpp_2m_ancestral for base and dpmpp_2m for refiner
Thanks
the head is so overexposed
wdym
i can see you there
depending on the image you might need to tweak the amount of work the refiner does
yep, im mumbling atm between 0.5 to 0.25 in 20 steps
@west breach Found the fault, didn't import everything in __init__.py
oh yeah, that caught me off guard too
yeah seems like 0.5 X 20 steps X unipc X kerras works wonders
so basically the stock sdxl is great for anime and art, and the refiner to make everything look more realistic
Probably time to push it to github. 😄
anyone running sdxl with vlad1111? how did you get sdxl vae to work? 🙂
I put it in models/VAE, but it isn't being detected for some reason.
I've redl'd the vae, restarted, reloaded etc. nothing seems to work.
I use it on A1111 and it works on the dev branch
dev or release candidate? but A1111 doesn't support the refiner, I think?
yeah, no refiner yet, but damn, it sure does work lol
high CFG
I used 6.5, what do you recommend?
how much steps?
60
for 6.5 ~20 is good, 7 is good at ~25, 8 between 20 and 30
damn 60 is a lot, good for details though
lower cfg gives the model more "freedom" but it starts to slack in details at a certain area
idk man, the detailing got insanely better with 4.5
try it with some other stuff
its interesting, it is as if sdxl reacts completely differently to CFG than the rest of the models
this is what you'd usually excpect with 3.5 CFG
an amobea like feeling
this can usually be fixed only in img2img or higher cfg passthrough
try doing the first few steps with a low CFG like 5, then pass to a 2nd base sampler with like a 9 CFG
mmm didn't try 9 cfg yet, i did see it completely burned at 16
it doesn't burn if you do a few steps with a low CFG first
ill try a few combos with the refiner, 2.5->9
I mean you can use high CFG and not get burn if your prompt is basically a novella, but for short prompts a low cfg generally works best.
ok, seems like the refiner has a certain bias
low gravity mode.
inflation is hitting this nation hard
dude, SDXL is damn close to nirvana with a low CFG
try doing the crysis character
i think that what the refiner does is add stuff that look more cool and more expressive
look at how to get the path to the current file with python
oh damn, 8 steps?
where is SDXL 1.0??
yep 8 steps
Check out our weekly event #1087493421209485393 
is there a node that would allow me to save images in their own folder? and the images generated names are based on the type of model or lora i switched to?
just for extra organization
the current one can do it if you do: output_folder/prefix
@comfy I tried this (I know it's not an official documentation) but it didn't work. was it ever implemented? https://blenderneko.github.io/ComfyUI-docs/Interface/SaveFileFormatting/
yes that docs should work
are there any codeformer nodes available for comfyui? roop uses it on automatic1111 afterwards to fix the low resolution transplant that roop does. i can't seem to find any way to achieve this same post restore fix on comfyui.
not the most important thing. i should just be training faces with loras
hmm okay. since the search and replace feature would be super powerful. the filenames always just gave out %variable% instead of populating it
beauty
make sure you do: %node_name.widget_name%
I'll try that -thank you!
I think there's a codeformers node on civitai somewhere
can I trace back my exact prompt and settings in ComfyUI as I can in A1111 (Image Browser)?
yeah drag the image on the UI
Wow! Does SDXL do "proper" text already?
not sure how i can do that, save image only lets me change filename_prefix to a node or a widget
around ~60% of the time, yeah
So drag each image and it'll reconstitue itself on the UI? Good to hear
it follows text with all caps very easily
0,9 yes
click on the "ComfyUI" written in the widget and it will let you change it
its still baking
IDK about 1.0 tho
the real comfy ? the dev ? 😮
just wanted to drop a big thanks then, you revolutionized my way to use SD
Sytan told me so, but really happy they took you in !
you dont mean this right?.... cuz it only lets me change the value
i got two big node packs off civit that said they had face restoration stuff. was pack and impact pack. they're chalked full of nodes but their face restores don't seem to be applicable. they're set up for upscaling and i don't think they use codeformers. I can't search civit outside of popular packs though. The search functions succcck
Civitai is a platform for Stable Diffusion AI Art models. Browse a collection of thousands of models from a growing number of creators. Join an engaged community in reviewing models and sharing images with prompts to get you started.
how am supposed to find nodes on civit?
like this?
there's a reason i come to an active community to ask.
so many new nodes, nice
nice. maybe that might work.
i'll have to figure out how to install though. should just have to copy the facerestore folder into custom_nodes? or is it more involved. that repo looks more involved
yeah copying that facerestore folder should work
https://civitai.com/?query=comfyui&view=feed even this shows zero results. civit just sucks for discovery. we really need a new central host. these guys have capitalized on getting in early and are phoning it in
Civitai is a platform for Stable Diffusion AI Art models. Browse a collection of thousands of models from a growing number of creators. Join an engaged community in reviewing models and sharing images with prompts to get you started.
yes
I am using text to speech and making a file (available soon) of many Prompts I have in a Gen AI Magazine ...
Shows a bunch of results for me lol
intresting
they're pandering to the anime / nsfw audience mostly and nothing useful seems to get popular there because they've designed it, intentionally or otherwise, so waifus are maximum signal boosted rather than anything quality. i mean shit. the search function doesn't even do it's most basic job.
Clear your filters
i'll clear cookies. filters are clear
shit i'll just make a new account. i criticize civit for waifu shit all the time and my account there is my discord account. i wouldn't be surprised if public opinion got my account shadowed
logged out and tons show up. gfigure
I'm torn between ComfyUI's simplicity and capacity to work straight out of the box; and the easier extensions available at A1111 - especially Latent Mirroring and Image Browser - but both are superlative software packages!
Is it possible to use date-functions to create subfolders too?
sorry for ranting about civit's lack of nodes when they're clearly all there. feel a little gaslit by civit admins or whatever bug causes the search to die when i sign in
almost
well that actually worked
its interesting that you left the last step 6/7 with noise and sent it to the refiner with additional 21 steps
this is how the refiner supposed to work, but im unable to send a 128x128 latent into the k-sampler with different resolution
any idea how it is supposed to look in comfy?
Comfy would have a VAE Decoder step between the refiner and image.
The way this is presented is a bit strange, because the base can't produce a 128x128 latent without it looking like complete shit.
And the Refiner doesn't make images bigger
depends on the layout. many of them just feed the latent from the base to the refiner. see sytan's
Is that latent resolution different to how we create an empty latent in comfyui
Because if you put in a 128x128 latent, it would just produce a mess
Right. I did say between the refiner and the image. 😉
The numbers you give to the 'empty latent' node are divided by 8.
Oh ok, so that is doing exactly what the main workflow is doing
yep, seems the same to me
it looks weird
wait, divided by 8? mmm so that's it
hey, isn't SDXL1.0 supposed to release tomorrow? I don't know why I think that, but I remember something like that.
I can't remember where I saw it (it was on this Discord though), but is there a node/workflow that can customize the file name output in ComfyUI?
no
said a week so tue. i wouldnt hold my breath for dates.
Probably Wednesday the earliest
wait, so what's happening tomorrow? I remember Emad or someone saying something is happening on the 23rd
The Save Image node has a box to change the output filename
Can it use variables like time and the first few words of the prompt?
Maybe you are thinking about Emad tweeting 3 on the 20th? https://twitter.com/EMostaque/status/1671121009817034759?s=20
Tomorrow Sunday happens
It'll be like a week or so so nearly there and all the variants are being ready for the big launch where we will have a nice chat as a community.
dates moved around a lot
so, next week is promised?
Not sure about that. If it doesn't you could probably modify the node to do it
"week or so"
or just another way of them saying it's delayed again
Nothing has been promised
is either a week up until infinity
OK thanks, I was hoping to get the equivalent of the AUTO1111 naming ability, like below
"samples_filename_pattern": "[datetime<%Y-%m-%d %H-%M-%S>] [prompt_spaces] ([seed])",
Doesn't look like it does that at the moment but you can edit the SaveImage class in the nodes.py file.
It's line 1119
honestly, I'm more exited for the A1111 support for the refiner than for SDXL1.0, I think 0.9 is already better than expectations
that's up to the devs of that ui to implement it
I doubt they won't, A1111 will be forced to do so, most of this community uses his WebUI
I imagine its a pain in the arse to do it because of how hacky the A1111 WebUI is
I don't think the devs of a1111 get payed for their work so
don't blame them if they don't have it implemented
It sounds like SAI have been giving them a hand though
From what's been said in here by staff
i've been offering help wherever i can but for the most part auto's just goin at it himself
they did confirm they will implement the refiner in an efficient way, but the question is when
we've gotten some fixes merged into SGM to make auto impl easier
Thanks, I found the repository, it was https://github.com/bash-j/mikey_nodes and the python file has code showing the addition of date/time to the filename. I'm about as good with Python as I am at levitating, so I'll see how little damage I can do 🙂
also, will using the refiner when it's implemented cause performance issues?
if it's properly implemented not really
or will it just unload the base then load the refiner
I am using Base in A1111 - are u telling me that Refiner does not yet function inside A1111?
nope
I know, right?
you can try it right now in comfyui, the performance is pretty much what you are going to get if it gets implemented properly
You mean the excellent results I am getting out of Vlad A1111 SDXL are Base and Base Only?!
vlad has the refiner properly implemented because he's using diffusers
There's a little delay in passing off from the base to the refiner, but I think that's more because of VRAM
You could change Line 1152 of nodes.py to: file = f"{datetime.now():%Y%m%d_%H%M%S}{filename}{counter:05}_.png" also add "from datetime import datetime" to the top of the file.
idk about vlad, i'm talking about A1111
Thanks, would an update from ComfyUi override that though?
Yes
Base only is fine for most things, the only place I've seen the refiner provide a decent improvement has been on eyes and fingers
It keep crashing because 100% VRAM usage (8Gb) - but even using -medvram or --lowvram it still gives memory warnings
you can copy the save image node to your own custom node and do the changes there
then it won't get overwritten by an update
💩
OK, I'll have a go once I'm brave enough 🙂
Personally I think having the date on the filename is a little pointless, when you have the file creation time anyway
The prompts I get, but if you use long prompts you can't fit it all on
@west breach Pretty happy with it now. Thanks for the help. https://github.com/twri/sdxl_prompt_styler
SDXL responds well to the templates
But I guess that's expected. 😄
hee's sleeping
tried adding it but getting no option to connect other nodes to it
right-click on the node and convert text_positive and negative to input
fiddling with a few custom styles too, might come later
Just d/loaded - will have fun using this!
Cheers!
I do. He left for several reasons, however I don't think I want to stir the pot by mentioning any of them, and just know that he left the server because of negative interactions
prompting with 'Tilt-Shift Photography' is awesome 🙂
I just prompted: "Mediterranean city port with a flying spaceship attacking. it is firing a laser beam to the city causing many large explosions, Tilt-Shift Photography"
Is the thought right now that SDXL might be just one model? or will it more likely be the base and refiner for 1.0
Is pseudo the one who constantly leaves the discord and reenters under a new name or was that another regular that I'm thinking of
i asked him on the other server
but he was a valuable asset for this community
Pushed a few new styles to the repo.
There is a new refiner in the bot now so I guess they're going to keep having base + refiner.
Also, @wicked frigate , does Automatic1111 have access to SDXL1.0 so his UI can support it right after it's released?
it's the same architecture as 0.9
We had experiments with bigger changes that would need code changes but seems the big changey ones won't be 1.0
(a shame, I like some of the bigger changes, but perhaps a future model)
i thought , the refiner was baked into the file or smth
yeah the big changes that break everything are going to be a future release
any hint what the big changes could be?
it depends what works
Well I’d say it’s for the best to not make super big code changes for 1.0 yet sense people will get upset lmao. 1.0 will most likely be the new standard. At least I hope it is.
1.0 is the same model arch, the only thing that will be different from 0.9 is the recommended workflow
i did found a diff when using SDXL and a 1.5 model (e.g ghostMix), i think SDXL makes the propt much varied, when i type " a decorated hallway" SDXL tends to generate more varied versions of a hallway, where as ghostMix has like 1 hallway theme/color scheme, is this a training thing or just 1.5 ?
Which I’m guessing will be discussed in the super stage?
I'm just saying that because it behaves a bit differently vs 0.9
you'll see when you get your hands on it
McMonkey: Oh, what kind of big changes are you experimenting with?
a weightlifting panda with a serious era doing a lot of effort, muscular and heavily tattooed through all of his fur with tattoos of Vietnamese warrior symbols leaving traces of their designs through his fur, in a dark and ultra modern weight room with a red background and light effects, lifting forcefully, as high as possible, above his head a weight bar with large cast iron discs, showing the effort in the gesture and squeezing hard the helm, ultra realistic, fully respecting physical laws for more realism
us the bots, this is for sdxl chat not generating images
Is that 0.9, 1.0 or something else?
open the image in comfyui and you'll see
#bot-x
it's SDXL + a few tricks + WD beta3 illusion
What is the broken FennecEarSizeMultiplier node that looks like a LoRA?
it's something I'm experimenting with, it's named that way because that's one of the effects it has
Damn, SDXL realy responds well to style prompts.
Mediterranean city port with a flying spaceship attacking. it is firing a laser beam to the city causing many large explosions, Tilt-Shift Photograph
"Mediterranean city port with a flying spaceship attacking. it is firing a laser beam to the city causing many large explosions, Tilt-Shift Photography"
go to the bot channels
Hello, where can i find sdxl 1 ?
@errant yarrow Here's tilt-shift photo. Is that what you had in mind?
It' better
Yeah, I feel the same way. And this is one single try.
Tried updating ComfyUI and LoRA's are still quite slow sadly.
I commited the tilt-shift and a few new styles. If anyone else want to fiddle with it it's here: https://github.com/twri/sdxl_prompt_styler/
stability ai's prompts have prefix sai- and my custom ones are gen-
Thinking about giving the possibility to block either negative or positive prompt templating, could increase flexibility.
Nvm, without noticing it I see that I compared LoRA and Base with different steps.
stuff~
:3
guys, how do i use sdxl lol
(you'd have to beg that info out of Dustin or Joe, not my work and I don't want to steal other people's announcements from em)
A: download ComfyUI, or auto webui's dev branch, download the model, put it in, use it.
or B: pop over to the bot channels #1100170312106127410 and do dream commands there
or C: clipdrop/dreamstudio
thank you very much ❤️
Just take a screenshot of their discord messages and spread that around without telling them about it, its as easy as that
Styles are grunge, dystopian, cubist, constructivist, pixel art and film noir
and a simple prompt "beautiful woman, age 35"
when is sdxl released?
july 26th, or so. maybe.
. you can use the bots right now tho!
im already getting insane results with sdxl 0.9 cant wait to see how 1.0 changes this.
will the 1.0 be available over the API customers?
wat
this channel has had 500 messages in it today alone
500 is too low
on 18th, there were too many
yeah the day that was the initial target for release that we missed and delayed a bit longer, where there were tons of messages from people talking about the release target, the leak of the delay, etc.
I sent around 102 alone
there are very specific times this channel gets REALLY active lmao
before that too, we had too many msg's too
today its too idle
just imagine whats its gunna be like wednesday haha. this channel is gunna blow tf up
when?
on wednesday?
theres an event. https://discord.com/events/1002292111942635562/1130919343941758978
Emad's gonna join?
no idea.
damn, i wouldn't be available, I would be watching oppenheimer 🤦♂️
@livid cradle may I ask you how you achieved such results on the beverage commercial photos?
im building my own magic prompt to give me this results.
yet, due to a bug in SDXL masking, it keeps fucking up my originial image... can't make this consistent and keep the brand intact... anyone can help with that?
using this (SDXL generated can) and try to position it in various scenes. yet sometimes i get lucky, sometimes i just get freaking frustrated.. im working on the prompt only to generate the backdrop and palcement...
Wold be nice if i can feed my prompt a variable with the image so it only builds around it and take that as the main subject of the canvas.
but as you can see in the last iamge, it clearly generates another can behind my original init_image... waiting on someone from SD team to address this masking bug in the image to image api
I'd love a can of 'coocunt' where can I find one?
everyone loves coconut, that's why my marketing photos will be so appealing 😛
No I don't want coconut, I want coocunt as advertised.
Maybe not as good as above
see... that's waht i keep saying... it modifyes the text on the init_image which is fully masked... can't understand how this is not a big problem and immediately addressed by SD developers?
They need you on their team.
Strange, thought. But probably 1.0 it is addressed?
if i would of known how to fix i would of passed them the data... but unfrotunatelly i don't know where the issue is, i just reported my findings.
now i started a cocunt trend on this room...
haha, it's a great word.
anyone from this room in good relation with any developer from here? at least they can take a look at the masking problem... 😦 im so frustrated that this keep happeneing.
But you're using 0.9 right?
It's a bit like using the BETA, it ain't gonna be quite right. I find little oddities with 0.9 which I am confident won't exist in 1.0
The basic results are 10x better than anything I would get from 1.5 base. So it is encouraging for an unfinished product.
yeah i know it's beta. but hopefully this masking issue is going to be address with the release of 1.0 (i don't think it will, i reported it a few days ago only) so hopefully some big hearted developer will look into it and try to fix it. 😄
This is better, had to change seed.
Is that a can of cu...oh dear.
Gothcha, cuud. Sounds the same, taste the same?
like coconut 😉
Truly delightful.
always loving a lick
As long as they don't describe this beverage as dry, then I am in.
And as long as we keep on running around the terms and not triggering autobot, I'll see myself out xD
(really good wordplay mate though)
You get good at when dealing with finicky prompts.
English is not my native language so I must miss out on the fun. 😄
Oh what is your native language?
Swedish
idk exactly what you're currently doing or the API stuff, but, from the perspective of just, how to do the thing you're doing in SD (in general, not even really SDXL specific) what you can do in this case is txt2img a plain background within any can mentioned in the prompt, then just apply the can over top, and used masked img2img to redo the background with the txt2img as a low % init, so it's less likely to get confused by the can
And a seemingly sensible answer has emerged.
Here's a new flavor, got a bit sharper too.
I was thinking about this workaround as well, it's a good idea, but its going to require multiple api calls , longer wait time, and more expendive on me... so i need this to work as initially designed.
dude i would pay you 20$ for just this
The designs are lovely. Definitely something you would expect to see on some Mediterranean billboard
Given how fast I got here I'm amazed.
no joke I do a lot of ai for over two years but those image creations get me every time. this is truly art
I'm a photographer (for fun) and these days I think I might as well sell my camera. AI is it.
My workflow was to give GPT4 some context about what a good prompt is and then suggest to have commercial themed prompt styles. Feed that back into json and use it with my prompt-node along with my custom prompt.
im actually building a product placement, product photoshoot ai app , would anyone be interested in early access to such thing? 😄
im actually starting to think i should build this in public and ask users feedback as we advance the development 😄
Yeah, looks nice. Bit sharper than my style but im certain it could be improved upon.
Yes.
I worked with a company that sells high end spirits and we did a lot of lifestyle photography for those products. Seems to me there could be a market for such an app.
Had a dedicated studio for it.
@livid cradle what platform is this? never seen this ui. is this a homebrew :)
yes, its not just a SD webui like all the others, it's going to be a platform dedicated for ecom owners, influencers, marketing experts, product photographers and small business owners specialized in product photography. it's work in progress.. hoping i will have the MVP by the mid August.
thanks and best of luck with it
I never considered this for product photography, and by extension social media, but it works really f'in well
sorry for the stupid question but is there a special hf space or notebook you are using for this. i'm often too lazy to get my own setup running haha
Most of us are running it locally on our computers
What prompts do you use for those bottles?
A bottle of Dalmore whisky, sitting on a wooden table in a scottish pub, with a lit fireplace in the background.
SDXL responds quite well to 'conversational' prompts I find
That is awesome
and yes it responds very well to it. I really like how it has been improved so much
ive coded my own prompt generator that is focused on generating the best prompts for photography scenes and shots. called it Magic Prompt 😛
I'm excited to see 1.0
Yeah, I hated so much that I have to specify not wonky eyes. Tf is that.
So this is a massive leap.
Is your Magic Prompt Public?
so inside the app you just have to say your product type a few words like "a can of coconut flavored energy drink" and based on this prompt it will enhance it to this:
Mountainous Set-up: for a can of coconut flavored energy drink: Create a scene with a granite rock in the midst of a clear mountain stream against a majestic Alps mountain range under a clear blue sky. The scene is illuminated by the vibrant, high-altitude daylight, characterized by dominant greens, whites and blues, with a focus on creating a refreshing sensation. Incorporate surrounding alpine flowers and ferns to enhance the product's natural sensation. The scene composition should be a low-angle shot, capturing the product emerging from the water against the mountainous background with a low-angle to emphasize the product and the towering mountains, creating a refreshing, invigorating, and exhilarating atmosphere. The overall style of the scene should be realistic, vibrant, and breathtaking.
Solid prompt. Let's all use it for free.
are you going to have it hosted or is it going to be open source and on github
feel free to use it.. i generate unique prompts each time in various styles, scenes, backgrounds and a whole other more categories not shared here 😛
I'm gonna put that prompt in, see what comes.
I would be happy to test it and provide feedback
paid and hosted applciation.
Good prompt.
Same issue with some what messed up text, but yeah, workable.
Text is not what SD is perfect at, but I guess nothing that photoshop won't solve.
some early results.. using same image as base image.
im not interested in the text on the can... it's best with no branding so no copywrite issue on my promotional page. anyone will use their own products to enhance.
this is just a sample i use to test.
but the results are promising... various styles and scenarios same product.
i just wished someone from the SD team would say something about this image to image masking bug i reported. (meaning they are working on a fix for 1.0) 🙂
Yikes
It might not make it into a fix for 1.0 It seems like its too close to final release for any major bug fixes... It could become a hotfix after release but it takes a lot to fix big bugs like that I would imagine.
I used to work for a company that made proprietary software and fixes used to take months after I would submit them... It would first come through a customer. I would submit a jira bug ticket.. then tier 2 would pass it to dev. dev then goes to QA... then back to me for final testing... then it would be staged for implementation in the next release.
anyone knows what are the cfg and steps which are used for the bot / clipdrop?
Not sure what they have set on the backend... but theres a good chance they are either set low to reduce server load and offer better speeds, or there is some sort of "rule of thumb" guide they use for optimal results based on whichever sampler they use for the image.
I wonder if one can make a good photo for fashion magazine cover
probably.... its all about the right prompt
there are some folks in here that are geniuses with prompting. I have seen so many great images in here lately.
How To Use SDXL in Automatic1111 Web UI - SD Web UI vs ComfyUI - Easy Local Install Tutorial / Guide https://youtu.be/eY_v5IR4dUQ
Our beloved #Automatic1111 Web UI is now supporting Stable Diffusion X-Large (#SDXL). In this video I will show you how to install and use SDXL in Automatic1111 Web UI. Moreover, I will show to use SDXL LoRAs and other LoRAs. Furthermore, I will do image generation and speed comparison between Automatic1111 Web UI (SD Web UI) and ComfyUI.
Sour...
sdxl working great
but no refiner suppor yet
I mean doing it only with SD / SDXL - probably not a high chance, but if your'e mixing in some photoshop skills and something similar - for sure
that too
wanted to say I found out about your channel like a few days ago and I'm more than amazed and grateful. is there any way we can donate to show some appreciation? (I don't have too much but if there is something like this already lmk😂 )
Photoshop is great for finishing up final works
especially with the new AI tool in photoshop
got a chance to use it and while it isn't as powerful as SD / SDXL - it is so much more useful for specific tasks
I'm on it. Still a simple prompt, so If you want another style probably easy to adapt.
For the bot it's deliberately variable so they can gauge ratings.
Shoulder pads are the latest fashion this year I've heard. So much so that Kanye wears them.
You can do unconditional samples from SDXL by using an empty prompt
Gives you interesting random images from the training set
It can do some pretty insane stuff that's for sure.
Is that from an empty prompt?
No, custom style and a simple prompt.
Ah I ran like 30,000 images from different seeds of an empty prompt
This was the first image I got.
My uncle got me hooked on RVC, and I've been using it all day lol
I'm taking a day off from SDXL research, and then I will be back at it again tomorrow with some new LoRA tests
lol like the audio thing?
@paper phoenix I'm not sure if you saw, but I fixed my 3090
Yeah, the real-time voice changer, it's incredible
It works so damn good, and it sounds so real
its pretty good. ive been messing with it the one that changes vocals for songs. it does work very well
Where can you get the changer?
Did the model on the bot change recently? I got some amazing pics a few days ago but now with the same prompt they are all really, really bad
not sharp, over contrasty, not even following the prompt fully
They are constantly changing things, so they can figure out what's the best when people vote on them.
So different model candidates. different steps and CFG as well
I see, is there any way to know when/what changes or do they keep that quiet?
Keep it quiet on purpose so it doesn't introduce bias in voting
Once 1.0 is released I imagine it will be more consistent
Hopefully, I felt the images got worse after 0.9 in the bot.
hopefully, because this change is like night and day. It went from 'actually good' to 'my first prompt'
Could just be a change in CFG to be honest
you wouldnt even guess these were the same prompt
You can get massive swings from just steps in CFG
watch 1.0 suck and everyon stick with 0.9
Hey I already moved my main lora model to 0.9 so xD
seeing this 'cooked image' look worries me about the models they got for sure
It's fine, they'll be using "bad" settings on purpose
I think he's asking for the realtime voice changer
I know what it is but I don't remember the link
yeah they dinfintely using bad cfg's and sample steps on purpose
that
I've got the links for RVC and the other one
Oh, it's literally just called real-time voice changer, it's an AI that you can run locally on your hardware, and there's a huge community that supports AI voice LoRA's for it
It's quite scary how good they are.
Links referenced in the video:
Realtime Voice Changer - https://github.com/w-okada/voice-changer
RVC training (colab) - https://youtu.be/9wu6LSue_dU
RVC training (local) - https://youtu.be/hB7zFyP99CY
Come join The Learning Journey!
Discord - https://discord.gg/Mym3MxcvWg
Github - https://github.com/JarodMica
TikTok - https://www.tiktok.com/@j...
I think it's this one - https://github.com/w-okada/voice-changer
Makes it kinda hard to vote lol. "Is this garbage better than also garbage?" I dont like either what do they want from me lol
I couldn't get the realtime one working well, but RVC was good and then the other one based on Tortiose for normal TTS was decent too
you choose the best of the two. lol
@high skiff What fixed your 3090?
If they are both shit just don't vote
ya what fixed it?
The current idea is that there was something wrong with the mounting pressure on the back plate
The seller and I went through Helen back doing everything, so I decided to consider the card completely dead and tear it apart and rebuild it, we tested various different things with thermals and none of them yielded any better results, but then I completely disassembled it a second time and put it back together, and then the issue stopped
thats usually the route I go down
The backplate does use very short thread spring screws, so it's possible that through shipping they got jostled out and the card wasn't making proper contact, or the card was making two hard contact on one side which could have been causing issues
Whatever it is, GPU's been solid for 15 hours now, and I've been hammering the hell out of it
I have talked to Joe penna directly, and he said if both results are bad, do not vote
He said voting for bad images just hurts the algorithm, so be selective with the votes you give, and make sure they are for results that you are happy with
Glad it fixed it, a weird one for sure.
That was a huge concern that I had as well, so I brought it to his attention and he informed me that we are not supposed to vote for all images, only the good ones
Undoubtedly, however the GP was running a rock solid, and it's performance is incredible
What channel do we go to to vote on images?
Before, generating two images on it back to back would instantly blue screen it, now I've generated over 400 images, trained a LoRA, been messing with real-time AI voice changers for several hours, ran nearly a full hour of stress test and heat test benchmarks, and all sorts of stuff
Any of the bot channels, or showdown
good to know, looks like I will be not doing anything for a day until they tweak the settings. Its just generating cooked image after cooked image with the same pose over and over again.
Yeah, it very much seems like SDXL has 5 days of being bad, 1 day of being good
I do wonder how people vote on it though. Are they actually voting on if it was close to the prompt, or if it just looks good.
It's very weird, their results are very inconsistent and you think that they would at least have an idea of what they are doing at this point, and cut out all of the things that are obviously bad
Considering it once gave my feral lion tits
I think we know the answer
I talked to Joe, and he said the most consistent voting alignment is people legitimately just clicking random buttons
He said something about the bot being able to throw out votes when they're cast too fast and succession
But some people will just vote on a nicer looking picture, even if it's not exactly wht they asked for.
im guessing they vote on what looks good. bad for prompt adherence but good for model output
He said a large group of users form these side biases, to where they will vote almost undividedly for either A or b all the time
Almost like they think that one of them is one setting and the other is the other, and they form a bias to one of the sides
I know some people have been voting for bad ones on purpose for some reason
I think it was someone that had just decided that SAI were doing it "wrong"
So was being an ass
i get self-conscious about that when i'm voting in my channel, and it just happens that a series of A or B is better
Maybe my luck is just shit today lol. I did the 'resize' and it did a tall one, so I did it again and it did the same size, and then the same size again, and then finally a different size. Then I did the restyle and it did a sticker, then I hit it again and it did another sticker...
Side note, Kandinsky 2.2 should be pretty concerning for SAI lol
It's not as good as SDXL, but it's a pretty big leap over 2.1
And it can generate some pretty damn good looking images still
The average SD user is not smart and doesnt look more than 5 min ahead in time, I dont think they need to worry lol
Yes, seen that model too.
2.2 is a huge leap
if all the people going 'no tits = not using sdxl >:(' doesnt say as such lol
I've had instances where I've voted one side a lot more than another but that's because the images were better.
A cinematic photograph of a white tiger standing in the middle of a forest looking through the leaves of a shrub with dappled lighting from above
Kandinsky 2.2 vs SDXL
Obviously SDXL is doing better, but Kandinsky is improving far more rapidly
I remember when Kandinsky 2.1 came out and I was all for it
but no one wanted to impliment or use it
why?
Because russia made it
not even close. plus, kandinsky is tuned, SDXL isn't yet
what prompt did you do for those?
While I do agree that SDXL looks way better than Kandinsky, my statement was that Kandinsky is catching up really fast
Kandinsky has made a hell of a lot more progress to the quality of their images in the last month than SDXL has
do "realistic, Ice dragon, desolate, intricately detailed, artistic lightning, particles, beautiful, amazing, highly detailed, digital art, sharp focus, trending on art station,"
yeah, SAI is now in a slumber lol
Kandinsky vs SDXL for:
A portrait photograph of a beautiful woman with long and curly blonde hair and red lips smiling in a white gown in a chapel with gorgeous and colorful stained glass out of focus in the background
I think STXL looks more polished, but I think condensky has an overall better vibe to it personally
*SDXL
Granted, the resolution difference is huge
SDXL has significantly more detail
Also, I accidentally generated them at different aspect ratios
Let me try SDXL over again
your average SD user will not move to anything new unless it can do boobs
God damn some of the settings on SDXL look terrible today
even if the model would be better in the long run
WTF is this bro
This one looks better, but still looks pretty off
yeah I stopped genning because its literally not following my prompt any better than 1.5 now
I'm curious, would you mind sharing what words you use for that sort of teal / turquoise leaf color?
What's off about that one... It looks pretty good to me
"oh you want it in the style of X and Y artists, both people I did recently? here is a photograph style image"
It does look good, it's just not as good as I would be hoping for this late in the SDXL process
It honestly looks like something that I could still pretty easily generate using 0.9 on my own computer
Although I will say there are sometimes when the 1.0 candidates really do just hit it out of the park
Likely just not the best settings for that image
This is my prompt: positive_prompt: cinematic photo a white tiger standing in the middle of a forest looking through the leaves of a shrub with dappled lighting from above . 35mm photograph, film, bokeh, professional, 4k, highly detailed negative_prompt: drawing, painting, crayon, sketch, graphite, impressionist, noisy, blurry, soft, deformed, ugly, headdress
Interesting, I wonder if it's the term cinematic
What was your prompt for that one?
Dunno why I still have headdress in the negative lol
just do a style, it just removes any negatives from your prompt for some reason lol
A portrait photograph of a beautiful woman with long and curly blonde hair and red lips smiling in a white gown in a chapel with gorgeous and colorful stained glass out of focus in the background
dont need negatives when it wont let you do t hem for some reason
greetings
Howdy
Ice Dragon
I'll try ice dragons tomorrow. They look great!
!!Warning!! This is what happens if you put a duck in the microwave
also, I made a prompt template that works surprisingly well on SDXL, well, at least on the tuned version I made of it
SDXL 1.0 finally coming out next week
Exciting to see what the community eventually votes on
after 30 images rendered of ice dragons these are my favorites
I really like that electric looking glow
This one is a bit freaky
but the style could be promising!
Yeah going to try it in a sec with something else
Didn't keep it for a different animal
nice
Hmm, using "Anime Movie" seems to go very Ghibli
I guess I can see why though
It seems to have decided this is how you sit in a chair
Getting better. Slightly changed the prompt
lmao
I think all her insides have been pushed up
There's almost no hype, almost no discussion, but foremost you have to prove you can at least do something better. Realistically you have to do everything better, most of the time. People dismiss LLMs that aren't the 'overall' best even if they're best at one thing or another.
ohhhhh
my god my heart just sank
came back to my PC OFF
Its a new windows install, the screen timed out while I was getting food lmfao
Anyone know why in the world SDXL can't generate a racoon? Like, it doesn't upset me but it's weird that it'll do literally anything else but that 😂😂
works fine when i try it in local - quite probably there's an overeager word filter accidentally blocking you there, as the name of that animal is also a racial slur in some obscure contexts
it's probably not the style you're looking for but I got this one 😉
That's fair but it was strange cause that last one was literally, "prompt: racoon style: photograph aspect: 1:1" and it made a picture of a camera. Just super strange. It being a racial slur is a good point, could be a weird word combo. Was just super strange.
I. Love. This. 😆😍
bots?
It was Bot 8.
it's the nsfw filter, that is too hard core, has word "coon" in raccoon, so it blocks it from generating raccoons, try the local version
or try something like trash panda
hopefully it understand u mean raccoon
Ahhhh that makes sense.
Trash panda might get some good results.
Question:
How can I train something that will help SDXL generate images at higher resolutions out of the box... Is it possible to train a model using upscaled images at 4096x4096 so that the model I use will generate images at 4096x4096 instead of 1024x1024... Is this something that is possible to do or is there a reason that it is normally 1024x1024.... I guess my real question is how does stable diffusion evolve the resolution of generated images??? If my method is possible is there a way I can help you guys get it there over time? I have an RTX 4090 and a fairly beefy PC setup.
generating at 4096 is gunna be insanely hard on the gpu and theres no reason when you can just generate at 1024 then upscale or using something like ultimate sd upsaler or hiresfix to get 4096.
ya but... it is progressing over time... sd1.5 was at 512x512 and so on... My question is how can I help with that progression.. They seem to be developing a way for this to evolve over time. I just want to see if there is a way for me to help them get there.
They have been doing it within graphics card abilities
for some comparison, 512x512 is 262,144 pixels, 1024x1024 is 1,048,576 pixels, and 4096x4096 is 16,777,216. thats a huge jump.. i understand a 4090 can probably do it. but im just saying at this point. it is a heck of a lot faster to generate 1024x1024 then upscale and get same quality.
if you want a good 4096 model youd have to train one yourself and you wouldnt be able to do that on a 4090 at all at that scale of resolution.
youd have to code something completely new if you wanted an efficient new 4096 model. or just wait until we have the tech and graphics to handle such massive sizes
@sharp robin trash panda worked 😂
upscale?
almost always theres a way
I do use upscalers
huh. i was saying 1024x1024 image then upscale to 4096 is faster than generating a 4096 image itself
oh thought i was replying to Scorp
that's super cute 🙂
Lil guy made it to the showdown 😆
does comfyui have prompt editing like [woman:goblin:0.5]? Readme doesn't mention it
no but you can get very similar results with either conditioning combine or concat
hoo boy that's a lot of node
Some of my recent projects favorites
Where can I put some of these where they can be voted on?
I've only been using SDXL on Comfy, if I use A1111 do I use Hires fix as well?
And is the refiner for img2img after txt2img is done?
yeah a1111 styled prompt editing would be wonderful to have. In SD2 I mixed a lot of TI embeds and other tokens together to create interesting things.
really powerful stuff
there's this but I haven't tried it yet: https://github.com/taabata/Comfy_Syrian_Falcon_Nodes
so I found that the less human more "other species" the better it works. Dragonoids/dragonborn are easy, government lizard people are doable, "woman with green skin" is very hard. I kinda managed to make it slightly more doable with a super cursed dual-prompt node layout that I 100% wouldn't recommend to anyone, and the color still bled into her hair so that's fun.
maybe if there's some fantasy races with a really strong weight that's basically just "human but purple" it'd work a little better. Like a Twi'lek but with hair.
Is there any documentation on what these do?
I kinda get what combine would do, but not sure about how concat would work.
interesting - thanks for sharing your research!
but at least you can do hybrids now
it's like the balance meter in a tony hawk game where the left is feral monster and the right is normal human and it's constantly trying to throw you to one or the other
concat is like BREAK in a1111, combine is like AND
ok, never used BREAK in a1111
yeah I noticed this as well. when I did a lot of prompt editing with SD2 I always tried to find this sweet spot and then try to push it in the direction of the mix or style I wanted to explore. but it's not working the same for me right now.
right now my no. 1 issue with my prompts are the hands
. I'm stuck with this since 2 days lol
I just gave up and try to compose the shots in such a way that the hands aren't visible
without controlnet you're basically fighting rngesus
with cheats
yeah - I tried so many things now. In some prompts it works alright, but I got so many images that I find interesting but the hands...
I think it's a fundamental problem with diffusion. It interprets fingers more like scales or hair and not like a constrained structure.
this is one of the better ones (1/50?) of the prompt I'm currently working on but it's still problematic.
Also it doesn't understand kinematics, so it may internally consider things like a closed fist and an open hand two completely different structures.
No one wants to answer my questions?
?
I just get one where the arms look decent and run the hands on a fast inpaint for like 50 batches
Only if you want a hi res fix
I think the problem is that hands exist in this wierd space between fluidity and rigidity, as well as having many small parts which don't take up significant space so the amount of pixels it has to figure it out isn't that high.
I tried using hires fix on A1111 and it took a long time to generate
Do I use the refiner in Img2img only
Sorry, I haven't used SDXL with A1111 yet
hands have millions of possible poses and configurations. very complex little sausages with very indiscernable patterns
Exactly. If they were more fluid 'flaws' wouldn't be so noticeable, if they were more rigid they'd be more predictable. But they're neither
faces while having many expressions only have one layout so they're not bad. The pattern is so simple you can see faces on everyday objects or the moon :^)
i hate how the refiner trys to make a face out of everything. wish it didnt. lol. cuz it works well, but damnit it trys to hard to maake faces that are not meant to be faces
yeah, that's definitely part of it. so I guess no spell casting for now till we have ControlNet or some fine-tuning that helps. I tried all the tokens and prompt techniques I can think of right now
but hands have many different layouts, so I'm guessing it treats them more like a "pattern fill" such as fur or scales and just kinda adds fingers until the gap is filled
Honestly it won't surprise me if at some point we end up with a refiner that identifies problematic areas like hands and then passes them off to a specialized model that only does hands or whatever the problem is
think they mentioned investigating a "many experts" model like that
the latent space is trolling again
Hmm, this keeps randomly happening when using the latent composite
Yeah there's been some speculation that's how GPT4 works, no confirmation yet but it makes sense
giving you the middle finger
yeah. I guess image segmentation is the way to go anyway. background, foreground, single objects and subjects
Coincidentally the 'many experts' model is similar to how the brain works with individual regions assigned to types of tasks. They're not completely separate but they are specialized
no joke. generated a couple of minutes ago
that aint bad
Agreed. I also think image segmentation will likely play a role in the next round of training. Instead of just training the image dissect it into it's parts so the ai can 'understand' what each component is and how they fit together
one of those so close you can taste it kinda results
yeah there are many pieces to do that already available. it just needs to be in a nice package now 🙂
sdxl that's not what a fucking woman looks like
be honest, did you type horny woman into your prompt?
okay after 300+ test images to fix hands - time to move on to a different project and return some other time 😄
sounds similar to the crop conditioning they did in sdxl
with big pumpkins
"woman with red skin"
oh I just dragged a workflow into comfy and firefox quit existing that's nice
When I used Latent Composite it turns everything into flowers for some reason lmao
no prompt or when it doesn't understand it does mostly nature images for me
I'm not completely familiar with that but I think that's more about breaking the image into chunks where as I'm talking more about defining whats actually in an image for the AI
like Meta's new segmentation model would let you pull an object out of an image entirely
Yeah I don't know if I'm doing something wrong, or it's SDXL, but Latent Composite just goes to shit
yeah. there are a few conditioning layers. its about attention conditioning and i think what you're proposing could tie in with the same kind of work they're doing with the crop conditioning attention goals.
the sdxl report is a pretty neat read
i know what segmenting is already. Its one of the coolest arenas of computer vision imo
segment anything is huge and the disruption waves it has caused haven't made landfall yet
in some 16:9 sizes that are outside of 1024x1024 like 1600x904 or 1820x1024, I get a couple of pixels with a white gradient either on the bottom or right of the image - in almost all generations
Definitely. Even the basic application of identifying objects and/or parts of objects are huge. Then you add the potential to increase the amount of data AI can pull from already existing images because it now has context and I think it's almost certainly going to lead to big gains
with textual inversions, you can already kind of do that. it takes the alpha mask as an input to put attention only on the subject left unmasked
AI is going to end up out of our comprehension, our brains have not evolved enough to be able to even grasp some of the ideas or understandings that AI will eventually present
by that point we'll have become augmented intelligence and the ai machinations will be our own
Honestly I haven't looked for an actual example of how accurate textual inversions are at it but I'd be shocked if it's anywhere near Meta's model
oh TI embeddings have nothing to do with segment anything. it's for training embeddings for stable diffusion i mean
We've only recently discovered how the human body continues to go through puberty and a girl can begin menstration (her period) even with zero brain activity
just an example of how a mask focuses training already
it's not that far out of an idea
Interestingly this is what Isaac Asimov speculated the end game of robotics to be, a state where robots and humans were indistinguishable. A lot of people take the wrong message from his works because they had to have conflict and inevitably included 'hostile' AI/robots, although really the point was more that it was the people that were using them wrong.
If that's the case then technically our mind is essentially a separate entity from our body so why wouldn't we be able to extract the human consciousness into something else?
yeah i'm a big fan of asimov and the other greats of scifi past
if it exists and happens in the universe, it's not magical and it can be engineered
well it's still finnicky but I think those new nodes are on the right track. Color doesn't bleed into the background and it's so green it looks photoshopped
You know how this is going to go - Matrix, Skynet - you don't give the AI rights and it will rebel 
i think a problem for our future could be that most of the world is ran by older people that regect new concepts of technology therefore they will percieve it as hostile even though it is not
@upbeat summit 1824 x 1024
combine node seems to alternate between conditioner A and B for each sample
suppose we gonna give toaster rights now too then! rarwararawrar
you say that like your toaster doesn't already have rights you slave-owning colonizer
Eh those types of stories always attribute a level of emotion/malice to the AI. Like there's no reason skynet would care about if it exists or humans wipe it out. The much scarier part of AI, at least to me, is that it actually doesn't care at all.
Any form of life that is essential to our ecosystems or any form of life that can display higher level intelligence should be given rights
my toaster loves it's life in my home! i treat it right!
