#๐๏ฝsd3
1 messages ยท Page 110 of 1
Hello! I have questions regarding the "commercial use license" for SD3. Where would be the best place to ask about that?
shorter token quantities.. still not as good as I expected. Workflows embedded
nah, it's certainly better on paintings, try watercolor, or pastel acrylic
the actual composition of gemstones -> all things that are lost on flux
oh I remember watercolor on the service, that was good
I just remember making regular oil (impressionist) paintings on the service and it looked better than offline
that's what I'm mostly after
if i personally was serious for commercial stuff, i would contact SAI directly to answer some questions i guess, idk
strange trojan horse.
Big Ben is going down
damn
lol
and. sd3.5 does blood. flux doesn't want to.
nice
fair enough, I was hoping someone around here had some experience and understanding right off the bat :v
But thanks for replying!! I'll be looking around their website then
i mean maybe someone can answer you idk :3
I remember sd3 large being very good at gore and flux being almost nonexistent
so its nice to see that still working
like a store made out of guts or some stuff like that
did a spider-pig earlier, yeah, it's good at that. for certain better than flux
flux feels very neutered
If you aren't making enough from your ai art to buy a new Farari, prob don't need to worry.
no idea what i put in as prompt. modern architecture is weird man
i like the mana, health and poison potions on the table haha
Aortic tomato ketchup spurts from an open wound, where smashed and jagged candy bones protude thru the skin...
(the example pic)
What? Flux does blood just fine? Though I never did try lakes of blood...
nah, it didn't do it fine at all
Nice!
at best a few splotches, but nothing in the realm of the thing you'd want, if you were gunning for it
I just can't get it to make a good impressionist oil painting
holy hell
that looks like a cover for some movie, nice
white background ๐ฎ
eh I give up on paintings
I swear there's something wrong with the VAE or whatever, the image have a speckled look to them
CCTV is okay
kek
๐คฃ
I figured as much but I wanted to excercise due caution so it doesn't bite me in the ass later on.
Essentially I wanted to make graphic t-shirts but I have little physical artistic talent. So I've been having SD draw things that I want, and I'll mash them together using GIMP and edit them to my liking then slap them on a shirt.
I'm not sure of ANY legal jargon so reading the "https://stability.ai/community-license-agreement" makes my head hurt....
Not as good as MJ, but not too bad. Flux Dev lol
the part where it just didn't hold up for me -> it didn't want to apply the blood anywhere
it's just splotches, and non-applied blood
you have to do some "prompt engineering"
Septo-prompt
aight, lets overload the prompt
thomas got weird when he snorts his coal
As between You and Stability AI, You own any outputs generated from the Models or Derivative Works to the extent permitted by applicable law.
i like it
ugh...i REALLY should go to bed
Perhaps you should create some loras ๐
... i never make/use loras. pure prompter only ๐
that can only get you so far
nah, gets me exactly where iw ant to. just want to have some fun and post few images
^^
nice
hey where is cat with 4gb of vram on this glorious day?
yea, "extent of applicable law" is ambiguous enough that idfk what limitations that entails ๐ญ
yea i tried to contact him lol
Photo of Criminal in a ski mask making a phone call in front of a store. There is caption on the bottom of the image: "It's time to Counter the Strike...". There is a red arrow pointing towards the caption. The red arrow is from a Red circle which has an image of Halo Master Chief in it.
Flux does this first try
or definitely consistently
lets be honest here man, you are most likely not gonna generate anywhere near 1 million dollars :3
he lost his account, it's @rapid pivot now
I appreciate the texture on everything though, flux is made out of plastic
this one is made out of scratched plastic
plastic booba
that's what I figured :v
dont worry man, il give you a small loan of 1 million dollars
holding swords is usually hard for AI, they dont hold the swords correctly
Hey Im not really a technical user or anything, Ive been running a1111 SD 1.5 on my AMD gpu since last year
Will SD3 work on my rig? Or should I just stick to 1.5 until I get better hardware ( wont be building a new PC until end of 2025 )
ye video game stuff is always fun
well, time to go to bed ^^ cyaz
gn sweet prince
gn
yeah flux got this right
i only played cs 1.6
oh look no bokeh
okay what the fuck comfyui you are destroying my space
what are you offloading
bruh at least say what card it is and what vram it has ๐
(bokeh:9999) in the negative prompt :3
The free HF SAI image generators for 3.5 won't generate certain images. Though their Flux HF generation areas weill do the same prompt just fine.
its an rx 590, 8gb of vram
its not powerful at all, my SD 1.5 caps out at like 900x900 ish
but i get nice results
are you not generating locally?
i think you need 12gb for flux and sd3. so untill then you should try out sdxl or pony
Oh damn okay, Ive seen a lot of pony so I guess Ill switch to that. thanks!
12gb vram and 32gb ram is like the minimum these days for the cool stuff
Flux and SD3.5L work fine on my 8Gb VRAM/64Gb bRAM
oh damn, thats cool
nice
i often get vram errors on my 12gb card though ๐
yea i built this PC in 2019 and with win10 ending in late 2025, my pc doesnt meet sys reqs for win11, i was just gonna wait and upgrade the whole rig at that time and probably get a 12gb vram nvidia card
what is the generation speed? @dusky thistle
Mage just added 3.5!!!! โค๏ธ
she works at hot topic
30/30 [00:55<00:00, 1.85s/it]
kk
idk who that guy is

did you make a new account? :3
haha
lol
Anyone know a good link to educate me on installing and using SD3.5-Large-Turbo in ComfyUi on my PC?
like, what files exactly and where to put them? PC requirements to feasibly run it? etc?
the comfyui github readme has enough instructions to get you started, especially with example workflows and instructions where to place each file
thx, I'll check it out
\
now that is way better
i like the blend of colors
seriousyl, if this thing is anywhere near as trainable as cascade, it's gonna be amazing
i wish they would bring back the Cascade feature of mixing pics
cause you can easily blend styles
When Alioth finds The Mask.
How so?
im disapoint man, where are the Thanos pics? :3
the workflow from matteo and comfy both refused to do 1920x1024
I haven't had issues with alternate resolutions:
So, all these pictures you guys are sharing here, I could save them to my computer, load them into ComfyUI to see the workflows used to make them? Am I understanding that right?
Gone; reduced to atoms.
wait did you specifically prompt for that kind of style?
can you tell me the prompt
My prompts are all in the images. I never strip of meta
if it wasn't a glitch this would be very cool
i also assumed glitch at first
@dusky thistle My, God, man, I don't normally grab image WF but Holy Toledo. Damn, lol.
impressionist oil painting
how come it works so well for you
maybe my sampling is bad or im just giving it a certain context that makes it snap out of the style
I dont understand the comment. How come what works so well? Is it not supposed to?
https://github.com/ClownsharkBatwing/RES4LYF nodes are here just fyi
using stochastic sampling with third order RES, CFG++, all custom implementations
I was going to train some SD3.5 stuff in a bit, but not getting 1920x1024 sucked
lol
why 1920 x 1024? Strange resolution
this is going to need a lot help
must be evenly divisble by 64
those are the nodes in the link for RES4LYF
they're my samplers
welcome to comfyui, we have spaghetti
not the red ones I cared about it is all the ones I am scared to touch that are there but not connected and some are others not with all thew text
i mean you can start with an easier workflow and work your way towards more complex stuff
Yeah, Straight outta Compton, yo
my spectator nodes be chillin
Yeah, just hanging out, ready for action
they ready to snipe a fool from afar
They are known as sleeper agents.
sleeper nodes
"These are not the droids you are looking for"
im looking for Android 18
must be heaven
oh, well I did a pip install -r rquirements.txt all good. DId a manual install of opensimplex. All good.
I do an install missing nodes it says all good I have none. :/
It might be the size and not the resolution
Here is an alternate made from 1600 x 896
not worried about it I am going to go train it
his workflow is missing stuff
and it can't find them. Not a game killer
of course I am in legacy mode as I want no part of the new comfy way.
Workflow and image info is in the pic
I can't wait to train sd3.5 as Flux bent me over
what stuff is missing?
I suspect the issue with 1920x1024 is that the res is too high
ye but thats the thing he told you to instal
at 1440 x 816 it is already hunky dorey
the trainers are being written as we speak
i know, but you need to install his custom node
I did. I guess you missed that.
(default, no Lora)
As I said, not worries
oh so you are saying, its some extra stuff even aside from his custom node?
okay, I updated all and NOW it works
kk
no missing stuff. I already git his but something in the backend and I already updated right after comfy added 3.5
that's weird... those are in that repo
I know.
oh it works now?
I even closed it down and restarted same. I then said to update all and it worked
yes
yep
that means I updated 3 times today
lol
@dusky thistle your nodes in the cli is doing this
oops, that's a debugging thing, lol
forgot to remove that
should be gone now
1920x1024
some of these can be nice wallpapers
what is that border?
prompted for vhs tape
huh
At the moment, Iโm really pleased with the variety within a single prompt, especially the abundance of different faces. Yes, there are issues with anatomy and complex interactions, but I believe these can be resolved with fine-tuned models. Overall, the model turned out well. Itโs only fair to praise the SAI team for thisโthe criticism paid off.
i've been sayin it for a while: please, god, let SAI release 8b
i can hear the popo already
Quite impressive variety of comics and cartoon styles
i like it ๐
popo is interested in fake family photos? xD
haha
Same prompt each time
they are coming for the VHS tapes
guys how much time does it take to generate an image using SD3.5 at 1024x1024px with an 8GB gpu?
i really liike the shadows on this image, the 3d effect is really trippy
a bit slowish, its bareable. Its loaded to ram for me
on 3070 btw
i dunno. but it's about 3 minutes on my machine and i have 16gig. you might consider using turbo
the sky ๐ฆ
prompt for clouds
why the filth in the sky?
lots of leftover noise in that one
yes
artifacting it looks like. prompt for stuff in the sky, you'll be less likely to get that
is your workflow in that image?
nah, it is Clowns
ddim
switch to beta for the scheduler
k
lol that says nothing, like how much is slowish? give me solid numbers, how much ram does the 3070 have? im guessing 8gb bc you responded? for me sdxl is like 15-30 seconds and flux is like 80-200+ seconds
to be precise, it will be more than sdxl and less than flux
give me a prompt, i'll run it on my machine and tell you how long it takes me
30-40 seconds for 15 steps i havent stopwatched in second
30-40 seconds on 8gb of vram?
the workflow in this uses the 3 encoder node. you prompt each encoder to its strengths
beta it is there but barely
yeah and 64GB of ram
what's your prompt?
a very high quality extremely detailed 4k cinematic photograph, a very sharp 4k 8k masterpiece, of a snowy mountain trail captures the stark beauty of winter in the high altitudes. The trail winds through a dense forest of tall pines, their branches heavy with fresh snow that clings to every needle, creating a soft, white blanket that muffles the sound of the world. The ground is a thick layer of snow, the surface untouched except for the clear imprints of bootprints and the occasional small animal tracks crisscrossing the path. The sky above is a brilliant, cloudless blue, the cold air crisp.
to start with, dorp the 4k and 4k 8k terms. they do nothing. let me see what i can get out of th erest of it
i have a laptop 8gb gpu + 32 gb of ram so I doubt ill be able to reach those 30 second times but ill be happy with 40-50, flux can't do faster than 80 on Q_8 flux-d variants
keep in mind I measured it now when everything is loaded in ram. First time loading in ram is limited by your drive(in my case 500mb/s ssd)
it slowly chugs and I can litelarly see my ram filling and then its 30-40s
I downloaded Turbo which can run on 6 steps but I tried to bump a bit samples
which sampler are you using there
sd3.5L has a bit of trouble with that scene
i tried throwing in a few different quality terms and it helped a bit, but not enough
3 minutes with a 16gb vram card? that seems strange, for example on my 12gb card, with flux, its like a minute MAX at those resolutions
good point, indeed it does run faster on subsequent attempts, so it's 30-40 after it's been primed let's say
yeah exactly, 3.5 is almost same like 3
beta
that's the scheduler
i make it a point not to argue with sharks. especially sharks with clown makeup
euler the default
kek
euler kinda sucks tbh
it's the simplest but it's just not very accurate
I think. not sure I closed it down as I go to linux and train 3.5
a lot of discussions for trainers and a lot of funky stuff in 3.5
the SAI decisions made are wonky at best
@sacred geode
3.5 also shows they didn't throw a lot of money at it (for obvious reasons)
workflow is in that
this is what i'm currently using for my sampler
has anyone tried SD3 with LongCLIP yet?
it worked
res_2s in the lil drop down is almost as good and 50% faster
i would love to know how you know this, and some of th eother stuff you've stated
I kind of thought the result was a tad better
it depends also what you are going for i guess. realistic stuff is usually better with the dpm samplers, for cartoony stuff, euler is fine, but then again im no expert, just speaking from some experience i had. now if you want to use custom samplers, then thats a whole other thing. not to mention noise injection techniques and so on
would be nice, right?
i have several hundred images that show you what it can do. you have to learn how to talk to it though
yeah. since you're wrong
i've come to believe that sampler selection is more objective than i used to
where is this node?
hiding in his shark tank ;)
opinions and what we see in the code. You have yours and we have ours. Yes, I said we not just me.
has like 20 samplers all rolled into the same framework
oh trust me, mine is not an opinion.
okies
once tehy're implemented in a similar manner, you start to see it's really just a question of how accurate it is
RES is basically a patcehd up version of DPMPP
to fix some issues with the math
have you tried matteo's node he created for the flux blocks/
nope
same neural network
no wonder it's called res4life
that is easy to understand
so his node should work with SD3.5
what node is this?
yepp I memorised that chart right away
the test is tomorrow morning
matteo released a node several weeks back that allows you to adjust all the blocks individually
i don't think it's in his essentials pack though.
matteo is one of the stable diffusion gods we have
he's a good dude too
and major comfy node programmer :) - they jsut don't pay him
๐ฆ
course he doesn't work for them ...
i learned a lot of cool tricks from his videos
@dusky thistle your prompt - workflow's in the image
lol
so t5xxl is only 256 tokens?
what is that part about 77 /256 tokens coming from the CLIP encoders? anyone care to elaborate? I understand that the smaller CLIP L is limited to 77 tokens and then G/t5 are supposed to be the bigger ones
there are 3 encoders that sd3.5 uses. not just t5xxl
yeah i get that
you still dont' want to go into ramble mode, even with t5xxl - and you do want to give it your rich detailed prompt. you want to give clip_l the ambient, artsy, background, fine details. and clip_g the 'just the facts mam' black and white text. then they don't battle each other
yeah i understsand that much, just wondering if that chart is saying that t5xxl is limited to 256 tokens
what chart are you looking at?
this one, top left, i thought t5xxl had way bigger context width like 1024+ tokens
the way i read that chart is that it's 77 tokens for clip_l and t5xxl + 77 for clip_g -
clip_g is your workhorse. it does most of the heavy lifting
there are longclip fine tunes out there
that might work with this, they work with the other models
some things are better, some aren't... still tons of artifacts/structural issues, sd3.5L just doesn't understand this scene well
it would be worth trying at least
this guy makes them https://huggingface.co/zer0int
its a nice way to get a variation of an image also as the new clips interpret prompt differently
they don't like to put bins (garbage cans) as much for example
yeah i use the longclip finetunes from zer0int for my flux / sdxl workflows
there's also a better clip_l out there with better text understanding
if you read from their huggingface model page, that 77/256 is for training stages, but t5xxl should be around 256 i guess. the others should be 77.
issues are just in the branches more than anything, it has trouble resolving them
i think a finetune will clear this up
the autoCFG guy made the best text encode node I have ever seen
with 856 configuration examples lol
https://github.com/Extraltodeus/Conditioning-token-experiments-for-ComfyUI
yeah i was using that one until i learned I can use the longclip finetune over the clip_l finetune, they say they manage to improve the context width from 77 tokens to something like 248
last time i used the longclip, it produced some very bad results tho... idk
maybe it's better now
its tricky
using alternate clips
my main checkpoint these days is New Reality on SD 1.5 and the zer0int better clip does worse on that model
well that's the thing with components, people like to experiment and mix and match
i think it makes the most sense to always use the encoder your weights were trained against, and not to swap them out post-training
probably yeah its an ablation
still using 1.5 eh? :3
I could put a clip switch node in the workflow maybe
and then toggle it every now and then
yeah my main is SD 1.5 now
1.5's the only one that understands certain terms correctly
now that Blepping ported DiffuseHigh to comfy
you can make SD 1.5 image at 2048x2048 or more without any upscale, all in one pass
https://github.com/blepping/comfyui_jankdiffusehigh
this is currently the best method on Arxiv so its awesome that we have it now
Lol at that name of that repo
haha yeah
his hidiffusion one is also called janky
interesting.. i actually didnt know about diffuse high. i know there was something called ultra diffusion or whatever, that is kinda similar, but il check that one tonight as well and see how it works out for some of my 1.5 stuff
ultra diffusion yeah that's actually clown's repo
https://github.com/ClownsharkBatwing/UltraCascade
whoops
there's this but probably not what you meant ```Ultra-Resolution Cascaded Diffusion Model for Gigapixel Image Synthesis in Histopathology
that could be extremely useful
yeah my main interest in diffusion models is actually what they can do outside of making images
making images with them is just a great way to learn
shit i cant find it ๐ฆ il try on github
yeah. there's a wide range of things they're being used for across all industries now
don't worry I forget the names of good papers for weeks at a time lol
diffusion language models is a funny one
the output texts are very different to LLMs
ah i got it... https://jingjingrenabc.github.io/ultrapixel/
ah that's the same as Clown's repo
is it?
yeah his repo is basically getting that model to work well
well it all makes sense now lol
I really need to try it some time
its a bit tricky for cloud as it needs quite a few files in the right places
stable cascade was awesome when it came out, but then they announced sd3, but sd3 was terrible... LOL
lucky for us, flux arrived not too late and then we have sd3.5 i guess
@dusky thistle lol, it's really a small world eh, didnt know you did the ultrapixel repo as well
the ultrapixel repo is a port
i did a full rewrite https://github.com/ClownsharkBatwing/UltraCascade
runs faster, more vram efficient, integrates natively into comfyui
yeah I mostly skipped flux
I was looking through my download folder the other day and I really didn't do that much of it
gonna just focus on SD 3.5 now
you skipped flux? how dare you sir... :3
the original code for ultrapixel was honestly a complete mess
yikes
deleted tens of thousands of lines of unused code prolly
lol
all kinds of crazy issues with weird hacks etc that degraded quality, undocumented behavior that diverged from the paper etc
which is why i just did a rewrite
it looks fantastic though
was worth the trouble
yes awesome work man ๐
@weak trellis 82 seconds on the second run, 8GB vram, 32GB ram, 20 steps, 1024x1024
what really broke things through to the next level was implementing PAG for cascade
i remember pag, dont really use it these days
PAG is the biggest image quality upgrade in the last 2 years in my opinion
but only when it is suitable
oh, the other key thing was i trained my own version of stage B lite
results are much better than with the full stage B
images like the above would look like butt
the training for stage B got fucked up i guess cuz it did need patching up
wait, was PAG before or after SAG?
cool thing though is now you can generate directly at 2560x1536 or even higher without even clearing 11gb vram
it came after
ah
i got SAG implemented with cascade too, it's helpful as well
also invented RAG (random attention guidance lol) which is great for some photographic styles
what compression factor you use the most for cascade?
PAG was nuts though
i don't use compression factors
it makes shit way too confusing imo
yea
the key is getting the dimensions for latent C nailed down
B is just a superscaling model
the best resolutions for C are 24x24, 18x30, and 24x40
then you can kinda pick whatever you want for B
would you say B is superscaling or refining?
superscaling
there is SEG as well which was a sequel to PAG
its less powerful though but somewhat prettier
and refining
SEG likes to put curtains up and PAG doesn't for some reason
so wait, what VAE are they using for sd3.5?
curtains? it's trying to censor the booba man ๐
cascade doesn't understand much more than a 10-15 word prompt, but it's unebilevable with aesthetics
it's one of the last models that was actually trained on artist names etc
cascade 2: electric boogaloo
its also a colour thing where PAG likes orange/yellow/red contrasted with black/purple
whereas SEG likes pastel colours
just oozes style
but i remember cascade having problems with eyes, like it makes them heterochroma, or however the hell you spell that word LOL
I remember me and @icy drift making some TCG waifu cards or whatever with cascade LOL
oh no, i think it was like animals
can you link me SEG? i didnt see that one
that is the clownshark style
hahah
you made some loras or checkpoints if i recall haha
yeah, which somehow have more downloads than real models get
some weird botnet has latched on and is repeatedly downloading my garbage in an infinite loop i swear lol
there are a few nodes but I recommend these ones for PAG and SEG
they have a really good rescaling called SNF
https://github.com/pamparamm/sd-perturbed-attention
my HF is basically a trash heap
there's no reason it would be some popular destination
i mean dont lower yourself like that man, i think you are doing some cool stuff
and now some real SD3.5L
the Anthropic botnet is crazy apparently
oh i just mean in terms of how i organized it
i just uploaded shit without bothering to name anything carefully or organize
every few days, just heaved another lora into the same folder with some partially descriptive name
https://huggingface.co/ClownsharkBatwing/CSBW_Style/tree/main nearly 2k downloads last month ๐ค
- image on the left using default CLIP L, t5xxl fp8, triple encoder node
- image on the right using the fine tuned CLIP L (by zer0int), t5 fp8, triple encoder node
- I can confiirm that LongClip doesn't work with SD3, i think zer0int will have to make a new node for it to properly merge it
so weird
ah "Smoothed Energy Guidance", SEG, just wanted to see what it stands for lol, i mean il check it out tonight too.
shit i have tons of stuff tonight, sd3.5, seg, ultrapixel, diffuse high, and prob like 10 other things i forgot as of this moment
quick find 10 more
lol
We need some Raid
wait where is the SNF you mentioned? i dont see it
so is segg better than pag?
shadow legends?
smooth energy guidance?
I think SEG is better yeah
however PAG is a bit stronger
SEG looks more attractive to me
where is snf
full - takes into account both CFG and Guidance.
partial - depends only on Guidance.
snf - Saliency-adaptive Noise Fusion from High-fidelity Person-centric Subject-to-Image Synthesis (Wang et al.). Should increase image quality on high guidance scales. Ignores rescale value.```
ah
if you are upscaling PAG is better I think
or fixing broken stuff
the original SAG is also very good, its the best for clarity or sharpness still
i would like to see an anime in that style haha
so how's 3.5? any improvements?
from 3.0? yes
damn pyramid flow, videocogx, tons of flux loras and accessories, sd35, ultrapixel, sana and omnigen coming... madness...
its .5 better
sd3.5 large still bad in hands . prompt:a man show his fingers with ring in the street.
yeah face fix and hand fix are needed for now
the ones from impact pack are fine, they do it all in 2 nodes
impact pack is awesome for a lot of things
could do flux for hand fix
then a second pass with SD 1.5 for hand fix 2nd pass
face fix I would be okay doing 2 passes with SD 1.5
but wait... there's more ๐
is there any negative prompts for bad hands?
flux is good for the sam prompt
dat moment when flux 1.1 weights release tomorrow, i mean that would be funny considering we just got sd3.5 LOL
i kinda daont want flux to change bc new weights would kinda invalidate all the existing work/loras/
but what is the price of invalidation compared to greatness ๐
man that bokeh is too strong
needs some work
it can do really small details it seems
yeah, depending on how you prompt
why sd3 , sd3.5 or flux do not have inpaint model?
in which interface?
flux has an inpainting control net
I agree though, it needs inpainting models, or patches like the fooocus ones for SXDL, or a brushnet
flux is never going to get those. but SD 3.5 will soon as someone comes up for air from everythign else they're suddenly developing and creates one
this one ? https://github.com/alimama-creative/FLUX-Controlnet-Inpainting It do not work well
yeah, cause flux isn't trainable. and you have to train controlnet. give people a few days to work with 3.5 and someone'll start on controlnet for it
that was the one I was referring to yeah
I haven't tried it yet
I personally use the powerpaint v2 brushnet for inpainting
or the SDXL union pro max control net
or maybe krita is a good solution as it's a paint program and also runs stable diffusion
krita looks great but I need it automated
Where is this Clip found?
they are here https://huggingface.co/zer0int
sup
@craggy crest how many images do i need to get a controlnet trained? For SDXL it was like 100k or more
any idea why this prompt isn't generateing anime
retro anime style, a woman with long flowy blonde hair, wearing bold makeup, in her casual t-shirt and denim pants, navel cutout, indoors, solid red background. dramatic perspective.
i have no idea, i'm not the one that trained any of the controlnets in the past
the estimates for Flux union control net cost were in the tens of thousands of dollars sadly
will be similar for SD3
some are coming from SAI though, which is good
where do you get this estimate from good sir
can't remember, either reddit or discord
is that like, 2 weeks of H100s?
H100s are about 2 dollars an hour or so
nah maybe 4
first came Alibaba, now we have Alimama :3
no you can get them for 2
flat comic line drawing...
I love digimon
a little close but not quite
what is your prompt?
flat comic line drawing, a woman with long flowy blonde hair, wearing bold makeup, in her casual t-shirt and denim pants, navel cutout, indoors, solid red background. dramatic perspective.
i had digimon cards when i was lil, but never knew how to play, so i trashed them
good
joins us on the yu-gi-oh cult
lets burn inside churches together
and btw im using large turbo model
sd3.5 has fingers conveniently tucked away, but so far i like the aesthetics
@bitter hearth
nice.. how did you manage that?
flat anime style comic line drawing, a woman with long flowy blonde hair, wearing bold makeup, in her casual t-shirt and denim pants, navel cutout, indoors, solid red background. dramatic perspective, by Katsuhiro Otomo
that can definitely narrow it down! Or use a reference image of the style you like
ok
controlnet is a huge model so ...
did you know flux can't get this prompt right? a woman in large white t-shirt, bare legs, at a beach as the sun is setting. the ocean water is deep turquoise with orange hue in the sky. but sd3.5 did it at one go
I took the artist name off: "flat anime style comic line drawing, a woman with long flowy blonde hair, wearing bold makeup, in her casual t-shirt and denim pants, navel cutout, indoors, solid red background. dramatic perspective"
Oh wait, I was using SD3.5
with Flux, you have to mention anime a couple of times, and definitely want to add an artist reference
no, im talking about sd3.5 for anime related query
it cheated, she has a nose and lips, NOT anime lololol
to render anime with flux i usually use lora
like this
With flux you can get most things with prompt only (but not nsfw)
there that's better, no nose nor lips ๐
hot
is she single?
nope lol
nice
My first sd3.5 lora
how are you doing anything anime
fairly easily
you are fast.
@bitter hearth @bitter hearth
cats with no fingers, great choice ๐
what is your opinion about current trainability?
I think it is better than flux. 24GB able to do fp16 training with bs1. 100 images, 4rank, 2hours

I noticed in the sample images that hashtags were used in the prompt. Is the hashtag list extensive? Like could I assume if I wanted a steampunk feel for my picture that I can put #steampunk? what does the hashtag do?
I'll say this much: after a ton of flack, and deserved, Stability stepped up and delivered pretty much what was hoped for after their initial announcement in March. They had teased the 8B and it was what everyone had been pining for and now they delivered. So a big kudos to them.
I'll even add that although it cost them in more ways than one, the stumbles that led to Flux were a bonus to the users. Perhaps not to them, but certainly to the community at large.
Stability have smashed it yeah, amazing model
We all now have two large and stellar models
I really like how flux looks for some things I just don't like the distillation
it's good, but im waiting for the usual stuff, controlnets, finetunes, and also training loras
there's one lora ๐
I like choice, and competition, and we have it in the spades now
someone here made first one very fast
no im gonna train my own lora :3
it can have more potential than flux in the long run, if it is actually better at tunning
I'm looking into making loras again
It used to be the competition of the mini 2B models. Not anymore
we are getting updated 2b too btw
some kind of hybrid can be good
make image in flux then second pass in SD3 then upscale with SUPIR (which is SDXL)
Now we have commercial level quality
Well aware. Heh
all we need is to fix anatomy and this is dream model
sd 3.5 doesnt treat the keyword girl with immature toddler
it does if you mention 3 yo girl
Can you hires with 3.5?
i dont want to
I actualy thought the announcement image witha girl on grass a hugely funny choice
I just mean its flexible lol
probably no, tilling is the only way
yeah flexible i like it so far
it goes funny after a certain number of steps

I rendered 1600 x 896 images with no issues
Stupid 
DiTs don't scale as well as Unets in latent size due to the positional patchwise embeddings
so tiling is the way with Flux and SD3
I managed to fit both, text encoder and sd3.5l into 12gb vram but don't see speed benefits compared to when model was offloading into memory
whichever UI you are using might still be automatically offloading
I am not sure
same was with flux
literaly have no reason to not use fp16
however, it is super fast to load and prompt calculation instant too
how about
wow you just made 8k upscaler
no itsl 4x the size of latent
scary numbers
then i reduced that to .50
so thats like 2x the latent
oh I see
yeah 2x is fine
I thought you were joking by making low quality 8k lol
I'd recommend lanczos for up and area for down
or bicubic if the same node will sometimes be up or down
default size any of these before upscale #๐๏ฝsd3 message
What will stability do after sd3.5?
live happily ever after
what i have loaded into comfyui ... the model is 16gb, the clips are 6gb+ so that's about 22gb but unlike flux i don't feel my system is carrying a lot of weight... and it takes me roughly 10seconds to render images with sd3.5
i heard the plan is to use the massive windfall to make a hostile takeover bid for microsoft
yeah, it's a much, much smoother experience than flux
how is that possible ... i like it btw
yeah, i'm liking sd3.5L quite a bit so far, i think this is a really promising base model
can't wait to get going on some training projects
and i'm really happy they released it
sai's 8b was the model i've been hoping all year would be the one we could get our hands on
i anticipated the model would stress the system memory but to my suprise its so smooth i dont feel a thing
im feeling this inclination to put flux aside
its' like how in the winter, if your water heater craps out, the best thing to do before the inevitable ice cold shower is to go roll around in the snow in your underwear for a while, cuz then that shower will feel nice and toasty
Not me. I don't play favorites. I will happily use them both.
Use ALL the tools!
there are few reasons i say that, flux is bad at full body images, and also anime with flux is not the same as sd3.5, and the big factor of memory management that sd3.5 is handling pretty well
you also cant use flux for nsfw even for aesthetic styles which you can with sd3.5
I hear you. Except since I don't think I have ever produced an anime image... It is not a terribly big factor
Flux has a very narrow range, yes, but its a good tool for what its good at. Dont toss out a paring knife cause its not a chefs knife
My thoughts exactly.
yeah, good thinking, but sd3.5 feels like a nice change from heavy memory
Even windows paint has a use or two
I was able to produce a great new logo for my YT channel with Flux, and am infinitely grateful. And don't think I did not try them ALL out there. Commerical and open alike
It is :) learn it.
yep... got three trainers on my system here, cuz why not? none of them can do everything that the others can do
Link to channel?
Laugh all you want, but I use it quite often to add a nice red outline square or cirlce to my tutorial articles
My comment was serious
Same
Hi, I'm Albert, and in this channel I will cover chess in all its flavors and variety. I have been a player for over 30 years, and a chess writer with thousands of published articles to my name.
I am now returning to active play. So you will find a series called Study Chess with Me, in which I invite you to join me on the journey to share and ...
Oh i need to watch you, im lousy at chess
Playing chess?
writing about it, photographing it, and teaching it, yes
way cool
does stable fast 3d model go here as well?
This guy already planning another dish
https://huggingface.co/SG161222/RealVis_Large_V1.0
loving the aesthetics of sd3.5L
you should post that in the L3 discord's weekly challenge
Ideogram V2 are firing shots
their new inpainting thing looks like the best inpainting
let them fire :)
weirdly the best open source inpainting control net is on Hunyuan-DiT
I'm gonna try it but it needs unpickling first
i believe if you're on pytorch 2.4.0 or higher you can safely load pickles
depending on how it's done in the code
ah okay will look into this
do i place the carnivorous blobplant into that pot
you can try, just use gloves
how OLED's grow in urban decay
prompt: on the left side of the image is a dark skinned man wearing black jacket. On the right side of the image is a white skinned man wearing tshirt. both men are running away from an explosion that can be seen in the background
and special effects explosion
sd3.5 doesn't take negative prompt or does it?
cause their workflow example sets cfg to 1 and has conditioning zero out
it responds pretty crazily to neg prompts, as in, not well
it does, yes
you can neg prompt just fine with flux but for whatever reason, it's very touchy with sd3
but best practice is ALWAYS dont' use them
afaik cfg = 1 is no effect of negative prompt
that's flux only
there very well may be a way to use it effectively and consistently so, but i haven't found it (yet?)
for SD3.5, i set cfg at 4
how many steps / seconds per image you guys getting on the turbo model ?
8 steps i think?
so for now i'm just leaving it blank mostly
4 steps as you can see in my sampler
i have yet to find anyone that uses negative prompts correctly anyway
on my rtx 3060 takes 10s about / image
it is turbo variant's workflow
that's for large model i think, but large turbo uses cfg 1
where did you find that?
their workflow example on hugginface
most ppl do just dump word salad into theirs, yea
and even those that don't, show by the way they use them that they don't really understand why they exist and what they are for
without negative prompt
try my workflow
yea, prolly
im using the same workflow
except you have some parameter values different
smaller than sdxl ๐
probably. and are you prompting each encoder seperately?
no, my prompt goes in one positive txt encoder
ill try in few min
doesnt' fp16 require ram about 32gb and up ?
my workflow has the triple encoder node, and you're supposed to prompt each of them to their strengths
so - you're not using the same workflow at all
no. i'm using that and i have 16 gig vram
hmm i thought you converted them into widgets
nope
download the workflow i just posted and open it and look at it
there is no supposed to, but you can
i have it
i was talking about my specific workflow i posted to him
btw this scaled t5x has better result as the author says
probably. i have a better clip l, don't have that t5 though
when you mean beter clip l you are using this? https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14/tree/main
his latest, yes
sad
see when im using large turbo its not a unet loader, its a checkpoint ... along with triple clip loader
gguf uses unet loader
i ran the q8 sd3.5 gguf, if your doing something like that the node needs to be updated
three's a little tiny problem with that. SD3.5 does not use unet - so i'm not sure what that's doing
this anime quality is effortless with sd3.5
and they have good eyes
Sana
not sd1.5 mellow eyes
Link for Sana please? ๐
its only demo but they launched it here https://nvlabs.github.io/Sana/
3.5 is fun :) isn't it?
@bitter hearth try this prompt: by artist "Chris Van Allsburg":cute, big-eyed, flaxen haired by artist "Jasmine Becket-Griffith", by artist "Cyril Rolando"
the Sana paper examples are a bit better than the demo
I can't get the demo working at 4k, only 2k at most
but the paper has 4k examples
because the demo is 1024px demo
ah okay that makes sense
oh okay yes i try to ran same 18 gguf
Sana Demo
Sana Demo
but where vae?
this is nice
baked into the model?
Sana can be decent for quite a lot of prompts
you'd have to load vae node seperately
i mean how to load from node
okay thanks
OmniGen just dropped! https://huggingface.co/Shitao/OmniGen-v1
@fossil pagoda
double click and type in load vae
same error idk why
Long necks this SD3.5L
yeah there are so many good drops
Yemeni influencer!!! ๐
update extention, it does work
(OmniGen) Okay, less impressive. Failed the text and the feather in his helmet. Still haven't tested physics or long prompts.
This is an anime drawing of an armored knight holding a sword. The knight has a large red feather in his helmet. He is riding on a giant chicken. In the background, a castle is exploding in a large fireball. The smoke rising from the exploding castle spells out a word in the sky, and it says: "OmniGen".
This year has been really productive, but unfortunately, no one has managed to catch up with DALL-E 3, which is a real shame. Iโm still amazed at how OpenAI achieved such results and why others canโt do the same
model there has "1024" label
how to update? in node manager i dont see any updates for Unet loader gguf
yeah but their 4k images in the paper were done with the Sana-1.6B1024px model
its probably not optimised inference anyway so we will find out when it drops
check installed extention, press "try update" and restart
a bit weird things, but I am looking forward the weights
howewer, not sure if the are going to release this week due to sd3.5L and M ๐
๐ค
:)
there is no update button so ill just reinstall
I haven't tried Dall-E. Can you give me examples where it beats Flux? I know the leaderboards all have Flux ahead of it.
yea, dalle is unbelivable
you can use it free https://www.bing.com/images/create
it's what bing uses
it follows prompt better, quality worse, especially now with crazy quantization
it's also heavily censored
it wasn't back then, but people were generating crazy stuff, so it is now ๐ฆ
it is just 4 steps?
it's from openAI, it's always been heavily censored, and got more so once microsoft got involved
Normally, but I used 5 or 6 for these
forgot to write prompt?
Not always
well, thats fine too, there is also merge between both models on civit
that is negative
i've used it since day one. always
i was getting great results with exact that
that's not what was in your screen shot.
just changed
oh okay
Yes, I made my own with Flux Dev/Schnell and some Loras that I always use. That cuts down on VRAM usage and I get Dev output with Schnell number of steps
are you using the same VAE he is?
Trying to install OmniGen locally... Failed again.
that's better. take your steps up to 8 and do NOT leave those other two encoders blank
I think they heavily used synthetic data, like automatically generated cgi with perfect captions, so that is why it is so smart
you need to give each encoder a prompt - t5xxl gets the detail rich prompt. clip_g gets the shorter, black and white description, and clip_l gets the ambient, background, artsy, fine details