#🆕|sd3
1 messages · Page 108 of 1
Sweet! Thanks!
Some of these loras have really funky effects on the models lol
Notice the white line on the right edge of the image. I've seen this with a few of them. Totally ruins the models. I dunno what they used to train on.
I think I identified the lora. You can see the effect in some of the example images towards the bottom.
https://civitai.com/models/661699/style-of-h-r-giger-flux-295
Super Flux
SUPER FLUX is a easy method to highly improve Flux details. It will also save you a lot of time by making test rendering much faster with only 10 steps.
Get my Workflow here: https://www.patreon.com/posts/is-super-flux-to-114327248
Join and Support me
Buy me a Coffee: https://www.buymeacoffee.com/oliviotutorials
Join my Facebook...
An Asian female star,with artistic texture,a beam of sunlight slanted on the body, --ar 9:16
Sana stands out in the realm of image generation models for its exceptional power and speed, particularly when compared to other models like Flux. Despite being much smaller, Sana-0.6B generates images significantly faster, offering an impressive 100× speed increase over larger models like Flux-12B, while maintaining competitive image quality.
...
SANA has grown out of PiXart Sigma 🙂
What is that? Is it even released?
just another t2i model, i doubt it's not a flux killer though but a fun research model to tinker with. Will probably come out within 2 weeks
ah
gotcha, thanks
1.6b and .6b, definitely not flux killers but I dig that people have options for lower vram.
how much longer until SD3.5? Been waiting for it as Flux gives me the runs.
Yeah it does not beat flux dev but it’s incredibly small and fast while being really good for its size. There are 2, 0.6b and 1.6b params.
Both models have great prompt following, great text rendering, and decent image quality according to benchmarks in the paper. Humans might have issues tho.
as long as it takes. you could just use SDXL for now, it's good and i don't think it'll give you the runs
using the fp16 GGUF model
Here we go again..
Give them at least 3 weeks this time to sort thigns out 
what's she holding?
Holding the looking at the camera prompt 
Or away from
Cx
Spy Camera
I train and no one downloads XL trainings. ratio is 30 to 1 flux to XL. I don't need waifus.
Yes, here we go again as they are late so something is holding them up since they announced it, or did they announce it knowing they are months away?
sure they do, they're still downloading 1.5 trainings. is the problem that you want something for your use? or to sell to a fickle customer base?
Sana did not come out yet, will come out later.
ah
sell? LOL, I wish. I am not here to praise, nor bury SAI only be glad someone out here still wants to see a working model from them.
my hope is they don't screw it up again as they did with SD3 AND that it is not like Flux. Make it more like SDXL quality, ease of training styles in, and speed with the prompt adherence level of t5. If they can't do those there is always flux.
my hope is that whatever is released, we don't have another raid on this discord by elements that are here to just stir up trouble and see if they can kill the company. we'll see
That happened? Probably did knowing the base I have had to deal with. I left discords over it as so many became beguiled with Flux to the level of madness. The AI community is pretty sick and I started to see that with Pony.
I like to use Flux, or Hunyuan, to SDXL so xl can clean it up but they get the adherence.
that happened
try using sd3-2b-medium as a refiner for sdxl
I haven't touched sd3 since it first came out and the quality issues in the gens as well as the deformed people I deleted it and went right back to XL. Now, unhappily, on Flux. Like flux for text and conforming adherence but XL has the better quality.
plus the speed is horrid in Flux
here's a little known fact - because SD3-2b-medium and flux use the same neural net, and the same code - flux has the same issues as sd3-2b-medium. now ask yourself just what did black forest do to mask those and why is flux so inflexible?
yes, I want to know the first and I think the last one is due to dev being 85% of pro and distillation.
Schnell is even worse at 70% I think it was
i put the time in to learn both models - SD3-2b-medium is a fantasic model, but you do have to learn it from the ground up
I don't need to do time as a good model just works. Out of the box (never tried to train it as back then nothing yet could) it was bad when XL was next level in quality over it. Adherence SD3 was better but flux, or hunyuan, is much much better than SD3 for that. LyCoris said it is due to the t5 used for sd3 and hunyuan but I have not spoken with him about flux. Being flux is 12B t5 it would be why. btw, that t5 is also a curse as the more it knows the more inflixible it seems to be.
you do need to learn how the AI thinks. remember - both flux and SD3-2b-medium do NOT use unet, and thus everything you learned for previous models isn't going to work the same way
oh, and flux has major issues (out of the box even) with concept bleeding
oh flux has far more issues than that
a s tonne, I know
btw, BFL < SAI. All the bad devs I cursed at SAI seem to be BFL now. just an fyi
bfl are the original eluther researchers. nothing wrong with any of them
if you say so, but I have my opinion of them. A trainer dev asked them about how flux is just not trainable as it should be and they said they have no want to make it more trainable. Pro is why.
it isn't trainable and they can't make it trainable. it's frozen
and pro is not why
well, they haven't even released the shift value used while SAI was open about it
they can't.
yet SAI could? why not?
that's something i can't talk about. but they can't release that, or much of anything else. be happy flux makes you pretty pictures if you stay in its narrow range
As I said I prefer XL over Flux for pretty pictures but the adherence goes to Flux hands down.
i walked through flux's latent space, one token at a time, after it released. for a full month, 12 to 16 hours a day. it has an incredibly heavy bias toward: women, dogs, fantasy, and anime cat girls. if you give it a term that has an unclear subject, instead of getting random subjects that depict the term, you'll get one of those four things, extremely well drawn, and frequently a mixture of them. and even in a lot of cases where the subject is clear, out of a run of 4, three will be women or dogs
so of course it does pretty pictures
it has no choice
Flux is heavily laden with big eye Anime.
anime cat girls
probably, but I never prompted for that
even in training it everything wants to turn into anime and I do realism
you don't have to prompt for it, it's so heavily biased toward it, you're gonna get it anyway
Okay all!!! This is fantastic. It's a new FLUX LoRA that makes it Turbo. 8 steps for a fantastic image. It's Alpha, but totally worth testing. https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha/commit/b2db8dcbd15fb095cffd8ab530499e47883466e7
pass
here's something for you to try. pull up flux dev, put in just the word pail for the prompt, no negative prompt, generate 10 times, see what you get
I do cfg 1 so can't negative prompt it
okay good.
It's really an amazing LoRA. I just found this earlier but I put it through the paces for a few hours.
read this channel's name. sd3. go post that in general chat
pail you say?
I asked CS1o where I should post FLUX info and was told that sd3 was sort of combined with FLUX and to post here. But alright.
just the word, pail
you're gonna get a lot more people seeing the link in general than in here.
oh, that ugly 16x16 grid?
?
Flux.1 Dev
that's actually a cool texture effect for certain situations
did you use the same word for both encoders?
oh, I only use t5 the L I leave blank
run it again and this time put pail in both of them
run it 10 times, and see if you get anything that's got french fries
with my loha
so weird
not yet, but a lot of nice pails
none yet
mostly just empty pails
but you are getting pails. it's a concrete, solid, subject. but what if you give it a term that has no clear subject? switch teh word pail for both encoders with Hand-drawn and see what you get - as it's an unclear subject you should get a large range of images from landscapes to who knows what - run it 10 times too
The pail master
First seed
I told it something for a prompt to test my training and it was not what I expected. I have noticed flux falls into holes.
go try the next test. use Hand-drawn for both encoders. just that term. 10 times
if you expect it to think outside its box it will do random weirdness to off thewall stuff
second seed
Hand-drawn so it is asking hand-drawn what. fill in the blank. It pops a fuse
maybe. run it 10 times, post the images
so you got - dogs - 1 that's a dog and 2 that are dog-ish/fantasy creatures - and one anime cat girl
and that gives you a glimpse of what was done to mask the issues flux has and why it's not trainable
wdym?
and why BFL can't release training code
just a shift value as we are currently using SD3's of 3.1518 I think it is
you should have gotten quite a range of subjects, all in a hand-drawn style. you got dogs, women, fantasy, anime cat girls. with one dog being just a dog, and the others a mixture of dog+fantasy creature
remember that flux and SD3-2b-medium ARE THE SAME THING
except after robin left SAI, he did things to flux, before releasing it
10 as 6 I could not show
now try using hand-drawn as the first term of a longer prompt
for l and t5 again?
for testing stuff, yes, give all the encoders either the exacxt same thing, or prompts crafted for their strengths. don't leave one blank unless you want to fluff the result with random dice rolls
this one is cool
now try it with your lora
WOW
interesting lora you made
a different training
try just the term Javelin with both encoders
here is a different training I was working with and this one is really nice
What was I to expect?
when testing something, you shouldn't expect anything. those came out nice, except it looks like you got a dead carcass in that top one
javelin
is that with or without your lora?
without
what do you get, with?
I meant, what were you trying to prove with your test... I am not sure I understand what my results mean... as far as your point of "flux is x, y and z, etc."
what did you train that lora on, anyway?
death, destruction, corpses
oh, take a look at the content. is it all over the place? or is the range of random subjects fairly narrow
seems to have given you that - don't see any javelins though
probably on a dead knight somewhere, lol.
hmmm... I guess it is related? I am not that intuitive 😛
With my LoRA on, no trigger
see what you get with your lora and just the term Harmonic
Same prompt (hand-drawn) with my LoRA, new seed
okay, I was right these loras are overtrained but I was testing the trainer. here is 0.5 weight (the above is 0.8 where I normally run them)
now that's really nice
that's a very strange gun he has
neither did he, apparently. he's not likely to survive shooting that
My LoRA, and "harmonic"
LOL
sort of a drone-ish thing
try the short phrase Harmonic scale
I guess... not sure how to interpret. My training was heavily specific to a topic of spy cameras and spy tech and a 1950s theme with a muted color palette...
you got wires at least
here is with my lora at 0.8 and I did not use the activation word.
if you don't use the activation word, it's not supposed to use the data in the lora
With and without the LoRA enabled (prompt: harmonic scale)
the second one looks like soundwaves - not sure what the first is an attempt to draw
here's a longer prompt with a very unclear subject: delicate cosmic quilling, points on a curve, waves, ripples, crescent wings, epic maximalistic nebula
it does, even on XL, but at a massively reduce rate. calling it is akin to 10x weight, or higher just depends. I have one lora I released that did have an activation word but I never released the word because it made the effect too heavy even at 0.1 (for XL).
Trigger word added
why did sd3 make Cronenbergs while flux normally doesn't?
not sure what you're asking
Rick and Morty reference
black forest did things to flux after they left SAI - part of which is the so-called distilling they did
With and without LoRA, same seed
which is why I am having all kinds of issues training it for styles
because flux is not trainable.
which one do you like better?
disagree, as people are training it for likenesses with ease. concepts bleed too much and styles are hit or miss
they are creating small models that do provide some sort of image generation direction. but that isn't the same thing as actually being able to train a lora to update the weights of the model
how are they getting the various ages of Cheech and Chong to perfection?
They are creating an object that is used during the generation process with information for that specific thing. it's not updating flux's weights however, i'ts just providing data for that specific thing
YOU are trying to train a style, YOU need to actually update weights
and YOU are hitting walls
big time. though it did help to unfreeze the t5 it lead to massive bleeding
especially background stuff
you are hitting what are essentially titanium walls. flux is not trainable. it's frozen. and most of the loras out there don't do anything that a well crafted prompt for flux wouldn't do
try this prompt: Three bubbles with smoke in different colors, by artist "Yoshimasa studio lighting"; ray tracing global illumination, octane rendering, 8k resolution, Unreal Engine 5, hyperrealistic photography, complex details, minimalistic, back lighting
why did they lock it down like that so we can't really train it? Now I know you are right because I was seeing that what I trained an LLM could create with a prompt. A lot of the people I mentioned earlier who are in a spell over Flux would fight with you over what you said, but I know otherwise.
Flux spell never worked with me, and I lost a lot of people I knew over this as they refused to accept reality.
the inferior quality they were willing to accept was the final nail
one trainer I work with remain in my camp but they have been fighting everyone and lost more people due to this Flux miasma
that's something you'll have to ask black forest
I have my suspicion called potential lawsuits. I was told by an insider about these devs at BFL as they were the ones complaining about the money flow, etc... SAI. This is why I suspect pro for cash and to prevent potential lawsuits while the tech flounders in general.
the code is open source. lawsuits are not an issue
it has everything to do with the core issues of the neural network they decided to use
well, even SAI was being sued and even if you win the resources to fight sucks.
you win but still lost. Courts
the black forest team developed the code. they'd have to sue themselves
it's open source. everyone has the right to use it
No, I mean artists, etc... suing as they did to SAI, or were about to and why emad rushed 2.0 with less artists in it.
and liknesses. the connections could be trained back in though
sure, and the artists lost because they're entire lawsuit hung on somethign that wsan't true. however, the only people that could sue black forest over flux would be... black forest
and that would be silly
this is all unsettled territory and until settled China wins as they don't give one pee pee about IP, or liknesses, or much of anything.
please understand something - the BFL researchers created the code. it's their design, their work, and if anyone has a right to it, it's them.
but it's open source so no one can take anyone to court for using or modifying it
open source is not uncharted territory
you are not grasping what it is I am saying
i am. lawsuits are not possible. only a stupid lawyer would take the case and a judge would toss it out within the first 5 minutes
SD is open source as well but SAI can get sued for scraping, or having actor likenesses in it, etc...
you can sue anyone, for anything - but it's expensive so most people dont' do that unless they're pretty sure they can win
this is why they made that opt out deal
With...
well, if I had elon musk money, or Google, or Sony, and went after BFL guess who wins? draw it out for an enternity which is why I hate the legal system as the one with the most money 99.97% of the time wins cause you can't afford to outlast them. Out of court settlements etc...
black forest wins. They. OWN. THE. Code.
they created it. it's their work. they own it. however it's open source so you can't go to court against anyone that uses it
Without LoRA and With my LoRA
they're both not bad - but the smoke's supposed to be different colors
try this prompt: an ornate box, open lid, a spider crawling out of it; Rob Gonsalves; hyper-realistic,hyper-detailed, fantasy, cosmic art; elegant, intricate, detailed, extremely textured, colossal, monstrous.
Big impact LoRA
now that one worked right. the smoke is three different colors and it looks good
Without / With my LoRA
i think i like the larger spider better
Yeah you're right..
these are sd3-2b-medium
comfy workflow is in them if you want it
Blue Future LoRA
Looking at the artisan channels and still nont impressed. Really disappointing.
Sana demo: https://ea13ab4f5bd9c74f93.gradio.live/
Sana github: https://github.com/NVlabs/Sana (no code yet, vae out, model out too?)
It's actually pretty nice, not flux level but great for the size. It has great prompt following, surprisingly decent humans(not flux level, around sdxl I guess?), and nice text capability. Its undistilled and supports negative prompts.
left is Sana(looks like photoshop lol), right is flux. Sana got everything right, 4 cow pictures, ufo out, cat on couch. Flux got cat and ufo I guess but not space nor 4 cow pictures
prompt: A highly detailed photo of a white cat sitting on a brown couch in a living room. Behind them, is a window, and 4 cow pictures, one in each corner of the window. Outside the window is mysterious space and a ufo.
Can do woman in grass too
You'll see that Sana isn't actually too great with text, can only do small phrases(like the above). Apart from that, its actually pretty nice.
WHAT
if this becomes apache 2 this might replace 1.5
its so small and doesn't use a large encoder like T5-XXL
and low step count needed
Uhm? This is what my extremely detailed description of 2 characters produced 😦
demo overloaded 
is it out? What? comfui ready?
no code yet
but something is happening

yeah i get that feelin
It is 1.6b and 0.6b, so it is more like competition to sdxl, sd3m and sd1.5
probably not a "flux killer" like the sensationalist clickbait-y youtube crowd depicts it but I'm excited. come on youtubers chill out
every day a "sora killer" or some other killer... come on now
yea, they make false expectation about everything
well, all it did for me is not taak any youtube crap seriously, don't think i've looked at non instructional vids (it's nice if you need to do some weird diy thing you never done before) there or years
same, but a lot of non technical people have no choice
Remember when we were excited for 512/512 text to image XD
fk that sht! We need 4k now, and not with upscaling either!
the very first midjourney was magic!
and those images look sooooo bad now, but for the longest times i was happy MJ could do 'a scary sunflower with a face"
for me it was sd1.4, and I was shocked
i think it was at the same time, i had that sunflower in mj, and sd couldn't do it 😢
Style: A photograph with a red Cinestill-inspired aesthetic, shot on a Nikon D850 with a shallow depth of field, emphasizing the subject's haunting, android-like features. Highly detailed, with fine textures and intricate details, evoking a sense of unease and intrigue. Scene: A small, battle-scarred mech, reminiscent of H.R. Giger's biomechanical art, with glowing, crimson eyes, holding a vintage Barbie doll in its long, slender fingers. The mech's skin is a deep, iridescent blue, with intricate, cybernetic implants. The Barbie doll, dressed in a flowing, black, Victorian-era-inspired gown, seems eerily out of place amidst the futuristic surroundings. The mech stands in a vast, desolate, post-apocalyptic landscape, with crumbling structures and overgrown vegetation in the background. The sky is a bruised purple, with wisps of smoke and dust swirling in the air.
pixart
yeah sadly
it's just overly aesthetic
there's no image without bokeh unless its like a landscape shot or 2D/painting
I also got just heart but for explicit prompt, maybe they gave specific instruction to text encoder for this demo
Anyone know the comfyui extension that lets you delete a group and all the nodes in it in one shot?
I started a fresh install and I can't find that custom node package 😭😭😭
Thanks 🙏
this model might be good for the comunity because its so small. people will be bale to train it very easaly locally
sdxl is 6.6 b paramtere with the text encoder. but only 2.3 without it
1.5 is 890 million parameters but with clip its 1.47b parameters
6.6b is combined with refiner too
yes its not a flux killer. here is a ideogram vs flux vs Sana prompt "A photo of a fursuit of a fox. The fox is wearing a blue and orange striped shirt and a pair of glasses. The fox is standing on a wooden platform. The background is a forest."
oh
sorry then i was a bit wrong because nobody uses that xD
i think ideogram is still often better then flux
wow, ideogram looks super real
the first one is phenomenal minus the questionable hand situation

yes ideogram is good
ran same prompt with sana and got more adherent results first try, however it is not fursuit
but i think like ideogram is the only model that has good furry stuff in there training data. so the other models just have a dataset problem
maby my prompt was bad idk
Well we can always try to make a furry lora for flux
training flux is very hard. but a lora is realistic
the cool thing for sada is that training will be easy
we will be able to make full finetunes locally
with what gpu?
12 vram 32 ram
daim thats good
daim
yea, and it is smarter than sd1.5 \ sdxl, however flux noticably smarter
curious how 0.6b going to look
yes but flux is like 12 times as big
i thikn the 1.6 is the only realistic one to use
Nvidia knows their stuff so it must be decent
fair, hope community won't ignore this model

but this one has a chance, as it is fast and light, also encoder is lighter than T5
Rooting for it
I want to try sana with tensorrt and 8step hyper lora, it might be 3x speedup
yes and its good that its not useing t5. t5 is old. models shuld use modern llms
this model is alredy very fast. i think the demo is just slow because a ton of people are using it
this demo just is unbelievably slow, can't Nvidia allocate some more gpus for this super fast model?
but i think the really next big thing is this one. but it might be to hard to finetune. https://arxiv.org/pdf/2409.11340 because its 3.8b parameters. its based on the phi mini llm. its a gpt style image model. it can do some very cool stuff
they can but are not lol
it can do stuff like this
Oh yeaa, I am very exited about this
btw i found a demo of it on hugging face
they should release model this month, according to their comment
gimme it 
but its qute slow. be carfull i think you can only runn it like once or twice https://huggingface.co/spaces/Shitao/OmniGen
and put the right image name toknes into the promt like deskibed in the space
画一张吸血鬼
zecihene einen vampier
if you look at the botom of the image you see that most of the time waiting is just beeing in a queue
for me it is just "processing" and quite long
Thanks btw!
I ran out of quota, did not even test with references 💀
but that demo is great sign, seems like model is at final stage
i said that you can only test it only onece or twice 💀
yes!
imageine if they make it work iwth lama cpp
that would be huge
I missed it 
I am not into llm much, what would it bring?
llama cpp is a very optomised way of running llms. it can run on the cpu but its still very fast
so it would work like on every device even if it doe snot have a nvidia gpu
wow sama canfollow very long i asked chat gpt to deskibe the furry picture from erlyer and used that as a prompt " The image shows a person dressed in a detailed fox fursuit. The fursona has bright orange fur on the head, arms, and body, with white accents on the snout, chest, and tips of the ears and paws. The character has black fur on the edges of the ears, as well as large, exaggerated paw pads in black on the palms. The fursuit features large, expressive eyes covered by a pair of blue, semi-transparent glasses, giving the fox a playful and intellectual appearance. The character is wearing a casual blue-and-orange striped long-sleeve shirt, matching the fur's orange color. The fox has a cheerful, welcoming expression, and its paws are raised toward the camera, as if greeting the viewer with an enthusiastic wave. The background appears to be an outdoor forested area, with many tall trees visible, though they are slightly blurred. The image is set on what looks like a wooden bridge or platform, and the overall mood is lighthearted and friendly.pts but i odnt like the style "
this model in particular would be great on cpu, especially with low step lora
I wonder why most models has so little knowledge about fur suit?
Maybe majority of images are amateur photos from various contests or meeting, so they are being filtered out due to not meeting aesthetic score?
Just assumption...
I finally made Meissonic 1b work but it is not much faster than SDXL. 13-15s for 1024px with 20steps, however it needs more steps for good quality
Pirate influencer
画一张吸血鬼
Can I get SANA for ComfyUI yet?
wow awesome
Awesome
eh prompt adherence def isn't sota. gotta go work. i'll play later
can use as a second pass over auraflow
nothing to see here is my first impression, i'll give it a chance later
GANG GANG
wooo!
the taste of vindication is so sweet
"all good things come to those who wait"
what did we wait for then
Not blown away yet...need to play with it a bit more
What are min requirements to run the new 3.5?
Look like 12GB is a safe bet
Ok, tyvm 🙂 Hopefully i will get that one day xD
I love the tongue in cheek of choosing girl laying in grass in the announcement post
so 12GB is for that 3.5 medium model?
I didn't catch that. Well spotted!
12GB is for Large
SD3.5 Large
oh, that is much better than anticipated
Bah, we know that with Comfy it won't be stuck to that
How is it for text and human anatomy. I don't mean clothesless, just not deformed
Regardless, super pumped
Flux had become the defacto monopoly on open models by sheer size and quality. Now there is competition
this model is perfect its everything I was waiting for personally
Is it supported by comfy? Meaning do we need to wait for any special modifications or will it use the standard SD3 nodes?
comfy discord announced day 1 support
you might have to do an update but it will work
It's killing my 4090 right now, hehe
its running on my 4070 12gb
Wide angle shot of a beautiful women holding a sign that reads "SD3.5 Large"
should use hugging face workflow or is there better
I like weird shit like this better than when it outputs "good" art
SD3.5 anime
[SD3] Nice tires, dude
sd3.5 confirmed good? 😮
yeah look at that red car
photographic quality as good as the best flux image there
It needs to do more testing but it has the potential to be better than FLUX as it already beat FLUX in some aspects even when broken. And it is more trainable.
For the smaller VRAM user, SD3.5 Medium drops 29 October
I'm not getting anything close to what i get from flux
i mean SD3 was already better at photorealism than flux, so it is just a matter of anatomy+consistency
Stable Diffusion 3.5 Large Fine-tuning Tutorial from SAI :
https://stabilityai.notion.site/Stable-Diffusion-3-5-Large-Fine-tuning-Tutorial-11a61cdcd1968027a15bdbd7c40be8c6
sd3.5 failed the horse-fish hybrid test
how do you pass the horse-fish hybrid test though
How much vram is needed per tier
the image must have the horse head and the body of a fish
sosig
wow what is the prompt
selfie photo of 18yo woman at the club, red head, pale skin
weird anatomy
someone try this:
"in the ocean, there is a fish creature whose is the head of a cat and the body of a fish"
wait i did it
nice
NOO i exceeded my wuota
Just type catfish 😆
lmao came here to say that, that's great
thats a different thing..
Now make her laying in grass
((
ban this guy
xD
😼
Sorry, something went wrong while trying to process genre. Please try again later.
Reference id: PSB8GUSY
Ok, so far so good
sorry wht are the corect cfg /step values again?
Finally Stability AI released Stable Diffusion 3.5 Large! It's a great model but not for the reasons you would like.
Some of the workflows in this video: https://f.latent.vision/download/sd35L.zip
00:00 Intro
01:27 Default workflow
03:55 Styles
05:18 Text encoder blocks
06:23 Model blocks
09:59 Prompt comprehension
12:27 The bad
13:42 My 2c
15...
The Last.fm user GinesRico has no listening history on their profile yet.
Just created your Last.fm account? Make sure you set it to track your music app.
Using Spotify? You can link that here. This can take a few minutes to start working.
Please note that .fmbot is not affiliated with Last.fm or Spotify.
selfie in grass
Try full body shot in grass, curious about her limbs
thats what i did but i also put selfie in it so it kind of made it large FOV to get as much of the body into it as possible which is kinda cool
Anyone know if we still need this node for 3.5?
yeah
Pretty good
yes it is in the example workflow
with more steps (im doing 20) it might fix that hand too xD
Maybe "full body starfish pose in grasss" lol
no i dont see it in the example workflow
is this 3.5 or 3.0?
I just tried gening 2 images, same seed, one with that node one without. No difference. It doesn't seem needed
then you dont have the right one
IT CAN DO PAINTINGS
its so good
I just tried gening 2 images, same seed, one with that node one without. No difference. It doesn't seem needed
3.5
the grass look smooth af
or you have the wrong one
rip flux
naw competion is good and i am happy that both exist
I wouldn't count the chickens yet
in terms of license yes but uhh the dataset uhhh
even from side it aint too bad
I still prefer flux for that (This is Flux not SD 3.5)
16 GB VRAM for Fp16, 8 GB VRAM for fp8. Not bad, not bad.
Haha so good
lets not prompt for starfish xD
works smooth on 4060ti 16gb, idk about settings tho
No more starfish, that'll give me nightmares
these are mine
ok let me try with some more elaborate prompts
hahah the eyes
cool
idk how they messed it up cause the large model could actually do laying down women apparently
How's SD3.5 compare to midjourney?
eh not as good 😛
I bet this is the only thing people will post on subreddit for a week straight
trust me
yes
it's advised
sd3.5 not that bad, i suppose?
if it proves to be good at paintings its gonna be my secondary model that's for sure
models usually have trouble with upside down and sideways faces, so they're overfitted on not doing them. SD3.5 will attempt to do them and mostly succeed
Incredible, but the bodies shown so far are a little worrisome
flux literally cannot make paintings without a lora or very low guidance scale
Always going to try to make the good boi UniPC work (34seconds on 3060 12g vram)
never used sd3 for bodies either xD
I think it'll come down to how easy it is to finetune, as it's main competition Flux has been notoriously difficult in that regard.
8B vs 12B
yupe
these giant models are very hard to train
sana is smart and very small
its the first decent yet SMALL DiT
it might overshadow 1.5 if finetuning proves to be good for it
but by default its no better than pixart (Sana)
what ahhaha
Sana, have you not seen it?
anime
please look at hte sana images again they look melted
even worse than pixart?
what prompt
~~aesthetic~~ anime 1girl, detailed eyes, barefoot, feet focus, sitting, kitsune facepaint, pale skin, bikini, white long hair, red eyes, white fox tail, ninetailed fox
3.5 large model
sana is not good with details
it did look smooth/artifacted that's for sure
nevermind me, just limit testing sd3.5
Ma'am, I believe your head is on backwards.
Oh it looks quite good. Is 3x faster than flux dev for me and it also seems to be capable of anatomy and some nsfw from the get go
someone wrote on the subreddit
wo this looks cool
this was the prompt "A cursed, low-quality image inspired by disturbing found footage, featuring grainy textures and corrupted digital artifacts. The scene has a dark, eerie atmosphere, with unnatural, warped figures barely visible through static and glitchy interference. The clouds in the background are shadowy and fragmented, adding to the unsettling feeling. The text "Embrace the chaos inside your mind; it makes the clouds blush" is overlaid in a broken, distorted font, blending into the eerie and glitchy visuals. The overall effect is strange, off-putting, and deeply unsettling"
normal 3.5 Large is about the same speed as Flux Dev for me, haven't tried the turbo yet though
The seconds per iterations hurts so much xD
wait until people start finetuning it. There are no limits. In one of my "tests" I tuned this for 120k steps at batch size 8 on 100k images and it didn't break
pretty high LR too
I'm looking forward to some nice fine tunes for sure
cool
the aesthetics of this image is insane from such a simple danbooru tagged prompt
this is for you. Free and commercially available, easy to tune. No DPO (so hands might be wonky at times), but we might release a dpo tune in the future as a lora plugin, if the community doesn't make it first
DPO would have made this harder to tune
and also restrict the vector
limiting range and styles
It is perfect lycon!
I noticed that it said first party control nets are coming
that sounds like great news because control nets are so expensive to train
cool swords ❤️
1 prompt 1 image 1 pass
the grass is good its very sharp
its good at anime :§
i forget is there preferred sampler/scheduler steps ect
will do fine with DPM++ 2m, DEIS or UniPC, SGM uniform with shift
she can kill me any day
Very nice one, that's from a single prompt with no inpainting?
wow cool that both styles work
new elden ring boss 😮
some framing artifacts on 1344x768, but damn is so good
yes, prompt by @fluid pecan
im busy now, will try sd 3.5 later, looks good so far
i like this one
i also hope they release Sana soon, looks interesting
I'm excited about Sana yeah, it looked better than SD 1.5 base and is fast
same prompt better image 🙂 was it done on turbo?
no, regular
I just replaced Elsa with Rapunzel in your prompt
dpmpp_2m?
do you guys have a workflow i could use iwth comfyui for the new thing?
A bit of style bleeding on the hair but at this point it's just cherry picking, I don't think I've seen flux get a similar concept either
Look at this
this is insane i love it
perfect transition
Three-kneed beauty! "Mrs Jake-the-Pake!!!"
that looks even better than latent couple / attention couple
can I steal it?
you must
sd3.5 is better at grasping concepts than flux is, it seems
it did the ethereal swords, and cracked face better than i got it ever running with flux
not to mention, the amethyst actually looks like amethyst now
That's super cool. Can you share the prompt?
At this point I gotta ask, what's the holdup with the hands compared to models like Flux? Is it the smaller size or is it the base model being undertrained?
a scene that blends photo realistic and anime elements. A photo realistic man is taking a selfie with an anime woman. The scene is in a cafe, early in the warm spring morning. The photo is candid and dreamy. Everything is realistic except for the girl. The mix of reality and the anime girl makes the scene surreal but happy
wow they really released sd 3.5 large, i thought that this day might never come
was worth the wait
Its good - but I couldn't tell 3.5 or Flux?
Yes, we've become completely spoiled. These days before, if you didn't get psychedelic stuff, you were already happy) but now - "the skin texture isn't worked out enough" 🙂
people would pay money for this!
Flux with my LoRA
ah yeah the hidden camera on the right LOL
There are loras already for SD 3.5 large made by Shakker-Labs
im sad was thinking the turbo would had been faster so imma use the large instead
classic 80's wizard multiclassing action!
Is there a list of compatible samplers for 3.5?
Sd3.5 large 🙂
Wow!
I think they got early access because they released the loras two hours ago, they couldn't have trained an entire lora in an hour
golder sword
acceleration loras like turbo are never really "faster" per step yeah they just need less steps
SD3.5L has unique faces!!!
i tried turbo my entire pc got frozen had to restart
i have like a 3060 idk why it not working
aw man
🥳 New sd3 and the big one first too
maybe i should wait for medium?
does sd3 use negative prompts?
I am running non turbo version with T5 encoder, everything at full precision on 3060 with no problems, it is slightly faster than flux, despite having slowdown from cfg
but 32gb ram is almost maxed
Alrighy then there goes another 16 gigs of hdd sapce
hmm i see
what you using? the large one?
Yes you can input negative prompts I believe.
only the largest things
Protip: Try init images
It’s going to be very slow, full precision can’t fit on 3060, what’s happening is the model is leaking into ram. This makes it really really slow.
Quantization should fix this issue I believe. 8bit should be basically same quality while improving speed massively.
It is very good at styles and diversity, like, a lot diverse than flux, but still has coherency problems, but I think it is fixable
@turbid grotto on comfyui or stable diffusion?
you mean comfy or a111\forge?
always comfy :3
alright
im not used a lots of comfy
i either use inokeai or stablediffusion
cuz comfy seem hard to use
same thing was for flux and I got no speed benefits from fitting model in to the vram
you just have to use ready workflow
ready workflow?
You need q8 with some cpu offloading so basically offloading the text encoder, vae when not in use. It will still be massively faster than the model leaking into ram.
https://comfyanonymous.github.io/ComfyUI_examples/sd3/
just drag image into comfy
alright
mmm Comfy is beta testing an all in one installer/app should help with peoples comfy woes
Testing complex scenes
this is more of a hardware isue
how so?
So what’s peoples opinion on SD 3.5 rn?
the models are just grids of numbers they can't cause your pc to freeze
i see
a 3060 is slower compared to a RTX 4060 Ti 16GB or even a regular 4060 8gb vram
love the announcement image lol
Same XD
is it normal for 3070ti, or im cook
Fp8?
very diverse, unique, styles, not overfitted, but anatomy needs finetunning
Yh I’ve seen anatomy needs finetunnng but the styles looks good
I don't know lol, I just run it, I'm not a professional
But luckily the anatomy isn’t too bad or anything, not flux level but still decent.
I’m just waiting for the dev of the app I use for MacBook to run SD to support the 3.5 large
every face is different
There are versions of SD 3.5 large for both 24gb vram cards and 8gb vram cards. see which one you downloaded
defiantly very poor performance something not right, i get like 10x better on a 2080 ti
stability solved 1girl
1girl laying on the grass XD
uhh uhh ueah
you probably using shared memory, you somehow have to make it not being used, try adding --lowvram argument
My 1st SD3.5L output
Ohhh, yes, I remember it
Are you using Forge?, i don't think Forge supports SD 3.5 large yet
Since it’s non distilled I can’t wait till I see civit Ai loras and models etc
nope, I don't think im using forge
let me change some setting fast
Then auto-1111?
yeah, I don't know what is this lol
Tho I will say it’s a cool effect
unresolved noise, needs more sampling
I need to fix my performance first lol
So SGM Uniform is needed, I’ll have to use 2m Trailing on MacBook when the app is updated
I am not sure, it is just default in homfy that works
To anyone trying to run SD 3.5 large locally, just use comfyui its way easier to get it up and running
I’ll be using “Draw Things” as the application
Cuz I cba using the others
I just need to wait for the dev Liu Liu to update it
For 3.5
ide knock down resolution a bit too, to a 1MP size like 1280 x 720 or something
I don't think it help, but I'll give it a try later
3.5 large
The classic spaceman riding a horse
Crazy day today! 2 new video gen models as well as sd3.5 large.
closeup testing
can't wait to test
if its really faster then I'm excited
especially if we get amatuer photography or similar loras
probably more texture than flux
flux has smooth faces
and im not glazing right now this is just how I remember
it is a bit faster than flux because it has cfg, with hyper lora and cfg 1 it would be a lot faster than flux with hyper
more
thank you
There is also already a sd3.5 turbo model which seems to supports cfg as well.
oh yea! but I prefer 8 steps lora for full model
Super SD3.5L Workflow (after an idea by Olivio Sarkas) - the w/f is in the PNG
O my god, look at this, no bleeding bitwin opposite styles,
from LatentVision
Yeah now that’s a lie lol, if flux can’t follow a bit of your prompt, either your prompt sucks or is way too complex.
What's latentvision
A youtuber
yeah I'm sus of that comment
but he is right, flux really overfitted and far behind sd3.5L at diversity and styles, however better at anatomy, probably due to overfitting
matt3o
Yeah that’s true, but I was talking about the adherence to complex prompts part.
matt3o made the IPAdapters
okay, fair
stability released training guide
Film style, realistic, perfect structure,Depicting a high-resolution movie still of Gundam fighting in a combat setting. This photo shows a real, intricately detailed Gundam robot fighting on the battlefield, with dynamic effects and glow effects. The Gundam is mainly made of metal and plastic and has a complex mechanized design. Its body is mainly gray with shades of red and blue, and it has a prominent orange face mask on its head. The robot lifted its left arm and held a huge, damaged shield with obvious wear and tear, presenting a mixture of red, white, and black colors. The shield seems to have undergone intense combat, with cracks and shattered edges that make it look rugged and worn. The background is filled with dramatic, fiery explosions, dark gray smoke and sparks, creating a strong sense of movement and chaos. The explosion is vividly depicted as a mixture of orange, black, and gray, adding a sense of urgency and conflict. The entire scene is set in a realistic urban environment, with neon lights in the background indicating a modern high-tech environment.
Super SD3.5L Workflow
they stole from me ||that stole from @fluid pecan ||
for some reason the comfy workflow doesn't have the sampling
maybe its automatic
also this is def way faster than flux
this is like as fast as medium if not slightly slower
the one on our HF repo has it
Super 3.5L w/f - getting striations!
@lavish osprey i've been checking your profile everyday for sd3 news, thank you for this
can you please tell me what im missing: im getting: 'NoneType' object has no attribute 'tokenize'.
im getting this error in comfy
I have no idea
can you help me lykon please
been super silent until today to work on the release, I should have time to be a bit more active from now on, but I might also take a small break for a bit
ive put all three clip models into the clip folder
but getting this error: im getting: 'NoneType' object has no attribute 'tokenize'.
Is this a crop or the actual generation? You have prompt that can share?
wrong checkpoint/clip combination
thanks alot! can you tell me which of those i should keep for the sd 3.5 large checkpoint? please
great model, like it's aspects, but what do you think about anatomy and how flux achieved it? Is good anatomy only possible due to heavy overfitting\overtraining?
all of them! but you can use only T5 or only clips (without T5) to save time on prompt processing and memory
what im doing wrong then?
@lavish osprey's prompt - very dark (too dark?) striations again!
this is how my ui looks like
have you seen the new mochi model? https://huggingface.co/genmo/mochi-1-preview
specific DPO for anatomy helps it a lot, but will restrict your model to the vector range of a lora and make it impossible to finetune
open weight t2v model but you need 4xh100 to run it
but why im getting the error? ive choosen sd 3.5 large and all clip models are in clip
use triple clip loader and load the G/L/T5XXL, dont use the clip from model
no but now I have, it looks cool
are you using the workflow on our HF?
and it's apache 2!
@bitter hearth and it seems now you have access to sd3!!!!
isn't it epic
how please. thanks alot for pointing it out
I am using one I made myself - it is probably causing the striations!
you weren't lying
I'm waiting for a GGUF build to be made so i can run it on my 8GB card
OK, I'll try this - thank you
sorry, this is for comfy, I don't know how things are in other interfaces
im inside comfy
double click blank area and search for triple
thanks!!!
thanks!
Also, do we have to use ModelSamplingSD3?
my final question: where to connect this triple loader to
that being said, if we did dpo now (or someone from the community made it), the model would still be trainable since you got access to the original distribution
yes, 3
and 2~4 for training
the 2 text boxes, just like it is now
it wasn't in default workflow 😳
on HF? yes. You talking about the comfy examples?
Better - brighter - no striations!
assumed so, it wasn't in comfyui example workflow tho. Do you need the timestep for the negative prompt? i remember needing it for sd3m
yes, comfy workfow from https://comfyanonymous.github.io/ComfyUI_examples/sd3/
thanks, I'll tell @simple thistle and @viral plaza
well I just did
nvm just realized, the sd3.5 large files have the right workflow
the comfyui default one is different
what it does btw
thanks for reporting it. For now use the one on our HF repository
yeah they def nerfed the painting knowledge of this
okay!
but it's on me for spreading info about Zdzisław Beksiński
needed 1 minute to see that sd 3.5 is an upgrade, for one reason; TEXTURE
2 generation setback in anatomy and adherence (on par with SDXL). only thing this has going for it is copyrighted works and styles that it includes really. still have to fix all the stuff that we already moved past. not impressed. I'll prolly skip it
Shift defaults to 3 so not needed normally. Swarm tosses it in anyway by default
The negative prompt timestep thing isn't needed it's just a trick that can sometimes help
SDXL is nowhere close to it. Cope
is my setting right?
it just work very slow with 3070 gpu
@echo off
set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=--opt-sdp-attention --medvram --opt-sdp-no-mem-attention --no-half-vae --opt-channelslast
git pull
call webui.bat
or what else should I try
comfy's example workflow should be all good ref #🆕|sd3 message
I wasn't aware Shift had a default. Thanks!
I added Model Sampling as it was not in the ComfyUI* w/f
im getting dark images, did i messe dup a node anywhere?
Thanks for the heads up
Each their own, not everyone likes or needs the same thing, i'm delighted not to see smeared smoothed images (dunno how to describe flux's style really)
yes, wrong clips
Your clip should be l and g
and T5 optionally
thanks!
which sampler do you use?
Not bad huh guys? And it will be trained too coz the license is good.
how long does it take for each image?
I'm on RTX 30 series
turbo 20 secs regular 3 mins lol on 3090 12 vram 32 ram
not sure the regular needs 40 steps tho...
bc it took me 4min for 1024x576 20 step, I don't know is it normal or not :/
no
Prompt = Surreal Stock Photo
75s with 3060 for 20steps
When did the 3090 have a 12gb vram variant?
whaaaat
Decided to join back after months of not being here, just cause I am mildly interesting in SD3.5
do you use default setting?
You really should release your base models together with a fine tune just so you can shut down some gratuitous and baseless criticisms
yes
Just under two minutes 1152x896 64Gb RAM, 8Gb VRAM RTX2070
Its definitely not as good out of box as flux, but if it can be trained properly this time, I can see it getting more popular in the community
I just repull whole project...
Ok, so I guess 5 minutes isn't normal for 4 steps on a 4080
Missed you - some of my best work in your workflow
Oh hey there haha
3060 i mean XD
I just started getting into Flux training. Having some absolutely WILD realism results with extremely little data. Coming for the flux realism crown this time around
Some settings are absolutely wrong
but why??
SD3.5 could be interesting, but its gonna be a while before its viable and supported, so I am gonna focus on flux instead. I held off of flux for a very long time cause I felt it was not aligned with the community, but after seeing how much work was made in making flux accessible, I feel better now
but yeah, I am interested in SD3.5. I had heard about it for a while, just glad its out now
prepare to be surprised
By?
the speed of work behind this release
it knows breasts 
@noble coyoteResult from today with my Flux training
Call the ponies.
Flux is exceptionally easy and reliable to train, so I am curious how SD3.5 will be
Nice! Have you tried Super Flux? https://youtu.be/3WViQ2pL6ks
SUPER FLUX is a easy method to highly improve Flux details. It will also save you a lot of time by making test rendering much faster with only 10 steps.
Get my Workflow here: https://www.patreon.com/posts/is-super-flux-to-114327248
Join and Support me
Buy me a Coffee: https://www.buymeacoffee.com/oliviotutorials
Join my Facebook...
I am not opposed to competition. I just know SAI is far from the most reliable company 😅
So I definitely don't want to get hopes up, especially when I see huge issues with SD3.5 still
Nope, I only use flux dev base, none of the other things as of now. I migth start using Libre flux, just to see, as it was trained by a research partner of mine
Same. I've tried countless times to train a sdxl lora, just won't train at all with the way i train "exotic" models for flux. As in unconventional "hairstyles" for instance
