#🆕|sd3
1 messages · Page 107 of 1
Mechanical Insect LoRA
Hey bro how its going? Do you have a workflow? for cogvideo img2vid? Most of the workflows I´ve found are way complicated and they use LLMs for captioning the source image (I only wanna do text to video , I can caption it earlier)
I think these workflows are generally decent: https://github.com/kijai/ComfyUI-CogVideoXWrapper/tree/main/examples
Thank you! I´ll try it
the facial expression on the rock LOL
Create an eerie, gothic world filled with whimsical, exaggerated characters who inhabit twisted, shadowy landscapes. The scene should blend dark, muted colors with vibrant accents, capturing a sense of fantasy and isolation, while evoking both beauty and unease.
Dali Flux LoRA
feather
i2i Florence2/GGUF_Flux
amazing
Housing market in a nutshell.
果冻蛋糕
Flux1.Dev.fp8 in PortraitMaster
👍
Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.
If you have any questions, feel free to ask us!
Your dashboard
Help
Support server
Other languages
en: help
ja: help Japanese
diamond painting
extraterestre control society
Any way you can share your prompts? They don't appear to be embedded in your uploads.
Sure.
Which image(s) ?
I like a lot of your stuff, so it would be great if you could turn on metadata somehow, lol. I don't know what tool you use.
But if you can't or don't want to, you could start with just the one I linked.
https://pyramid-flow.github.io
New CogVideoX competitor, based on sd3 and produces pretty good videos. Only 2b and produces results similar to 5b CogVideoX.
Pyramidal Flow Matching for Efficient Video Generative Modeling
"Flux models now support the NVIDIA TensorRT software development kit, which improves their performance up to 20%. Users can try Flux and other models with TensorRT in ComfyUI."
"The Flux models’ dev and schnell variants were downloaded more than 2 million times on HuggingFace in less than three weeks since their launch."
HOW do we get away from you?
I see your motif is a monkey - is that what we'll become?
@fossil pagoda
how did you covert flux to be tensorRT? Every time I've tried it, it says OOM on my 4090.
was a datacenter card
I wonder if TensorRT for Flux will work on my 8Gb VRAM RTX 2070?
oh ok. have you tried taking the resulting file and using it on a local 4090 or something?
My Rene Magritte LoRA
(schnell+lora)
I haven't but that should work
Same... but I haven't tried it on the smaller quantized models (if it even works)
someone should just do it once and put it on hugging probably
could you upload it to HF? 🙂
PokeZom LoRA
ok will have a go
this guy needs to release a version of it without the zombies. would be great for general stuff
ran the same prompt against the stability sd3 ultra api. Not sure what's up with the subject doubling.
always a risk when an edge is 1344+
Yeah this is just using their 16:9 option in the api.
never quite found out what Ultra actually is
its substantially better than SD3 Large
it has more high frequency detail which implies something like noise injection, noisy stochastic sampling or an upscale followed by a downscale
So any news in the Flux world? I mostly used Schnell on Comfy, as Dev takes a lot to generate. Couldn't ever run the nf4 versions
I have a RX 6700 and Ubuntu
I'm sad that I cannot really test stuff like with SD XL because the generation time, even the loading time of the model
not sure cos AMD
Yes I read some improvement for Nvidia aparently
Flux Dev generation time is Interstellar time generation for me, at least usually results are great but I cannot play much with it
At least Schnell sometimes delivers amazing results, but it seems like Dev always has some twist
some people prefer Schnell designs
And I kind of lost interest in generating in SD XL as I presume I would get more interesting stuff with Flux
oh great to know
it takes a long time for tooling to get made, over a year
at the moment SD 1.5 and SDXL are strongest models
cos they have the most tooling
also it seems not many SD XL new models improvements, like it has reached its limit
what do you mean by tooling?
At the same time, some people make great images with whatever model, so, I don't know, but I know Flux follow prompts better and that's amazing, I'm tired of just praying in SD XL
tooling as in software stuff
Really in SD XL and with ip-adapters and ControlNEt, the posibilities are endless
yeah at the moment there is so much to explore
there are tons and tons of Flux loras, at some point I wonder if they should be integrated or something
someone on this server had a go
they make checkpoints called "mangled merge"
they did a big one for SDXL and a newer, smaller one for flux as an experiment
actually I haven't tried much as usually in my xp Lora needs a lot of tries and set up until they work as intended, if they are any good in the first place... so because long generation times, I don't want just to frustrated
I was wondering if something like that would just make the model a real mess, as many Loras are actually bad
it did, apparently
the SDXL one is cool but he said a few times the latent space gets wacky
Indeed
My new hobby is to just download and test flux LoRA s ... 100 per hour.... Should be done in a year 🤪🤪🤪🤪
yeah there are so many its amazing
there was a good realism one on reddit today
turns out you can outpaint a flux image using SD 1.5 and it will continue the image
even though SD 1.5 could not create that image from scratch
Wow that's pretty neat. I think someone brought out a new in painter/outpainter for flux too within the last day or so
Flux is so good it influences other models to be better.
any news about sd3? 
Cleopatra?
the original vs outpaint is noticible but pretty nice. SD 1.5 kind of lacks the lighting effects/texture or whatever, that kind of mist in the original.
You can try to do it with SD XL
I think it will generate better
just some tries, not perfect but better lighting already?
Purz Face Projection LoRA
Hey, you beat me to it... I haven't gotten to that one yet 😛
Purz Dried Flowers LoRA
Purz Neon LoRA
thanks that's nice
this example worked a little better
this is 100% SD1.5 though
and one from SDXL
Adequate depiction of South Korea.
yeah the problem with SD1.5 though is
there was no mention of Korea/Korean in the prompt though LOL
Purz VHS Box LoRA
wow it learnt the layout really accurately
pretty impressive... I don't know, I have something against 1.5. It makes pretty believable profesional photography stuff, but usually two crowded like it cannot compose the whole image, so it just add stuff...
There was a user that posted SD 1.5 and the quality was impressive but the composition was like that
it does have a higher amount of small objects than sdxl yeah
really incredible
I asked him once and he said model is Epic Photonism
which is a great model
and then tiled upscale
Very cool
Eaullama
DonM Illustration Styles LoRA
Imagine/ gourmet platter of hot dogs
It tastes funny. something happened to the meat when it went through.
Yes I haven't taught craziness to AI. something is always lost.
Comic Book Vintage LoRA
Dirk Lasermaster.
Yeah xd, I should have tried it with flux, somehow food was better with sdxl in some cases
I have slowly been working on a full Trump deck. 12 down. Many to go.
let me guess isometric something 
Actually, no... I think I prompted 3D Game Objects or something like that. It is a short prompt
The actual prompt:
toolbox 3D game technology
wow just that prompt, no "trending on artstation" or anything
flux
Kinda amazing I can do all of this in 1-step with schnell, nothing else.
prompts: “4-image grid”, “the word “schnell” made out of cake”, “gta5”
1 step? wow
Quantized too lol
Yeah with 1-step, however anatomy is messed up, 2 step helps very much in anatomy.
Flux... fer sher
Meh, I am waiting for no step 👀
the massive blur is hard for other models
no step sounds nice, you just get the image lol
I noticed with flux pro 1.1, the blur is even more
maybe it takes a strong model to understand blur
my own flux lora, fluxdev
Working on 2 new versions of Mangled Merge for Flux. 'Matrix' which focuses on realistic loras and 'Magic' for 2d loras. 'Matrix is almost done. This image is with an additional 228 loras on top of the original 230 from v0, 55 more loras to go for this one and then I am going to work on Magic. I started merging 4 loras at a time in sets of 3 and then smoothing with 2 versions of the Della merge method that I created a couple weeks ago.
Mangled Merge is a really interesting project
Thank you. I'm gonna try finetuning on top of it once these two versions are complete.
do you find it very different to base dev now?
I haven't tested just yet, I want to try and get all of these loras merged first. But one thing I am finding different from v0 is that cartoons or painterly images don't work anymore. I can get plastic 3d rendered looking things but not paintings or illustrations. Of course I could be wrong. that's just my initial findings based off of 1 seed.
with realism, you lose some aesthetics too.
yes. I'll show you a couple instances... Mangled merge on the left vs dev on the right.
its done a good job lowering the blur yeah
MM on left Dev on right. Keep in mind it's still WIP needs some more smoothing. But the prompt was "anime,cyberpunk, A young girl with large, apprehensive eyes stands amidst a cacophony of disembodied, bloodshot eyes. Render this in a gritty, expressionistic style, emphasizing jarring color contrasts and impasto brushstrokes. Employ a worm's-eye view to enhance the feeling of being watched. The girl's face and the surrounding eyes should be the focal points, illuminated by an unseen, sickly green light source. The background recedes into an indiscernible darkness."
So anime basically goes out the window in a lot of cases.
Same here MM left Dev right. Painterly images is still doable.
seems to add signatures for painterly things. I'm not using any negatives for these.
yeah the changes are pretty big here
Yeah. I'm finding the Della merge method pretty interesting too. I coded 2 different versions, 1 that follows the paper closely and another that works better with merging models patched with loras. I don't think either are the greatest at merging loras in but I'm finding they work great for smoothing overfit models.
promt: make a logo
No.
JustEat_v1.safetensors
Can you make a hairstyle changer?
Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.
If you have any questions, feel free to ask us!
Your dashboard
Help
Support server
Other languages
en: help
ja: help Japanese
Anybody know if there's a way to have eye occlusion on Reactor?
If my model has glowing red eyes, it'd be nice to have face swapping with those glowing eyes
PortraitMaster+Flux
Prompt: "An Art Nouveau style storybook page depicting the Whispering Woods scene. The page features elegant, flowing borders with organic floral and vine patterns. At the top, the title 'The Whispering Woods' is written in a beautifully ornate script, surrounded by decorative elements. The scene shows the shadowy forest with graceful trees, fireflies lighting a winding path, and curling roots, all framed by the decorative border. The colors are soft and magical, with deep greens, silvery moonlight, and glowing gold from the fireflies. The layout feels like a page from an enchanted storybook, with a mixture of imagery and text space."
if you are a branding specialist, please imagine a image branding picture for Midea Industry Group, the picture is about good lifes in the city, should contains the elements of home appliance, air conditioners, new energy cars, e-bike, energy storage products, people are enjoying happy life created by Midea, the picture should reflect the company value of nature and sustainability
No.
Generate high resolution render output from this
Teachers suspected of 'fleecing' students of their buzz
I have been trying for days to achieve the same results as XY on my local installation. However, I am failing.
I know that they use Stable Diffusion XL but nothing else. I have a sample prompt here.
"Create a cartoonish illustration of a muscular, anthropomorphic rat. The rat should have exaggerated, bulging muscles and a confident expression, resembling a bodybuilder. It should be holding a red dumbbell, showcasing strength. The rat should have large ears and a cartoonish face with a smirk. Incorporate a bright and bold color scheme with black as the background to make the character stand out. At the bottom of the image, add the text "GYM RAT" in a large, bold, red font that looks hand-draw"
Attached is the comparison.
Can anyone help me how to get a similar result locally? I have tried it several times and the results were always better with stablediffusionweb.
my settings
https://nvlabs.github.io/Sana/
yeah, I think stability is kinda cooked if above is good as shown.
Sana is 0.6b parameters, competitive to flux dev, undistilled, supports 4096 res natively. 1 sec for a single img on a decent gpu.
Looks interesting!
@craggy crest look dm 
uses Gemma2-2B instead of T5-XXL
I would personally not put sana 1.6B so high up on the GenEval score, but for how small it is it really makes up for it.
if its really good to train for, PurpleSmartAI/Astraliteheart might love this
Idk how far they are into training for AuraFlow
Yeah I think flux dev is still the best but its amazing for it's size, undistilled, 4k native generation, and incredibly fast.
those scores get weirder and weirder, how is pixartsigma slightly below sdxl and way above lumina-next while pixart-sigma and lumina are in the same ballpark and definitely progress compared to sdxl. And then we have flux-schnell above dev. Meanwhile SD3, which just falls apart on 60% of my promps is above pixar and lumina which manage sane results on most prompts
if this sana model handles a variety of styles similarly to as pixart, it could be fun to use, the demo images have this very noisy grainy look though, time will tell.
I agree with their scores, for what it's worth. Lumina can get pretty mushy, and Sigma feels a bit below SDXL to me
SD3 is only bad if there is a person in the prompt
Guess we value different things :p
SD3m outputs are just awful to my eye, boring full frontal photo's of whatever i prompt with no atmosphere at all
SD3m is overtrained on photos a bit
which would artificially boost its benchmark scores a bit
this is the sort of image benchmarks are testing for: https://www.researchgate.net/publication/328362220/figure/fig4/AS:1086455329890335@1636042544303/Samples-of-LSUN-bedroom-dataset.jpg
just like generic photos of a room
so models that are overtrained on that style do really well in benchmarks even if they can't do other styles
yeah, and geneval seems automated object detection
says nothing abput coherence, sadly also noticable in those sana images, that word "fast" on the cat's sign, the letter T is halfway out of the sign 😬
I took a longer look through the paper
its gonna be good I think
competitive scores on all of ImageReward, FID on MJHQ-30K, DPG-Bench and GenEval
and its 100 times faster than Flux Dev
Do anyone know how to run flux with vae and clip l included (I have to run t5 separated) in comfyui?
What I mean is , is there a way to load only the T5 model?
there is a load clip, not the clip vision one
can't with default comfy nodes
if you only have a full checkpoint file
I would say these are unusably bad, but at those speeds and resolutions... Hmm.
Just ignore the included Clip-L and use the DualCLIPLoader node. (You shouldn't be prompting clip-l anyway. Use the flux prompt node and leave clip-l blank or else you'll seriously hurt quality. Do some tests with a frozen seed if you don't believe me.)
depends so much on the seed and prompt whether its good to have Clip-l
to a good extent I agree though
The Weight Family are nearing the end of their incarceration!
Clip-l adds very fine detail imho
Well, that could be true. My tests used multiple sequential seeds and just showed that overall Clip-L hurt performance. On a seed-by-seed basis, maybe you would find exceptions.
Opinion? Did you test this? (I did not.)
Yes, tested way back when Flux was released
Will test.
The detail added was very very fine
Not sure what that means. What am I looking for here? (The problem with prompting Clip-L is you get concept deformations. E.g. the teeth of a mimic chest bleed out onto the floor around it.)
for the most part I just leave stuff as default for prompting (so I use both Clip-L and T5)
and then I feed Florence 2 node to it
I'm not much of a prompter
My prompts are very specific, and I roll anywhere from dozens of times to hundreds to get one image that actually follows my prompt. Different strategies I guess, but you should give it a try and see if you prefer the results.
Is this for Sana?
SD3 Medium
Isn't SD3 worse than Flux though? Did I miss a new API version?
If you plan and plan your prompts, you can get workable images with SD3 - but it mangles human anatomy. But with Flux, it's so much easier
yeah my prompts are super basic, stuff like this: (Photo:1.3) of a street in a city. There are taxis and lamp posts. There are bins and plants. There is a garbage can and a drain.
Yeah you're not even specifying composition. I don't know if clip-l would hurt in that case.
It's like asking an out-of-shape guy to take off his shirt though... Come on dude. 😨
I looked at your tests before
did you test T5 with full text, with tags for clip?
I feel like your tags for clip might have been too long for clip to handle maybe
That might be a thing too. I saw one of my favorite youtubers using just a few short tags for clip. I'm testing now to see if using the same prompt for t5 and clip adds detail, but I'm not sure that's what @noble coyote meant.
I tried as well some of the new fancy clip or clip long fine tunes
but I never got better results from them
T-5 only null-test. Prompt:
In this RAW photo, A furry anthropomorphic hamster live-action anime girl is holding a sign above her head. The sign says "CLIP-L FINE DETAIL". She is wearing a denim jacket and sneakers. She is standing in contapposto. Her silky, glossy brown and cream-colored fur is highly detailed and shining in the warm sunlight. She is smiling widely, with sparkling eyes. Her denim jacket has a rich fabric texture. Her shoe laces are untied. She has long, wavy, glossy brown hair flowing down her shoulders. She is standing in the park on a field of grass and wildflowers. Fluffy white clouds drift through the blue sky overhead. The photo is taken on an antique polaroid camera and is extremely highly detailed.
Doing clip now with same prompt, then will try with a few short tags, and just check detail changes.
(Same seeds / settings etc. Only varying prompt for test.)
(Also I'm using one of those fancy Clip-L versions.)
No significant detail gain from just pasting same prompt into clip.
Trying short tags next.
clip tags: RAW photo, extremely highly detailed fur, hair, fabric texture, eyes, grass, wildflowers, clouds, film grain
Very definite and obvious detail gain!!! 🥳
That's fantastic.
Wonder if I have time to test a difficult prompt and a portrait...
wow nice
yeah this matches my experience
Clip-L with maybe 6-10 tags is nice
on models with Clip-G, sometimes Clip-G can be good with just 3-4 tags
These textures are really something else. I love it. Out of time though. Gotta go.
okay bye, thanks for tests
Also these were all with the 8-step hyper lora.
in my testing this one I put before maxed out T5 ```(Photo:1.3) of a street in a city. There are taxis and lamp posts. There are bins and plants. There is a garbage can and a drain.
unless you need very specific things
Always. 🙂
Depends on the model architecture and how well it was trained. Speaking of models, I saw that that new tiny Nvidia model uses Gemma instead of t5. Guess they found a way to make it work and apparently, it does a much better job with understanding. But that makes sense, since t5 is almost archaic by ML standards now lol
yeah that is true different diffusion models will make better or worse use of T5
and yeah the usage of Gemma in the new Nvidia model it exciting
its the Pixart team apparently
SD3 still hasn't released better weights than SD3 medium?
No
They've gone silent.
Well with that upcoming Sana, they are probably panicking.
Not heard of that one
Nah, they're laughing, sana looks less then stellar :/ https://nvlabs.github.io/Sana/
The more i look at those images the worse they become, if all you need is speed, maybe people will see value as some gimmick to run a model on phone or something, but it looks not so great to me
Well, the SD3 release wasn't anything to celebrate either 🤷🏻♂️
I'd like to know if Stability still exists as a company and if they intend to continue releasing models. It's been complete silence since Flux released.
I'm pretty sure there's still the intend to release something as it was teased in their fine-tuning guide which is kinda sort of official communication and maybe more official than lykon's hints on twitter, but how well SAI's new thing works is anyone's guess
I really had high hopes for this new pixart / sana thing 😥 but it's just "fast"
It's too small to be any good...0.6B
Isn't there a 1.6B version too?
yeah flux has been amazing so far.
You are right, example images like this looks pretty bad.
^..^<
but its insanely fast
Its much much faster then sdxl, sd3. Pixart, and 105x faster than dev at 4k.
It can natively generate 4k images natively with a speed 105x faster than flux, it’s undistilled too. It will use way less vram then flux dev.
Dev does seem clearly better but you can gen 10+ images way faster then a single flux gen and probably get a better img then dev.
I have to say though, humans might be an issue tho, they don’t show them in many poses.
Yeah sd3.5 was supposed to be available for some people for testing but idk what happened to it now. 8b was decent, worse then flux in most things but non distilled and more pleasant aesthetics by default.
No news about 8b or 2b now from what I saw.
who cares how fast it is when it cannot make any consistent image? The generations are all messed up
They aren’t messed up from what I see. There are some artifacts but even flux has them. All diffusion models have them(like the text being weird in the bottom left of the cat img).
It is lower quality then flux dev but as I said for single gen but way faster. It won’t replace it or anything but it will be a solid alternative.
all images I saw so far were full of errors. Much worse than sdxl
help
wonder if SAI is going to actually release 8B or just wait for it to become obsolete :/
Yeah idk abt that, did you check the page? Sdxl has an incredibly hard time writing a single word, and has very bad prompt following compared to the newer models and this one.
has anyone heard anything new from Flux team aside from their blackforest twitter account?
Help me Clownshark. Help me make real movies.
no dude, trust
SDXL is a god at generating images
all we hear from it is true
no base model compares to "sdxl" lmao
(meme)
first time I see sana stuff
looks very fast, interesting stuff

if Sana really is 100 times faster than dev
then we could do tiled upscale up to 8k
and then downscale to 1024
in the time it takes dev to do the image normally at 1024
so are we still waiting for sd3 something or is that whatever was delayed here now
checking in after a few months
We are waiting for sana now 
nothing new yet, nothing known, please hold restructuring
Any news about SD3.1?
two weeks

I don't really know what you wanna say with that. Sana is fast because it is not a transformer method.
They use linear attention, which is faster than normal attention, but also leads to inferior results. To compensate for that they add convolutional nets.
yes its fast, but it suffers from all disadvantages of this architecture
it is also a bit funny because they act like "we have a DiT which is super fast and does not need position embeds - nobody achieved that yet!"
although SDXL is using exactly such an architecture: convolutional backbone, no position embeds
SDXL is using real attention however
and that's what you see in the results. Its superior to Sana
If you want super fast results use Paella. its a convolution only architecture
its bad as Sana but its super fast
cause its using CLIP. I think we all learned now that you need a proper LLM as CLIP has no good text understanding
it might be possible to bootstrap T5 to it like Ella anyway
I sort of don't like judging image models by their text generation ability
I know some people need that, but it feels like a more niche usecase
Did you check the images in the project page or arvix? Sana base beats paella by a very large margin and even sdxl in a few cases while being much smaller.
Maybe you are comparing it to fine tuned sdxl models, base sdxl is not very great.
Yeah true, even sdxl can get decent text rendering with enough training I believe. I think to judge a model, it’s more of a mix of prompt following, image quality, humans, and text rendering.
lol, whats the difference of fine tuned sdxl models to sdxl base?
its same model architecture
No I’m not talking about architecture, I’m talking abt image quality.
yes, if I make a new model and train it on better data I can beat other models that are trained on bad data
I know
and I say: their architecture is probably not good
that's all i care about 🤷 They won't beat any other good image model with that. i would rather wait for a next PixArt then
sana is the next pixart.
there is only one author shared between both?
anyways, seems a step in the wrong direction to me
I would have much rather had the exact opposite yeah
a model that is 100x slower than flux but with better image quality
I would be also fine with a Flux light
what SD3 is supposed to be when it ever comes out
or some new and cool architecture like MAR
the alleged samples of SD 3.5 2B looked great
I don't trust anything SAI post anywhere ^^° I will just wait until its released
yeah its hard to know
Hunyuan-DiT is slowly improving as well, they might yield a good model
the latest Hunyuan-DiT is not even that bad, so long as you do a tiled upscale or progressive upscale
it does need multiple passes
I don’t see anywhere that it’s worse then normal dit, it just says that they replace normal attention with linear attention which loses some quality but then they replace ffn with their mix-ffn which makes it regain the quality.
yeah, thats already ridiculous to me. Replacing attention with linear attention makes quality worse. Thats a good sign that linear attention is just not good for this task
using convolutional resnets can compensate that. Yes. We know. You can replace the complete network with conv resnets (see Paella). You will just not reach the quality of DiT
you see the problem in all their example images. They are full of errors and lack any global coherence
also their choice of making the VAE compression rate larger is a mistake in my opinion
generating images from latent using a l2 loss makes images blurry and low quality. To get around that you need either a GAN or a diffusion model. I think VAEs are usually trained with a GAN-like loss to mitgate this issue. But that has its own downside. You just get into trouble if you make the compression ratio too large
linear attention seems to be better for very small things, smaller than the size needed for 512x512+ diffusion models
the scaling trends for linear attention seem to be okay for small compute amounts but it then falls off a cliff
which makes it useful for certain things but 512x512+ diffusion models are maybe too demanding
Idk seems pretty coherent in the example images, not perfect but when are diffusion models perfect?
The vae part is true too imo, supposed to be comparable to sdxl vae which honestly is not too great compared to the 16ch vae. Can increase training speed and inference speed but still.
I found that SDXL and SD 1.5 VAE still does well if your final image is 4k+ resolution
i.e. tiled upscale
then the VAE limitations are minimised
at 1024x1024 scale, the SD3/Flux VAEs are much better yeah
Their ae paper https://arxiv.org/abs/2410.10733
We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models for accelerating high-resolution diffusion models. Existing autoencoder models have demonstrated impressive results at a moderate spatial compression ratio (e.g., 8x), but fail to maintain satisfactory reconstruction accuracy for high spatial compression ratios (...
if the rest of the model was good I would feel okay with SDXL VAE
not ideal but would be okay
a bigger problem with SDXL was the lack of ZTSNR, which pretty much every new model fixes

its probably dropping rly soon
I think people will be happy with it when it drops (despite lots of memeing)
the previews looked good
I wouldn't mind if they trained for a bit longer if there is still benefit
previews?
they were on this server so they should come up in the search
user revealedinadream_70414 claimed to have sd3.5 images, so search that username
it was a month ago #🆕|sd3 message maybe #🍥|anime message and #🆕|sd3 message it seems, and a few more they posted were 3.5?? possibly
thanks! Sadly, there is no complex examples, but it seems model was undertrained a month ago, so it will probably take some more time
Those things are so weird to eat
Is there a new sd3 yet?
8 months ago...
Which model made this? It is Flux level hands
Also, what happened to model from the paper, did it just disappeared or what? It looked awesome
the 8B
The one in api really that great? If so, why it took another 8 months and yet not ready?
Ultra pipeline in the API in particular is very good yeah
they dont give it out because they sturggle with moeny
yea that's basically like 8 years ago in AI years
seems like the plan is to just let 8B become obsolete, then release it
it was obviously ready to release 6+ months ago
8B seems kinda small today that we got flux 😁 @dusky thistle
But tbh it looks like the smaller the model the more efficient somehow, look at pixart for example, or sd3 (which was badly trained but for its size it was good in some aspects)
8b is big enough, i think
what flux has really done for us is put to rest the idea that we need models to be tiny for them to be adopted by the masses
no one cares about how small your model is
they can just quantize it
one of the arguments against releasing great/big models was also that we wouldn't be able to train loras
well, we can drop precision there too and still get good results, we can swap blocks and finetune 12b on a freakin 4090 (which i'm doing, it works)
Please generate a 1900*1200 sized wallpaper showing an artis representation of a neuronal network.
Come and get your Balloon...
m
I like how there is always a camera lense hidden somewhere now
its like an easter egg
Solution = “looking at the camera” or “looking at the viewer” and Negative Prompt = “camera”
DecadeTW Auto Prompt
I saw that video last night but haven't gotten around to adopting it. Do you see many improvements?
Really clear and sharp. Its like the Refiner stage after the Base in SDXL
Cool. It looked pretty solid in the video. Do you see speed improvements?
Using GGUF_Q8 is always much speedier than Flux.Dev
Unfortunately I'm stuck with Dev at least until I get all these loras merged. 143 to go. Then I get to play around after quantizing.