#๐๏ฝsd3
1 messages ยท Page 105 of 1
They probably picked those who stubbornly kept praising SD3 despite the terrible release. LOL
That isn't my question. My question is where the rumor of a 3.5 version started. Who said this, or wrote this and where?
Mix of Toy Camera, Blue Future and Wraith_BW all at Civitai (by andreac75)
supposedly Prem said things at AI conference San Francisco 2024, haven't actually seen any confirmation
ok, well, that is interesting if true. If they are rebuilding a larger or better version of SD3, it will be fascinating to see what they do
I mean, despite al the other SD projects, like Audio, 3D, video, etc. The image AI was always their central calling card
And if Flux has shown anything, it is that you can not only give a full monster 12B to the community, but still be able to market and monetize one of your own (AKA Grok 2)
If the new SD3.5 cannot match up to Flux - then perhaps SAI ought to go and do something else? ๐
nice, trust me bro confirmation
@linaqruf_ @oron1208 Wait for 8b. It's basically Flux without distillation and heavy hands dpo. This should make it easy to finetune (and dpo).
We're also trying a new scaling down mechanism for mmdit, the new 2b is gonna work much better.
It wont and do not have to.
- SD3.5L is 66.6% size of Flux
- It won't be distilled and probably no heavy DPO
- It should be easier to train and run
- Better License
Plenty hints something is brewing reading twitter , though Lykon seems more occupied finding all the faults in flux than promoting sd3 these days
barely scratching the surface in what Flux can do, SD3 also just you know too many bad pics not worth time
And to be fair, I've heard that something be released a loooong time. Seeing the weights is believing. But taking all the chaos SAI had been in into account, I understand things moved slow or not at, and only started again the last month(s)
I feel the API of 8b looks nicer style wise (not all smoothed) than flux but generates much more flaws and much less prompt understanding. Still being able to actually use it will be nice
I wish them luck and hope to finally move from SDXL 
Wasn't that the 10th and 11th? Did it actually get "announced"?
Dunno - just saw this in passing ...
Prolly just hearsay, rumour, chit chat etc
Anyone still using SD3?
Lykon
@Lykon4072
Wait for 8b. It's basically Flux without distillation and heavy hands dpo. ```sounds perfect
those are the two big problems with flux, the distil and the baked-in look
still waiting 
yeeees!
But we will probably get noticeably worse hands than Flux without overbaking (just a guess)
main problem is that at this point Flux has a huge support by the community
Yea, it won't be a problem
not really
only a few flux-specific nodes and tools have come out
would actually say the opposite, that flux doesn't have much tooling yet
flux seems to have better prompt following, text rendering as well but sd3.5 isn't distilled so some advantages
basically the community will get one more amazing choice, win-win for us.
i'm pretty sure tho, by the time sd3.5 large is released, black forest labs might release the text to video model.
SD3.5L has own advantages and if Stability release training scripts with weights and maybe couple of controlnets, it is gonna be a good start
the best prompt following is Auraflow V2 I think
not the latest Auraflow V3
it got a bit worse in that version
Really I hope so. Let's see in a couple of weeks
btw, do we have a good finetune of Flux already? Last time I heard, it was loosing adherence
finetunes are much better now, there were a few bugs. there are already hyper/turbo loras as well
for Lora yes, for model I don't know
no big checkpoint yet
Loras seems to be very good now
mine yes ๐
been having trouble with some Civit flux loras
some are very nice but some are very overfit
a couple break the image above a very low strength
I liked these ones
well becuase it turns out that lorsa works very well also with small network dim and steps
BTW I just won tha Civ training contest for flux
wow nice
Does anyone know what is this? It is by the creator of realvisxl and I don't quite understand what that name means. Did he figured out how to reduce Flux's size down to 1b parameters?
https://huggingface.co/SG161222/RealFlux_1.0b
it seems model training
if you look on the realvis civit page for SDXL he talks about it
its not cooked yet
jugger team are also cooking
version 1.0b of RealFlux finetune
and that's what I'm saying.. If big teams moved to flux SD3.5 must be really good
because training a model has costs
They move to flux cause SAI is tonedeaf and no one knows WTF the StableDiffusion future brings, don;t really think it's about preference, it's taking the only sota model out there, despite the shortcomings
Where can I download sd3.5 ?
it won't beat it but it will be an alternative, its faster/smaller and has more knowledge. not as good as flux in many important things like prompt following/text rendering/humans but still a great alternative.
not out yet, will probably take a long time(months)
I really hope so
Why are people talking about it then ?
Some people got early access of the api for testing.
Ah got it, thanks everyone!
Congrats!
thanks
Ah , not me ๐ญ
yeah looking forward to that, do you have any links for it? i wanna check it out.
no news yet
there was some big news today cos Fal.ai got funded https://blog.fal.ai/generative-media-needs-speed-fal-has-raised-23m-to-accelerate/
someone from Black Forest invested lol
I saw the reddit post, they are thinking about video gen as well, which is pretty cool!
https://www.reddit.com/r/StableDiffusion/comments/1dm1kpv/pixart_team_joins_nvidia/
there's also the lumina team
they released an LLM recently that can make images
so maybe after this they will make a diffusion model again
yeah the aesthetic was good but kinda took lot of time and imo not worth it.
This seems a very promising llm image gen method(model not released yet, but will soon): https://github.com/VectorSpaceLab/OmniGen
It's very impressive imo, its very small, just a measly 3.8b params and has no text encoder but supposedly performs as good as sd3 large in t2i.
The most impressive thing is that it can do reasoning, editing, step by step images, deblurring, and everything controlnets can do in just 3.8b params.
wow yeah looks good
So they're saying now that they will actually release 8b?
Id be thrilled if they did that
2b is... We all know... 8b is pretty fn good though
ARS Midjourney LoRA mixed with a little sci_fi_future LoRA
Yep just read through their paper and that looks 100% legit and ๐ฅ .
The model shouldn't be super capable at that size, but if the architecture works (and this is an absolutely awesome-looking architecture), you should be able to initialize from Pixtral and get something similar with SotA results.
Open source image-to-video from CogVideoX just dropped!!! https://huggingface.co/THUDM/CogVideoX-5b-I2V
(They had text-to-video public weights, and they had image-to-video private weights demoed in HF space for a while, but the image-to-video weights are now open. Downloading and testing now.)
Steps are happening! No errors so far...
Flux allows use of living artists in your prompts (and training)
Might end up being useful for upscaling or AnimateDiff or something IDK.
Image generated by Flux (on local PC), and video generated by CogVideoX also running on local PC! ๐
I wouldn't call it good or useful by any means, but I'm going to try some other subjects and see what I get.
Why am I suddenly getting ridiculous times for Flux???
Oh it's because of the low resolution for CogVideoX images duh.
Over 8 minutes per video... Oh well. I'm rendering two tests now: a censored test and a spaghetti-eating test. I might leave it at that, or I might try some others if I think of anything I really want to know.
Hey have you seen CogVideoX-Fun?
No what's that?
Its a modification of CogVideoX https://github.com/aigc-apps/CogVideoX-Fun
Its more flexible
I bet this model can be dropped in. I will try this next.
ImgToImg work?
Its based on CogVideoX, so if IMG2IMG works on CogVideoX then CogVideoX-fun should be able to do IMG2IMG as well
Oh. I mean ImToVid
It can do Img2vid and vid2vid
wow yeah this is a really great sci-fi building
the shape is so complex
1024x1024x49 sounds great
The finetune was made by Alibaba PAI and open sourced by them
the original was 720 x 480 so going to 1024 x 1024 is big
ye that worked ok
It needs deepspeed and 15 isn't compatible with windows. Will try manual install.
Success. Downloading models.
Can you free-up space by placing your Models Folder on a separate drive at all?
I can clear out HF's cache. And I can delete some obsolete stuff accumulated in my AI folder, like auraflow and infinigen.
Wow ComfyUI is WAAAYYYYY slower than the CLI! I have almost finished my breakfast and it's still rendering.
Maybe the resolution is different. I guess I'll find out when/if it ever finishes.
Video rendering needs meadow-sized RAM ๐
Just wondering why Comfy is slower than using Python in the console. It's the same model more or less.
Console was 8-9 minutes. Comfy is already at 20 minutes and the GPU is still pegged at 100%.
maybe node isn't quite done right
I'm no expert - I am an artist with a soupcon of technical know-how! ๐
I think I accidentally used a higher resolution. 480 vertical pixels for the console and 768 vertical pixels for Comfy. I thought the 768 was referring to the width.
the Ali one can go up to 1024x1024 though
I might queue a test for that before I go to work, but at this rate I'm going to have to cancel the current render.
Thanks. I felt the same way ... Have to see if it was mainly flux or the Lora doing the heavy lifting ๐คญ
I have to kill the Comfy render after 32 minutes running... ๐
I need to generate base images and queue up some renders before I leave for work. Oh well. I'll try a lower resolution and hopefully I'll have some results when I get home.
Leaving a few test running: Volcano erupting, spaceship flying, superhero, swimmer, running, dancer. No idea how many will get done, but I'm out of time.
Your image looks great! Could you share your workflow?
The w/f is in the metadata of the PNG - click on it - then open-in-browser - right-click and d/load
Got it, Thank you very much!
Lol
Audioreactive Video Playhead system, now with real-time MIDI control + 21GB of new timelapses, and SD configurations.
LK + UBridge + Smartphone โ TDAbleton โ TouchDesigner
You can access these project files, plus many more systems, tutorials, and experiments, through: https://linktr.ee/uisato
Nerdy Rodent, that u?
Does anyone here use JoyCaption?
I have been using it a lot as a local install lately to get descriptions for images with no prompt info for img2img.
I think it works great, its in alpha and blows my mind at how good it is, but was curious to see how it would work with different models.
Does anyone know how I could change the model or if that is possible with this type of thing. I think they are using a quantized VLM and not a typical LLM.
If anyone has any ideas or opinions please let me know. Or if you also use it let me know how your experience has been with it.
I will post the model folder structure below maybe that will help:
Images made using Flux + LoRAs into Ollama img2img
Is there any way to feed more guidance into that setup. I like it but I would like to do something more like this and it never seems to work with that IF image to Prompt node.:
Ollama, Florence2 and Jan.ai are my go-to prompt generators. I couldn't get JoyCaption to work!!!
You can try my workflow if you want. It has JoyCaption, maybe it will work for you. You may want to or need to make some changes to it. It is the first workflow I have ever made from an empty workspace. Also if it does work I would like to know how I can improve the workflow. I don't understand how all this works as well as some of you.
NYJY Nodes won't load ๐ฆ
I'm trying to find their github page. I remember I had to do something specific. I also had to translate most of the page to english, lol. but then it worked.
I needed all 3 of these:
That last one doesn't like NYJY won't load
also this was translated from their github page. I dont remember if I had to install pytrans myself or if one of those node installs did it automatically.
I think it also downloads the model on its own the first time so it may take a while:
Maybe you have to install pytrans
This is on the NYJY Page
I wish I could help you to get it working. It is pretty cool.
CXH-JoyCaption also does not load ๐ฆ
Oh well, its time to disable a whole bunch of nodes I guess until it starts to work ...
lol I do that too. Its a tedious process.
i don't think comfyui makes a great image tagging gui and people creating workflows for that are kind of wasting their time.
imo.
taggui among others exist. joy caption looks like a neat model but i've not seen how it's any different from WD tagger. I think it's trained on porn better. Hence the "inclusive" part of the description.
what is taggui?
https://github.com/jhc13/taggui simple gui for managing captions in a folder
van Gogh-y type stuff - Ollama img2img, with Flux output
Those look awesome.
Thank you. I bet this would work pretty good for tagging non ai art images too. Like say product images for Etsy or any ecommerce store. ๐
love the whale tail in the shore. surrealism and van gogh? yes please
i use it for auto captioning, then cleaning those all up and manually tagging
has a bunch of models. i don't think joytag in it yet. has blip2, wd tagger, florence 2, and a few others.
i'm going to start experimenting with the other models a bit. you can prompt some of them and instruct them on how to describe the image, so if there are particular tag styles you prefer, that would help
the UI is suited more to tags. not as good for long natural language tags, but it still manages
Once you upload any image (except porn/gore) to fineartamerica, it tags and describes it for you
yeah ther are many ways to do things. i prefer a ui that streamlines it all since i manually caption hundreds of images in a session
I wish there was like a text merge node. Where you could take a node that you put Specific LoRA keywords into, then you take that AI output text and combine them leaving you with the AI image description and at the end or bottom you have your LoRA trigger words. Cause getting AI to include LoRA triggers without changing them is like impossible.
pretty sure there are string concatting nodes
programmers don't use natural language. something intuitive like "text merging" is called "string concatanation"
WAS Nodes has concat
Algae
Can you please put the exact node name in the chat so I can search it in the node add section?
@noble coyote recommended this one. i've used it before was-node-suite-comfyui
Vincent!!!
(He reincarnated and bought a 3D Printer and made another ear!!!) ๐
also didn't catch siphilis so now he's calm and cool instead of manic and schizo
lol
Does anyone know if there is a way to get Ollama to list the models and allow one to be selected in stead of having to type in the exact model. I have several and Its hard to keep up with the exact names.
ya if your talking about in ComfyUI there are nodes that do that, the one you are using might not
https://github.com/AIrjen/OneButtonPrompt/pull/224 oh cool this works in the new forge again
thats a really interesting extension, since the same codebase is the comfyui node too
Hahhaha
I have two Frenchies so i have a soft spot for images of Frenchies ๐ฅฐ๐ฅฐ๐ฅฐ
I will let you have the prompt; and will make sure to do some more ๐
Actually, d/load the images yourself, as they contain the workflow
Thank you, kind sir ๐๐
Does anyone know if there is a way to get an llm to offload itself after creating the prompt so it can allow for resources to be used for the rest of the process like upscaling, facefix, LoRAs, etc.
Maybe a node with an unload model boolean for on = true or false
Placed after the prompt is generated. Any ideas are welcome?
Yikes I am testing dalle3 and images really do look considerably worse then flux. Pretty good human anatomy though.
๐๐ together makes me think of that game where you see someone doing that below their hips and they get to punch you twice for looking. UNLESS ||without looking you know they're doing it and poke your finger through their ring, thus granting you right to punch them twice||
You're six months late...
anyone got some experience with flux hyperparams for character lora
i got 30 imags, complex character, network dim 16 not enough to capture fur patterns
wanted to do LR 0.00025 with cos restarts and adam and just do lots of steps
but apparently people use way less steps and get good results
a happy batwinged frog playing a harp in the air
you need LLAVA not LLAMA
(the vision models are mostly LLAVA based or Florence from Microsoft.)
Llama is a text llm
ohhh
do you have one you would recommend that is similar to the one I have?
do they have abliterated models? I guess that is how they categorize uncensored.
I haven't seen any vision models specifically for uncensored image descriptions.
Let's dig here:
JoyCaption uses:
Both of these for this image description node
Does anyone get this when running Img2Img
WARNING: IFImagePrompt.IS_CHANGED() got an unexpected keyword argument 'image_prompt'
it can be done though
wrong varity of tomato
trying out this new schedule free support in flux
meta research's new big training thing
https://github.com/kohya-ss/sd-scripts/pull/1600 meant this one
woh. its fast.
I find that my setup "gets cold feet!" I ask it to make the images as if van Gogh had painted them ...
It comes back with a lame excuse saying it cannot do that, or I should modify my input etc.
I let it run a few turns, and the lo and behold! The van Gogh begins to show up.
I've had the same problems with DallE-3 - lame excuses, and after a few goes it complies.
I've always said "computers and software are like cricketers: they sometimes drop the ball!"
My answer to you p e r s e v e r e ๐
I've had results with llama, llava, zephyr, qwen ...
Qwen2.5, Zephyr, Llama3, Llava2 all work for me
I'd use Claude Sonnet but I don't want to pay!
img2img using #Ollama and #Flux in ComfyUI
img2img Ollama and Flux plus LoRAs
img2img Ollama and Flux plus LoRAs
there's not a bot
"Ollamba!!!" ๐
i read that as "Ollambada" like the dance ๐
Available on all platforms : https://bfan.link/world-beat
Kaoma - The Lambada (also known as Llorando se fue)
The full-screen HD official video of the worldwide #1 smash hit record from 1989
Playlist "Annรฉe 80 LA TOTALE" : https://bfan.link/annees-80
โผ Subscribe / Abonnez-vous : http://bit.ly/ClubMusic80s
โผ Follow us on / Suivez-nous sur Face...
I like when it fakes small print
Eve had a cunning plan: if she could do that with just one apple, what would happen if she had one hundred apples?!
Ollambada etc ๐
I'd eat her apple.
Har har!!! ๐
this discord has higher file size limits than others
oh its cos its level 2 server boost
thank god for that 50mb
yeah I need it cos I never go below 4-6k any more
I'd rather wait for the slow generation times than go lower
A pirate-themed Furby standing on comically tall peg legs, depicted in the style of an oil painting. The Furby is of normal size but standing on exaggerated peg legs, with stormy seas in the background. The mood is dramatic, with dark clouds and turbulent waters adding to the pirate atmosphere, while the Furby maintains its characteristic fuzzy, round appearance. The scene captures a sense of adventure and whimsy, blending the quirky appearance of the Furby with pirate aesthetics.
Borrowed from DallE Theme of the Day
"Minimalist rugged oil painting in faded earthy blue hues, capturing delicate details in vast solid patches. A weary and anxious queen wearing a night robe standing near an open stained glass window in the high tower of a castle, holding a candle in a simple holder. View from out of the window. She gazes into the starlit night, as a distant search party of horse riders at the bottom of the castle run away on a dirt trail into the trees."
flux colours are so much nicer than SDXL
Does anyone know how models get on ollama.com?
I am really interested in trying this one but have no idea how to get it on ollama if it isnt on the website.
aimagelab/LLaVA_MORE-llama_3_1-8B-S2-siglip-finetuning
That reminds me I've been meaning the check out Pixtral 12b. I wish I could run it in Comfy somehow...
@icy drift Pixtral 12b is very good for its size, but this is a better choice: https://huggingface.co/openbmb/MiniCPM-V-2_6
It's similar in terms of quality but is considerably smaller, faster, and supports video.
@cursive frigate That model isn't really that great. Although it uses llama 3.1, there are still far better alternatives. It only barely beats llava 1.5 which is kind of ancient compared to models today. The above one I told is better and ollama supports it from what I see.
Can you send me a link for this one from ollama, I am having trouble finding it. The filtering and search on ollama is not great.
MiniCPM-V-2_6
If this is the one you are recommending that is.
Yeah here: https://ollama.com/library/minicpm-v
Thank you.
That model works really well. thanks.. I have to go make dinner. I'll post some images here later.
Nice model... thank you.
I love the R2D2s everywhere.
interesting window glass she's got her foot in
It's elastic transparent aluminum ๐ ๐
Mix of Spandex, Aluminum and Lexan
too cute award!
Becoming harder and harder to tell, but there are still a few weird things here and there
ollama run minicpm-v:latest
My installation errors out - Error: llama runner process has terminated: GGML_ASSERT(new_clip->has_llava_projector) failed
minicpm 2.6 works well on my ollama installation, which version have you installed?
"ollama run minicpm-v:8b-2.6-q8_0" , take into account that this uses around 10gb of vram
There are other version that uses a bit less vram
... just loading it into my Ollama w/f ...
Error: llama runner process has terminated: GGML_ASSERT(new_clip->has_llava_projector) failed
I have it running now on 8Gb VRAM ๐
When I say "running" - prompt box states "Failed to fetch response from Ollama" - so not running then! ๐ฆ
:C
Maybe you can try with this one (its a q6 quant) "ollama run minicpm-v:8b-2.6-q6_K"
I get an error message - llama runner process has terminated: GGML_ASSERT(new_clip->has_llava_projector) failed
Ollama and qwen2:0.5b
I was running Ollama v0.3.9 - now I've upgraded to v0.3.11 - minicpm-v:8b-2.6-q8_0 works well ๐
First fruits
Nice!
If you were trying to be more descriptive you could say what model in ollama, i know ide be curious which models but im not going to keep asking you
just saying Ollama is like saying i used windows to generate the image
My goto model is llava2:latest; then llama3:latest.
Qwen2:0.5b is also cool; and Zephyr:latest
My latest addition is minicpm-v:8b-2.6-q8_0
ya i saw you/someone mention that, looked good, will try eventually also
Make sure you have Ollama v0.3.11 for minicpm to work!
minicpm gives a radically different look to llava2
you do a ton, but ide also suggest trying llama finetunes, there are some better than the base, im still exploring but this one was a big improvment ajindal/llama3.1-storm
Just installing this one ...
minicpm has great image-making - yet poor prompt coherence!
Someone said it was good for producing video
llama3.1-storm successfully installed
A candle burning in the vacuum of space, with cosmic swirls of color replacing traditional smoke and flame shapes. The candle's flame merges with celestial elements, creating a blend of fantasy and abstract art, with vibrant colors and surreal space forms dancing around the candle.
https://civitai.com/models/783736 for that cheesy 80's low budget sci fi look. ๐
this looks sweet for recreating a more authentic retro sci fi style tbh.
Thanks! I always love seeing AI but with that retro cheese style.
Davinci Flux LoRA
William Mortensen LoRA
hi, I wasn't able to run the flux nf4 models (because AMD GPU and stuff I couldn't solve), I wonder if there is something new I could try
what's this ollama thing, something extra in the generation process?
Ollama runs llms or multimodal llms to help enhance prompts or convert images into prompts. Itโs pretty popular since itโs fast, uses low vram, and is very easy to use.
thanks
how it is installed or used?
I believe there are comfy ui nodes for it, you can also use it normally, here is the instructions: https://github.com/ollama/ollama
The only โnegativeโ thing about it is that most of itโs basically just llama.cpp and they just add a slight bit of code to make it simpler but itโs much more famous even when they fully rely on llama.cpp to support new models and basically all the hard code but donโt really mention it.
ty!
Oh! Llama
hey, saw this post a couple days ago, what are your thoughts?
from what I read that optimisation away from comfy has been around for a few months now
and it seems that it could be a nice efficiency saving
just nothing huge
hey guys cat bot here once again on my custom youtube modpack run episode 2
Well, he [is] a zombie, so the third arm is plausible? ๐คทโโ๏ธ
Steel Polished LoRA
Timeless LoRA
do flux loras tend to need a low strength?
the amount of Civit loras I find where they are burnt out at 1.0 strength is weird
I'm often setting it to like 0.1-0.5 at most
depends on how they are trained. I noticed that with 2k or more steos and 8 network dim they look overburn, so with my new loras i use lower values. BTW I put the strength in every model I publish
it converges very quickly and doesn't over fit. i can't seem to give it a learn rate it hates.
well, low ones.
Some do... my own Magritte one is quite strong... needs to be around .60 in most cases. Similarly, my SpyWorld one is a bit strong and can work at 1.00 but mostly better at lower strengths.
DISCLAIMER: I have no clue how any of this works ๐ ... my LoRAs were done with default settings on either Replicate, Civit or local
Also I noticed that some loras with 1.0 of strength tends to draw too many things in a picture
Fantastic Realism LoRA
Above is 1.0 strength, Guidance 3.5
Strength .80, Guidance 4.0
Ollama Flux Fantastic_Realism LoRA
Ollama Flux Fantastic_Realism LoRA
If I use the trigger on my LoRA, it kickes into overdrive LOL. Same stregth, same settings as before, just added the trigger and goodby prompt LOL
DallE Theme-of-the-Day Prompt = A surreal, imaginative scene featuring an AirPod Pro floating in mid-air with glowing sound waves emanating from it in multiple directions, creating an entirely new dimension of audio. The sound waves transform into vibrant, swirling patterns that ripple through space, merging with abstract, colorful landscapes that represent different sounds and environments. The AirPod Pro is white and sleek, with its details highlighted by the surreal light and spatial elements around it, giving a futuristic and immersive effect.
Fabulous Realism LoRA on a van Gogh style image
more people added into sd3.5 testing, dpo soon i hope
hopefully, I'm really looking forward to SD3.5
need an undistilled model with 16 channel VAE
Desolation and Lines LoRAs combined
Victorian Gothic Horror LoRA
Anatomica v9 LoRA
more Anatomica
wonderful
can you try the prompt adding "Metal Logo" ?
hang on
prompt was m3t4ll0g0 Text : "WHOA". metal logo. zombies running screaming
interesting, I'm too far along in my current multi-concept lora to restart it with a new scheduler, but I'll have to check that out. It reminds me of early XL and prodigy, which converged way way faster than wadam
i'd run some smaller tests before dedicating time to it. i did a 200 image dataset and it took it in well. mostly doing a lot of 30 image sets.
i wish civitai had an easier way to manage versions. this is gonna be a mess with all these quants in a few months.
I scrolled through a bunch of new messages really quickly
I could see them
The whoas

A surreal landscape depicting the equinox with the Sun positioned exactly overhead, casting minimal shadows. The scene is inspired by the mystical and symbolic style of Alejandro Jodorowsky's 'The Holy Mountain.' Elements of surrealism and esotericism are present, with abstract shapes, towering mountain-like structures, and enigmatic figures standing in meditative poses. The colors are vivid and otherworldly, with golden, deep blues, and crimson hues blending together. Rays of light emanate from the Sun, creating an ethereal and almost divine atmosphere, reminiscent of a dreamlike and spiritual world.
This girl is the amalgamation of all girls.
she is nobody and everybody
Quigglestink!
guys, does quantized Flux work with controlnet?
Anyone test this? dev one will come a bit later it seems(still in training)
samples are up https://civitai.com/models/788550?modelVersionId=881836
Yeah samples actually do seem pretty nice, but not sure if it still has as good prompt following and text rendering as normal schnell.
I might switch to this model
cos I already liked Schnell's compositions and layouts more
Yeah schnell composition was more creative then dev and even proโs I believe. Not quality tho but realflux might help
I can't handle Schnell's tendency to make foam-noise out of details like building windows and flower fields, and I already get 6-step renders out of Dev with the Hyper lora. But this is ๐ฏ the best looking Schnell output I have seen so far. For close-up stuff, I bet this is good enough. Downloading now to see what it can handle at 4 steps.
Here's a 4MP image with this model generated in 28 seconds with 3 rounds of 4 steps each. Notice the grainy mess of the food products.
Here's the exact same prompt and workflow using the Schnell base model. Notice the crisp definition of every item on every shelf.
you need distilled cfg, did you use that?
I think this model was finetuned for realistic textures, and in the process it lost some general object knowledge.
You need the optimal settings, try it with this
Euler Beta
Sampling Steps: 4-6
Distilled CFG Scale: 3.5
CFG Scale: 1.0
Yeah I followed the directions and used the same for the base Schnell, just for consistency.
Also, I was only testing the 4-step performance, because I already have 6-step dev.
I wonder if there's some specific subject where it could outperform. Hmm.
Oh interesting, both seem to have cons and pros. Real flux's looks better, the cart is not weirdly opened and a weird shape, the human behind is pretty weird(big head but small legs) but some objects behind in realvisxl's are mushy.
It does look better then normal schnell for sure but doesn't fix all flaws. Can you try text with it?
Trying now with base, and then I'll try with real. This is the last test I'm doing though, gotta get some editing done.
Target text is: "Does this mawashi make me look fat?"
Base model:
Yep, thanks for testing!
In this case I think it got a better result overall, and the texture is much more realistic. The win goes to real. Gotta edit though. See ya.
To upgrade pytorch + cuda do I need to be in the ComfyUI_windows_portable folder? Or do i need to be in the python_embeded folder?
its realvis model
they are always only for photos
Seems considerably better at text as well so its actually pretty nice. Probably going to replace hyper for me when a nf4 comes out.
yeah 2nd one seems much better
Nice!
๐ thanks
Whats the best way to copy JUST the style of a image?
Cubiq's ipadapter nodes within comfyui
I'm not sure where else to ask this... It seems like the perfect place...
Does anyone know if Nerdy Rodent has a discord server?
nope. just his twitter account and his youtube channel
he's pretty good about responding on twitter though
Thanks @topaz valley and @dusky thistle https://github.com/cubiq/ComfyUI_IPAdapter_plus/tree/main/examples
I'm just going through them now do you know what example? Kolors and image didn't do muych
To me it doesn't seem like it does anything more than copy the image like a noise / blur, it's not matching the style.
The top is the style I'd like to copy
Okay so for example:
The top left image is the style that I would like,
It's not BAD but is that the best quality I'll get?
Auto 11 is the way to train a lora?
Yeah I know, that's why I'm confused.
Yes, pixel art.
But I think I'm just goint to train locally.
Is the traintrain good?
I don't know what idk civit :goat: means
civitai?
metropol parasol
Schnell
well, well, well
Vaguely abstracted modernist oil painting in an expressionist painterly style. A young child stands by the glass wall of a zoo, a gorilla sitting on the other side in its leafy green enclosure. They both hold up a hand to sign a kind "I love you". Subtle imperfections and splattery effect. Bold textures."#
I hope they cook something good, sd3 was very disappointing.
SD3 2B is a very good model. the only reason is was disappointing is because 1. people didn't bother to learn how to use it and 2. it isn't unet, and has a couple of core issue. however, it is much much better than flux, which is seriously broken, extremely rigid, and massively overfit for several concepts to mask the same core issues that SD3 2B has
yet people are falling all over themselves to use flux because robin HID those issues and they haven't noticed them
I respect your opinion.
i did the testing - and the work - to drill down and figure out why sd3 2b has the issues it has, and just spent hundreds of hours and the last month walking through flux's latent space. it's not an opinion, it's hard facts
Sure ๐
i honestly don't care if you believe me or not
i do agree with what you say, hope sd3.5 won't flop, distilled models are mid
we'll have to wait and see, won't we?
appreciate your fondness for sd3, however can we tone back on the constant trying to "win" others over? definitely okay to have differing opinions. @craggy crest
i'm not trying to win anyone over
too late for that
yes, which is why I used quotes. A lot of your history here is arguing that others are wrong, and you are right. So, I'm asking to tone back so the environment is nice and clear, @craggy crest
we'll have good movies now ?!

fruit 
some of the stuff that is known as stuff that won't work on Flux because its distilled, might actually work
I'll post a miku in anime
SDE and ancestral sampling works now if done right
and its possible SAG/PAG will work if you tonemap the inevitable CFG burn away
there might also be a sneaky way to get something like tiled control net without training one
so many issues
can't you just use a1111 
I actually tried the other day to read a1111 code
its very confusing
(every piece of code is confusing to me)
and did you find anything?:D
yes. and turned the information into the developers
its secret?:D I'm just nosey

diffusers is probably the nicest overall code base out there for this
A1111 is essentially legacy code at this point, its not really made well for scaling
there are UIs that run on diffusers too like SD-next and Invoke, its not always command line
sd next is pretty cool
UI is questionable butt
kinda cool
it runs a tiny bit better with zluda than a1111 on my pc, not anything worth using still 
Yeah sd3.5 8b already looks good, and its not even fully done training yet from what I have heard. Hope they release the 8b one instead of 2b this time.
SD4.0 in a couple of weeks.
Not sure if I am doing something wrong. Maybe someone can load up this workflow and check it out but. These images seem to be grainy or pixelated.
Any advice here would be appreciated on this.
I won't run that, looks like an upscale only, so without the original image i can't say
And yes. It's bad upscale
This is the original that I am using to do img2img with.
Here is the base image it spit out for this example
here is the bad upscale.
Do you want to change the original that much?
Imagine: chotta bheem white dress
It's better, right? i'll try to build a workflow tomorrow
looks like SAI are a going concern I guess
i want to keep the colors, lighting, and subject characteristics close to the original. the only real change i want it to go from a rendered cgi image to a realistic looking photo if possible.
the images i produced so far are not really the end goal. I just couldnt figure out the upscaling issue with that workflow. Its one I got from Nerdy Rodent
could you tell us a bit more about
the img-to-img method you used
and then the upscale method
i left it embeded in the images. Im already off the computer for the night. Ill upload the json for the workflow in the morning.
ah ok, I don't need the JSON I can get the workflow from the image when I have a server up ๐
we're in a rough spot for upscaling at the moment due to a lack of good Flux control nets
so the choice is to either use Flux with no control net, which requires re-rolling tiles a lot
or to use SDXL with SUPIR (the best control net around) or other SDXL tiled control nets
if the image is very large (and so the number of tiles is high) then SD 1.5 can be good too
with flux even memes are high quality
I started over and I am working with a new workflow. It seems to be working so much better. Hopefully going forward we get some better upscale models.
Anyone got an IPAdapter for Flux - but not X-Flux - as my VRAM isn't up to it?! ๐
Until Matteo makes one, I am out ๐ ๐ ๐
A close-up, intense view of a baseball catcher signaling a low curveball, with the focus on the right hand giving the sign. The catcherโs fingers are clearly extended downward, hidden behind his legs, showing two fingers as he discreetly calls for the pitch. His left hand holds the mitt low to the ground, but the real attention is on the precise and subtle movement of the fingers, communicating strategy in a tense, pressure-filled moment. The dirt-covered ground and beads of sweat on his hand add to the intensity, while the shadowy atmosphere heightens the focus on the sign itself.
I tried this in SDXL and SD3 base models and were a big fail... Flux does it properly
This one, not so much
anyone seen the blueberry model on: https://artificialanalysis.ai/text-to-image ?
what did you try
I just see blurry

Exactly that ... ๐
Prompt is:
smooth color gradient, representing as many colors of the spectrum as possible
SDXL and SD3 generated a muddled mess...
I am sure it is a Skillz issue though ๐คญ
where do u get blueberry?
show the one you got from flux
you can go through the https://artificialanalysis.ai/text-to-image/arena for the actual arena but it's on the leaderboard tab, it's at it top above flux pro
The one I posted was from Flux...
4 second generation time with cfg would be 6-8b model, maybe sd 3.5?? There's two models on there both close in elo so guessing they are doing a/b testing
I thought that was a fail it looked ugly 

sample generation
doggo!
now this is cool @sacred jewel 
these are flux dev + ralism lora only?
also using ClownSampler from my repo which is using a heavily modified version of the refined exponential solver (RES)
Is it publicly available? ๐
yep, in my res4lyf repo on github
i isntalled it but does not show up in comfy
Alright first time using comfyui, lets see how it works.
This is a bit freaky... It had nothing to do with the prompt I initially gave it or the image I put in for img2img...
I call it the demon llama and the monk.
lol
Although this will come across as COMPLETELY obvious, stock images of basic stuff is going to just die as a market
I needed an image, on a clean background, of random stacks of books. Random sizes, colors, and age
Flux Pro:
Ideogram 2.0:
(among many for both generators)
I mean with such ease, why would anyone waste time or money on stock images of such?
I show both, not as a competition between the two, but to show that any top generator can do the job
Nice
errors at the console on startup maybe?
they stay red and its not in the manager yet
im trying to run the flux xontrolnets too and they dotn work. do the work with the schnell fp8 version?
With Adobe, they try to guarantee that their Stock Photos have not been based on copyrightable material
Most AI, on the other hand, has shedloads of copyrightable material behind it ... !
Have some furry @sage burrow 
Simple answer - Copyrights. lol โ๏ธ money was the reason
You can save $15 and nobody gf, but business canโt afford such risk. For them such mistake could cost $15M ๐ป
Nude
@errant dust have you seen โOne Billion Codeโ on Netflix ? Opposite example though - Google stole Planet Earth algo, and then won lawsuit vs German founders.
I've been offline for a couple of weeks, what did I miss?
and why aren't there really many new flux loras and checkpoints? I made a few just to prove it could be done, hoping that others would create a bunch ๐
Apartment building on the street with beige clinker bricks, neighboring buildings with red clinker bricks
A charming chalk drawing of a futuristic spacescape, featuring a campsite with tents, sleeping bags, and outdoor essentials, the sky is a glimpse of outer space with stars and comets. The landscape radiates warmth and comfort, bathed in a golden glow that entices viewers to explore its hidden secrets. Looming over the campsite is a sleek, modern space station, connecting to the lunar surface via a shimmering energy bridge that glows with life.