#🏞|general-with-images
1 messages · Page 166 of 1
/generate
Create a beautiful fertility hospital logo realistic
me over here on a a770 generating hunyuan video gens
they must've fixed hunyuan's vae decode tiled for native
they added temporal size to the native node
Oh right, was meaning to try that one as well. Could you share the workflow you use? Or the git to the nodes needed?
The nodes are native.
Also, need to one day start messing with comfyui on deck as well 
it's a unet model so you load it with diffusion model loader
that alongside emptyhunyuanlatentvideo
that's in the workflow on the blogpost that comfy just made
and modelsamplingsd3
Could you share the workflow and link to the model you use? As i can then go from there 
You could also just go to the comfyanonymous examples and get it
because it's there
Oh,. there, ghotcha xD Was looking in the announcement channel for some reason lol
here's the workflow. you'll need to open this in the browser, then just save it and load it into comfy - dont' try to save it right from this post - click on it, open in browser, then save it. https://blog.comfy.org/p/running-hunyuan-with-8gb-vram-and and follow all the instructions in the blog post for models, vae, etc
If you are running on consumer hardware, utilizing hunyuanfastvideo's lora alongside a q4 gguf model is recommended imo
i'm on a 4060, and the models used in the workflow work... okay... but are fairly slow. so gguf might be a better way to go
thanks :) Cause i've been trying to test the new longer 10 second 1.1 model of gguf, but still can't for the life of me get it to make videos for longer than 2-3 sec 
you running out of vram?
Nope. Just turns out to be that short
i can't do longer than that with 16g vram as i run out
currently interpolating a tv show episode, so gonna try in roughly 30 min
what parameters do i change to make it longer?
Just total frames?
total frames
Gotcha. Gonna test afterwards and report back :P
The model was trained on a framerate of 24fps
so it must be 24+1fps
an endframe
max i've done is 384x256 at 121fps
also look at the blog post and try changing the various settings if you do run out of memory
you can always change the speed/FPS in davinci resolve if you need to
Fps i change with RIFE node. Or just pull it through flowframes
it also doesn't work well at an AR that isn't what the comfy workflow is using
at all
Same thing i'm doing here actually. Just either do within workflow, or separate with flowframes
Comfyui has RIFE and Film VFI as nodes.
I combined that alongside a 2x fast compact upscaler (nomosuni compact 2x)
Oh man, gif dithering xD
That isn't a gif
also
the original video is 512x384, so yeah
dithering gauranteed when both interpolated and upscaled
that happens when you change the aspect ratio. it doesn't like most AR
Huh, send me that workflow, wanna see if i can make it less artifacty out of curiosity :P
this isnt true, like LTXVideo it comes down to total prompting.
Hunyuan video has a spatial decoder that allows it to handle much lower resolutions
AR is fine as long as it is nearing 16:9 or letterbox ratio
idk about 9:16
it's not? that's odd, since that's what happens with EVERY SINGLE TEST i ran yesterday with an AR that wsn't what is in the comfy demo workflow
And then here's me at low resolutions generating coherent walking animations
but you know best, as always
even at 384x256
@dry blaze i'll be on the L3 discord if you need me
Lol he can't defend himself at all
Every time, man.
Here's me generating zombie knights
Is there a tool out there similar to tensorRT, but for models that isn't SVD to speed up generations for images and videos? Like animatediff, gguf and hunyan?
As at least with images, other faster processing nodes exists, such as AItemplate, but it seems to be broken on every new confyui version for some reason. but curious if aitemplate can be used with older video models
No idea what the L3 discord is
Don't bother.
Literally every time I've conversed with that man it becomes an argument.
I'm not the only one either.
Here he comes.
mostly because you're wrong every time.
You leave every time without affronting your argument in any way
love how you argue with the programmers
I don't care that you can code. I clearly am getting outputs that are operational at lower resolutions
You can't disprove that, so leave and go to L3.
Oh no.. what did i start lol
torch.compile
Oh well. Well, i'm not knowing of the discord he mentioned, so i'm stuck here anyways 
He's generally disliked anyways 🤷♂️
For aitemplate?
I've had DMs from other people regarding him
you started nothing. i tried to help you, the resident troll lord jumped in to start a fight and arguement per normal
I'm a troll supposedly
If you actually had any argument maybe you'd stay a little longer.
But you don't.
no its an alternative
if you are able to, using deepcompressor for 4-bit quantisation is either the current best or close to it
torchao int8 AoT compile is also very strong
holy crap, i need docs on how to do all of that, and read a "explain like i don't code" article of what each is/does, as i appear to be quite out of the loop xD
so long as you have pytorch 2.5 cuda 12.1 or 12.4
and the model is a DiT not a Unet
then torch.compile is easiest and fastest to do
Quant part i know at least. Making lower bits of it to save space and time by making it "less "lossy", and more "sacrifice slight visual for speed and size"
cuda moment
i cry inside
oh yeah without cuda things are rough
Crap, outdated pytorch. 2.3.1+cu121
Especially when you try to run comfy on steam deck via distrobox because arch xD
What can you quantisize? Any model? And how small can you quant them to? As if i can quant to any smaller, i wanna have some fun xD
Also, just realized, sorry for mad multiping 
flux 1.58 bit just came out like an hour ago lmao
you can quantise most models in fact
Oh sweet! I guess i know what i'm gonna have fun doing xD
What's the main difference between these though?
deepcompressor has a special new method for dealing with the large amounts of outliers you get if you quantise to 4 bits
Oh neat! Though, it failed. Host key verification failed. upon trying to install with poetry
I appreciate you guys working on video 🙂
They have released a vid -> vid but its not great from what I see
zebra
that is a car
Pretty cool
test - @fresh venture (fast)
-# Create, explore, and organize on midjourney.com
Hey @worldly quest
Use the buttons below to subscribe, or to link an account that already has a subscription.
[Please check the bottom of the channel for more information.](#🏞|general-with-images message)
My pizza emoji is so hot it became lava 🔥
-# flux.1-dev Q8, open-genmoji lora
Total chaos. Scattered data, messy inventory, delayed orders, and stressed logistics
So Long, and Thanks for All the Fish
Dragon flies cinematic effect colour splash
Create a similar image
New technique / AI Flipbook style animation - https://www.youtube.com/watch?v=e-F7rtctxHs
Technique consisting in a new synthetically trained AI model [FLUX.D LORA], a little bit of Python, and some human[?]made editing.
You can access this LORA as of today through @civitai, and full project files [1760 images + prompts + Py files] through: https://linktr.ee/uisato
#animation #ai #design
New year tree
@night crescent found this mentioned in the project's readme. maybe this offers insight.
oooh, thanks, that was it
Photo via Kodak portra 160,young pretty girl, 20 years old. She has blonde hair, blue eyes, pale skin. Split into four images, Shot of different angeles, white background --style raw --v 6.0
this isnt midjourney lol
If I wanted to draw this image, can someone sugest a prompt?
realistic or anime
but yeah the prompt i used:
masterpiece,best quality,amazing quality,very aesthetic,absurdres,newest, 1girl, short black cheongsam, short sleeves, red with gold branch floral pattern, red floral pattern black mask, black gloves, high heels, sitting on table, crossed legs, black hair, hand near head, hand on knee, side view, cherry_blossoms,
didnt use the tag latex or plastic that your image has because that would generate a whole nother image
could modify the dress length and other details as you see fit as i feel like a normal one was a bit too long and short a bit too short
wow!
Hello boyz und girlz
CC-Super is now available on CIVITAI
https://civitai.com/models/1102583
Crie uma imagem Mulher poderosa de 27 anos, loira, cabelo médio em ondas suaves, vestindo vestido de seda azul-marinho ou preto. Postura confiante, com uma mão cruzada ou segurando bolsa minimalista. Ambiente luxuoso, pisos de madeira polida, paredes cinza escuro com detalhes dourados, luz suave entrando por grandes janelas. Móveis elegantes, mesa de vidro, pintura abstrata na parede, transmitindo sofisticação e exclusividade.
it took me 27s to generate a picture like this in 7900xt, i think it's pretty good
help me to make a cartoon cute image of the parrot in the picture
Create a picture of elephant
create a picture of messi
@night crescent something like this?
some lora's dont have a trigger word they just work
but if your using illustrious
try using: toriyama_akira_(style) (see msg below)
i know, i was just saying one of them did use one
toriyama_akira_\(style\)
it works with akira toriyama on illustrious, but it's not as close as the images i see on civitai
could be the model
used mistoon_anime for that one
used WAI-NSFW-Illustrious one for this one (same prompt)
the DBSbroly model looks nothing like any of the pictures i see
Dragon Ball Super Broly Movie (Series Style) [Illustrious] was made on the wai nsfw one and you gotta copy their settings
since it's not exactly toriyama there, it's a specific style, and the lora just doesn't replicate it, so i dunnno what i'm doing wrong
tried using the same prompt as the xemple images too
like euler A or
remade this image but instead of Eular A its now DPM++ 2m + Karras
so sampler does influence it a lot
yeah but did you try the DBSbroly model?
ah didnt download it. i get the style using their artist style name (changing up my main prompt for it entirely) you get pretty close
yeha without a lora i get something ressembling the style a bit
but the DBSbroly style is more unique, and i can't replicate it with the lora
like i'm trying to do cloudstrife with it, and it looks like this :
i used the same prompt as preview image (positive and negative), and just removed the character description to replace it with cloud strife
i think its just too syleized on the sayans in the description any image in the gallery below isnt even close to their main examples if it isnt a sayan
yeah you're right, i got fooled by the special effects
i'm trying the other ones right now
this sounds like bad advice but generally if any lora is below a few hunderd downloads i lower my expectations quite a bit unless its extremely niche
i've got good results up until now, but dragon ball, no deal, lol
thanks anyway!
managed to get something good with the manga style art lora, but the clothes are nowhere near what cloud should wear, even with the correct prompts, mmmh...
i made a model for emoji generation but its based on sd 1.5 not flux or sdxl or anythign so its not quite as good but im trying to optimize it to run locally on mobile devices (like apple genmoji but for noto emojis)
unfortunately android development is hard and i cant figure out how to quantize the weights
/analytic visible:3d风格一张图以红色为基调,上面祝福语是大吉大利周围装饰图标金元宝
3d风格一张图以红色为基调,上面祝福语是大吉大利周围装饰图标金元宝、红包、烟花、梅花
Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.
If you have any questions, feel free to ask us!
Your dashboard
Help
Support server
Other languages
en: help
ja: help Japanese
Also huge, starting price on a 5070 is very reasonable vs release of the 3070 price
Thank you for using comcom analytics.
"comcom analytics" supports all community managers (moderators and server owners) by stats, visualization, and analytics.
If you have any questions, feel free to ask us!
Your dashboard
Help
Support server
Other languages
en: help
ja: help Japanese
is there any model that generate images like these
and what type of images are these anyways? like what kind of paintings
i've seen them being used quite alot in philosophical reels and things. I find them quite soothing to see
love these birds 😄 and the fish is cool
scalpers are going to decimate that supply 😦
Im gunning for a founders edition
yay Tifa. 😄 This image is actually pretty good because the background has a very midgar-feel to it with the industrial look. And her character face came out pretty well. I didn't think flux knew her very well.
Good luck! You might have to camp-out somewhere a week in advance. 😛
Stores in the netherlands dont got such a fierce competition luckly
ah right on
thank you 😊
these are neat too, whimsical...
I swear it looked like it was moving.
Hmm.
I think I have one major issue.
I need an upscale model that doesn't cartoonize it.
nope this one https://openmodeldb.info/models/4x-Nomos2-hq-atd
the Helaman ones are good
@glacial tinsel if you want results that are somehow "more realistic" than this, you're likely just looking for a particular photograph aesthetic. not realism
You could try prompting using words like: unsaturated colors, sepia, impressionistic, post-victorian, sketch, painting, brushstrokes, etc. Give it a try!
Assuming you are doing a second denoising pass after upscaling, I recommend trying 4xUltrasharp. I’ve done comparison tests on a number of different upscaling models with Flux and always keep coming back to Ultrasharp even though it is old and not as fancy as the others. It just seems to strike the right balance between adding details without unpleasant patterning, suppressing oversharpening of high contrast edges, and avoiding both boosting film grain and suppressing it so much that detail is lost.
I also recommend not using the tiled VAE encode/decode nodes that are in core — they have a problem with inconsistency between the tiles that varies from run to run. The custom node package “Tiled Diffusion & VAE for ComfyUI” contains what appears to me to be a better implementation, at least with fast mode turned off. The quality also degrades much less when you decrease the tile size than the core implementation.
VAE tile decode nodes have an issue with consistency between detiling
what model were you using here?
it causes visual artifacts
I've noticed that when running both hunyuan and ltxvideo
A curious ermine weasel with sleek brown fur walks gracefully along a winding rocky pathway. The path is lined with moss-covered stones and small boulders in varying shades of gray and earth tones. Dappled sunlight filters through overhead tree branches, creating dynamic shadows on the ground. The weasel moves with characteristic quick, fluid movements, occasionally pausing to investigate its surroundings. Its long, slender body weaves between the rocks while its alert black eyes scan the environment. The scene has a natural, photorealistic quality with soft depth of field. The lighting is warm and natural, suggesting late afternoon golden hour. Small ferns and arctic wildflowers peek through the rocks, adding touches of green and color to the scene. The weasel's fur appears slightly glossy and detailed, moving realistically with each step. Camera angle: low to the ground, following the weasel at its eye level, creating an intimate perspective of its journey.
LTXVideo 0.9.1
I need to find a way to pull better prompt comprehension and motion coherence out of the model
Stable Diffusion 3.5 large
The custom node I mentioned does much better than the core node. I tested both by fixing the seeds of both first and second stage and just changing the tile size of the tiled VAE decode at the end. Varying between 512 and 1536 in increments of 256 and running an image comparison routine on the results (subtracting one image from the other and multiplying the results by 10), I found the custom node to give almost identical results, only varying slightly on various details spread around the image — the differences are difficult to see even with a flicker test between two images. The core node gives much greater differences, both in the details and in large scale aspects like tone and brightness of each tile. Using tiled encode and decode is important to avoid memory issues when upscaling to large sizes and doing tiled denoise, so I recommend the custom node for highest quality results.
I'm not using tiled decode nodes anyways
🤷♂️
I'm doing a full decode
and then running it through rife
How are you denoising the second stage after upscaling? Flux doesn’t like being run above 1 MP.
Ultimate SD Upscale
That tiles it by itself
I don’t know what rife is. Ultimate SD Upscale is tiled denoising.
But a worse implementation, because the tiles don’t communicate. Using Mixture of Experts, the tiles can communicate during denoising, which helps consistency and hallucination.
This simply doesn't matter with redux clip vision being used alongside an upscale controlnet
Eh, I haven’t bothered with the controlnets. They seem to take up too much memory, slow down inference, and my workflow works fine without them. Flux is generally a lot more resistant to tiling artifacts than other models. MoE just helps it that last little bit.
That's why I use the controlnet and redux on Schnell.
The main image is made with dev alongside 4 anti-burn nodes
3.5 cfg, negative prompt setup.
If it works for you, that’s fine. I don’t mess with cfg or negatives.
Having a higher CFG with flux helps a ton, It allows for better overall prompt comprehension, and combined with other anti-burn methods really makes the image shine
I've used CFG with flux since day one yeah
I haven’t had much trouble with prompt comprehension. Sometimes Flux is a little too opinionated about image aspects that are associated with, but not stated in, the prompt, which gets annoying. But most of the time tweaking the prompt to be better helps a lot.
Tokenization bias that comes from relative tokens
Where you have dog, so you have "tail" but it isn't mentioned
Negative prompts address that issue
even moreso when you realize having a negative prompt with flux fill
allows ridiculous levels of control
lol so much misinformation when you do this server search query. i remember cw being one of the loudest mouths spreading "flux can't do negative prompts" ever since i got here
when used effectively, negative prompting is a massive boon and a good tool in the kit
i would fully agree that "bad anatomy" and "bad hands" are probably not going to work. they're probably relics of sd15 negative embed prompting that got popular. like there were embeds trained on those tokens specifically
🤷♂️ We need a net specialist who can pull out Flux's bias tokens from its trained dataset
Either that or a hell of a lot of trial and error
has anyone done embeddings tests withh flux? i only ever find loras on civit
🤷♂️ I'm only now just trying out pytorch 2.7 dev on arc
I've tried out InstantIR and got very disappointed by it
redux from flux with an upscale controlnet works better
redux and controlnet upscale kinda beat everything imo
nothing seems to be as good at mainting coherency and semblance to the original image while also massively improving details
SD 1.5 embeddings work on Flux lol
that might be seeming to work with confirmation bias. it shouldn't . sd15 embeddings would use clip g and flux only has clip l
sdxl is clip g and clip l yeah. sd2 embeddings might work on flux since it's clip l
sd3 is clip l g and t5
i think i'm getting that right
maybe yeah
I always use T5 Ella with SD 1.5 so I forget what the normal way is
i miight have dysliexic'd it. sd15 is clip l and sd2 is clip g. But the same point should apply. flux uses open clip
we're all long past the normal ways 😄
I hope companies don't do multiple text encoders again
yeah you can use Qwen with flux now lol
https://huggingface.co/Djrango/Qwen2vl-Flux
its the vision version too
LLMs are losing tokens if i seen the research right. i probably haven't. But maybe .. maybe the next gen of image models won't have locked in tokens either
just an abstract notion that i have no specific idea about
patchwise or continuous tokenizers are a thing yeah
i saw a youtube, where a guy was talking about the current limitations of image generation models and he brought up an interesting example. all clock faces have 10 and 2 hands, no matter what time you prompt it to show.
hmm not sure I had a look on Civit, searching "clocks" for Flux images and there was a range
if you put the flux filter on you see improvement
hey can anyone here point me in the right direction for making images like this? https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQQ8lJN_IGxOPJl1iHT2wa_X3ON1QeJgTjbTA&s
I just wanna use imgpaint or whatever it is to do this for my friends and myself, i think is funny
good call
Hi all, does comfyui create images with simple prompts like Fooocus? I’ve been trying AUTOMATIC1111 but I can’t get consistent results or even anatomically correct images. I just started using Fooocus but it seems a bit slower to me although it’s very easy to create good and consistent images. Also, the stop or skip buttons are unresponsive for me.
Thank you in advance!
Comfyui itself does not enhance prompts on it's own. You'd need a node that gives it that function.
In this case you could use flux prompt enhance node, llm_party, ollama prompt text encode, etc...
what gpu do you have? using forge? you might need special launch tags. or just use swarmui instead
familiar looking lil companion screen thur
https://www.youtube.com/watch?v=jEcQqhcAtOI you scrubed 4 of the other pc build vids . how soon till this one?
I GET ASKED A LOT ABOUT WHAT SYSTEM SET UP I HAVE. SO I THOUGHT I WOULD PUT TOGETHER A VIDEO OF THE PHOTOS I TOOK WHILST I BUILT IT. ITS A TOTAL BEAST OF A DUEL WATERCOOLED RIG. THERE IS A VIDEO OF THE FULLY BUILT UNIT AT THE END OF THIS VIDEO TO.
I HOPE THIS ANSWERS YOUR QUESTIONS ON WHAT I RUN FOR USING THE AI TOOLS YOU WERE ASKING ABOUT...
because computers cant change ever
that was a stolen vid that they were pretending was them. crafted alt persona. but you know this
they never did a real face reveal
https://www.youtube.com/watch?v=NimnGpBEEvY og video of that guy. not the dice guy
Get the TABS & Backing Track here: https://www.fatfreecartpro.com/i/12ipi?card
More Tabs & Backing Tracks here: https://mikisantamaria.e-junkie.com
My online school / mi escuela online: http://www.escueladebajistas.com
Follow me on instagram! http://instagram.com/mikisantamaria
Bass: Modulus Funk Unlimited (tuned D-A-D-G). Seymour Duncan pick...
don't blame me for whatever reputation your employers catch. thats what you should say
@crimson dawnOk, so he seems to be just as confused as I am. Anyways, dude seems to be sketchy, and I don't blame you for not liking him. He kicked me from his server, and when I joined back he told me "We don't do drama here" and then banned me before blocking me. I don't blame him tho, I must look like a fucking weirdo messaging him about something so stupid
Anyways. I am not gonna let this bother me anymore. I am not whoever you think I am, and its kinda ridiculous that you can't see that. Please just leave me out of this, cause I have employers in this server, and I don't need them thinking I am some jackass who keeps harassing you with messages for some malware ass game.
This whole thing is wild, and I wish you a good one, but stop slandering me, thanks man.
i never brought it up before until today so i won't again. just don't poke the bear.
Not planning on it, cause I am literally not that guy 😭
true story
oh my god lmfao
Currently working on some stories and making characters.. I love how this turned out but I struggle to keep her facial stuff .. How can I keep her facial to another prompt with different outfits etc? I have copied the Seed number and keep the look prompt but I only change the clothing ..
Lora, you need a lora
Or photoshop the face in
A lora for the face?
Yeah, character lora
This is an O'C Character so only doing it in prompts.. So I cannot use the Controlnet to keep the face or something then.
I have to make an Lora just for that character's face? But if I can't keep the same face when doing the character .. I can barely do a Lora with it as I need couple images that are the same?
Hmmm there is something. Its Not optimal and i haven't used it befkre
So if I make 20 characters (high number for now) I need a lora for each?
Havent watched the video but basically here
https://www.youtube.com/watch?v=TTk00FLc2kY faceswapping
Discover how to use FaceDetailer, InstantID, and IP-Adapter in ComfyUI for high-quality face swaps. Achieve flawless results with our expert guide.
Note: If you encounter difficulties locating the IP-Adapter Unified Loader node, please ensure that you upgrade ComfyUI to the latest version. Additionally, you should download the "IP-Adapter Plus" ...
But hair is still a thing
Theres a reason visual novels mostly have only the main cast and the rest as faceless males/females tbh
So I should just try to make like 50 images and take as many as close as possible with the face and do a lora? 🙂
I would recommend trying ipadapter but training many loras takes a lot of time
And if you do it on civit, some money too
Yeah, I am trying to learn how to do it on my PC but keep getting errors etc, trying to follow a video but i fail haha 😄 I manage to make 1 lora but don't understand koyas_ss yet. But I have not done much more, but maybe time to do it 😄
Its definitely a learning curve but projects like these always take time
I'm experiencing the issue (ComfyUI). Any help?
GroundingDinoSAMSegment (segment anything)
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)
Oscuro panorama
”Let The Damnation That Is Your Failure Die With Your Blood”
image of a horror woman
we also like kats.
I mean the alternative would be furries in my case :D
Hi everyone! I found https://deepai.org/ that allows me to select a model and generate good images (example attached), but I have little control and I cannot customize. Every time I generate an image, it resets and the whole style changes.
I tried running Sogni, a mac app, and select some model. But then the quality is not high.
Any recommendations of an app or free online service you know I should try?
like the images @viral frost produced - how was this done? online? some service? locally? what tech?
Ah probably locally but you need a pretty good pc for that
Before i recommend anything, what are your computer specs?
Hmm we had people here running it locally on a m2
Oh yeah, this app I tested uses SD models
But i dont think you can run the newest stuff on it like flux
Cs1o made a great guide for apple
https://github.com/viking1304/a1111-setup
Oh yeah, I heard of Automatic1111
But if you get stuck i recommend throwing a post into the help channel as i dont know much about apple
I'll give it a try in a bit and if there's anything interesting to share, I'll send an update
thanks for sharing!
You do need certain models though to run but you can find those on civit ai under the tab models > filter
Sdxl is okay for realism. Illustrious is getting really popular for anime
Checkpoints are what you need for to start
Once i wake from my nap i can show some stuff in the ui. Just apple stuff i do not kno
Nice!
A quick question about working with a model.
I noticed that the generated image inherits trends from the model. So, if the model has dropshadow below the character, you'll get it in the generated image regardless of instructions. Am I right?
And if the character is partly cut (like you see half the figure), the result will be similar. You cannot ask it to produce a full character
Hmm style can be heavyly influenced by the checkpoint yes but producing a full character should be possible
But even then styles can be changed with lora's
what's lora's?
Hmm lets say a checkpoint does a lot of stuff pretty ok but not great
A lora does a little stuff but wayy better
total newb here. I don't even know what a checkpoint is 🙂
Like a checkpoint knows that emma watson is a girl but a lora can make her likeness really well
Ah the model sorry
Oh ok 😄
Iike sdxl juggernautXL
gotcha
Or sdxl illustrious
and lora is what then? settings? another type of model?
Basically a mini model you throw ontop of the main model
Ahhh. I'm really learning here
So lets say juggernautXL (main) + Emma Watson (lora)
To get emma watson in that style
Oh cool
Oh I see the loras on civitai
If you disable the pg filter its mostly gonna be what you expect its gonna be so for beginners sake keep it on lmao
Or use civitai.green
Its the sfw variant
Lmk if you managed to install a1111 I'll rest my eyes a bit. After im free to hop on a quick call to tour the thing
Been up working since 5 am lol. (Almost 9pm rn)
Thank you! I've been up the whole night with the kids, so I feel ya
I'll give it a try and see where I get
really appreciate it
is there not any other repo that isn't a1111
from what i know its behind the other ui options now
its been a while
maybe a concept lora lol
For mac its what i saw in the pinned msg in tech-helpdesk
I’m running Flux dev with ComfyUI on an M3 Max 64GB.
You can find my workflow in my image metadata, but you’ll have trouble running it with just 32 GB.
I wonder how good the GGUF quantization methods would do for mac
Since you're already on CPU
"CPU"
I think some people do quantized, but I’ve never tried. MPS doesn’t support fp8, so I don’t know how it’s being done. I’ve heard that T5 always needs to be run as fp16, no matter what you’re doing. Even though bf16 is kind of slow, I’d rather run the full model and not have to think about whether or not I’m suffering from quality loss.
Well, comfyui-gguf has a t5xxl GGUF model
If that is operable then that'd be an option.
What am I doing? Just, you know...lying on the street, sucking up beans through a straw.
old prompts (like sd1.4 old) with xl models...
Anyone know how to get the AI to actually listen to me?
Hair isn't even brown, and it doesn't have the tiny devil horns as I stated in prompt
This is what I have btw
Have you looked at Nomos8k_atd_jpg?
That is an insanely good restoration upscale model
great at denoise
I love running my old prompts. 🙂 I feel like my prompts were more creative back then.
These are the upscaling models that I've tried. I haven't found any of the HAT, DAT, or ADT models to be better for the purposes of an interim upscale, with denoising afterward. I think this is because what makes the image good for denoising must be different than what makes it good for viewing. I was hoping that the _hq models, since they were trained on high quality images, would produce better results, but in actuality they took the subtle film grain from the initial generation and amplified it, producing very noisy final images after second stage denoising.
yell at it
I'd use the nomos8k for a pre-inference pass into ksampler
and the ksampler will handle the details
in this case tho
its just schnell with redux and controlnet upscale
and thresholding etc
origina
Well, all I know is that I tried all of them (with frozen seeds) and looked at the details using flicker tests and I always came back to 4xUltraSharp.
Ultrasharp barely changes the image. It doesn't seem to add details.
Which is good for inference of course, given that the image itself isn't noisy
I've used a couple that aren't on this. LDSR, InstantIR, SUPIR, AuraSR, and Latent Interpolated upscale
Generative-based upscaling methods.
InstantIR and Supir are heavily prompt dependant.
LDSR is slow but it's a good clean upscaler
in terms of just normal upscalers, I've tried nomos8k, nomosuni compact otf medium, remacri, esrgan, bsrgan, and ultrasharp
Not gonna lie, with how controllable redux and the controlnet tile is (strength-wise) it is probably the best upscaler right now just because of image retention
for flux do you use the jasper tile or the union one
I think I'm currently using the jasper tile one.
at 0.5 strength
ah okay yeah I need to try it
been using the union one all this time
think I picked the wrong one lol
If you are of age, simply go to Civitai and look for them there.
This isn't a place for discussing stuff like this (since i think it's very clear you're looking for nsfw content generation)
Can someone assist me with training SDXL (RealVisXL v5). I've tried alsorts, but get no result, at epoch/step 0 is almost identical to 5000 steps.
Real character lora.
here's my settings in OneTrainer
I've tried learning rate 0.003, 0.0003, 0.00003, 0.000003
Masked training on/off, image repeats 10-50, batch size 1,2, 4. train text encoder 1/2 on/off. EMA cpu/off, Train UNET on/off. Change the keyword, remake the tags with different generator (WD1.4, joytag, Florence-2-large)
I'm completely stumped.
I've tried other SDXL checkpoints also, nada
4070 ti super, 20+ hours of training, nada
hair left on my head from scrathing = nil
im getting the 9090 TI SUPER
the side says 50990 now thats one futuristic card
dang, early access!
I would have rented this game from Blockbuster.
I'm particularly interested in the **RTX 5099 Text **version
Cant see who the mods are but watchout chat another scam on the loose userid 1325847535612334102
Dont engage em. You can spot them by DM wording, usually also links with url shorteners.
oh yeah. i dont care about engagement. in the offchance its a real person typing (i had a few that being the case) any moment wasted on me is a moment not spend on another person
Fair logic
Stay safe guys
one time i got one to even block me after i kept asking for feet lmao
Do anyone know what I need to use in the Script to be able to try different Lora strengths? Like 1, 1.1, 1.2 and so on?
could you send your profile .json?
Someone has sent me a setup that seems to work quite effectively and I don't have a specific profile, I was just using a preset one - #sdxl 1.0 LoRa
Oh no not the watermark with controlnet
I got to say ai images being watermarked is peak irony
Dragon Ball "Super" 🤮
same reaction i had seeing that pfp lol
Wat dis? Imma poke it.
juggernautXL or Flux but id delete that image homie before a mod comes and 🔨
Hi everyone,
I’m trying to place a real heat pump into a real photo via Stable Diffusion’s Inpaint.
I need it to look perfectly real, including the logos and details, but those keep getting distorted.
Any tips or tutorials for achieving fully accurate branding?
Thanks!
If you can wait till 6-7pm my time (after work) i can show you how i do it
Pic i did for lolz, ai fixed the monitor impaint edges only
For example, I need these camera angles (photo 1) And this is the promt: 2.5D, 4k, realism, UHD, Cartoon style, white background,maximum detail, Image in the center: One-story house, overgrown with grass, abandoned, ((Side view, 40 degrees altitude 3.0)) the result of the generation is this (Photo 2)
I also tried to make a top view like here is an example (Photo 3) For promt: 2D, 4k, realism, UHD, Cartoon style, white background,maximum detail, Image in the center: One-story house, overgrown with grass, abandoned, ((View from above 3.0)) And here is the result (photo 4)
Please help me figure it out
in general, the result of the latter can be achieved from the latter, but how to make a position that would generate a clear front or side view?
In fact, I need to make images in such a way that it would be possible to clearly set the viewing angle.
Thank you! Thats very kind.
Should I be impressed that the "stars" out the window are perfectly reflected in the mirror?
this chat is really awesome, love you everyones ideas and help.
Hmm I think these were flux dev out of the box. But this is a pretty common generation (and these aren't even really that good.) likely can be done with some xl models, and certainly with loras.
Video games, boxes, consoles, book covers, Lego set boxes, etc, all come out pretty fun
I've purchased a Macbook Pro M4 Max w/ 36GB RAM and 1TB drive (already filled up more than half). Will Automatic1111 run well on that? I wanted to check before spending a bunch of time configuring it to find out it can't handle it.
No I havent' tried anything yet. That sounds pretty cool, especially for a noob like me with big dreams! Would you have a link by chance?
https://youtu.be/Tk61auVeF0s?si=Xm8bv_U-4odQsraU
had some fun inserting myself on back to the future stage in comfyui. Enjoy!
Upscale and then impaint
And it depends on the checkpoint on how the eyes come out
vshshbsjss
i found this on web how to make this i have promps but dont work is there a checkpoint for it?
Lora probably
Or their specific checkpoint
I want it but need to know what is the checkpoint
thats like asking me what oven my neighborg used for his pizza
but if you just seek something similar:
https://civitai.com/models/716982?modelVersionId=801785
is that a policy i.e. strictly only stability.ai images are to be posted in here. (I am running flux through stablediffusion.cpp)
Ah no, i said that comment because it was hella nsfw
Flux de-distilled is pretty powerful.
Doesn't need flux guidance nodes at all. Can be set to a 3.5 in dynamicthresholding and 100 CFG in ksampler
hands a lil messy buit I assume I have settings that aren't yet optimal for this model.
I need to look more info up
is it based on dev or schnell?
Schnell.
neat!
It seems the one I'm using is non-commercial.
I am wrong.
There are two schnell de-distilled flux models
but it isn't this one.
ahh okay. the license is why i ask. thanks for clarifying
👍
honestly im probs gonna still stick with base dev since i already have a well-operational workflow for it
dynamicthresholding allows me to use negative prompts eitherway
At least until there is a definitively better model
that is open-source
took your image, ran it through perplexity claude 3.5 sonnet to get a prompt
threw that into flux prompt enhance node to get varieties
It's generating rn
white monster with da red berries
An intricate 3D render of a small alien-like creature with pale white skin and large black eyes, its fur a kaleidoscope of red berries scattered across a reflective surface. The mysterious creature, with its elongated fingers and claws, is the focal point, while the gray background adds depth. The image is a stunning, captivating visual experience, capturing the creature's emotional connection with its natural surroundings.
only 50% done
furry alien berry grabber
Flux.1 Dev and SDXL's base resolution are both 1024x1024.
Flux being better at prompt comprehension than any current open-source model
Also
for the details
it's the sampler, dpmpp_2m and scheduler sgm_uniform
for flux.1 dev
👍
oh uh i also use like 4 cfg anti-burn nodes
to allow myself to use negative prompts with base flux.1 dev
Original prompt: bizarre unbelievable action photograph of a white Buttwinded+thirsty! grinbending on a alien meatball pile diagram meal artist product, intricate, chaotic, smoky, realistic lighting, octane render, unreal engine, dreamlike resolution, best resolution isdom gumballs STERY0LDE, 4k, fisheye, 8k, cinematic lighting
GRINBENDING
airbending but for smiles
I bet you smiled when you read it!
Thank you
Does that pencil have a pen nib? 😄
@proud dagger Can you point me towards the downloads for Cosmos clip and vae please?
I've got the model, and I think I have the correct clip (not certain), but really struggling to find the VAE.
if you use SwarmUI it'll autodownload, if you need to manually download for some reason they're here https://huggingface.co/comfyanonymous/cosmos_1.0_text_encoder_and_VAE_ComfyUI/tree/main
Thank you!! I'm using ComfyUI 🙂
Hi everyone! I’m Valtteri Hirvonen, a professional photographer based in Finland, and I’ve been working worldwide for the past 20 years. Lately, I’ve been diving headfirst into AI art, blending my photography background with the endless creative possibilities AI offers.
These images are part of my personal experiments with surrealism and new techniques. I love exploring the space where reality and imagination collide—creating scenes that feel strange and unexpected. For me, it’s all about pushing creative boundaries, seeing where the process takes me, and having fun along the way.
If you’re interested, you can find me under my name on most social media platforms or visit my website: www.valtterihirvonen.com.
And please excuse me if this isn’t the right place for this kind of introduction—I’m totally green with Discord… 🙂
Thanks for checking out my work!
https://stablediffusionweb.com/image/25622118-robot-woman-with-removed-face-plate any idea what checpoint this is?
again lol, https://stablediffusionweb.com/#faq
Stable Diffusion is a deep learning model that generates images from text descriptions. Use Stable Diffusion online for free.
so tbh prbbably a lucky result as the prompt + neg is a terrible prompt
Ok
Quick question and I don’t know if this is a tech-support question so I’m asking it here. When I use Lora’s in forge I usually just click on it, and it post the name of the Lora in my prompt do I still have to use trigger words in order to activate it?
If the model instructions tell you to use the trigger word, you should.
What if it doesn’t have trigger word the Lora is called Add Detail. Do I just click it?
It’s full name is add-detail-xl
Just click it, yep
Also out of curiosity, have u done anime characters before, and if you have how did you do the eyes or do you just use photoshop?
I don’t know why but for some reason to me 3d eyes are easy, but anime eyes are really hard.
Thousands of times
Not sure what you mean by "do the eyes"
You can use the adetailer plugin or generate a higher res to get more eye detail
Prompt, Lora, or do you use Photoshop
(or segmentation if you're using SwarmUI like me)
prompting
you can do allot with just promptig, just gotta know how to word and place it
Yeah just prompt. Most anime models use tag based prompts, so "yellow eyes" or whatever will give you that
Are you stuff like high definition, eyes, perfect eyes, high detail, eyes, red eyes. Stuff like that
I use
Nah you don't need that junk
no
Find an image you like and check the meta
if you open in bowser and save from there meta is good, the preview here is compressed with no meta
K thank u
What does free u do
Oh so is it already in stable diffusion or do I have to download it?
Do I have to install it?
yep
Yeah I had to. Thank you
yw
? Am I doing something wrong
😂
😅But what am I doing wrong
Not my prompt got it from tutorial
i dont see anything wrong other than youre over prompting the negative
i think your chasing perfection from an imperfect tool
as advanced ai image generation and ai in general as come, we're still very early in the infancy of it all
take the time to learn the tools aswell will be beneficial in the long run
I have been watching tutorials and that what it said to do. Thank you. 😆 Any suggestions
suggestions for what
In you character eyes you have extreme details in it
my prompt for that was extremely simple
positive:(masterpiece, absuredres:1.2), 1girl, purple eyes, delicate pointy ears, upper body, close up face
negative: 3d, monochrome, text, (female child, loli:1.2)
What model or Lora
Hi! I've been trying to use my friends who know more about image generating than me (i'm such a tech noob I don't know where to start). They've tried to generate a picture I'm looking for that I want to put on a t-shirt for a friend of mine that I know would make her so happy. But we've been unable to get close to the result I'm hoping for. Is there anyone who could help me? Preferable if you can try to generate the picture I want for me if possible, cause my friends have already tried several times and they seem to lack the skill in getting what I want. I'd owe you a favour and it would make me and my friend so happy 🙂 If you can help, please DM me and I can tell you the picture we've been trying to create. It's totally safe for work, it's just very specific and don't want to bore everybody with the details.
Did you figure out what you were trying to do with eyes?
Transformer 😄
Quick question I want to post a few ai arts of where the AI messed up, but in a way created a masterpiece, the problem is it shows legs is it OK? There’s no nudity or anything.
I just want to show how cool the art is. Especially since it was a AI mess up.
Anybody know what model they used for this? I really like the eyes.
Probably used illustrious or flux
Specific model is kinda iffy to ask
I think I found it “ Animagine XL V3.1” did a compared analysis with this image and they look similar in terms of eye shape, nose shape, and jaw shape.
Between what
What brand flour did your neighbor use for their cake. It could be a merge, could be a lora
I am no ai or got the metadata 😭
Lol I see your point😅
Do you know of any good anime models to merge with this one?
Animagine is a solid one but illustrious is the current popular model
Wainsfw or NTRmix
But different architecture but you get similar results
And it knows like most characters on danbooru
general-with-imagesThe magic magic carpet is like in the fairy tales about Shahrizada
general-with-images
hi there
which model can draw this kind of image?
I tried several models, and even though I entered no eyes, no face, no hair, the final image still had eyes and hair. Or do I need a model or Lora specifically designed to generate this kind of character?
Does anybody know where I can download the anything v4 model. It is on hugging face but there are two options one is download test model, and other is download the inference config.
Yes.
随便生成一张女性图片
Found it 😆
Does anybody know what dynamic thresholding does?
/Style: Aardman-inspired claymation style, reminiscent of Wallace & Gromit or Creature Comforts.
Scene 1: A Day Unlike Any Other
Mood/Color/Style: Soft morning light with warm, golden hues casting long shadows. Style: Aardman-inspired claymation with visible tool marks and rounded forms, in the style of Wallace and Gromit.
1.1 Establishing Shot (Bird's-Eye View) + Slow pan across the zoo + The modern zoo at sunrise + (empty action) + A high, wide view of the zoo. Curved pathways, glass enclosures, lush foliage. The Matriarch's untouched trail to the waterhole is visible. The Fox is subtly seen watching the Matriarch's enclosure from behind a bush. Squirrel scurries across the path + Soft morning light, golden hues, long shadows + Aardman-inspired claymation, visible tool marks, rounded forms, in the style of Wallace and Gromit.
What did he mean by this. You can't generate images here
https://stablediffusionweb.com/image/25622118-robot-woman-with-removed-face-plate what for checkpoint is this on Stable diffusion
Buddy i told you the last time 😭 its base SDXL. Look on their FAQ
Literally basic sdxl 1.0
No merges, no extra trained
Illustrious model?
I did a workflow to use Illustrious img and upscale it with Flux and upscaler. Really nice
i have issue running flux to upscale with
had a bad system crash by running comfyui direnctly on windows
had to reinstall windows, and like people were talking about using comfyui within container for security reasons, i had to use windows WSL that lets you run ubuntu natively inside windows but that has a drawback with memory, so flux throws OOM when i try to refine with 2 different models
Comfy in a container. My God, why go to all this trouble?
How much ram/Vram do you have? Your specs?
you need container for security reason, cause custom scripts can mess up your system badly
vram is 12gb, ram 32gb
The whole world works more or less like that, I'd say. And unless you install a dubious extension, maybe...
Comfy is open source, and so are its extensions. It's all very closely monitored. Using WSL 2, a container, will just waste your time.
I have the same. It's more than enough for Flux gguf Q8.
12gb is enough for dev
i ran flux before the incident
no need to run the crappy distilled ones
yes it is
now not taking chances again after system crash
and incidentally now that i get oom for running comfyi within ubuntu i dont play with flux much now
get comfy, normally, get wavespeed node, use the first block cache at 0.12
Crashes with this type of AI can mostly be explained by a lack of RAM, as your system uses pagefile.sys for large loads. A Dramless SSD, for example, can easily be knocked out.
not a ram situation
what happened was my windows file got corrupted apparently from malicious code
the thing is i can run flux within container too but when it comes to fixing faces with refiner which needs me to load 2 models thats when i get oom now
download more vram
in all seriousness, turn down the highres
if you have it at 2
turn it 1.5
Even lower if you still get that error
How do I do that
it was a joke
.. you can't it's a joke
….oh
lmao
Cuz I was able to generate images with Hires fix a while ago but today it’s starting to become problematic
8GB
highres at 1.5 or 1.25
highres fix takes up a good amount of vram cause its upscaling mid generation
Okay I’m trying this at 1.25
||in swarm i regularly do it at 2 and when i impaint i forget to disable it so it does it again at 2||
Also 8gb gang here
Oh 1.25 is good for me
Oh boy not even withstanding 1.4
alrighty well found your limit. try to stay within in to not oom
if you want additional upscaling at that point
youll wanna upscale it through img2img
Any magnification algorithms besides latent to make things easier?
QUIERO UNA CHICA DE ANIME DE CUERPO COMPLETO EN TRAJE DE BAÑO
WHO CAN TELL ME HOW I CREATE THE IMAGES HERE, I'M NEW
go to #🤝|tech-support and check the pins for install guides, youll need use your own hardware, importance is gpu and how much vram it has, and what brand (nvidia, amd), the pins will cover all the questions you will have for installation, anything else after that you can ask in tech support or general chat for advice on how to generate good images,
MY MEMORY IS 16GB AND RAM 8
has anyone figured out how to make AI comic books? im an artist working on a comic, and would love to use AI art to speed up the process but its been churning garbage for me for a year now. i even tried making a lora with my characters
like this is my hand drawn panel. step by step. i give it to stable D hoping itll complete a step for me
im using img2img with prompts calling on my character lora + comic book panel loras. these are the results:
Multi-scene generation in SD has always been janky, just do it panel-by-panel, and then put the images back in place.
It will also help SD better understand your prompts too.
Tho I do recall seeing a comic panels LoRA on civitai, so maybe your LoRA + something like that can work
jea ive got 3 different comic loras that just dont work with img2img. they can make a gorgeous txt2img but then it wont have the vision/characters i wanna use. the time i spend fixing it could be done drawing the perfect vision instead
you right let me try individual panels instead jea.
so the panels make some AWESOME facial expressions for me to use. but the body is still horrendous lol. so im just taking the faces XD
Hi, why is the model hash different? I downloaded the model fresh from civitai, but it creates very different results and looking at the model hash it is also different. What am I doing wrong?
Anyone use this with Fooocus?
nothing. also 1.5 is extremly outdated. dont uss that if you can. use pony or illustratious sdxl models
I like the style of 1.5 models. It's more realistic than most pony models.
For SDXL I got too less VRAM.
But why does the hash change? I try to reproduce the same picture, as I did ~1 year ago with the same model, but I get totally different results
more realistic? how lol, 1.5 is the oldest base model trained at 512px, logically speaking, thats impossible lol cause xl is trained at 1024px,
hash is just version identifier
try ruined fooocus, more updated fork of foocus
Yeah, resolution-wise the XL models are better, sure. But my GPU can't handle it, that's why I am using SD1.5. For me it has good enough results and it does a good job in photorealistic images
but whats up, i use it
ah okay, make sense
cos once you get a taste of xl quality, hard to downgrade
same thing when youll play with flux
but flux makes my 12gb cry
I am not convinced of flux images. Most images I am seeing online are too clean. They probably need more fine-tuning, but nevertheless I can't use that reliable anyway
So how comes it that the version identifier has changed? Even the sample pictures have a different hash, than the has in the description of the model on civitai
I am talking about photorealstic pictures
cause hash is a specific version of a model, if any changes are made to it like additional training etc, it updates the hash this explains it better rthan me
in the grand scheme of things, hash doesnt play a role
in the 2 yrs ive been here answering questions thisis the first time someone's mentioned the hash to me
🤓 hashings fun
I am not that deep into those topics, but hashing is nothing new for me.
I was just wondering, why I am getting two very different results, when using the same configuration and same model. The only difference I noticed was the hash.
Shouldn't be the model hash the same?
You'd know that flux is more than good at photorealism with finetunes and loras
sampler and scheduler choice also are heavily output-dependant for flux, which dpmpp_2m sgm uniform being generally recommended for photorealism
rough example, 2nd pass would easily fix artifacts at 0.4 denoise
Somehow managed to get a VERY NSFW model to generate this.
Dog hands!
What model is it?
Yum. How appetizing. 🤮
Let me guess - "A ham shaped like a rhino beetle."
Some astronaut fun. Created today.
Top left image
Three legged chicken
three legged chickens
Actually yknow what
I should set up an upscaling workflow
see if i can fix it
You added graininess into your prompting, or a lora i assume
that realistic analog photo feel
uh nope i fluxified it
I need a defluxing lora
Ooh, so close. It's actually a bizarre unbelievable action photograph of a large delicious detailed steamed ham on an operations couch, goblins and thelikebecomeing instructions labels to the left and using gorilla oil technology (final fantasy)ram like technology gogurt device, jj abrams et al, professional photography
I'm not trying to achieve realism anyway 😅
I added grain in upscale process. I like grain...
Grain is good.
Grain gives the illusion of realism
I know you're not, though. It's just a little pinpointer.
Yep, well noticed anyway. And easy fix.
It seems the reason why my outputs were so clean was due to the upscale model I used
nomos8k atd jpg has JPG noise removal 🤷
Imma do an output with just 4x ultrasharp
I prefer 4xffhqdat. It gives more realistic and a bit softer look.
i wouldve expected helaman to release a decent overall upscaler in over a year by now
surely there is a better one that is more recent accuracy-wise
though tbh guess it doesnt matter if you upscale then inference
Using that upscale model with a film grain node
on dpmpp_2m sgm uniform schnell ultimate sd upscale + flux redux + flux controlnet upscale
My result
Only problem
WE GOT MORE THREE-LEGGED CHICKENS
I might have
to bring up mty film grain by .1
just to get the same effect you're going for
had to also decrease guidance down to 2.2, where i had 3.5 as default