#🏞|general-with-images
1 messages · Page 170 of 1
Hey all. Anyone interested in helping me with some generations?
I have been running some tests on CivitAI using some of the web models on there, though am not happy with the results. I unfortunately lack a PC capable of running SD locally, so CivitAI has been my best free-to-use option so far.
My goal is to re-create some images like the one attached here. As far as I know, this is a real, life-size prop animal and is not AI generated.
man danging in moon
🌛
Draw a young man with white hair on a chair, next to a cold white cloth
👴 🪑 ⬜
#artisan-faq if you wanna generate in this discord
she tamed the angry flowers
railway and data
quickest wan workflow at the moment?
Not the exact same but I mean the appearance
Hazme un logo divertido con colores púrpura blanco negro gris y blanco el nombre del negocio es The most virals products shop
SD 1.5
how to move from this...
to this?
im trying my best with prompting but failing all times
I don't understand .... what are you trying to do here ? your prompt is for asuna at the beach => you get asuna at the beach.
please roast my lora of my dog (sdxl base image ) 🙂
IDK this is well above average SDXL lora so its hard to criticise it
Curious. Did you use "blue Police box" or "TARDIS" in your prompt?
You must be a detective! 😄
sorry was away, are you talking about my lora ?
yeah its nice
ah thanks 🙂 was not sure I was on the right path
ye this is way better than most SD 1.5 lora
the main thing to be aware of is to try not to overfit it
if you overfit it then it can start making the model less flexible, or have strange visual effects
but its a tricky subject there is not like one clear definition and metric for if something is overfit
You might want to delete this one for two very specific reasons.
the social definitions of SFW versus NSFW are not very logical
by the usual social standards that is SFW
When I think "SFW" I think "Rated PG" or "kid friendly"
oh I interpret NSFW too literally maybe then
Yeah and there is so many parameters, the images, the training parameters, the base model, it is quite hard tbh
LOL I like the cinematic lighting
yeah lora training has many param
tricky
intel arc is slowly catching up to cuda
latest 2.7 pytorch nightly now includes triton
not working yet but still
Hi
hey, nice one, which model are you using ?
Thanks! I'm constantly changing, but I think that may have been this one https://civitai.com/models/998740/shuttle-mixes
new gemini flash 2.0 EXP on aistudio
idk if anyone can help, but idk how to get from this to this, i've been trying literally ever setting, and the left pic is as far as i can get
Where did you get the pic on the right? Did it mention Loras? Did it mention using custom nodes, scripts, or extensions?
There's all sorts of utilities to get from left image to right
change your dimensions.
that's probably the easiest advice
This too, the image on the right is using 560x840 while your image is 512^2
i got embededs, 1 lora, and 2 models
ahh ok ty for that, i'm new to this
dang spam bots
Hey @wispy spindle or @wheat girder, two bots hit all channels
a second bot has hit the server
@wheat girder this is another one to add to the bot and block
what left pic?
It seems to be a spamming bot, it's copying alissa's message from above.
a little more details? are you trying to do a video or something else?
i think hes struggling with detail and pose i suppose
but he needs afterdetailer
maybe
maybe also a vae
used to get those grey wash images when i had none
looks like SD 1.5
you can make good image with SD 1.5 but there are a bunch of things you need its not easy
compared to SD 3.5 or Flux where you can just run basic workflow
for SD 1.5 you essentially need
- FreeU or some other tool that deals with the same issue
- one of the fancier guidance methods
- one of the methods for dealing with CFG-burn effects
but it all needs manually tweaking
so I can only advise to use modern model like SD3 if you are still learning
I tried doing some inpainting for a D&D map.
Does this look plausible for a dragon's breath attack?
yup. looks very good.
thanks, I'm still trying to get it more snowy and perhaps a bit less chaotic though
unfortunately this is pretty slow. large image size involved. takes almost 5 minutes per image
If the dragon is aiming at a particular target, you would tend to get those smaller scorch marks.
If the dragon is just strafing, you will get very long, more narrow scorch marks, as he flies overhead and just shoots at everything in front of him.
And don't forget arcs if the dragon lands somewhere and then breaths fire.
Maintain a human face. Wear a yellow safety helmet with the letters "KFB" on it. Wear glasses. The person is 40 years old. The background is modern architecture. Clothing: a white T-shirt with a collar. engineer
A few pictures I've made recently. Working on making character art for some of the pathfinder table top role playing games that I'm in.
On the left is my creepy summoner and on the right is my pyro goblin alchemist.
I love how it's like, cartoony pony but the background is like an artistic painting. ^_^
A chibi-style character standing outside in the morning, looking at the ground covered in fallen cherry blossom petals. They have a thoughtful or slightly sad expression, holding a tiny petal in their hand. The background features a soft, pastel-colored spring garden with gentle sunlight filtering through the trees. The petals swirl around in a light breeze, creating a dreamy and poetic mood.
What do you mean with "the same issue"
You can probably just save the image at a lower resolution, inpaint, then add it back to the picture
maybe downscale and then later upscale as well
yeah. Probably the wrong toolset. anyway. I heard gemini has some really good automatic image modification thingies. maybe I can try that.
"A highly detailed 3D model of Enkidu, the wild man from The Epic of Gilgamesh, in a full-body pose. He has a muscular, untamed build, covered in wild hair and fur, with an intense and primal expression. His eyes are fierce, and his posture is strong, standing confidently in a wild, natural environment, surrounded by animals like bulls and wolves. Enkidu's body language conveys his raw power and connection to the natural world. The image should be ultra-realistic, with dramatic lighting emphasizing his wild nature. The model should be presented with no background (transparent PNG)."
Thats oddly specific lol, youd need a lora for enkidu
🦹♂️ Here you go
the unet is imbalanced and the skip connections need balancing with the output of deeper blocks when the model is used
its just how it is
First time I heard about that
Do you have anything to link to for this? Might be time to go try SD1.5 again
its a weird thing yeah
that the models have this
if you haven't read FreeU paper that is best place to start https://arxiv.org/abs/2309.11497
In this paper, we uncover the untapped potential of diffusion U-Net, which serves as a "free lunch" that substantially improves the generation quality on the fly. We initially investigate the key contributions of the U-Net architecture to the denoising process and identify that its main backbone primarily contributes to denoising, whereas its sk...
but skip connections get mentioned in a lot of other places
some methods exploit them for editing and stuff
1.5 is prone to making artifacts which are basically bright spots. No way to really get rid of it since it's a flaw with how the VAE was trained.
lol yeah I remember when that went viral on reddit
The bright spots and blue spots are both problems that occur when someone merges checkpoints and doesn't know what they're doing.
The only VAE issues I've seen are JPEG pixelation issues (artifacts) if the VAE was badly trained.
Too many people train on lossy JPEGs not knowing what they're doing.
@celest sigil
sorry for bothering, i want to ask, so.. now i try to learn basic comfy ui, then i try to generate the image and iam confused why the result is so weird.. so i try use AUTOMATIC1111 with same model, prompt, cfg, resolution, and the render its good.. so why the render in comfyui so weird?? is my node is wrong? or what? need some help and advice haha. sorry if my english bad, and thank you
For starters, A1111 & ComfyUI use different prompt weighting systems. Check this out for more info: https://github.com/BlenderNeko/ComfyUI_ADV_CLIP_emb
Have we seen anything new in terms of video or is Hunyuan still the recomended king?
I never got great results off it.
Wan2.1 is a new capable series of video models
Speaking of wan2.1, I just installed it today and made some of my first videos. Exciting!
That's awesome, and gives me some hope for my future usage of wan2.1 What GPU do you have?
4070
It was easy to install and run and I'm smooth brained. Though i had to ask a few questions, poke gpt, and watch a video
It takes forever to generate videos though. Like 5-10 minutes to generate 1 second
SD1.5, no bells & whistles. WIP 1080p base model. Too high weights killed my subject making this completely SFW by accident. (No Prompt included)
if you animated that, it might look like a waterfall
an artwork that reads the name سحر in Arabic typesetting font
Final Gemini did it after a few tries. a correct name in Arabic
gemini is google, and with google translate available, i would expect it would do the best job
do Google Translate have Arabic vision glyphs? I would say Google fonts more precisely than Google translate but still It should have parameterised them in the model and any font page can render Arabic text, this is not something extraordinary.
give me a phrase
The only words I know, you can blame 4chan for. "halal" and "haram" (they're not curses)
Well that font is everywhere, the arial or the arabic typesetting standard font but still google produces other fonts:
I gave it 10 images of the name with different fonts, look what It produced:
to me, since i can't read that - it's pretty, squiggle lines
i have hard enough time reading latin letters and english words
still correct Arabically
i'd love to have a language talent, but my brain just doesn't do that
language is culture and culture is language and power is language and language is power
i mean, i can't even count to 5 in arabic - forget reading anything more complicated
If you ever find yourself in need of it or wanting to learn, I recommend Pimsleur above all others for spoken language learning.
i'll try it. i failed duolingo spectaculary
It's what the CIA uses to train their Analysts so it's highly recommended.
Fair. Arabic is even hard for me native speaker as It has many forms but thankfully, I was attached to it since my childhood so I know a lot of under the hood rulings and uncommon classic words and how to make Arabic poetry
always best to teach kids multiple languages when little if you can. that's when the human brain is most tuned to learn languages
Actually AI is getting better when talking about Claude Arabic performance, It needs only better vocalisation, grammar and correct typography in terms of image generation
but it's an LLM, not an AI image or video generative model
yeah and I don't think AI will break language barrier as culture is very dynamic thing
all this technology we're running around making images and videos with is computer vision - that's been around a long time - and wasn't really developed for what we use it for. if you haven't researched that yet, you should
developed for other things? non-cultural usages?
sure :) medical stuff comes to mind
https://youtu.be/2w8XIskzdFw?feature=shared this is a good spot to start
By the end of this course, learners will understand what computer vision is, as well as its mission of making computers see and interpret the world as humans do, by learning core concepts of the field and receiving an introduction to human vision capabilities. They are equipped to identify some key application areas of computer vision and unders...
maybe but before Chatgpt there was an AI called Spleeter for music separation and I think AI has been directed in to this evolution from the start by looking at the tools I were using before 2018
we've had AI since 1957 - and computer vision has been being worked on for decades
sorry for bothering.. do u guys know why segment anything not showing?. so i just installed comfyui layer style.. and then i try to search segment anything and is not appears?
There's nothing in that node pack called "segment". Did you install "Segment anything" pack?
This is what layer style has for segs
umm i dont know actually.. but after install the layer style is this included right isn't it?
Try searching for that name, and not "segment".
umm.. still not appears
hi guys, I will do today a presentation about sd to generate architectural images for a university
it's possible I keep negative prompt on some way on seetings?
for example, I cant generate NSFW images, can I put it on some way on settings or only on negative prompt?
I wanna avoid show this words
Check startup in terminal to see if they failed to load.
or you could install a NSFW censorship extension eg : https://github.com/AUTOMATIC1111/stable-diffusion-webui-nsfw-censor
thanks ^^
You could also use/create an embedding/style if your intention is simply to hide nsfw words from showing during the presentation. Or both the "nsfw" negative embedding and the extension. That way you re minimizing the chance of getting something nsfw AND censoring it if you still end up with some ranchy output.
"Rainforest terrarium without water, Portland cement rock formations "Hyper-realistic geological formation with volcanic rock layers, f/8 depth of field (foreground branches at 1/2 meter), dual soft lighting (5000K main/4000K fill), weathered texture (Ra 3μm surface roughness), moss-covered areas (30% coverage), dry branches system (crown radius 8cm), environmental fog (30% concentration), 16K resolution with 32 samples/pixel, photorealistic style (75%) + illustration enhancement (25%), CRI 95 lighting, natural color balance (4500K), volumetric dust effect, rock layer thickness gradient (40-60cm), Fibonacci spiral composition, subsurface scattering on moss (#7CB342), environmental color (#1A1A1A) backdrop"
Require higher details: promotion"sample rate 64 samples/pixel" + " texture detail 2048×2048"
the link of my presentation
https://www.youtube.com/watch?v=eAHk1r5zFQA
A convite da professora Fernanda Moreira, participei da sua Oficina Inteligência Artificial para Representação de Projetos de Arquitetura falando sobre o Stable Diffusion aplicado para geração de imagens de arquitetura.
Apresentação especial comemorando os 10 anos do curso de Arquitetura e Urbanismo da Universidade Federal de Goiás (UFG)
00:00...
Guys, this was a presentation about SD to a Architecture course here in Brazil. The presentation is on portuguese but I thanked the SD discord server and the whole community for their help and company 🙂
I left the link to the server as well and invited people to join in
Dang that was fast, you were asking bout the censorship only yesterday. I'll try to watch it later on, my portuguese is pretty thin tho :p so I might have to turn on subtitles. Skimming through I see that you're showcasing txt2img, basic img2img, controlnet, styletransfer. That's a lot of stuff to explain in a single course.
thanks ^^ At last I don't show nothing NSFW 🙏 Was not a course, was a presentation of AI to generate images for architecture for the university. I introduced SD and tried to show a bit of the whole picture
generate a really funny picture of a friendly and happy ghost that is sitting in a chair while being interviewed for an executive job
I'm surprised you went with that, I would have just done a GIF of Ghostface from the Scream parody "Scary Movie"
lol
For flux, could you guys reccomend better Ultimate SD setings?
i wouldn't use that sampler with flux
Thanks, could you reccomend something else?
note that the the sweet spot for flux is 672x1024 or 1024x672 - so you'll get the best look to your images if you stick with those for the image sizes. try huenpp for the sampler and simple for the scheduler
My input image is 512x1024, so I'm just squeezing under, just trying with your suggestions now 👍
i know, but there's a reason for the specific numbers i suggested
I'll try and find a doc on why that is, I need to have the base 2 to compress these images properly.
the reason it is - when flux first released, a number of us did some indepth testing. that's the ratio that works the best. we didn't bother to write up what we found so you won't find a doc on it
but it has to do with the size of the images in teh data set it trained on
you can always crop the images before you compress them if you need to
here's my flux latent space walkthrough https://docs.google.com/spreadsheets/d/1VDt_9U59QXwXUhFP0T32HhMg6_Iem76aW4C2u_B3CsI/edit?usp=sharing you can see clearly how it reacts to words and what it knows
I fully understand the trained resolutions, I just assumed you meant flux was different than any of the other generative models, testing with 672 to see if it changes much.
As for the heupp2 and simple, it takes longer and is almost identical (it doesn't look better) to the previos settings I have, so I'll keep mucking about with these.
Cropping or rescaling will break the aspects I need.
For instance this is dirt that is good, and the upscale weird "dirt"
i'm sorry. i will point out this is the stability.AI discord - and that SD 3.5 large is much much better than flux in all respects - and suggest perhaps you switch to it
I've used 2.5 and found flux d to be way beter, I have no problem trying it out, are you sure its way better? 😄
not sd3-2b-medium. sd 3.5 large
and yes, i'm certain it's better. by miles
let me know if you get stuck. remember - that with every AI you first want to learn how it thinks - so one or two word prompts, generate multiple times, study the results
Hate to ask here 😄
But anyone have a good workflow for 3.2 I can test out, I just searched and nothing good came up.
Also, the two differences between 512 and 672, they are roughly the same in terms of detail,
it's 3.5 not 3.2 and sure.
this one allows you to prompt the 3 encoders seperately
Thanks mate, greatly appreciate your help, this may have solved my entire issue!
they each have their own strengths and ifyou give them all the same prompt, they'll battle with each other
and this uses sd3.5 large as the main model with sd 3.5 medium doing the upscaling
That was a slog to get HF to work with me and the token.
For me it's hard to decide what one's better, as they both have their pros.
But I will say, the encoding is 3x better
anybody have any luck generating pixar-like images?
3D Animation Diffusion is what you're looking for.
oh seeing some nice results, cool ty
Can someone help me find a yolo.pt file for detecting full body. I am really struggling to find it 😦
"A retro sailboat docked at a small beach pier at sunrise, with coconut trees swaying in the breeze, gentle waves lapping at the shore, and a large, glowing sun rising over the horizon, in a 1950s-inspired illustrative style with a serene, pastel color scheme and sharp silhouettes.
Just tried it, is pretty nice
A picture showing Two main types of bone include spongy (trabecular or cancellous) and compact (cortical) bone. osteons in compact bone and trabeculae in spongy bone. Figures showing osteons in compact bone and trabeculae in spongy bone, including osteogenic cells, osteoblasts, osteocytes, and osteoclasts and blood vessels.
can anyone give me a simple wan workflow with loras? please? i beg you ! 😊
Give me the model requirements and I'll try and make one.
Is it on HF?
loras and model you mean?
Yeah what you'd need to have it runing on comfy
2.1 example
thanks, where do i load the loras 🙂
Just add a load lora node in between the text encode:
hook up the model and the clip
Do you mind if I DM you?
kinda
i am not dming sorry
but do appreciate the help
i am trying to
load the loar
so basically a third node from the clip to lora and i connect the model node to the base model and lora yes?
you'll figure it out 👍
thanks bro all my hearts ❤️
designing textures for video games might be easy nowadays with AI
Hmm not really, it might help but youd still need to edit a lot
Using it for inspiration sure
But without the base textures like metal etc its gonna look out of place
ohh... mmm...
i2v is almost impossible to get any AI to do a turn around with - you'll have better success if you start with an image of a character partly turned but it'll still likely rotate the character to be looking directly at the camera
I could forward one, but it's NSFW. I have seen some that were more than successful.
I just don't know what combination of tools they used to make it.
a little extreme but there is this https://civitai.com/models/1346623/360-degree-rotation-microwave-rotation-wan21-i2v-lora
Can someone point me towards a good tutorial to get the best possible likeness while training my own likeness lora?
SDXL
no one ever takes this advice but my advise is to use Stepfun even if you have to block-swap it constantly
its an upgrade to wan
but you need to take care your power consumption does not exceed what would be worth it
if your vram is below a certain amount then the amount of block-swapping would make this the case
but if you have 24GB vram its probably fine
these were my personal notes: rank and batch size as high as hardware allows alpha = grid search from alpha = 0.5*rank to 2*rank learning rate = grid search from 1e-6 to 1e-3 weight decay = grid search from 0 to 1e-4, possibly up to 1e-2 gradient clipping = grid search from 1 - 2, possibly up to 5 optimiser = adamW schedule = cosine warmup = 2.5% to 10% of steps
grid search just means try numbers in that range
hi im just wondering what style and prompt strucutre is used to create images in this style? thanks
Picture of a jet
✈️
video of an idiot
dude even looked up to read the text
i mean a lot of people use google translate in here
i just point them to #artisan-faq
Can 4060 run this?🙏
Ti? Maybe, normal ver has 8gb vram so hmmm either out of memory errors or its gonna take a very, very long time
Hm 5080 takes 10-30min
I can proceed it before I go to bed, afterall AI thing is lucky draw lol
Unless there is no out-of-error I don’t mind to wait but at least allow me to wait lol
You know, waiting isn’t a matter, yet out-of-memory-halt is the real hidden devil
You know, running hub shit can project a preview of the prompt for quick and dirty via this https://www.runninghub.ai/ai-detail/1901651290458316801
AI App - Easy and straightforward way to create a surrealistic CG fantasy artwork.Output: 1 original Image (832x1216px)1 2x upscaled image (1664 x 2432px)
If I can wait and something sexy, then I can get this shit
what is most used and most compatible wan checkpoint for loras
480p? 720? gguf?
I am not good with these choices
hello i would like to generate images in this style of drawing with stable diffusion could someone please help me when i do it it makes an image that has nothing to do ?
"Aesthetic wallpaper inspired by Oriental Five Elements, gold and wood fusion, soft green leaves with intricate designs, shimmering golden branches, interplay of emerald and gold, tension of balance, delicate mist, hidden golden glimmers, minimalist abstract, ideal for mobile, 4k"
Design a vibrant and eye-catching Spotify playlist cover featuring a dynamic collage of musical elements. In the center, a vintage vinyl record spins, surrounded by colorful musical notes and abstract sound waves. The background is a gradient of electric blues and purples, evoking a sense of rhythm and energy. Scattered around are small illustrations of various musical instruments like a guitar, keyboard, and drums, adding a playful touch. The overall mood is lively and energetic, capturing the essence of diverse music genres coming together in harmony.
Curious as to why checkpoints/loras always transfers so poorly 
And this is from one nvme to another.
Because file transfer for single files is high, but you're transferring 1,554 of them. Cropping won't save you here.
Is there a way to use AI to digitize a logo from a photo?
Yes, but each are 300MB. So it should easily be few 100MB's at the minimum
And cropping?
It's 150GB worth of loras, so it should be quite high, even for nvme's at sequentials. It'd be way slower if it was few 100KB per file :P
#🏞|general-with-images pixar style
anyone knows if this image was generated usingstable diffusion?
Anything
@feral lagoon Of course. Use the IPAdapter style transfer.
DM me
And this can basically just take the logo in the photo and remake the same logo digitally? Is there a good guide you know of for this?
Just look at the hands 😄
Every AI messes up hands, that's not really giving it up
I'm trying to create images similar to that one, but can't find the right AI for that
MF
guys
it have been a rollercoast tonight
won a bid on ebay for a 3090
(bear with me)
received it today ! total surprised ! (the mail is shit in the UK you pay for next day delivery, you are happy if you got it the week after)
anyway
got the card
got a minute for my previous card that serves well
ope the egpu (x chroma) ( yeah i got a laptop)
clean it quickly, all good
try to put the new card in
try again...... try again.... do NOT FIT .....
I repeat do not fit !!!!
oh god depressed
google it, the fu%$£%£$ing bracket too large
had to use a bloody plier on my enclosure to make it fit, I was thinking I got no fucking idea what I am doing, I might just have destroyed the egpu and the card trying to make it fit
but guess what
card recognized and soon you gonna have more picture of my dogs !
sorry for the rant, it just happened and I had to say it so someone
on this, gonna mind my own business
https://www.youtube.com/watch?v=Enf2P1_Mofs
Provided to YouTube by Entertainment One Distribution US
Only The Rugged Survive (feat. RZA) · Wu-Tang
Legendary Weapons
℗ Entertainment One Music
Released on: 2011-07-26
Auto-generated by YouTube.
How many steps?
20
Denoise?
0.7
i reinstalled stable diffusion now because
first time i installed it my interent was cutting off often so idk if there may have been corrupted files
i will try again
@feral lagoon Inside Comfy Manager search for IPAdapter. Click on the github link and scroll down. There is how-to video on the page.
That's likely too high denoise. Try 0.1 and increment it gradually with successive gens until you get a better result. If that doesn't do it, then I don't know, I don't use i2i that often.
will try thanks!
@slate drift leave that discord
Its spam/scam
I'd recognize baby yoda's influence anywhere, those eyes and ears...
what model is it?
which model is that ?
Thanks. Yep, Flux.
do you by any chance know what Euler a is?
is it a name for Eular_ancestral?
Yes.
good. anime waifus this is the type of thing we all love to see. Waifudiffusion has a special place in my heart
WaifuDiffusion is ancient now and garbage. Download Aniverse v4.0 and thank me later.
hi, what must I choose to have a fast GPU to SD? Some of this that I pick starting SG with 15 minutes, other with 45... What must I observe?
DLPerf
and try to get DLPerf at least 30
thanks Neon
what is this links above?
bots
I pick one with DLP 30, take 8 mints to start 👍
I din't know about this
I can build my files on blackblaze and just start a instance on vast when I need to use SD? Is for something like it?
I'm tryng the diffetents preprocessor of lineart controlnet. I know in theory the differentes between then, but I don't se difference in the images generated
Img2Img for Controlnet Lineart
Preprocessor Realistic
Preprocessor Coarse
Preprocessor Anime Denoise
Preprocessor Anime
they are pratically randon, I don't see the patterns according to the preprocessors
yeah its fast download
The preprocessing is used to determine the lines from the input image. So if you want to see a difference compare a realistic image and a anime image as source with different preprocessing models. Not your accident on a white paper 🙂
Now I understand, I will experiment this! 📚
Flux dev or sd3.5 most likely considering the coherence
tried sd3.5 but it doesnt really generaet what i want
probably
Yeah probably controlnet was involved
@vagrant dust I made a study
Original images
Prompts:
1° - exterior architecture of a school by walter gropius, wood, glass, steel, realistic image
2° External scene of a biomechanical castle in a landscape of trees, with a lady in front of it
Realistic
Coarse
Anime Denoise
Anime
I understand the the preprocessor must be matched with the quality of the line image. Do must be more aproximaty from the controlmap, maybe?
But in pratical I don't see a real difference, everything looks randon
What I'm not understanding?
hey guys,
Hasn't anyone noticed a big change in Comfy's performance lately?
I have the impression that it consumes more ram/Vram than before, or that it doesn't organize the vram><CPU/Ram cache as it used to.
It's having problems now, and makes me do virtual ram (pagefile.sys) whereas before it didn't... I use the same workflow, the same config, the same models... Nothing has changed on my end.
3080Ti, 32gigs, 3700x, 970 Evo+
@grand walrus ?
oh bro i use aniverse too im just reminiscin'
I understand completely. I still think about F111, F222, the Abyss Orange Mixes, etc.
"Turn this image into a Studio Ghibli-style animated portrait. Use the soft color palette, whimsical background, and facial features inspired by Ghibli characters. Style it like a scene from 'My Neighbor Totoro' or 'Spirited Away'
"Turn this image into a Studio Ghibli-style animated portrait. Use the soft color palette, whimsical background, and facial features inspired by Ghibli characters. Style it like a scene from 'My Neighbor Totoro' or 'Spirited Away'
I understand the the preprocessor must be matched with the quality of the line image. Do must be more aproximaty from the controlmap, maybe? But in pratical I don't see a real difference, everything looks randon What I'm not understanding?
No data source is currently selected. Please choose a data source from the dashboard and try again.
Guys u think it’s a good idea getting a 5070?
Draw peaceful village with ghibli style
i think your looking for chatGPT's image converter
Yeah!! Can you help me ?
probaby? its on chatgpt
i dont have an openAI account though
Is there any option in dc to generate a pic to ghibli style ?
heyy everyone where can i make request? i cannot see any channel for it
Hey guys. I'm making some static RPG backgrounds and I've generated a lot of images like:
I want to use outpainting (with stablediffusion webui) to extend these backgrounds and make full scenes.
Is this possible? Have any of you done something like this before? Any model recommendations?
yes, it's possible. I have did it a long time, I don't remember but the other users can help you 😉
Obs: you took the consistency compositions of characters very well
PM
Image on the right is with inpaint using the settings you posted.
The teeth are kinda better, the eyes are fine, just a little looking up too much, but the skin color is horrible.
yea its to bright in this case
Happens to me a lot with inpaint
I'll try now.
It's not as bad, but still not the right color
That was with 3 denois
yea much better
With 2 it's better, but you can see a clear color change around the mouth
I guess I could try to paint only the mouth and eyes
Here's with adetailer
also bad
you can select the eye only model in adetailer
are you in txt2img or img2img?
img2img
yea there adetailer isnt as good to use
I'll try in text2img
i tried a test with Adetailer in txt2img with its default settings
Yeaaa... that's like, not really an improvement
Looks about the same.
Little sharper?
But I kinda like the first image more
It's odd, when I look at the two image you posted I like the right one better, but when I click on them and zoom in I like the left one better
This was eye only in text to image
But I don't know what it looked like before, so I don't know if it's better or not
but the eye looks good
Yeah, not bad.
Gonna try something, I have a hunch, maybe if I remove "freckles" from my prompt it will work better. I think it might have an issue with red freckles on green skin. Probably confusing the AI with those colors mixed on a face.
yea adetailer aswell as inpainting mostly smooth out freckles or blush, and they sometimes struggle with colored skin
clean faces work good
Yeah, I've been having a heck of a time with this goblin girl I've been working on for weeks lol. I had much better success with my last character.
yuuuup, there we go, when I removed freakles it was a LOT b etter
Yes, working on OC's
Playing a goblin alchemist in a pathfinder game and I wanted art
Phew, thanks for helping me through that, it's great to have someone to talk to about this stuff. I think the freckles might have been the issue. ^_^
nice no problem 🙂
and yea dont bother with codeformer of gfpgan, they are outdated, and codeformer is for realism anyway
Maybe you can help me with another issue, that might not be an issue, but a limitation? I wanna generate images larger than 1024x but I think Illustrious doesn't like that. When I try to make an image larger than 1024 it destorts my characters legs and makes them lumpy
illustrious is like pony models, based on SDXL.
SDXL models are trained on a 1024x1024 resolution.
So staying near that gives the best results but you can also go a bit larger like most people use illustrious at 832x1216.
Going to far above a trained resolution will result in duplicates and deformations.
So most users use upscaling to enlarge the image and also get a better quality. You can do that with the Hires fix for example
Yeah, I can hires 2x, but if I try to do more I get the lumps again
yep, make sure the denois is not higher than 0.5
Yeah, I try to keep denois around 3-5
It's frustrating cus when I make an image that is say, 1280x1280, I can see how much better the face is rendered, it looks soooo good, but the body turns to lumps. Makes me sad >_<
yea best is then to limit the width to a smaller value
so the person dont have space to deform xD
lol, yeah, I've noticed that in images where the character is closer to the screen and less of the body is rendered, like an image from the waist up, the image looks a lot more detailed. But this makes me sad too cus I like the whole body look.
I want my cake and I want to eat it too 😠
yep exactly! full body is the hardest as you have less pixels for the face and hands
So basically, I need to give up on my dreams is what your saying? XD
if you need perfect hands then maybe or use "hands behind back" as tag xD
for the face we have adetailer
I don't have too much issue with the hands, mostly the face.
Is it advicable to double or triple or even quadupal up on the adetailer? I see you can set it to do up to 4 masks at once.
Like say, one for the body, one for the face, and one for the eyes?
thats useful yea if you want specific stuff upscaled:
there are for example a lot of community models for adetailer, for like glasses, furry faces, eyes, etc:
https://civitai.com/search/models?sortBy=models_v9&query=adetailer
Oh, here's another question, prompt related, drives me nuts, but any idea how to stop stable from generating symetrical backgrounds? Both sides always seem to be more or less a mirror of each other. I've tried prompts like "Asymetrical background" and stuff, but doesn't help much.
Ooh, interesting
prompt for a background would be the easiest
instead of letting the ai decide
It's just hard, cus my prompts keep getting longer and longer and that's not good, right? XD
yea to long is not that good
I;m tyring to keep it down to like, 240 tokens,
I often get up to 300 ish
It's so hard to keep it short, there's sooo much to describe just for the character's outfit
But thanks for the heads up on the adetailer models, that's awesome!
no problem yea there are some nice ones 🙂
Hmm, do I have to use ones that say they're specifically for illustrious?
Like how you have to with loras?
nope they work with every model
its just inpainting
the models are only for detecting stuff in the image
Oh cool, good. So they add onto adetailer and you just select it as a mask when you wanna generate in text to image?
nice nice.
This is great, making some more head way! Every week I grow stronger and stronger. ^_^
Thank you thank you. ^_^
Have you had much luck with things like mosaic or is that not worth the hassle?
here is a good example of full body with adetailer off and on
left is on and right if off
thanks discord for switching them... xD
lol
Yeah, huh, that was the full body one? It looks like it only changed the face, ya?
It did a good job on the face though
noe that was only the face
you dont need a body switcher for fully body
Ooh, you meant full body as in a full body image.
yea sry xD
Gotcha, I thought you meant the whole body thingy in adetailer. gotcha gotcha
any cool extentions you recommend? I'm gonna look more into this adetailer model stuff when I have a moment.
I got the booru tag thing, that's super helpful.
Is the refiner worth using? I'm not sure if I can tell a difference with it on vs off.
there is one very important extension. Its called Booru tag autocompletion.
As you use Illustrious models you need to know that these were like pony models trained on Booru Image board tags.
The extension suggest your these tags like a autocorrect and shows how common a tag got used. so you can prompt much better and more effectivly
yea i should read before typing
Yeah, I've got that one, it's a big help. totally!
nope its not recommended to use
it was made for sdxl in the early days
but hries fix is much less resource heave and works better
Okay, what about kohya hrfix integrated?
that can be used like hires fix
I've tried it on vs off and I;m not sure if it's good or not.
Any scripts I should be using in the script box at the bottom?
X/Y/Z script if you want to endless compare stuff with each other ^^ like samplers, steps, loras, checkpoins etc
I"ve heard about that one, not sure I need it though
Oh, here's a question that I have been struggling with for a while. Most of the illustrious models I've seen recommend using eular A for sampling method, but when I talk to chat gpt for advice and ask it what sampling method is best for fantasy character art it says hands down dpm++3m sde karras. I've tried both and it seems that dmp does look a little better, but I'm not 100% sure there's that huge of a difference and I'm curious why eular a is always recommended on the model description pages.
euler a is cleaner
dpm++ samplers add more noise into the image
that can cause wierd issues with some models
but more noise means also more details
Hmm hmm. I see, I see. gpt says stuff like, dpm is sharper, has better lighting, and produces higher quality images?
yea it doesnt have much clue about SD
What schedule type is best with euler a? I often use karras with it.
karras is the best to use it with.
if you leave scheduler on auto it will always use the recommended ones for each samplers
I like the r-esrgan 4x anime6b for hires, that's solid, right?
For normal sampling steps, should I go crazy and do 100? XD I usually do 30, but I notice things do kinda look better with more steps, like 50 or 60 but not sure it's all that better.
Sorry, just basically asking a million questions lol. Sorry sorry.
no problem yea the upscalers are okay to start but there are also some great community ones:
you can find them here:
https://openmodeldb.info/
For SDXL and illus i also use 30 steps its a perfect balance between speed and a good output
more steps doesnt mean better quality, the sampler then just tries to add something more, can be for example a new hair strain or a 6 finger
ya lol, that makes sense.
clicks link
Wow, a whole new rabit hole to get lost in lol
I have no idea where to begin
UltraSharp for realism is good
Thanks, I'll look for that one.
I think I've heard of that one, is that for upscaling in the extra tab, or for hires?
or UltraMix Balanced
boath
takes notes
Ooh, I see. Okay, cool. So I can get something that works better then the anime one I currently use.
Most Anime Users use Fatal_Anime
Remacri is also good and lolypop
Thanks. I'll look into that
you can test with x/y/z and compare 😛
Sometimes I get some fairly good stuff, but often it doesn't render well for me. the struggle is real.
I've been struggling to figure out why sometimes I get a good image and most the time I get a bad image.
looks awesome!
Thanks! I just cant seem to replicate the style consistently.
Like this.
It's not nearly as good
My best pictures seem to be with bad prompts that shouldn't work but for some reason do. Then when I try to fine tune my prompt, cut out bad tags, shorten things to be more concise and to the point, I get a worse image. I don't get it XD
I think one of the main differences between those two pictures is the second image is with the "full body" prompt to include the whole character, from foot to head, and so it had less pixels to render detail with for the face and such.
yep thats it
Does that really make that big of a difference?
It's crazy
Like, even the lighting is way better in the first picture
Anyway, thanks so much. I need to go touch grass and get off the computer lol. You've been a massive help. Really appreciate it. Thanks again.
I have a lot of good new info to research now and a lot of helpful advice.
no problem! yea there is a lot to learn and test with SD ^^ have fun!
cute goblin girl!
Thanks. ^_^
epic transformation
Can anyone please explain or tell me where to find a "how-to guide" for creating images and polishing my existing photo for use as a Facebook profile pic?
Noob here
@halcyon burrow - consistency with your character looks good there. Which method are you using?
consistency with outfit and landscape, just not quality and lighting. I don't think I know what you are refering to when you ask for method? I'm still very new.
Consistency, meaning your character looks the same across multiple images. Just detailed prompting? Or using ipadater, a Lora, or something else?
I have 6 loras in the prompt and the prompt is almost 300 tokens long.
idk what ipadater is.
But thanks! yeah, I'm happy that the character consistently looks the same. ^_^
I also told the prompt to base the character off of Tifa Lockhart, so that helps with consistency as well.
I use illustrious as well, if that's helpful to know. I really like the checkpoint model illustrij v8.25. The creater has newer versions but I like 8 the best.
LOL... Tifa huh? I wonder if anyone around here ever generates Tifa images.... 😆
The illustrious models are tough for me with their severe reliance on artist tags. And you gotta be careful of 'ai face'.
what's "ai face"?
But yeah, Tifa is a common one for sure XD
The booru tag extention helps a lot when prompting for illustrious.
It is kinda funny though, my image is based on Tifa Lockhart, but it honestly doesn't look like her much. This is in part because in the prompt I stated "based on" instead of saying that it is her. Seems to make a big difference.
You can see it clearly in my characters bracers on her arm though
It's the phenomenon where there is a very common (or a few common) faces that all look the same for characters people generate. 🙂
Now with anime-style art, this is sort of its own thing, as there tends to be a lot of similarities in the characters anyways.
But for some of the more "realistic" or "uncanny valley" models (thinking back to like the RevAnimated model, when that got big), you run more into the AI Face.
I find that most models are really prone to that ai skin look
Even in anime ones
But because people don't prompt around it etc it's the most common too
Checking to see if I have an example. Problem is I have like 10k images saved and no way to easily search anything 😄
AI Face....
Versus images like this that are far more interesting. (Though this could have used a face segementation...)
Anyways, here's a Tifa. I'm going to go get a snack. 😛
Aw, yes, I see what you mean. I have noticed that a lot of faces look similar nod-nod
@dry crow Thanks to you 😄
nice one! np 🙂
I love the owls lil hat :3
Just the lil hat ? 😭
I like the whole picture ofc!!
The little hat just made me giggle~
I find that some of the tags that are supposedly bad prompts are not actually that bad or rarely even exactly what I desire
Hey guys i struggling here how can i make ghibli kind of images from the potrats in stable diffusion ?like chatgpt
Whith out changing the facial structures
في هذا العيد: أمنية كل مسلم ومسلمة 🇵🇸🤲
Anyone here didn't fall in the trap of OpenAI Ghbli style? That's not even Ghbli style despite the adherence to face features. this is Ghbli style
Not this
Or this
Its a combination of the different Ghibli styles (movies)
So its very similar but different
Haha we all love little hats
@jaunty prawn Now giggle.
how do i create ghibli images?
Whats with the obsession with Ghibli style?
seems very different to me.
Even the most powerful AI they say of Sora - OpenAI cannot draw "kissing forehead" correctly
My friend totally had this controller... his dad worked for Nintendo.
Create Ghibli style image
Millions of people had that controller, it's the standard Nintendo controller. Nothing special.
Turn in info Ghibli art
Not sure if serious... 🤔
Just made epic profile picture for me lol far beyond Ghibli style drama
nice! Love it
😴
Anyone know a flux workflow where I can have the exact same background and same character, but just with each image showing the character doing different poses?
I make an image with just a background and generate a seperate image of the character with a transparent background and paste the character into the background. But the shadow and lighting are not correct when you do that. So I don't know what to do
Spam
If you joined them, leave its spam
To troll the original artist with… which I’m not even going to touch it… it’s not even the actual style.. just particals
Let's spread real Ghibli
ghibli this picture
.
@copper matrix
Ahh you didn't change the image resolution
If you were to match it in "aspect" ratio you'd get better results
Hmm though your also using SD1.5 i see
On an 1024*1024 image
What gpu do you have?
nvidia rtx 4070
The model didn't really turn the Chinese family into robotic humans. For example, they still look like regular people. What is the main issue here? Also, my original resolution is 768x1408 if I generate an image at 512 resolution, wouldn't that result in low quality? What's the logic behind choosing 512x512 and then upscaling afterward? Is that the correct workflow?
And regarding increasing the resolution, which tab under the 'img2img' section should I use? I couldn't find it, and even though I watched tutorials so I still don't fully understand how it works. 😦
With a 4070, i recommend looking for SDXL models
Hmm ai gets confusing because old content gets promoted on youtube
The logic on that one though is that SD1.5 was trained and made for 512x512 images
While sdxl is made for 1024x1024
I think control net is not supported 1024x1024
For example I created this it didnt exist before. text2image
My final goal is to convert an existing image of mine into anime style using img2img, but I just can’t get it to work. That’s my only real problem. 😦
When using txt2img, I can enable the 'Hires. fix' feature it generates the image at 512x512 and then upscales it to 1024x1024.
But in img2img, I can’t find the 'Hires. fix' button at all.
So how am I supposed to upscale after generating the image at 512x512 using img2img?
I just can’t figure out the logic behind this!"
I think it can not generate hands and limbs anatomy 😄
if you want higher resolution, use a newer model. SDXL might be sufficient for your purpose
controlnets can deal with any resolution. Its just important that the control image and the generated image have same resolution
btw. "highres-fix" is nothing else than generating an image in low resolution, upscale it, then do img2img to improve details
ı downloaded this model
in your case you want to generate an anime version of yourself. img2img is not the best way of doing that
probably something like IPAdapter is better suited
you can try img2img and controlnet (I would use depth controlnet instead of canny though), but you will need a lot of noise to get your image changed into anime style and then you will loose the similarity between your face and the generated anime face
this example for ipadapter should be exactly your usecase:
thanks I also tried something and it's my result but I couldnt understand the logic inpaint to fix fingers and feet
how r u creating those
mine just tries to make it realistic and it looks bad
like here i tried to create a car
this shit happens when i type something like catgirl
like it looks pretty b ad
looks like sd 1.5
I m guessing that s your problem
(Although you can do pretty amazing things with just sd1.5)
Since it is AI we dismiss it as bad. but if this was hand drawn by a person it would be very disturbing.
Like that would be the series entitled "Gertrude"
crie uma umagem de um urso com roupa de super man o urso da Burr Bear
Create an image of a bear in a Burr Bear superman the bear outfit
(original) deepseek coder 33b instruct / flux unchained by SCG hyfu 8-step / flux hands 000016 lora / flux detailer v3 000007 / fluxdetail fluxhands unchained RAW photo, showcasing an awe-inspiring evening view in a stylish loft apartment within a futuristic cityscape teeming with life. The room features floor-to-ceiling windows extending at an angle across the full length of the image overlooking a bustling urban backdrop. Soft, golden hues of sunset light filter through these expansive windows, casting dramatic shadows on the luxurious furnishings and the man sitting by the window. High-definition resolution highlights intricate details in the modern architecture and design of the cityscape outside. The television is positioned on a table to complement this panoramic view, creating a well-equipped leisure space for entertainment.
Fusion of future aesthetics and natural themes, flowing glass texture, refraction effect
dall3, of course. 😄
some day? 🙂
Well by the time they release it im sure better stuff would be around
yep, keeps changing. dall3 is still my favorite for a while, though.
Here is a very noble woman who recently owned Microsoft a big tech company in the US and they say Ps will not change US. just wait in the upcoming years to see a new US.
Chat what u think the “best” flux loras are for portrait fashion photography?
admittedly SD 3.5 is better at skin
get novaanimexl if ur looking for anime
its good :3
Ibtehal Abu Al-Saad: One Woman Against 10^6 Men. Here are more details about the brave stance of Ibtehal Abu Al-Saad against the use of artificial intelligence in Gaza 🇵🇸✊
After being expelled from the hall, she went to a press interview and said that she fears the consequences of her uprising against Microsoft less than she fears unwittingly contributing to writing code that would be used for the genocide in Gaza. 💔
And the issue of using artificial intelligence in warfare is not new; it is documented and has been reported by independent newspapers before. 📰 Mohamed Mahdi Al-Tabar even received a warning when he exposed the matter in the OpenAI Discord, causing him to leave the heavily censored server. 🤐
But this virtuous activist, this fearless exposer—“a witness from among their own” (but she turned to be among Palestinians)—rose up even more forcefully against one of the AI syndicates on the company’s 50th-anniversary celebration day, making an honorable stand that even the Arab rulers themselves could not record, despite their awareness of this crime. 👏👑
And it is not only Microsoft that uses AI for genocide in Gaza; there is also a report accusing Google of being complicit in this heinous crime against humanity. ⚠️ You can watch the video “How AI Tells Israel Who to Bomb – Vox,” which documents in detail this new sadism and fascism. 🎬🔍
#Expose_Technology_Crimes #AI_for_Killing #Ibtehal_Abu_Saad_Rises #Palestine_Will_Not_Die #Boycott_Complicit_Tech_Companies
Designed with artificial intelligence by Mohamed Mahdi Al-Tabar 🤖🎨
You may use this image (but please credit the source and link to the designer’s personal page) 📷📢
why dont work inpaint in FLUX model?
because flux aint no inpaint model
generate augment code vs cursor image
Study that I'm doing put a 3d image in a photo
3d model from blender
image where where the 3d model will be inserted (like context)
I did this results with inpaint
Applyed lineart on this and got a result more realistic
I can play with the form lowing the controlnet weight
This will be interesting. Gonna try making a lora that uses 5 seconds, up to minutes worth of videos.
Though, i don't know how to make it train batches of frame sets yet.
I already reaload the UI
Oh boi, thank heck for batch scripts lol.
Had GPT make me a script for every range of scene changes and happenings to describe everything at proper intervals for this one 11 sec video clip. And that's one out of 10's of clips i have plans for lol
How do I download Flux to webui version of Stable
do this is a error message? My SD don't open
ok, thanks. Was the second isntante that I tryed
hey, this guy is a bot??
he deleted the message
Has anyone tried HiDream I1? It’s a new 17B t2i model released under the MIT license and supposedly has done well on some benchmarks and arenas. It uses T5, CLIP, and llama 3.1 8B. https://huggingface.co/HiDream-ai/HiDream-I1-Full
#🏞|general-with-images Three East Asian children sitting on lush green grass, chatting and laughing. One cheerful boy holding a smartphone, gazing at the screen with bright eyes and a radiant smile, sunlight creating soft bokeh in the background, vibrant summer atmosphere, anime-inspired illustration style with soft shading and warm color palette.
Another (relatively) new model, apparently autoregressive: Lumina-mGPT-2.0. https://github.com/Alpha-VLLM/Lumina-mGPT-2.0
Just, saw this, too: someone has attempted to distill T5-XXL to keep just the visually-meaningful stuff. Could potentially be useful to reduce memory use with Flux and SD3, but I haven't tried it yet. https://huggingface.co/LifuWang/DistillT5
generate a picture with a dog in
That living room would be noisy as hell. Interesting image, though.
Unfortunately, this doesn't seem to work with Flux in ComfyUI on MPS. It just generates noise.
Any Stable Diffusion/AI in general badasses can tell me how does one generate more images in different poses of the same character wearing the same clothes/armor?
This is as close as I could get, I think. Idk though
I tried using ControlNet and all but it didnt seem to work very well
try 'character sheet' in your prompt.
I did, thats how I got the lil sprites, but I need to make separate ones too
I just need to know if theres a trick in general to keep the same looks/armor
You can use a Lora for your character, and/or IPAdapter.
Or decide Wonder Woman and Mega Man should duel.
Fun fact: "Wonder Woman" was not in my prompt anywhere. Thanks AI.
It uses image inputs along with promtps to copy composition, style, characters, etc.
I have not personally used it, but many have.
Isnt that the same thing ControlNet does?
Or, well, hmm... maybe I did many many many months ago. 😄
Is it an addon on Stable Diffusion webui or is it a separate thing?
It is different than ControlNet, but can be combined with Controlnet.
I know it is supported with comfy, I'm not sure what else.
Im on Automatic11111 or whatever its called
I'm sure someone has published a tutorial on the Y'tube.
In fact, I think the creator publishes videos. They might be a bit heady.
I think it is limited to 1.5 and XL models.
Im using ChatGPT as my guide up to now lmao
gggg
no. A controlnet is "layed over your image", so it influences the image directly. An IPAdapter is more like an "inspiration". The model takes style or content from your conditioning image but it does not copy it 1:1
help me, I'll pay, how do I make sure that I throw photos into stable diffusion and the model automatically finds clothes and puts them away? I'm creating a telegram bot with this functionality, please help me, I'll pay!
Hello
This style > Ghibli
Again AI showcasing bias? This ruined Rockman integrity. Please delete 🥹
You never know what it'll do next!
You seem really upset about ai bias when it's bias exists of the images in its dataset
I've had random characters show up sometimes due the scene, no big deal
Best wallpaper. I wonder if you guys can help me to enhance it further with your inspiration like by removing some artifices introducing something else
Seems like a depressing wallpaper, why not have one that celebrates the culture of Gaza instead?
You seem to advocate a lot for it so some positivity could be nice
But if that's the vibe your going for id make the right half of the street a bit darker to match the left
Can't ignore the darkness for celebrating the culture. It is not a subject matter of the moment but I hope to do it soon. Unfortunately we don't live in a pink world and yeah I will edit the right side now to see though It seems to me like a flashback scene from an Anime
You could have the subject look into the camera, Mohammed.
that would be destructive unless I use Sora
hmm otherwise, she seems to be a bit "clean" considering the devesation in her surroundings
maybe some grime or dust
seems darker now
It is symbolic rather than realsitic but the presentation itself should be enhanced
I would like to make it more realistic that It will not sound like AI
I would it have like visual illusion 3D overlap. It did do it but It is very flat ant nod smooth
Get a non cartoon model
seems nicer but the anotmy can still be enhanced further
What are you using?
Here, have a Tifa.
an Tifa?
Hey Omnia, what model is it?
That was 4o.
RayFlux FP8 (4 min a gen on my m2 pro MacBook with 16gb ram but I can’t complain it works)