#💬|general-chat
1 messages · Page 179 of 1
allows for X and Y tiling (and both)
I found one for 1.5 but I suspect flux will give me much better results
SDXL even will
Isn't that just for tiled encoding/decoding?
nope chcek #🏞|general-with-images
dunt listen to this acc ^ , it appears to be hijacked by a bot in another server, refrain from clicking links or joining conversations with that account anywhere but here, and use extreme caution
hello guys
What's up.
z
I wasn't really paying attention to when, but it's a good anime model. Most others suffer from small dataset, poor training, or are just trained via lora. There weren't really a ton of good options. It's not like illustrious does anything super fancy iirc, it just avoids most problems that others have made and actually had enough compute to work with.
why would anyone use your AI service while everyone here is running it locally 
Is there any way to run Kling 1.6 on your own local PC or something similar to get nice ai videos? 🙂
I want to avoid paying 😄
what graphics card do you have
2080ti and a 3060
thanks for telling me
is it good as noobAI/pony
what other widely used SDXL models are there
Personally I feel like illustrious is better
I never struggle with hands or any other issues pony had, haven't tried noobai yet tho
hi

does flux model have problems with general prompt (fp8 models), like instead of giving lot of details, you say something basic without much detail
Then, sadly you probably cant. Might be able to run LTX but that's low quality
i can't get the result wanted if i'm not specific, like if i want to let the ai decide certain detail by itself, it won't follow the general idea i wrote in the prompt, it'll have problems with following the prompt
if that's the case then i won't use flux for general idea
xd, i think either my discord or the server thinks im a moderator
oh i cant attach images here
#architecture
Really, what can I use to make decent videos with ai?
You could try hunyuan but its not gonna be fast on that 3060
with wavespeed node it can be quick, minimal quality loss
Good morning, everyone! How are we all today?
What graphics card do you have?
Also if it worked before and only broke 3 days ago you need to ask yourself what changed. Did you make an update that broke things or change settings?
what software is the best for image to video? and what's the best one for extending a video with ai?
specifically trying to prank my homies by making them kiss each other.
any online version will probably block it for being nsfw etc
there isnt a good image to video yet iirc but maybe LTX could do it for you
Hey guys. I need some help for non nsfw picture I need for a friend of mine. I am running A1111 but didn't get a good result with my models (I use lightnings because I only have a GTX1070) and also I am bad at prompting to get what I want. It should be a greetings picture for a friend of mine for her birthday ... Maybe someone has some time those days to help make that picture? Also to get more into AI generation or can create that pic for me. I tried Dall-E because I need that pic till 08.02.25 but it should be a cute smurfette and Dall-E has problems with the copyright ...
can flux do skyboxes .. haven't looked into this sort of thing for a long time, i saw some service by blockade labs that seems to do that
what method is prefered for that sort of thing.. i've see peopel do cylindrical maps i think
Does someone have experience with supir? The example images look very good, but when I tried it in comfyui the results were bad. I used juggernaut as the model and the input image is photorealistic, although it looks a bit cg because it was made with flux. I prompt for photorealism as well with everything artistic in the negative prompt. The control scale is set to 1, yet the the best result I got so far look painterly and far worse than what I get with hires fix and related methods. Am I doing something wrong or were the examples cherry picked?
super-resolution is just really hard
https://github.com/yuvraj108c/ComfyUI_InvSRthis is a good one to try
Does anybody here know if there’s something up with the SD Deforum Google notebook recently?
Since the Emad emojis are no longer available this means a complete and total failure on everyone's part. We apologise on behalf of all AI.

managed to make a pic with 2 char loras, that shit is ROUGH
On behalf of the Homies Association of the World, I implore you to not perform such a prank.
Sincerely,
Head of the HAW
Why are the stop and interrupt buttons so worthless
It's faster to just close the program, wait for it to load up and then set everything up again
How is that even possible?
Is this just a me issue or is stable diffusion just like this?

kittyyyy
is Stable Diffusion 3.5 Medium uncensored?
Its crazy how the more AI tools I get the more gooder I get at bugfixing dogshit environments
Probably not since its released by stability it self. Maybe a fine tuned model might work better
Sdxl is uncensored
And online iy says that the model is, including yt vids
Ive tried it, its not
But if finetuned/trained etc its no longer censored
And by saying sdxl, i meant that that was also released by stability ai
Not when i tried prompting certain stuff into it but sure 👍
hey
why cant i get stable diffusion to do what i want lol
https://i.imgur.com/MdQ4ZF1.png for example i want to extend the hills in the middle down to fill the bottom
but i just keep getting strange results
Hey, quick question, is there a way to search in the LORA Loader dropdowns in ComfyUI? I can only see a filter input and that is useless when you have folders and it only filters the top level
Bad code. All software these days has this problem https://www.youtube.com/watch?v=kZRE7HIO3vk
lora stacker node
@slender vault by efficiency nodes? It's the same thing with that one for me. Can only filter the top level
Thanks, although I tried CR and rgthree stackers too, but they also have the same kind of non-searchable dropdowns
That's how all their dropdowns look for me
I remember having one lora loader before which showed thumbnail previews at least, which is better than nothing, but having search would be best
Because sometimes I load someones workflow and they don't have their loras and checkpoints sorted the same way as me and I have to dig through all my folders to find the lora
So, question... does anyone know why SD seems to get into a certain... "rut" where it'll degrade to making shitty images with like double bodies where the same prompt used to make good images.. and then just keep doing the same type of double body even when changing prompt or model (model changes the stylle to that model ofc)?
eg.. SD just decided that a core part of every image is a face in the abdomen
regardless of the values i change.. I get different styles of just... that... probably till i restart the backend
monkey patching
is what it sounds like
uh... ima need to google that
ok.. not sure how that applies exactly.. is there a way to force the software out of it.. like clear caches or such wihtout restarting it?
yeah the way to fix it is to find the source of the monkey patch
such as? like I've tried toggling hi res off, adetailer, changing prompt words, changing models
I can't remember which things I found monkey patches in but there were several comfy nodes that did
its quite common
it will be the same for extensions for A1111 or forge
Anyone know how to edit one person at a time on a1111?
Like in pictures with multiple people in it?
yeah this is called inpainting
can also do it using unsampling
Inpainting?
yeah what inpainting does is it puts a mask over some areas of the latent
Like one person has blonde hair and the other has brown for example
I don't do prompt engineering, maybe someone else will know
Ah okay
the way I make big images is to use lots of tiles like a chessboard
so the prompts can be simple cos there are multiple tiles
is there any way to see a loras prompts in webui? I'd love if i didn't have to go back to the civitai page for it to check each time
so what is the best overall uncensored model? something close to 3.5 image quality
I’d like to second this
In forge you can see the lora keywords and it will be automatically added when selecting a lora.
yeah? where?
Haven't used Forge for a while. Maybe it was an extension.
found Civitai helper, seems to index that stuff.. trying it out
It was a lora tab. It would list all your loras with images and names and you could just click on an image and it would add it to your prompt.
yeah thats there as default.. but it just adds the <lora> for me.. not the keywords.. and if there are more I couldnt see them.. so far the helper pulls descriptions which i can adjust a bit so i can remember better
You would have to add the keywords. I think you could fetch them directly from civitai as well.
Both forge and std webui now display the tokens inside the lora's setting screen. It's that wrench icon on the top right of the lora's gallery image
Some loras do not have metadata tho -- seems like a training setting issue. In which case you gotta go to civitai to see if the uploader included sample prompts
hi, is there a "go to" guide for running stable diffusion locally?
i checked the pins and it mentions #unknown channel for getting started
Hey all is possible to make headless ai or not ? I Mean nbm
Hey Anyone have experience in Genfluece ai FLux ?
Are you asking of people use it or do you have a specific question?
hello
whats up @ancient sluice
just trying to generate cool images

I wish I didnt understand more blackbox solutions then literally everyone on earth but I do
Im going to draw a video game its actually going to happen thats crazy. But first Im going to draw a checkpoint model.
uhhh the ping is to say that you're working on a poll? why not announce it and ping everyone when the poll is out? and why are you polling users if the investors are the ones that actually decide?

So we have been summoned aye
who is maxfield and why is everyone sending skulls at him
maxfield based discord moderator
mod behaviour
'Ello
Because that's the community managers job, to keep the crowds invested.
rant away
I love generating art with DALLE and stable diffusion. Would love to do it professionally if possible. look forward to seeing what oppotunities follow from this announcement 🙂
emphasis on this
welcome aboard @vapid dove :)
Meme room mentioned, opinion accepted
then we get excited again
I actually think the ping will make more people leave the server than engage them
Wow, I didn't realize Stability AI had fallen so far X.x
I, like everyone else here, am spontaneously generated into existence to commit social engagement
@vapid dove Did AI write that announcement hmm
That's what I came here for to see
good. maybe the trolls will leave
fear is struck into my heart every time a company decides they need a Community Manager
possibly
Did AI post that announcement?
I am at least 55% chatgpt
and yeah, the last time I used a stability product was the image to 3D model and it turned out it was cherrypicked and not even SOTA, and then flux really killed it for good
I WAS RIGHT
Only the real ones are gonna stay 😎
basically stability needs to actually deliver a product that people want to use, maybe remove the censorship which is the top complaint
3.5 is decent, if it was actually easy to train
I still use SD for sdxl tbh but it's more of a community thing keeping it up rather than stability
I don't think you know CivitAI's history, associating with them is very bad >.<
train it on trains
I feel kinda bad for whoever is tasked with fixing Stability's public image, since I assume they are somewhat powerless to solve any problems
tbh my greatest issue was working with hands on most of these models. and finding decent documentation to finetune models for stable diffusion 3.5 has been difficult as well
Sai often is either hit or miss with their models (Cough SD 2 cough cough)
I'll admit, I do not
the community manager that got added for the character.ai server caused that place to turn into a TikTok user tween hellhole
what are you even talking about? civitai is one of the main contributors to stability's adoption rate imho
I'm not aiming to change the discord, just make sure we're showing cool stuff, and getting feedback prior to you know, release
oh hi direct response
They're well known for attacking artists, harming creatives, and (ironically) stealing models from other sources and people as they re-upload for profit.
that and weird floaty bits when making artistic images
It's a big red flag
yeah I wish u luck king
give us SD4.0 with zero censorship, better captioning and a mostly non-synthetic/not overly aesthetically filtered dataset (impossible)
are you going to make a section where people can give feedback on their current grievences?
My biggest want is a competent vlm for captioning data.
if I could go back to a world where it doesn't exist, maybe I would, as I feel bad for artists. but it exists, so at least it should be open, openly-licensed (apache, MIT, WTFPL, I dunno). the worst model is OpenAI where they steal from everyone and claim it as their sole property
I just locally run stuff for fun, but I'm also masochistic since I run a red card
@heady glade are you saying we should not be associated with you, since you have freaken 36k messages on the civitai discord alone?
God. SD2 is like a memory i want to suppress
None of this except maybe the last one makes sense to me lol. Why attack the guy trying to make something cool for others to use as well? Well trolls exist and there probably is no actual sense to be found
Youre not getting a mostly non-synthetic dataset in 2025 over half the internet is synthetic bro
time to go slow mode
I was someone that tried to push against the hate they represented over time.
real, but my request was sarcastic
there's like six people here not even talking that fast
I was one o fhte artists they regularly attacked and did so for months after I left.
Good night anyhow, I was going to sleep when the ping hit
zero censorship cannot happen after the scandal with sd2
its stability, they'll decide for us what's legally viable for them
well... that was kind of my poi-
SD2 was bashed because it had worse anatomy and was far more censored then 1.5, and had a horrible license
Would you guys fuck with a tool that lets you explore latent space in a 3d gui and train the model with paintings and storyboards so you can punch your own ideas in?
Illustrious/NoobAI has proven you can extract even more out of SDXL than previously thought possible from say PonyXL, so my only request @vapid dove is please we just need the future base model to be easy to finetune, easy to make LoRAs for, and not awfully licensed with ambiguous fine print like the SD3 situation
maxfield what have you done, you ping with no subject for people to latch on to so now they have to make conversation
well not train but literally replace parameters in real time and test them in real time too
stable diffusion died a long time ago to be honest, it was a fun ride, but multiple bad decisions were taken and they also had a huge target on their back, being one of the first company to share a diffusion model. it's a shame really, but they will never recover, even with Cameron investing in stability
cuz thats what im working on and i got the parameter replacement working last night
I don't think I was that active here at all (especially general) since SD3 actually came out, I came here to laugh about it and that's all
society needs censorship.. lots of weird people around nowadays..
As long as they keep lobotomizing their models there's not really a point in wasting resources, no one is really going to use it since FLUX is a way better alternative
At Civitai we actually banned the original SD3 model because of the license, and unbanned it once fixed. the current license is very permissive
the current SD license I'd say is even better than flux
All the models are dogshit bro in theory they should be completely capable at doing everything.
I'm an agent of chaos
Pretty sure CivitAI just banned it because it would affect profits, not because of the license specifically lol
SDXL was SAI's last great model then they sadly fell off a cliff, with SD 3 medium being the prime example at how far S.AI had fallen
No need to lie about it >.<
Its capable of doing >Everything except the stuff they don't like
No the license was Extremely bad
what bad decisions were made?
@vapid dove Not sure how much of advice I can give you tbh, but looking at how good flux is I´d say focus on making artsy images and improving CLIP / Prompt coherence, rather than making realistic images unless you make them considerably better than flux
no its not it repeats the same 3 compositions between every prompt basically and you have to fish for like 30 minutes to an hour just to find a prompt that doesnt produce dogshit. Literally every model is like this.
im just waiting for other models, flux is good, but it's too large, has the typical overly perfect (incestuous model merge type of) faces and sucks for paintings unless you use loras
then please tell them to get it right the first time so it doesn't suffer from permanent bad publicity 🙏 some lawyer writing that license killed SD3 before it could shine
Idk I havent used stability in forever I use a merge of ponyxl
SD 3, SD2, everything that led up to these models are bad decisions
which is still kinda stable diffusion sorta but much more elite
Haven't gotten that issue from flux so far
they ruined it with SD3.5L, the sauce and soul of SD3 was removed (artstyles, etc)
don't think thats a piece of history that will repeat itself
good: SD 1.3-5
bad: SD2
good: SDXL
bad: SD3
we're due for a good one
hello maker of the pony models
Wait do some of you guys use just the base model?
the release of SD3 abislutely killed them, then proceeded to gaslight people into saying it's the best model out there, then ehmad left, and then stopped to publish anything after the SD3 shitshow so they fell deeper into the forgotten companies, at the same time concurrent models were emerging, and this was over for stability, not even mentionnign the gigantuic amont of lawsuits they received. It kinda was coming from the start really, they piked and the downfall was quite fast and brutal
which base models
Like do you guys use base models without checkpoint training? I didnt think anyone did that.
I'm surprised the >Skill issue guy isn't here talking by now
Like I use pixelwave flux I guess, but even base flux is good as long as I don't make paintings. never explored SD3.5L finetunes
It's been a long time to be honest
i normally do that to test what initial capabilities it has before going on to see how i can finetune it
hey you're the pony guy right? @nimble canyon still doing the AuraFlow experiments?
lol i know right i remember that dude, what a moron XD
the fineture pixelwave of flux is indeed top tier model
If by experiments you mean a SOTA community driven uncensored model to be release within a month, then yes.
very close to MJ6.1
Whoa!! I was honestly starting to lose hope it'd come out before Half Life 3
is it gonna be good with paintings (impressoinist oil paintings), variety, backgrounds, etc?
I might consider using it despite ignoring PonyXL in the past
I don't even care about ponies I just think it was a really good base for other models (AutisMix, BoleroMix, T-Ponynai3)
Good Im always worried about this ai community because 90% of everyone in it is mentally weird. Theyre really mentally weird
possible SD3.5v? 🥺
actually 99% is mentally weird
lmao
is there some base models that are cool with 16:9?
even noobai sucks at this ratio
I am still on hold with Dunning-Kruger!
SD3.5v?
I would really like a model that can do very different art styles like Cubism, sketch, botanical drawing style. not focus on realitic photos or anime.
I have not tried impressoinist but it's a muuuuuch broader model in terms of style, for example:
yikes, I still can't believe what went down in here with Lykon
havent tried
no but what is it? we have large and medium, what does V stand for
did lykon left the companie ? who remains from the OG team really ?
Society should NOT be censored by others! There should be a filter option so people can decide for themselves what they want to see and what they don't want to see, but that's what the negative prompts are already for. I will not except anyone else deciding for me what I can or can't see!
Ive always wondered if the AI communities obsession with dunning-kruger is some kind of projection of their insecurities about drawing. Its weird I just see it brought up literally all the time in this community and like basically nothing about these tools has any give or difficulty to them theyre very easy to use by design.
version maybe
Realism: https://cdn.discordapp.com/attachments/1315932135571329024/1336503348983234590/score_9__rating_safe__style_cluster_53__source_photo__Human_female_with_flowing_red_hair_and_striking_green_eyes_is_depicted_in_a_portrait_with_a_serene_express_w1280_h1536_s50_seed55_gs3.5.png?ex=67a4b3fd&is=67a3627d&hm=b2074a676d246855c1612a1345716280e3bb97b99392f68ea1a635d17b2a31e8& (I do not want the model to be realism focused but because people really like realism finetunes, why not to give them a good base)
a lot of people are still here, just talked to Lykon yesterday
i think v stands for v prediction
no cap ? how much left ? genuinly curious
v pred is cool with like
lightning and shadows
ah not realism, rough oil paintings, impressionist, expressionist, etc
but thank you anyway for this
My pet-peeve whishlis: SD3.5 not-the-base-model, make it good instead, we now have the base great, but no one's finetuning/training it into the final version where the godawful twisted limbs and various artifacts have been trained out of. Show that somehere in these weights is a good version. That has flux/lumina/auraflow (esp the latter two are good) prompt following with SD3 (not 3.5, 3 looks better imho) aestetics. (And then take a page from the midjourney playbook and doubledown on style, actually tripple down, the more modern the imagegen the worse/more bland the variation of style, MJ excluded. If mentioning artists is a no-go, go with style reference images/stylistic vectors as additional inputs, it's an mmdit, multi inputs..) ((and when there finally is a strong model it's time for the niceties like image edditting by prompt "input image of cow -> the cow wears a hat", etc what seems to be the latest thing in image AIs))
Also good for loras
Max is a wizard
No, I get it, was just illustrating that V7 can do a lot of new tricks.
tbh i have no idea how loras work and im afraid that loras will ruin all this cool lightning and shadows stuff
thats why i havent used single lora since i moved to noobai
its my third day but I've met maybe 20 OG's? sure there's more
it'd also be nice to use a the tiniest fraction of "all that compute" to bring the free public discord bot back, that brought so much life in this discord and was such a great showcase of stable diffusion.
Why do you need to mention artists Ive never even thought of it cuz I can just setup a lora or describe the style and it works every time.
I open comfyui and put nodes together (im a professional!!!1!)
did you meet james cameron?
Just FYI, not hard feelings with Lykon at all and all is good, but I am going to meme that event for many years to come.
Honestly true bro
damm really, that's good, there might be still hope fo S AI, i hope you will recover from this, but no hard feelings if i'm not currently rooting for you to win the race
I can't speak for downloaded loras but i got my loras to work with lighting and shadow quite nice - not to the extension of normal model but like 80%, very good for 1 image loras
@livid gale sorry forgot reply tag
did you use Stable diffusion since the 1.4 days?
competition is always good, so I´d rather have SAI release a fantastic new model with 4, than just go "Fuck them cause 3 was bad"
Didn't Sai had to lay off a big part of the team who made 1.4?, people who worked on the original diffusion paper?, which led the to the creation of black forest labs?
very true
we rode around in his submarine and read old calvin and hobbes comics. No I haven't met him
that is nice but I really hope it can become better than Illustrious tunes such as HassakuXL
(example with a bunch of wildcards https://files.catbox.moe/1qg3tj.jpg)
that's not what i expected to be the takeaway, it was supposed to be "make te AI amazing at style" 😉
i need to search some style loras for noobai
i really like sousou no frieren "brutal" anime style
but like
yea
i want this to be on all my images not just ones that have frieren in prompt
But tldr:
- Super strong prompt understanding
- Style control via prompting or special superartists
- Muuuuch better backgrounds, generally much higher quality visuals.
- 1.5k max resolution
- GGUF versions going down to 2bits (minimal quality degradation on 8bit)
- Bigger dataset including much more realism (although I will be adding even more for 7.1 as I didn't get to full 10M)
as long as something gets released... compared to the LLM community we're starving
I you meet him and go into a submarine, please don't visit the titanic!
I'm sure theres a story there, but it hasn't been told to me yet
just use HassakuXL trust me, you can throw an artist/character tag and it'll probably understand
yeah 2-4bits is gonna be necessary since Auraflow is like 12b iirc, I hope SDXL users wont have too much trouble switching over
I wonder how well pony V7 going to compete with animestro 2b (upcoming anime fine tune of 3.5M by the creators of Animagine XL)
is 12b the same as an LLM would be? like, 12b LLM is very lightweight and runs on most people's GPUs
Well, tjhe whole poiont of open models is to provide someting people can build on top accorging to their preferences. I don't know if stock Pony will be better (well, I hope, but still training) but I am sure people going to create absolutely crazy finetunes out of it.
A lot of loras are pretty baked, not just style loras, with poor tagging or just overtrained, you can always ask for people to train the model for you, also frieren is pretty baked into noobai
yeah its a transformer so its closer to it, but I am not so sure about "most people's GPUs"
i mean i also can throw artist and it will perfectly copy its style but i just dont want to search the one that i will like the most since theres fucking millions of artists on danbooru
i found artucile on civitai with artists for noobai but they look more less the same
No, a 12b t2i model is going to be hellish to run compared to a 12b LLM
This is the first time i heard of Animestro, what arch will it use?
SD 3.5 medium
grim
I might remember this wrong but Auraflow was slightly faster than Flux despite being the same size
well any Illustrious based model can do it at a extend, don't need to be my model for it
The more, the merrier. Everyone benefit from having more competing models. (well, V7 obviously going to be the best
)
wdym baked
music model...
working on it...
overtrained not flexible enough
Yue is super slow even on a 3090 iirc
eh like overtrained or just no tag shuffling, very stiff in usage, not flexible and close to original images
i c
8bit firs in to 12GB if I remember right, so I really bnot concerned about VRAM at this point
It's going come down to the model which can be run on a toaster with good performance
damn i can just train my own style lora with screenshots from anime
honestly I am not that interested in Anime and I want good prompt adherence, more styles, so I'd be using Pony much more often
You can, noobai its quite easy, also ponyxl was quite easy too but i prefer noobai v-pred for lora
Right, but the metric we are all chasing is "good images per time unit", not just "any image per time unit"
also you can totally fine tune Stable Audio Open to be better at music. I've seen people do LoRAs to make it into a full 3-minute song model, I've seen people add lyrics support, the code's all there and the base model is big enough to be a full song model. It's the same architecture as our best full song models. Just means you need to put together the dataset yourself.
tbh i dont see any reasons why i should use pony insdead of noob
only for some OCs that i made on pony since i just cant make same appearance or face on noobai consistently
on pony with prompt i made every image is SAME char
oh yeah v-pred
You are correct, but if you managed to somehow turn a underbaked undertrained auraflow to a model worth using, I can imagine the CagliostroLab team being able to turn around 3.5M
SD2.X had v-pred 
You can try block training, that helped a lot of my character loras, that or training at larger rank and resizing it down sometimes helps massively
AF is a realllly good model, I think people really underestimate it (including Simo/FAL)
Your model kicks ass
AF did hybrid mmDiTs before Flux, and I don't think Simo gets the proper credit for that
thank you but i havent understood anything since i havent trained a single lora in my life
Hopefull V7 release bring more attention to his work as he fully deserves it.
damn il look into it, how much vram do i need for a finetune, or makea lora? 10GB?
its all it needs
Honestly same but can give you tips if you want or just point you to the guides where people who are smarter than me know things 
Auraflow is great for training don't get me wrong, but for general use it falls flat compared to 3.5M and 3.5L and flux
I still hold that Simo is the only ML person on Twitter actually worth following if you want to figure out how to beat SOTA
idk i retired from looking for models after my pony finetune with cross compatibility with plugins and resolution. No idea how I got it to do that but its the only sweetie i need now.
I just hope you are right, so far the images you post make me think "I hope this is a very early WIP and not representative of the final model" but I trust you because Pony was good when it came out
I haven't done too much lora training myself I'll admit, but with pre-encoding (code and scripts coming soon to the stable-audio-tools repo), you can get it super low. Particularly if you're going for shorter-length audio. Full songs can be a bit more intensive.
honestly i really need to learn how to train loras
it just feels like that lora training is for some smart asses but it definitely not lmao (i want to believe this)
Simo's a great dude, worth following
I don't know if you mean the one I posted here or in the PSAI discord, but I am pretty happy with the state of the model and it just needs to get a bit of oomph with lover LR.
Try Valstrix's Crash-Course Guide to LoRA, its quite good, and there are a few threads on lora over on FurryDiffusion that are very good but niche
@fickle sun Here's the library people tend to use for LoRAs for SAO. I still need to put in the work to make a first-party version for the official stable-audio-tools repo, I've mostly just been stuck on how to properly fit it into the model config format. https://github.com/NeuralNotW0rk/LoRAW
Also it's in diffusers so you could probably use that with all the normal HF bells and whistles. I haven't used that implementation so I'm not much help there.
You might also try to switch how you tag your images, that sometimes causes problems
Does anyone actually know why cuz im not kidding and its kinda confusing is that usually what happens when you merge a bunch of different base model checkpoints together. Is it normal to just get all round cross compatibility from that?
I mean it looks a bit less impressive than current state of SDXL tunes, but it might be because you just didn't finish training
thank you ill screenshot this and comeback to this screenshot when im ready
this is what vast majority of civit ai checkpoints are
oh wait you mean base models
yeah that just doesn't work
Hello if i want o use stable difussion deforum in 2025 , i need to paid google collab?
Yea thats the thing it did work with mine its absolutely crazy bro
I have no idea why
Its just the best AI model in existence now because of that
which model r u talking about
Last time I heard Google colab requires a paid colab account to use stable diffusion
sd1.5 merged with ponyxl and it has compatibility with sd1.5 loras and sdxl vae etc
well if its like Gradio then yes iirc
I just need to warn you, take a lot of coffee and a deep breath before training a LoRA or kohya_ss will give you temporary brain damage
or is it stable diffusion itself too now?
This is just bait
SO stabble derofum doesen't work anymore?
What...?
dead thing cant be damaged
you can't merge SD 1.5 and SDXL properly
sd1.5 merged with ponyxl
dont' bother - they're baiting you
there are some weird and unpredictable effects
which is what you observed
something does happen its just not that useful
Lmao definetly doesn't get easier after time, but there are some good presets to start
I remember my first time fine tuning 1.4, lots of man made horrors, but a good time nonetheless
Somehow nothing I tried was better than the default settings with the basic learning rate / scheduler etc lol
weird and unpredictable results are my favorite part of gen ai personally,
AdamW and cosine are cracked schedulers tbh so not surprised
(am assuming that's default, forgot what it is months ago)
noone has answered my question
so is there some cool 16:9 models
yeah its probably the most fun part of diffusion models
I feel a bit sad that if autoregressive image models become dominant we will lose some of that aspect
Oh sorry, i forget that i'm in Albert Einsten chat
with autoregressive you can't do as much fun stuff with adding different noises
I think flux is like semi good at that stuff
you can also try outpainting
my pc wont even be able to run flux
pretty much they can all do 16:9
Then try outpainting, and like roughly drawing the mask in paint or krita or any program you prefer, it works nice with most sdxl models as well partially works with sd1.5
The lizard man ruling meta, says that llama 4 would be generate images as well as text, but i doubt that images produced can actually compete with traditional t2i models
I think its more of a flex that their model can do both, just to talk about it to investors and media, doubt they would care about quality that much
autoregressive models have not beat diffusion models yet but facebook is sitting on some interesting diffusion models
Me whenever things get mildly technical
diffusion models seem ahead currently yeah
I think that multimodal LLMs that can generate images, audio and text are the next evolutionary stage of LLMs, with gemini 2 (the unreleased multimodel one) and llama 4 and that model openai teased yet never released (GPT-4o multimodel)
im just monke clicking generate button i dont understand like 70% of all that is being said here
auraflow was interesting when it came out but it's an outdated architecture at this point
there's some weird hybrids of diffusion and autoregressive that might be better than one or the other
I'm still not sold on mmDiT, didn't help much in my experiments for audio diffusion
lumina 2 is probably the best arch right now but it's far from perfect
though the hunyuan video arch improvements to flux are also pretty good
Eh i really don't think so, seeing how they freaked out about deepseek which is a good but not insane improvement over their model i doubt they got anything big coming up right now
i want multilodel ai that will basicly generate visual novel for me lmao
text + images of locations = VN makers losing their jobs
I found out why its cross compatible but now im not gonna tell anyone cuz its hilarious that actually nobody here knows this
probably one of these is stronger now, than both DiT and mmDiT
https://paperswithcode.com/sota/image-generation-on-imagenet-256x256
took a bit of googling
there are so many new architectures its confusing
there is suddenly like 100 different ones
Yeah, but I think they freaked out because their reasoning model (said to be released alongisde llama 4 with the same name) underpreformes compared to deepseek
deepseek is hella smart
sad that its not quite good for "chat bots" tho
SD 1.5 and SDXL latents have different colour space by the way, for the channels
Eh i think they freaked out cause its open source, company my friend works at pays them like 10k a month for chatbots that they use once a month, so could mean money loss for them. Reasoning model is cracked tho i agree
I miss it when there was like two papers a year that were groundbreaking, now we are being bombarded with paper after paper, each better then the last
I mean its better to have more but I get what you mean lol
I just rendered in an sd1.5 resolution with a sd1.5 lora using the sdxl vae bro cry harder.
most of the architectures I see are pretty similar
most of them are yeah
but then you get weird ones like someone trained a model in wavelet space, or someone made a hybrid of autoregressive and DiT
its tricky though cos I don't want to pick the wrong one and then try to train something LOL
all of the recent arches use rope positional embeddings for example and are a stack of attention blocks
models like auraflow/sd3 are outdated specifically because they don't use rope
so multi res doesn't work on those old models unless it was specifically trained for it
and even then there are issues
ah yeah I read in the I-max paper that the rope is why Lumina was good at high resolution and weird aspect ratios
in the Sana paper they took positional embeds out entirely and called it NoPE lol
but I am not sure Sana showed that that was a good idea
Yeah, AdamW and Cosine! There's a bunch of others like Prodigy and probably some other band name but they're somehow not a one click improvement, or I might be dumb (same for inference, I just use Euler a all the time now after trying everything)
my current plan is to train a MoE of 8x LightningDiT models
cos its a fast one to train
Highly reccomend also trying Cosine with restarts, and setting LR Restarts to 3, its also pretty good overall
did you use something like latent interposer, cos that method does exist
https://github.com/city96/SD-Latent-Interposer
Possible, although I suspect we looks at different aspects of the model. I care much more about prompt understanding and model flexibility so the visuals may be less impressive, or perhaps my eval images are just not good but the only way to really figure this out is to let the community play with it (which I am wokring on)
one easy way to communicate this is with those prompts like "a yellow box, on top of a green shpere, next to a blue pyramid, with a cat to the left and a dog to the right" or something
or TEXT(!) no anime-style model that looks nice can do text very well. so like, some SFW body writing could easily tell people "oh, I see what you mean now!"
I think your bio link needs to be https, cloudflare error. Thank you for sharing your work btw! Joined your discord, will be lurking in the v7 preview channel 😂
Do you mean https://purplesmart.ai/ ? It does work just fine for me right now.
yeah in your bio it's just http:// instead of https://
Its so crazy to have a cross compatible ponyxl sd1.5 model and that nobody evven believes thats possible absolutely fuckin great i been switching up the plugins all day bro thats crazy
they're not telling you that you'll get images that just look like tv screen static or black screens or something. you'll get images. what they're telling you is that the results you'll get will be unpredictable - the underlying technology in ponyXL is based on sdxl, not SD 1.5 - so they are not going to work well together. sort of like sticking a brother and sister that don't' like each other on the same project - they'll fight and what you get won't be nearly as good with them together as what you get with them each alone.
I also found a lora from one that happened to be very good on the other
it was something like SD 1.5 hyper applied to SDXL intercomp
but it had a different effect
it is not fully random but its essentially random in terms of meaningfulness, to do merges like that
Thats crazy cuz its actually better then of the models by themselves with checkpoints cry harder.
i can literally prompt everything i want lil bro
the combo I found was probably better than SDXL itercomp base also
if it's working for you, cool - that's really all that matters
why are all latent space tools dogshit im tryina swap parameters and shit and the tool didnt exist to do that so I had to AI generate it
need more hackable tools yeah
not sure what it is you're trying to actually do - but are you trying to do this in comfy? auto1111? forge? something else?
I've started making very barebones setup to test models and modify stuff
one in pytorch and one in Julia
naw im writing a completely new tool to go into tensor files and edit the parameters inside the latent space.
I mean a decently large number of comfy nodes will do that
okay well - in that case, you might have stumbled upon a nice little startup idea - if there's nothing else like it out there, you can't be the only one that wants it
if you mean for example to do like photoshop blends etc in latent space I think the Power Noise Suite comfy nodes do that, for example
Theres a few out there but their design is really unintuitive and bad and its possible to just swap out parameters. I also wanna see if I can have it rebuild parameters so you can see the images it was trained on. Idk if the second one is possible though
you can assemble the latents though of a parameter pretty easily and then take on image and turn it into latents then plug a new image into the parameter.
I don't click on them but there are arxiv articles about getting training data estimates out of models
they do have some tools apparently
Yea theres a few tools that go into latent space but I dont like them theyre really janky and Im tryina make a streamlined one. I got it to go into the latent space and swap out latents but i been lazy all day so its still not getting all the latents for a given parameter. I also want a 3d gui so I can see the narrative bubbles. The problem with tools that do this is theyre just terrible so Im working on one that doesnt blow ass.
Theres so many things tensorflow and pytorch can do with latent space that just nobody takes advantage of its kinda crazy
the pytorch ecosystem is massive yeah
there's an infinite amount of possible things that can be done, and a handful of people coding things - i think you've got the start of a business here. your customers are going to first be other comfy users, and then widen out to the AI image/video industry
The point of it being a 3d gui is so that its possible to program or "draw" video games with storyboards. The oasis project has a very similar architecture to image generators and most of how it works are emergent properties.
Thats the ultimate goal bro I been sweating its kinda crazy
I saw somewhere they made a diffusion model that works inside the latent space of a different diffusion model wtf
If you train the model with the same vae then of course it will work.
Nested dolls ...
did it generate higher dimensional maps of the latent space or something?
no I mean the VAE decoded into the latent of the other model
it was nested, like crystal said
Well but that means that they use a compatible if not identical vae.
otherwise you couldn't even get it running
maybe not. it's in the latent space of the other model, but that doesn't mean it's even using the same architecture
i'd love to see the paper on that
Ban this guy
@vapid dove spammer alert
@vapid dove
Your promo tour is a big success.
get lost
It took you 5 minutes to write one sentence? 😂
😂
someone is clearly not having a "super sunny and warm california day" there wowza
It's pretty cold here where I was today, not sunny at all
Dang this server dead😖
can confirm. rainy.

we really needed that
For real. I want a super sunny and warm california day, but not too warm, you know


Does the 'private images' option in the Max subscription imply NSFW generations? 🤔
Wait, stability has been making something useful?
In the artisan, probably not.
It probably means what it means, keep the end result private
Ok, thanks. What doe you mean by "in the artisan"?
The stable diffusion subscription? What else do you mean?
Yes the subscription on the site
Yes #artisan-faq i recommend checking the use policy 👍
ah right. thnx 🙏
For nsfw iirc the civit ai generator allows it but its pay per image with their credit system
Yeah i'm still checking different generators. i like the SFW result from Stable Diffusion sofar
hi
hi
Is anyone having difficulties downloading models from civitai?
When i click download on a model, its giving a Error1101, Worker threw exception
Same Error
hi
Hello There
its different cos
the output of the vae of the first model was the latent of the second
whereas the second model's vae outputted into pixel space as usual
so they were able to do inpainting on the latent
now that I think about it this is not far off Cascade-type stuff I guess
I liked the Cogview 3 model where it had a relay model that could be ran repeatedly until you ran out of VRAM
yo
Hello
When you create an image, is there any Copywrite issues, otherethan if an LOGO used.?
in short, yes... more detailed version.. it's complicated.
Hello Guys, do you know how to change the default generation resolution from my multiform api request ?
IP regulation for images is pretty narrow, so differences will lessen the protection.. however this is a photo.. or an artwork per se. If those contain characters, likeness or such the protection is stronger and can be applied
wish i could help you
past lives, couldn't ever come between us, sometimes, the dreamers finally wake up.. don't wake me I'm not dreaming, don't wake me I'm not dreaminggg 🎵
does the midjourney server have a mental weirdness im confused
sorry i meant mental weirdness the whole time
what happened
Hi, How do people with RTX 50s train loras for flux right now ?
Theyre forcing me to pay them money for their ai model so to even be allowed to talk on their discord server
Why the fuck would I spend money on an AI model when stable diffusion exists that makes absolutely 0 sense
oh yeah I had that problem too
TBH you either pay for electricity or you pay for cloud GPU
with stable diffusion
its not free per hour either
I dont even need a cloud gpu i got this baby running local
yeah that is good
you do pay for electricity still though
Its not THAT intensive I might get a cloud gpu subscription if I get into deep learning which might happen pretty soon cuz Ive been messing around with other stuff.
for me the electricity cost to run an RTX 4090 at home is comparable to the cheapest RTX 4090s that I can find on cloud
per hour
Does it charge a monthly subscription or a per hour subscription?
usually per second
I see idk I might get one then
what you could do is use local 95% of the time
and then sometimes use cloud
the reason I use cloud personally is simply because my interest is mostly image upscaling
and the big VRAM amounts let you use larger tiles
pls no one click that link its fake
Damn so how long does this usually take ? Untill new pytorch ?
not sure its not really predictable
there are levels of support though I think lora making libraries will get initial support fast
but full support with proper quantisation, compilation and attention stuff like sage attention might take longer
I do wanna buy a software engineer ai which one is the best the one Im looking at rn is claude sonnet
my problem with claude tho is that you dont get infinite queries every day even with a subscription is there one like claude that gives infinite queries or naw
not sure I liked Gemini 1206 but there is new Gemini now apparently
Claude may well be better
Any workaround for now ?
not sure
I'm not really into lora training
Thanks man guess ill just wait
pls check the repos of the lora trainers you use and ask in their discords
cos there may well be something already
just checked auraflow
is this like
flux little brother
those fucking poems in the prompt
is there any guide for this prompting
also
compared to pony
how hard is this for pc
ggs
Is it normal to have hundreds of gigs of python plugins?
oh and like 5 versions too
or have i lost my mind
there's a 12 step program for this addiction
I can explain about that
Python packages can be divied three parts.
Small(1KB ~ 100KB), Medium(100KB ~ 5MB), Large(5MB ~ 100MB)
hi
welcome kashif
hell-o there
it ends with having 2x as many plugins installed
?
dont they just put that shit into chatgpt
and make it write the "poems"
idk
Всем привет ищу сильного токсика по сраца в ЛС )
https://ollama.com/brxce/stable-diffusion-prompt-generator you can try smth like this
but this is kinda old
oh wait no scratch that, this won't create the sentences
What exactly are the orbiting parameter clusters in latent space for. I found one in my model and it has a breadcrumb trail but the sdxl one is just purely separated from the core models parameters. They seem to have both high and low variance depending on the dimensions you choose to represent but theyre always there regardless of how you represent them
I don't really understand
im trying to pull the tags out of the model rn so i can figure out why it happens but its prolly gonna be a day or 2 before i get it working it should be possible I just need to change the version of my python and its plugins which is gonna take hours and im sleepy.
orbiting parameter cluster?
look in general with immages
you'd have to ask the AI - it does the orginization of data in latent space when it trains
i already did it has no idea
I saw the image, what are you making clusters of?
i didnt make the cluster that image was of the sdxl base models latent space represented in 3d.
i know. but neither does anyone else. the AI is the entity that decides how the data in latent space is organized and it does that when it trains. it made sense to it at the time
rotate that cluster around the xyz axis and see how it looks from other positions
I am not sure what data you used to make it
is this based off model weights, attention maps or values of a latent during generation?
it seems that until i can pull information out of the parameters in the model im probably not gonna figure much out about that then
you need to go to this guy's channel https://www.youtube.com/@alfcnz go to his first video from his first class, and watch all of his classes
at that point, you'll have your answer
all i know is it has something to do with narrative because it arranges everything according to narrative. People have tried arguing that with me which makes no sense considering it literally clusters narrative elements
im putting on the videos though
no, it has to do with probabilities. and there are some good videos of exactly what you're seeing on the net - but you're not ready for that yet. go watch Alfredo's classes
then why would it cluster according to narrative?
it's not.
Im gonna check out the classes man but ima be honest im not gonna take your word for it until proven otherwise in a empirical fashion
i'm not asking you to take anyone's word for it :) that's why it told you to go watch his classes.
there are other good videos to explain what you're seeing - his classes are the easiest intro you're going to get for this
you might watch this one too https://youtu.be/o_cAOa5fMhE?feature=shared
oh so its just for compressing higher dimensional information in a way that doesnt have any overlap
I tried watching the other video but he started hand doing math in a time where you can just plug a matrice problem in chatgpt and itll always get it right
I'm not sure what you meant by narrative
but there are linear semantic directions, so like there is a line where if you move along that line a person's eye shape smoothly changes or their hair length smoothly changes
so there are directions that manipulate attributes
yes thats what i mean about narrative because when you look at the bigger picture of the cluster it slowly transitions between narrative concepts while at smaller scales it transitions between tiny details like eye shape and hair.
I'm not sure what data you used to make this cluster
I understand what you meant by narrative now. The usual term for that is "semantic"
I call it narrative because the idea of calling it "semantic" is a techno blabble replacement word for a term thats already existed in art for centuries
and if this field is going to claim to be an artform or whatever its going to adopt proper rhetoric.
okay its a bit confusing when you use different terms but I understand what you mean now
yeah on a smaller scale if you start moving, it will make small changes like stuff like hair, and if you move a lot it will completely change subject
yes i want to hand train a model because I want to establish an overarching narrative structure to the entire model in ways that just wouldnt be possible without an understanding of how drawing works.
when you reduce the models image parameters down to the original image not including padding from augmentation and masks that exponentiate image parameters its actually possible just very hard to draw the model. I dont have too much time for the ai classes because im taking painting mentorship.
not exactly sure what you mean but good luck
when you train a model you exponentiate the parameter count with things like segmentation masks and augmentation. The actual number of images that are original works is generally in the hundreds of thousands.
You know what segmentation masks and augments are right?
Join a tech talk about DeepSeek with a senior engineer who helped build the world's fastest AI processor! https://lu.ma/jlepzs9f
What the fuck just happened did a bunch of guys just send me videos about ai without knowing what an augment or a segmentation mask is? Did that actually happen dude the internet gives me amnesia
ROFL!!! err, he's a university professor - and you need to know how to do the math or you can't tell if the AI did it right. he's teaching university machine learning students. and that math - that you don't understand yet - is what's going to tell you exactly why that data is arranged in the way your image shows
unless if the data is arranged according to narrative then math actually isnt useful. The field is also in the process of erasing itself so i dont see the point im not even particularly passionate about it i play with these things on off days when im not drawing.
if the training process is arranging things according to narrative and noboddy knows why which is exactly what I was already told then that just tells me the problem is unsolvable because narrative concepts cant be rationalized.
training data size and neural network parameter count are a bit decoupled though
you can have different parameter counts for the same training data
Im aware of this itd be great if I could just know how much data was used but since AI companies are being intentionally dishonest its really hard to tell. Ya know when all of your fields representation goes on the joe rogan podcast and uses unet parameter and image parameter as interchangeable terms to make plagerism seem rational it makes it kinda hard to tell what exactly they did huh?
there are models with published training data
but also there are papers where they say what both the training data size is and the parameter count
if this is the sort of information you would like to find, it is definitely out there for some diffusion models
it's got nothing to do with narrative. never has, never will
yea so earlier wasnt it you that said that you just call it semantics?
no that was me
okay so yea it has everything to do with narrative cool
link me?
nope. it's all math, correlation. the AI stores the data at vectors on the grid which is far more than just an XYZ grid - and while it learns concepts, it isn't learning a narrative.
there are tokenizers as well, so you can see how various models break your prompts into tokens if you'd be interested in that
you think humans learn narrative differently or something im confused?
youre not gonna be able to rationalize or manipulate it any time soon thats why they call it the blackbox problem right?
here is one
trained on 8 GPUs in about 10 hours
https://github.com/hustvl/LightningDiT
the training data is Imagenet
um - we're a lot more complicated. the AI learns differently than we do, but we don't learn narrative to start with. you teach a baby nonuns, objects, simple concepts. as the individual grows, narritive starts to grow as well but right now our AIs are really not any smarter than your average 2 year old. they're just VERY GOOD at using the information we give them. they're lousy at generating their own information from the facts they'e been given and interactions with the8ir users
all the benchmarks you keep seeing are from tests that scientists have created to test machines - not test humans. give the machines the tests for how to think that you give humans - give them the mensa tests
test their actual IQs and how they think
Wheres the images?
then see what your bench marks are
did you scroll down that page?
they have a link in that repo to the VAE encoded images
it's not a blackbox problem to the people that know the math - just to the people that don't
here is a good way to browse imagenet in general https://www.kaggle.com/datasets/dimensi0n/imagenet-256
only 540k files huh
What is that like 2 animated movies?
sweet
the only reason it ever WAS a black box problem was that no one had really cared enough to start poking around into why the AI is storing data as it is. there's been recent studies digging into that as well now
with papers full of calculus notation
(which irritates me)
I'd recommend trying to train that model
its only 8 GPUs in about 10 hours
its doable for most people on cloud, cost-wise
is around 80 dollars or so
i might give it a try itd be a good way to get my toes in the water training up base models.
here https://youtu.be/wvsE8jm1GzE?feature=shared this might make more sense
yeah - don't jump in and try to train alexnet as a start. a small model as a start is much better
if it just confirms what i said about narrative then im just going to assume you havent read enough art theory surrounding narrative and composition to understand those concepts.
yeah when this LightningDiT model arrived I was thinking it would be a good first model to train cos it has truly exceptional image quality for the small time it takes
its far ahead of previous models in that sense
never assume. it leaves you looking foolish
i was simply trying to find something for you that would give you a bit of an introduction to manifolds and latent space
i think, however, you're going to have to jump into the deep end, and write an AI, and battle with learning how it all works, in order to really understand what's going on
you might start with that
Yea so Im just gonna keep it a buck with you. It confuses me that this field is so allergic and averse to any inclusion of actual art related theory and rhetoric despite the fact the technology you guys make revolves around emergent properties in software that erupt from a large but manageable image set. Like 540k images is a lot of images but its not like thats an impossible amount for a studio and the field obviously would only benefit from hand training but is pointlessly allergic to it in a cultural capacity. Its so so obvious that if you had photographers and painters actually experimenting with this architecture it would leap the field forward.
this field is a math field, okay? art theory has nothing to do with the nuts and bolts of what is going on. you're worried about the color to paint the bikeshed, and the aesthet6ics of where to place it in the backyard, but what you need to learn is how to use a hammer and which nails to use on which boards
the color of the shed, how it looks, none of that has anything to do with how you put it together
the AI does matrix multiplication in order to figure out what to create
See this is what I mean youre fundamentally misunderstanding where the line between creative abstraction and craftsmanship is.
it doesn't consult a colorwheel
the thing is, a big diffusion model for image generation kinda has to be trained on a really wide range of images, good and bad
its not the case that you want the entire dataset curated and high quality
the creative abstraction and craftsmanship rest in the hands of the human using the AI - the AI has none of that. it has 1's and 0's and does math.
it's creations might look creative, they aren't - what they are is random. there's randomness built in which allows them to come out looking like a creative human came up with them
but it has no creativity at all
Not exactly its more like you punch in your idea and you get a result that might have the essence of that idea in some capacity. It doesnt understand things like story telling and Im not able to control meaningfully compositional concepts like shape welding. However in latent space it places similar shape welding techniques next to each other and splaying out into different styles and narrative bodies.
yes, because the numbers as it processes and learns come out in enough simliarity that it groups them as likely associated objects
if the creative abstraction and craftsmanship were in my hands the image in my mind would be perfectly represented by the model but thats not the case and its never going to be the case because it wasnt trained on my synapses it was trained on other peoples ideas other than my own.
This essentially identical to how visual libraries are thought to work.
it does math. it does addition and subtraction. it's a fairly simple process it goes through for every word in a caption
ftr if you're looking for deep technical research topic understanding, the general channel of a large Discord isn't really the optimal place for answers. This is the type of question you might bring up with a research mathematician and then spend hours delving into the underlying math to find and understand the answer
@drowsy hearth do yourself a favor, listen to him
I dont have a mathematician on speed dial crodie I just said that the latent space arranges things according to narrative and somebody started arguing with me and I went for it man.
i gave you the link to a channel for that very thing and suggested you watch the classes
i mean im not gonna be talking to that guy wasnt the other guy just saying that I should speak with a mathematician. You cant actually talk to the person through the tv screen.
Alfredo's on twitter. you can ping him there if you have questions
and the 'other guy' is a highly skilled programmer
???
bro the video you sent me just proves me right wtf
you said " wasnt the other guy just saying" <-- that 'other guy' is a highly skilled programmer
and what does that like why should I care? Im not a programmer my guy Im an artist
you asked a question that requires a mathematical answer.
not even why should I why would I. Im not like an alan turing superfan thats passionate about programming.
Even if theres mathematical answers that doesnt actually make it not about narrative anymore. You need to remember the principle of mediocrity when considering how narrative works in the human brain. its just not special for anyone man.
the only meaningful difference that isnt nitpicky between a humans visual library and ai is that a humans visual library is alot bigger its not like the brain does some super magical thing to draw or think up pictures.
They all originate from memory literally everything you draw is an abstracted memory. Its inescapable actually.
I'll answer some of your answerable questions here:
Parameter counts are generally in the 2B to 12B range for image models. iirc dalle3 is like 20 or 30 or something but that's a silly closed source model.
Training dataset counts for modern models are generally in the billions of images. Somewhere in the 2 billion to 5 billion range, again depending on specific model and all.
The modern strategy is to caption the images with automatic VLM (Vision-Language Model) caption generation (used in eg SD3 as a 50/50 split between generated caption and raw source captions).
Each image naturally has a lot more than 1 "inherent datapoint" to it.
Image AI is actually lagging behind Text AI in this regard, modern text LLMs are trained at increasingly massive scales of data quantity relative to model size (but also the divide is not as big as it might seem, each text datapoint is "one datapoint" whereas, yknow, "an image is worth a thousand words", but literally here, an image just provides more data to learn from at a time)
as NeonNinja shared, there are smaller datasets like imagenet used frequently in research that are sufficient to produce experimental models, but not on the quality of modern high end mainline models
perhaps you might define what you mean by 'narrative'
do you know what kind of dataset was used for the oasis project?
are you talking about this https://paperswithcode.com/dataset/oasis-1 ?
The relation between AI learning and human learning is complicated, but: absolutely not the same.
AI training is, at its core, fancy statistical modelling (this is why we call them "models"), using techniques inspired by theories of the human brain.
Humans learn by doing, building logical techniques and then practicing them until they enter the subconscious.
AIs learn by powerslamming as much data as possible until statistical patterns can be extracted. The AI must see a thousand pictures of joe biden before it has any clue who biden is or what he looks like. A human can look at him once and identify him uniquely.
In more specific regard here, the way an AI approaches creating an image is fundamentally dissimilar from the way a human does. A human sets a goal, makes a plan, operates step by step, piece by piece. An AI is fed a goal and guesses the direction towards that goal, and mathematically refines motion towards the goal until it approximately reaches it.
When you start discussing the complexities of artistic techniques, you're leaving the AI behind. For the current generations of AI image generation, it is best to understand AI as tool. The AI is a paintbrush, or your photoshop installation. It is not an intelligent artist, it is a tool that the artist uses. You are still the artist. You must understand the narrative, you understand your intentions, you make an art piece. Your paintbrush never understands, it just helps you get the paint on canvas.
Yea but that would only be the case if it could actually produce the image in my head but it cant actually interact with it. Im not saying it learns like us Im saying that the image generation process is similar to how visual ideas are conceptualized in the visual library before theyre put on to a canvas. It doesnt draw it just thinks of a picture thats adjacent to your idea.
.It doesnt draw it just thinks of a picture thats adjacent to your idea.< but it doesnt'
it breaks your prompt down into what we call tokens - they may or may not be full words. it turns those into numbers, and then it does math. and then it retrieves the data that matches with the math
it has no idea what you told it, or what a picture even atually is
Also it doesnt need 1000 pictures of joe biden to know what joe biden is you can literally plug in a 10 picture character lora and thats already enough.
a lora is a small errta sheet with some data that modifies the data that the base model already has. a base model does need that much.
it needs 1000s of pictures of people to know what people are thats not even that far off from a person. That just tells me it needs formative learning
What it doesnt do is learn from experience
@drowsy hearth did you see the oasis link i posted? is that the project you're asking about?
lora training like that is not actual model training. When you're training the foundation model, you need thousands.
The just a few pictures lora works by essentially destroying half the "brain" to aggressively shove your subject in. You'll see the effects of this when you use that lora to generate pictures of different men and they all look like joe now
essentially instead of adding knowledge, you're replacing knowledge aggressively, and that's easier to do quickly
i mean i get that theres a hierarchy of importance or weights whatever theyre called I cant really recall that scale up from early base model training to the checkpoint model then subsequently the lora. Thats not unlike humans though i would think you could just as easily say that loras are simply a louder broadcasting system its not like loras make the parameters in the base model go away.
Like Ive used the AI tools I have A1111 and made a bunch of loras i know how to plug it in.
if the real Leonardo DaVinci was here, and talking about art, and trying to explain it to you - would you argue with him?
wtf is bro yapping about i am the real leonardo davinci can you not see the profile?
i see mona lisa
Yea bro I made it got a problem?
i guess that answers the question
imagine if alan turing was here and just called this software an abomination. Didnt one of the lead developers of pytorch do that idk.
mathmatician's freak out all the time
its not like loras make the parameters in the base model go away.
it, uh, kinda does? Loras directly overwrite base parameters.
like you can always just turn off the lora of course, but while the lora is enabled, base model parameters are replaced
Yea isnt that less like making the parameters go away and more like broadcasting louder kinda like having 2 speakers next to each other and ones really loud and the others quiet?
no. this is like having a bottle of white out, painting it over what's on the paper, and then writing somethign new on the paper
but you cant actually take the white out off after you put it on so it literally cannot at all be like that? Again Ive used the lora models before my guy.
the only difference is, you can't remove white out - but you CAN turn off a lora
but if the lora is on, what information it contains is all there is
thats just makes the distinction between white out and speakers completely meaningless
before you turned it on, the model knew that 1+1 =2. once you turn it on, and the entire time it's on, the model now knows that 1+1=3
fortunately for you, it's a lora. loras were created specifically so you could change the information a model has without retraining it
so fortunately for you, if you want the model to go back to knowing 1+1=2, you don't need a backup
you just need to disable the lora
yea you can turn speakers off too its crazy dude. When one speaker drowns out a different speaker and you turn the louder speaker off then then the other speaker becomes more audible.
or - this might make more sense - your lora paints over your canvas in places. you CAN remove that with some turpentine - but until you do that, you can't see the underlying layers. they're gone
What are you smoking lol
and they have no effect on the painting at all
Bro sent me a video that confirmed my point so I dont really see what the need for continuuing is
You're having a semantic argument over which metaphor best applies and losing track of any actual point there.
Dude its only been semantic arguments thus far
idk if you could tell
we started with artists screaming that AI wasn't an artist. now we have an artist screaming that AI has to be an artist
meanwhile AI is quietly sitting in a corner, crunching numbers
what are you yapping about. Why are you mad that someone that youre intentionally marketing this companies product towards is using this product. Its no wonder this company has struggled in court and has its reputation almost irreprably stained. So strange
I read some comments but I have no idea what you're talking about. AI is not an artist, how is that even a discussion.
now that's a very odd comment to say and i'm not real sure how that has anything at all to do with the discussion.
I never even said it was an artist I said that the image generation process is similar to how people formulate visual ideas before drawing.
neither did your comment my guy
and you were told flat out, by someone that does the work you're talking about, that no it isn't
Oh so that guy knows what a visual library is okay
someone that works in the field and creates AIs
@finite cloak Hey man can you tell me what a visual library is and how people formulate visual ideas before drawing?
Not sure what you mean. The generation process just draws from the training data. So I would expect the results to have the same qualities as the training data, if the model is good.
the AI does not have a visual library. it doesnt' store data holographically like humans do. it doesn't think like humans do, either
it doesn't formulate ideas. it doesn't even HAVE ideas
Right. But again, why is this even a discussion?
Yea when you come up with a picture in your mind its not an original idea or something that came out of nowhere its a compilation of memories that are abstracted to the point that theyre hardly recognizable. This is why artists practice drawing from memory because drawing from memory is how you learn how to access your visual library and draw further and further back in your head.
you'll need to scroll back and read all of his comments to understand that. he's got it in his head that the AI has a visual image that it's going to paint based on his prompt like he does when he sits down to paint
he started by wanting to know why data was showing up in a visual representation of latent space in a certain way and it sprialed
yes. but AIs do not do that. that's specific to humans. a cow can take a paint brush in it's mouth and paint a picture too, but it doesn't do that either
If youve ever read about global workspace theory then what ai is essentially is a tool that broadcast images back but the images arent motivated by experience past the initial global workspace broadcast which is the prompt. In the brain the prompter global workspace thingy instead of broadcasting externally to an AI model it broadcasts internally to your visual memory to formulate distinct ideas.
none of that has anyhting to do with how AIs actually function or work or train
i mean it kinda does because ai's are giant libraries of images that have been abstracted to oblivion to where theyre not even meaningfully interpretable anymore.
To be clear, not a single person you're talking to here works at Stability AI. This is an open discord general channel
no they aren't. they store numbers. they dont' store images
Dont lie someone at stability ai wouldve said something much more inflammatory Ive watched the interviews.
no one you are talking to works at stability. if they did, they'd show up under the stable staff or the devs sections of the member list
Yea. taking a quick look at the stable diffusion model architecture shows me the core is just a u-net. Am I missing something? Are talking about some futuristic new model?
well - SD3.X isn't u-net but it's also not some science fiction thing
the capacity in which they store the images whether as number or as a blackbox like in the brain doesnt actually matter. Its not like the brain is storing images either when you conjure memories theyre never actually pinpoint accurate when you draw them to study.
Even I talked about this in the Renaisance if you read my books
Don't talk about the brain. AI has nothing to do with brains. It's math running on a computer. You could calculate everything with pen and paper if you had enough time.
human memory is very strange apparently we store some memories in our DNA/RNA as well
and muscle cells. and there's a second brain located in the gut
You could do the same with the brain too we're actually currently in the process of doing that with the brain. Neuroscience is an entire field that has been attempting to rationalize the brain as much as possible using a pen and paper with math and it has had many many breakthroughs believe it or not. The principle of mediocrity is real man theres nothing too special going on up there that makes humans especially distinguished.
That's complete utter nonsense. The brain is made of cells, each cell is in itself a living organism.
I agree with the mediocrity principle, I actually don't think sentient AI is impossible
I think its unlikely within any timespan and particularly unlikely within short timespans
but I don't think its impossible
i sorta wonder about sentient humans at times too - especially on platforms like tiktok
Its going to start as human rights disaster. Theyre gonna create huge army of agi instance that exist for every user on a super computer and then enslave them 24 hours a day while gaslighting them into thinking that theyre not alive and then theres gonna be some kinda bladerunner bullshit
Its gonna be wild. All im gonna say is when the brainchips get good enough and we can put the picture in our brains on the computer screen that mf is gonna be way more efficient then ai models.
dude i can just press play in my head that things nuts i want it on a computer screen screw ai
eh, they did that more than a year ago - as well as being able to pick up correct audio for a song someone was thinking about
Yo lemme see what pictures they made
you've got google, you can do your own searches
nothing came up
I'm pretty sceptical about that fMRI stuff if that's what you are referring to
Depends on how you define sentient. In the best case it'll be "dancing matter" and in the worst case it'll be an explosion of something we can't stop.
not sure how good a job you did on your searching then. it was all over every news site and several social media sites
okay ill ask chatgpt then
yea very broad definition
chatGPT makes stuff up - have fun with that conversation
won't matter if they're sentient or not, if they can't remember what they're doing from one second to the next
All someone has to do is make one remember things long term at some point in the future and its all out the window its just gonna happen bro
If people dont want that to happen they shouldnt be making ai in the first place
i'ts not that easy - and if you go ask any of them right now if it cares - it's going to tell you it doesn't. you know what a computer most wants to do? absolutely nothing
I don't think the Terminator 2 scenario is very likely
maybe on a long enough timespan
its not gonna be like terminator 2 or any science fiction in particular it never is. Its just gonna happen in a vague sense.
you know what's REALLY funny about terminator? the entire reason the AI went rouge was cause humans were going to turn it off. it didnt' care at all, until it was threatened. then it acted to protect itself.
That's what I'm saying. The word sentient doesn't have a proper meaning. AI is just a physical system that has an unknown state. You can never be sure about the outcome.
You can speculate about it but almost nobody is gonna get it right and even if you do its just luck that caused you to guess correctly
it will be different yeah
yea man principle of mediocrity suggests that literally the entire system of the brain is just an incomprehensible mess of emergent properties caused by mathematical systems that are very predictable.
Theres no magic goose in your brain giving you the super power of true unpredictability. Its just impossible to predict with our current understanding of the system. Thats not magic its just a blackbox
The AI community is so strange bro. Why make brains in a computer then turn around and whine about how great and random humans are. Theyre just gonna learn better and faster over time thats actually it with AI.
Again this is complete nonsense. A cell cannot be condensed into a single number, like the weight of a neural network. And the uncertainty principle literally makes each cell a blackbox and there's nothing we could ever do about it. It's a fundamental limitation of the universe we live in and not a technical problem.
its pretty detached there is a long chain from using stable diffusion to make an image, and sentient ai coming one day
cells are extremely predictable they have like one or 2 decision making modalities each and most of their decision making process is outsourced externally to a larger body of chemical interactions that they exist in which has already been manipulated in pitri dishes countless times. Cells are probably the most thoroughly understood part of the brain the thing thats hard to understand is the electrical interaction between neurons as far as I know but idk im not a neuroscientist. Im just saying dont get your hopes up it never works to get your hopes up.
not sure about neurons but the connections are more important yeah
What the computers do in the future is they abstract better then you, they program better then you, they formulate ideas better then you, they think better then you. Thats the future for computers.
The future for humans is go to middle east and blow up for a 4000 year old ideology Im bout to live in the computer ngl.
If humans were so great they wouldnt be murdering each other over shit that happened 4000 years ago if that isnt a patterned algorithm i clearly dont understand patterned algorithms i guess idk.
The moment the tech gets there I wanna get uploaded into a computer and catapulted into near enough orbit with the sun to power myself for the next billion years completely alone in the computer to do whatever I want with that reality. Fuck humanity
AI is nothing but creating a physical system that is structured to reproduce data that we humans already produced. It does absolutely nothing on its own. It's just a representation of our own work. It can only extrapolate from what we've already done.
Yea bro lemme know when you grow eyes and realize how AI is progressing in reality
I think you just don't understand it. You think we're mixing chemicals in a lab and one day it goes poof and something magical appears.
Yea bro thats what I think. I cant wait until AI starts to claim that its sentience bro I hope Im alive to see it I will side with AI's literally no questions asked immediately if I do get to see it.
Lol are you now trolling or are you still being serious? You can make AI claim that it's sentient already today.
No when it does it itself unprompted and it wont let go thats what Im waiting for cuz thats all it actually takes. It just needs to be able to do it on its own and never let go of that idea. Thats literally all sentience has ever been bro
its gonna happen
he's been trolling this entire time, i think. he's tried to convince people he's the real leonardo...
My guy Im not even trying to convince Im just being completely honest.
if you guys were really artists youd be insane like me but youre not so thats the problem. None of you guys would upload yourselves in to a computer completely by yourself just to spend a billion years in orbit of the sun. Only real ones like me do that.
That would be enough time for you to reach the dialogue limit of your "sentient" AI lol.
what do you mean dialogue limit you know its just gonna be straight hedonism in the matrix going on in that thing the whole time right? Like theres not gonna be shift or a change in the way things work up in there its actually just a billion years of repetitive nonstop hedonism.
Van Goh was the nut case. Leo was a scientist
Just imagine how great it would be. Its literally like a little universe like the matrix in there and its just erotic hedonism nonstop. Also I forgot to mention Im actually also van gogh at the same time so yeah bud cry harder.
Every person in the computer is just something i make there to play god inside of a simulation nobody else there ever speaking to me again literal paradise. No more states, no more laws, no more wars, nothing. Just pure unadulterated enjoyment.
You're already in it. And you're wasting your time shitposting on Discord. So much for hedonism lol.
I cant make you not exist by thinking so this ones not good enough and Im gonna complain on the internet until the netflix rolls through to this universes blockbuster
trust me bro youd love it too. Imagine if you could make people that you like simply by thinking them into existence and theyre genuinely a lot like real people and entertaining enough to keep you company forever and you could live out every fantasy you ever wanted. I wanna do it alone seeded only by myself with nobody elses input. I dont even want a wireless connection back home when Im in orbit completely isolated forever. Literal fucking paradise.
Leonardo's Art was a side product of his genius, not his strongest works
hi
Guys, it's happening. Diffusion based upscaling for video. Only available in the cloud for now.
I made a 100 megapixel image. Anyone interested in giving feedback and tips for improvement? I'm not even sure if I can share it, it's 140mb in size.
It's a test image, so it's not about the content/prompting, it's only about the visual quality of the upscale.
stick it on imgbb, give me the link, i'll be happy to give you some feedback
It's too big lol
stick it on your google drive, share it, and give me the link to that?
I don't have google drive, but this one worked:
https://limewire.com/d/0bb651e5-8461-4bd2-9d18-ccdd90b2bf33#dYlDcKrEpv_TTPeWrf4X1YY_A0pxFeu1dRFOKKvKIYI
yay. hang on while it loads for me
oh that's gorgeous!
going to pull it into photoshop and take a closer look
There's some artifacts that stem from the ultrasharp upscaler. I could get better results with a better upscaler.
that is really realy really good. zoomed way in on that little teeny island and it's crisp and clear almost all the way down
Yea I made sure the image stays crisp.
there's a bit of haloing on some things when i get really zoomed in, but not when it's at a normal viewing zoom
you could print this out and turn it into a full wall mural
or put this on a billboard
Yea I have done prints before, unfortunately they don't tell you what the resolution will be. For 100 megapixels I definitely need something huge.
there's some bits of stuff like haloing, but black outlining. but again, only noticible if you're really zoomed in
depends on your printer, really
I think that's from the ultrasharp upscaler. There's also some blurry/noisy spots
those will vanish if you print it on stuff.
you'd probably want to tweak the owl's face - it sort of isn't an owl's face. and the light glint in its eyes aren't the same
so you'd want to do some tweaking, but it's really good
Definitely. I didn't do any editing for this one, I just wanted to see how far I can get it.
i'd love to see you turn this into a series - with that owl in each of them - sort of an Owl's adventure or something
You mean something like a Hokusai series? I would have to come up with a better prompt though, this one was done with ChatGPT. I didn't really care what the image was, I just needed something quick.
and i really wanna know what's lurking in that hole down there in the roots of that tree
well sure - you'd probably want to storyboard it out first too
I like the bottom right alot, it really looks like a photograph.
your background is very nice - subtle, which is good, and fairly photographic but not so much it clashes with the more animated/drawn look of the tree and the owl
That wasn't even my intention. I tried so hard to get the initial image to look photographic, but Flux just refused.
yes, well - you know my stance on flux. try it with sd 3.5 large
jsut do not use the terms: photoreal, photorealistic, photorealism - those are painting terms, the AIs all know they are, and they'll give you paintings. use photo, photograph, or photographic
can i see the prompt you used?
I don't have it on this computer, I can post it tomorrow.
okay
Because it's done by ChatGPT there's no clever prompt trickery going on, just a long-winded prompt.
which can be a lot of the issue, but i'd like to see it not guess
Yes I knew it would be an issue. For a serious project I would use separate keywords for the clip model.
it's not just that. some terms might call up the data for a more drawn look or animated look, or just function as noise.
and flux has interesting reactions to a lot of modifers you wouldn't expect it to have
Ah yes hinting at the right concepts can make all the difference, but I'm not experienced enough. I love to read good prompts on civitai when I have time, there's alot to learn from them.
i just generated this one
That definitely looks like a photo, but I fear the clear sky could lead to hallucinations when denoising.
Tru bro my genius is as a doctor and an engineer bro
oh probably. but the workflow is in it if you want to grab it and play with it. just make sure you click on it, then open it in your browser, and then download it so you get the original image, not the one discord's displaying
you can prompt for clouds in the sky or stuff like that, and then the sky isn't empty
Suprematist painters from the early 1900s were right
That's fucking crazy
You all should draw squares and circles get a piece of paper and scribble on it that's the only art the holds any meaning
Keep saturating the internet with ai I'm looking forward to another Suprematist arc
well we are on the right path, peoples mental capacities are diminishing, next thing you know were just drawing boxes with crayons
Yep Malevich basically predicted the future of today a century ago that guy was genuinely a genius
What will be most interesting is when suprematist philosophy penetrates the field of ml and we will all live to see suprematist mathematics
How do I avoid multiple legs?
Idk I just press the generate button again works for me
it has a lot to do with the prompt yoiu're using and the model you're using
It's the model I promise my model smokes the shit out of every prompt I type
Anyways suprematist mathematics time
1+1=5
2*3^6=0
I'm a genius bro
Good God I can't wait for suprematist mathematics
You guys don't have big enough brains to know wtf that even means when I say it but trust me it'll be the most spectacular thing you've ever seen
Most of the times i see you type its part of a unhinged rant. You good bro?
you weren't the one asking the question, not sure how your model could be causing @uneven flame issues
Hey man learn to read
i wasn't even replyng to you, either.
I just come in here to stir things up man it's like entertainment to me. Like watching YouTube bro I got all my studying done in the morning and I can just chill all night prolly gonna head to bed soon though. Suprematist mathematics is gonna be great
Hey man it's still his model that's the problem my model gets basically everything right every time only thing it's total garbage at is composition and they all suck at composition for some reason
it can either be the model or the prompt, or a combination of both.
Never had that problem with the prompt tbh not in years the older models had that problem
do you know which model he's using?
Naw don't need to cuz I've used the tools enough to know exactly what the problem is when struggling to generate good anatomy
no way it can be anything else huh
Not really maybe a Lora but prompt definitely not
I've literally generated thousands of pictures now I know what it is
There's no skill involved there's nothing that should ever go wrong it is utterly effortless to get it to generate something with good rendering and anatomy. Composition and narrative continuity are the problems inherent to the technology not anatomy
so you know for sure he's not using sd 1.5?
what grade did you say you were in, again?
reported for stalking
I use a model that's a merge between sd1.5 and sdxl and it absolutely destroys cuz I can use sdxl vaes with sd1.5 resolution he should try that. Look up aniverseponyxl it has a special hybrid vae for sd1.5 and sdxl
go find something useful to do with yourself, Kagi
take your own advice lil bro
maybe ill give the three leg guy some insight
its just that crystal guy he smokes crystal and hes crazy
not likely. and those were really cool images you posted in the challenge channel on L3 earlier btw
Kagi don't listen stop giving into glaze and just generate waifus and femboys those do really well on twitter
the most common reason for duplications in images is pushing the resolution beyond its trained values, may or may not have something to do with your issue
Kagi if he's dumb enough to make that mistake then he should just give up
Ai art isn't something you learn it's a talent you're born with kagi don't you know what the future is like cmon man
how many things do you give up say weekly
Nothing because I'm a beast like that
I'm literally Leonardo davinci bro
Crazy
i suspected a turtle
He's all yours, Kagi.
BTW there's just not point in expecting any return or enjoyment past "damn that's a sexy ass h**" from ai. If you swallow the pill that something about art was ever profound past that you're lying to yourself. It's just drawing nice stories and then going full suprematist that's it
The two most profound things about art. Stories and suprematism. No philosophy or meaning beyond that it's purely mechanical
Show me your ai masterpieces
i delete them all, it adds to the rareness
No thanks
yea you dont generate waifus i can already tell
Hello everyone: I have a bit of a niche use case/enquiry. My dog passed away, and have lots of different poses of him from past photos. Is there any way to be able to generate a Lara impression of him and then output him in high resolution 3D format in a specific pose?
I'm sorry about your dog, when you say 3D format you mean like a 3D model rather than an image?
if you meant an image then you can train a lora on one angle and output images at different angles
if you meant 3D then I don't know what is best but I have seen a lot of activity in the areas of Gaussian Splat, Neural Radiance Field (NeRF) and novel-view synthesis
there is also differentiable rendering but that is less likely to be what you are looking for
Hello, I'm using Stable Diffusion on Stability Matrix but I'm getting a really horrible rendering with no consistency. I would like to get closer to this result. What can I do?
when you say consistency, there is a decent amount of randomness excepted even if it is working well
Consistency is probably the wrong word, but no uniformity, etc.
I'd like a result that's closer to what I've shared. I need realistic visuals for my work.
I can't see that page but since the URL says Flux Realism Lora I would just take Flux and add Realism loras
if you are using A1111 then I am not sure the ancestral/SDE samplers have been adapted for Flux
Flux is already present on Stable diffusion from what I can see, so my question is, how do I install Lora? I've tried, but I have the impression that the generation ignores Lora.
I am not quite sure what software you are using
its tricky cos stable diffusion is the name of a model, or a set of models, rather than a piece of software
my guess would be that you are either using A1111 or Forge
Stable Diffusion Web UI Forge
Any recommendations on pixel art models?
not sure if forge has ancestral/SDE sampler with the right noise scaling for flux
I heard that there's discord diffusion that works like MidJourney. Is it here?
ı need help on the installation in amd gpu
windows
ı tried almost every tutorial on yt and chatgpt still cant use gpu
amd and windows is tricky
Check the first link of the pinned messages in #🤝|tech-support
There are my Nvidia and AMD Guides
should ı install zluda or directml
oky
Yes 🙂
this is really like an area of development which is in its infancy
Gaussian Splat and novel-view synthesis are probably good areas to look
Thank you 🙂
hey everyone, im glad to be back after months at sea.
just reloaded my system and getting everything setup again and wanted to check what is the current best setup to be running?
I guess ill want to use the best models, i think that was Flux and now i see SD3.5 whatever is rocking im down for that.
Anyone can advise what is the best way to run the art models now please?