#🆕｜sd3 | Stable Diffusion | Page 8

wide pagoda Jun 5, 2024, 10:42 AM

#

What about 512x512, does it just generate cropped images?

noble coyote Jun 5, 2024, 10:42 AM

#

storm saffron Jun 5, 2024, 10:43 AM

#

Sigma is a bit of an oddball though, it uses very long captions (300 tokens) rather than the 77 people are used to, so to get the best from PixArt you need to type in a large prompt or expand it with an LLM

#

SD3 still has that 77 token limit though right?

noble coyote Jun 5, 2024, 10:44 AM

#

storm saffron Sigma is a bit of an oddball though, it uses very long captions (300 tokens) rat...

Yes, my best PiXart-Sigma output is when I use Jan.ai to prepare a "natural language prompt" as input 🙂

#

T5 is better when natural language is input ...

storm saffron Jun 5, 2024, 10:45 AM

#

I've been using zephyr7b in comfy

noble coyote Jun 5, 2024, 10:45 AM

#

In fact T5 was conceived to use natural language

storm saffron Jun 5, 2024, 10:46 AM

#

Probably shouldn't discuss pixart too much in the SD3 channel of SAI though. Feels weird.

noble coyote Jun 5, 2024, 10:47 AM

#

But as too few people can get their hands on SD3 - we have to talk about something ... 😄

dull star Jun 5, 2024, 10:48 AM

#

interesting, Pixart (DiT) had the entire image noisy/distorted when doing a higher res version on any model

#

at least, if I recall correctly

noble coyote Jun 5, 2024, 10:51 AM

#

I'm @ClipDrop SD3 now ... anybody got an interesting prompt for me to try?! 🙂

#

OK, gone to Openai - Daily Theme for a prompt 😄

dull star Jun 5, 2024, 10:53 AM

#

Photo of Criminal in a ski mask making a phone call in front of a store. There is caption on the bottom of the image: "It's time to Counter the Strike...". There is a red arrow pointing towards the caption. The red arrow is from a Red circle which has an image of Halo Master Chief in it.

#

ah im late

noble coyote Jun 5, 2024, 10:53 AM

#

A moody and atmospheric fight scene between stick figures rendered in a realistic digital painting style. The scene is set in a dark alley, illuminated by a single streetlamp casting dramatic shadows. Two stick figures are engaged in a fierce battle, with one delivering a powerful punch knocking the other backwards. The realistic digital painting technique gives the stick figures texture and depth, making them appear almost human-like. The atmospheric lighting and detailed background enhance the emotive intensity of the scene

noble coyote Jun 5, 2024, 10:54 AM

#

dull star `Photo of Criminal in a ski mask making a phone call in front of a store. There ...

I've tried that one ... was OK

dull star Jun 5, 2024, 10:54 AM

#

oh you were the one who tried it

#

I forgor lol

storm saffron Jun 5, 2024, 10:55 AM

#

noble coyote A moody and atmospheric fight scene between stick figures rendered in a realisti...

Pixart

noble coyote Jun 5, 2024, 10:58 AM

#

A futuristic terrarium with a unique and captivating design. Inside the terrarium, miniature bioluminescent plants and flowers emit a gentle, glowing light, creating an ethereal and otherworldly atmosphere. The plants are arranged in a way that mimics a mystical, enchanted forest. Tiny, intricately detailed fairy houses and pathways are nestled among the foliage, adding a whimsical element. The terrarium's glass is etched with delicate patterns that reflect and enhance the inner glow, making the entire piece a captivating focal point

#

Create a square visual representation inspired by the DIY spirit and featuring prominent use of safety pins, styled as a monochrome street snap photograph from the punk era. The image should focus on a close-up of a single person's face dressed in typical punk fashion, caught in a candid moment as if walking through the streets and looking back over their shoulder. The photograph should emphasize sharp contrasts and capture the gritty, raw aesthetic of punk culture. The person should have a brightly colored mohawk or spiked hair, appearing self-cut and dyed to emphasize the DIY ethos, though rendered in black and white. They should be sticking out their tongue and raising all fingers towards the viewer, conveying a rebellious and defiant attitude. The person should wear dark eye makeup, with black eyeliner and shadow. Accessories such as safety pins used effectively but sparingly as earrings, body piercings, and prominent decorations on their clothing should be visible. The person's eyes should have a nihilistic, vacant expression, emphasizing a sense of emptiness or detachment. The background should feature a gritty urban scene with graffiti-covered walls, capturing elements of street art that align with the punk aesthetic. The monochrome style should highlight the rebellious spirit of punk with intense, sharp contrasts and a raw, edgy feel. The photograph should have a candid, spontaneous feel as if the person was unexpectedly captured while walking through the city and looking back. Add a grainy texture to the photograph to enhance the raw, unpolished aesthetic.

#

This punk prompt is a monster!!!

Create_a_square_visual_representation_inspired_by_the_DIY_spirit_and_featuring_prominent_use_of_safe_1.png

Create_a_square_visual_representation_inspired_by_the_DIY_spirit_and_featuring_prominent_use_of_safe.png

Create_a_square_visual_representation_inspired_by_the_DIY_spirit_and_featuring_prominent_use_of_safe_3.png

Create_a_square_visual_representation_inspired_by_the_DIY_spirit_and_featuring_prominent_use_of_safe_2.png

#

SD3@ClipDrop

#

SD3@ClipDrop prompt = A chubby Shiba Inu named XiaoShuai, standing on two legs with its belly exposed, not holding any items. XiaoShuai is wearing a cute human outfit that includes a brightly colored shirt and shorts. The Shiba Inu has a warm, tan coat typical of its breed, with expressive eyes and a playful expression. Scene in warm tones, realistic style, capturing the essence of a cozy home atmosphere

A_chubby_Shiba_Inu_named_XiaoShuai_standing_on_two_legs_with_its_belly_exposed_not_holding_any_ite.png

A_chubby_Shiba_Inu_named_XiaoShuai_standing_on_two_legs_with_its_belly_exposed_not_holding_any_ite_3.png

A_chubby_Shiba_Inu_named_XiaoShuai_standing_on_two_legs_with_its_belly_exposed_not_holding_any_ite_2.png

A_chubby_Shiba_Inu_named_XiaoShuai_standing_on_two_legs_with_its_belly_exposed_not_holding_any_ite_1.png

viral plaza Jun 5, 2024, 11:26 AM

#

wide pagoda What about 512x512, does it just generate cropped images?

kinda. rn the SD3-Medium model mostly just makes it work at 512, idk if the release version will be the same or not

#

well not perfectly, but, like, relative to what you'd expect for borking the res entirely

pine cedar Jun 5, 2024, 11:28 AM

#

Is this the 2b model of sd3？

wide pagoda Jun 5, 2024, 11:29 AM

#

So it does closeups, but with "reasonable" framing

low stone Jun 5, 2024, 11:30 AM

#

pine cedar Is this the 2b model of sd3？

No, that's the 8b beta on the api

pine cedar Jun 5, 2024, 11:30 AM

#

low stone No, that's the 8b beta on the api

amazing picture

noble coyote Jun 5, 2024, 11:34 AM

#

What SD3 does have is visual acuity - poor limbs, fingers and faces (at times) - yet with an enhanced visual acuity.

dapper basalt Jun 5, 2024, 11:38 AM

#

If Stable Assistant sticks around and stuff, I'm gonna keep it I think. Its an awesome tool to finetune what you are trying to get. For example I had it optimize prompts. It'll be great once its fully featured. So o nce weights are dropped and it isn't necessary to burn through credits really quickly and to use it as a tool on the side, it'll be worth it.

low stone Jun 5, 2024, 11:41 AM

#

noble coyote Jun 5, 2024, 12:16 PM

#

dapper basalt If Stable Assistant sticks around and stuff, I'm gonna keep it I think. Its an ...

I can never understand why Stable Assistant gives you 150 prompts/150 pictures a month (for $10); and ClipDrop gives you 300 prompts/1200 pictures a month (for the same $10)?

dapper basalt Jun 5, 2024, 12:17 PM

#

SA needs the money and knows people will jump on SAI quicker. Cause of the new tech and brand name

noble coyote Jun 5, 2024, 12:18 PM

#

Twice the number of prompts and eight times the number of pictures 😄

#

AFK

dapper basalt Jun 5, 2024, 12:19 PM

#

Yeah true

muted dove Jun 5, 2024, 1:11 PM

#

noble coyote A moody and atmospheric fight scene between stick figures rendered in a realisti...

Boltning Hyper

storm saffron Jun 5, 2024, 1:19 PM

#

@viral plaza do you know if it's just the standard T5 encoder that everything elses T5 uses or is this specific to SD3?

viral plaza Jun 5, 2024, 1:21 PM

#

storm saffron <@105458332365504512> do you know if it's just the standard T5 encoder that ever...

publicly available google/t5-xxl

storm saffron Jun 5, 2024, 1:21 PM

#

viral plaza publicly available google/t5-xxl

Phew, I don't have to download it agaaaaaain.

viral plaza Jun 5, 2024, 1:23 PM

#

https://huggingface.co/mcmonkey/google_t5-v1_1-xxl_encoderonly the encoderonly file loads more fasterer tho

mcmonkey/google_t5-v1_1-xxl_encoderonly · Hugging Face

storm saffron Jun 5, 2024, 1:23 PM

#

viral plaza https://huggingface.co/mcmonkey/google_t5-v1_1-xxl_encoderonly the encoderonly f...

🙂 I've already got one of those.

viral plaza Jun 5, 2024, 1:23 PM

#

nice

muted dove Jun 5, 2024, 1:24 PM

#

storm saffron Pixart

Also Pixart-Sigma

storm saffron Jun 5, 2024, 1:24 PM

#

Pixart uses the same one at FP16

storm saffron Jun 5, 2024, 1:24 PM

#

muted dove Also Pixart-Sigma

Nice!

#

Another one from pix

muted dove Jun 5, 2024, 1:28 PM

#

Same 😄

noble coyote Jun 5, 2024, 1:30 PM

#

viral plaza https://huggingface.co/mcmonkey/google_t5-v1_1-xxl_encoderonly the encoderonly f...

Can this be used inside PiXart-Sigma at all?

muted dove Jun 5, 2024, 1:34 PM

#

noble coyote Can this be used inside PiXart-Sigma at all?

I'll let you know if it loads...after it downloads

wild remnant Jun 5, 2024, 1:36 PM

#

storm saffron Jun 5, 2024, 1:41 PM

#

noble coyote Can this be used inside PiXart-Sigma at all?

Yes. I'm using an fp16 xxl t5 right now.

muted dove Jun 5, 2024, 1:45 PM

#

storm saffron Yes. I'm using an fp16 xxl t5 right now.

Are you using the encoder only version though?

storm saffron Jun 5, 2024, 1:45 PM

#

muted dove Are you using the encoder only version though?

Yep

muted dove Jun 5, 2024, 1:45 PM

#

How? I get AttributeError: 'NoneType' object has no attribute 'get'

#

Loading T5 from 'F:\ComfyUI_windows_portable\ComfyUI\models\t5\t5-xxl-encoder'
!!! Exception during processing!!! 'NoneType' object has no attribute 'get'

storm saffron Jun 5, 2024, 1:46 PM

#

#

I'm actually using a BF16 one, oops,

muted dove Jun 5, 2024, 1:47 PM

#

The clue was still there though, thanks! 🙂

#

Changed path_type from folder to file...now works

storm saffron Jun 5, 2024, 1:48 PM

#

https://huggingface.co/city96/t5-v1_1-xxl-encoder-bf16 if you want the BF version (which may be more accurate than FP16)

city96/t5-v1_1-xxl-encoder-bf16 · Hugging Face

muted dove Jun 5, 2024, 1:51 PM

#

I seem to have all of them! 🤣

#

No wonder I'm running out of disk space

#

I have mt5-xl too

#

15GB

storm saffron Jun 5, 2024, 1:54 PM

#

You can probably find a smaller xl as well.

muted dove Jun 5, 2024, 1:55 PM

#

I have flan-t5-xl-encoder-only-bf16, which is 2.5GB

#

storm saffron Jun 5, 2024, 2:31 PM

#

muted dove

Does that work as well?

muted dove Jun 5, 2024, 2:32 PM

#

Not tried

storm saffron Jun 5, 2024, 2:37 PM

#

I wonder if the outputs are compatible.

noble coyote Jun 5, 2024, 2:38 PM

#

muted dove I have flan-t5-xl-encoder-only-bf16, which is 2.5GB

Trying bf16 now ...

storm saffron Jun 5, 2024, 2:39 PM

#

Nope, just tried, xl doesn't work, it's a different tensor size

#

xxl only

noble coyote Jun 5, 2024, 2:41 PM

#

bf16 works - not made a noticeable difference - it might if I was doing photorealistic though (I will try that later!)

#

Trying XXL ... it's working ... again, no discernable difference?!

sterile pendant Jun 5, 2024, 3:19 PM

#

noble coyote bf16 works - not made a noticeable difference - it might if I was doing photorea...

Now you see why nobody uses llms at high precisions. You can look up perplexity vs quantization to see more about it.

dull star Jun 5, 2024, 3:28 PM

#

I remember in the GPT-J days I wanted to run LLMs, but the best we could do is load-in-8bit lmao

storm saffron Jun 5, 2024, 4:38 PM

#

noble coyote Trying XXL ... it's working ... again, no discernable difference?!

BF16 and FP32 are very close in precision, FP16 loses precision in the sorts of ranges DiTs have.

noble coyote Jun 5, 2024, 4:42 PM

#

storm saffron BF16 and FP32 are very close in precision, FP16 loses precision in the sorts of ...

I'm running a lot of art which does not need a great deal of artistic precision - do you mean precision in detail; or in prompt coherence?

storm saffron Jun 5, 2024, 4:43 PM

#

noble coyote I'm running a lot of art which does not need a great deal of artistic precision ...

Mathematical precision. 😄

noble coyote Jun 5, 2024, 4:43 PM

#

I come to all this primarily as an artist ... glad to know that mathematical precision exists!! 😄

low stone Jun 5, 2024, 4:54 PM

#

the city96 guy has a 3 gig version of the t5 for hunyuan, and I did a side by side comparison. Details on the face, lines on the clothing, various stuff like that are suddenly not as sharp or don't make sense. They're not major, but it's definitely noticeable.

muted dove Jun 5, 2024, 5:22 PM

#

low stone the city96 guy has a 3 gig version of the t5 for hunyuan, and I did a side by si...

Isn't the T5 just an LLM that helps improve the prompt? I'm surprised it has any impact on the details in the image.

low stone Jun 5, 2024, 5:22 PM

#

no that t5 is the thing actually doing the encoding for the image model.

#

it's converting the words to numbers.

teal fossil Jun 5, 2024, 5:23 PM

#

viral plaza publicly available google/t5-xxl

Which is great. I tried it in TagGui and it's pretty neat. 🙂

low stone Jun 5, 2024, 5:23 PM

#

hunyuan has a separate prompt expander model that they also include if we want to use it, but we have better stuff like gpt4 and llama3 for that.

dull star Jun 5, 2024, 5:25 PM

#

OH SHIT STABLE AUDIO HERE?

#

https://stability.ai/news/introducing-stable-audio-open

Stability AI

Stable Audio Open — Stability AI

Stable Audio Open is an open source model optimised for generating short audio samples, sound effects and production elements using text prompts.

#

non commercial license as expected

#

cant wait for comfyui implementation

dusky thistle Jun 5, 2024, 5:48 PM

#

hope like hell we someday get the full weights for the full version

#

i'd go wild with that

teal fossil Jun 5, 2024, 5:53 PM

#

@viral plaza On another note - could we theoretically load CogVLM instead of T5 XXL? At least in TagGui it gives me better results. Too bad it's quite VRAM hungry and CogVLM2 is Linux only...

teal fossil Jun 5, 2024, 5:59 PM

#

dull star OH SHIT STABLE AUDIO HERE?

Let's hope we get a good UI for that soon! (looks like audio-webui isn't actively maintained anymore)

dull star Jun 5, 2024, 6:01 PM

#

honestly comfyui would be fine for me

teal fossil Jun 5, 2024, 6:05 PM

#

dull star honestly comfyui would be fine for me

Ooh... RIGHT! I didn't consider that... absolutely.

dull star Jun 5, 2024, 6:11 PM

#

I bet we'll see some addon implementation a few days later

#

maybe around the time SD3 2B comes out

raven fern Jun 5, 2024, 6:11 PM

#

viral plaza https://huggingface.co/mcmonkey/google_t5-v1_1-xxl_encoderonly the encoderonly f...

is this the one we will need to download when sd3 releases?

fair spruce Jun 5, 2024, 6:12 PM

#

dull star Jun 5, 2024, 6:12 PM

#

raven fern is this the one we will need to download when sd3 releases?

no, you should download the bf16 one

#

smaller filesize, faster loadtime, less memory used up on CPU Ram

raven fern Jun 5, 2024, 6:12 PM

#

ah nice, i like the bf16 version anyway :3

dull star Jun 5, 2024, 6:12 PM

#

👍

fair spruce Jun 5, 2024, 6:17 PM

#

dull star Jun 5, 2024, 6:18 PM

#

wait a sec, the link to the model that Alex sent is the same filesize... maybe its fp16 as well

#

oh its from him, mcmonkey

raven fern Jun 5, 2024, 6:18 PM

#

mcmonkey d luffy 🙂

fair spruce Jun 5, 2024, 6:19 PM

#

raven fern Jun 5, 2024, 6:19 PM

#

calm down Musk

fair spruce Jun 5, 2024, 6:19 PM

#

Show Musk go on

#

they said to me 🙂

raven fern Jun 5, 2024, 6:19 PM

#

good one

fair spruce Jun 5, 2024, 6:31 PM

#

#

#

#

rustic junco Jun 5, 2024, 6:39 PM

#

When using the api to generate images I get finish_reason: 'CONTENT_FILTERED',
seed: 2950283743
}

It blurs the image. Any way to say that it should not filter content ?

low stone Jun 5, 2024, 6:42 PM

#

rustic junco When using the api to generate images I get finish_reason: 'CONTENT_FILTERED', ...

Nope. Have to wait for the 2b on June 12 for that.

rustic junco Jun 5, 2024, 6:42 PM

#

low stone Nope. Have to wait for the 2b on June 12 for that.

Where did you get that information ?

low stone Jun 5, 2024, 6:43 PM

#

rustic junco Where did you get that information ?

Look here. #📣｜announcements message

rustic junco Jun 5, 2024, 6:45 PM

#

low stone Look here. https://discord.com/channels/1002292111942635562/1002292398703001601/...

Thank you very much! 🙂 So are we sure that we will be able to turn of that filter ?

woeful spindle Jun 5, 2024, 6:45 PM

#

rustic junco Where did you get that information ?

The diffusion gods have revealed their divine will unto us

low stone Jun 5, 2024, 6:45 PM

#

rustic junco Thank you very much! 🙂 So are we sure that we will be able to turn of that filt...

The local version won't have filters. It'll just be capable of something or it won't. There won't be any blurred images.

fair spruce Jun 5, 2024, 6:46 PM

#

low stone Jun 5, 2024, 6:51 PM

#

#

Hah it couldn't quite do it.

rustic junco Jun 5, 2024, 7:01 PM

#

low stone The local version won't have filters. It'll just be capable of something or it w...

Do you know what format the image-to-image mode has to be ?

#

I tried to send an url and tried to send the image as base64

#

both times i get bad requestb ack. I also changed the mode to image-to-image but doesnt seem to work

cedar gale Jun 5, 2024, 7:02 PM

#

raven fern Jun 5, 2024, 7:05 PM

#

what am i looking at lol

fair spruce Jun 5, 2024, 7:06 PM

#

#

cedar gale Jun 5, 2024, 7:08 PM

#

raven fern Jun 5, 2024, 7:10 PM

#

kek

cedar gale Jun 5, 2024, 7:10 PM

#

#

fringe rain Jun 5, 2024, 7:17 PM

#

1dog

raven fern Jun 5, 2024, 7:23 PM

#

cedar gale

that looks like a picture from an actual commercial LOL

past flame Jun 5, 2024, 7:23 PM

#

"Inside of you, there is 2wolves"

cedar gale Jun 5, 2024, 7:23 PM

#

I was bored. 😄

#

#

I think the coca cola drip is the best tho.

#

rain current Jun 5, 2024, 7:31 PM

#

ideogram

cedar gale Jun 5, 2024, 7:32 PM

#

Noice.

rain current Jun 5, 2024, 7:33 PM

#

I am very impressed by ideogram. I took your image and used describe, then generated the resulting prompt... it has a lot of potential

#

Although sometimes it makes real crap

cedar gale Jun 5, 2024, 7:34 PM

#

#

Its fun I like using SD for making outrages stuff. 😄

raven fern Jun 5, 2024, 7:35 PM

#

now im thirsty

rain current Jun 5, 2024, 7:36 PM

#

So then....

cedar gale Jun 5, 2024, 7:37 PM

#

Yeah that kills em faster.

#

#

This might speed up things.

raven fern Jun 5, 2024, 7:39 PM

#

lol

#

put a poison bottle

cedar gale Jun 5, 2024, 7:39 PM

#

#

😦

#

ALOHOL.

#

Sad panda.

#

rain current Jun 5, 2024, 7:42 PM

#

agony

cedar gale Jun 5, 2024, 7:45 PM

#

turbid grotto Jun 5, 2024, 7:46 PM

#

glif-stablediffusion-3-cds899-iplbat6xwporf2u7b8j1fv5a.jpg

glif-stablediffusion-3-cds899-lj2tfvnya9v1gnshc83ciy57.jpg

glif-stablediffusion-3-cds899-pyl8bg8hjsz6otro0h1206c3.jpg

rain current Jun 5, 2024, 7:46 PM

#

The patient may not survive...

cedar gale Jun 5, 2024, 7:47 PM

#

They will be fine.

turbid grotto Jun 5, 2024, 7:48 PM

#

some people saying text is gimmick but I have a hell of a fun with it and it is good

glif-stablediffusion-3-cds899-w0x2erlgrtk5q1e1h264efs8.jpg

glif-stablediffusion-3-cds899-eq05otw61jlg3hf40eqfnxo4.jpg

dreamy sundial Jun 5, 2024, 7:49 PM

#

turbid grotto some people saying text is gimmick but I have a hell of a fun with it and it is ...

yeah it's pretty good for a base model

cedar gale Jun 5, 2024, 7:52 PM

#

turbid grotto Jun 5, 2024, 7:56 PM

#

I don't really understand how do glif.app host sd3. It takes 8s for gen while someone from SAI (or it was in paper) said it takes 40s or so on 4090 with 50steps. Do they have something more powerful? But why for free then, how do they profit?
At first I wasn't sure that it is real sd3 but it is not similar to any other services

turbid grotto Jun 5, 2024, 7:57 PM

#

dreamy sundial yeah it's pretty good for a base model

and it is yet undertrained

rain current Jun 5, 2024, 7:57 PM

#

cedar gale

raven fern Jun 5, 2024, 7:58 PM

#

i can fix her (in photoshop...yea...)

cedar gale Jun 5, 2024, 7:58 PM

#

Yup she looks healthy.

rain current Jun 5, 2024, 8:03 PM

#

with sdxl refiner

rain current Jun 5, 2024, 8:10 PM

#

turbid grotto I don't really understand how do glif.app host sd3. It takes 8s for gen while so...

I haven't tried it, but could it be fake? Lately, I see a lot of https://huggingface.co/spaces/markmagic/Stable-Diffusion-3
// I see that it no longer works; it was working yesterday

Stable Diffusion 3 - a Hugging Face Space by markmagic

turbid grotto Jun 5, 2024, 8:18 PM

#

rain current I haven't tried it, but could it be fake? Lately, I see a lot of https://hugging...

I also thought about that but what is then? It is definitely not Dalle3, Ideogram looks very different too, SDXL could never achieve such prompt following...

#

At the beginning I got fooled by one "free sd3" site. Decided to check metadata and found "Ideogram" there) But it is not seems to be case here

low stone Jun 5, 2024, 8:26 PM

#

turbid grotto I also thought about that but what is then? It is definitely not Dalle3, Ideogra...

Well could just be using the sd3 api

storm saffron Jun 5, 2024, 8:27 PM

#

low stone Well could just be using the sd3 api

Considering the source code was literally "exec(os.enivron("CODE"))" they were probably using the initial signup free credits on the SAI API

rain current Jun 5, 2024, 8:27 PM

#

I think some people are using Ideogram. It's easy to make calls to Ideogram via the endpoint URL. I do it sometimes by inserting the bearer token, and it works well

turbid grotto Jun 5, 2024, 8:31 PM

#

low stone Well could just be using the sd3 api

yea it could, but it just too generous, I didn't even hit limit once. Some of these sd3 glifs has about 30k plays while, in addition having gpt4\claude prompt rebuilding, or sdxl refining

turbid grotto Jun 5, 2024, 8:33 PM

#

rain current I think some people are using Ideogram. It's easy to make calls to Ideogram via ...

I made some comparisons but couldn't find obvious similarities between Ideogram from their website and glif

hallow lion Jun 5, 2024, 9:08 PM

#

1 week.

#

😏

hallow lion Jun 5, 2024, 9:10 PM

#

rain current I haven't tried it, but could it be fake? Lately, I see a lot of https://hugging...

its like hl2 in russia came out before hl2 in america came out

#

😄

#

shady characters pushing SD3 in the backrooms and alleys

low stone Jun 5, 2024, 9:22 PM

#

Sd3/sdxl refined

cunning lintel Jun 5, 2024, 9:24 PM

#

first thing to try when weights drop

A_frenetic_cleaning_girl_inspired_by_the_works_of_Bruce_Timm_hurling_a_pile_of_old_el.png

dull star Jun 5, 2024, 9:27 PM

#

LMAO

#

I really wanna see the difference in prompt adherence

low stone Jun 5, 2024, 9:32 PM

#

Sd3/hunyuan

cunning lintel Jun 5, 2024, 9:36 PM

#

dull star I really wanna see the difference in prompt adherence

i just want to try dumb contradicting stuff like "luminescent Rose madder Bright lilac huge crooked (fur:1.9) (alien:0.2) (Titanoboa:0.5) (Meerkat:0.4) amethyst eyes. full body. Boreal Forest, puddle, centered, sunlight, focused, uniform light background, photo-realistic" and T5 is probably too smart for that :p (though it doesn't even look half bad)

luminescent_Rose_madder_Bright_lilac_huge_crooked_fur1.png

dull star Jun 5, 2024, 9:37 PM

#

I never got this actually. Can T5 also have prompt weighting?

cunning lintel Jun 5, 2024, 9:42 PM

#

no idea, i'd think why not but it prob needs a new implementation

low stone Jun 5, 2024, 9:43 PM

#

Refined, it's really neat

low stone Jun 5, 2024, 9:44 PM

#

cunning lintel i just want to try dumb contradicting stuff like "luminescent Rose madder Brigh...

#

#

Wow these came out great

#

#

purple furry snakes in the shape of "SD3" are wrapped around the arm of will smith who is taming a lion

cunning lintel Jun 5, 2024, 9:49 PM

#

grin, asking a bit much 🤣 one day, imagegens will happily oblige, i hope 😉

low stone Jun 5, 2024, 10:20 PM

#

cunning lintel grin, asking a bit much 🤣 one day, imagegens will happily oblige, i hope 😉

Other than the text portion, pixart did it more accurately.

hallow lion Jun 5, 2024, 10:34 PM

#

sigma slow

low stone Jun 5, 2024, 10:56 PM

#

this is with 512 sigma

storm saffron Jun 5, 2024, 10:58 PM

#

hallow lion sigma slow

If you think sigma is slow, wait till sd3, apparently it's also quite slow. I believe lykon said 30 seconds per image on a 4090? 50 steps.

dull star Jun 5, 2024, 11:06 PM

#

isn't that 8B?

#

and also 50 steps too

#

In early, unoptimized inference tests on consumer hardware our largest SD3 model with 8B parameters fits into the 24GB VRAM of a RTX 4090 and takes 34 seconds to generate an image of resolution 1024x1024 when using 50 sampling steps.

faint breach Jun 5, 2024, 11:07 PM

#

50 steps yeah. 30 seconds for that isn't so bad. thats 1.5 it/s at megapixel resolutions. not sure what the problem is

#

not even any optimizations done by the community yet

dull star Jun 5, 2024, 11:08 PM

#

well for 8B that's pretty good, but for 2B, a 4090 doing 1.5it/s isn't that good imo
I hope MODE13 meant 8B

#

and yes, optimizations

faint breach Jun 5, 2024, 11:09 PM

#

a lot of the time may come from t5 layer too. faster without it and many won't want to use it

#

the way a lot of civit users prompt, t5 won't be necessary

dull star Jun 5, 2024, 11:09 PM

#

yeah lmao

hallow lion Jun 5, 2024, 11:09 PM

#

if i can get under a minute 1024/1024 30 step ish i cna live

dull star Jun 5, 2024, 11:09 PM

#

1girl, big boob, intricate won't need T5 for sure

hallow lion Jun 5, 2024, 11:09 PM

#

now piling on loras controlnets ipa adapters will be a different story

faint breach Jun 5, 2024, 11:10 PM

#

i've been using Omost which takes like a minute to make the prompt code in the first place lol

#

i'll be fine

hallow lion Jun 5, 2024, 11:11 PM

#

dull star `1girl, big boob, intricate` won't need T5 for sure

what about "2 girls, one with A cup boobs and one with C-cup, intricate"?

dull star Jun 5, 2024, 11:11 PM

#

sadcat

#

im sure PonySD3 will get it right first try

faint breach Jun 5, 2024, 11:12 PM

#

there's gonna be a lot of debate over the usefulness of t5. many will declare it to be another sdxl refiner layer since they won't see immediate gains

#

another prediction i have. controlnet won't work as well on larger sd3

dull star Jun 5, 2024, 11:14 PM

#

how so

faint breach Jun 5, 2024, 11:14 PM

#

same prblem with sdxl. more parameters means less influence

dull star Jun 5, 2024, 11:17 PM

#

interesting

#

then again, wouldn't the MultiModal nature of it make controlnets just superior?

#

multiple input streams and such or whatever

#

rather than some hacky solution

faint breach Jun 5, 2024, 11:19 PM

#

i'm not saying i have an educated prediction

dull star Jun 5, 2024, 11:21 PM

#

and I don't evne know what input streams are

low stone Jun 5, 2024, 11:27 PM

#

hallow lion what about "2 girls, one with A cup boobs and one with C-cup, intricate"?

there was a thread on the sdxl channel today where andreac75 who makes a lot of top notch loras is just deleting all their stuff off civit because it's all just porn now. They didn't want to see all the heinous stuff people were making with their stuff anymore.

faint breach Jun 5, 2024, 11:32 PM

#

and the porn side of the community have this notion where if you even slightly dont' want to see that, you're very obviously an fundamentalist christian oppressor

patent acorn Jun 5, 2024, 11:50 PM

#

rain current I haven't tried it, but could it be fake? Lately, I see a lot of https://hugging...

ermn error

Screenshot_2024-06-06-06-49-08-131_com.android.chrome-edit.jpg

trim arrow Jun 5, 2024, 11:52 PM

#

sd3 hasn't even released yet...i think.

patent acorn Jun 6, 2024, 12:01 AM

#

ik

upper snow Jun 6, 2024, 12:35 AM

#

arguably the more important question: deep floyd stage 3 wen?

hallow talon Jun 6, 2024, 1:16 AM

#

I wonder how long after SD3 2B releases will we be able to start training LORAs and finetuning?

solid violet Jun 6, 2024, 1:23 AM

#

trim arrow sd3 hasn't even released yet...i think.

developer API has been out...
https://stability.ai/news/stable-diffusion-3-api

Stability AI

Stable Diffusion 3 API Now Available — Stability AI

We are pleased to announce the availability of Stable Diffusion 3 and Stable Diffusion 3 Turbo on the Stability AI Developer Platform API.

astral grail Jun 6, 2024, 1:33 AM

#

if it's very easy, why is stability not officially doing that?

cinder junco Jun 6, 2024, 1:51 AM

#

astral grail if it's very easy, why is stability not officially doing that?

Do you want it to take even longer to release the weights?

astral grail Jun 6, 2024, 1:52 AM

#

well if "very easy" means it would just take 1 day longer, then I think that would be worth delaying it 1 day for, yeah 😄

cinder junco Jun 6, 2024, 1:53 AM

#

astral grail well if "very easy" means it would just take 1 day longer, then I think that wou...

Of course. I think Stability would just do it if they thought it were THAT easy.

low stone Jun 6, 2024, 1:53 AM

#

astral grail well if "very easy" means it would just take 1 day longer, then I think that wou...

That's just going to be a comfy node update later. The weights don't need to wait for that

cinder junco Jun 6, 2024, 1:54 AM

#

Easy probably means "you don't have to go to special gyrations to get the loss to decrease, but you still need to train with a lot of compute and data".

#

And rather than spending all that compute on training the model on other resolutions, I'd prefer that the issue with the positional encoding get fixed so that the existing model is more flexible.

low stone Jun 6, 2024, 1:56 AM

#

The downside of tiled upscaling is that it takes many times longer than standard upscale image and denoise at 0.5.

#

It's probably why they're not doing it on the api and instead upscaling sd3 with sdxl turbo

astral grail Jun 6, 2024, 1:58 AM

#

low stone That's just going to be a comfy node update later. The weights don't need to wai...

I'm quite sure @viral plaza in the message I replied to was saying that SD3 can easily be trained for higher resolutions, so no, that cannot just be a comfy node, it needs training

low stone Jun 6, 2024, 1:59 AM

#

All of these models are 1024 res trained.

#

It would take a massively larger amount of time to train the base models of millions of images at higher res. It would take massively more money to buy that gpu time, which a near bankrupt company can't do.

#

So instead, they're delegating that to the community, which makes sense.

#

I believe that they have the best of intentions at this point. The reality though, is that we may never see 8b or even 4b if someone doesn't come along and fund them somehow.

#

So we should just get 2b going and do what we can with that until/if we get surprised with something better down the road.

sterile pendant Jun 6, 2024, 2:04 AM

#

faint breach same prblem with sdxl. more parameters means less influence

Nah, cnets were kind of a hack job to implement into unet based architectures. They should be exponentially better with DiT once people start rolling them out.

hallow lion Jun 6, 2024, 2:19 AM

#

yes take 2B and run with it

#

dont do another Cascade!

low stone Jun 6, 2024, 3:16 AM

#

bitter hearth Jun 6, 2024, 3:27 AM

#

#

#

Stupid black bars

#

sadcat

#

#

"action movie" in the prompt, really likes to add fire to things

patent acorn Jun 6, 2024, 3:36 AM

#

bitter hearth "action movie" in the prompt, really likes to add fire to things

looks more like a video game to me

bitter hearth Jun 6, 2024, 3:37 AM

#

patent acorn looks more like a video game to me

thomas I have unreal engine in the prompt

#

What you said makes all the sense :p

patent acorn Jun 6, 2024, 3:38 AM

#

ohh

bitter hearth Jun 6, 2024, 3:45 AM

#

mighty pulsar Jun 6, 2024, 3:46 AM

#

/help

bitter hearth Jun 6, 2024, 3:46 AM

#

The green reaper..

patent acorn Jun 6, 2024, 3:47 AM

#

bitter hearth

thats a pickaxe not a scythe..

#

i heard its pretty bad at scythes too

#

later when we get to see it finetuned

bitter hearth Jun 6, 2024, 3:48 AM

#

#

Loooong scythe

patent acorn Jun 6, 2024, 3:48 AM

#

thats a scythe now

hallow lion Jun 6, 2024, 3:49 AM

#

at least Friday is close

bitter hearth Jun 6, 2024, 3:50 AM

#

#

it's not bad at the scythe itself it seems, but same problem with guns, the holding part

#

#

patent acorn Jun 6, 2024, 3:53 AM

#

bitter hearth

yep diffusionhand

bitter hearth Jun 6, 2024, 3:54 AM

#

Well

#

Fingers are always bad

#

#

hallow lion Jun 6, 2024, 4:24 AM

#

bitter hearth

Looks like even those 4Gb of VRAM are overheating...

sterile pendant Jun 6, 2024, 4:35 AM

#

I wonder if we will be able to rig up a multigpu setup for sd3 in comfy with any form of ease. I'd use my second weaker GPU for t5 inference and my main one for sd3, that way I wouldn't have to swap models constantly or do t5 inference on the CPU...

#

I know that in the python code, it's usually as simple as saying cuda0, cuda1 or CPU, so it might be pretty easy to slap into a node

#

I just don't know if comfy would natively like it with the smart memory management stuff

turbid grotto Jun 6, 2024, 6:19 AM

#

low stone there was a thread on the sdxl channel today where andreac75 who makes a lot of ...

Is there alternative?

#

||not hf pls||

dusky thistle Jun 6, 2024, 7:06 AM

#

turbid grotto ||not hf pls||

What's wrong with hf

turbid grotto Jun 6, 2024, 7:19 AM

#

dusky thistle What's wrong with hf

idk it is just not that convenient if you don't know what exactly you need

#

could be skill issue

weary snow Jun 6, 2024, 8:14 AM

#

What is likely going to be the minimum requirement for Stable Diffusion 3 Medium ?

cobalt moon Jun 6, 2024, 8:23 AM

#

weary snow What is likely going to be the minimum requirement for Stable Diffusion 3 Medium...

4GB

#

to be fair in 2024 almost everyone have at least 4GB

viral plaza Jun 6, 2024, 8:23 AM

#

astral grail if it's very easy, why is stability not officially doing that?

Technical things being easy doesn't mean business things are easy.
Like first -- Our technical team could release a thousand models... but how do we organize marketing around them all, how do we get safety testing on them all, how do we get legal signoff on them all, how do get licensing organized, etc. etc.
But also: if we decide to spend three days training 2B-2048, that's three days not spent training the 8B. (Opportunity cost: every action no matter how easy comes at the cost of something else you could've done with that same time/effort).

in short: something being technically easy doesn't mean a company like Stability can or should do it internally.
In fact, that's part of the point and value of open releases: Stability doesn't have to do everything, we just open release the models and what info we can, and there's ten thousand other people who will each obsess into any given subsection of work and make that

viral plaza Jun 6, 2024, 8:25 AM

#

astral grail I'm quite sure <@105458332365504512> in the message I replied to was saying that...

I'm pretty sure there's a way to software-patch the pos embeddings and make them friendly to other scales. Not 100% confident but similar things have been done before (see eg RoPE scaling for LLMs)

viral plaza Jun 6, 2024, 8:26 AM

#

raven fern is this the one we will need to download when sd3 releases?

no just wait until launch day and we'll have clear proper ways to get any required files

viral plaza Jun 6, 2024, 8:27 AM

#

dull star wait a sec, the link to the model that Alex sent is the same filesize... maybe i...

oh yeah lol wasn't gonna upload an fp32 file in the year of our quantization 2024

imo we should be using a 4bit quant, but, need proper software support first

cobalt moon Jun 6, 2024, 8:28 AM

#

4 bit quant?

#

as in LLM's Q4?

viral plaza Jun 6, 2024, 8:28 AM

#

teal fossil <@105458332365504512> On another note - could we *theoretically* load CogVLM ins...

CogVLM is an LLM that takes image inputs, designed to generate text descriptions of those images
T5-XXL is a text encoder-decoder generic pairing, in which SD3 uses only the encoder, to convert prompts to inputs.

These models are not interchangeable, they're not even in the same category

dull star Jun 6, 2024, 8:29 AM

#

viral plaza I'm pretty sure there's a way to software-patch the pos embeddings and make them...

I hope it gets figured out cause I don't know about how good it will upscale images (tiled) with depth of field in them

viral plaza Jun 6, 2024, 8:29 AM

#

cobalt moon as in LLM's Q4?

yeah

cobalt moon Jun 6, 2024, 8:29 AM

#

oh you talk about T5

#

yeah that make sense

viral plaza Jun 6, 2024, 8:29 AM

#

for example i use a 4bit T5 for https://github.com/mcmonkeyprojects/translate-tool and it works great

cobalt moon Jun 6, 2024, 8:30 AM

#

there seem to be quite a lot of variation model for T5 though

#

there are fp16 version by theunlikely

viral plaza Jun 6, 2024, 8:30 AM

#

(T5 is an architecture intended to be finetuned, there's tons of variants out there, it's kinda silly that we use a base model of T5 for sd3 tbh)

viral plaza Jun 6, 2024, 8:31 AM

#

cobalt moon there are fp16 version by theunlikely

there's an fp16 encoderonly T5-XXL intended for SD3 here https://huggingface.co/mcmonkey/google_t5-v1_1-xxl_encoderonly/tree/main

#

(which is on HF because the SGM training code uses HF so i uploaded that to dodge the long slow load on the full fat fp32 pair file)

dull star Jun 6, 2024, 8:32 AM

#

Loading fp32 T5, was always the slowest part when using deepfloyd and pixart lol

cobalt moon Jun 6, 2024, 8:33 AM

#

just get oof with OOM when using pixart lol

#

I haven't try Tiled KSampler though

noble coyote Jun 6, 2024, 9:01 AM

#

"... obsessing into subsections ... groan, if only!!!" 😄

sterile pendant Jun 6, 2024, 9:05 AM

#

dull star Loading fp32 T5, was always the slowest part when using deepfloyd and pixart lol

Pixart has a couple of smaller versions in fp16 and bfp16. Drops the size to around 8-10gb and speeds loading up a lot. You can find links to them on the extra models comfy plugin page

#

Though the absolute easiest way to deal with the slow loads is to just get a cheap $40 nvme in the 2-5 gb/s range and models load in seconds

muted dove Jun 6, 2024, 9:07 AM

#

sterile pendant Pixart has a couple of smaller versions in fp16 and bfp16. Drops the size to aro...

The one in the link is only 10GB

patent acorn Jun 6, 2024, 9:19 AM

#

viral plaza CogVLM is an LLM that takes image inputs, designed to generate text descriptions...

huh i dont understand, then how do yall caption sd3 datasets?

viral plaza Jun 6, 2024, 9:20 AM

#

dunno, probably not

#

there's work on them but nothing release-ready last i looked

patent acorn Jun 6, 2024, 9:20 AM

#

oh cogvlm i see

viral plaza Jun 6, 2024, 9:20 AM

#

it's possible those get finished soon after and released or something, or community takes over, idk, we'll see

cobalt moon Jun 6, 2024, 9:21 AM

#

it is pretty weird that some having memory about Stability will release ControlNet models at day one

viral plaza Jun 6, 2024, 9:21 AM

#

CogVLM is an AI model that generates captions, not a dataset. The captions that don't come from CogVLM, are just whatever captions came with the source images

patent acorn Jun 6, 2024, 9:22 AM

#

ight captioning with my own brain then

viral plaza Jun 6, 2024, 9:24 AM

#

if you're doing your own dataset, yeah just write your own captions

sterile pendant Jun 6, 2024, 9:24 AM

#

muted dove The one in the link is only 10GB

yeah, we were talking about the fp32 versions from before though. it was like 20gb.

viral plaza Jun 6, 2024, 9:24 AM

#

if you're doing a megascale dataset (millions or billions of images or whatever), the collection tools to build datasets like that usually can copy out original source caption data

patent acorn Jun 6, 2024, 9:25 AM

#

viral plaza if you're doing a megascale dataset (millions or billions of images or whatever)...

for instance im planning to finetune BTD6 Lora on SD3, well i have to make my own cogvlm finetune..

#

its like 400+ images iirc

viral plaza Jun 6, 2024, 9:27 AM

#

I expect the HuggingFace team will publish finetuning info on day 1, followed later by the various community finetuning software vendors (kohya, onetrainer, etc)

late compass Jun 6, 2024, 9:40 AM

#

viral plaza I expect the HuggingFace team will publish finetuning info on day 1, followed la...

i have one question

#

diffusionhand

#

How to get membership or early access?

storm saffron Jun 6, 2024, 9:45 AM

#

viral plaza CogVLM is an AI model that generates captions, not a dataset. The captions that ...

Do you know what the system prompt and prompt was that was used in CogVLM? I've had some good and some bad results from using it.

teal fossil Jun 6, 2024, 9:56 AM

#

I got CogVLM2 running yesterday and my first prompt got responses that copied the whole question before answering. Made it absolutely useless for captioning. ^^
I rewrote the prompt and then it was way better.
Only persisting problem so far(which I can understand is unimaginably hard for an Ai to understand) is that right and left hand / arm are swapped when a person is facing the camera.
I'll have to manually check all captions because sometimes it gets it right, most times not. 😅

sterile pendant Jun 6, 2024, 10:00 AM

#

storm saffron Do you know what the system prompt and prompt was that was used in CogVLM? I've ...

Most vlms are good to go out of the box, since that's literally their sole purpose and what they were trained to do. But if you're trying to have a specific format, you have to do a little bit of sys prompting. I spent a few days a while back making one for llava 1.6 Mistral and it was around 1000 tokens long with examples in it. You can also use RAG as well, but the info still bloats up the context size. I know llava 1.6 ate up like ~2000 tokens for the image information alone, so I didn't use RAG for it when I only had a 4096 context size(no sliding window or other optimizations)

storm saffron Jun 6, 2024, 10:25 AM

#

sterile pendant Most vlms are good to go out of the box, since that's literally their sole purpo...

The most annoying thing is it starting every caption with "In this image" or"The image contains" or variations of those, which I then have to take out. Also I had one vlm that just kept on about handbags for some reason (it was a Microsoft one).

lavish pier Jun 6, 2024, 10:34 AM

#

Anyone has suggestions for an image to text/prompt model? I am trying to get a prompt from an image to generate similar images

dry wave Jun 6, 2024, 11:21 AM

#

storm saffron The most annoying thing is it starting every caption with "In this image" or"The...

an easy way to avoid that is by using a predefined ouput-prefix

#

question: Write a short caption for this image.
response: This image shows [autocomplete from here]

sterile pendant Jun 6, 2024, 11:44 AM

#

The other trick you can use, granted it's double the time, is to use a second pass with llama3 instruct(or any good recent instruct model) to clean up the captions and format them better

cedar gale Jun 6, 2024, 11:44 AM

#

Or use taggerui and discourage from caption:

sterile pendant Jun 6, 2024, 11:44 AM

#

They are starting to make some llama3 based vlms now, so that should be awesome. Think there's a llava next llama3 someone is working on

sullen moss Jun 6, 2024, 11:46 AM

#

1 week... 🤗

sterile pendant Jun 6, 2024, 11:47 AM

#

But up until now, the best one I've used that was the most reliable was llava 1.6 Mistral. Granted, that's for consumer level hardware and easily runs on 8gb vram. Mistral was a beast until llama3 came out(again talking about smaller models you can run on a normal PC)

storm saffron Jun 6, 2024, 11:50 AM

#

https://www.ollama.com/library/llava-llama3

llava-llama3

A LLaVA model fine-tuned from Llama 3 Instruct with better scores in several benchmarks.

sterile pendant Jun 6, 2024, 11:59 AM

#

here's an example of a long description version i made a while back. it was part of a two pass system where this would be fed into a prompt generator llm. the system prompt for this is ~700 tokens

viral plaza Jun 6, 2024, 12:00 PM

#

late compass How to get membership or early access?

memberships program is @ https://stability.ai/membership
i don't think there's sd3 early access

viral plaza Jun 6, 2024, 12:01 PM

#

storm saffron Do you know what the system prompt and prompt was that was used in CogVLM? I've ...

i don't know the answer to that sorry

late compass Jun 6, 2024, 12:01 PM

#

viral plaza memberships program is @ <https://stability.ai/membership> i don't think there's...

i have some matter if you can come to dm

#

please

viral plaza Jun 6, 2024, 12:01 PM

#

should use CogVLM2 anyway instead of the original, would be better and prompting it will be different

late compass Jun 6, 2024, 12:01 PM

#

habby

viral plaza Jun 6, 2024, 12:02 PM

#

sterile pendant They are starting to make some llama3 based vlms now, so that should be awesome....

i don't trust llava at all for architecture reasons -- it just has images fed into the context token space of the LLM, vs cogvlm has active transformer layers dedicated to the image

sterile pendant Jun 6, 2024, 12:03 PM

#

viral plaza i don't trust llava at all for architecture reasons -- it just has images fed in...

have you tried llava 1.6 mistral? it's pretty decent

#

(for it's size i should add, being able to run on an 8gb and all)

#

obviously, for larger scope and better hardware, i'm sure there are better alternatives

sterile pendant Jun 6, 2024, 12:29 PM

#

storm saffron https://www.ollama.com/library/llava-llama3

i'm playing around with a different version called llama3 llava next 7b right now and it's pretty good so far. the next versions of llava are based on 1.6. i think the one you linked is based on 1.5

teal fossil Jun 6, 2024, 12:33 PM

#

cedar gale Or use taggerui and discourage from caption:

That doesn't work as intended most of the time.

teal fossil Jun 6, 2024, 12:34 PM

#

viral plaza i don't trust llava at all for architecture reasons -- it just has images fed in...

In my tests CogVLM always performed better than Llava1.6.

#

Btw @viral plaza is there a time on the 12th we should look forward to?
A Midnight release? Late evening in the West US?

cobalt moon Jun 6, 2024, 12:48 PM

#

you can say midnight release or 12pm release in UK

#

... or just random drop

viral plaza Jun 6, 2024, 12:50 PM

#

teal fossil Btw <@105458332365504512> is there a time on the 12th we should look forward to?...

Thaaat's a good question, will ask internally

radiant hound Jun 6, 2024, 1:17 PM

#

teal fossil I got CogVLM2 running yesterday and my first prompt got responses that copied *t...

What image captioner did you get cogvlm2 to work with?

teal fossil Jun 6, 2024, 1:19 PM

#

radiant hound What image captioner did you get cogvlm2 to work with?

I went on a lengthy adventure yesterday (thanks with a lot of help from Llama-3 70B & an OT community member) to get the repo of TagGui running on WSL. I'm very very happy that it works now. The next time a repo asks for Triton or bust - I know what to do. ⚡

noble coyote Jun 6, 2024, 1:23 PM

#

Omost asks for Triton - and when you install it (the only version available!) - it tells you it is the wrong one?!

lucid swift Jun 6, 2024, 1:33 PM

#

RGB gaming sausage

#

#

noble coyote Jun 6, 2024, 1:37 PM

#

On another note entirely - I've been trying Stability Audio (via Pinokio) - and it's quite some fun!!!

lucid swift Jun 6, 2024, 1:38 PM

#

noble coyote On another note entirely - I've been trying Stability Audio (via Pinokio) - and...

the local mode?

#

i wish i would know how to fine tune it xD

noble coyote Jun 6, 2024, 1:40 PM

#

Pinokio puts it all into a local VENV yes

lucid swift Jun 6, 2024, 1:40 PM

#

noble coyote Pinokio puts it all into a local VENV yes

did oyu know that the model got leaked like a week bevore it relesed

cobalt moon Jun 6, 2024, 1:40 PM

#

noble coyote Pinokio puts it all into a local VENV yes

how much GPU does it need?

lucid swift Jun 6, 2024, 1:40 PM

#

cobalt moon how much GPU does it need?

i think 7gb vram

cobalt moon Jun 6, 2024, 1:41 PM

#

that ComfyUI node said 8GB VRAM

noble coyote Jun 6, 2024, 1:41 PM

#

Here is my first groove ... an arpeggio of a C9 chord https://drive.google.com/file/d/1lG2fArbCe2AauplHH2BhEAsUaKF3mKih/view?usp=drive_link

cobalt moon Jun 6, 2024, 1:41 PM

#

too

noble coyote Jun 6, 2024, 1:41 PM

#

Pinokio uses GRadio

lucid swift Jun 6, 2024, 1:41 PM

#

noble coyote Here is my first groove ... an arpeggio of a C9 chord https://drive.google.com/f...

i dont have accses just post the aduio in discrod

noble coyote Jun 6, 2024, 1:41 PM

#

I have an 8Gb RTX 2070 and it works OK, about 45 seconds (100 iterations) to get a 47 second mp3 file

#

lucid swift Jun 6, 2024, 1:42 PM

#

sd3 "Spectrogram"

noble coyote Jun 6, 2024, 1:43 PM

#

cobalt moon Jun 6, 2024, 1:44 PM

#

that doesn't look like a spectrogram though

#

but close enough

lucid swift Jun 6, 2024, 1:45 PM

#

RGB gaming catheter

noble coyote Jun 6, 2024, 1:46 PM

#

lucid swift RGB gaming catheter

Is my audio interesting enough for you to continue your interest in Stability Audio at all?! 😄

lucid swift Jun 6, 2024, 1:47 PM

#

noble coyote Is my audio interesting enough for you to continue your interest in Stability Au...

yes i want to finetune the model but i dknt know how

#

i alredy installed it

low stone Jun 6, 2024, 1:50 PM

#

noble coyote Jun 6, 2024, 1:53 PM

#

low stone

"Comms Room Ruby Portal!!!"

lucid swift Jun 6, 2024, 2:18 PM

#

#

#

#

dull star Jun 6, 2024, 2:45 PM

#

I'm sorry if this has been asked or answered already, but will we have to truncate prompts ourselves if we make our own Loras or Finetunes?

patent acorn Jun 6, 2024, 2:52 PM

#

i hope sd3 is able to make this

Screenshot_2024-06-06-20-35-54-180_com.mobile.legends.jpg

low stone Jun 6, 2024, 2:54 PM

#

patent acorn i hope sd3 is able to make this

#

No

patent acorn Jun 6, 2024, 2:54 PM

#

awww that sucks

low stone Jun 6, 2024, 2:58 PM

#

#

Sigh. Still no.

#

Maybe 2b on the 12th will be better

dull star Jun 6, 2024, 2:59 PM

#

with 2B we can cherry pick easier too

#

can't wait to mess around with text encoders

low stone Jun 6, 2024, 3:01 PM

#

#

This is just one failed attempt after another

dull star Jun 6, 2024, 3:03 PM

#

poor calf is out of focus

low stone Jun 6, 2024, 3:04 PM

#

Yeah when it's local I can just tell it to render 30 of them, with tons of various llm variations of it. Not gonna do that against a paid api.

dull star Jun 6, 2024, 3:04 PM

#

fr

#

low stone Jun 6, 2024, 3:04 PM

#

Hah

#

Can you get it with the cat snuggling the oxen though and taking a pic of both of them

dull star Jun 6, 2024, 3:05 PM

#

damn, they don't want to cooperate

dull star Jun 6, 2024, 3:06 PM

#

low stone Can you get it with the cat snuggling the oxen though and taking a pic of both o...

huh?

low stone Jun 6, 2024, 3:06 PM

#

dull star Jun 6, 2024, 3:06 PM

#

is this ideogram?

low stone Jun 6, 2024, 3:06 PM

#

oxen is sitting next to a tabbie cat who is taking a selfie of both of them. Background is a farm.

#

Dalle

dull star Jun 6, 2024, 3:07 PM

#

oh

#

looks pretty good

#

we could overfit Loras to make something like this

#

We'll see how 2B looks like

#

or fully trained 8B

#

Festivalman do you think that skipping 4B is a good idea?

#

It makes sense to me, you will have 2B, which is what the majority of people will be able to run, 800M for low end users, then 8B for enthusiasts or small businesses if the Enterprise membership isn't too expensive

#

4B might stick out like a sore thumb and would delay the other models

low stone Jun 6, 2024, 3:15 PM

#

tabbie cat that is snuggling with an oxen while taking a selfie of both of them. Background is a farm.

#

Dalle/sd3

#

I think sd3 did a good job, it's technically the selfie pic itself

low stone Jun 6, 2024, 3:19 PM

#

dull star Festivalman do you think that skipping 4B is a good idea?

Every size of sd3 they release will segment the user base that much more. I'm on the side of releasing as few as possible.

cunning lintel Jun 6, 2024, 3:20 PM

#

if anyone wants to challenge sd3, try something like this (Dalll) (saw the pic in reddit a long time ago, sadly sd3 really doesn't shine with abstract concepts like "mustache from her hair" 😢 )

#

thinking about it, that image is extra evil as it also has hands 😂 (this is sd3, closest i got, but that mustache, is so much more real :p)

In_a_sun-drenched_Japanese_classroom_cherry-blossom-haired_Sakura_creates_a_makeshift_.webp

low stone Jun 6, 2024, 3:24 PM

#

#

Yeah I'm getting the same thing

#

Did a good job with "anime" though. It really looks good

#

Sharper than deepbluev4 that I've been using I think

wild remnant Jun 6, 2024, 3:25 PM

#

lucid swift Jun 6, 2024, 3:41 PM

#

patent acorn Jun 6, 2024, 3:42 PM

#

low stone tabbie cat that is snuggling with an oxen while taking a selfie of both of them....

ok the sd3 is well done

dull star Jun 6, 2024, 3:42 PM

#

is this supposed to be mewtwo?

desert garnet Jun 6, 2024, 3:45 PM

#

nah thats mewfour 💀

sterile pendant Jun 6, 2024, 3:51 PM

#

low stone Every size of sd3 they release will segment the user base that much more. I'm on...

That's fine though. I think people will see that fine tunes aren't going to be as needed and that loras will be more of the go-to. Eventually someone will make an xadapter to share the loras between editions.

low stone Jun 6, 2024, 3:56 PM

#

sterile pendant That's fine though. I think people will see that fine tunes aren't going to be a...

Yeah but you can't share Lora against different base models. These different size models will literally have different training against different images.

coral sable Jun 6, 2024, 3:59 PM

#

octopussy SD3

#

Often feels like SD 1.5, lots of trial and error. But when the high roll hits it's really good. 6 more days sadcat watwow

dull star Jun 6, 2024, 4:03 PM

#

yeah 2B will be probably more consistent, especially with Finetunes

teal fossil Jun 6, 2024, 4:05 PM

#

Hmmm... currently curating all the CoGVLM2 captions... which are nice, but still utterly filled with "suggestions" and "seems to be"'s "possibly" "appears to be" and other nonsense...
--> and now I'm wondering if that's actually stupid cause those might still be in the SD3 captions as well.

dull star Jun 6, 2024, 4:07 PM

#

teal fossil Hmmm... currently curating all the CoGVLM2 captions... which are nice, but still...

if those appear at the end like I remember them being, then those are probably truncated anyway...

#

or they had a good system prompt for it anyway

#

I do hate those assumption parts of the captions though

coral sable Jun 6, 2024, 4:08 PM

#

dull star yeah 2B will be probably more consistent, especially with Finetunes

coping for the best Copege

dull star Jun 6, 2024, 4:09 PM

#

8B is heavily undertrained

#

but it is coping for sure

#

we can't expect images to suddenly look 100x better

coral sable Jun 6, 2024, 4:10 PM

#

i'd sacrifice consistency for variety anyday. Variety is amazing on SD3

dull star Jun 6, 2024, 4:10 PM

#

Oh I mean like anatomical consistency or whatever

#

it will have variety

#

they talked about the model overfitting, causing differing seeds to become useless

#

so I think we're going to be okay

#

maybe the Turbo model(s) will have that though

#

and finetunes

coral sable Jun 6, 2024, 4:11 PM

#

it can have both, I'm just sayin what is more important to me. Had lot's of fun with API already

dull star Jun 6, 2024, 4:11 PM

#

same

#

#

8B is really good at paintings, so I hope 2B Base will be good at it too

#

same dataset after all, no?

#

should be similar enough

low stone Jun 6, 2024, 4:13 PM

#

Playground 2.5 is an example of almost no variation across seeds.

dull star Jun 6, 2024, 4:13 PM

#

btw @coral sable idk if you have seen this, but this is a TRULY raw image from 2B, no upscaling (From Lykon)

#

this is how the face looks upclose

#

remember, no highresfix or other tricks

coral sable Jun 6, 2024, 4:15 PM

#

dull star should be similar enough

bigger dataset, same tech. Wouldn't call that the same dataset

dull star Jun 6, 2024, 4:15 PM

#

catlurk I see

coral sable Jun 6, 2024, 4:15 PM

#

dull star btw <@843837229734166570> idk if you have seen this, but this is a TRULY raw ima...

yea saw that, really good for a base model

dull star Jun 6, 2024, 4:16 PM

#

I wonder if the 16 channel VAE is just that much better for smaller details such as eyes

#

what's unfortunate, is that like with Pixart, highresfix has an issue

#

resolutions higher will create very very noisy artifacts

#

#🆕｜sd3 message

#

I suppose by the time we get it, they'll fix the pos embed code, or the community will within a week

#

if Tiled upscaling is okay, then I'll just do that then for the time being

coral sable Jun 6, 2024, 4:24 PM

#

highresfix/adetailer still needed, details drop significantly on anything that isn't closeup. But that's to be expected as consumer hardware is what SD3 is targeting

dull star Jun 6, 2024, 4:26 PM

#

yeahh

coral sable Jun 6, 2024, 4:26 PM

#

I wonder will there be launch event on 12th as it was with SDXL

dull star Jun 6, 2024, 4:26 PM

#

what was the event again?

coral sable Jun 6, 2024, 4:26 PM

#

I mean just discord call with bunch of people

dull star Jun 6, 2024, 4:26 PM

#

ah

coral sable Jun 6, 2024, 4:27 PM

#

Stability guys presenting new tech

dull star Jun 6, 2024, 4:27 PM

#

oh so like the center stage type of call?

coral sable Jun 6, 2024, 4:27 PM

#

yea

dull star Jun 6, 2024, 4:27 PM

#

I can't wait to remake Tekken intros as Live Action movie stills

#

I wanna see how good img2img is with such an intelligent AI

#

oh heck I could have tried that with pixart already lmao

#

I keep forgetting that it can do img2img too

#

you guys have any plans on what to use SD3 for that you couldn't or wouldn't bother to make with previous models (too much controlnet or regional prompting involved)?

hallow talon Jun 6, 2024, 4:34 PM

#

dull star you guys have any plans on what to use SD3 for that you couldn't or wouldn't bot...

I'm hoping it's a lot easier to do images with multiple people in it without their features (hairstyle clothes etc) blending together with the other person's.

trail frost Jun 6, 2024, 4:42 PM

#

Is there any new update on pixart models

dull star Jun 6, 2024, 4:47 PM

#

only 2K model is the biggest model released so far

#

there's lora training code

#

its in diffusers

#

nothing much to be honest that I know of that is "new"

viral plaza Jun 6, 2024, 4:47 PM

#

dull star I wonder if the 16 channel VAE is just that much better for smaller details such...

yeah the new VAE is majorly beneficial for small details

dull star Jun 6, 2024, 4:48 PM

#

nice

viral plaza Jun 6, 2024, 4:57 PM

#

teal fossil Btw <@105458332365504512> is there a time on the 12th we should look forward to?...

I'm told there is not a specific time we can give in advance, the actual release might be a process over a few hours (model weights, code, etc.) rather than necessarily all go at once

viral plaza Jun 6, 2024, 4:57 PM

#

coral sable I wonder will there be launch event on 12th as it was with SDXL

no event planned

teal fossil Jun 6, 2024, 4:59 PM

#

viral plaza I'm told there is not a specific time we can give in advance, the actual release...

Thanks for asking. Usual routine then (while I'm probably all giddy). 😄

sterile pendant Jun 6, 2024, 5:02 PM

#

low stone Yeah but you can't share Lora against different base models. These different siz...

Which is why I brought up: https://showlab.github.io/X-Adapter/
I'm sure someone will make something similar for the various sd3 models. Who knows, maybe by the end of the year you'll even see one that allows you to use sdxl loras with sd3.

X-Adapter

Adding Universal Compatibility of Plugins for Upgraded Diffusion Model

#

They won't be perfect 1:1s though, but should be close enough if it's done right. Maybe like 90% accurate

dull star Jun 6, 2024, 5:16 PM

#

viral plaza I'm told there is not a specific time we can give in advance, the actual release...

I wonder if comfy and your UI (stableswarm) releases the update earliest

fair spruce Jun 6, 2024, 5:41 PM

#

dull star Jun 6, 2024, 5:43 PM

#

nice

cedar gale Jun 6, 2024, 5:50 PM

#

fair spruce Jun 6, 2024, 5:50 PM

#

#

cedar gale Jun 6, 2024, 5:54 PM

#

fair spruce Jun 6, 2024, 5:54 PM

#

#

#

cedar gale Jun 6, 2024, 5:58 PM

#

fair spruce Jun 6, 2024, 6:07 PM

#

#

cedar gale Jun 6, 2024, 6:22 PM

#

#

fair spruce Jun 6, 2024, 6:30 PM

#

#

cedar gale Jun 6, 2024, 6:36 PM

#

dull star Jun 6, 2024, 6:36 PM

#

yes nvidia I will keep buying your 24GB high end cards for 3 gagillion dollars sadcat

fair spruce Jun 6, 2024, 6:38 PM

#

cedar gale Jun 6, 2024, 6:46 PM

#

fair spruce Jun 6, 2024, 6:47 PM

#

#

cedar gale Jun 6, 2024, 6:53 PM

#

fair spruce Jun 6, 2024, 6:56 PM

#

cedar gale Jun 6, 2024, 7:04 PM

#

raven fern Jun 6, 2024, 8:42 PM

#

cedar gale

🤣

turbid grotto Jun 6, 2024, 9:42 PM

#

glif-stablediffusion-3-cds899-sogti4n0larlfkke3qj4qhiv.jpg

glif-stablediffusion-3-cds899-qftd2liuy9b68gykrw67p7jo.jpg

#

glif-stablediffusion-3-cds899-xhhwhqbasz3oi2yn6tf40eip.jpg

low stone Jun 6, 2024, 9:52 PM

#

turbid grotto Jun 6, 2024, 9:56 PM

#

low stone

Is this refined?

low stone Jun 6, 2024, 9:56 PM

#

yeah with 0.3 denoise

turbid grotto Jun 6, 2024, 9:57 PM

#

low stone yeah with 0.3 denoise

I like it

#

glif-stablediffusion-3-cds899-lc1o2qj93sht2d92h00sz8bd.jpg

low stone Jun 6, 2024, 10:02 PM

#

image1-Ultra-a-mentally-insane-homeless-man-on-a-new-york-city-street-corner-is-holding-a-sign-that-says-I-got-the-2B-early.png

raven fern Jun 6, 2024, 10:42 PM

#

lol

dim sinew Jun 6, 2024, 10:50 PM

#

turbid grotto

lol - I have an 8gig 3070ti, it’s just never enough sometimes. (Not doing ai just yet)

dull star Jun 6, 2024, 11:01 PM

#

T5 bf16 on CPU with SD3 2B fp16 will be good for you 🙏

raven fern Jun 6, 2024, 11:01 PM

#

soontm happemad

dull star Jun 6, 2024, 11:02 PM

#

it takes about 10-15 seconds to T5 to process the prompt, but once you are done and you don't change the prompt
you will be able to change anything else (CFG, Sampler, resolution, etc) and get instantly to generating the picture without having to regenerate it

#

so the future is bright

raven fern Jun 6, 2024, 11:02 PM

#

the future is now, thanks to science

dull star Jun 6, 2024, 11:03 PM

#

or just don't use T5, we don't know how much worse it's going to be without T5

raven fern Jun 6, 2024, 11:03 PM

#

according to the research paper, the difference is not that huge

dull star Jun 6, 2024, 11:03 PM

#

yeah but they also claimed stuff like DALLE3 level prompt adherence and stuff so... 😬

raven fern Jun 6, 2024, 11:03 PM

#

yea i guess will see :3

dull star Jun 6, 2024, 11:03 PM

#

but I do believe that images without Text might be decent without T5

#

if its as smart as Pixart-Sigma without T5 then it'll be good

#

caues its still a massive improvement over SDXL

dim sinew Jun 6, 2024, 11:17 PM

#

dull star T5 bf16 on CPU with SD3 2B fp16 will be good for you 🙏

Can you break this down for a noob?

low stone Jun 6, 2024, 11:18 PM

#

dull star if its as smart as Pixart-Sigma without T5 then it'll be good

pixart has a 300 token context window. sd3 is limited by clip length

dull star Jun 6, 2024, 11:18 PM

#

low stone pixart has a 300 token context window. sd3 is limited by clip length

yeah its funny

raven fern Jun 6, 2024, 11:18 PM

#

its kinda already broken down :3 long story short, you dont have to run everything on gpu

low stone Jun 6, 2024, 11:18 PM

#

it's really the benefit of the t5 stuff over clip

dull star Jun 6, 2024, 11:18 PM

#

dim sinew Can you break this down for a noob?

SD3 2B, which is the model that will get released on june 12th, will run just fine on 8GB of vram

#

you can use SD3 2B with T5 (a text generation model, which can also be used as an encoder for image generators), which increases its text capabilities (and possibly accuracy to the prompt)

#

but its expensive, it takes up a lot of VRAM

dim sinew Jun 6, 2024, 11:19 PM

#

I’m a hobby photographer, and I am just stepping into wanting to create AI generated landscapes for my fantasy world I’m starting to write.

dull star Jun 6, 2024, 11:19 PM

#

ah

#

If you are looking for mostly portraits of people or subjects, or simple landscapes, current models are fine for that.

pseudo stone Jun 6, 2024, 11:20 PM

#

wll we ever get sd3 large

dull star Jun 6, 2024, 11:21 PM

#

if you want to make detailed scenes with meaning, then pixart and SD3 2B will be the perfect fit

dull star Jun 6, 2024, 11:21 PM

#

pseudo stone wll we ever get sd3 large

yes

#

we will get all models

pseudo stone Jun 6, 2024, 11:21 PM

#

when tho

dull star Jun 6, 2024, 11:21 PM

#

800M, 2B, 8B

but for now, 2B is the most optimally trained one

pseudo stone Jun 6, 2024, 11:21 PM

#

like a month after medium

dull star Jun 6, 2024, 11:21 PM

#

if not two

#

8B is a VERY large model

pseudo stone Jun 6, 2024, 11:21 PM

#

I hope

#

so

dim sinew Jun 6, 2024, 11:21 PM

#

dull star If you are looking for mostly portraits of people or subjects, or simple landsca...

No no… like deep sci-fi/fantasy stuff. Really describe a scene and generate.

raven fern Jun 6, 2024, 11:21 PM

#

2B will all the toys with it will keep you occupied enough until the other models :3

dull star Jun 6, 2024, 11:22 PM

#

dim sinew No no… like deep sci-fi/fantasy stuff. Really describe a scene and generate.

ah yeah, then SD3 2B or Pixart will let you describe your scenes in detail and give you the best result you can achieve offline.

low stone Jun 6, 2024, 11:22 PM

#

image0-Ultra-In-a-vibrant-neon-lit-cyberpunk-inspired-sci-fi-illustration-a-motley-crew-of-grotesque-alien-monsters-with-razor-.png

#

sd3/pixart/hunyuan

dull star Jun 6, 2024, 11:23 PM

#

Pixart is completely free, SD3 2B needs licensing for commercial use (we are awaiting more info about this, but if you are just making stuff for your own enjoyment, ipso facto, personal use, then this does not affect you).

raven fern Jun 6, 2024, 11:23 PM

#

4D chess 😮

low stone Jun 6, 2024, 11:23 PM

#

dim sinew No no… like deep sci-fi/fantasy stuff. Really describe a scene and generate.

can you describe an example scene?

dim sinew Jun 6, 2024, 11:25 PM

#

dull star Pixart is completely free, SD3 2B needs licensing for commercial use (we are awa...

Well, probably eventually will need commercial usage.

dim sinew Jun 6, 2024, 11:25 PM

#

low stone can you describe an example scene?

I’m honestly creatively brain dead atm. Been a long day

low stone Jun 6, 2024, 11:27 PM

#

dim sinew I’m honestly creatively brain dead atm. Been a long day

#

sounds good, here ya go

dim sinew Jun 6, 2024, 11:29 PM

#

low stone

It sure if that’s a building generating energy. Or a weapon that landed and started imbedding itself into the moon to destroy it…

dull star Jun 6, 2024, 11:30 PM

#

I like SD3 for paintings too

low stone Jun 6, 2024, 11:30 PM

#

dim sinew It sure if that’s a building generating energy. Or a weapon that landed and sta...

Here is the image that you requested:

dim sinew Jun 6, 2024, 11:30 PM

#

I’ve toyed with also training a model for a specific side project idea. But I have no idea if you start either full compositions, or individual objects by name.

dim sinew Jun 6, 2024, 11:31 PM

#

dull star I like SD3 for paintings too

I love this!!!

dull star Jun 6, 2024, 11:31 PM

#

thanks

low stone Jun 6, 2024, 11:31 PM

#

dim sinew I love this!!!

dull star Jun 6, 2024, 11:31 PM

#

if all else fails, I'll still gladly use SD3 for paintings

low stone Jun 6, 2024, 11:31 PM

#

basically just get a chat going with llama3, and keep changing elements until it does what you want.

dull star Jun 6, 2024, 11:31 PM

#

low stone Jun 6, 2024, 11:32 PM

#

dim sinew I love this!!!

dim sinew Jun 6, 2024, 11:32 PM

#

low stone basically just get a chat going with llama3, and keep changing elements until it...

I don’t even know what this is!

dull star Jun 6, 2024, 11:32 PM

#

its a text generator like chatgpt

#

or rather, an AI chatbot

#

you can ask for suggestions for prompts and etc

low stone Jun 6, 2024, 11:33 PM

#

yeah, it's llama3 language model, running locally on ollama, which has the open-webui front end so it looks like chatgpt.

dull star Jun 6, 2024, 11:33 PM

#

dim sinew Jun 6, 2024, 11:34 PM

#

low stone yeah, it's llama3 language model, running locally on ollama, which has the open-...

I need to now where to go to get all this stuff 🙂

low stone Jun 6, 2024, 11:35 PM

#

https://ollama.com/ https://github.com/open-webui/open-webui

Ollama

Get up and running with large language models.

GitHub

GitHub - open-webui/open-webui: User-friendly WebUI for LLMs (Forme...

User-friendly WebUI for LLMs (Formerly Ollama WebUI) - open-webui/open-webui

solid violet Jun 7, 2024, 12:59 AM

#

low stone https://ollama.com/ https://github.com/open-webui/open-webui

llama generates images now?

#

oh yea but not for commercial use*

low stone Jun 7, 2024, 1:19 AM

#

solid violet llama generates images now?

so using those combination of tools and a comfyui backend, you can have the chat interface generate images based off llama3 responses just by clicking the picture icon on the response.

#

so you can have a chat back and forth, generating images along the way

#

add this, change this.

solid violet Jun 7, 2024, 1:25 AM

#

you’ve just changed my whole world man. thank you this was really insightful. where did you learn all this?

#

self portrait, “allegory of the cave” depicts @solid violet led to the light by @low stone

low stone Jun 7, 2024, 1:28 AM

#

i do this stuff for work and spend way too many hours at home doing it.

low stone Jun 7, 2024, 1:51 AM

#

image1-Ultra-Vivid-neon-outline-of-text-SD3-TooBee-with-glowing-energy-waves-emanating-from-its-engines.jpg

agile hornet Jun 7, 2024, 2:13 AM

#

Will SD3 be usable in Fooocus?

low stone Jun 7, 2024, 2:14 AM

#

eventually

#

man this ultra on the api for sd3 is 8 cents per image now. I've thrown a good amount of money at it, but that's Dall-E money already. I think I'll wait for the 2b at this point

patent acorn Jun 7, 2024, 3:23 AM

#

cedar gale

This is my fav so far

cedar gale Jun 7, 2024, 3:37 AM

#

patent acorn Jun 7, 2024, 3:39 AM

#

where 2 bees

gusty trail Jun 7, 2024, 3:40 AM

#

In the imagination

hallow lion Jun 7, 2024, 3:53 AM

#

is this lama thing worth it?

raven fern Jun 7, 2024, 4:01 AM

#

https://tenor.com/view/2b-gif-2822797880514591387

Tenor

agile hornet Jun 7, 2024, 4:17 AM

#

cedar gale

According to Terrance Howard 1 Bee+1Bee=3 lol

noble coyote Jun 7, 2024, 4:44 AM

#

low stone can you describe an example scene?

There are many competitions on Discord - and there are some very good prompts available - I borrow many from the OpenAI Daily Theme ...

noble coyote Jun 7, 2024, 4:45 AM

#

cedar gale

(One Bee + One Bee) or not (One Bee + One Bee) - that is the question?! 😄

noble coyote Jun 7, 2024, 4:46 AM

#

agile hornet According to Terrance Howard 1 Bee+1Bee=3 lol

Eric, the half-a-bee?

noble coyote Jun 7, 2024, 4:59 AM

#

low stone

Using Llama 2 Chat 7B Q4 - this is its result - Against a dark, starry backdrop, a sleek, metallic structure rises from the lunar surface, its gleaming facade reflecting the pale light of a distant sun. The building's fluid lines blur the distinction between form and function, as if it's alive, pulsing with an otherworldly energy. As the moon's gravity pulls at its base, the structure's edges begin to distort, revealing a glowing core that seems to be the source of the energy.

noble coyote Jun 7, 2024, 5:19 AM

#

#

#

... no glowing core - sadly!

#

sterile heath Jun 7, 2024, 7:05 AM

#

I mean it depends how it’s implemented. You don’t even need to run both on the same device or at the same time. You could generate an embedding, save it, load the next model and pass the embedding to it if you wanted to be silly. You can also just prune, quantize etc.

teal fossil Jun 7, 2024, 7:22 AM

#

https://tenor.com/view/dune-stilgar-lisanalgaib-fremen-lisan-al-gaib-gif-13633185611076219756

Tenor

#

Me next Wednesday if SD3M turns out to be what is promised.

#

Every time I work on a Dataset or train a LoRA Atm... I can't wait to play around with that sweet Sweet 16-channel Vae.

cedar gale Jun 7, 2024, 7:45 AM

#

cedar gale Jun 7, 2024, 8:12 AM

#

hallow lion Jun 7, 2024, 8:25 AM

#

😄

muted dove Jun 7, 2024, 8:32 AM

#

cedar gale

+tax = $3500

cedar gale Jun 7, 2024, 8:33 AM

#

😛

#

#

The RTXX 4099 costs the same tho.

radiant ledge Jun 7, 2024, 8:35 AM

#

this one even has two fans

cedar gale Jun 7, 2024, 8:46 AM

#

#

modern rover Jun 7, 2024, 8:58 AM

#

how to create a image

#

who can tell me ?

hoary pilot Jun 7, 2024, 8:59 AM

#

M

muted dove Jun 7, 2024, 9:00 AM

#

modern rover who can tell me ?

#artisan-faq

cedar gale Jun 7, 2024, 9:06 AM

#

#

tardy crag Jun 7, 2024, 9:31 AM

#

Will the model released on June 12th have the possibility to input an image or will be just text to image?

weary crystal Jun 7, 2024, 9:56 AM

#

low stone

Short question did you used the sd3 API or a local running comfyui with api exposed?

compact forge Jun 7, 2024, 10:22 AM

#

tardy crag Will the model released on June 12th have the possibility to input an image or w...

You will be able to download and use it for anything you could with sd1.5 or sdxl

cobalt moon Jun 7, 2024, 10:28 AM

#

compact forge You will be able to download and use it for anything you could with sd1.5 or sdx...

are you even answering the question

#

yeah i think festivalman hook up the SD3 API onto Ollama / open sourced LLM webui

compact forge Jun 7, 2024, 11:03 AM

#

cobalt moon are you even answering the question

Sorry for answering the question

bitter hearth Jun 7, 2024, 11:14 AM

#

#

#

#

#

last 2 waow

#

wait not last one, this is very cute

dim panther Jun 7, 2024, 1:39 PM

#

A living room with a large window, hardwood floors, and a fireplace. In the center of the room, a stylish couple is examining different sofa options, including a modern gray sectional, a tufted leather chesterfield, and a mid-century inspired loveseat. The room is filled with natural light and the couple appears to be deep in discussion, considering the size, color, and style of each sofa to determine the perfect fit for their space.

lucid swift Jun 7, 2024, 1:42 PM

#

bitter hearth last 2 <:waow:1017853838516035725>

cool stuff

lucid swift Jun 7, 2024, 1:43 PM

#

cedar gale

yt thubnail generator xD

cedar gale Jun 7, 2024, 3:11 PM

#

low stone Jun 7, 2024, 3:17 PM

#

weary crystal Short question did you used the sd3 API or a local running comfyui with api expo...

I use both. It all ends up at their api in the same way.

low stone Jun 7, 2024, 3:18 PM

#

cobalt moon yeah i think festivalman hook up the SD3 API onto Ollama / open sourced LLM webu...

Specifically with Ollama/open-webui, that's going to their supported sdxl json which gets sent via api to a local comfy. As of this minute, it doesn't support sd3 api for the image functionality. It only supports dalle / a1111 / comfy, with their prebuilt workflows and then your prompt or the llm response

noble coyote Jun 7, 2024, 3:30 PM

#

I am 'Omosting' ...

cedar gale Jun 7, 2024, 3:46 PM

#

noble coyote I am 'Omosting' ...

Sounds hot.

noble coyote Jun 7, 2024, 3:46 PM

#

A beautiful water nymph, bejewelled

#

Using Omost LLM Setup

cedar gale Jun 7, 2024, 3:46 PM

#

Aye it is pretty dope.

noble coyote Jun 7, 2024, 3:46 PM

#

https://github.com/lllyasviel/Omost

GitHub

GitHub - lllyasviel/Omost: Your image is almost there!

Your image is almost there! Contribute to lllyasviel/Omost development by creating an account on GitHub.

cedar gale Jun 7, 2024, 3:48 PM

#

Tested it a while, I made these:

noble coyote Jun 7, 2024, 3:49 PM

#

Would they have been harder to make without Omost?

cedar gale Jun 7, 2024, 3:49 PM

#

Yes.

#

On top, in front else to hard.

#

Also things like bleed between 2 objects.

#

Omost is regional prompting on steroids for lazy people. 😄

noble coyote Jun 7, 2024, 3:51 PM

#

I'll try some 'positional' stuff

outer charm Jun 7, 2024, 4:01 PM

#

So... You can download sd3 now?

runic tusk Jun 7, 2024, 4:05 PM

#

No.

outer charm Jun 7, 2024, 4:05 PM

#

Oof, thanks soul

runic tusk Jun 7, 2024, 4:05 PM

#

Trust me, literally everyone will know when it's available. You won't miss it. It'll be posted everywhere.

low stone Jun 7, 2024, 4:06 PM

#

cedar gale Tested it a while, I made these:

Are these sd3 or omost? Impressive if omost

cedar gale Jun 7, 2024, 4:07 PM

#

These are omost.

cedar gale Jun 7, 2024, 4:24 PM

#

noble coyote Jun 7, 2024, 4:30 PM

#

Omost - here is my 'positional' prompt - a giraffe is wearing a top hat. On the hat sits a green frog, eating a red apple and reading a blue book. A snail sits next to the frog. A butterfly sits on the book

#

Ultimately very very poor positional takeup at all.

#

#

Yet the Omost text does specify all of this prompt ... it just doesn't deliver?!?!?!?!?

swift portal Jun 7, 2024, 4:33 PM

#

Hi

cedar gale Jun 7, 2024, 4:33 PM

#

noble coyote Omost - here is my 'positional' prompt - a giraffe is wearing a top hat. On the ...

I think this is to much.

noble coyote Jun 7, 2024, 4:33 PM

#

Keep it simple?

#

As I say, Omost picked up on all this and laid it out neatly section by section ...

woeful spindle Jun 7, 2024, 4:37 PM

#

noble coyote Yet the Omost text does specify all of this prompt ... it just doesn't deliver?!...

I dunno if we'll be able to do use sd3 instead of sdxl in omost but if we will, that's gonna be a huge improvement

#

does it say anything about changing the model on the github page?

noble coyote Jun 7, 2024, 4:38 PM

#

If Omost is this disappointing though ...

woeful spindle Jun 7, 2024, 4:38 PM

#

you cant really say "disappointing"

#

cus it's using sdxl

#

we all know how sdxl performs

#

we can see a clear improvement there

noble coyote Jun 7, 2024, 4:40 PM

#

Nerdy Rodent did an Omost Video on YT - he opens the code and drills down to the model name ...

#

Someone said it was RealVis SDXL?

cedar gale Jun 7, 2024, 4:42 PM

#

I think so but you can just swap the SD XL model.

#

You just need to convert it to diffusers and route it via config.

noble coyote Jun 7, 2024, 4:42 PM

#

cedar gale I think so but you can just swap the SD XL model.

Tell me how? 🙂

#

When I'm Omosting, I swap-out Torch 2.0.0-cu118 (Xformers in ComfyUI) for 2.3.0-cu121

#

... and then back again when using ComfyUI

cedar gale Jun 7, 2024, 4:48 PM

#

DM-ed you some stuff to avoid spam here. 🙂

woeful spindle Jun 7, 2024, 4:49 PM

#

cedar gale DM-ed you some stuff to avoid spam here. 🙂

Could u send me too

noble coyote Jun 7, 2024, 4:55 PM

#

Give me a prompt which will work in Omost, svp?

lucid swift Jun 7, 2024, 5:03 PM

#

sd3

dull star Jun 7, 2024, 5:06 PM

#

horse

#

watching a video

#

of a fox eating a hamburjer

#

SD3 is truly the best

#

🙏

brisk cipher Jun 7, 2024, 5:16 PM

#

dull star SD3 is truly the best

Where can I try sd3?

noble coyote Jun 7, 2024, 5:17 PM

#

FABLAN@Glif

trail frost Jun 7, 2024, 5:17 PM

#

Stability api

noble coyote Jun 7, 2024, 5:18 PM

#

FABLAN@Glif is free

#

Free SD3

#

Omost

brisk cipher Jun 7, 2024, 5:19 PM

#

noble coyote FABLAN@Glif

Link

#

Can?

noble coyote Jun 7, 2024, 5:22 PM

#

brisk cipher Link

https://glif.app/@fab1an/glifs/clv4ca3aq0004xklr050jusec

glif

glif - SD3 Photography Preset by fab1an

dull star Jun 7, 2024, 5:35 PM

#

@brisk cipher https://glif.app/@fab1an/glifs/clv55nhjy0003yr19n09wz0np if you want a completely raw one where you can do paintings and other artstyles, and even write negative prompts, I recommend this

glif

glif - StableDiffusion 3 (with negative prompt) by gliffyglif

#

gives you the most control

noble coyote Jun 7, 2024, 5:45 PM

#

Omost - Peppa and Cthulu

noble coyote Jun 7, 2024, 5:52 PM

#

dull star <@1013025999094546473> https://glif.app/@fab1an/glifs/clv55nhjy0003yr19n09wz0np ...

From SD3@GlifyGlif

glif-stablediffusion-3-with-negative-prompt-torcello-uspbsbpct937kieixzaj9j2q.jpg

glif-stablediffusion-3-with-negative-prompt-torcello-wyim99m1mrwq3enbmxowld7l.jpg

glif-stablediffusion-3-with-negative-prompt-torcello-mji05d61nmi1qrexwnrasdfg.jpg

past flame Jun 7, 2024, 7:01 PM

#

noble coyote Omost

Omost is not remotely SD3

rain current Jun 7, 2024, 7:10 PM

#

Absolutely, at the moment the only thing (in my opinion) that competes with understanding the prompt and with good quality is Ideogram

#

Example

sick cedar Jun 7, 2024, 7:18 PM

#

rain current Example

Is that SD3? 👀

rain current Jun 7, 2024, 7:18 PM

#

ideogram

sick cedar Jun 7, 2024, 7:20 PM

#

rain current ideogram

It's not too far off then. SD3. I mean we are also comparing a undertrained SD3 model with a fully trained Ideogram. So i'd say that's pretty good.

#

In my opinion, i think SD3 did better with the Horse looking at the TV monitor prompt.
The burger on the screen looks more coherent on SD3.

rain current Jun 7, 2024, 7:22 PM

#

I'm not saying it's better than SD3, especially considering that SD3 will have more training, and with controlnet, etc. It's just that, as of today, it's the only thing I see that's similar and free

sick cedar Jun 7, 2024, 7:23 PM

#

rain current I'm not saying it's better than SD3, especially considering that SD3 will have m...

Oh for sure yh.

rain current Jun 7, 2024, 7:32 PM

#

But I also see that it is more obedient to the prompt "slightly plump". I'm not sure if 2B is capable of being that precise.

rugged tartan Jun 7, 2024, 7:54 PM

#

I am seeing a significant difference between CORE and ULTRA in the API. the first image is comning from CORE the second is the result of ULTRA, same prompt and same seed. Ultra seems to ignore the details of the prompt. Does someone here is experiencing the same issues? In general everything done with SD3 in the API resembles the quality of SD-1.6

Prompt: Picture of a Anime-like real life woman with intricate face paint baroque style, Japanese facial features, realistic skin texture, photographic, photo-realistic

7b3cd045fed23c1167116f3e6be7c6a2f753b50bf63ec551f0593531bb7d269c.png

0cc983379bf86a6b1603063969d19fd458c3d01ac6be761d49a26584f73ea5cc.png

rain current Jun 7, 2024, 8:00 PM

#

cedar gale Tested it a while, I made these:

(ideogram + sdxl refiner)

viral plaza Jun 7, 2024, 8:03 PM

#

rugged tartan I am seeing a significant difference between CORE and ULTRA in the API. the firs...

if core fits your preferences, use core. Core is generally best stable setup and Ultra is best experimental/beta setup, rn Ultra is using experimental SD3 models and doesn't have the xl heavy finetuning that core has

rugged nova Jun 7, 2024, 8:10 PM

#

cedar gale I think so but you can just swap the SD XL model.

If you don't mind, would you explain how to swap the model for Omost? I would appreciate it.

#

Nvm, someone make a fork to allow you to select a model https://github.com/runnitai/Omost/tree/main

GitHub

GitHub - runnitai/Omost: Your image is almost there!

Your image is almost there! Contribute to runnitai/Omost development by creating an account on GitHub.

dreamy sundial Jun 7, 2024, 10:29 PM

#

bitter hearth Jun 7, 2024, 10:51 PM

#

vapid radish Jun 8, 2024, 12:01 AM

#

SD3 VS My SDXL Merge
I know portraits are not going to be SD3's strengh until people finetune the art out of it.

#

I cannot wait to upscale image with SD3 locally though.

dull star Jun 8, 2024, 12:08 AM

#

that's pretty good for a base model

#

and it doesn1t have that typical finetune look to it

gray flicker Jun 8, 2024, 12:40 AM

#

girl

lucid swift Jun 8, 2024, 12:42 AM

#

rain current Example

what propts did oyu use for the hyena

dusky thistle Jun 8, 2024, 1:22 AM

#

gray flicker girl

Here is the image you requested.

bitter hearth Jun 8, 2024, 1:33 AM

#

hallow lion Jun 8, 2024, 2:08 AM

#

make an image of the sd3 model training in the gym

#

coz it needs more training

hallow lion Jun 8, 2024, 2:08 AM

#

bitter hearth

is this what te fellow kids ar einto these days

cedar gale Jun 8, 2024, 2:09 AM

#

hallow lion Jun 8, 2024, 2:09 AM

#

elphant tho

#

:<

cedar gale Jun 8, 2024, 2:09 AM

#

😦

hallow lion Jun 8, 2024, 2:09 AM

#

text is the new hands

#

diffusionhand

cedar gale Jun 8, 2024, 2:09 AM

#

bitter hearth Jun 8, 2024, 2:10 AM

#

hallow lion coz it needs more training

hallow lion Jun 8, 2024, 2:11 AM

#

lol why a ghost

bitter hearth Jun 8, 2024, 2:11 AM

#

You didn't specify

hallow lion Jun 8, 2024, 2:11 AM

#

🤣

bitter hearth Jun 8, 2024, 2:11 AM

#

runs away

#

cedar gale Jun 8, 2024, 2:15 AM

#

raven fern Jun 8, 2024, 3:30 AM

#

lol

patent acorn Jun 8, 2024, 4:07 AM

#

sd3 request: man crying because there are no food in the fridge

real terrace Jun 8, 2024, 5:00 AM

#

glif-sd3-photography-preset-mariotteboyle-pm0lmliolotn422mw68d7pp5.jpg

hallow lion Jun 8, 2024, 5:00 AM

#

patent acorn sd3 request: man crying because there are no food in the fridge

Asking for a friend? 😄

patent acorn Jun 8, 2024, 5:01 AM

#

not interested

real terrace Jun 8, 2024, 5:01 AM

#

patent acorn sd3 request: man crying because there are no food in the fridge

patent acorn Jun 8, 2024, 5:02 AM

#

2nd is better though he should face to the firdge

#

also

#

there are still FOODS!!

#

shouldve been an empty fridge i forgot

real terrace Jun 8, 2024, 5:07 AM

#

patent acorn there are still FOODS!!

touché, that's true, well it is not that smart

real terrace Jun 8, 2024, 5:11 AM

#

cedar gale

glif-sd3-photography-preset-mariotteboyle-xv90j2fsyq9kehastfjzy0fn.jpg

patent acorn Jun 8, 2024, 5:11 AM

#

ough

noble coyote Jun 8, 2024, 7:09 AM

#

past flame Omost is not remotely SD3

Does Omost not have good prompt coherence?

cedar gale Jun 8, 2024, 7:11 AM

#

#

cedar gale Jun 8, 2024, 7:20 AM

#

patent acorn sd3 request: man crying because there are no food in the fridge

#

#

Went for depressed instead of crying. 😄

bitter hearth Jun 8, 2024, 7:30 AM

#

#

sadcat

noble coyote Jun 8, 2024, 7:33 AM

#

Just trying Anytest Controlnet ...

cedar gale Jun 8, 2024, 8:13 AM

#

bitter hearth Jun 8, 2024, 9:02 AM

#

noble coyote Jun 8, 2024, 9:11 AM

#

Originals made using Portrait Master. ComfyUI+Anytest Controlnet - prompt = beautiful leopard, sunny glade, hat, cinematic lighting. [NightVisionXL, dmpp_2s_ancestral, karras, 30 iterations]

#

Originals made using Portrait Master. ComfyUI+Anytest Controlnet - prompt = beautiful peacock, sunny glade, hat, cinematic lighting. [NightVisionXL, dmpp_2s_ancestral, karras, 30 iterations]

#

Originals made using Portrait Master. ComfyUI+Anytest Controlnet - prompt = beautiful fox, sunny glade, hat, cinematic lighting. [NightVisionXL, dmpp_2s_ancestral, karras, 30 iterations]

pseudo vault Jun 8, 2024, 9:25 AM

#

What's the best generator for producing text?

noble coyote Jun 8, 2024, 9:27 AM

#

Ideogram

#

Or try ComfyUI and the Harrlogos2 LoRA

pseudo vault Jun 8, 2024, 9:27 AM

#

Any usable on a phone? (Android) Not near my computer anytime soon lol

noble coyote Jun 8, 2024, 9:28 AM

#

Ideogram may be

#

Or free trial Adobe Firefly

#

Firefly can be fiddly on a phone

hallow lion Jun 8, 2024, 9:44 AM

#

Nothing on the phone works.

pseudo vault Jun 8, 2024, 9:45 AM

#

@noble coyote Thanks pal

noble coyote Jun 8, 2024, 9:45 AM

#

pseudo vault <@801511644944400414> Thanks pal

Just trying on my Android phone ... Ideogram

patent acorn Jun 8, 2024, 9:46 AM

#

cedar gale

nice

noble coyote Jun 8, 2024, 10:04 AM

#

pseudo vault <@801511644944400414> Thanks pal

I logged in to Ideogram (free-to-use) via my Google account, and have generated these 3 images via my Galaxy A9 phone - SD3 will have to be very (very) good to get to this standard of text!

dull star Jun 8, 2024, 10:21 AM

#

It may actually get these correct, it's just that the layout might look too 2D or pasted on. We'll see eventually.

noble coyote Jun 8, 2024, 10:31 AM

#

west haven Jun 8, 2024, 11:19 AM

#

A staff member in the office is wearing yellow clothes with three conspicuous letters: PPT.

zenith talon Jun 8, 2024, 12:17 PM

#

ddf

dim sinew Jun 8, 2024, 12:24 PM

#

noble coyote https://glif.app/@fab1an/glifs/clv4ca3aq0004xklr050jusec

If I gave you a 4031x2687 image of squirls to re-run with whatever prompt that was, could you convert it?

dim sinew Jun 8, 2024, 12:31 PM

#

low stone so using those combination of tools and a comfyui backend, you can have the chat...

Yeah -- I'm deff looking to eventually use stuff for commercial purposes. Doing some research for a project right now. Need to learn how to do stuff in general, and then start working on building a model for the plans we have.

rain current Jun 8, 2024, 12:42 PM

#

lucid swift what propts did oyu use for the hyena

A close-up of a hyena's face. The hyena appears to be in a contemplative or playful pose, with its front paws placed near its chin, as if it's holding its face. The photograph is in black and white, emphasizing the texture of the hyena's fur and the intricate details of its facial features. The background is blurred, putting the focus entirely on the hyena.

low stone Jun 8, 2024, 1:00 PM

#

noble coyote

One of the big benefits of ideogram is their magic prompter. It's crazy good at making great scenes with little input.

dim sinew Jun 8, 2024, 1:02 PM

#

dull star <@1013025999094546473> https://glif.app/@fab1an/glifs/clv55nhjy0003yr19n09wz0np ...

When I saw this I was like ... Oh look, its a decedent of Frog from Chrono Trigger.

vapid radish Jun 8, 2024, 1:16 PM

#

noble coyote

cedar gale Jun 8, 2024, 1:36 PM

#

rain current A close-up of a hyena's face. The hyena appears to be in a contemplative or play...

trail frost Jun 8, 2024, 1:47 PM

#

Im so excited for sd3 open release

#

Can we use that in our projects like sdxl 1.0?

dull star Jun 8, 2024, 2:21 PM

#

if you use it and not make money, you can use it with no difference at all

#

it doesn't affect people like us

#

its only for the people who want to make money using the model who might have a bit of trouble

#

it'll get sorted out at release.

urban arch Jun 8, 2024, 2:46 PM

#

cedar gale

The Goat needs 6 legs to be accurate.

dreamy sundial Jun 8, 2024, 2:53 PM

#

vapid radish SD3 VS My SDXL Merge I know portraits are not going to be SD3's strengh until pe...

what?? faces are clearly the strength of sd3, there's a ton of variety of faces, all sd 1,5 and sdxl faces look the same, when finetuned it will be mj6 quality for sure

vapid radish Jun 8, 2024, 2:54 PM

#

dreamy sundial what?? faces are clearly the strength of sd3, there's a ton of variety of faces,...

I have never met anyone with eyes like this...

#

But an upscale with SD should help, just too expsense on the API right now.

#

dreamy sundial Jun 8, 2024, 2:56 PM

#

vapid radish I have never met anyone with eyes like this...

this is a super undertrained base model, you're comparing it to an extremly finetuned model with a workflow and highres fix, of course is not gonna be as detailed

#

while all the people on sdxl look exactly the same

#

result of overtraining

vapid radish Jun 8, 2024, 2:57 PM

#

dreamy sundial this is a super undertrained base model, you're comparing it to an extremly fine...

No they where both 1024x1024 raw outputs.

dreamy sundial Jun 8, 2024, 2:57 PM

#

and merging

dreamy sundial Jun 8, 2024, 2:57 PM

#

vapid radish No they where both 1024x1024 raw outputs.

but still you chose the worse result that sd3 could have gave you

cedar gale Jun 8, 2024, 2:58 PM

#

vapid radish I have never met anyone with eyes like this...

Looks kinda hot tho. 😛

vapid radish Jun 8, 2024, 2:58 PM

#

I'm not saying it is bad I really like it.

dreamy sundial Jun 8, 2024, 2:58 PM

#

i've had far better results with better prompting, gpt4o really helps a lot in most cases

vapid radish Jun 8, 2024, 2:58 PM

#

This is what SDXL looks like after my workflow (6K zoom in), I'm sure SD3 will look great as well.

dreamy sundial Jun 8, 2024, 3:01 PM

#

vapid radish This is what SDXL looks like after my workflow (6K zoom in), I'm sure SD3 will l...

it's crispier higher res yes, but the skin looks like plastic and eyes soulless, like doll, and i'm sure every image is almost the same, no variation, almost all sdxl models look like that

vapid radish Jun 8, 2024, 3:01 PM

#

dreamy sundial Jun 8, 2024, 3:01 PM

#

sd3 face variation is on par with mj6

#

raw SD3 outputs

vapid radish Jun 8, 2024, 3:03 PM

#

dreamy sundial it's crispier higher res yes, but the skin looks like plastic and eyes soulless,...

I can do less smooth skin, a lot of people just prefer that look.

dreamy sundial Jun 8, 2024, 3:04 PM

#

dreamy sundial sd3 face variation is on par with mj6

also the people are looking at different directions, sdxl people always look straight and the eyes are focused looking at the camera in a soulles way

past flame Jun 8, 2024, 3:11 PM

#

noble coyote Does Omost not have good prompt coherence?

It uses a chatbot to plan out where the given object will be in an image and then uses normal SDXL with some regional prompting extension

#

It has nothing to do with how SD3 architecture works

noble coyote Jun 8, 2024, 3:14 PM

#

dreamy sundial also the people are looking at different directions, sdxl people always look str...

Often my SDXL 'subjects' give no clue that "there is a camera at all!" Until I use Face Detailer; then evey subject is looking straight-to-camera!

sick cedar Jun 8, 2024, 3:33 PM

#

vapid radish

We also using a undertrained version of SD3 currently too!

dull star Jun 8, 2024, 3:37 PM

#

I can't wait for 2B

#

I really really hope the paintings are good 🙏

#

just 4 days

dreamy sundial Jun 8, 2024, 3:37 PM

#

dull star I can't wait for 2B

me too man, imagine what things companies like leonardo.ai will do

hallow talon Jun 8, 2024, 3:38 PM

#

I wonder how long after the release of 2B that it'll be supported in A1111? I can use Comfy but it's not ideal

dreamy sundial Jun 8, 2024, 3:42 PM

#

#

vapid radish Jun 8, 2024, 3:47 PM

#

dull star I really really hope the paintings are good 🙏

Paintings seem pretty damn good to me.
https://glif.app/@Jib/runs/wy2hokyaknzijufrcujo3l4d

glif

glif - Jib's run of SD3 plain

dull star Jun 8, 2024, 3:57 PM

#

yeah 8B is good

#

I wanna see how good 2B is

civic quail Jun 8, 2024, 4:25 PM

#

what are the requirements for sd3

robust junco Jun 8, 2024, 4:25 PM

#

low stone Sd3/sdxl refined

How about doing SDXL-> SD3 or SD3->SDXL->SD3?

civic quail Jun 8, 2024, 4:26 PM

#

civic quail what are the requirements for sd3

i have 8gb vram 32gb ram

dull star Jun 8, 2024, 4:26 PM

#

robust junco How about doing SDXL-> SD3 or SD3->SDXL->SD3?

refine with SD3?

#

🤨

low stone Jun 8, 2024, 4:29 PM

#

robust junco How about doing SDXL-> SD3 or SD3->SDXL->SD3?

I was fooling around with denoise yesterday. The biggest issue with sdxl is multi subject refining. It just wants to make everyone the same thing. I managed to upscale by having the various refining stages with various denoise levels, but it was super finicky and needed different denoise numbers per image, so in the end it wasn't useful long term. I'm really hoping that refining with sd3 will solve this because it understands multi subject.

#

#

#

For example.

#

Ella is really good at complex multi subject. But it just makes the sdxl upscale's job really hard

dull star Jun 8, 2024, 4:55 PM

#

https://x.com/Lykon4072/status/1799377331003556059

Lykon (@Lykon4072) on X

"Ultra" now on Stable Image Services
#SD3

#

probably posted here already

robust junco Jun 8, 2024, 4:57 PM

#

low stone I was fooling around with denoise yesterday. The biggest issue with sdxl is mult...

yet have you tried the way mentioned? 🙂

low stone Jun 8, 2024, 4:58 PM

#

robust junco yet have you tried the way mentioned? 🙂

That would be 16 cents per image at this point. This new api pricing is too expensive now. I can't see myself using it anymore.

#

I get 1000 images a month from ideogram for $20 and I can img2img with sdxl or sd3 2b when it comes out

robust junco Jun 8, 2024, 5:00 PM

#

low stone That would be 16 cents per image at this point. This new api pricing is too expe...

yes, I see,as well thought you would possibly have done it already

low stone Jun 8, 2024, 5:00 PM

#

Sd3 is dalle price now, but we all know that for most things, it's not as good (the censored api sd3). Obviously local sd3 is a different story.

mortal mesa Jun 8, 2024, 5:06 PM

#

which one is Ultra

left parrot Jun 8, 2024, 5:15 PM

#

civic quail i have 8gb vram 32gb ram

That should be enough to run the 2B model that will be released on wednesday, but not the Huge 8B mode that's coming eventually. 2B seems to be plenty good enough for most uses!

dull star Jun 8, 2024, 5:15 PM

#

low stone Sd3 is dalle price now, but we all know that for most things, it's not as good (...

yeah considering Ultra is a whole SD3 workflow, it's less acceptable that it's DALLE price

vapid radish Jun 8, 2024, 5:26 PM

#

dreamy sundial Jun 8, 2024, 6:04 PM

#

left parrot That should be enough to run the 2B model that will be released on wednesday, bu...

i hope we could run at least the 4B model on 8gb vram

dull star Jun 8, 2024, 6:06 PM

#

dreamy sundial i hope we could run at least the 4B model on 8gb vram

they might skip 4B

left parrot Jun 8, 2024, 6:07 PM

#

yeah, the 4B doesn't seem like it has that much of a use case, stuck between the top quality 8B and the 2B that's balanced between performance and output quality

#

I suspect most finetunes and lora's will be based on either 2B or 8B

turbid grotto Jun 8, 2024, 6:32 PM

#

low stone Ella is really good at complex multi subject. But it just makes the sdxl upscale...

what about upscaling with ella + kohya deep shrink?

cedar gale Jun 8, 2024, 6:33 PM

#

#

#

low stone Jun 8, 2024, 6:34 PM

#

turbid grotto what about upscaling with ella + kohya deep shrink?

I'm upscaling with ella for that first increase from native 1.5 res up to 1.5x that, but then going sdxl after that. the problem is that sd 1.5 checkpoints aren't very good compared to sdxl which are amazing at this point.

#

so it'll keep multi-subject until high res, but compared to sdxl refinement if it works, is night and day quality.

turbid grotto Jun 8, 2024, 6:34 PM

#

dull star they might skip 4B

hope there will be a way to quantize 8b sadcat , because I feel like 4b would be a perfect option for 12gb cards

turbid grotto Jun 8, 2024, 6:35 PM

#

low stone I'm upscaling with ella for that first increase from native 1.5 res up to 1.5x t...

oh thanks

#

I found sdxl hyper a really good for upscaling with only 4-8 steps

low stone Jun 8, 2024, 6:39 PM

#

turbid grotto oh thanks

#

that's what my ella workflow looks like.

#

first 2 on left are ella native, last 2 are sdxl.

#

so as you can see, the sd 1.5 checkpoints just don't have the content in them. they're very lacking in training outside porn.

#

yeah i'm using sdxl hyper now all over the place. it's amazing

raven fern Jun 8, 2024, 6:40 PM

#

vapid radish

does it know Bob Ross style? :3

dreamy sundial Jun 8, 2024, 6:43 PM

#

dull star they might skip 4B

why?

dreamy sundial Jun 8, 2024, 6:44 PM

#

turbid grotto hope there will be a way to quantize 8b <:sadcat:1130568570712109176>, because I...

or 8gb cards too

#

they promised to release all versions

#

2b is too low and 8b too high, i think 4b is what majority of people would use

cedar gale Jun 8, 2024, 6:45 PM

#

dull star Jun 8, 2024, 6:45 PM

#

turbid grotto hope there will be a way to quantize 8b <:sadcat:1130568570712109176>, because I...

Store at FP8, load in FP16, less space used and iirc, less VRAM too maybe

dull star Jun 8, 2024, 6:45 PM

#

dreamy sundial why?

already too many choices to split the community

vapid radish Jun 8, 2024, 6:46 PM

#

raven fern does it know Bob Ross style? :3

It will give you Boss Ross if you like (take that Dalle.3).

turbid grotto Jun 8, 2024, 6:48 PM

#

low stone first 2 on left are ella native, last 2 are sdxl.

yea sdxl is nice, can't imagine how good upscaling can be with 8b blush

mortal mesa Jun 8, 2024, 6:49 PM

#

there is no 4B and community split talk never seems to effect anything

raven fern Jun 8, 2024, 6:49 PM

#

vapid radish It will give you Boss Ross if you like (take that Dalle.3).

i mean sure... :3

turbid grotto Jun 8, 2024, 6:50 PM

#

dull star Store at FP8, load in FP16, less space used and ***iirc***, less VRAM too maybe

hope it gonna work well

#

however, 2b could be really enough with controlnets and other stuff

turbid grotto Jun 8, 2024, 6:55 PM

#

dreamy sundial or 8gb cards too

maybe with some optimizations? Even sdxl (2.6+something) takes about 8gb on my pc

dreamy sundial Jun 8, 2024, 6:56 PM

#

turbid grotto maybe with some optimizations? Even sdxl (2.6+something) takes about 8gb on my p...

it runs perfectly on my 8gb card, really fast also 12 seconds an image

turbid grotto Jun 8, 2024, 6:56 PM

#

oh, I will run out of ram for 8b agony
even loading sdxl peaks roughly at 14gb

turbid grotto Jun 8, 2024, 6:57 PM

#

dreamy sundial it runs perfectly on my 8gb card, really fast also 12 seconds an image

it that standard model with 20 samples?

dreamy sundial Jun 8, 2024, 6:57 PM

#

i think it was 15-20 steps i dont remember

turbid grotto Jun 8, 2024, 6:58 PM

#

1024px, 20sp, sdxl = 19s on 3060

dreamy sundial Jun 8, 2024, 6:58 PM

#

not bad

turbid grotto Jun 8, 2024, 6:58 PM

#

dreamy sundial it runs perfectly on my 8gb card, really fast also 12 seconds an image

then it is really nice

#

I didn't expect

dreamy sundial Jun 8, 2024, 6:58 PM

#

turbid grotto then it is really nice

yeah, on comfy tho

low stone Jun 8, 2024, 7:02 PM

#

#

this is gonna be great when we have sd3 at home

cedar gale Jun 8, 2024, 7:36 PM

#

low stone Jun 8, 2024, 7:43 PM

#

#

do not taunt happy fun nordvpn

#

#

censorship dragon getting you down? use nordvpn.

cedar gale Jun 8, 2024, 7:53 PM

#

low stone Jun 8, 2024, 7:55 PM

#

#

cedar gale Jun 8, 2024, 7:57 PM

#

low stone Jun 8, 2024, 8:11 PM

#

#

Release the weights!

cedar gale Jun 8, 2024, 8:14 PM

#

low stone Jun 8, 2024, 8:14 PM

#

compact forge Jun 8, 2024, 9:10 PM

#

is the artisan bot using sd3 release candidate already?

hallow lion Jun 8, 2024, 9:45 PM

#

vapid radish

those hands are transcendental

#

she has 2 rings but ill let that slide

#

i guess that could happen

bitter hearth Jun 8, 2024, 10:13 PM

#

Nothing wrong with 2 rings

#

thomas

#

#

thomas

rich iron Jun 8, 2024, 10:27 PM

#

low stone censorship dragon getting you down? use nordvpn.

can it protect your chicken from Dokken?

vapid radish Jun 8, 2024, 11:23 PM

#

dull star Jun 8, 2024, 11:32 PM

#

very epic

raven fern Jun 9, 2024, 1:56 AM

#

elden ring boss 😮

bitter hearth Jun 9, 2024, 2:49 AM

#

raven fern Jun 9, 2024, 2:50 AM

#

very sharp details

bitter hearth Jun 9, 2024, 2:50 AM

#

raven fern very sharp details

edge of technology

raven fern Jun 9, 2024, 2:51 AM

#

cutting egde

patent acorn Jun 9, 2024, 3:03 AM

#

edging

bitter hearth Jun 9, 2024, 3:11 AM

#

patent acorn Jun 9, 2024, 3:17 AM

#

hello sd3 make me lego ninjago characters

low stone Jun 9, 2024, 3:23 AM

#

#

a tornado of library books, weird supernatural ghosts flying around sliming the library

raven fern Jun 9, 2024, 3:24 AM

#

slimy ghosts

low stone Jun 9, 2024, 3:28 AM

#

scenic shadow Jun 9, 2024, 3:47 AM

#

Has there been any word on if any finetuners got early access?

silver sluice Jun 9, 2024, 3:51 AM

#

does anyone have a ComfyUI workflow that's compatible with SD3? I realize it's not realeased yet but I'm guessing for those who do have early access there's already a workflow developed for it? I just want to make sure I get the nodes installed and ready for launch day

wild remnant Jun 9, 2024, 4:15 AM

#

hallow talon Jun 9, 2024, 4:48 AM

#

I wonder how long after release of the model that we'll be able to train and finetune using Kohya? I'm definitely excited about the possibilities with SD3 when it comes to finetuning