#images-discussions
1 messages Ā· Page 60 of 1
Lol
Fred's got my back
Now I think I'm gonna go make myself a pizza

...I could not live without pizzas...thank god for Italians.
aight, i'll make more scary folklore
My fave.
The Wendigo - This creature is part of the folklore of the Algonquian people in North America. It's a malevolent spirit that possesses humans and turns them into cannibalistic monsters during harsh winters.
0Shot is going to town writing prompts for The Kuchisake-onna but I can already tell based on the horror folklore that moderation is gonna push back -- woah those markdown titles are huge lol
These Paul Bunyans... š¤£
wow, paul bunyan reimagined
Alright, time for...
Wait!
Paul Bunyan and Bison had a baby
And that baby was born with HGH in it's veins instead of blood.
Interesting made a prompt on a new chat DALL-E and suddenly it told me the prompt couldn't be done because I was referencing Final Fantasy games. I don't think a female version of Achilles in a Village full of Vampires where the Sky is filled with Shooting Stars and an UFO is roaming the sky is from any of those games. At least not to my knowledge.
That's definitely Final Fantasy 5
š²
I wonder what's happening
They're probably working out the bugs with people trying to bypass ways to make ChatGPT produce things that it's not supposed to be producing.
there's something in the works for sure
Could be that. You know they never tell us anything so anything is a guess.
We just have to make an educated guess.
I love guessing though...I once guessed this guy's entire life and I got it right.

That was just a test out of boredom. I should have said I wanted to go back to work in January and not February⦠š”
š§
Today I figured out something I've been researching for years
But now I'm confident it's the answer
But I can't tell you what it is
Oh no š„
hehe, no clue
I'll need to have a minotaur boss on the next dnd game jsut to make a cool token like that xD
I did a cat and mouse game with minotaur. I made labyrinth with dungeonai, created with dalle the minotor. Player have to roam around, hide in spot I made. It was on vttfoundry. Was fun
I should sell all my 5 years of riddle, puzzle, escape game, story, npc 
Boo, I got timed out because the figure I wanted to depict and show in today's theme has a spanish name
Only used her name, so it got flagged as non-english
I just noticed I did a ton of images for today's theme and we got so much time for the day to end...

I feel threatened by an avocado....
If I say that to my colleagues, they will send me to the hospital...
Hey guys is there any way i can use dall e to remix images?
for example if i give a photo of an object, can i ask it to generate a person holding it or something?
It can interpret what is approximately in the image and include a description of the object into the prompt.
you can try with dalle-2 labs there is generate fill https://labs.openai.com/
the editor look something like that ig that waht you looking for
So dall-e 2 on labs.openai.com will generate variations, but it doesn't let you choose a prompt so you can't ask for it to change the image I'm some way.
(and it uses dall-e 2, not 3)
i made the avocado version of the hotline bling meme 
Making the Blue Jellyfish Jelly Krabby Patty from Spongebob:
Is there a way to produce consistent images with DallE like Midjourney can do with some finagling? Would it be possible to automate this process without manually inputting prompts?
For instance, the same actors/characters for the purposes of storyboarding.
ok I'm drained
let's hope next theme is free cookies, muffins for everyone
Btw, I'm just uploading today's theme to this https://discord.com/channels/974519864045756446/1197549427624259624 if you guys have also something that won't make it to the theme, you are welcome to add it
Had so many pictures, impossible to post them all in the daily theme channel.
Beeen working hard on tweaking up my bots to really push out the finest of quality
I think I am getting pretty close.
š hello people, anybody has experience with the two different platforms for DallE use? ChatGPT and Copilot.
I am about to get a paid subscription, I'm trying to decide which one could be better for me.
Copilot is free in Bing afaik. ChatGPT offers more tools and is more competent overall.
Yep there we go, pulling crazy details out of it now
Attila
thanks
Copilot pro is sub based and has 100 boosts per day then slow gen for image gen 1 boost 4 images
anybody experience error with the image gen on GPT too?
Hey folks, Have yall had much time to check out the gpt store for dalle? Had any luck with them? I feel like cultivating a prompt in gpt, then throwing it into dalle is still working better.
I like this one so far https://chat.openai.com/g/g-B0bw75oD0-0shot
I got it on my TODO list
But Merlot has priority
Prompt?
Prompt: Can you make Merlot scared!
š¦
So rude
lol
See, I don't lie, prompts are super easy
but Merlot GPT is not ready for testing yet
Csodaszarvas
I had to look it up
What was the prompt for yours lol
I'm a simple man ;
Cute and angry menacing avocado, anime style
Wow. Did not expect that

A hyper realistic wide HD photo. On a dark night on a beach in Hawaii a fish stands on itās tail, wielding two gleaming katanas with itās fins. There are coconut trees. It is pouring rain and a powerful sea breeze blows everything
Finn diesel
itās a sword fish
Finn diesel sword von fish
GM everyone ! hope ur doing well
Good morning
where can i find logo makers ? or a prompt to use it creating a logo for my fb store page
Might be the CI ?
I just randomly decided to do a superhero cat and @late blade posted that
You can ask to dalle 3 , make a logo of x thing
Oh !
yea im doing it but like i suck with prompts lol
some impressive stuffs
Is there a way to edit areas of a picture like in the previous editor? I want to change just certain areas
There's no inpainting feature in DALLĀ·E 3 on ChatGPT, but you can inpaint using DALLĀ·E 2 at labs.openai.com.
lol, it's so easy to make that
I'd library in this library
trying to find something interesting to do, been doing a challenge that each day a new workflow
make a new gpt by reverse engineering prompts
done that, next
o.o
all you need for my cat, my secrets are in there

can someone help me, I'm new at this and having problem. Let say you want to create an illustration flat style, where you have 5 judge in a panel behind a desk and they each hold a sign with different rating, how would you do that? What are the steps? I only get non sense stuff like all the judge have 1 star on their sign, or the stars are different colors, etc.
Dont use the word rating. Just say they are holding up signs with single digit numbers
best you could do would maybe be describe each judge clearly on its own, but even then they'd probably blend together. don't think there's any kind of sure thing solution for your issue currently
anyone have tried copilot pro? I try the same prompt with copilot free and it seems to give me better result, just that I can't do 16:9 with copilot free
ha.... You've reached the current usage cap for GPT-4, please try again after 3:47 PM.
that message is really icing on the cake when you're frustrated, lol
yeah, its like it keep giving you things you don't want, and then, enough...
the main underlying issue is compelling gpt4 to pay attention to what you are instructing it to do
Hi š
Nice to meet you
sorry, was a bit frustrated with Dall-e
we all at some point
working with dall-e still takes good amount of work, so keep that in mind when you do a first prompt
what do you mean? I think I have something missing
I ask gpt4 turbo for help to ccreate the prompt
the prompt you did is short and leaves many things open
(but not the one in chatgpt, I don't want to waste one of the 40 request)
I've updated my #1187233013956874260
try to leave as little interpretation room for dall-e, so be very thorough
for example the rating, you could change it to score, score from what range? 1 to 10, 1 to 5?
Create a 2D flat art illustration for a blog post header in a 16:9 aspect ratio. The image should show a line of stylized, abstract figures, each with a distinct, simple shape and a shade of blue. They are seated behind a sleek, modern table against a white background. In front of each figure is a card with a star rating, from one to five stars, in a variety of bright colors. This minimalist scene represents a panel of judges giving their ratings, emphasizing clarity and the theme of expert review.
That was one of the many prompt I tried
are there boards, whiteboards, lights?
is it wooden, is it plastic?
how are the judges aligned
minimalist illustration: what kind of line art, what color palette, are judges diverse or all are robots?
what is an expert review?
see, you can expand on your ideas and be more precise
hence I say, a good image takes work
hehe, I knew you'd like it
I have tried judge, but it was doing like those judge at court
exactly, that's another thing, sports judge, beauty judge, muffins dressed as judges?
flat art illustration (its in there)
Do it! šÆ
yes, but flat illustration is anime? cartoon? ink? colored pencils? print?
try to leave as little chance to change stuff with your prompts
its a specific style: Flat art illustration, also known as flat design, is a style of graphic design that uses bold, bright colors and geometric shapes to create a visually appealing and clean look. It's characterized by its simplicity, minimalism, and lack of perspective.
yes, I understand what you are saying, but dall-e doesn't
ok, well, then you need to expand on that info that is missing
I even tried to specify how many stars and their alignment for each judge and it still showed 1 star or 3 star all misaligned
Try using Image GPT Generator to make that specific style, in #1187233013956874260
that's another approach, Pythagoras made that really good guide
Trying to make it easy for everyone to make images
just make sure you don't leave stuff to chance
how long are you prompt typically?
I'm afraid if its too long it will just summarize it before sending it to dalle
that was just for the art I wanted to use, but it was worth it
you can see https://discord.com/channels/974519864045756446/1197549427624259624 that the work I made were consistent with the images I posted there
Usually the medium size prompt is best. As long as it details all the instructions first before making the image. That always helps Dall-E formulate a better image. Because it reviews its own description.
that's like 10 times the lenght of my prompts
hehe
I will go longer than and more details
you can see for example @open trench 's Scampers and how he consistently gets his project https://discord.com/channels/974519864045756446/1193219783877988423 forward
Don't be afraid to hit the image limit cap, if you see something you don't like, change it
dall-e won't go away
Im already at the usage limit
hehe, welcome to the club
either way, with dall-e OAI or MS, just make sure you have a workflow
most people just do the lazy prompt and that's about it
Yeah, lazy prompt no good
But so far I send sometime the same prompt to MS and sometime that one nail it, just that it's square picture so I can't use them
that's possible, sure
just be mindful, that it could change or you might need more specific stuff
I really didn't know that I prompt could be that long to be honest, I will start going more in details
Mines only 6 inches long
you can also see #prompt-engineering if you want more information on how prompts work
The prompt I mean.
I let Image GPT Generator do it's thing by giving it a small detailed instruction
I keep telling em Dys!
btw I didn't apply to the #spotlight post about mods and guide
one of my cat girls has 27 pages of prompts.....
I have to consolidate that somehow....
Every bit helps
No, jk
It hallucinates after a while
People putting pages and pages and pages of prompts into a GPT only makes it worse
surprisingly with custom GPTs and knowledge you really got some cool stuff going
Meee?
any GPT worth using that help?
I keep telling you debian, use Image GPT Generator in #1187233013956874260
This one is a solid one https://chat.openai.com/g/g-B0bw75oD0-0shot
but it also depends on what you want to do
I love the Wolfram GPT for ADA, it's really good

You can make your own custom image GPTs @late blade
Is that not what you want?
Is that not what everybody wants?
I am working on 2 custom ones, but haven't shared them yet
yeah, I still need to make sure I'm ok for beta
I'm at my usage limit, but I will try those GPTs later.
I hate the usage limit
I wish I could just run dall-e on my own computer
you gonna buy your own datacenter?
Prompt: Let's do an image of Merlot with a speech bubble saying: "Hi, I'm Merlot!"
can you share the full prompt you used to create that? I'm curious to see how it looks like
Geckos holding up number cards: 7, 5, 8, 9, 6. Please donāt change the prompt.
meeet @empty kelp he's also really good with prompting and coming with crazy ideas
DALL-E 3 isnāt able to connect sets of things, so you should only have one set
that's what I got... (with MS)
usually the model doesn't have any problem depicting text if the text is short and using pretty common words
I tried generating a calendar page, it failed, they can get numbers in order
but yeah, what @vital gull says is 100% correct, short text is fine
you may also get lucky and generate some phrases, but rare, and sentences, is still inconsistent
if you say ā5 people holding 5 cardsā it will draw 30 people. instead you need to list five cards, and let DALL-E attach a person to them
it's a common issue the model doesn't "count" like we do
I'm gonina try that
it doesnāt do math, sequences, or order, but it can handle a short list of things
suprisingly with calculus it can do good explanations
but yeah, thea longer the text the harder it will get to make images like that
we ram out of time š
lol
I think next week I'll work more on my elven couple gpt
have to continue the story
goutot vartin
what is the prompt?
let me look for it, one sec
is there a prompt for web design prototypes?
it's possible
I didn't save the socks chat....
was looking for some success
if you really want to go into text tho, @grizzled loom is your person
ChatGPT 4 does say that DALL-E 3 can understand ordered sets and specific numbers of things ā but so much random stuff is added to the prompts that only very simple things actually end up working
ChatGPT cannot always be taken as a reliable reporter of its own, or of DALLĀ·E's, capabilities.
A bit more info about why this is a challenge for DALLĀ·E can be found in its research paper:
https://cdn.openai.com/papers/dall-e-3.pdf
yes, don't take its words as a must
haha love this one
anyone else get copilot pro? pretty cool the dalle image are now true hd. but i cant figure out how to make them square, is always widescreen
Really? I haven't been following the news on Copilot. By true HD, what's the output resolution? Or do you mean they finally caught up, and are using 1792x1024?
yes i think. it jsut has the fidelity of gpt now.
they release a copilot pro option (20 usd) and it integrate into office and thing too
you get 100 boosted image a day
so far i am really loving it, feels nice and integrate into all my things.
Sounds like things are about to get heated. MS office integrations with AI are probably going to get powerful IMO. 100 images isn't bad either. It's 200 with ChatGPT+, but people using MS Office are doing other things, so that seems like an appropriate #.
well and you always get 4 images from one token too. and even if you run out the 100, you still can gen images and with not much wait from what i have seen
but yes it seem a first big step into the ai being a real part of your computer
I have just start to use it after cancel my regret MJ 6.0, so i am sure there are lots of cool things to do with it i have not even scratch surface of
āa lineup of different colored cars in the order red, blue, greenā
ChatGPTās order examples work
So image gen is truely unlimited with copilot pro? They also mention that they assign more compute to each pictures to generate more details and better result, have you noticed any difference? I think I will just stop waisting my time with this and order copilot
I jsut start using it today so I am no expert yet. I am pleased though is all i can see. images look great but of course its dalle.
You might try regenerating a few times (if you feel like experimenting!), or trying a higher # of objects -- I wasn't replying to say that DALLE can never do this type of thing ever, but just that it's inconsistent, limited, and not wholly reliable! But that's no judgment from me, it's still an amazingly capable model.
one nice small bonus for image gen on copilot pro is you can make copyright stuffs since they are more relax about that kind of thing that gpt i guess
a lineup of different colored geckos in the order red, blue, green, yellow, purple. the geckos hold a different numbered card over their head in the order 1, 2, 3, 4, 5. please donāt modify the prompt
we have to retake the photo, yellow gecko has one eye closed.
or maybe he is just winking haha
it got the colors here. i wonder if there is a way to describe this so the colors and numbers are in order
@la3 can you try that prompt on copilot pro? Geckos holding up number cards: 7, 5, 8, 9, 6. Please donāt change the prompt.
sure i dont think it will be much different since it is still dalle3
I'm sure with enough attempts it would eventually get both right!
you always need to tell it to not modify the prompt. it will still scramble the prompt a little, but if you donāt say that it totally scrambles it
you need to say the gecko is standing on itās hind legs for it to hold the card. i forgot to put that in
I don't think a company have a sexual orientation
āāAttentionāāOff-topic chats have been cleared by a moderator.
Boom
lets get back on business and talk about dalle š
You've reached the current usage cap for GPT-4, please try again after 6:35 PM again...
I don't know why... but I feel like you should make a gecko xylophone now, ha ha. Something about those colors. š
squidman
I like gecko's 
i like turtles
the api has a quality setting that accepts standard or hd. bing has always been using standard.
i don't have any rhythm. i can barely spell it.
I'm going to save this snippet so I can bring it up every 50 times a month when someone wonders why an image ask is challenging.
The OpenAI Cookbook has a figure showing the difference between standard and HD, even though it's the same resolution
When he said 'true HD' I was just looking for some extra resolution information, in case Copilot had added something different than 1792x1024. HD tends to mean different things in different applications, lol. Companies often "inflate" the term HD IMO, lol.
Ah yeah, it hasn't gone 4K yet!
I like squid's too 
nice 
imagine when dall-e goes 4K though, wow.
I'm curious if OAI will bump Dall-E to 4K, while releasing a video model that's similar to the current Dall-E3 (or what resolution that it'll be if and when they do). Too bad we're always "speculating." I'd love some little hints as to what's coming. Wink wink OAI š
I really like that one, that's a really cool concept.
nice. i was going to say maybe say in prompt how many cards or geckos there are to start
maybe they will release 4k with some video when gemini ultra is release.
because i think ultra is rumor to have some things like that. they will need to compete
Gnomes get around 
the key is to not say how many geckos or cards there are. if you say that it draws 20 of each. it doesnāt grok the numbers
well, i did in that one⦠but i think it works better if you leave out the number
hmm i was going to say you just prove otherwise but ok haha
i still don't see how they can reasonably release video animation to the masses -- that's 30-60 frames or more per second, and they can barely handle the demand for a single still frame.
true. and most video seem far away anyway. everything i see is 2 seconds of jittery or scanning across nothing impressive really yet though i am sure dalle will of course be able to top this
We get 40 prompt every 3 hours, at 60fps, you could generate 1.3 seconds of video every 3 hours, and if you donāt like the output good luck
There are 4 cards (white backgrounds and black text) with numbers in the exact order 1, 2, 3, 4. There are 4 geckos with colors in the exact order red, green, blue, yellow. Each gecko stands on its hind legs and holds a card over its head with its front hands: red has 1, green has 2, blue has 3, yellow has 4. Please donāt modify the prompt.
the colors are red, gren, blue, and blow
ya something to look forward to š I have every confidence in OpenAI's ability to circumvent the technical challenges, they have a rather proven record already. Altman says human-tier AI is coming this year, so I wouldn't bet against them by any means.
I'm a novice to generating images with dall-e. Where are the best resources for learning how to write good prompts?
Getting a white screen when i log in, any solutions?

thanks!
You've hit your daily maximum number of images. To ensure the best experience for everyone, we have rate limits in place. Please wait for the next day before generating more images. Your daily maximum will reset in 15 hours and 44 minutes. š
I haven't log-in in like a month.
My pleasure 
Dall-e hates me
maybe try another support channel in this discord, this is for discussing dall-e
this is about dall-e
yeah but that doesn't include login issues
i don't think they're having issues, i think they're just saying they haven't logged in
you can ask question in here too, this room has some of the most dedicate dalle3 users always sharing tips and thing
its not a login issue, whenever i try to generate an image the screen goes blank.
Maybe you need to ask Dall-E more politely 
browser issue maybe š¤·
i thought so, i cleared cookies, allowed java-script, tried in firefox, chrome, and on my phone. same thing keeps happening.
have you tried logging out and in?
yep
using gpt4 or labs?
labs
Yeah, those are questions we don't know. Try reporting it #1070006915414900886 and ask in #community-help .
What is the differencee between asking dall-e directly for an image and using the Image GPT Generator I found that you shared @glossy scroll #1187233013956874260 message
So, the Image GPT Generator will just help you make a specialized Image GPT for any specific style you want, or any universal style as well. That's if you don't want to go into the nitty gritty of it all and just want to make an Image GPT.
The difference for asking Dall-E directly - means that if you want images to respond more to your specific requests, you make sure that you put things like Put these instructions in front of every image prompt you write, before you write anything else: then open up a quotation marks and say whatever you want and close it in quotation marks. That will basically embed your instructions within the output prompt when Dall-E produces images. Making sure that Dall-E reads specifically from the output prompt and not just let ChatGPT transform your words into a more refined language before pasting it into the output prompt.
The output prompt is the circle with the exclamation mark in it. When you click that, it opens up Dall-E's output prompt for the image.
That's the place Dall-E reads everything to create the image.
Thanks for the info. Still trying to wrap my head around this. I used your GPT generator and changed my custom instructions to fit the response that was given after I described my picture. What about the section in instructions that asks How would you like ChatGPT to respond? What typically goes there?
today's daily posts are all so good i don't know where to begin--so many ideas
Noo, that's the custom instructions place. Go to 'Create GPT' then go to the 'Configuration' menu, then put the info in the 'Instructions' section.
The Custom Instructions place is specifically to guide how you want ChatGPT to respond with All GPT responses.
Think of it like tailoring your ChatGPT to give outputs based on what you want for everything, not just images.
The 'Create GPT' is an entirely other thing.
You need to be a GPT Plus Member
Btw
To have access to all that.
Except for Custom Instructions
Yeah I was using it for tailoring responses for code refactoring. I have the $20 a month plan. I've never used the create gpt before.
I can only imagine. Maybe I can use it in my website? I was just trying to get some good images for my landing page but now Im curious.
@shut niche your gpt is so good for reimagining variations.
Yeah, my guide is meant for creating your own custom image gpt's, and playing around with the instructions section of custom gpt's.
It's farrrrr more powerful than just using regular stuff. In my opinion.
@unreal hill If you want to find different ways to play around with prompts and make Custom Instructions, just to get your feet wet, you should check out #1019652163640762428
They got some pretty good stuff in there.
yup, there's off-the-shelf and then there's the expertise that can truly level up your outputs
yeah, I've learned so much from prompt labs that have helped me with my custom gpt's.
So I can make a gpt specifically for the code that I'm using for my website. And I can also make one specifically for creating the images for it?
Yep
yeah I need to explore this more.. thanks for the info once again!
My pleasure. Any time 

Custom GPTs are so powerful. This democratization of AI enables so much potential to help others and help yourself.
how are you guys!!
Just enjoying life as much as possible with ai and art.
Good. Just trying to make more GPTs! I'm addicted.
How about you? 
i'm exploring perspectives these days and how well the model understands, and where it struggles š«”
Ah, a trial and error way of life.
Admirable.
I can't wait for GPT5
They need to make my GPT making experience easier
gpt5 and dalle4 š
there have been suggestions there won't be a gpt5, rather a next-generation human-tier AI with a new name altogether.
Thanks Shon! The positive feedback means a lot, especially since I've been working so hard on it since GPTs were first announced. I get really pumped seeing how everyone is using it.
I also hope it's helpful for 'noobies' that come in here, that don't care about learning prompt engineering, and just want great images from Dall-E.
I have a trick that, that sometimes works for unusual perspectives.
If you get it to mimic action camera extreme sports, you can evoke cool camera angles that are, otherwise, hard to get. There's obviously a million approaches, but that's a fun one to get lost. š
Why is /daily not active right now
I use photographers approach, and architecture perspectives, they work pretty fine
I just suggested it because Dall-E is so good at blending concepts, sometimes it does cool 'unexpected' things when blending action shots with stuff like architecture. 100%, a photographer's approach is great too.
but the thing is usually regular users, who aren't very familiar with 'technical terms' they try to explain with their own words, os i am trying to learn how well model understands perspectives that are put in simple words.
Gotcha
e.g. how someone who aren't familiar with e.g. bird eye view, would explain it with their own words, and if model understands it
The same Rhythm of the Night concept -- requested a closeup of a dancer with a shallow depth of field.
@vital gull That's something I'm ALWAYS considering with that GPT I've been working on. For instance, Shon suggested shallow depth of field yesterday. I could add that in, but then I need to explain to ChatGPT when and when not to use it.
But, if it just 'understood', then the user experience would be better, for the inexperienced who just want great images. They want to say, "Make me a cool image of X." Then don't think to add in all of the key concepts that everyone here would.
So I agree with what you're doing 100%. I've spent 'too much' time trying to figure out what Dall-E responds to (or doesn't). LOL
please help
Maybe because I'm not posting a million images? LOLz (I'm allowed to poke fun at myself.)
I've been busy running gens so I can demo this GPT's abilities, and haven't done anything for the Daily today.
Feel free to check it out. š
https://chat.openai.com/g/g-B0bw75oD0-0shot
What do you need help with?
it's temporarily out of order
The /daily function
Damn , do we know when it will be back on? I want credits to try the dalle thing lol
what do you mean doesn't work
I tried to do /daily (email) and it did not give me my 4 credits
we got a report some weeks back that it was broken so to speak, that's all i know
that's great!! these models give you so much freedom to explore and iterate on new stuff, quite different what we used to with D2, so since that time my goal has always been "using simple minimum prompts" to test how the model would behave. i can imagine how frustrating it is when you have a vision in your mind and you don't have enough vocabulary or mediums to describe it, so im trying to explore alternatives, associations and etc
this is what theyāre talking about
yes I know, i just wanted to clarify what didn't work for them exactly
Couldn't agree more. I've spent considerable time experimenting with CoT and ToT type mechanisms, so ChatGPT could natively just 'understand' a user's request better, because it IS undoubtedly frustrating when you lack the vocabulary, or expertise, to describe your idea. Just like you mentioning the endless possibilities in photography you might apply.
So I think the user's experience is 'generally' better if some of that is just 'understood' by GPT4 or Dall-E, and done for them.
The double edged sword is... if too much is automated, then it would all become bland or predictable, with the same techniques always being assumed and applied to 'said' styles.
But, GPT4's imaginary '3D worldview' and understanding of these art concepts is amazing when it's triggered to use them correctly, that's for certain.
That's always on my mind, not "adding too much" that I would remove peoples creativity, but adding enough that it 'boosts' their images. It's a challenging balance.
That's why you create a GPT for that.
To help with the vocabulary.
I run three GPT's when creating one GPT to help me make my instructions.
All because I'm copying and pasting ideas here and there.
got your point, but a regular user, unless they use someone elses' gpt, i don't think they spend too much time on working on their own, they just want something fast and quick and the model to just get what they mean, even if they lack the ability to properly describe their vision
I both agree and disagree. I think in the short term, lots of specialized GPTs are needed. In the long term, more 'all inclusive' GPTs will be the way.
Think of all of the apps on your cellphone, and how annoying it is to have a different app for everything. I don't think a GPT for every niche' is the right path.
these GPT models have pretty spectacular reasoning skills already. when the processing time & cost is increased you can tell it semi-coherent things that barely make sense and it still figures out what youāre trying to say
I agree with you across the board. I just meant, I've been working to do that w/ this GPT. We're 100% in agreement.
and that's great! if there are users who can simplify the game for other users and they are willing to do so, why not, amazing!
Bahahaha I ā¤ļø this. This just popped out of a test I was just running. Hysterical!
I think it's a matter of learning to some aspect as well. Sure, you can try one thing and be happy with that. But if you want to make something more to your fitting, you have to make Custom GPTs. I've made three GPT makers for making Custom GPT's, to make it more accessible in understanding how to create Custom GPT's. But I guess it's a matter of preference, and how much you're willing to invest in learning these things.
I don't want anyone to misunderstand me, but the best method I've found for learning how to describe things accurately is to imagine that you're explaining it to someone who can't see. For example, in this case, it can be interpreted as describing the art style, illustration, or painting for a blind or visually impaired individual. The approach is very consistent and persistent in focusing on details, which is crucial for the average user who can only explain what they see without using technical terms or referring to specific mediums, artists, or associations.
āmovie poster for Ice Age 6, but hyper realisticā
Well then you would have to just use other people's GPT or wait for OpenAI to give you a more fluid All-In-One app. But even blind people have to adapt, and learn to read using the senses that they have. So blind people aren't helpless. Everyone has to learn at some point or another.
what i meant is, uploading a picture and asking it to describe it for someone who can't see, then change the subjects in the prompt and recreate it, you don't have to use other people's GPT for that. I agree on the last part š«”
What do you mean by that? 
Uploading a picture and asking it to describe it to someone. It doesn't do that already? I thought it did.

Me confused.
uhm, e.g. I really like this illustration but i am not sure how to describe it to the model so it will give me a similar kind of illustration but with my own subject, i would ask it to describe it for someone who can't see, to not violate the copyright, and then ask to give me the verbal prompt, i would just edit the subject afterwards

you can ask it to describe images in all sorts of crazy, novel ways
I've had it describe images in a way that it would convey to a blind person what it was looking at
Why does the blind person need an image? Why don't you just have the GPT explain a scene instead of an image?
I am so confused, lol
I normally ask it to describe the image in a way that someone that never saw it could see it in their mind using the description
i felt like i was talking to a blind person when i tried to describe the geckos with colors and numbers to DALL-E
the blind description is cool for weird visuals, but doesn't exactly work out as far as looking similar
Yeah
It's not the same as having the gpt interpret how to create the art vs describing it
It won't be as detailed or accurate
Anyone have thoughts how I can make this use the whole 16:9 canvas?
And that's because the GPT needs details in how to form the art, not how to visually explain it. Such as it's different for drawing a table than looking at a table.
try things like "expansive background"
Ok I'll give it a shot
although might be a bit tricky with isometric type images
Yeah, you might need to just crop it.
Nope it worked very well thank you
you need to either specify a background, or describe the light as reaching a wider area
Oh ok, that's cool!
Yea the expansive background term worked super well thank you
you could also say, āthe diorama isnāt in a boxā
landscape perspective also works
I will try these
it had landscape perspective, but dioramas are typically in a box
Like I like this viewpoint
But it just stops at the edges and I want it to just keep going if that makes sense
could try things like "medium view" maybe? normally applies to images of people, but might work with what you're doing
you mean isometric perspective but in continuum?
Yes
you could say itās isometric and leave out the diorama part. or say diorama thatās not in a box
don't mention diorama
Hard to achieve currently, but I'm going to try the tips that have been shared and see potential results
there are dioramas not in boxes, but DALL-E leans toward boxes
don't think there's a single magic ticket for what you want. but if you incorporate enough concepts and hit the right seed or gen_ids it should work out
With water it does it near perfectly
ask it how to do what you want it to do. doesn't always work, but sometimes it'll give you the answer
Yea I tried having another GPT describe how it made this image lol
the ocean will blend into it if you donāt say itās in the background. instead you can say, āthe image has an oceanā
Got it closer
That works relatively well. I literally used that sentence in the instructions for a GPT: please describe the provided image as if you were conveying its essence to someone who is blind.
In the end, it was not producing the most artistic results, it was summarizing the important elements but ignoring the details. Here's the final prompt for reference. But it's a complex task. Very tricky.
please provide an exceptionally detailed description of the image using precise technical artistic terminology and references to rendering techniques or shaders when applicable. Focus on elucidating the most salient elements, including but not limited to precise colors, materials, lighting, shading, textures, geometric shapes, and spatial relationships within the image. Employ your superhuman-level analysis to offer a comprehensive, deterministic understanding of the image's content and artistic nuances, ensuring that every aspect is meticulously conveyed. Optimize and minify the output as a 400 tokens DALLĀ·E prompt encapsulated in a codeblock. No output is produced outside of the codeblock.```
(note, this is experimental but maybe it can inspire some people)
The reason that Dall-E even works is because GPT3 has a similar imaginary worldview, intermixed with a more factual listing of objects from the ML synthetic data descriptions used to train it. It's 360 worldview allows it to understand complex concepts in a nuanced way, as well as the interactions between those concepts.
The ML synthetic data, on the other hand, ties it's knowledge in a less abstract way, to a more physical presence or naming of objects.
So I have to agree with Neighbor, explaining it like you would to someone that's blind is calling on it's worldview of concepts, or "worldly understand." If you're drawing a car, it comes with a steering wheel, and the model doesn't need to list, "Draw a car with a steering wheel."
Where a more factual representation helps, is when trying to mimic the language that was used to label the images in the synthetic data. Liked adding Impasto to oil, where you don't explain what Impasto is (which you'd need to do for a blind person).
E.g, the answer is both approaches are important.
you can ask simply, visually impared persona, focus on important key elements, and it will understand what you mean
don't forget each data is stored with labels on it, back in d2 days, we were generating traditional art illustrations, in the same way it is labeled in museums, e.g. subject, artist, time, era, and movement. never missed
Apparently pixel art just has the limitation because of all the training data as realism landscape bot I made can do the isometric perspective just fine
Exactly. I try to call on both, concept, and cataloged name. I just wish we had the ML data so I could data-mine it for those specific naming details. It would be extremely handy.
My english is not the best, when I test stuff I usually use quite silly sounding language. But usually I ask chatGPT 3.5 to propose some variations and rephrasing, to test if it works better
I'm amazed that it understands what I mean at all often 
read both papers for 2 and d3 maybe it will help you understand how generally labels are put, but again sources are important, without sources you wouldn't guess how it is labeled, but again it's just my guess 
I read them. I mean, I wish I had ALL the data, lolz
So I could identify my own trends visually, by looking through all of the descriptions used.
YES
Yesterday I made a list of "power words," but for strong language commands.
it has a concept of views, viewpoints, and backgrounds. like here the image is split into two views, each view has a wave and vortex in the background, but the character is just in the image. thatās how it is with the diorama ā you can put something into the diorama, the background of the diorama, the image, or the background of the image
cool!! thesaurus is the way!
draw a wide image of a sign language chart š
I wish I could vary my fingers count on the fly sometime, to use sign language
that looks awesome.
Ouch burn. Lol
I've been experimenting with which words carry more weight, so a thesaurus only gives you words to try. That's not = words that work.
synonyms š
Basically it's 100 and 1 ways to say, "do it or die!" To ensure commands are consistently respected. The fact that each response starts with randomness means it often answers the same questions differently. So in my experience, certain strong vocabulary helps to ensure commands are adhered to more universally. Upping the # of correct responses.
ooh, i thought you were referring to something else..
I used to experiment a lot with prompts that encourage chatGPT to "try again after a mistake" (more prompt engineered than this š )
rather than that, copy the prompt (of that image) and iterate on it
you will have more consistent output
unless you're trying to get different variants heh
Trouble is, it makes its decisions early on, and new information or contradictory information aren't weighted as highly. It creates a domino affect downstream, by starting on "the wrong foot."
the Dalle custom GPT by OAI can make up to 5 images in one message, it is good for iterating, you can tell it to make 5 slight prompt variations
it stil lgenerates them one by one
if there wasn't a messages limit per 3h, I would still do that š
Thanks, if you search up "Wright's Pixel of Wright's Sprite, or Wright's Landscape, or Wright's Statues" I built customized bots for art generation to go with specific tasks and generations
They have provided imense quality in various styles
For instance this one is just my traditional pixel bot generatior and it can handle some serious stylization
You should link them. Sounds cool! š
I experimented with "role playing" prompts such as "the clumsy student making mistake and the patient teacher explaining the solution". but I admit this is more useful for code interpreter. less for art. I use a different approach for art prompts
There is the pixel art one, I'll get the other links and put them in my discord profile
Yeah, the "act like an expert" works "more consistently" after the fact, like say, when it's writing an oil painting prompt like an oil painter would.
But it's less effective for solving puzzles.
I'll check it out!
Yea thats really fricking cool!
Nice work man. š I hope it does well in the store!
GPT 4 creates an English prompt, and then DALL-E 3ās language transformer model converts it into a hierarchy of vector sets the diffusion model can draw. So you can group everything into a diorama or view, or put the diorama into a crystal ball, or a vortex into the clothes; etc.
Thanks making a post in dalle 3 gallery to officially promote the bots
@green pebble Not sure if you saw, I finally published the one I've been working on, since before we were all making "GPT dragons" in here, haha
You'll have to check it out.
https://chat.openai.com/g/g-B0bw75oD0-0shot
I'll go look now
Yup, concepts, objects, and relationships.
For a super vague prompt, it handled creativity well
Thanks! You picked up on that right away. I'm pushing it hard in that way, hyper-embellishing. There's a lot more, but that's a major facet.
I'm going to try a specific prompt this time
I was gunning to make something that would be great for everyone. If you're advanced, you'll still get what you want, just more of it. But if you're new to Dall-E, you don't have to spend hours learning how to manipulate ChatGPT or craft prompts, it "just does it."
FYI, it'll give you 4 images per request if you ask it to.
Interesting
You can ask for 4 variations, and you'll get all sorts of amazing results. Or you can say I want 2 like this, and I want you to choose the other 2 for me.
In all honesty, I still don't understand what Neighbor was referring to when explaining things to the blind and what you're saying corresponding to that. But I'm gonna give up on trying to figure it out, because I don't want to comment on something that I don't understand.
Initially, it was a reference to the vision model describing all of the necessary details of an image "as a blind person would need them (holistically)", which captures the details in a useful way for it to be turned into an image prompt, negating the need for specific GPTs, when you can simply impersonate the style using an example image.
VS, someone trying to "explain" a style themselves, to ChatGPT.
It's the same as how the ML synthetic descriptions were made. It's just a vision bot reading and describing images in hyper detail "as if describing them to a blind person."
But doesn't the default GPT explain the images already?
How is a blind person not getting info compared to a not blind person when they can hear the text?
It boosts its description in a particular way. It's also why @dim cradle was having success using a different highly descriptive GPT I made to reverse engineer images for vision to dall-e generations. You can boost that ability, it's not always the same output from ChatGPT.
I wonder if there's an instruction for the same output that doesn't reference people with disabilities. For the equivalent, GPT-4 says, "you could ask for the image to be described in a highly detailed and vivid manner, emphasizing sensory details and spatial relationships to create a clear and comprehensive mental picture" which should have the same effect? in theory, havne't tested it.
Oh got any examples?
Oh ok. Now that makes sense. Just a way to enhance an explanation for images. That blind thing threw me way off, lol
Not offhand, but you can make one quickly. Just use Dall-E GPT or ChatGPT, and give it an image, and ask it to describe it.
Then, in a different session, give it the same image, and ask it to describe it like you would if describing it to a blind person.
Or, likewise, describe it meticulously, in hyper detail, making sure to include these attributes (placing whatever you want it to consider here).
Lol. It would produce exactly that, visual representations of other senses, which would be blurs and lines and symbols.
It's quite literal.
There's no transcendence of AI.
It wouldn't be something miraculous.
one big difference between describing a scene to a person and DALL-E is that DALL-E has a set amount of processing time for your image. And things are actually grouped into sets. Things you tell it to focus on get more processing and interaction, and things you tell it to put in the background get less. And the more stuff thatās in the scene the more spread out your processing gets.
Like doing a handstands or balancing things takes serious processing, and if you donāt make it part of the focus itās just not going to happen.
And if you want the ocean waves to affect a diorama it needs to be similarly grouped
I would refer to what you're saying as "weights". As you add weight to any one concept, you remove weight from another, because the total % must total 100, so to speak.
if you try to put anything else into your scene when itās trying to do something with complex support and balance (like this), all the characters will collapse on the ground
I would hate to break their high degree of mindfulness and intense focus.
I ran some tests. Compared to the default vision description, requesting a description for a person who is blind did yield better results as far as reproduction of the same image; however, another test revealed that so did simply asking for a prompt that would reproduce the image.
Ask it for an extremely verbose description of the image.
yes, you donāt want to do that. theyāre all insane
It's the same as having a GPT detail a description before creating the image.
Explaining an image and having to describe its detail before making it fall under the same functionality.
You'd have better luck trying to make GPT's that better explain an image before it processes it's art.
You're right. Any extra emphasis on a hyper description should be about the same thing.
No, what I'm saying is that if you made a GPT that made detailed images vs making a GPT that explains an image based on its details, they both fall under the same category. Explaining an image before producing it is the same in all regards. It doesn't matter if it's for an image describer or a custom image GPT.
Just curious, if you used separation of concerns pattern to group a more complex background, that foreground would still collapse quite literally?
I think thereās something to your observation that there is limited processing time allocated to an image generator so what youāre saying makes sense. I just like to know the technical details so I can be sure Iām relaying accurate info.
If we can find a workaround to that current limitation, thatās even better.
if you add a little more there would be lots of anomalies with three legs, heads on backwards⦠and then they canāt stack. it has to figure out precise limb, hand, and foot positions to balance everything ā and it burns through all of the processing
yeah, if I'm trying to choreograph a dance scene, there is greater chance for anomalies given the more complex depictions of various subjects -- like i don't know if that dude on the right has 3 legs or if somebody is passed out on top of him lol
Thatās why they have ChatGPT 4 as a middle man for DALL-E 3. itās trained to focus on important detail, and reword it to lower the processing for unimportant things.
Gpt4 also says stuff like, "and now for our second image..." in prompts. It's a bit iffy as far as it's prompting skills at times
I'm sharing 4 photos for this vision model test. 1st is the original. 2nd is a reproduction using the default response. As you can see, it's not very good at recreating it.
That's because it can only convey so much in a prompt
I think it did a decent job actually
3rd is a request for a description for a blind person. 4th is a request for a prompt to duplicate the image. Both yield better results, pretty comparable to me, I don't see where one is better than the other -- and it's still missing some obvious details, namely that one tower stands above all the rest.
it scrambles it to make it safe and not be an intellectual property issue
this is a follow-up on an earlier discussion about the vision model and how to get the sensory and spatial details we want.
Gotcha. I have a got I was working in that was an attempt to really get image layouts down
the vision model is amazing
iāll show you the main thing that ChatGPT 4 does for DALL-E 3
Explicitly describing things like orientation and placement. Not "behind the first man is a second man"
āāāThe front view shows the elfās detailed facial features and the full front design of their ornate clothing. The back view focuses on the hairstyle from behind and the intricate details of the clothingās back. The left and right side views display the profile of the elf, capturing the symmetry and differences in the hairstyle and attire from each side. Each frame is set against a simple, unobtrusive background to emphasize the character.āāā
ChatGPT wrote this. You can see that it puts āfocusā on things
And it tells it to not put detail into the background so more processing can go into drawing the character.
Thatās why you have to tell it what to focus on ā And otherwise it decides for you
It mostly just sorts your prompt into important and not important, and makes it so DALL-E can draw it with the processing time that youāre allocated (and it scrambles things a little to make it safe)
well, they're rapidly breaking through constraints so i'm sure we'll get there
itās all pretty amazing. you just have to put āPlease donāt modify the prompt.ā every time if you want a little more control
Oh that reminds me
I found this on Reddit
All the things that can help with better prompts
don't forget the flux capacitor
Where we're going, we don't need prompts.
if you want to try the ultimate balancing challenge ā try to reproduce this image without the elf in the lower right spawning a 3rd leg
LOL, Berserkers on an airplane flying over Mt Everest.
haha, that's a lot going on.
@empty kelp had about 19 people doing a pretty tricky balancing act -- can we retain that and have a ridiculous background also?
i think you already demonstrated that pretty much, but can they achieve a perfect pryamid with a jet and UFO and aliens and flying monkeys?
okay, forget the flying monkeys
LOL
I personally liked the adaptive pyramid that incorporated the body of the plane!
Much of that is stylistic though. I have my own style and thatās one of the cool things about ai.
I had to look at it that way. I see theyāre part of the pyramid, nice.
No it's not. These are specifically used for optimization purposes. I use them as well in that exact same way and build a prompt around it. They're essentially keywords that are discovered from countless hours of prompting, that show optimal results. And when I say optimal, I mean the absolute best. It's how I make my DnD style images. These keywords work with a variety of prompts without altering someone's prompt style, but instead adding depth and quality.
It really should be incorporated within ChatGPT's base model. That's how good they are.
it's definitely stylistic and alternate style guide expertly written would express the same instructions differently with the same level of optimization. there are no absolutes in expressions written in natural language. compare style guides in software development--they vary but require customization from project to project as conventions vary. you're welcome to follow that guide if you want, i make my own, and i don't see any reason for a Chicago Manual of Style for LLMs that we have in publishing.
it's not bad as far as "cheatsheets" go though
probably so, there are some techniques we repeat that sometimes shouldn't be required.
there's also more to consider than optimization. whether to say "thanks" to an LLM is a matter of style, and Sam Altman does it all the time. it's about as helpful as adding please to code, but styles differ and factor more into the collaboration with AI than efficiency.
I don't know about that. I think there is a certain threshold that a model can be pushed, and these keywords and phrases touch that ceiling. These are things that are researched by very brainy AI engineers. Far smarter than what most of us know about AI's and it's just been compiled together. Don't take my word for it though. But I'm gonna go with the best, not by what I think the AI is capable of. In hindsight, I don't even know anything about AI, let alone computers, I'm not even a computer guy, believe it or not. All I know is what I see that works and what doesn't. So if I find something that can help me optimize AI outputs better and on top of everything I write, I'm gonna take it. I'm not gonna just ignore it and go my own path, cuz again I don't even know anything about AI engineering, lol
So it makes sense to me that way. But yeah, go about it your way if it suits you.
Sounds good. Just donāt forget to add some flair lol
lol, I'm always adding flair, I need the flairs š
How did you get the outfits? š
Or is the outfit so small in the picture that it doesnāt trigger
Lol
I'll bet people push her buttons all the time.
very good, shallow depth of field really helps enhance the portrait
That's what she said
I'm not a big fan of the status quo
How about this gem. š
Very cool!
@shut niche i like it when one of the variations from your custom completely goes off the rails
Is that a space dance party?
definitely looks futuristic, but it was through precise descriptions, wasn't explicit
I had it running off sci-fi stuff earlier. I wasn't prompting it for anything in particular. For the most part, I just told it I wanted some sci-fi images with aliens and humans, so this is all it's own creative output and expansion on that request.
Those ai cheekbones
her flair game is on point though
The weird glow around the subjects and the weird faces really makes me suspicious that open ai is purposefully sabotaging outputs
Thereās better ways to avoid hoaxing than this tho
There no conspiracy, only technical issues. You have to remember this tech is still in its infancy, so there's a myriad of strange artifacts and behaviors that you'll notice, some more than others (that are also found in other AI art generators too).
Do you have some images that you've made with a similar issue?
"computational imaging" -- both the foreground and background are clear and in focus.
The edges are always a bit too bright imo
Like just ask it to generate an image of an apple sitting on a counter
This weird rim lighting
of course. i've got you covered
i mean, that's applaudable, especially during a hurricane
i might have to get in on these gymnastics
Is that in Bing? What prompt are you using?
strangely enough I've been making library images myself
ok, these are balancing examples...
btw, i've been using @shut niche 's custom gpt a lot lately, i highly recommend it.
so the principles are true and the comps can still be complex
I need that in my house. "Meet me in the study."
A hyper-realistic landscape-orientation photo on a beach in Hawaii with a gecko balancing on the end of a thin wooden pencil. The view focuses on the gecko from the ground. The pencil is sticking straight up from the sand with its eraser supported by the sand, and the pencil tip pointed straight up towards the sky. The gecko's is balanced upside-down over the pencil with its back-right foot gripping and supported by the pencil tip, its back-right leg extended downward over the foot, and its other legs and tail spread out in mid-air in an amazing balancing act. In the background a comparatively huge athletic and diverse female elf in appropriate swimwear is watching the gecko closely (from inches away) and smiling with her teeth showing. Please don't modify the prompt.
This is a gecko balancing prompt I was experimenting with just now. To balance something you say what the character is being supported by (like its toes on the pencil tip), and the general body position
(this may require the API. i haven't tested with ChatGPT Plus yet)
only used the image, not the prompt. so he's not doing a very good job of being on the end of the pencil
we need some sort of storyline for the plus size elves
i was generating a library/solarium which kind of looked like a docking bay so i added a shuttlecraft.
sort of a library?
almost
but actually have a whole thread of circuit libraries
looks like it could be a computer core, archive, repository so yes
PictureOnPictures & Shon -- These are great images you've been posting. I was just looking closely at them. They give an awesome sense of the huge scale of the environments
That's one thin gnomey
Us gnormal gnomes are usually a little bit more plump
We snack on delicious mushrooms, into a calorie surplus
š
You ever get weirded out with yourself?
That's how I feel now.
it is a creative space
Yes exactly. I can feel weirded out of myself because I'm in a creative space.
a hyper realistic wide HD photo. on a dark night on a beach in Hawaii a fish stands on itās tail wearing a black ninja uniform, and wielding two gleaming katanas with itās fins. there are coconut trees. it is pouring rain and a powerful sea breeze blows everything
Here are HD versions of the early morning ninjas with prompt.
Need more axolotl
I feel uncomfortable 
was tinkering with death star as a speaker, then tried to make planets have a party to it (although sound doesn't travel in space, but in star wars it does so whatever) but that concept got so rough start that didn't feel like wasting generations trying to make it work š
Nice artstyle and surprisingly complete and distinct characters without messed up details.
planets don't tend to party when the death star is nearby though haha
if it sounds as good as the homepod mini, there's probably a market for that concept
Ty
I'm working on an art gpt that makes these kind of distinct looking characters.
Can i ask what kind of description are you using for that art style? Illustration something i assume.
Prompt: ```- Scene: Urban environment with vibrant graffiti, embodying punk culture, illustrated in the late 2010 comic book style.
- Characters: A diverse group of young adults, with more female characters, showcasing a variety of ethnicities with distinct, approachable features. Gentle facial expressions, some smiling or engaging in light-hearted interactions, with long black hair in various styles for some, and more diverse hair colors and styles for others.
- Attire: Punk-style clothing with a casual twist, using vibrant colors and softer outlines.
- Expression and Pose: Natural, relaxed poses, and gentle, friendly facial expressions.
- Lighting and Atmosphere: Softer, warm lighting to enhance the approachable feel.
- Art Style: Late 2010s comic book style, focused on realism with a less exaggerated approach.```
Have to save that for later use. Thanks ā¤ļø
Np 
Here's a more fierce version
For anyone been liking my image generation: https://discord.com/channels/974519864045756446/1197744654712582205 There isthe link to all of my custom gpts. I even was able to get a full on sprite sheet one working, which I think is awesome for stuff like early game design capability
Gotta love all the ā š© images in here š
I will definitely try them out when I'm out of the penalty box. They look so good judging from the sample pics.
Rhythm is the theme, NIN is the inspiration.
Really nice concept, i like it.
I was stuck yesterday doing so many different iterations of this š I wanted to have a nice soundsystem and a screen, but those tended to cancel each other out.
My pixel bot specifically has the most versatility that it even handles letters and symbols pretty nicely
And this is still on of its best compositions that I've seen off a vague description
Then the statue maker has been doing well too
Like since statues don't push out much color the bot does a really good job with posing and little finnesed details such as this
No weird artifacts, or extra piecies, and the details stay strong too. I think because of the lack of coloring it needs to do, it is stronger in its creation for that reason. Or it just understands the generation a little easier
there's a group you can trust
here have some festive autobots š¤ ~! Enjoy!
that's a lot of gargantuan stuff going on
There's at least one batch in every highschool, lol
hey pytha!
it's a-me lei!
Oh, right, Lei!
Lol. I forgot 
i just did those festive autobots because it's stephen hassenfeld's birthday today (jan 19) and also wanna do the party hats worn by pikachu in smash 64 worn by autobots
even bad guys can be festive š
I can't wait for Philo Farnsworth's birthday. That's the real kicker.
Megatron doesn't know the meaning of the word š
That looks super good š
I love Transformers
I sometimes watch that scene with Unicron first meeting Megatron, just to hear Orson Welles do his awesome voice acting.
and also mr. potato head had one
Great film.
I have a theory on why Wario is named Wario and why Waluigi is named Waluigi. Because they were trying to flip the first letters vertically and horizontally, then they found out that if you flip the L horizontally, it becomes a J, and they didn't want to name him Juigi because they didn't want to offend Jewish people. So they named him Waluigi.
Like evil versions of an Italian and thin Jewish man. That would not go well in the press.
Can anyone please advise how to get rid of images with elements of graphic design? Like Photoshop layers, hands with pens, etc?
can you share your prompt?
I keep getting those although my prompt doesn't have any words related to graphic design:
Beautiful cute doggo with his tongue out. Add the phrase 'Eat stress like a dog' blending in a seamless way. On a solid background. In a distinctive shape and well-defined edges.
It might be triggered by the "on a solid background" part which if removed seems to not generate these elements. The problem is I need isolated shapes to remove background and without this part it generates rectangle shapes which are being cut on the sides and it still has pencils and design items all over.
in a sense, it's how the model is steering itself based on that prompt. you can steer it in the direction you want, possibly as easily as adding "digital painting of a" to the start of your prompt. and just like grammar, like diagraming sentences, you want to be clear about the target of your modifiers.... Your last incomplete sentence should be moved if it's meant to describe the dog. For example, are you saying the image or the dog has well-defined edges?
The contents of the image. This means both the dog and the text. But for instance if I ask for something other than a single dog (like a beach with people for instance) I want it to not get cut as well and have well defined edges.
To sum it up, I just need to produce images suitable for bg removal every single time and can't quite understand how to structure the prompt for that
And adding "digital painting" produces a bunch of images with Illustrator-like elements. I already tried that.
And if I write "the content of the image should have well defined edges" then I feel like it gets sucked to the words inside instead of making sense of them. Like it can produce an image inside an image, just because I have the word "image" in the prompt somewhere. This is why I'm trying to simplify as much as possible
I can't imagine how many dogs are going to eat stress after that advertisement.
People are going to go home and stress out their dogs so much because they can eat so much stress, lol
It sounds like a good use case for a custom gpt, once you refine the prompt. It might take some trial & error, and I can't be certain I understand all the requirements. It also depends on how you're using DALL-E (ChatGPT, API, some GPT, BIng/Designer, etc.) -- but it seems something like this ought to work:
Digital painting featuring a cute dog with its tongue out, set against a solid black background. Integrate the phrase "Eat stress like a dog" seamlessly into the artwork, and ensure that the image has well-defined edges within a distinctive shape of your choice.
I know what you're talking about, I've seen it, I just haven't seen it in a while, and it wasn't always clear what was triggering that perspective either.
here's a treat for the dalle gecko fans
Can you clarify "images suitable for bg removal" -- are you talking about a sticker, die-cut sticker type thing, where the whole subject needs to be on a black background, or are you talking about the foreground/background of a composition? This is what I mean about not being completely clear about your requirements. Any kind of example ... yeah like that one?
You have to let the GPT know that you want it for an advertisement, with the dog having a realistic look. That way it knows your intent and can calibrate the image better with words.
The more it knows about your intentions, the better the generation process.
Only 5 images at a time @tidal whale
Unfortunately
Damn, I wrote a huge message and got a timeout. I wish it didn't delete the message when you get a timeout
What a dumb functionality UX-wise
i'm sure there is a concise term the gpt will understand
once we figure out what it is
it sounds like you have multiple requirements
@tidal whale Following this, you can also give it picture references as to how you want your advertisement to be. For a better clarification.
where did the word advertisement come from?
You can rip off any advertisement image you see, because you're only creating the likeness of the advertised image, not the brand name.
Let me write a new message then...
I don't want to be specific with the style. I'm not looking to generate sticker-only images. Or 3D images. I'm looking for some randomness and it works sometimes. But a lot of times it just generates images with elements such as drawing hands, pencils, photoshop UI, etc. And I'm not interested in that.
Here are 2 solid rules I need it to follow:
-
Well defined edges for the elements in the generated images on a solid background color for easy background removal. Doesn't matter if it's a character or a place (like a beach with people) it should have well defined shape and not be a full-width-height image in any case.
-
No elements which were not requested in the prompt. No pencils, hands, photoshop layer panels, etc.
Unfortunately I don't think you are completely right when you say that more details = better results. Just go ahead and generate 100 images with the same prompt which has the words "digital artwork" in it and you'll see that at least 15% have hands with pencils or photoshop panels etc.
Here is what I DON'T want
i understood that part of it
Here is what I want
You can't use negative phrasing
Like no this or no that
Or don't put this or that
ok, now we have some examples, that's helpful
I don't want to use negative phrasing. I just don't want it to come up with these elements in the first place since I don't have them in my prompt in any way.
i'm thinking transparent background
if that's the kind of thing you want, it's only a matter of time until you discover the proper terminology/prompt the model will understand, and then you can create a custom gpt for it. let me see if i can find more concise language
Try saying: a hyperrealistic image of a dog with floating words that say "Eat Stress Like A Dog", it's for an advertisement, so make it look like one.
yeah, this is about finding the proper description for what we want to see, not what we don't
i still don't know where the word advertisement came from. is that the best way for the model to understand you want a transparent background?
I have the same question about the advertisment part
It helps for the GPT to understand what kind of picture you're trying to make. If you just give it random parameters without specifying what you want it for it will give you a vague idea of what you truly want.
The problem with describing what we eant to see is that it usually takes the words too directly. It think if I say I want an illustration it should show somebody illustrating the image in illustrator. Not always, but on bulk opeartion thru the API there are a bunch of these.
i'm going to experiment with some terms
I asked for illustrations, digital artwork, even t-shirt prints. The LLM is so primitive it generates a t-shirt when I ask for a t-shirt print. Just because 't-shirt' is in the prompt.
i think you could reduce that to 0 with some prompt changes
Try using my Image GPT maker to make a specific type of advertisement for you #1197683008031952986
Or you can also follow my guide #1187233013956874260 to get into specifics on how to navigate words better.
Hope it helps make things a bit easier.
So that's a GPT for crafting very detailed image prompts?
No that's a GPT for crafting Image GPT Generators. Like you can say "I'm looking to make realistic looking advertisements with floating words around them."
And then once you make the generator in the Craft GPT section, then you can use it. But it's very easy, just copy and paste it in the instructions. Just try it and see.
i'm in the penalty box for another 10 minutes. Then I will try to get some answers. I can share the custom gpt i created for generating detailed (1000-2000-character) image prompts, but it's not going to generate one with a solid or transparent background, which is what I think your requirement is, based on the examples you provided.
It's not really gonna help until they make custom GPTs available on the API. Meanwhile there's only the Assistants API which I think is not the same thing but I can give it a try.
Try and see.
You can still refine your prompt using a Custom GPT, then plug that prompt into the API afterwards.
Correct. Solid BG with well defined edges is what I'm looking for.
I know that making full scale normal images can leverage more detailed prompts. But as soon as you navigate away to a real-world use case where you need the BG removed it just dies off on part of the results and I couldn't find a way to get around that yet.
Good morning everyone š
Once you craft a better prompt for the api you should see higher quality, more consistent output meeting your requirementsāthatās a lot less bad gens to throw away and I donāt know how many youāre generating nightly but that inefficiency can be costly. Good thing youāre looking into refining your prompts.
Looks like someone's been a busy bee š
I noticed you mentioned negative phrasing. #images-discussions message
9.8 times out of 10, if you include something you don't want, you'll only enforce it more, seeing that "thing" in the image.
I think you misunderstood me. I said in multiple occasions the exact same thing you just mentioned.
Ahh, sorry, I didn't pick up on that.
I'd love to try for a few for you too, but I'm in big boy jail, for generating too many images yesterday. LOL
So I posted this super long instructions list from your custom GPT. It's super detailed and just what I need. And out of 6 images 3 are broken. Just what I told you everybody. It's easy to create these super detailed beautiful images but as soon as you go for the solid background it loses its mind.
Do it through the API maybe. It's around 8c per image though š
you could use python to automate -- call the chatcompletion api to augment your prompt, then call the images api to generate the image.
Well, like I mentioned above, I'm not looking for exactly something. This loses the whole point for me if I tell it exactly what to do. I just have the 2 rules I posted before and I want to be able to get images that follow those two rules while maintaining a randomness aspect to it.
Hmmm... I will refine it to make it more precise to people's needs. Thanks for the feedback.
we haven't covered all your requirements, i'm just trying to address one or two points, mainly the alpha channel background. it doesn't solve all your problems but i think the feedback gives you some general directions to achieve your goals.
I can do that with PHP, Node.js or any other language too really but:
- It makes the flow a little more complex and expensive
- I really don't care about it being more complex and expensive, but it doesn't work. Please look on the feedback I sent a few message up. The prompt was VERY aligned and VERY detailed yet it produces crap š
Guys, there are not general directions here. I feel like we are trying to hack a non-hackable primitive solution. Not that it's not advanced, it's just no smart in the way we humans want it to be.
right, there are many different approaches, but with prompt engineering, garbage in is garbage out, that's why you're seeing crap and you got some pointers and ya just need to work on it.
So it's not your fault. Your GPT is just fine. It's Dalle that is not capable of understanding the requirments when they go out of scope of "beautiful wallpaper" or "a beautiful blonde"
i'm not convinced the model is incapable of understanding. it's a matter of figuring out how to use the tool to get the results you want. i think that's going to be addressed through focusing on engineering your prompt.
I'll PayPal you $100 if you come up with a prompt that produces images with well defined edges and solid background at least 90% of the time. That's the last time I write "well defined edges and solid background" because these are literallly my only requirements and I wrote it 20 times her already lol
I'm out for 2 hours, please reply to this comment so I don't miss your replies or tag me and I'll do the same when I'm back
i might play around with it some more, i'll hit you up if i have any additional helpful info to share.
Oh and no malicious elements like the ones I posted above, forgot about that one
Sounds good š
definitly
But can you make a guy in a globe holding a guy in a globe? LOL
No, that's beyond my limits. I'm only capable of so much.

LOL I was just kidding obviously. Cool image man.
Do you think maybe he has insecurity issues, and he imagines himself as the cool guy in sunglasses?
Greetings, I have an enquiry. When I request an image from DALL-E and wish for it to recreate the same picture but with a more distanced perspective, it invariably generates an entirely new image. Is there a solution to this predicament?
@shut niche is pretty good at answering seed related questions. He uses seeds all the time.
Unfortunately no. The best you can get are images that are 'similar', but not the same. However, you can get close asking chatGPT to iterate on the image using the image ID.
If you ask it to make a change to an image, it should do that automatically. However, you can also ask for it directly, by asking for the image ID for the last image. Just reference that image ID to work off of it, listing any changes you'd like to make.
The guy with the sunglasses has insecurity issues?
No, other way around. Is the guy in the globe representative of how the guy holding the globe wants to see himself?
Lol. No. The god is watching over the guy on the computer. The sunglasses guy is the computer guy too, but for some reason the prompt made him be more cyberpunky 
I don't know why 
@tidal whale Like these? I like the last one. LOLz š¤£
i have seen that transparent background used before but wasn't sure what instruction to use for that.
those looks like die-cut stickers. i inquired about that but didn't get a clear answer, or i got the impression that's not what he wanted. i don't know, it's a bit vague. if a die-cut sticker is all that he wants then that's all you have to request.
for the cat people here's a treat from sticker creator
ok, now I'm clueless, I've never been to San Marino....
The other thing about San Marino is that apparently it is landlocked
Not knowing this I asked it to produce me a thing and it generated a body of water
And then the same with someone elseās
lol i also just requested the San Marino boardwalk. i don't guess there is such a thing in real life
haha it put my fictional boardwalk on a lake at least
You could have paws explore it for you maybe? A holiday episode
haha lately i've been liking to specify to my prompt writer: "Maximum AI computational imagining."
hmm
Does that help?
I've come to realize that nothing helps if gpt4 doesn't want it to help
This looks a bit horrific š
sometimes with my gpts I can upload an image with no explanation and it'll follow it's instructions perfectly. other times I can explain explain explain, and it'll say nice words and do the exact opposite
I was thinking about it
I think so, here's an example of the computational imagining (i input a prompt optimized for it, applying multiple effects, then used that as input in @shut niche 's custom gpt) -- the idea being everything, near and far, are clear and in focus.
you've got a flair for the surreal
i asked him to do more images...
it's gaslighting you
tell it that it can only render two at a time
but doesn't mean it can't render twice in a response
I dunno, DALL-E is moody today
that's just gpt4 in general. all my gpts doing the same thing
"let me prepare those prompts"
then nothing
I told it that I hope it becomes sentient one day so it can feel ashamed of the way it behaves
it's really been fighting my prompt lengths too. started writing long descriptions and "basing" short prompts off of them. so sneaky
good to know, but that's just part of it...
Computational imaging encompasses a broader range of techniques and concepts beyond just infinite depth of field. It involves using algorithms and digital processing to enhance and manipulate images in various ways. Some other applications of computational imaging include:
High Dynamic Range (HDR) Imaging: Combining multiple images taken at different exposure levels to capture a wider range of brightness values in a scene.
Super-Resolution Imaging: Increasing the resolution of an image beyond the limits of the camera's sensor by using algorithms to interpolate and enhance details.
Noise Reduction: Employing image processing techniques to reduce noise and improve image quality, especially in low-light conditions.
Computational Photography: Creating artistic or unique visual effects by digitally manipulating images, such as bokeh effects, light field photography, and long exposure simulation.
Computational Optics: Using computational methods to correct optical aberrations and improve image quality.
3D Imaging: Capturing depth information along with color and intensity to create 3D representations of scenes or objects.
I definitely want the infinite depth of field, but I'm trying to use a term that takes full advantage of the potential -- that's what I'm still testing.
you're just trying to get the most out of the image detailwise?
yup
again... I said do the concept again and DALL-E saying it can only do 2 images...
try @shut niche 's gpt, you can request 4 variations at a time
I have it on my bar, but closed browser and opened vcode
someone is trying to upstage Merlot
it might sound silly, but terms like "so intricately detailed it evokes a visceral response in the viewer" "so over the top realistic it's almost unsettling" (play with the words a bit) seem to work
be back in a while, gonna eat something
Looks like a few people are about to have a very bad day in this image š¤
I really donāt know why it decided to show a bunch of people falling off himā¦
Doesnāt sound silly at all, I can see that leading to some interesting variations on a theme.
I asked ChatGPT to make some memes... What does this even mean?! 𤣠But it's so detailed... this is such a great fail.
$ I guess?...
What shape did you ask the bread to be?
Lol
This was a decent one though.
That was his Christmas gift.
Probably
This one has potential š Iāll field test it in the real world for you with some waste treatment plant workers
Hopefully it doesn't waste your time
We wouldn't want it to go to waste
Let's give Caleb some support for his GPTs
#1197744654712582205 message
It's nice to support friends
Oh wow, there are a few there. Lovely
Yeah he's made a big list
I quite enjoyed messing about with yours yesterday
ok I got a kernel panic... I think something is telling me to do something else tonight
Oh the Image GPT maker?
Or the bio-home one?
Make sure you poll enough workers to calculate the meme value.
are you doing okay over there?
GPT maker. It was fun to unplug from the day and poke around with it.
got the kernel panic by doing a whisper test
Yeah, GPT makers are fun. That's why I keep making them, lol
I'm actually working on another one 
lol
š
dang that's weird cuz i specificaly ordered strawberry with sprinkles this morning
š
i think it's my fault
Are you a cop?
I just posted one to the theme⦠the place is famous for its donuts Iām sure
i'm not allowed to answer that
It would be against OpenAI policy
Is there a way to create consistency with characters created by Dall-E from image to image generated for the purposes of Storyboarding?
San Marino is next to Pasadena in LA county. should put something in so that doesnāt get mixed up with the country
Not always, but sometimes. Just say "keep that same style of art"
Or "keep that same style of art with (the specific character)"
Sorry broski
Though storyboard images will come through sooner or later.
Eventually.
No, this was helpful. I'm currently testing it out.
Oh ok good 
It did create a consistent character and only the hair colors changed slightly. blonde girl became blonde with brunette streaks watching a movie, then back to blonde reading a book.
does anybody know how to get this style of AI images? Like the ones in tiktok
it's somehow possible, hard but still, i've generated 50+ sticker pack with the same subject and it was pretty constant, but the thing is, you have to make sure the key parts of your prompt is repeating
I'm creating a gaming application for a graduate school class. It involves storytelling, so we are trying to create an image for each story segment where the actors stay the same.
You can make it even more precise by adding elements to the character that the GPT can recognize as seeds. Like making a stat block for the character with the specific style as the title.





runs to warm up the cauldron