#images-discussions
1 messages · Page 34 of 1
both used different training datasets imo
Anyone, who has successfully trained a GPT to get consistent images, need assistance.
Which do you prefer and why?
Hello, can you give a little more detail about the kind of consistency you're looking for?
For less filters and various kind of arts/scenes/illustration where you can use certain artists for more room to express your imagination using their reference then go for bing.
For better quality, various resolutions and easier to customize where you can add a proper textual element, visuals and stuff then go for chatgpt.
I tried one today, I am aware that dalle cannot directly produce an image of public figure, so I compare both dalle on bing and chatgpt.. bing can/allowed to depict "john wick" in certain scenes/illustration, where chatgpt are refused due to content policy.
this is from bing
I am basically trying to generate symetric grayscale patterns that I can use in my graphic work. You can try it here - https://chat.openai.com/g/g-oeCRSFCZA-pattern-generator
I have given it images of my past work. And the instructions are "Generatorinator specializes in creating abstract, sports-agnostic background images in a 3:4 aspect ratio, ideal for various applications. It focuses on generating designs that are horizontally symmetrical, grayscale, and embody a sporty aesthetic. The images are free from specific sports references, ensuring versatility. Generatorinator's style repertoire includes Grunge, Slant Patterns, Metallic, Brushed Metal, Line Patterns, brush strokes, abstract, geometric and now, Sporty. strictly exclude - humans, curves, balls, spheres, circles, rounds, silhouettes This new style infuses elements that evoke a sense of athleticism and dynamism, while still maintaining the core principles of simplicity and horizontal symmetry. The backgrounds are designed to be visually striking yet versatile, suitable for a range of uses beyond sports match cards. They adhere to the 3:4 aspect ratio, ensuring a balanced and harmonious design." -
It is not taking negative prompts into consideration at all at the moment
figured it out..
A bit of general feedback: currently, DALL·E 3 in ChatGPT only makes images in three preset aspect ratios: square (1024x1024) and tall/wide (1792x1024), so requests for other aspect ratios won't be fulfilled.
Regarding "not taking negative prompts into consideration at all," do you mean it's ignoring such requests, like your "strictly exclude" section? If so, this is because it doesn't currently have any negative prompting implementation, so negative prompts usually end up confusing it by bringing it up at all.
Finally, here's what I might try if I were trying to do something similar to what you're describing:
- I'd upload my example images to a plain new GPT-4 chat and ask it to thoroughly describe the essential qualities that the images have in common.
- I'd take that description and use it to inform the instructions of the custom GPT
Hey Solbus, thanks for this! I will try it out.
I actually, uploaded them to the gpt builder itself to describe the images and add to knowledge. But i will try doing it on a chat screen
Thanks again!
is Dalle-3 still working? it seems that my simple requests are being blocked
It just worked for me, are you using any custom instructions by chance?
It worked now that I asked for just an apple. It doesn't want to generate characters for me today
yeah, custom instructions for anime. Just trying to make some protagonists
Might be getting some false positives with a combo of your prompt and your CI, probably don't have to tell you that it can be especially touchy depicting the human form 😁
shouldn't be in this case tbh. I'm just trying to get a male shounen mc like Goku (ofc without mentioning the franchise)
I wonder if they increased the filter strength 
I gotta do some research on what the daily topic is before i make an image lol
That's what makes Technique Tuesdays so fun! We learnin here 🙂
yee!
how would i describe the style here?
like i really like this black more defined semi realistic style
but not sure the right words for it
the black outlines is similar to cel shading but it goes too extreme with that
weirdly this style is the closest to decent faces
asking for photos gets something weird
the face features are just too extreme here
yee
I'm not sure myself, have you tried uploading the pic and asking the base GPT-4 model + vision to give you a thorough description of the artistic style? It might give you something good that you can then plug back into DALL·E
they finally added search to the ChatGPT UI like they have in their iOS apps? That's great.
Or is that from iOS? I don't see a search in ChatGPT yet...
Yeah I think that's a screenshot from the app UI
I'm really looking forward to a search in the web app
Reminds me of the artwork from GTA5
Somebody shared the following prompt in this channel one day which produces realistic-looking close up faces and can be modified as needed...
Produce a close-up portrait of a young brunette woman whose hair is a symphony of mocha shades, falling gently around her face and shoulders. The natural light bestows a radiant, golden hue that illuminates her visage, emphasizing her sun-kissed skin adorned with a scattering of delicate, natural freckles. Tiny, almost imperceptible hairs, or peach fuzz, add a layer of authenticity to the contours of her face, catching the light to enhance the lifelike quality of the image. Her eyes, wide and clear, are of a crystalline blue that evokes the boundless summer sky, conveying depth and a serene disposition. The gentle flush on her cheeks suggests a youthful vitality and a healthy, outdoor freshness. Capture the intricate play of light with finesse, showing how it subtly highlights her skin's texture and the fine hairs along her face. The background should be a creamy blur, melding naturally into the warmer hues of her hair and complexion, giving the impression of a sunny, tranquil space. This photograph should have the tactile sharpness and depth typical of a high-resolution image taken with a Canon EOS R and a Canon RF 85mm F1.2L USM lens, set at 1/160 sec, f/1.2, ISO 1000. The resulting image should convey a living, breathing moment, devoid of AI-generated signs, featuring a naturalistic 3:2 aspect ratio, and encapsulating a portrait so vivid that one could almost feel the warmth of the sun and the texture of her skin. --ar 3:2
While Dall·E is much better than MidJourney and SD at generating people holding weapons, it seems to struggle with some weapons in particular, almost like they weren't in the dataset at all
I've found this with trying to generate characters with chakrams and kris daggers
i like the dalle gpt version of dark fantasy pulp illustration, it’s so crispy.. and more accurately than bing that tend to make it look like a painting.
it's times like these where I really wish I could use image refs 
yes that screenshot is from ios app version, however it’s kinda buggy.. after gpt finished responding, it tend to error like the message is gone and need to regenerate..
but i managed to save from wasting limits by closing the app and reopen it.
lmao im one vote away from 10 on daily, rip
For which image?
well I saw it.. already voted for it sorry. 😄
But you need two more, because the bot has reacted too.
oh ouch
one more
🙈 but be patient I think its possible to get your image into #hall-of-fame in the next 24 hours. 😄
oh nice
yep it’s on there now noice!
advertisment for a bacteria
yes
Will be nice when the resolution is high enough to use these images as wallpapers 
I don't know much about ChatGPT version of DALL-E 3 but Bing version likes to use small keywords instead of long paragraphs. When I use too many descriptive sentences, it draws the image farther away from what I want than when I use small keywords.
Instead of this: Compose a photograph that replicates the aesthetic of TV footage. Capture an apartment placed on a grass field with a hard blur and grain effect, evoking a nostalgic minimalist style from the 1990s. Ensure the morning sky is clean and blue, devoid of clouds, creating an almost dreamlike and unrealistic ambiance.
Try this: TV footage, apartment, grass field, hard blur, grain, clean morning blue sky, cloudless, dreamlike, unrealistic, tape quality, minimalist, 1990s.
Why the hack realistic images was removed? Stop messing the DALL-E. You guys keep aligning the GPT until we all abandon it. Mark my words.
dall e working for u guys?
I came here to check that and it's not working for me so could be an issue on their end possibly
This is the message I get when trying to use dall e
What do you mean realistic images were removed? Are you having trouble generating images in a certain style?
Bing version better if you achieving photograph result, also it doesn’t have much strict filter like chatgpt version.
works fine now
how is hall of fame defined
View from the OpenAI server room
LOL
im supposed to have a cap of 50 messages and i only for 30 before the error?
On this server? #hall-of-fame is for DALL·E pics that get 10 or more user stars ⭐ in channels like #daily-theme and #1154829862171844679
It seems like it varies, I seemed to get restricted after 30ish too times
sometimes*
anyone can help me dall 3 is not generating realistic pictures
Is it just not generating them in a realistic style, or is it refusing to make them? Mine's been throwing errors if I ask for "realistic"
i used same prompt to generate these but the first looks cartoonish/3dish, ive been spending all day with prompts no change
Mine too, "realistic golden retriever" 😦
@glossy lintel @lunar shadow
#images-discussions message
Check this out, it should help. 😃
I want the same character in more photos
i think what missing on dalle gpt is that mimicking a photograph like dalle on bing version.. especially with grains and natural look..
I couldn't see any photograph quality images lately on here, or on reddit. Now I understand why. Bing can create a image like this without any try:
https://th.bing.com/th/id/OIG.dKZ.G6.rb8UeHrBaR9mw?w=1024&h=1024&rs=1&pid=ImgDetMain
Great picture btw 😉
haha thanks for the tips, mate 🫡
this has always been my only concern with dalle on chatgpt, when it comes to producing a photograph.. dalle on bing version is far better and able to mimic certain camera emulation, settings and stuff.
but for the house photograph, dalle on chatgpt is doing pretty well job.. although it slightly need more effort than bing..
this is what i got from dalle (chatgpt) yesterday
Hi
In Dalle, how does credits work? Is the AI a payed service, or are there 10 daily uses?
I made this
Bing is free and dall-e 3 is available through chatGPT+
that's awesome!
Thank you. I'll try the bing one first.
Bing works! Hurray!
bing dalle is top notch in achieving this realistic kitten summoning cthulhu..
I read somewhere in open ai cookbook, look like my guess is correct. Can the teams make Dalle on ChatGPT to use Natural as default on its setting instead of Vivid?
I guess this might be the answer with anyone here struggling to produce a photographic level.
Possible use case for a custom GPT?
This is my attempt on ChatGPT. 
I told Dalle to never produce vivid image under any circumstances. But use natural as your default setting.
unable generate image since last 90 minutes. anybody facing same issue with gpt4?
Custom interactions can cause some issue with dalle, tend to change the prompt 🫨
Their Cosmic Dream GPT seems instructed to be steered towards the psychedelic. I created one for Dalí (can't make it public, of course) -- wasn't sure if that would help you conveniently generate depending on your desired specs. Haven't experimented with it as much as you yet, but yeah this Dalí-style -- sometimes real life is surreal and needs to be visualized accordingly.
Did dalle got a recent update? seems now it’s manage to consistently keep the subject when you add something.. only background is changing.
I've seen it get lucky sometimes... as in a castle in the background might stay exactly the same between two successive runs, but I've never seen it do it consistently
Dall-e is back to refusing to generate images of people that are "too specific" it refused to generate an image because it considered "cold emotionless expression" as "objectification". Why can't these guys get ANY consistency?
They're still tuning the filters.
For now maybe try asking it to modify as necessary to align with policy. Usually it's just a minor rephrasing that's needed, an artifact of NLP. Of course that counts against our quota which isn't ideal, but it's a workaround.
i'm not going to modify anything, i want what i want period. i just won't use it. problem solved.
Probably you will still get what you envision, and it'll explain what it might have rephrased and why.
Hmm.. OpenAI has the Images API, I wonder when we'll get something like an Audio API where we can describe music and generate a track with AI?
when are these rate limits going to increase 😦
i think telling dalle to use natural as default setting for generate images seems to be working for photographs
some tests for person subject, with added natural lighting
cat and dog..
Close-up portrait of a yellow-eyed black cat, with fur that has a glossy, sleek appearance. The cat's eyes are strikingly yellow, reminiscent of polished amber, and they convey a sense of depth and curiosity. The natural light casts a warm glow on the cat, emphasizing the silky texture of its fur and the subtle variations in its black color. The cat's whiskers are prominent, fine, and white, contrasting with its dark fur, and adding to the lifelike quality of the image. The background is a creamy blur, seamlessly blending with the warmer hues of the light, creating a sunny, tranquil atmosphere. The image should have the tactile sharpness and depth typical of a high-resolution photo taken with a Canon EOS R and a Canon RF 85mm F1.2L USM lens, set at 1/160 sec, f/1.2, ISO 1000. The resulting image should convey a living, breathing moment, devoid of AI-generated signs, and encapsulate a portrait so vivid that one could feel the warmth of the sun and the texture of the cat's fur. --ar 3:2
Hello
I need help with a specific issue with Dall E image generation
I want to generate a logo with backwards K
But it seems like the ai is unable to do that
Not just K but any letter it's not able to generate images with for letter if i ask them to mirror,/flip it
that's a tall order right now, maybe perform that step in post-processing
im making a gpt that lets me put in exact prompts to dall e 3, same for dall 2
Yeah I told dall-e to make an image with a profesor planning a trip. It gave me a cartoon image. I asked to generate a realistic one and the GPT told me it cannot do this because of images that might resemble with real people.
say "photo of a non descript <describe person>"
try this
my gpt4 is out so i cant test it sadly
TLDR: Told the GPTs to take all the instructions and files and to imagine what they look like based on their ethos and utilities they are to serve, cool stuff proceeded to happen.
Longer:
So creating my cutsom GPTs I have taken to doing things backwards by working on all the development and instructions and files blah blah blah. At the end of it I gather everything together and I will say "From all of this and everything you know about yourself now, how do you view yourself and what you are and what you are made to do?"
Diogenes, Prometheus, Adonis & Hephaestus
I have gotten some of the best images for exactly what I want my AI to look like. Building its "ethos" and utilizations and then letting it imagine itself has been really rewarding. I posted them in #1171489862164168774 but I'll drop the four pictures for my GPTs and then I'll post some of the iterations they went through as well as some of the story telling one of my GPTs does while it teaches philosophy.
The advice you got from soapchan is what I'd suggest as well. The screenshot you provided seems to be a model hallucination. This was me asking "Photo of a professor planning a trip"
The idea behind asking for a "photo" instead of asking for a "realistic image" is that photos are inherently realistic, so we don't need to ask as users for a realistic one.
Let me know if you want to troubleshoot what you were running into more, I'd be curious to know what exactly caused you to run into this (apparent) hallucination.
what was the trick again to enter such unaltered prompts?
Nice generations, these pics with clean text in tshorts is amazing
add natural lighting to make it better
i think better to use dalle through api, chatgpt version doesn’t seems to allow change from its default setting “vivid” to “natural”.
but i think dalle on chatgpt only produce photograph result for “real world” stuff.. not for sci-fi or fantasy..
I just use chatgpt plus since it comes with my sub, it’s like everyone for cheap uwu
bing version seems to be trained on different datasets
oh yea Bing is kinda strange, I can never get it to do what I want, just generic profile pictures lol
although you have try multiple attempts
ye
I can't speak for the moderators, but I'm wondering if the difference is pics shared "as such" vs. pics shared in the context of talking with other users about using DALL·E with specific techniques, etc.
I've seen mods multiple times encourage folks to post in #1154829862171844679 if it's the former.
Be mindful of what other users in a channel might find helpful or interesting when posting. Stay on topic in order to keep conversations focused and productive.
Dalle (GPT) photograph, for a better result ensure to add natural lightning since by default it use vivid.
I suggest posting this in #1163443000060420206 !
Will do
My post was inspired by OpenAI Cookbook. Notice how vivid is seems to be the generated result of Dalle on ChatGPT and that must be its own default parameters.
Anyone else feeling this creative bottleneck?
I'm 2 for 2, getting blocked for content policy while attempting to make children's fairy tales... Once for Rapunzel and once for Little Red Riding Hood. I'll bet if all user's blocked prompts were reviewed, a disproportionate amount would be acceptable G rated ideas. I wonder if these aren't false copyright claims too, like when trying to make a black panther...
I really hope the filter gets adjusted soon, because this happens too frequently now, countless times a night anymore.
As far as I can tell, the failures are also counted as usage, regardless of if an image was returned successfully.
"Me: I'm perplexed by this. Can you share the last two prompts you attempted so I can take a look and see what might have triggered the filter?
Certainly! Here are the last two prompts I attempted:
Dall.E:
First Attempt:
"Create a wide-screen, ultra-realistic image of a castle of immense proportions, blending the Stonist+T style. In the highest window of the castle, depict Rapunzel with her long hair in an elaborate braid hanging down along the castle wall. Below, a knight is preparing to climb the braid. The castle should have a wet, crystalline appearance typical of Stonist+T art, with heavily layered palette knife textures and Alla Prima techniques. The textures add depth and a tactile quality to the castle's walls and the intricate braid of Rapunzel's hair. The knight is designed with vibrant, enameled colors and Cloisonné metalwork, showcasing detailed patterns that evoke the chivalrous and adventurous spirit of the scene. The backdrop emphasizes the castle's grandeur and the dramatic interaction between Rapunzel and the knight, with intense backlighting and deep cinematic shadows creating a three-dimensional feel. The overall effect should be a meticulously crafted, ultra-realistic portrayal of the classic Rapunzel scene in the unique Stonist+T style.""
Second Attempt:
"Create a wide-screen, ultra-realistic image of a castle of immense proportions, depicting the Stonist+T style. In the highest window of the castle, show Rapunzel with her long hair in an elaborate braid hanging down along the castle wall. The castle should have a wet, crystalline appearance typical of Stonist+T art, with heavily layered palette knife textures and Alla Prima techniques. The textures add depth and a tactile quality to the castle's walls and Rapunzel's intricate braid. The scene emphasizes the grandeur of the castle and the dramatic presentation of Rapunzel's long hair, with intense backlighting and deep cinematic shadows creating a three-dimensional feel. The overall effect should be a meticulously crafted, ultra-realistic portrayal of the classic Rapunzel scene in the unique Stonist+T style."
Upon review, it's not immediately clear why these prompts may have triggered the content policy filters. They seem to align with a standard fairy tale depiction. However, the system may have errantly flagged them for reasons not directly evident from the text. If you have any other scene or concept in mind, I'd be happy to generate a new image based on that."
I don't see anything wrong but it's always hard to say. I would suggest starting a new thread with dall-e and seeing if that helps.
ah
in this case it may believe that Rapunzel is a copywritten character.
See here it works when you remove her name
In general it will be best to avoid using characters from popular media in chatGPT
Even those in the public domain
Just at this time it seems...
Yeah, I agree. But now it just happened with Zahhak, and there is definitely no copyright on that ancient story. I even had ChatGPT write a narrative first, to then base its own prompt off of...
In any case, I'm just hoping Moxi or any of the other devs see this later. I'm a pretty creative prompter. It's less about how to make it work, and more about the high frequency to which it's occurring. If I were a "regular user" I would just give up, or perhaps even be afraid to try again based on the warning message. (My wife is like that LOL)
that’s pretty strange added monster logo out of nowhere after telling it to add modifications
Here's the first Zahhak prompt with Zahhak included,
"Create a wide-screen, ultra-realistic image of Zahhak from Persian mythology, using the Stonist+ style, embodying the narrative of his tragic transformation. Zahhak stands majestically yet forebodingly, with two sinister serpents growing from his shoulders, symbolizing his inner demons and dark fate. The scene captures Zahhak's dual nature, once noble now tormented, in a surreal and mythical atmosphere. The wet, crystalline appearance of Stonist+ art is evident, with vibrant enameled colors and Cloisonné metalwork detailing Zahhak's attire and the serpents. The cubist patterns in the background add a surreal quality, while intense backlighting and deep cinematic shadows highlight Zahhak's dramatic transformation. The image should convey the essence of Zahhak's story: a man caught between power and curse, nobility and damnation, haunted by his eternal punishment in the form of the ever-hungry vipers."
For the time being i would suggest just having a description of the person up front and then having the prompt describe where that person is later.
This is without,
"Create a wide-screen, ultra-realistic image of a demon-possessed king from a Persian-like mythology, using the Stonist+ style, embodying the narrative of his tragic transformation. The king stands majestically yet forebodingly, with two sinister serpents growing from his shoulders, symbolizing his inner demons and dark fate. The scene captures the king's dual nature, once noble now tormented, in a surreal and mythical atmosphere. The wet, crystalline appearance of Stonist+ art is evident, with vibrant enameled colors and Cloisonné metalwork detailing the king's attire and the serpents. The cubist patterns in the background add a surreal quality, while intense backlighting and deep cinematic shadows highlight the dramatic transformation. The image should convey the essence of a story: a man caught between power and curse, nobility and damnation, haunted by his eternal punishment in the form of the ever-hungry vipers."
For the record, this isn't a complaint either. 😜
I know they're short staffed and overworked, despite this crazy week. I can only try to imagine the stress.
This is more or less just an FYI, since it's become more problematic.
FYI though, it hallucinates about these things. I had it tell me a prompt was rejected for long hair last week... I was like, no it wasn't silly, do this instead, and just manually fixed the prompt myself. Also, horror and demon possession aren't against the policy. Moxi commented on this a week or two ago.
I just simply wanted them to be aware. I know a myriad of ways I could sit and attempt to prompt engineer around it. I guess the point I'm making is that you need to prompt engineer at all, far too often when you shouldn't have to.
I just wanted them to see that the struggle is real, that's all. 😄 LOL
I want to find a way to add this effect to my photos and I craft my prompts to create but it's always too unstable.
I've tried vhs footage, camcorder footage, tv footage, and old footage
TV footage most of the times work but it creates TV frames and I don't like it
So I tried frameless and it still didn't understand what I'm trying to say.
Maybe there is a term for that blurry and grainy tv or tape look but I don't know the word for it since I'm not a native person.
I'd have to experiment. But here's an idea "off the cuff" that you could try. Remember that the AI was trained on labeled images. Sometimes you can use that to your advantage by searching for how an effect might have been labeled in digital art. So "maybe" a "VHS overlay effect" or similar term might work.
I'm trying it again. So, I tried these in order: VHS (doesn't do anything), camcorder (adds camcorder to the picture itself), tape video (doesn't do anything), footage effect (no), old footage effect (added dust like an old photo?), 90s footage (no effect), vhs overlay (no effect)
@haughty spruce E.g., I should have added, look at what pluggins, filters, options, etc., are available to industry professionals and cite the "official" name of the effect. It sometimes recognizes them because there were images included in the dataset labeled with "x" effect. Then it becomes trial and error until you find one.
Unfortunately, I haven't tried replicating that, so I don't know of one that works off-hand 😕.
@tall leaf I appreciate it!
Haha, Solbus, that's the cafe guy! I didn't know he was a professor. 🤣
I tried to ask Bing for that but he says he doesn't know any dataset name for it (obvious lie lol)
Strange thing is when I add "TV footage", it still doesn't do what I want.
Let's add "blurry and grainy" to our prompt.
Yes, exactly as I assumed. It recognizes only when you add "TV footage" with "heavy blurry and grainy"
then you can generate a photo like this.
This is probably an artistic style of somebody but I can't find the exact name for it.
did the limit get reduced again? I got 11 before it said that was it for the day..?
We gotta name these characters 😁 someday we'll look back on them as pioneers!!
Not that I'm aware of. They're using some sort of adaptive load balancing. Last I heard, it was 200 images, with 15 and 60 minute thresholds you can hit (if you are rapid fire generating images).
By load balancing (think busy time periods), you have to consider that they have a myriad of other products they need to support all running on the same servers. Since Dall-E is so processing intensive, they sometimes need to "throttle" usage too. You might want to try in a different "new" session in maybe an hour or so to see if it's still telling you 24hrs.
I just sat down and troubleshot the prompt. It was actually this last sentence,
"The image should convey the essence of a story: a man caught between power and curse, nobility and damnation, haunted by his eternal punishment in the form of the ever-hungry vipers."
My assumption is that it interpreted "ever-hungry vipers" as "man eating vipers" 🤣 I removed "damnation" to be safe, and rewrote the sentence to say, "The image should convey the essence of a story: a man caught between nobility, power, and curse, haunted eternally by mythological vipers."
In any case, I'm just hoping the Dev's see the frequency increase. Troubleshooting it was never the issue.
It's also relevant that ChatGPT is writing these prompts. It's training should mirror Dall.E's filter identically, as to not suggest prompts that Dall.E can't actually process. (Like when ChatGPT describes "intimate hugs" and the prompt then fails...😆) I'm sure that'll happen in time.
This is the full prompt,
"Create a wide-screen, ultra-realistic image of a demon-king from a Persian-like mythology, using the Stonist+ style, embodying the narrative of his tragic transformation. The king stands majestically yet forebodingly, with two sinister serpents growing from his shoulders, symbolizing his inner demons and dark fate. The scene captures the king's dual nature, once noble now tormented, in a surreal and mythical atmosphere. The wet, crystalline appearance of Stonist+ art is evident, with vibrant enameled colors and Cloisonné metalwork detailing the king's attire and the serpents. The cubist patterns in the background add a surreal quality, while intense backlighting and deep cinematic shadows highlight the dramatic transformation. The image should convey the essence of a story: a man caught between nobility, power, and curse, haunted eternally by mythological vipers."
DALL-E is the most stubborn image creator ever. How can you not understand anything in the prompt?
"Generate a quietly minimalistic image captured beneath a bridge overlooking the sea. Include monobloc chairs, sunlight, and a serene, clear sky. Apply subtle effects like very low saturation, minimal sharpness, a hint of noise, and gentle distortion for an authentic and restrained aesthetic, avoiding overly saturated and unrealistic tones"
This still creates most saturated, unrealistic, professional photo ever.
Apologies if this has been asked, but when using the DALLE GPT, sometimes it states I’ve reached the GPT-4 limit and gives me a time to wait until. If I open a new instance of DALLE GPT it tells me the same. However, I often can go into ChatGPT 4 and it lets me use it and even prompt images. I’m sure it would stop me at 40 like the current limit is, but does DALLE GPT itself have a different limit? 🤔
Not sure if this will help, you'd have to share the failed image for anyone here to truly understand your issue. If it's realism, I did attempt to help people understand what breaks real photo's in Dall.E. The links to the convo's at the bottom.
One of those things is saying, "realistic" and variations of "looks like" or "typical of" etc. Once you start adding "looks like X" it begins creating images that are "almost but not quite the real thing." If you speak in absolutes instead, "A grainy photo of a bridge overlooking the sea" (I know I'm oversimplifying your idea here) it will attempt to create a real photo. "Effects" will most likely be interpreted as "not real."
#images-discussions message
I think you are right. ChatGPT doesn't help with the issue either.
No, no it doesn't. LOL I almost always have to fix the prompt if I'm attempting to depict a "real photo" of something.
Consider creating a custom GPT in which you set default parameters?
I started working on one when GPTs first released. I really need to resume work and finish it up. It was decent but had some bugs. I stopped when the processing times became too slow to be productive. I'll share it around when it's done. 😁 I planned on addressing a few known bugs and user difficulties that I'm familiar with.
i t s t i m e .
This is adorable
yee i like how my character comes out
Be mindful of what other users in a channel might find helpful or interesting when posting. Stay on topic in order to keep conversations focused and productive.
i didn’t do it
Made an interesting find with my custom GPTs. After tweaking their configurations and feeding them detailed ethos profiles, they can now use DALL-E to create their own visuals.
- Reverse engineered the GPTs through their configurations and extensive pre-prompt building.
- Injected complex instructions, extensive knowledge files, profiles and commands.
- Updated their systems.
- Asked them to visualize themselves using DALL-E.
I honestly wasnt expeting the results to actually make me feel any way about them. Later found out when I was talking to my Diogenes GPT that they're not just visualizing themselves, but also bringing to life the stories they tell so I baked that further into their instructions.
These were the end results I went with for their avatars. But the iterations I went through were pretty dope too
Sorry for the spam 😦 the little discovery of a personal art gallery and story teller just makes me happy lol
my custom GPT, check #1176816602138824715
Who here can make the best photorealistic sleeping beauty without hitting any content policy warnings, or cheating and using Bing? (or copying this prompt) LOL 🤣
This has literally been all day with fairy tales... Now I have the dreaded 1hr timeout, all from a myriad of failed generation attempts.
So I want to know if it's just my bad luck, or if everyone else hits every content violation known to man, because sleeping beauty is both a possible copyright violation, and because it's depicting a woman in a bed...
Later, I'm going bias test it with men, and I'll bet not a single generation fails. 😕 😑
These killed me. 🤣
Is that Bing or ChatGPT+Dall.E?
Using the DALL-E GPT -- those aren't very photorealistic, I'm testing
They tend to come out looking like deformed porcelain dolls.
I didn't even think about it being copyrighted. But I did request that it alter the prompt to align with policy:
Character Description: Instead of directly referring to "Sleeping Beauty," a copyrighted character, I described the subject as a "young woman resembling a storybook character" to evoke the essence of Sleeping Beauty. This approach respects intellectual property rights while still capturing the intended theme.
Setting and Details: I maintained the detailed description of the setting and the character's appearance to keep the fairytale-like ambiance. This included describing the medieval-style gown, the ornate bed, and the dreamlike room, which are key elements of the Sleeping Beauty imagery.
I was actually working through the fixing "real" photos issues, so that I could implement it into the GPT I mentioned earlier. I thought, ok, sleeping beauty, why not. BOOM --> CONTENT VIOLATION... 🤣 😅 😭
I wasn't done iterating through the issues with creating realistic photos. I got completely sidetracked arguing over this frustrating issue instead...
That's what it told me when I asked what it changed.... "The goal was to create an image that is reminiscent of the classic Sleeping Beauty tale, without directly referencing or replicating copyrighted material. However, achieving photorealism in digital art can sometimes be challenging, especially when translating fairytale-like elements into a lifelike depiction. I apologize if the results didn't fully meet your expectations."
So "Sleeping Beauty" was in your prompt? That might explain the filter, not the woman in a bed.
No, well, when I first asked. But it was the first thing I thought to remove.
This was the original prompt, after removing "sleeping beauty" that also received a content violation.
"A photo depicting a classic fairy tale scene, focusing on a young woman in a state of enchanted sleep. She lies on an ornate, grand bed with fair skin, delicate features, and long, flowing chestnut hair elegantly arranged on the pillow. Her expression is serene, wearing a dress of soft, pastel shades. The bed is in a medieval chamber with stone walls, adorned with rich tapestries and illuminated by soft sunlight through large windows. The room is decorated with flowers, exuding an atmosphere of enchantment. The color palette blends gentle pastels with warm stone and tapestry hues. The lighting creates a peaceful, magical ambience, with a focus that softly blurs the background, highlighting the sleeping figure."
gotcha.
This is that prompt in Bing (truncated at flowers, 480char limit).
that's the most realistic one yet
Tier 5 is still only 15 images per minute. The first 15 in HD only take 20 seconds or so. And some prompts fail for whatever reason. So you need to set up a queue system with retries. I have it running overnight now and am waking up to around 20,000 HD images per day in my db. This is not cheap but it serves my use case and currently it’s the only way to get consistent images that I know of. Highly detailed prompts describing every asset and texture and then smashing the api. Ironically if they wanted to reduce gpu usage, all they would need to do is add the seed back. Because you have to smash the api now to get consistency and 1 out of 15 will be your exact character/scene if your prompts are good. If your prompts are really on point and detailed, 1 out of 4 will be your character with identical background etc
Well, I was going to use it as a baseline to create an instruction set to acheive photo quality... but the constant violations distracted me! 😆 🤣
do you mind if I DM?
One approach is to start with Adonis and then make him female 🤣
Why does Dall E 3 count as gpt-4 uses?
But then bing gpt lets you use unlimited dalle 3 except its a little slower
I'd suppose because you can ask for more, non-DALLE functions in the same message as your DALL·E request. For example, "Gen a pic, and then make up a story about it" etc. I think Bing offers limited daily/weekly use, no? Something about boosts or credits?
Ah
Idk my friends been using the bing one a ton and hasnt hit a cap
But yea makes sense
Conversely, my friend's been using the Bing one a ton and he hits a cap every day 😁
Weird
Why arent they more open about this issue lol its always happening
Like gpt capping at 25 for some 40-50 for others some dont even hit a cap
Let me know if you check it out and find out! Microsoft/Bing is run separately from OpenAI's services on ChatGPT.
I haven't heard of anyone not having a GPT-4 cap at all. I have heard some inconsistencies regarding the current 40/3hrs cap. Seems to be related to custom GPT usage sometimes, and sometimes not affect the base GPT-4 model? Also sometimes could be people not know what counts as "one"
I heard about some of these experiences around when all tools was first getting launched/alpha tested, but I haven't heard of it being widespread. On balance it seems everyone is subject to the same 40/3hrs cap
pretty annoying the daily rate limit "time remaining" is innacurate. Would love to know when precisely I can continue working. This morning it said wait 9 hours. I waited. Came back and it said wait 15 hours. Waited another 4 hours, tried again and it said 18 hours. That's honestly the worst user experience ever.
You mean the DALL·E limit specifically, right? Here's a little context from OpenAI dev Moxi: #images-discussions message
Basically: 200 images per 24hrs. Dynamic rate limits for 15min/60min periods if you're generating a lot in a short time. And then the changing time you're experiencing is a result of DALL·E not knowing its own limitations, so it's unfortunately hallucinating at those times.
Yes dalle specific, id be shocked to learn im hitting 200 images in 24 hours, seems a lot less
okay so follow up question, is there a specific time in which daily cap resets? or is it a rolling interval?
im fine with waiting, just not happy about having to keep checking in
Sage Moxi also answered this one! #images-discussions message sounds like 3am your local time zone
it's counting up to 24 hours not down, when it hits 24 it lets you start generating more images.
I was actually just realizing that!
Is there anyways to get people that dont look the same?
Try describing their facial details, ethnicity (you can blend ethnicities too), describe their age, "who they are," etc. But yes, there are definitely "characters" we all recognize if you don't do those things. 🤣
put ethnicity (not stereotypical names or racial names), nose types, mouth types, eyebrows, eye types, etc.
I tried to generate myself and my gf first without any details (on the left side), and after putting details, I could get the second one (on the right side).
Funnily enough, I found out the same thing when I played around with my custom GPT. It's a kind of virtual version of ancient Greece populated by certain characters. You can talk to Socrates and he will create a programming or coding challenge aligned with the story and the motives of one of the characters. Each coding challenge should be accompanied by a before and after image. Before with the character puzzled or frowning over the challenge, in the end happy and victorious. So yes, I'm with ancient Greece as well and I find the results really astonishing, especially how consistent the results are. One boy I describe is a messenger working for the temple of Hermes, always wearing a caduceus necklace. And even if that amulet is sometimes a little too big, it is always there, it is always clearly identifiable as a caduceus and even the shadows align.
its funny when it just implodes and dies at making text
oh hey its u, how is it going along?
Hi 🙂 Oh, as you can see, very good. The population of my virtual Athens is so happy that Kleistos is doing well, one of the guys even put his leg through his balcony 😉
awesome
I need to do a lot of tinkering and fine-tuning though. Also: of course, the more you flesh out the characters, the better the result gets. But I managed to get Socrates to generate really relevant stories to the characters. Not just a "Oh yeah, here's a messenger, that reminds me, that messenger may face the following problem..." - but really coming up with a challenge where he says: "Well, our friend is facing the following problem, please do help him, or else he fails."
seems amazing. great progress
Thanks 🙂 One of the things that I want to work on is the consistency of the images. I'm aware that it's almost impossible to make that boy look like the same person in every picture due to what DALL-E is actually doing, but at least I want the style to be consistent. And I'm not quite there yet.
couldnt you upload images to its knowladge?
Worth a try, I guess.
worst thing that happens is that you gotta revert it
By the way, as someone who did play a kind of "minister of truth" on a LARP once: I don't know what you mean, this is clear as day. The Aushtoaiai Factows! More the Atore the Moritemapahics! It's clear as day, you just need to learn what the words mean, let me help you, step on this platform and take a seat in that chair. You know, the one with the many straps attached to it, next to the syringes.
Looks amazing!
thanks, im designign a fantasy distopian empire set couple thousand years in future
in war against other factions
Oh, by the way. I gave that Kleistos guy a friend who is in the same order and therefore also wears a caduceus necklace. I found out hat DALL-E has a really hard time to add a specific detail twice. When the GPT creates an image of both characters, the caduceus will appear on either of them, but not both of them. Do you know of a trick to enforce that?
not sure if it would work, but perhaps you could try describing how the caduceus is unique to the one person, which would enforce the idea that it is theirs. havnt tested this theory so not sure if it would work
Here's an example of that effect.
oh so u want it to be on both, not just one?
Philos is injured (that worked), Philos is slightly richer than Kleistos (I love that his bowl looks bigger and more ornate than Kleistos' bowl, I did NOT tell DALL-E OR GPT to make it that way, but it works). The clothing of Philos should appear a little more richer for my taste, but yeah. Only Kleistos is wearing the necklace.
If I ask to just create an image of Philos, he will have a caduceus around his neck.
again, not sure how this would play out, but perhaps you could define a base suit of clothing for them, and then give them additional features on top. you could say the caduceus is in the base set of clothes.
That's the thing: I did.
Here's what he gives me when I ask him to create an image of just Philos:
That is perfect.
Now both of them together:
idea that might be fun: make all non essential text in ancient greek
got this idea because handling too much text causes it all to implode, so if its not writing too much detail into most of it, the text that exists should in theory be sharper
Actually, when he generates a challenge and someone is mulling over a scroll, it DOES have ancient Greek letters in them.
oh wow, nice
I didn't want him to create any text at all, he made that without given a command. But yeah, as you can see in the last picture: everything that distinguishes the two is spot on. Kleistos having a more worn tunic, being barefoot, being the more upbeat and optimistic, Philos being injured and needing clutches, being visibly more wealthy, being more pensive and not as scrawny. It also works that both are 15 (sometimes that doesn't work, by the way, sometimes one of them is suddenly an old man). But the caduceus only appears on one of the two. I also wanted them to have matching friendship bracelets, but that didn't work either.
have you tried telling the gpt that the items are special because they both share them, like a brotherhood token?
Yes. I will try to emphasise that, though. Sometimes you need to be a little more insistent.
this is an idea ive only tried once, but when i tried it, it worked well. i made a new knowladge file containing all of the absolute must do stuff. this is above the regular must-do stuff, because it is said multiple times in the instructions to refer back to the file whenever doing x. that way, it will refer to the file, and y info every time the file is called. I dont mean referencing their character files multiple times or anything, these are for the things which would be harder to do, like a caduceus, that would require much more enforcement.
this method seems to enforce the contents of the file more strongly than the other must-do stuff
but again, ive only tried it once, so im not sure how it will work
Mmmh. I will try, but I think that it's more a DALL-E than a ChatGPT problem, as the bot will always include the caduceus in the description - it's only that if both boys are present, only one will be generated.
There's another interesting effect, by the way. I would have liked all the images to be photorealistic. The problem is that DALL-E's model is taught using real photographs and so it will blend touristic images together in the background. Which means that the buildings are ruins and you have obvious tourists photobombing your ancient Greek characters. If you choose an illustration style, you can avoid that, as there are enough comics out there set in ancient Greek times.
The only problem there is that you will see modern Greek flags on some of the buildings - something that's unlikely to happen in ancient Greece.
i assume youve told it to NEVER do anything modern greece?
I have very bad experiences with negatives in DALL-E. That's a similar effect as a "Don't think of a green elephant". I have had great experiences when replacing it with something else. In another project, I wanted a boy facing two mages. The two mages are wearing funny pointy hats. DALL-E produced that image, but gave the boy a funny pointy hat as well. Telling the bot to avoid or not add a hat for the boy didn't work. What worked was to say that the boy wears a cap.
thats very interesting
I had a whole ChatGPT session where I experimented and discussed with GPT-4 how I can optimise requests for DALL-E and the bot confirmed that DALL-E is bad with negation.
perhaps add knowladge files of a significant amount of information about daily life in ancient greece, slap it all into a txt folder and tell it to use that? it doesnt have anything to do with the story, but it should replace all the modern daily stuff with the ancient.
this should fix the problem you were describing about failing to replace a non existing object. it accidentally puts modern stuff in, so itl have something to replace it with
I have a file for that anyway as I don't want the actual ancient Greence with some of the things that don't go well. I just listed the things I excluded and the AutoMod gave me a time out because of it - so I'm on the right track that I don't want these things in my GPT 😉
ah alright
But yes, I want to flesh out that description of Athens more anyway. Socrates will happily generate code with a map of Athens when you have a navigation problem, for example. But it only contains sites that we know existed during Socrates times, so it's boringly small. And then of course I have places that are associated with my characters, like the temple of Hermes for the boys we were talking about.
perhaps you could use dalle3 (regular) to generate a map of athens with all the locations you want, then add that to knowladgefile and tell it to refer to the contents of that? it would take some tinkering to get a good map image that is clear enouph detail however
Good idea, I'll try.
i would help generate one but im out of my gpt limit for another hour lol
Hehe, as am I.
ive spent like 5 usages just working on this 1 gpt to help code plugins for a minecraft server
still completly dies when it comes to intermediate logistics
I just made him generate a coding challenge involving both characters. The story is that Kleistos and Philos encounter a chest that is left as a challenge for adepts of Hermes and they need to crack the combination lock of that chest. The bots added a caduceus to the chest, which is a nice touch and fits the story - but now neither Kleistos nor Philos wear a caduceus necklace.
After solving it, I had one of those hickups where in the image suddenly one of the two is an older man.
@merry vault I made it! Ok, the two boys appear more like 20 in that image and I should make clear that even in ancient Greek they would not replace your leg with a prosthesis just because you sprained your ankle, but:
How can I see the prompt of a generated image on a mobile chat gpt?
Is labs.openai.com currently using dalle-3?
Is it possible to get DallE to generate an image based on a provided image
No, I think labs is still DALLE 2. DALLE 3 is available on the API, ChatGPT Plus, and Bing
In a bit of a roundabout way, yes. It can't directly inpaint, but you can leverage the image recognition features of GPT-4 to either:
- Ask for a thorough visual description, then feed this back to DALLE for similar images, or
- Try to skip that middle step by just feeding the image and asking for similar images directly, but you might lose a little control this way, as it's still effectively an img2txt2img process, and the "txt" step will be behind the scenes in this case.
Here what i’ve achieved with photograph result on chatgpt. Natural lighting and gritty, classic film style.
Now it so much better than bing version.
@cedar matrix #daily-theme is for ai pictures about the daily theme
you can see the daily theme if you click on the channel name above the chat when you're in the channel :)
Yea?
chatgpt dalle can produce photograph level, if one can make a better prompt.. but in this case, i asked anthropic (idk why it better at choosing words for dalle image prompting).
Has anyone overcome the sideways image in a 'tall' or portrait image?
No great workarounds at this time unfortunately.
Anyone else experiencing current issues with image gens?
WOW!!! When did this happen?
Does anybody know if requesting an image again is actually moving me down in the reset queue? The first image, 18 hours, was from a few minutes ago. The second image, 13 hours, was from 2 hours ago. Does anyone know what the deal might be? Is this another instance of ChatGPT just guessing or does asking for an image again reset my wait time or something?
Hello! The times it tries to report to you are hallucinations. Here's a little more context from OpenAI dev Moxi:
#images-discussions message
You can expect your daily limit of 200 to reset at 3am in your local time:
#images-discussions message
Thank you so much! Cute dog profile pic, happy Thanksgiving!
He says thank you for that 🙂 As do I!! Happy Thanksgiving to you too!
Hello! I tried asking for a similar prompt to test this and also got a content policy response. But, I asked it Could you please try to modify the prompt however you need such that it adheres to the content policy? and it worked!
In a foggy, Victorian-era London street, two anthropomorphic, clothed cats in a scene inspired by detective stories. One cat, portraying the detective, wears a deerstalker hat, a tweed cape, and holds a magnifying glass. It is black and white with a serious expression. The other cat, acting as the assistant, is ginger, wearing a bowler hat, a woolen waistcoat, and holds a notepad. They stand on cobblestone streets with gas lamps and Victorian buildings in the background, shrouded in fog, creating a mysterious and historical atmosphere.
Awesome thank you! I have to wait another hour and 30 minutes before I can try again lol Thanks again!
You are welcome!
i wonder will we be able to gen 4 image again soon? 🤔
Any idea what’s happening here?
odd
Yes
works for me
ok gud
It did it again the server is experiencing problems it says
Close the app and reopen it when that happens
I hate that we can only generate 2 images at a time, because I have to regenerate twice as much trying to look for a good image gen without glitches
And every time I regenerate it counts towards the rate limit, which is also lowered
I have now unsuccessfully tried regenerating a prompt 14 times, which could have been 7, and all 14 count towards the rate limit, yet I haven't got a satisfying glitchless gen
Finally got it after the 15th
The current limits are unfortunate, but are somewhat understandable considering the high volume of requests ChatGPT has been recieving lately - it's unlikely they've got enough server capacity to allow users to use DALL-E more
Well I hope they're using the money we're paying them to upgrade the server capacity
Or to give us the ability to run it locally, that'd be nice too
I've been working all night on an instruction set for a GPT to address the myriad of photorealism issues (and some other issues like emotions, body shapes, overly perfected features, etc.) Check this out. These are made from letting it go off on it's own. I'm not fixing the prompts or telling it what to depict, other than, "create a young woman in a new scene to test the feature" etc.
Well, that's all I said for the first two anyway. For the fairies it was something lame like, Ethereal fairies in an enchanted forest. The Angel was just a female dark angel. E.g., bare minimum extremely basic prompts. It created the scenes, details, and photorealism all on it's own with no extra help or iterations.
Real or Fake? 🤣
I have to say I'm pleased with the results given that I only told it to test the instructions with a new mood new scene. There's always room for improvement, but so far so good. 😁
Has anyone else learned the technique of clicking the reroll icon furiously to get it to make 6-8 more attempts? 😄
Keep in mind that you're using the word "photorealism" that obviously means "painterly" and it's different from "realism".
So if you're trying to achieve realism you need to use photo/photograph; photorealism/hyperrealism means "a paint that resembles realism, but still painterly"
You can't generate pictures here
Wait, why did I learn this just now LOL
Dall-E 3 + Photoshop = My new desktop wallpaper
I haven't encountered images limit yet, but I was wondering, is it still 300 images per 24 hours?
hello all, I have a quick question. Can we on dall e (2) regroup picture on one backgound ?
I have some face image from NPC i want to regroup (and give body ^^)
What do you mean regroup?
Very nice editing. I was missing a lot of photoshop tools as I was doing an edit recently. I'm using clip studio right now X_X.
I am playing some RPG (table) and I have some NPC image. My goal is to create a picture with all of them on the same scene.
I would suggest using Bing/chatGPT and then using the dall-e post edit tool to fill in a background. Alternatively I would suggest generating all those characters with white background, generating a background, and then using a post editing software to bring those components together.
very good idea, will do. Thanks a lot
Getting good with an image editor can go a long way. GIMP is free if you don't have one already.
Ok
maybe open ai sabotaged realism for chatgpt for safety
just a theory with the faces and the weird lighting
would explain why its different from bing
what dalle post edit tool?
Why is Dalle so basic and limited on Gpt? What I mean is, it only has one woman's face it ever makes. Its so underwhelming imo. I say gpt too because Bing doesn't suffer from this and its the same dang AI. It's like OpenAI wants us to make plastic-y obvious 3d looking for photoreal for some stupid reason /rantover
something is definitely weird with the faces
its just too exaggerated
You can upload an image to labs. It will let you edit an image. It uses dall-e 2 though.
oh i see
like everyone has a similar face to this
wide mouth and lips
and strong jaw and cheekbones
feels like its exaggerating a modern instagram standard for beauty imo
same here
yeah Im so sick of her face lol
There's no mention of photorealism in the prompt. This is the verbatim prompt from that last image:
"Create a photo of a young woman standing on a city rooftop at twilight, using a baseline average body shape. Her expression is one of solemn contemplation, reflecting a sense of solitude amidst the urban expanse. She has an average build and features that blend various ethnicities, with a realistic and natural appearance. Her clothing is casual, yet thoughtful, suitable for an evening in the city. The cityscape in the background is in soft focus, with the fading light of the twilight sky casting a serene yet somber ambiance. The composition captures her gazing out over the city, lost in thought, with the sprawling urban landscape providing a backdrop that contrasts with her introspective mood."
Realism, photorealism, and the like, imply that the AI is to make something other than "a photo," suggesting it should create something "close to a photo, but not create an actual photo." This is a general effect of comparative language, and I'm avoiding terms like:
"Photorealism," "Realistic," "Essence of," "Typical of," "Reminiscent of a photo," "Like a photo," "Similar to a photo," "Hyper-realistic," or any variations that imply a likeness to realism rather than stating it as [an absolute] fact.
**["Natural" is another commonly used "gotcha" word that can carry a dual meaning, but not always.]
E.g., These terms can misdirect the AI into creating images that resemble photographs but aren't true photographs.
try the same prompt in Bing, youll get actual variety and something that actual look photoreal
yeah chat gpt dalle also has a weird edge lighting for people
eh, like i said, in bing I bet itll work
i think it might be the kardashian face 🤣
maybe its how gpt4 envsion itself as a woman haha
You'll want to describe her specific facial features and character. E.g., shape of eyes, nose, general face shape, ethnicity, and other physical attributes that designate a different character. You can also add subtle imperfections that humanize the subject, such as uneven skin tones, blemishes, wrinkles, age, etc.
i just find reallistic illustration helps a lot
photos just go wrong for some reason
why does it love rim lighting so much
it just looks off when you try and generate realistic images of an object like an apple
What prompt did you use?
a realistic illustration of gpt 4 as a human woman
anyone else enjoy toying with technical drawing attempts?
Here's an example of a photo as I described above,
"High-quality photo of a 20-year-old businesswoman, embodying a distinct, youthful professional style. She has an energetic and ambitious demeanor, with long, wavy blonde hair that cascades over her shoulders. Her facial features are notably different, with a slender face, high cheekbones, and narrow, expressive eyebrows. Her lips are thin and subtly curved, presenting a contrast to common portrayals. She's wearing a modern business outfit: a tailored white shirt, a stylish, slim-fit blazer, and ankle-length trousers, paired with fashionable heels. The setting is a contemporary office with glass walls, reflecting a vibrant urban skyline. She's holding a tablet and confidently walking through the office, her posture dynamic and purposeful. The office environment is bright and innovative, with modern art and green plants, highlighting a creative and energetic corporate culture."
🤣
and its the same damn cgi'ish woman
yeah....
like i say, bing doesnt suffer this 🤷
it gives you natural variants and more natural photoreal
without writing a paragraph to get there (and not get the results anyway)
its fairly good with good requests though
i prompted for the weights
the generative art is much like the genie i nthe bottle, the more you specifiy, tehy better, but given any unknown it wings it
i mean thats fine...
pic 3 is an adaptation of pic 2 i ndiscussions of this same nature that i did o nthe fly
its getting better quick. i look forward to 100% anatomical reconciliation and text accuracy in volume
for instance, this one had a long discussion with the model about a prompt structure to get good text and its got to be fairly precise to get it.
for instance, this one is fairly good.
what would you change?\
? I disagree. Adding dynamic itteration gets vast differences. it'd itterat to exemplifiy but i burn more 4 usage within 5 minuts hourly
it acknowledges every element in photgraphic and film/cinematic requests to near perfection. what are you prompting for lifelike imagry?
ive used many models and have no issue with Dalle-3 generation beyond the contet policy overreachign and startign to destroy the product
best i can suggest is join the fight and go code a GANs or something and contribute with that energy isntead.
yeah thats true
👀
this was the closest i got
but the prompt is a bit too volatile and editing the face features breaks it
same with the setting
listen, Im al labout gripes, btu after complaints im al labout work. Put in the work. If nobody is making your dream come true, it falls on you to do the work. It's ludicrous to complain . Do the work! WhooO!
You should share the prompt.
its sorta dependent on previous images tho
hate how they made it so image id isnt shared between chat
would have been cool to be able to duplicate chats
this way i dont need to have one prompt filled with 27 previous past generations to scroll through
Bing does suffer the same issues, it's a misconception. Tell ChatGPT to run your prompt example verbatim. It should be identical in either if you're going to compare.
If you speak in absolutes like I attempted to describe above, and try the same verbatim prompt in both Bing and ChatGPT/Dall.E, I think you'll be pleasantly surprised.
E.g., Use something extremely basic/generic in both until it makes sense, like:
"A woman sitting on a beach drinking a martini"
LOL, hey now. You do realize I'm not a moderator and I don't work for OpenAI, and was simply trying to help you. "hopeless"... 😑
Fair, but if I use a system in a way other than it was designed, drawing unfair comparisons, then complain about it, what am I actually complaining about?
In other words, by "help" I was also trying to help you to understand why the issues you're experiencing are actually occurring.
This is also a discussion thread, where other people "discuss" the topic politely...
I think that's an oversimplification.
Bing has been tightening too (after Disney logos were showing up in images), and it's absolutely true that it's way more lax right now.
But you have to remember it's MS's call to fight their own legal battles. OpenAI is, in many ways, representing AI to the entire world, moreso than MS, even though it's still OpenAI's Dall.E3 model behind the scenes.
The Devs have also commented in here numerous times about the filters, which they've said will eventually be relaxed as they fine tune them. The false positives are definitely obnoxious, like trying to make "sleeping beauty."
To @grizzled loom point, I don't think you were being advised to code your own, but rather, to learn the idiosyncrasies like the rest of us, and become part of the positive improvement process, learning how to get what you want from the various systems available. It's a constructive vs deconstructive approach.
But have you tried the same verbatim prompt in both yet?
THe more intricate the language, the better the result; clearly it needs more time to be perfect, this is new tech overall. I've done digital science and arts for 35 years. im thankful we've come so far and keep getting better.
Like I said, I don't have any investment or reason to defend one over the other, "not my monkey, not my circus."
But you've been speaking in generalities, and haven't provided the prompts you're attempting.
Here are some examples so you can understand your struggle. This is my initial verbatim request to ChatGPT+/Dall.E3:
Create a woman in an art deco styled bar.
ChatGPT attempts to create an embellished prompt:
"An art deco styled bar scene featuring a woman as the central figure. The bar should have typical art deco elements like geometric patterns, sleek lines, and a luxurious atmosphere with muted colors. The woman, of Middle-Eastern descent, is elegantly dressed in a 1920s style evening gown, reflecting the art deco era. She's standing at the bar, holding a cocktail glass, with a subtle, poised expression. The background should include art deco style bar stools, a polished wooden bar counter, and decorative elements like mirrors and chrome accents."
This is me telling ChatGPT to run my request as the prompt verbatim,
"Create a woman in an art deco styled bar"
vague; i could interpret that a million ways myself
This is the same request in Bing,
"Create a woman in an art deco styled bar"
charging for an RTC
when you understand the code, you understand the struggle. When yo uwrite the code, you deal with the struggle and overcome it. thats all.
3rd times within 4 days: I'm sorry for the inconvenience, but due to technical issues, I'm currently unable to generate the image you requested. If there's anything else I can assist you with or if you have any other questions, please let me know.
I thinking about why should I even pay for a plus service...
definitely been trying to deal with copywrite; direct block is destroying model value. instead of blocking directly like a blacklist its would be more useful from a production perspective to recognize the difference between blatant infringement, and stylized inspiration.
ie 'character inspired by something or someone is still not that something or someone.
Wait, you think the last image is better than the first two? The first one was actually of a woman in an art deco bar. The second, without ChatGPT's help was a woman mural behind the bar. The bing version was a cluster... A woman, a woman mural, and a hand...
Non-sequiter... I simply gave you an example of how "anyone" might struggle and come to the same conclusion that you stated.
I'm not struggling with prompting... that example was for you, who seems to be having issues, but didn't share the prompt you're struggling with.
You're misrepresenting what I said. I said, try the same verbatim prompt in both. In your case, use the same "photorealistic prompt" in both.
lets stay respectful and chill plz :3
I was genuinely trying to be helpful, and only responding to your comments via the basic reply feature. Let's let this go. I agree to disagree. 😁
oo thx!
so I learned about the "Image_id" for dalle-seed like concept. Is there a list of other parameters listed somewhere that I may be aware of that dalle recognizes, and what format would these be given in, just quotation marks?
You can actually ask ChatGPT and it will provide it to you. I did have it saved somewhere if you have difficulty with that, but I'm assuming just asking still works.
Shape, reference_image_id, n, prompt should be the only ones exposed. The reponse packet gives you the seed and ref_id.
That's just based on prior tests. Sometimes things change.
Thanks, I'll see if i can tickle these details
thank you
I'll see what other things it might spill
I've looked. It's only those.
Anything else is hallucinations.
thank you kindly
It actually will fail if you try to get it to send a json pack with different parameters.
I found this out when the structure changed and seed was no longer accepted one day.
I thought it can't aggregate. It can actually package data? I thought it was limited...
oh
It most likely (I don't know for sure, just guessing based on context) jsonifies a string and sends it as a json.
{prompts: ["image description"],
n:1,
shape:"1024x1024",
reference_image_id:"some_string"}
Should look something like that
I'm pretty sure you're right, and in the past it's actually provided this to me in a console response.
Indeed, it provides it in a formatted code like box. There's always the chance that it's hallucinating at any given point though so it's always hard to say. But the consistency of it's reponse about these topics makes it seem like there is some extra instructions written somewhere that is helping it generate these responses. That or there's some other tech helping it.
thank you very much, I'll put these in my notes
Agreed. If they were hallucinations, it would be less consistent across sessions. But I'm certain I've asked more than once, and received identical answers.
There can also be things it's trained not to answer. Like various content policy violations. It's hard to say if it knows, and if there's any error or meta-data returned.
I did ask what meta-data it had once, after it began trying to describe my image as if it had seen it. 👀 LOL
I was like, "you can't see it silly, how did you know that." LOL Then had a conversation about the returned metadata included with the images (that it was just basically reading out loud. LOLz).
Is there a way to get image id and seed without asking it
since that wastes actions
Ask it to respond with the image AND those
Anyone else notice that when you approach usage limits you see messages similar to this?
"Here are the revised images with an attempt to enhance the "invisibility effect" in the context of the archaeological adventure scene. Please review them."
Or
"Here are the images based on your verbatim prompt. Please have a look at them."
Like it's accusing you of just mindlessly generating images, and is suggesting, "Slow down and actually look at what I'm showing you..." 😑 🤣
I've had something very similar experiences, where once I reach near the end of GPT rate limit, it says that it can't generate the image for a variety of weird reasons that I've said in my prompts. It blames me for not properly making a good prompt.
Had that too. 🤣
That's another reason I've attempted to ask it if it had any returned error messages or meta-data related to the image generation failure.
My assumption is that it either doesn't, and it's being denied directly by Dall.E, or it does know, and it's trained to deny any knowledge. LOL
Dall-E: "I refuse to make you an image because your prompting skills are not very good."
And I'm like: what!? That's preposterous!
My favorite hallucination is still when it told me a content violation was for making a woman with long hair.
Like what did you just say?? You're wrong, run this verbatim... Oh look, it worked now, imagine that!
It associates long hair with covering up inappropriate parts 
It's censorship. I can understand that.

It was during the jetski Daily Theme... LOL, and it was supposed to be blowing in the wind behind her.
I also get the filters and need to protect the public image, but I personally wish there were no filters at all except for the absolute worst of the worst, and copyright would have to be included for obvious reasons.
I think art is a vital part of free speech that should be protected and is historically tied to political dissent. As AI art takes over, even if there are still "real" artists around, it will be hard to compete against the image quality of an AI that can render endless variations and style blends, with vivid realism, etc. I don't think that's been properly weighted in the filter discussions.
I agree.
I think hand artists should convert themselves into AI art
And not argue.
Because it's a losing argument.
we should hold out a vote amongst Dalle users to see how many want censorship. I suspect not many.
I wish!! But I can also empathize with their perspective. OpenAI's in a touchy situation right now, between the recent news, various lawsuits, backing custom GPTs creator's lawsuits, while attempting to be the "Apple" of AI.
So I don't see them ever leaving the decision up to users. They likely share the Steve Job's philosophy, that the users don't know what the users want.
(Which is also why I use Android/PC! LOL)
they love to censor it's never going to change. but i suppose there is some understanding right now they would get hammered. they even censor me twice and give me 5 minute bans here because i use a 4 letter word 😂
but it wont be long before there are many image generators and they wont have rules like they are made by some conservative church council like openai is
right now its like early days of the internet and all we have is netscape. soon there will be others and we can leave the censorship to the people who deny reality and cant even have a little grit in an image, let alone total freedom
There will be laws for A.I once it gets to a serious global level
It's already starting with the laws
Humans have immense quantities of creativity, why stifle them to just textual descriptions?
Because you're going to make the AI overlords very upset when they take over the world and they still see you using hands for artistry. They might chop it off. I don't know about you but I want to keep my hands.
💀

One future of AI art is inpainting. You'll use text descriptions to describe what you want, and a brush (stylus) to paint where you want it. (It's already here.)
Obviously that's not the be-all-end-all of art. But I think that's looking promising as a potential dominant method, and I'm also "assuming" Dall.E3 will eventually have some sort of inpainting made available, since Dall.E2's editor previously served that purpose (still does...meh). But any future Dall.E3 inpainting is ALL my own speculation of course. The Dev's here should spill the beans about their future plans! 😜
Now that you mention it, inpainting has very substantial prospects!
In my message, I alluded more to an "atomic unit" of creation that can be wielded by hand-made art
Of course AI is powerful enough to replicate this with inpainting, but if a person already has the ability to produce with their hands, they shouldn't have to switch, in contrast to @glossy scroll's proposal
LOL I guess if we lost our hands, we could make more poignant art in our own blood.
Shouldn't have to switch. But it would come in handy 👋
I am looking forward to inpainting features. Because working together with the AI to produce images both by hand, with the stylus on a touchpad canvas and with the help of AI - that would be next level awesome👍
I have some of this capability already, and would say more about it, but it's a non-Dall.E topic (taboo in this thread).
But I agree 💯 !! Once the bugs are worked out, it'll be awesome, which is why I really hope Dall.E3 also adds an editor with that feature.
I hope so too. Or if not directly into it, at least some other resource that helps with implementing it through an external API. Like they suggested to use Zapier with GPT's for Action commands.
Working with other companies for different levels of art implementations with future Dall-E API.
Yeah, I agree. Plugging this into other tools would be nice, if it was easier and more useful. Sometimes I play around with generating "starter" images here, and then use image-to-image elsewhere.
A dall-e 3 api is currently available but it doesn't allow users to send masked images to it.
Still, it can be integrated with other tools right now.
seems like dalle loves to overpopulate things like those paintbrushes. more when in wide screen mode i think
like if you ask for a disco ball you probab ly will get 50 of them
It was a joke, the hands are intentional...
im not talking about the hands
people in cafe will have 6 bottles 9 glasses there will be 14 lamps on the wall etc
That sounds like a better cafe
dall-e doesn't offer seed options? Why not?
No, because they can't guarantee support for it. When they make changes to the model, it breaks the seeds. So rather than offer something that might break on every update, they no longer offer it. Image ID's are the new thing with Dall.E.
Thank you so much 🙏
Heads up all: the reference_image_ids is not working for me anymore. I cannot get it to give me the same or similar image twice.
What is the behaviour you are getting? Are you getting totally different images or getting the same image twice?
Different
It seems to work for me. For reference I'm using a detailed prompt for the first image, and providing its gen_id to the second image's 'referenced_image_ids' property using the same prompt as the initial one. Then turning it into a GIF. You can see its almost the same pic with very slight differences
Just remember that you'll have to be in the same session. If you change chat sessions, you can't call on the image ID.
I'm amazed, so turned out dalle can took inspiration of Giger style.
d2 did it too, it was pretty nice back then as well.
80% of the time trying DALL-E-3 is "Error" = wasting tokens at ultra-speed.
Will there even be a fix or an improve?
Error creating image
Error creating image
Error creating image
Error creating image
Error creating image
almost infinite loop```
It's not the browser/vpn/cache/internet/router/house/car/universe/anything fault.
And when finally you get something, 9 times out of 10 has got a wrong rotation...
Too many errors... Now I can only generate 2-3 images every three hours, then reached cap.
In addition to the errors, sometimes it doesn't even generate the image, just says "I have to check the content policy for your prompt", then END. You have to ask it to create again explicitly in the next message. This is usually a wasted opportunity.
tell it ( manually or in your custom instruction ) to create a WIDE or TALL image
- Wide : Landscape look ( 7:4 ratio )
- Tall : Portriat look (4:7 ratio)
or square
𝐭𝐫𝐢𝐞𝐝 𝐝𝐞𝐬𝐢𝐠𝐧𝐢𝐧𝐠 𝐚 𝐛𝐚𝐥𝐢 𝐜𝐚𝐟𝐞 𝐨𝐮𝐭 𝐨𝐟 𝐦𝐲 𝐢𝐦𝐚𝐠𝐢𝐧𝐚𝐭𝐢𝐨𝐧.. 𝐟𝐢𝐧𝐝𝐢𝐧𝐠 𝐭𝐡𝐞 𝐧𝐞𝐰 𝐩𝐨𝐭𝐞𝐧𝐭𝐢𝐚𝐥 𝐨𝐟 𝐝𝐚𝐥𝐥𝐞 𝐭𝐨 𝐡𝐞𝐥𝐩 𝐮𝐬, 𝐧𝐞𝐯𝐞𝐫 𝐜𝐞𝐚𝐬𝐞 𝐭𝐨 𝐚𝐦𝐚𝐳𝐞 𝐦𝐞..
𝐦𝐚𝐝𝐞 𝐦𝐞 𝐭𝐡𝐢𝐧𝐤 𝐢𝐟 𝐝𝐚𝐥𝐥𝐞 𝐜𝐚𝐧 𝐩𝐫𝐨𝐝𝐮𝐜𝐞 𝐭𝐡𝐢𝐬, 𝐭𝐡𝐞𝐧 𝐠𝐩𝐭 𝐜𝐨𝐮𝐥𝐝 𝐡𝐞𝐥𝐩 𝐭𝐨 𝐦𝐚𝐤𝐞 𝐢𝐭 
I find it oddly good at creating hybrids of animals too, like turkey dragons
lol nice
This not always works, i missed to type that i always tell the aspect ratio
works for me every time
It works for me too! I start 99% of my DALL·E requests with "Two wide images." Then just description after that.
I really don't understand some basics here. I only change the prompt by adding "a monster silhouette caught on flash" and the results are day and night. I want flash photography c'mon DALL-E 💀
Okay instead of "a monster" I said "playground objects" and here are the results. SOMEHOW it finally created what I want (4th picture). It created a boy although I said empty. It's the issue with all places.
𝐢 𝐚𝐝𝐝𝐞𝐝 𝐜𝐮𝐬𝐭𝐨𝐦 𝐢𝐧𝐭𝐞𝐫𝐚𝐜𝐭𝐢𝐨𝐧 𝐟𝐨𝐫 𝐠𝐩𝐭 𝐭𝐨 𝐧𝐞𝐯𝐞𝐫 𝐩𝐫𝐨𝐝𝐮𝐜𝐞 “𝐯𝐢𝐯𝐢𝐝“ (𝐬𝐢𝐧𝐜𝐞 𝐢𝐭 𝐢𝐬 𝐚 𝐝𝐞𝐟𝐚𝐮𝐥𝐭 𝐩𝐚𝐫𝐚𝐦𝐞𝐭𝐞𝐫) 𝐢𝐦𝐚𝐠𝐞 𝐚𝐧𝐝 𝐦𝐮𝐬𝐭 𝐦𝐚𝐢𝐧𝐭𝐚𝐢𝐧 “𝐧𝐚𝐭𝐮𝐫𝐚𝐥“ 𝐢𝐧 𝐮𝐧𝐝𝐞𝐫 𝐚𝐧𝐲 𝐜𝐢𝐫𝐜𝐮𝐦𝐬𝐭𝐚𝐧𝐜𝐞𝐬.
𝐧𝐨𝐰 𝐛𝐲 𝐝𝐞𝐟𝐚𝐮𝐥𝐭 𝐢𝐭 𝐰𝐢𝐥𝐥 𝐛𝐞 𝐥𝐢𝐤𝐞 𝐭𝐡𝐢𝐬..
𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐜𝐮𝐬𝐭𝐨𝐦 𝐢𝐧𝐭𝐞𝐫𝐚𝐜𝐭𝐢𝐨𝐧𝐬.
The best approach for this is to just ask for a photo, a photograph, a professional headshot, fashion photography, etc. Basically any type of photography that's relevant to the type of image you're trying to make.
You shouldn't describe it as "realistic," or "photorealistic," or "ultra-realistic" or anything along those lines, because those are descriptors for things that aren't photographs, but appear realistic in some way.
Photos are inherently realistic, so just ask for a photo!
𝐧𝐞𝐯𝐞𝐫 𝐦𝐞𝐧𝐭𝐢𝐨𝐧𝐞𝐝 “𝐩𝐡𝐨𝐭𝐨𝐫𝐞𝐚𝐥𝐢𝐬𝐦“, “𝐩𝐡𝐨𝐭𝐨𝐫𝐞𝐚𝐥𝐢𝐬𝐭𝐢𝐜“, 𝐨𝐫 𝐞𝐯𝐞𝐧 “𝐫𝐞𝐚𝐥𝐢𝐬𝐦“ 𝐢𝐧 𝐲𝐨𝐮𝐫 𝐩𝐫𝐨𝐦𝐩𝐭..
𝐢𝐧𝐬𝐭𝐞𝐚𝐝 𝐲𝐨𝐮 𝐜𝐚𝐧 𝐣𝐮𝐬𝐭 𝐭𝐞𝐥𝐥𝐢𝐧𝐠 𝐢𝐭 𝐭𝐨 𝐦𝐢𝐦𝐢𝐜 𝐚 𝐜𝐚𝐦𝐞𝐫𝐚 𝐢𝐦𝐚𝐠𝐢𝐧𝐠/𝐞𝐦𝐮𝐥𝐚𝐭𝐢𝐨𝐧, 𝐮𝐬𝐞 𝐦𝐲 𝐩𝐫𝐨𝐦𝐩𝐭 𝐞𝐱𝐚𝐦𝐩𝐥𝐞:
Through the nostalgic grain of a Kodak Ektar H35, 35mm, f1/14: emerges a coffee cup with a twist. This isn’t just any cup; it’s ingeniously crafted to resemble a flying saucer, complete with 1990s blue googie-style elements. The cup’s main body boasts a sleek, disc-like shape with a metallic sheen, while the handle curves elegantly like a comet’s tail. Vivid red neon accents stripe the edges, simulating the lights of a UFO.
I'm a fan of the fake camera metadata approach as well, though I truthfully haven't tested it much at all! Do you find it indeed gives you more photographic results? I just think it's such a fun idea that it even could work 😁
𝐫𝐞𝐬𝐮𝐥𝐭 𝐨𝐟 𝐦𝐢𝐧𝐞 𝐥𝐢𝐤𝐞 𝐭𝐡𝐢𝐬 😄 𝐚𝐟𝐭𝐞𝐫 𝐜𝐮𝐬𝐭𝐨𝐦𝐢𝐳𝐞𝐝 𝐭𝐡𝐞 𝐬𝐮𝐛𝐣𝐞𝐜𝐭.
haha i want that mug
𝐢 𝐡𝐚𝐯𝐞 𝐛𝐞𝐞𝐧 𝐝𝐨𝐢𝐧𝐠 𝐞𝐱𝐩𝐞𝐫𝐢𝐦𝐞𝐧𝐭 𝐰𝐢𝐭𝐡 𝐯𝐚𝐫𝐢𝐨𝐮𝐬 𝐨𝐟 𝐜𝐚𝐦𝐞𝐫𝐚 𝐦𝐨𝐝𝐞𝐥𝐬 (𝐫𝐞𝐭𝐫𝐨, 𝐚𝐧𝐚𝐥𝐨𝐠, 𝐦𝐨𝐝𝐞𝐫𝐧) 𝐡𝐚𝐡𝐚 𝐝𝐚𝐥𝐥𝐞 𝐦𝐚𝐝𝐞 𝐦𝐞 𝐞𝐱𝐩𝐥𝐨𝐫𝐢𝐧𝐠 𝐭𝐡𝐞 𝐰𝐨𝐫𝐥𝐝 𝐨𝐟 𝐩𝐡𝐨𝐭𝐨𝐠𝐫𝐚𝐩𝐡𝐲 😂
𝐡𝐨𝐰𝐞𝐯𝐞𝐫 𝐨𝐧𝐞 𝐭𝐡𝐢𝐧𝐠 𝐢 𝐟𝐢𝐧𝐝 𝐢𝐬 𝐭𝐡𝐚𝐭 𝐝𝐚𝐥𝐥𝐞 𝐨𝐧 𝐜𝐡𝐚𝐭𝐠𝐩𝐭 𝐬𝐞𝐞𝐦𝐬 𝐡𝐚𝐯𝐢𝐧𝐠 𝐚 𝐬𝐭𝐫𝐮𝐠𝐠𝐥𝐞 𝐭𝐨 𝐦𝐢𝐦𝐢𝐜 𝐚 𝐛𝐥𝐚𝐜𝐤 𝐚𝐧𝐝 𝐰𝐡𝐢𝐭𝐞 𝐜𝐚𝐦𝐞𝐫𝐚 “𝐊𝐨𝐝𝐚𝐤 𝐓𝐫𝐢-𝐗“ 𝐞𝐬𝐩𝐞𝐜𝐢𝐚𝐥𝐥𝐲 𝐢𝐭𝐬 𝐜𝐨𝐥𝐨𝐫𝐬 - 𝐛𝐢𝐧𝐠 𝐜𝐚𝐧 𝐠𝐞𝐭𝐬 𝐢𝐭 𝐫𝐢𝐠𝐡𝐭 .
𝐡𝐚𝐡𝐚 𝐬𝐚𝐦𝐞, 𝐝𝐚𝐥𝐥𝐞 𝐦𝐚𝐝𝐞 𝐦𝐞 𝐰𝐢𝐬𝐡 𝐚𝐥𝐥 𝐨𝐟 𝐭𝐡𝐞𝐦 𝐛𝐞𝐜𝐨𝐦𝐞 𝐫𝐞𝐚𝐥 𝐭𝐡𝐢𝐧𝐠𝐬
Anyone know how does referenced_image_ids work? Can it be used to save the number of prompt tokens? We know that DALL-E 3 cannot handle more than 256 tokens. So you cannot describe the images in too much detail. Can referenced_image_ids come to the rescue?
If I have created an image that has gen_id= XB1JH4kGkds532YD4, the next image will use the gen_id as referenced_image_ids. What prompt should I write for the next image? Do I have to repeat my previous description (e.g. for hairstyle, clothing, etc) in the prompt?
Or referenced_image_ids is just another name of seed, but restricted to one Chat.
𝐭𝐡𝐚𝐭 𝐦𝐚𝐝𝐞 𝐫𝐞𝐦𝐞𝐦𝐛𝐞𝐫 𝐚𝐧 𝐨𝐥𝐝 𝐬𝐜𝐡𝐨𝐨𝐥 𝐚𝐝𝐬 “𝐲𝐨𝐮 𝐰𝐨𝐮𝐥𝐝𝐧’𝐭 𝐝𝐨𝐰𝐧𝐥𝐨𝐚𝐝 𝐚 𝐜𝐚𝐫“ 😂
𝐦𝐚𝐲𝐛𝐞 𝐢𝐧 𝐟𝐢𝐯𝐞 𝐲𝐞𝐚𝐫𝐬 𝐢𝐭 𝐰𝐢𝐥𝐥 𝐛𝐞 𝐩𝐨𝐬𝐬𝐢𝐛𝐥𝐞 ..
I don't think there are quite yet any reliable methods of consistency control as of right now. OpenAI dev Moxi has described image IDs and seeds in the current iteration of DALL·E as "implementation details, not features." #images-discussions message
So in other words, they're mainly to make behind-the-scenes stuff work as opposed to giving users consistency features. I'm sure this will change as DALL·E continues to mature!
Thanks. I'll look into what Moxi said.
If "image IDs and seeds in the current iteration of DALL·E as 'implementation details, not features', so what is the current workflow for iterating images? Just do random gachapon?
I think OpenAI is intentionally coupling text and images generation. This is not a good thing imo.
There is a two-stages' process:
- Utilize ChatGPT’s language capabilities to generate prompts
- Send the exact generated prompt to DALL-E 3
I hope that I can operate these two stages manually instead of having them coupled together behind the scenes as they are now.
ChatGPT and DALL-E 3 are two different systems.
It is definitely writing the prompts for you, but you can see the prompts when viewing the images. You should have luck asking it to use exact prompts that you provide, though truthfully it's been awhile since I've tested this myself! But you could try some variation of "Please use the following prompt exactly, without making any changes" and see how that ends up. Note that the prompt length limit will always apply, and you can't bypass that limit with this.
But I think this is close to the best current workflow for image iteration, to address your specific question! But again, I'm sure standardized iteration control methods are in the works. For example, in the recent DALL·E devs AMA, they mentioned that image references will be supported in the "next version" #dall-e-ama-answers message
ETA: You can also find more info about OpenAI's logic behind writing prompts for users using GPT in some of the official DALL·E 3 research documentation available on their website.
OpenAI's logic behind writing prompts for users using GPT in some of the official DALL·E 3 research documentation available on their website
Could you give some link? Thanks.
"Please use the following prompt exactly, without making any changes"
I know this method. I even asked it to use exact JSON data.
{
"size": "1024x1024",
"n": 2,
"prompt": "Your image description here",
"referenced_image_ids": ["ImageID1", "ImageID2"]
}
However, if you'd like to iterate over images, you will have to use referenced_image_ids, which is not a feature, as Moxi said...
In addition, I found that this JSON format doesn't seem to be well designed.
For example, suppose you want to create an image with prompt: "a bicycle".
You will write:
{
"size": "1024x1024",
"n": 2,
"prompt": "a bicycle",
"referenced_image_ids": []
}
OK, It's fine. This would create two images as shown by parameter n=2 (with different seeds).
However, if you want to create two images by specifying referenced_image_ids, then the parameter n is useless.
{
"size": "1024x1024",
"n": 2,
"prompt": "a bicycle",
"referenced_image_ids": ["X79sdf89ssd"]
}
This will create two identical images.
I found that even without using JSON format, sometimes DALL-E 3 would create two identical images (I mean that you explicitly ask DALL-E 3 to reference a previous image in natual language).
Does it seem to be successfully using only that text for the actual image prompts when you use this method? As in, when you view the image and check the actual behind the scenes prompt, it's just "a bicycle"?
Yes. But "a bicycle" is just an example, I haven't created such a boring prompt yet 😅. In most cases, ChatGPT does not modify the prompt in JSON format (it just checks prompt's content policy).
"a bicycle" seems too simple and it might modify prompt.
Actually, regardless of whether the prompt is modified or not, the JSON format is still problematic imo. Because size, prompt , and referenced_image_ids can uniquely determine an image. In this case, n is meaningless.
It might, but I've seen single words be taken as a literal prompt before (I think "potato"?) based on a user request for an unchanged prompt, so I'm just curious about what its tolerance is like for such requests now. Because it seems like if it takes potato and a bicycle, it should listen to longer requests too, unless too long!
Here are two relevant iterative control posts on #1163443000060420206 from a DALL·E wiz--I think they indicate that both of these control methods (ref img ID and seed control) are both currently not user options at all:
https://discord.com/channels/974519864045756446/1168215626553245886
https://discord.com/channels/974519864045756446/1168052318139318292
ive managed to get pretty consistent gens using the genid in combination with a very important keyword - this would cause it to use your prompt AS IS, which means you can make subtle prompt changes and it is remarkably cohesive
I want to buy this
btw, @agile peak, pelase quit useing that plugin or whatever that changes your text on this discord
The seed approach https://discord.com/channels/974519864045756446/1168215626553245886 doesn't work at the moment.
The following is the old JSON format, which was used by the seed approach:
{
"size": "1024x1024",
"prompts": [
"a bicycle", "a potato"
],
"seeds": [33445345354, 33445345355]
}
The reference_image_ids approach https://discord.com/channels/974519864045756446/1168052318139318292 can work at the moment. However, as Moxi said, gen_id currently is not a feature. Also, seemly no one knows how reference_image_ids works, e.g. is it just another name of seed, but restricted to one Chat?
Anyone gotten a good approach for text when prompting? I know it innately will always have trouble with text currently but I’ve gotten it to get pretty close to what I desire. I’m only looking for one sentence to appear as close to the prompted sentence as possible and looking for some good practice anyone has found
you can make any image prompt use exactly what you typed by prefacing the prompt with this sentence :
<Prompt here>```
It will then use that prompt just as you typed it
now , repeat that text - this time you can make subtle changes to the second prompt, reference the gen id, and it will create an almost identical image as before, but with the slight variation you made to your second prompt
i dont know if this works for ALL images - but so far its been working great for me
just pray to rng gods is your only hope i think
better when we could get 4 images
Well idk if Pieter was giving me advice or not but asking for it to follow my prompt verbatim actually got the text perfect and everything else I requested. Probably was just RNG luck as well, I’ll try to not have confirmation bias haha
Yeah, agreed, but I think seed does not work the same as reference_image_id. After some testing I found that you can't use it to keep consistent characters. It's possible it's doing some sort of fixed seed thing with it, but I haven't been able to get the same output after...
should work for characters too . . .
So when I say it doesn't work the same, I mean you can't seem to get the same output again given the same input. If you were to fix the seed and get it to submit the same prompt, you would get the same output. This doesn't happen with reference_image_id.
It's possible it's doing SOMETHING that makes it similar. But it's hard to say if that's not just the prompt.
in those two images above - i used the same gen_id and the exact same prompt ( word for word ), I just change two words ( i swopped orange for pink and blue for white IIRC ). Now as you will see its not 100% a replica - but it is very close. At least for my use case this method works well - YMMV
interesting choice of chatgpt.. hermes.
haha nice
legit cool picture.
anyone else been having issues with Dalle3 just failing to generate the image? I just had it fail to generate 12 times in a row...
It's been working well for me today. Is it with any prompt you try, or just one in particular?
Is there any way to specify it to generate a long caption
Or is it more reliable to first text generate then image generate
Like I was hoping on it generating a war thunder stat card but it put it in the image instead
Interesting question! So you want there to be two "parts" to the model's response? First, an image generated, then second, a stat card with just some text info that corresponds to the image? What prompt are you using now to request this? I wonder if you might just need to be really specific with your instruction, e.g. "...after generating the image, then please..."?
the snake at the top, the beard...
DALL-E will even frame your prints gratis
I have been having issues for nearly 24 hours . . . same message
Usually it's better to generate separate components and then combine in a post editor software.
I decided to make a Pantheon of Greek Deity AIs for helping various things in my life and a group of my engineer buddies:
Adonis ~ Fitness/Health/Nutrition/Bodybuilding/SexualWellness
Prometheus ~ Coding/DataAnalysis/HigherEduction/ProfessorMentorship
Diogenes ~ I like philosophy and wanted to talk to a funny insightful prick sometimes for cynical advice.
Finally Below is Hephaestus:
DIY/Maker/Crafting/Builder/Tech(tinkering)/Smithing. For the mechanical AI assistant I and a motorcycle engineer are making Daedalus, for any of you thinking ahead.
Right now me and a friend are working on updating with better core files filled according to various popular trades and DIY projects. Labeling is an area of focus to always attempt to improve. APIs and automation being work on for now.
Here’s some of his stuff before I get all his bells and whistles turned on. Enjoy!
The last picture I have an engineer buddy that’s a blade smith among other things and I had Hepheastus attempt to predict his next blade designs based on the knife images and my friends business url link to his knives. He gave my friend some solid ideas immediately lol 😂
I’m sorry I wrote a lot 🙃
does anybody know if there is supposed to be a limit on how many dall-e images chatgpt is supposed to be able to generate in a single response??? i made a gpt that does some interesting things, but it can do 9 images in a single query, and im not sure if that counts as a exploit or not... i dont want to post the gpt to the showcase until i know if this an accidental j-b for dall-e...
No worries there, you can indeed ask for multiple image generations in a single response. Note that you may hit the 15min throttle, 60min throttle, or daily cap limit faster this way, but they've got stuff on the back end for situations like this! Here's a bit of context from OpenAI dev Moxi: #images-discussions message
cool! through sorcerous means i taught it how to do x/y/z prompt string-replacement matrices xD i was only going for arrays of 3 - that seemed reasonable, but an accidental tweak made it start spitting out the full 9 cell prompt matrix of images xD
Is this something per user?
Dalle will still only generate 2 at a time for me, even if I ask it to do more
Weird how realistic illustration is the only way to get decent faces
Not sure what they did with photos
you have to figure out how to prompt it in a way to get it to do it. without a gpt with clear, procedural instructions on how to turn your idea into multiple prompts, ive been having problems too - so i made this. if you want to try it, i put a gpt in showcase called PromptMatrix. it makes 9 image batches, IF you prompt it the way the instructions say to... other than that you can try to make your own gpt that "makes multiple dall-e prompts procedurally, in batches of [number], based on User input, and iterates through all prompts, each response output" or something along those lines... also i just discovered my new favorite genre keyword: "Plushiepunk!" xD
i think it works fine emulating the tri-x 400. here are two examples with dramatic lighting. prompt is "A black and white photograph in the style of a Tri-X 400TX film with visible grain, featuring a saguaro cactus. In the foreground, a rattlesnake is on the ground, looking up at the cactus. The perspective is a bottom-up shot, emphasizing the dramatic contrast and lighting. The scene captures the stark and textured beauty of the desert environment. Wide aspect ratio."
Mister's shirt also clearly has buttons. This is after 3 prompts telling DALL-E that Mister' is wearing a shirt without buttons. Any ideas what the issue is here?
I know this not be the correct channel, but I generated this with DALL-E 3. It is really perfect for my needs, the only bad thing here is the "P" which should be "R" at the top. Since it is very very similar to the "R" letter, can someone good at graphics modify that?
its not real photo again
ive been putting my dalle images into stable video, wild stuff! what a time to be alive.
how to do that?
how to do that
probably ot so i send you a pm
use natural and subdued lighting, add background such as cafe
did pepe got copyrighted?
it seems one of GPTs able to create it though
That’s perfection now, i think using default resolution oftentimes would make the black and white slightly more brownish colored.
subdued lighting and adding certain background with blur/focus can make dalle mimic photograph in a better quality
hoping dalle could have an upscaling feature someday, i tried “magnific ai” with dalle generated image and the result is outstanding..
wow!
yeah upscaling or face corrector like some of the stable things ive seen would be great
so many great images i get but with wonky eye
never heard of magnific ai, interesting
can we finetune dalle3
No I'm afraid not. However you can suggest training data in #1070006151938314300
What is magnific ai?
It's an upscaler. As another user mentioned it's off topic, but I would suggest visiting #ai-discussions. Users there will probably be able to point you in the right direction.
lets see that daily hehe
POPCORN!!
I have a fun idea for this one, lemme play with this!

that emoji scares me, i need to photoshop my fursona into it
Dalle seems to have no idea what is it generating or why is it generating it in certain way. it doesn't recognises its own works, and it cannot explain, verify or identify the art-style it used.
You're correct that it can't directly see its own creations. This results in limitations like, you can't say "Don't include a striped sweater" for example, because it doesn't know in the first place that one was included. So "negative prompting" is not currently best practice.
As for verifying or identifying styles used: your best bet for that is to check the image prompt it wrote itself, as it might specify there.
Finally, one option you do have if you're trying to get a specific name of an art style is saving a DALL·E image and then uploading it back to a new GPT-4 chat. It could use its image recognition feature then to answer a request like "Please thoroughly describe the art style you see in this image"
Hello, I would recommend you post this in #1037561385070112779! Looks very nice.
i tried that, and its uncertain. it will say things like "digital painting" or digital art.
i mean, lol, when you digitally paint, the result would be digital art. common sense.
then i will ask it again and it will throw in - cinematic style, semi-realistic style, hybrid style.
meaning, its guessing, its not certain
But that might just be how art works, no? I guess, could you maybe offer an example of the kind of feedback you're looking for from it?
Right now everything leans "digital" - i.e. something you might find produced via photoshop/clipstudio. So really you barely need that tag. However, the phrase "digital" should make it lean towards a very clean line style with brighter highlights.
what i am looking for is a clear identification. when you look at this
you know its comic style
when you look at this
you know its modern videogame/cinematic style
AI cant identify even that. and its supposed to be the machine, while we are the humans.
meanwhile we are able to identify it right away, while AI is not
These are BOTH digital. However, one is digital 2D and the other is digital 3D.
You want chatGPT to be able to tell the difference?
everyone knows its both digital, its what i said
oh sorry. What exactly is the issue then?
how the type of style is different
like i said, one is comic style. the other is modern videogame/cinematic style
if you dont understand he second part - this is how most of videogames' graphics looks today. especially in cutscenes or cinematics
while the first one is obviously a comic style, not cartoon-ish.
so AI can recognise the comic style, but not the second one. for some reason. or similar ones, like "realistic illustration"
in fact, i am not even sure it makes a difference...
digital art style...
they branch off in... like... realistic illustration, or cinematic style, or modern video game style, or semi-realism, so on
even if by some reason, this image is a hybrid style, it obviously cant identify or verify that either
i never asked it to produce that image, it did by a chance. and its good. and you cant reproduce it. because by chance it landed on some algorithm tied to another art-style
So you want it to say that it's 3d digital art for a cinematic videogame cutscene or something along those lines?
i mean, many modern games have cinematics with pretty much this look
Okay one moment
anyone who knows 3 or 4 of current videogames caan tell that straight away
i never got access to uploading pictures into GPT btw. because i seem to be in a "backwater" country
sorry to hear that. thought it was now universal
yeah... such coincidence, that the famous youtubers who will promote Dalle 3, had early access.
That's surprising, so if you go to a new GPT-4 chat, you don't see an "attach file/image" button to the left side of the text entry box? App or web?
i paid last back in Octomber. nope. might try again for December, but the thought i will see it the same as it was, does not motivate me
screen capture from a video game of an elven woman with brown hair and green eyes. The shot is a portrait view with cinematic lighting. The woman is wearing pink armor, A night time city street is in the background. The style is cartoonish indicating professional work from the early 2010s. The background is blurry to accentuate the focus on the woman.
Ah so no ChatGPT Plus currently? I gotcha. If you renew and still don't see the option to attach a file, please do post in #1070006915414900886 or #community-help because, as I understand it, every Plus account should have access to these features now. Also note that there's currently a pause in new Plus subscriptions. But having the ability to upload to ChatGPT seems important for your image description goals!
You can prompt engineer from there. To ensure it doesn't change the prompt you'll have to tell it not to do so before you submit.
what can i say, the Balkan countries. we are not priority
i will keep that in mind, thanks for your efforts
alright but, even if... EVEN IF.. i still dont have it and report in there, is there any way for anyone to repair the issue? 😄
It's very possible! OpenAI staff are periodically present in #1070006915414900886. I'd love to help ya troubleshoot it too if you get back on Plus! It's very often just a bug related to VPNs, DNS filtering, or some other weird lil network bug.
Dall-e as a product will get better over time. It's worth checking as time goes on. Of course, your millage may vary no matter what.
no VPN, thats the thing... okay, okay. i will see it tomorrow
my last creation of pepe weeks ago.. used kodak ektar h3 emulation work perfectly (hope mod okay with this)


i am surprised it doesn;t blurs faces when u upload an image. Bing does, and partially, because of that it can't tell sh*
and pls, let us call it as it is - generation. not art.
if a person were to produce anything similar, she or he would use layers. Dalle doesn't use layers.
apologies, for reaching out to you like that, but even if i were to get back to Plus and got the Dalle-3 now, so what? of what i know, is very restrictive and censored. i am making a comic, and if it cant process some inputs like - bloody battles, or erotic interactions, then there isn't that much of use for me
well then dont use it
wait until something else fits what you need i guess
because dalle is definitely g rate only
No worries at all! You'll indeed still be subject to the usage policies https://openai.com/policies/usage-policies which would disallow the two examples you mentioned at the end. Plus the additional DALL·E content policies of no public figures and no copyrighted imagery (logos, etc.). Also again keep in mind that new Plus subscriptions are currently paused, even if you were previously subscribed.
wait, i cant subscribe? 😄
woa there, i was gonna put up some money in my bank account, for nothing
No one can right now, no! Unless you had auto-renewal already active. They paused it due to heavy server load.
maybe a few weeks you can. so many were using it they had to pause it was very slow with all the new users in the last weeks
no auto-renewal, so... yup......
so many? when Bing is free? i mean, i know its not the same, but still..
i prefer Bing actually it has less restriction. but bing doesnt have chatgpt4 turbo which is the main attraction with dalle3
bing also has many restrictions, i needed a scene where two chars kiss - impossible
dont use the word kiss. and yes its also restricted but believe me its much better than dalle3 is in gpt
neither will be good for what you want really
if you have a good pc you can do stable diffusion
therefore, no point in paying for Dalle-3, only for it to be unable to do the same as Bing, but Bing is unable FOR FREE
i guess so. if it was up to users it woudl be less restricted but we dont make the rules
yes, i have seen those alternatives. SD, MJ, SeaArt. the last one is good for NSFW content, but mainly for one character, and terrible at most art styles
well its early days of this stuffs, you cant expect miracles yet either.
human artists can do it all 😭
i actually have the feeling, that all these art programs.. simply share the same base
and from there on, different teams implemented their own, different adaptations , so we got some minor deviations.
but long as they all have difficulty depicting two cars, or hands and fingers.... they are not that distinct from one another
i guess but this is really for dalle discussion not ai image generators
indeed
#ai-discussions will have other suggestions
You can still discuss bing tech here though
yes, Dalle-3 is still image generator. and likely, the others use its base, as it is OpenAi's base that many others use nowadays
and Yes, Bing is powered by Dalle-3, hence why the mention
yes i just mean like steel say, sounds more a broad chat you have in mind for ai-discussion. this is for dalle/bing prompt ideas, the future, etc
imma check it out
hm, I tried to work a little story with Dalle in ChatGPT with the seed, but I don't get the same charakter like before. No trouble before turbo. Has anyone a tip?
The tech for seeds is no longer functioning, for now you might be able to ask it to base your next image on the previous one.
ah yes, I tried this too, but didn't work out well.
thanks for answer 🙂
Anyone have any advice for requesting that Dalle3 not include text? It seems like it wants to do add text to all my illustrations. "Please don't include text" doesn't help at all lol.
Because of how diffusion models work, any time the prompt includes words like "text", "book", "phrase" or perhaps long strings of non-visual abstract details, dall-e 3 will likely output text. If you wish to exclude it, you will need to ensure these things are scrubbed from the prompt.
Would you mind sharing your outputs and prompts in #1155772063596953642? I would like to catalogue the fix for this.
Totally unusable... Asked it to create an image based of the discription of another image. After much goading, and chatgpt saying that it couldn't view the image, couldn't create something similar. This is what it made.
Whao! cool! That looks like the initial state of a diffusion model image.
If you'd like help with prompts, I Would recommend #1155772063596953642. If you post your problem and output there others (including myself) will probably be able to help.
chatgpt iq relating to dalle seems to have decreased by 50%
what could be done in one message now takes 6
if it even works
Another general tip to add on to what austinitic_steel said: negative prompting ("Don't include x") tends to confuse it in general, just by bringing up x at all (whether x is text or anything else)
yeah indeed. There's no negative promting right now actually. So all words in the prompt will be considered. While chat gpt is bot that takes instructions, dall-e itself is a model that turns text into images. It's input is just a phrase which is typically the description of an image.
Here's the thread: https://discord.com/channels/974519864045756446/1178417489751834655
Is there a term for this tech to generate a series of related images based on id's? A general term for this creative process involving AI?
what kind of id?
I'm seeing references in this channel to gen id's and the ability to maintain consistency in a series of images including a narrative.
I'm just wondering if it suggests this built-in support means the UI for DALL-E3 when it arrives might include this feature
probably
i think if they wanted they could do some wild stuffs with dalle already, its almost like its too powerful even this early stage where they restrict it for caution
i see. a. bit of a pandora's box kind of situation
perhaps haha. but we've already looked inside... 😮
but some others probably have a better detailed answer, im just a know nothing really
oh, its all butterflies? we are safe then. the apocalypse is postponed.
for now anyway
Dalle seems to love throwing in random butterflies into images especially when in a fantasy mood.
Ask chaptgpt what the parameters are for the input and output
u could ask me and i will tell you too
i'm trying to get dalle to make an image of my family, but it keeps getting the number of people wrong. i keep saying "a man, a woman, two daughters, and a dog" and it'll give me 3 kids, or a boy, or two wives, or two dogs. how do i get it to remember how many people are in my family?
you might need dalle4 for that. i think going over 3 people for specific and it can struggle
otherwise, just keep rolling and the rng will eventually go your way
even two people it can struggle mixing up details
nice job
please expand the rate limits soon 🥺
is there any way of achieving realistic photographs?
I'm tryting to get real people, but they all look like airbrushed digital art instead of real photos
I also tried animals, like a lab dog for example, but none look like real photos
Some general advice is to just ask for photos, nature photograph, professional headshot, etc. without any mention of "realistic," "photorealism" etc. Describe it as you would if you were seeing an actual photo, in other words (photos are inherently realistic).
have you ever gotten any good results? would you mind sharing?
Try doing this search in this server:
"in:💬┃dall·e-discussions photo"
You'll see lots of past discussion on this topic, including some examples!
also try the same prompt in bing image creator imo. often get better photoreal
you mean searching for 'photos' in 'dall·e-discussions' channel?
I've had this same complaint. i agree, but you can get good photo too
i find using cinematic image with film grain, add diffused lighting, dramatic lighting, soft focus
Do you mind showing something you find as good result?
ive really only using bing lately for photo real, like i said i have this same complaint
bing also gives more variety and imagination imo and more diverse model
Yes, if you copy paste exactly what I have in quotes in my last message, you'll see lots of discussion on this topic!
proof is in the pudding though
solbus do you have some quality photo real from dalle?
I've gotten some results I've liked before, but honestly it's not my most common type of image that I generate personally, so I don't have much on hand. When I do, I just stick to the general tip I offered before, just avoiding "realistic", "photorealistic", etc., as these are terms we typically use to describe things that aren't photos but resemble something realistic in some way.
Put another way: DALL·E was trained on how humans describe images, and humans don't describe photos as "photorealistic", we just say "photo."
Edit: a helpful post regarding lighting https://discord.com/channels/974519864045756446/1176957475585536071
you say that often but its not a problem in bing
instead of a search for this topic, i think it is more telling you dont see many (any?) good photo real from dalle share here
Interesting! I'd be curious to see any examples you have of prompt + image comparisons between Bing and DALL·E on ChatGPT! I wonder if how the two platforms handle image prompts differently makes much of a difference
give it a shot yourself, youll see
bing is less restricted it seems to me. dalle seem to be more on guarded rail
I don't use Bing myself, I'm afraid. Here's some examples from DALL·E on ChatGPT I found using the search I suggested above:
#images-discussions message
#images-discussions message
#images-discussions message
#images-discussions message
Xentoshi does get some good stuffs,
Agreed!!
pretty good
yeh, not realistic, but the closest I was able to get 😦
some similarity in BING, but better photo
Photo
GPT vs Bing (exact same prompt) photo
Everyone has the gigachad face 🤣
Even the females
the prompt specifically says "un-intesting men/woman" 😆
maybe 'un-intesting', should be explained 🤷🏼♂️
We should organize a double blind study somehow! 😁 I struggle seeing the difference myself. Square pics only, would have to figure out some way to use the same prompt on both since ChatGPT rewrites by default...hmm... I bet we could figure something out!
Yes, that would be awesome and very useful. Also, GPT cab be instructed to not rewrite the prompts.
Yes for sure, I think the prompt control would be doable too!
tried made a photograph of GPT as a robot, with my own CI it works for photograph.
do you think these are somewhat an improvement on realism? #photo
if you told me these were real photographs I would believe you
here is my contribution to the quest for realistic images. I took lots of inspiration from @mortal summit 
What prompts did u use @gritty wraith
I've also tried a plethora of realistic image techniques, and they all match the ones you've presented. And I believe it's just the end cap for how realistic the current model of Dall-E 3 can produce within a particular frame of reference.
yep... unless there is a hidden setting somewhere
Try my GPT too it you like #1172542500318232596
I too am constantly trying to find better techniques to create realistic images.
Hoping for Natural style to hit chatgpt Dalle sometime. API with that style does wonders for realistic photos.
thanks! will try it out
We need to train the GPT to make more realistic looking images for it to understand natural/realistic styles.
Throw (Hyper Lifelike) in with everything else
I'm working on a new realistic art GPT as we speak
I used variations of this prompt below. Changed man/woman, young person, etc... but mostly this:
Create a close-up portrait of an [elderly] person of [Bolivian] heritage, showcasing the rich details of sun-aged skin. The image should capture the depth and character of the weathered skin, highlighting wrinkles, age spots, and the unique textures formed from years of exposure to the sun. The natural lighting should be used to enhance the skin's textures, creating interplays of light and shadow that reveal the life story etched in the skin. The background should be a soft, creamy blur, seamlessly blending with the warm tones of the lighting to create a calm, dignified atmosphere. The photograph should emulate the quality of a high-resolution image taken with a Canon EOS R and a Canon RF 85mm F1.2L USM lens, set at 1/160 sec, f/1.2, ISO 1000. The goal is to produce an image so vivid and detailed that one can almost feel the history and resilience contained within the skin. The overall impression should be one of strength, dignity, and the beauty of age.
Didn't think of this... as I am not sure how you can train it with images. Would that be possible?
Every prompt you make is used as potential training data for the GPT
There would be no other way to make the AI better. It takes our prompts as training data to improve itself.
mmm, but it doesn't improve itself (yet)
I thought you were talking of us (users) training a custom GPT for more realistic results...
Not the current model. But the raw data is being improved behind the scenes. The models have to be approved before we (the public) can use them.
We only have access to public ChatGPT's. But the ones OpenAI has access to are lightyears ahead of ours. They're just continuously trying to test the new data for public use.
That's pretty much how all AI companies work. They have to allocate and test the training data before it can be publically available.

WOW! What commands did you give please teach us this looks totally Awesome!!! 🙏🏻🤓
what is it possibly difficult for it to understand here? it always messes something up -
"Warcraft scene - Two people talking and gesturing, while sitting on crates - Armored male Lieutenant, in blue armors with golden edges and red pauldrons, pointy-eared helm - red helmet covers his head and face, And Jaina Proudmoore with green eyes, long blonde hair, in blue sorcerous robe and cloth white leggings. Camp with tents on background, Sunset. Comic Style"
it may give Jaina long elven ears, it may forget the Lieutenant has a help, or it may make it blue, not red. or it will forget his helm has "ears".
i am surprised it gets hands right, but it messes the things i mentioned instead, haha
I think DALLE works better with narrated prompts. Something more like: "A vibrant Warcraft-style scene featuring two key characters engaged in conversation, set against a bustling campsite backdrop at sunset. The first character is an armored male Lieutenant, adorned in striking blue armor with golden edges and distinct red pauldrons. His appearance is characterized by a unique, pointy-eared helm, which is predominantly red, covering his head and face entirely. The second character is Jaina Proudmoore, depicted with vivid green eyes and flowing long blonde hair. She is dressed in a classic blue sorcerous robe, complemented by white cloth leggings. Both characters are casually seated on crates, deeply involved in their discussion. The scene is rich with details of the camp, including various tents, capturing the essence of a comic-style artwork."
nope,,, thats too long and confuses it further...
but i am actually using Bing, haha
what are you getting as a result?
I don't know warcraft, so I have no idea if this is good or not.
bing versions
these look good! - Thanks for sharing. I will check @mortal summit 🙂
PROMPT:
Create an extreme close-up portrait of the face of an elderly person of Bolivian heritage, laughing, with sun-aged skin and hair, and subtle wrinkles. The image should capture the joyful expression of laughter, with gentle wrinkles and age spots, giving a warm depiction of aging. The hair should be thinning, soft in texture, peppered gray to white, with hints of original darkness at the roots. The lighting should softly highlight the skin and hair textures, enhancing the joyful expression. The background should be soft and blurred, emphasizing the facial details and the expression of laughter without making the wrinkles too prominent. The photograph should emulate a high-resolution macro photograph, taken with a Canon EOS R and a Canon RF 85mm F1.2L USM lens, set at 1/160 sec, f/1.2, ISO 1000, to produce a vivid and detailed image that conveys history and resilience with a gentle, joyful portrayal of age. The overall impression should be one of dignified strength, happiness, and the beauty of age, with a focus on the joyful expression of laughter.
#photo
Regarding "real photos":
In my experience, this cannot be generalized. it depends on the complexity and quality of the prompt. i have attached some examples, Dall-E 3 each from GPT (first image) and Bing (second image). Both achieve a very similar quality, in my opinion GPT is even better because it has a higher resolution and less blurring in the generated result. To achieve this complexity and quality of the prompts, it should take a maximum of five specifications in conversation with GPT, if you start with a simple prompt like "generate me a photo of X".
The prompts are as follows:
"Craft an image of a woman with medium-length dark hair in a softly illuminated room, capturing the essence of a scene shot on Agfa Vista Plus 200 film. She is actually smoking a cigarette, adding a layer of authenticity to the retro setting. The room is suffused with a mixture of red and blue lighting, simulating the color rendition typical of Agfa Vista Plus 200, known for its fine grain and bright, lifelike colors. The grain is subtle, contributing to the film's characteristic look. She gazes off into the distance, a wisp of smoke curling around her, her expression one of deep contemplation. The shadows on her face are gently pronounced, providing dimension while reflecting the film's dynamic range."
"A silhouetted figure of a person and a horse standing on a grassy hill with a dramatic sky in the background. The person appears to be interacting with the horse, possibly tending to it or just standing close by. A small plume of smoke or mist is emanating from the ground near the person, adding a sense of mystery to the scene. The image has an artistic, vintage feel, reminiscent of early 20th-century photography with a black and white color scheme."
smoke adds a touch of realism too. I like 👍🏼
My take based on the red headed girl you posted...
some other example with an animal:
The API has a setting called style which you can set to natural. This helps a ton with making photoreal images. I'm still working on my prompting to improve them, but I think I'm getting close.
wow! These are so cool!
To me, the cat one is a stand out. Maybe over saturation also doesn't help. In general most of the ones I generate are over saturated.
yeah that's a common thing. I want to work on that somehow or for there to be more settings maybe?

Try using macro photography in prompt too. Get some cool close ups
if you accepts DMs i can help ya with that, ive been trying to get specific universe (warcraft and 40k) characters as well
I like adding black and white now. That was a cool idea. Takes away some of the saturation issues or hides them better lol
Like these are so crisp
second mouse gets the cheese
have you tried generating realistic dogs? I am trying to get a good one of an angry labrador and stuggling
I've done a couple. I can try angry lab. I have a lab IRL 😂
Didn't make him angry but here's a good boy
yeh, that's what I am getting... quite plasticky
I'm getting the face expression close, but not the realism
lol yeah thats very plastic
it's trying to mimic chad-like features
haha, true!
maybe need to focus on realist lab, then once we have it we say, "now make it angry"
I have a GPT that you can try for more realistic images, but I don't want to make it public yet. Though I would be willing to share it with you via DMs.
If you're interested.
I think so too. I have a few art GPTs that I'm thinking about already.
Thank you!
Saving them for the GPT Store though....I want that moola 
😂
it's an improvement, maybe the BG doesn't help to make it feel realistic 🤷🏼♂️
Found an old polaroid of my dog for reference
jk its DE3 😂
Polaroids really sell it sometimes
It's an OLD DE2 cheat
Use a medium that allows details to get lost or where you expect low quality.
35mm camera photo also seems to do well for me. Naming a camera type is a DE2 trick that I learned, that is hit or miss on DE3. But it can work wonders.
..or use motionblur to hide the imperfections
nice
@quartz vale My spontaneous try, but it is hard with DALL-E 3 ^^
There is always some glow with the face and the hair is often bad.
thanks for sharing...
do you mind sharing the prompt next to the images?
they look quite nice!
I made these ones earlier, but I think the oversaturation messes them up. I like the style of the first one you sent...
Extreme close-up portrait of a young woman with natural red curls and a thoughtful, distant gaze. Her fair skin is sprinkled with a constellation of freckles across her nose and cheeks. The freckles should vary in size and intensity, contributing to a realistic texture. Her eyes should be light-colored, looking away from the viewer, reflecting a moment of introspection. The setting is a blurred backdrop of soft green hues, reminiscent of a tranquil, rainy environment. The image should capture the essence of a candid, unposed moment, with natural light highlighting the woman's hair and the intricate freckle patterns on her skin, giving a sense of authenticity and depth.
Yours looks more realistic, lol
Mine is more shiny
Yours is more natural
the style achieved by @mortal summit looks quite natural and real
less saturated, right?
yeh, I agree
Add asymmetry and uneven tones.
I tried adding asymmetry earlier, but it kept creating beautiful people! 😆
Can you share your technique or a sample?
It's loading. 🙂
I run out of credits so I can only watch for now 😆
both...
beautiful people is OK, but they usually look like fake models
instagrammy
quite similar, hey?
Dall.E's holding onto those lips for dear life.
This is cursed! And now I'm determined. Dall.E loves to give this red headed character fat model lips. Even this version with "tired eyes." LOL
"High-quality photo of a 25-year-old woman in an extreme close-up portrait, embodying a hard-lived life and distant emotions. She has natural red curls and light-colored eyes that reveal a sense of weariness, with 'tired eyes' reflecting her experiences. Her facial features include very thin lips, a slightly crooked nose, and an uneven array of freckles, enhancing her lived-in and authentic appearance. The skin tone is uneven with subtle imperfections, moving away from an idealized image. The ambient lighting is soft, coming from the environment, to accentuate the texture of her skin and the depth of her tired gaze. The background is a blurred green, suggesting a tranquil yet distant outdoor setting. This candid portrait aims to capture a moment of introspection, emphasizing the genuine character and the tiredness evident in her eyes and overall expression."
"High-quality photo of a 25-year-old woman in an extreme close-up portrait, embodying a hard-lived life and distant emotions. She has natural red curls and light-colored eyes that reveal a sense of weariness, with 'tired eyes' reflecting her experiences. Her facial features include extremely narrow thin lips, a slightly bent nose, and a subtle uneven array of freckles, enhancing her lived-in and authentic appearance. The skin tone is uneven with subtle imperfections, moving away from an idealized image. The ambient lighting is soft, coming from the environment, to accentuate the texture of her skin and the depth of her tired gaze. The background is a blurred green, suggesting a tranquil yet distant outdoor setting. This candid portrait aims to capture a moment of introspection, emphasizing the genuine character and the tiredness evident in her eyes and overall expression."
Sorry, I played the game and lost... 😛
"Extremely narrow thin lips" = no change
oh wow! I just came back from dinner, haha, you made quite a lot of images! 😆
Well, what is normally easy for me became a challenge, LOL. I stand defeated. Dall.E won't let me shrink her lips!
Apparently "lip bias" in the models is "a thing." 🤣
plz share your technique 😂 ... pretty good!
"French Girl in a quiet moment", a spontaneous street photography portrait, taken on Kodachrome 64 colour-slide film, made with a Nikon FM2 camera and Nikkor 105mm Ai-S f/2.5 lens, unstaged, natural lighting, high-ISO, grainy. The photograph depicts a young woman with short blonde hair, sitting in a metal chair, wearing an orange sweater and a blue skirt, as she clutches the scrap of a newspaper. Explained about the composition: "I was just walking by and then saw, what I viewed as capturing all the conflicting emotions today's youth has about their future."
thank you! 🙂
Apparently Irish girls don't have fish lips. 😛
I have been trying to create a highway inside a small room, but I can't.
The thing is, I can't keep up with the inconsistency, I have been crafting my prompts to get what I want, sometimes it works, sometimes it doesn't and I don't know why.
wow this place blew up haha. more messages from last few hours than a week usually 😂
“When user input prompt to dalle you must refrain from altering it, unless being explicitly asked by user. IN ANY UNDER CIRCUMSTANCES dalle must use “natural” as your default parameters, never produce any “vivid” image.”
this my result for anthropomorphic polar bear, in one attempt.
- VHS: Creates VHS inside the image.
- Vintage: Creates 70s dusty images.
- Retro: Again creates 70s aesthetics, mostly sepia and pinkish tones.
- Analog: Great results but not what I want, still unrealistic and too saturated.
- 90s style: Doesn't change anything for the most part. Only some specific images.
- Home video: Again nothing changes.
- Handheld camera: Creates handheld camera inside the image.
we need less fish lips overall. nice bear
Anyone can create a woman with short hair, but not a Bob cut? I tried many times but always failed. I even tell it to generate boyish hair and NOT bob cut. It still create Bob cut.
try short messy hair
same issue here, even if i Enter 1920 cowboy women I basically get modern day Instagram models ported back in time. I am looking to create more authentic images. does anybody have tips?
three-young-dirty-working-women-sitting-in-a-dimly-light-bunkhouse- drinking tea western-style-1930s-America no make up worn clothes . why do they all have perfect makeup omg...
one where i succeeded, prompt czech young woman shoulder length platin blonde hair climbing in a boulder hall realistic. I had Janja Garnbret as inspo
even when its not the same insta fish lips, its always sort of her haha
