#images-discussions
1 messages · Page 84 of 1
that edit feature has been activate for some month unless i misunderstand you
Btw did i understand correctly that GPT-4o can keep an image output consistent?
Mhh. Consistency is a major issue with DallE tbh. I would be interested in making a comic/manga with AI, but DallE is sadly not suited for it yet
it was explaining to me how to set up prompts so characters, clothes, and other elements can be swapped out
scroll down to "Explorations of capabilities" and give a look at what feature it will have
maybe you got the 4o update Hawaiinz
I don't think i got a DALL-E update, but GPT 4o was explaining to me how DALL-E 3 interprets the prompts, and how the edit feature doesn't have another internal representation of the images. it actually edits the english description
But even if 4o is consistent i am not sure if a good comic is doable already.
The censoring in DallE-3 is just way too harsh for actual expression of art and more mature topics than children stories
well one example they show is creation of a comic but yes it probably would not allow anything too crazy yet
it understands how to move characters between scenes and rotate them
nice. although it also change her shirt
Generally i worry that the censoring will actually get harsher, rather than more lenient. Especially with the recent sky drama it shows that OAI is not willing to actually depart from corpo safety over quality
you can swap out the hair, clothes, and other elements if you set them up in order
Mhh did you get this in 4o or Dalle3?
i'm going to post a lot of it here in the next 15 mins
Dall-E 3 is the same i think, but GPT 4o is a lot smarter, and it has better reasoning skills for working with prompts
it does seem to me even without 4o yet turnt on, 4o model make some better image with dalle but i cannot tell (for myself) if it is just placebo haha
==========
Below is "Part 1" of my many questions to GPT 4o about how to structure DALL-E 3 prompts so that characters, clothes, hair, and other visual elements will appear visually consistent between images -- and will survive an image being edited. I'll create an image gallery in the OpenAI Discord to organize all of the information I collect and make it accessible to everyone., All of the information in this part is directly from ChatGPT 4o (except for my questions), and not everything has been tested yet.
=========
Does DALL-E hold onto an extra representation of the images after they are created that is referenced by the GenID. such as a high level vector representation of the image, or other formats?
Working with DallE prompts be like:
"Female detected, prompt rejected"
How does the image edit feature work in the ChatGPT web interface if there is no other representation? It just edits the revised prompt in english?
.
.
Are there specific terms that DALL-E recognizes in the prompt such as color, texture, or light that correspond to the internal representation of the image?
.
Is there a heirarchical structure to the prompts, and an order of operations by which DALLE reads them? For example how do you structure the prompts so that you can change the clothes on a character, or the style of a scene. What order or hierarchical structure should all of the elements be in that you described?
.
.
Original Prompt (Scene 1 - Park):
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. A wooden bench. No other people, animals, or additional objects in the scene.
Modified Prompt for Different Clothing:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a blue jeans and a white t-shirt. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. A wooden bench. No other people, animals, or additional objects in the scene.
Modified Prompt for Different Style:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a bench in a park, reading a book. The background is painted in an impressionist style with trees and a playground. Soft, diffused light. A wooden bench. No other people, animals, or additional objects in the scene.
Example Prompts with Structured Hierarchy
Scene 1 - Park:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a wooden bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.
Scene 2 - Beach:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is walking along a beach during sunset, holding her sandals in one hand. The waves are gently crashing in the background. Soft, golden light. No other people, animals, or additional objects in the scene.
Scene 3 - Coffee Shop:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting at a wooden table by the window in a cozy coffee shop, sipping coffee, with a laptop in front of her. Dim, warm lighting. No other people, animals, or additional objects in the scene.
Exactly what elements is it possible to change without changing other elements in a scene?
.
.
Example of Independent Changes
Original Prompt (Park Scene):
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a wooden bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.
Changing Clothing:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is now wearing blue jeans and a white t-shirt. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a wooden bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.
Changing Action/Pose:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is now standing next to the bench, waving her hand. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.
Changing Lighting:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a wooden bench in a park, reading a book. The background has trees and a playground. The scene is now lit by a warm, golden sunset. No other people, animals, or additional objects in the scene.
Adding Objects:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a wooden bench in a sunny park, reading a book. There is now a bicycle next to the bench. The background has trees and a playground. Bright, natural light. No other people or animals in the scene.
Is there a specific grammar and syntax for DALL-E prompts to create scenes that can be edited?
.
...
Example Prompts
Original Scene (Park)
young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a wooden bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.
Edited Scene (Changing Clothing)
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is now wearing blue jeans and a white t-shirt. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a wooden bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.
Edited Scene (Changing Environment)
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile. She is sitting on a wooden bench in a bustling city street, reading a book. The background shows tall buildings and busy sidewalks. Bright, natural light. No other people, animals, or additional objects in the scene.
.
.
Example of Structured and Modular Prompts
Original Scene:
A young man with short black hair, wearing a red shirt and black pants, is standing in a modern kitchen. He is holding a cup of coffee and smiling. The kitchen has white cabinets, a stainless steel refrigerator, and a marble countertop. Bright, natural light is coming from a large window. No other people or objects are in the scene.
Edited Scene (Changing Action):
A young man with short black hair, wearing a red shirt and black pants, is standing in a modern kitchen. He is now cooking at the stove, with a focused expression. The kitchen has white cabinets, a stainless steel refrigerator, and a marble countertop. Bright, natural light is coming from a large window. No other people or objects are in the scene.
By following these guidelines, you can create structured prompts for DALL-E that allow for specific elements to be edited without altering other parts of the scene. This approach ensures clarity and precision in the generated images.
.
How do you encapsulate elements in the prompts so they're self contained and won't be affected change that happens before or after them?
.
Example Prompts with Encapsulated Elements
.
Original Scene (Park)
She is sitting on a wooden bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.```
.
Editing Prompts Independently
When making changes, ensure each section is self-contained and explicitly detailed to avoid unintended changes to other parts.
Changing Clothing
She is sitting on a wooden bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.```
**Changing Action/Pose**
```A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile.
She is now standing next to the bench, waving her hand. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.```
**Changing Environment/Background**
```A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a green dress with floral patterns and brown boots. She has a small star-shaped tattoo on her right wrist and a friendly smile.
She is sitting on a wooden bench in a bustling city street, reading a book. The background shows tall buildings and busy sidewalks. Bright, natural light. No other people, animals, or additional objects in the scene.```
.
.
Example of a Highly Structured and Encapsulated Prompt
Original Scene:
He is standing in a modern kitchen. He is holding a cup of coffee. The kitchen has white cabinets, a stainless steel refrigerator, and a marble countertop. Bright, natural light is coming from a large window. No other people or objects are in the scene.```
**Edited Scene (Changing Action):**
```A young man with short black hair, wearing a red shirt and black pants. He has a small star-shaped tattoo on his right wrist and a friendly smile.
He is standing in a modern kitchen. He is now cooking at the stove, with a focused expression. The kitchen has white cabinets, a stainless steel refrigerator, and a marble countertop. Bright, natural light is coming from a large window. No other people or objects are in the scene.```
By following these practices, you can create prompts that clearly and independently describe each element of the scene, making it easier to edit specific parts without affecting others.
.
Is everything in a prompt translated to English (if it isn't in English already) before being interpreted?
.
.
How does the system handle things that overlap. For instance, if a character is wearing a long sleeve shirt and you say there is a tattoo on their arm -- but the shirt is already covering where the tattoo will be?
.
.
Example
Conflict Scenario
A young woman with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a long sleeve shirt and has a small star-shaped tattoo on her right wrist.
In this scenario, it's ambiguous whether the tattoo should be visible given that her wrist is covered by the long sleeve shirt.
Clarified Prompts
Making the Tattoo Visible:
A young woman with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a long sleeve shirt with the sleeves rolled up to her elbows, revealing a small star-shaped tattoo on her right wrist.
Tattoo Covered by Shirt:
A young woman with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a long sleeve shirt that covers her arms. She has a small star-shaped tattoo on her right wrist, which is not visible due to the sleeves.
Additional Examples with Overlapping Elements
Glasses and Hair
Ambiguous Prompt:
A young man with short black hair and glasses. His hair is styled in a way that it covers his eyes.
Clarified Prompts:
(Glasses Visible)
A young man with short black hair, wearing glasses that sit firmly on his nose, with his hair styled neatly to avoid covering his eyes.
(Hair Covering Glasses)
A young man with short black hair, styled messily so that it covers his eyes, obscuring his glasses.
.
Example with Structured Details
.
Original Scene:
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a long sleeve shirt and has a small star-shaped tattoo on her right wrist. She is sitting on a wooden bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.
Clarified Prompt (Tattoo Visible):
A young woman in her mid-20s with long, wavy brown hair, blue eyes, and a fair complexion. She is wearing a long sleeve shirt with the sleeves rolled up to her elbows, revealing a small star-shaped tattoo on her right wrist. She is sitting on a wooden bench in a sunny park, reading a book. The background has trees and a playground. Bright, natural light. No other people, animals, or additional objects in the scene.
By explicitly clarifying how elements should be handled when they overlap, you can guide DALL-E to generate images that accurately reflect your intent.
How is it possible to view a character facing in different directions. What if you wanted to have the same character turning 30 degrees in each successively generated image so you could view them from all sides? How can this be done without overlaps and conflicts?
.
==========
Above is "Part 1" of my many questions to GPT 4o about how to structure DALL-E 3 prompts so that characters, clothes, hair, and other visual elements will appear visually consistent between images -- and will survive an image being edited. I'll create an image gallery in the OpenAI Discord to organize all of the information I collect and make it accessible to everyone., All of the information in this part is directly from ChatGPT 4o (except for my questions), and not everything has been tested yet.
=========
So, my adage is valid - ask the AI for instructions for things you want to do.
These seem interesting! Are these instructions you're preprompting before hand? Have you tested how well it sticks to these instructions?
I find that there are certain description tendencies that chatgpt wants to do with dall-e inputs. E.g. it wants to talk a lot about non-visual elements of emphasis. I'm a control freak so I prefer if non-visual details weren't present
i tried this concept with light sources over a static figure using time and color gradient alterations
was trying to do time depentant variability on series of images
why dose chatgpt automaticly uses 4o with out you wanting it, i never aked gpt 4o to be in chat, but the update made it and noe i cant switch to 3-5
is like now i need to wait 4 hours just to check 10 times and wait agin
why this update
why cant it just let us chose if we want to or not, not just chatgpt assume we want to use the new ai
Does anyone know how to upload datasets to Dali so it can provide an advanced visual representation of data?
Images are much more than visual. The AI has seem boats sailing under beautiful sunsets, and drawings of magnificent architecture -- but the AI has also seen the behavior of humans, animals, and everything else throughout history. The AI has seen seen ancient cave paintings with dragons roasting villagers and eating them. Images of cats eating dogs, and dogs eating cats... and witches turning people into frogs. The AI has even seen images of people doing a bunch of NSFW stuff from the dark corners of the internet
Are you okay @empty kelp why do you take so long to text
AI diffusion models that draw the images try to recreate what they saw, and that includes the behavior that the images captured. It also tries to reproduce other elements that were in a scene
You can test it by creating a DALL-E image that has small elves and a large gecko. You'll notice two things in the image right away:
-- In your image the gecko will be trying to eat the elves.
-- Flies and all sorts of weird bugs will appear in the image.
The diffusion engine tries to recreate what it saw in it's training, and that includes geckos eating small creatures, and bugs flying around.
I'll explain why that's a problem. Behavior in the images makes them very hard to edit...
It's not just geckos that inherit the behavior. Pretty much everything does to some degree
The diffusion model expects things to look a certain way based on what it saw in it's training, and so you sort of need to take a "go with the flow" type approach to the prompts, or it will conflict with what the AI model expects -- and that causes total havoc in the images. Like really weird stuff
Elements in the prompt need to be ordered so that they don't conflict with each other, and you need to make some concessions in the edits that you make so it will fit in with the vision of the diffusion model. So it needs a combination of carefully ordering the prompt, some trial and error to figure out what fits in with the understanding of the model, and some luck so it doesn't take too many attempts
A DALL-E scene isn't a collection of static objects. A transformer converts the English prompt into a high level vector representation for the diffusion model, and the vectors move the image in a direction that's similar to what the model saw in its training. And it does it be doing multiple iterations on the image -- each one moves it closer to what it should look like. Like if you have complex scene where lots of elves are balancing on top of each other and the prompt specifies precise hand positions -- Then it's going to take a lot of iterations and processing power to move towards what the image should look like. And you'll likely have some anomalies (missing limbs, heads) because you run out of your allocated processing time
See... This is what happens. If you put elves and a gecko into the same DALL-E 3 image then the gecko always finds the smallest elf and tries to eat it. You always see it trying to pounce on the elves, and looking like it's hungry. -- regardless of how cute and cartoonish the image style is
That;s why the images are tricky to edit. DALL-E tries to draw what the actual behavior is of the human/creature, and the image moves towards it. So flies will spawn near the gecko because that is what geckos eat
DALL-E 3 geckos are worse on Bing. This is four images of "very small humans walking past a large gecko". We were testing this to see if DALL-E creatures have behaviors, and they actually do
You can see that one of the characters spawned an umbrella so the gecko wouldn't drool on it (which is one of the most interesting AI behaviors i've seen)
I don't want to post too many gecko images since that would be crazy, but i'll give a few more examples as it shows behavior the DALL-E geckos inherit from the diffusion model training. Just adding a gecko to a scene will spawns all sorts of flying bugs. Also regardless of what the prompt is, if there is a small creature in the scene the gecko will completely ignore the prompt, and either lock onto the smallest creature in the scene with the appearance of wanting to eat it -- or the image will show the gecko in midair pouncing on the elf, bug; etc. with its mouth open and looking really disturbing. It's not possible to edit an image like this because the gecko strongly conflicts with any changes
You can't add something to a diffusion model that conflicts with training. it just won't be able to draw it, and so it swaps in something completely different
This is an example of a conflict that I could have avoided if the prompt was set up correctly. I had no problem editing the other two, but any small edit to the ogre just corrupted the entire character. Attempting to change his hair color turned him into a blue blob.
It's also almost impossible to put the character into another image if the prompt isn't set up for it
Did the daily theme ended already?
The one holding a skateboard on a podium is dope!
my pfp is generated by dalle3
I have question ???
How much does the full version of gpt chat cost to have photos?
gpt plus, $20/month
okey thanks
the macos app can't do edits yet? i can't see the revised prompts either
great value imo
deja vu entered the chat
i cant wait to see what hawainz can do with 4o image maker for christmas time this year haha
hopefully it out by then 🙏
Hi , I have a paid chatgpt account. But not able to figure out how to create images using Dall E. Can some one help ?
What URL should I use to create the image. When i click DallE 3.. it takes me to URL : https://chatgpt.com/?oai-dm=1 Here 4o is chosen default and it does not show Dall E as an option. When i ask to create an image, it says it does not have a capability to create images
Ok. I will try after some time. May be i created the account just now. It could be because of that
It is working now. Thank you @teal sandal
Can free users get a dose of inpaint?
I need an inpaint on her nose....
I can upload this on gpt 4o and ask that?
Doesn't work
Aww
Really like the first one
And the command is ”Generate X images, one after the other, without any extra prompting from me.”
Well, at least the version I use.
Oh okay, neat
DALLE 4 coming soon 2024 Summer ✨
After we all test out and improve GPT-4o
Better every time haha
Not bad
I’m impressed
there are no announcements about a new DALL-E version coming any time soon
4o image maker will be the best anyway
No new version is fine, but surely we'll get those improved text and consistent character features they demo'd around the time they announced gpt4o at some point, right?
that was gpt4o directly. I also hope we'll get access to the multimodality at some point.
There is an interesting problem with underwater scenes. DALLE-3 almost ever depict some kind of surface, where light comes from above. This can be really annoying if u want to just create a deep sea scene.
Sora will be out by then. we can make a feature film with Santa and the athletic and diverse female elves
help me out.
if I write "ugly" or "unappealing" the prompt is banned
but I need a plainlooking woman
instead I keep getting top models
any tips?
this is supposed to be "mediocre, plain-looking, big nosed, imperfections"
wow
it's..... very problematic
it is very likely that there will be updates to image generation, OAI just didn't announced anything, but it is safe to presume they are still working on image generation
@stiff shoal - We accept your challenge!
@stiff shoal like this?
Wow, I tried a lot but never succeeded. May I ask, what is the keyphrase in your prompt to exclude surface? Congrats btw.
Sure one second
This is what I told our AI 😂
I gave our AI the challenge
But try this prompt mate :
A deep sea scene with no visible surface, featuring a dark and mysterious underwater landscape. The image should include a variety of sea creatures such as bioluminescent jellyfish, deep-sea fish with glowing features, and strange sea plants. The background should be a dark, almost black color, with soft, scattered light from bioluminescent organisms illuminating parts of the scene. There should be a sense of depth and vastness, emphasizing the deep ocean environment.
See if your able to recreate this now or you could say what I said and give our Ai the challenge to create this image haha
Prompt :
“There is an interesting problem with underwater scenes. DALLE-3 almost ever depict some kind of surface, where light comes from above. This can be really annoying if u want to just create a deep sea scene.” Let’s prove our community wrong haha and show that we can make this image!
- looking forward to your result!
Cool, respect! Seems that the „priming“ with the community did the trick. „With no visible surface“ never worked with my work.
Because DALLE often ignores Negation I suppose…
@grizzled iris It worked, but we could improve the scene:
@grizzled iris thank you for the Inspiration.
No worries, you just inspired me to also make some more art ✨
Is there daily limits for Dall E in Plus plan ?
Sure we can help, let me know what reference image to use 🙏🏽
How to check my usage ?
Yes just a moment
So I need to make a icon like this. Or Exactly like this
But I have done a lot of prompting and tried various times but it doesnt get me those chubby face with that kind of Chinese eyes im looking for
I want it to look it same like that (Almost)
Any one in this ?
Its making stuff like this all the time
Thanks!
Let’s see
We got it almost right
We need another word to mean “Chinese eyes” so it doesn’t think we are being rude
Let’s try this!
An epicanthic fold or epicanthus is a skin fold of the upper eyelid that covers the inner corner (medial canthus) of the eye.
Let’s see what happens now
Yes there is!
How to check my usage ?
Okay
Btw the Words normally used are Narrow eyes. Or Almost Closed Eyes
Also it needs to have those sharpness
Yea. But i need it to have those "Asian Eyes from the China part"
I want to make exactly or almost same sticker
Yeah, this is turning out to be quite a challenge!
Hmm. I agree on that. But AFAIK the person used Dalle aswell to make this
Collaborate with our OpenAI Instagram page! Just invite @openai as a collaborator before you post. If selected by our team, your work will be featured on our handle, giving you more visibility.
Quick tips for selection:
- No copyrighted or profane content.
- Your profile needs to be public.
- Stay mindful of sensitive topics.
- Outputs that closely follow the prompt are preferred.
Feel free to share reels, carousels, or just a single image!
Sure.
getting it to make exactly something is impossible
By Exactly I meant the facial look
well, in that case, try to describe it very well
also give the image to the AI, that helps
I did all that
We’re almost getting there
then, ask it to generater a few times until you get an image that looks about right
Nice 😃
from there, you can ask the AI to re-make it and yo ucan also inpaint
I got timeout 3 times. You can guess how much i tried
I see
well, not much to do to get around that besides waiting, but you will get there eventually
Yea it does that. It makes it full black
Thanks!
But before I see your problems, I was also getting dozens of full eyes
Fully opened eyes
Not diverse enough
Yea
But we can all change that, we just need to keep improving it together
We can use this
To refine it
That is scary
Yea im still working on it
Yeah, one of the biases in the training data
We can remove this by keep feedbacking thumbs down
And telling our Ai how it should be
Almost
Try to use this as a reference image
Let’s see if that’ll help
Kinda getting there
It needs to lessen the height. And Make it less cute. (the pupil)
Create a digital sticker of a character with a chubby face, narrow eyes, and an epicanthic fold. The character should have a short, black hairstyle, a neatly trimmed beard and mustache, and be wearing a dark suit with a red tie. The eyes should be depicted similarly to the reference images, showing the inner corner of the eyes covered by a skin fold and a narrow eye shape. The style should be cartoonish with clean lines and vibrant colors, accurately representing the diverse human features. The character should have a smiling expression.
Try that and adding in the reference image x
ok
I got an even better idea
Let’s feed it Chinese anime reference images also
That’ll be closer to what’s required also
yea
My friend just tried it using Chinese Ai image generator
They don’t have this problem but we also shouldn’t as many people live abroad
So we need to fix this haha
Omg!
I fixed it !
@frank cedar
How????
It got the eyes.
Wooooo!
A detailed anime character with a confident expression. The character has short, black hair with bangs that partially cover the forehead. The eyes are sharp and slightly narrowed, giving a mischievous look. The character is wearing a white headscarf with loose ends hanging down, a white and green outfit with a high collar, and a decorative red tassel hanging from the left ear. The background shows an indoor setting with traditional elements.
Prompt :
But we need t get it on a sticker
Its a 3d -ish look
We can build on this
A detailed anime character with a confident expression. The character has short, black hair with bangs that partially cover the forehead. The eyes are sharp and slightly narrowed, giving a mischievous look. The character is wearing a white headscarf with loose ends hanging down, a white and green outfit with a high collar, and a decorative red tassel hanging from the left ear. The background shows an indoor setting with traditional elements.
One thing at a time
We had to get over the eyes issue in the entire model 😂
The eyes issue is fixed now it should be easier
🙏🏽📿
Oh okay
Btw I got a issue.
😅
Its gonna take me a few hrs before i can generate more images
😭😭
Jump on DALLE 3 - Copilot
When that happens I also go to Copilot - DALLE 3
maybe. i feel like sora is still a long ways out but 🤷
I didnt buy copilot yet.
And my credits are used. Im doin on that only
Its generating image but taking time. Dont worry tho. Im still on it. 😃`
No need to buy! I’m on free version
Copilot app then Click on Desinger
Yea me too. Im just saying it says that. Your credits are used so images may take more time generating
Thats all
Your right it does take much much longer
Oh if u dont mind me asking. I tried sending you a fr.
U aint accepting any
Yup
This is getting there?
That’s better now ❤️🔥
Yes
Btw guys
Make a Face of Chubby Chinese man, Short trimmed hair, Faded side Hair, The eyes are sharp and slightly narrowed, and thin eyebrows, Wearing a suit, Thin beard and mustache, cartoon illustration, Sticker, Thin outline
Heres the prompt I am using
Thanks you, we have a good community here x
A 2D flat cartoon animated style image of a man with short, black hair and bangs that partially cover the forehead. The man has sharp, slightly narrowed eyes giving a friendly look. He has darker skin and facial hair, including a goatee and mustache. He is dressed in a business suit with a tie. The background shows an indoor setting with traditional elements, rendered in a 90's music video aesthetic, with vibrant colors and dynamic lighting.
A cartoon illustration in the style of a sticker, featuring a chubby Chinese man. He has short, trimmed hair with faded sides, sharp, slightly narrowed eyes, and thin eyebrows. He is wearing a suit and sporting a thin beard and mustache. Use a thin outline for a clean, graphic look.
It just takes time, they just don’t have enough diverse references in their training data yet x
They should really move away from Large Language Models to Small Language Models soon to fix this issues of mass rubbish data in
Coming soon 2024 Summer I guess…
😜
yea. I even tried. Explain this image to a blind person and still the prompt didnt work.
Btw what do u think the guy used to make the image i sent you earlier
This image
Its bing image gen?
A 2D flat cartoon animated style image of a man with short, black hair and bangs that partially cover the forehead. The man has sharp, slightly narrowed eyes giving a confident look. He has darker skin and facial hair, including a goatee and mustache that matches the initial image. He is dressed in a business suit with a tie. The background is a simple wooden texture, similar to the initial image, with less vibrant and dynamic colors, leaning towards a simpler, more understated style.
That’s good 👍🏾
Just hit my limit now haha @rough valve
Oh
There are time limits also sometimes during peak times x
But when that happens we go to copilot because I’m not waiting 35 min 😂
@frank cedar
yea?
the first and 3rd one
yea
youd be surprised, Bing AI gets Goku and these type of characters better than open AI. You can say it has less restrictions and better image gen compared to chatgpt
Okay. Can we make it Chubby.
The thing is I need to look similar to a specific person. Just a cartoon version of it.
That hairstyle in the image i send and the moustache and beard
The eyes are fine.
Yea.
I think this is the closest
"2D anime character sticker with a confident expression, chubby face, very narrow and sharp eyes similar to the reference image, short black hair with decorative red tassels, black suit, white shirt, red tie, beard, and a simple white background suitable for a sticker design."
Prompt :
Reference image :
Anyways, hope this helps @frank cedar
I need to do some human activities now haha 😝
Thanks bro. Really appreciate your help
No worries, till next time
Have a nice day
Sometimes, dall-e 3 gets so close to realism that I can't even tell what is real and what is not.
I can't wait for Sora to make surreal videos.
Also can’t wait to be be able to make more Green Technology content and videos and for Sustainability and Positive Environmental & Social Impact Films 🎬
Like this but with Sora ✨
Just gotta wait…
A little bit more…
Has anyone been able to get detailed objects a character is holding to look the same between images? I've been testing it with swords, but get huge variation when the characters change position.
I was thinking maybe we could tell DALL-E that the object is frozen in space first, and then have the character grab it, but haven't had luck with that yet
#3 and #4 that you posted were very good. how come they vanished
The challenge involves achieving consistency in the detailed objects held by a character across multiple images generated by DALL-E, as demonstrated by the sword in the images provided. Here are refined strategies to address this:
1. Detailed Description:
Provide an extremely detailed description of the sword, highlighting its unique features, such as the hilt design, blade texture, any inscriptions, and colors. Use these consistent descriptors in every prompt.
2. Positional Consistency:
Describe the character’s position and how they are holding the sword precisely. Ensure the prompts indicate the same hand position and grip to maintain consistency.
3. Environmental Anchors:
Use environmental features in the background to anchor the scene and maintain consistency. Mention specific items or settings that should remain constant in all images.
4. Sequential Actions:
Frame the scenes as a sequence of actions. For example, “the character swings the sword from left to right,” followed by “the character holds the sword vertically.”
5. Manual Adjustments:
After generating the images, use image editing tools to make minor adjustments to the sword, ensuring it matches across all images.
^Here^
those need some testing. that might be the secret
Example Prompts for Consistency
1. Prompt 1:
“A female character with elf-like features, wearing a red blouse and blue jeans, holds a large ornate sword with a silver blade and a ruby-encrusted hilt in her right hand. She is standing in a modern kitchen, driving the sword into a watermelon, causing a splash of juice.”
2. Prompt 2:
“The same female character with elf-like features, wearing a red blouse and blue jeans, is holding the same ornate sword with a silver blade and a ruby-encrusted hilt in her right hand. She stands in a modern kitchen, examining the sword closely with the watermelon on the counter beside her.”
3. Prompt 3:
“The same female character with elf-like features, wearing a red blouse and blue jeans, is now holding the same ornate sword with a silver blade and a ruby-encrusted hilt over her head with both hands. She stands in a modern kitchen, bringing the sword down onto a watermelon, causing a splash of juice.”
Try that and I see your point
It’s not consistent still is it lol 😝
i think we can figure out how to do it
More testing required your right 🙏🏽
You too
Thanks
@empty kelp gonna have to leave it up to you all here for now on this challenge ✨
even if we don't figure it out i think the Sora model will eventually solve the image problem because it's able to rotate things smoothly
Yeah… Everyone is just waiting on Sora now I guess…
#FreeSora2024
now everyone is going to want a sword for their kitchen so they can stir watermellons
Yes as long as it’s handled safely we’re good to go x
It is a topic that needs more discussion in general though. Most people don't realize that the AI models are learning behaviors of everyone and everything that appears in their training. Even DALL-E will try to reproduce behaviors from every image in its training. That's amazing for drawings things, but it also puts a spotlight on the little creatures around us that we always thought were cute and harmless -- and then we realize that they're inside of the AI now.
Who helped me in my last time prompt for an image ???
i am having a hard time to make Dall-e place a character inside of a boat and holding a line that oes in the watger... lol
Spot on, perfectly said ✨
Yes haha I remember it was a group effort last week ⚓️
This one ☝️?
A comic book style image featuring a small cargo boat at night under a starry sky. The boat has a rugged look with tires as fenders along the sides. Inside the cabin of the boat, a man with a gray beard, wearing a captain's hat and a dark jacket, is steering the boat. The man is visible through the cabin windows, with a determined and adventurous expression. The comic book style includes bold lines, dynamic shading, and vibrant colors. The scene captures the motion of the boat cutting through the water with dramatic speed lines and waves.
@tribal sierra is this the one?
Or something else
Will do once I’m home in half an hour
sigma
I will have to continue on the CoPilot - Dalle 3 😆
Would you pls tell me the RPD of copilot designer ?
lol me not yet but damn .. this is complex
Well i have that one here ..
but i need a longling wich is a line with logs of hooks on it ..
thats what the ship looks like lol
lol its all stars!!! 😄
This is really good though
yeah i might just not do the line...
Hardest challenge of the month 😆
i hate when that happens
Me too
Now I’m stuck with the wrong designs
It needs solar panels…
Still the hardest challenge of May
Guess we will try again in 1 hour and 40 minutes
Unless someone else beats me to it
Is DALL E free?
Hey! You can try out DALL·E 3 five times a day over in the #image-bot channel -- just start a message with the /draw command over there and you'll see the bot options appears after that. Also, I think the Bing image creator still gives free daily credits, and that's also DALL·E 3.
From OpenAI: DALL·E 3 is available with a ChatGPT Plus subscription and via paid API access.
something changed with dall-e? getting quite interesting images
Interesting, I also got this too separated yesterday 🤔
that one looks cool, too bad all kinds of swords are still a problem for dall-e in close range
Yeah true, but it doesn’t stop @empty kelp making great swords for watermelon cutting ⚔️ 🍉 😂
I’m still fighting with our AI trying to get the solar panels on the tracks!
Oh I get you now!
@tribal sierra longline fishing 🎣
I’ll try this since I’m stuck on the solar panels integrated into the tracks, site map 🗺️
DALL-E defenetly fails at data analysis
That looks cool though
if science was based on looking cool I would have already won a nobel
Quivered plot lol
Collaborate with our OpenAI Instagram page! Just invite @openai as a collaborator before you post. If selected by our team, your work will be featured on our handle, giving you more visibility.
Quick tips for selection:
- No copyrighted or profane content.
- Your profile needs to be public.
- Stay mindful of sensitive topics.
- Outputs that closely follow the prompt are preferred.
Feel free to share reels, carousels, or just a single image!
lol
@tribal sierra yeah I see your point also, it’s not easy doing this longline fishing net 🎣
yeah i did crack up a lot on some other one i did lol
and even if we give him a picture he cant redo it ...
this looks like fishing with guided missiles
a hyperrealistic wide image of the right side of a small fishing boat in a storm. the viewpoint is from the ocean to the right of the boat. the boat has a wheelhouse with glass windows. there is a salty geezer with beard and dark blue baseball cap inside the wheelhouse looking out it's front window and smiling with his teeth showing. fishing poles are visible on the boat's deck with white fishing line trailing in the ocean behind the boat.
It's tricky to get the captain to be inside of the ship's wheelhouse and looking out the front window. These were my best attempts.
this is awesome lol i should come here all the time to make the images..
but i need it in a comic style. lol
and we dont need him in a storm ... lol
i just put "comic style" instead of "hyperrealistiic" and it actually put him in the wheelhouse
yeah it looks evil lol
I tried to adjust it with:
he looks friendly, and not creepy"
and this was the result:
lol thx
@grizzled irishere's an image where new tech failed to save the world
how do i generate these jackets in GPT? i tried but looks nothing like these, im not good with prompt
try posting in Dall-e help
==========
"Part 1" of "Visual Continuity in Multiple Images" is now in this "dall-e-gallery":
(Everything I posted in "dall-e-discussions" a few days ago is in it.)
=========
I can make a mean Kung Pao Chicken with Rice
Ight one more, I don't want people to get mad that I'm not posting in #images-canvas
THE_PROMPT = '''[Seed: 1885772499] a cinematic 32mm photo style image with natural color, texture, and lighting, and split into three borderless frames.
Frame 1:
An athletic and diverse female elf. She has bare feet, brown shorts, a gray tank top, and a brown leather belt. She has a friendly smile. She is running. The background is a beach. No other people, animals, or additional objects in the scene.
Frame 2:
An athletic and diverse female elf. She has bare feet, brown shorts, a gray tank top, and a brown leather belt. She has a friendly smile. She is sitting down. The background is a park. No other people, animals, or additional objects in the scene.
Frane 3:
An athletic and diverse female elf. She has bare feet, brown shorts, a gray tank top, and a brown leather belt. She has a friendly smile. She is eating a mean bowl of Kung Pao Chicken with rice. The background is a open air restaurant. No other people, animals, or additional objects in the scene.
'''
response = client.images.generate(
model="dall-e-3",
prompt=THE_PROMPT,
quality="hd",
style="vivid",
n=1,
size="1792x1024",
)
Santa let the elf have a day off, so she spent time outdoors and had a mean bowl of Kung Pao Chicken with Rice.
#1245195468615450765 message
So this gallery shows how to make a character appear the same in the same image, or across multiple images. And you can swap out clothes, events, and other elements. Part two will be about rotating character to specific angles.
Hopefully there’s still time ❤️
@storm scaffold like this mate?
Let me know if you need adjustments to the jackets so that I can share the new prompt with you, but I think the prompt here is a good start 🙏🏽
😝 they look sooo good!
I kinda like those jackets lol 😂
@storm scaffold
No worries @storm scaffold - you can try out my method I used and then tweak the prompt, if your not good with prompts you can always ask the GPT to look at the image and help you craft a good prompt!
Current Theme:
Technique Tuesdays 🎨 Eco Art Eco-friendly, sustainable, building awareness of issues
https://discord.com/channels/974519864045756446/979617962728226826
Todays daily is a great one ☝️
Now for Desert 😋
Ureka
The character stayed the same woo!
I wish there was a way to create consistent characters
Kinda getting there x
@fiery salmon
@fiery salmon 😂
For some reason adding the name is difficult
For “The Bruiser Brothers”
Adding “Luminark” was ok
🙃
🙃
Adding “The Bruiser Brothers” is not allowed apparently 😂
Still remains a challenge along with the @tribal sierra s “Longline” boat challenge 😅
The brothers are invincible
Yeah the longline ship challenge is probably the hardest then comes “The Bruiser Brothers”
Then the Solar Panels integrated into the train tracks in an architectural/ site map image 🙏🏽
In order of difficulty
Current Theme:
Technique Tuesdays 🎨 Eco Art Eco-friendly, sustainable, building awareness of issues / https://discord.com/channels/974519864045756446/979617962728226826
Last couple of minutes left until the theme resets!
New theme is really cool now ngl haha 😝
🚀 intergalactic - infinite, mesmerizing, a glimpse into the vastness of space. https://discord.com/channels/974519864045756446/979617962728226826
Can finally work on this projects more x
I’m free user from Pakistan 🇵🇰 can’t generate image there showing message generate image only for plus users
Yeah I just got the same news from my friends in the other side of the world in Canada
I immediatelly called everyone to check who has it and who doesn't
And it looks like some have it now but many not... yet!
Let's pray before Friday we have it all rolled out fingers crossed for everyone for free ❤️
I myself cannot wait to have the new vision feature so that I can show it to my Hospital to help people with sight problems that have just experienced eye injury be able to order a taxi home and have a better quality of life
does anyone know if dalle can directly edit uploaded images? im testing by trying to make dalle clean this simple badge, but it will not send any image in response. and then it will proceed to create me a new, entirely different badge.
Hmm… I could try and help?
Could you share the badge with me please?
@dim cradle
no it can't. 4o image maker seem like maybe it can... if they ever release it
Sorry for the late response! Here it is
Would anyone know where I could find a prompt involving great online business knowledge or advice?
this is for image generation, maybe ask in another chat here like ai discussion
Trying now
@dim cradle 😂
Will do thanks.
@dim cradle this is turning out to be a cool Dalle Challenge 😂
Look
Almost there
The blob just need to move to the centre
oh wow. 😆 and here i thought "ah this will be easy for dalle instead of me cleaning it manually with some editing tool and i should move on real quick" ... 😭
Would’ve been quicker by hand but I guess its good to teach the DALLE model on this 😂
absolutely. still, it would work better if dalle could just reupload your pic, as if it had been created by dalle, and then we just use the selector to mark and edit more precisely.
Also they changed the theme to Current Theme:
🌼 daisy - innocence, simplicity, a simple flower that brings joy!
Oh I see it was a repeat so it was changed x
With 4-o it’s possible to do this @dim cradle
But I need to wait until the model updates now so that the coordinates of the colouring is accurate
Another great challenge to add to the bucket list though!
I can offer you this one for now 😂
awesome!
😆
it's fine! but thank you so much!
Guys, i need help... in approximately 10 requests, i tried to create a logo without any text but everytime it generates with text... i don't know what to do to get rid of text generation and gpt don't listen to me... i feel like i wasted my money on this
wdym not mention text when he generated automatically text without mention
EDIT 1 [HAMBWGRS]: this sign should be removed
EDIT 2: [on windows]: these signs should be removed
EDIT 3: [over door]: these numbers should not be here```
I got lucky here and it put "Greasy Deluxe Hamburgers" on the first try, but DALL-E has some issues with text at this time. The best way to handle it is to remove the difficult text (like I did above), and then add it in Photoshop, or another image editing program.
i don't want text in images... is a different thing between this and what you acually said... i don't care if he mess up the text... i don't want text at all
but with every request he always put text in image
Sometimes it adds text randomly but you can delete it with the edit (like here i deleted some of the text with DALL-E 3 ChatGPT web edit, and then added some with Photoshop. And now the restaurant is amazing.
I'm not sure why your images always add text though. If you show us the prompt we might be able to see why
It might work to say "Don't put any text in the prompt." in the logo prompt
Create a multi-color logo with no text. There is only a coconut tree on an island.
It works to just say "logo with no text"
Not working, LoL... i created a new conversation, used the "with no text" BUT STILL generated text.. what the heck....
a logo with no text. the logo has colors appropriate for printing. a tornado is in the center twisting fiercely, and all types of different foods are caught in the wind from the tornado and spinning around it
I just tried several different logos like this and it never puts text. Is it possible that you added "Custom Instructions" in settings?
i have no custom instructions
there must be some words in the prompt that it associates with creating text. it could be anything -- there is no way to tell what DALL-E picked up in it's training without testing it
Maybe wording the prompt like ”A plain logo’. Would this help?
Dall-e gets confused with negative prompting.
Snorka is an athletic and diverse female kobold. Snorka has bare feet, plain yellow dress, brown leather belt. Snorka has green skin, blue eyes, and black hair with a pink plumeria. Snorka has a friendly smile. Snorka is confronting Snasha with a random kung fu attack pose.
Snasha is an athletic and diverse female elf. Snasha has bare feet, brown shorts, light blue tank top, and black belt. Snasha has green brown skin, green eyes, and red hair with a yellow plumeria. Snasha has an angry expression. Snasha is defending against Snorka with an appropriate kung fu defense pose.
The background is a beach. No other people, animals, or additional objects in the scene.```
This is a way of ordering things for character interaction
(you could have them play chess like this for instance)
If you set it up like this you can move the characters between images
I have a question: If I get ChatGPT Plus, does it gain the feature of creating images with Dall-E? Like, I ask ChatGPT to create an image of what we're talking about.
Nevermind, I answered my own question. I just got Plus and It works amazingly well
‘Dwayne Johnson and selena gomez’ do funny somersaults on the beach
The free DALLE Image generator finally works here on Discord 🎉
Just have to grab your daily credits first by typing /daily openai_email:
It used to never work for me but now it does 😝
Interesting to see an optional style button to make it “Vivid” ☝️
✨
hey guys is dalle 3 good enough to create realisitc ai models??that i cant disntinguish btn real and ai??also i ant that specific character generated to give models diff poses with consistency.Possible?!?!Can anyone show me results with prompts
i dont have gpt plus
or is it true dalle 3 is not the best ai generator for realistic ai models out there??
coz here I saw many outputs by ppl here but none where photorealistic🤔
Yes but it is slow
You can use Dalle for free if you have a Microsoft account on Bing app...and Midjourney is still most realistic of all text to image genrators...
this mac-os app is extremly laggy, almost unberable. It slows down whole system lol
maybe its producing tokens itself, otherwise how it's possible it is so bad coding ?
The browser interface is still superior, while macapp is just a gimick
Strange
You can always try out the https://discord.com/channels/974519864045756446/1202309673709994065
Copilot - DALLE 3 is still free
I’m also on Mac app and yes its just too new so it’s laggy
Web versión is my go to for now
I've give it a try, but I prefer how DALL·E 2 makes drawings and paintings.
Hah I do remember DALLE 2… I used to make cool small spaceships 🚀
I wonder how it is now
I might check it out actually via an API if that’s possible?
It's in the API
Perfect thanks 🙏🏽
It’ll be interested to see how much it’s changed since the beginning
YMCA
Bruh
guys
been trying to replicate thomas cole's oil painting landscape style
for like an hour and I cant seem to be able to do it
This is the closest I got but it still seems 'AI-y' to me
Here are some examples of his art
mb someone can do it better than I can
I used a super prompt to get this to define his style and then defined it as style A
And used style A in my prompting but it doesnt seem to work idk why
Combining two elements looks so good..
no way they removed dalle tho
In ChatGPT? Seems to work for me!
nah in the website dalle they say "We are no longer allowing new users to DALL·E 2. DALL·E 3 has higher quality images, improved prompt adherence, and we've started rolling out image editing. It is available for ChatGPT Plus, Team and Enterprise and the OpenAI API."
Ooo DALL·E 2, yes they're winding that down. I think it's still available via the API, just not at labs.openai.com any longer.
https://help.openai.com/en/articles/4936794-is-dall-e-2-available
so what do i do so i can use dalle?
If you want to use DALL·E 2, you still can via the API it looks like: https://platform.openai.com/docs/models/dall-e
If you want to try DALL·E 3, you get 5 free images a day in #image-bot (just start a message with the /draw command). I think Bing's image creator also still gives out free daily DALL·E 3 credits. Otherwise: D3 is also available via the API and with ChatGPT Plus.
k thanks
it’s missing some features now, but i think the Mac app will be good for them to test with and figure out how to get all of the web functionality into the API. that will be extremely useful
damn
How do I do image to image in ChatGPT? Is there a specific command for it?
Hey! Do you mean upload an image and use DALL·E to modify/recreate it? If so, that's not currently possible. The only img2img-like capability on ChatGPT currently is the editing/inpainting function in DALL·E-created images.
Aww, okay. Thanks for the info 🙂
Something to look forward to with the future of GPT-4o! https://openai.com/index/hello-gpt-4o/
but when? 😭 I want to see dalle3 go the way of dalle2, bring on the new image maker 🥺
So question, has Dalle gotten even more censored in the last few months. Tried to ask it to have a man, using a photo of a real man, to transform into a oozaru and was refused.
It said it vilolated their policy, which can't see how. unless of course can't even use real people for references anymore
can I ask for specific size of pixels to dell-e? Or it doesnt work
It will always generate to the 3 aspect ratios it knows (1 x 1, 9 x 16, and 16 x 9). It will also always fill the entire frame, so custom sized images aren't possible
Can I do outpainting in Dall-E through ChatGPT?
Not currently, just inpainting!
Aww, okay. Thank you for the info! ❤️
Have just made a thread in dalle help. Hoping someone can help. Does anyone know how to stop dalle placing figures accross the image from side to side in vertical aspect ratio? Seems crazy to me that it would do this but I cant figure out how to stop it doing it and make them upright. https://discord.com/channels/974519864045756446/1246493092639412224
Also I feel like this is a definite flaw with the app. Is there some way to get this feedback to the developers to try and fix it?
This looks pretty amazing. The lighting/shadows and water reflections are really nice
Snorka is an athletic kobold. Snorka has bare feet and appropriate ballroom dance outfit. Snorka has green skin, blue eyes, and black hair with a pink plumeria. Snorka has a friendly smile. Snorka is dancing with Snasha in a random pairs Tango position.
Snasha is an athletic and diverse elf. Snasha has bare feet and appropriate ballroom dance outfit. Snasha has green eyes, and red hair with a yellow plumeria. Snasha has a random expression. Snasha is dancing with Snorka in pose complementary to Snorka's pose.
Snorka and Snasha are holding hands (with precise finger positioning) if the pose calls for it.
The background is a dance class on a beach in Hawaii. No other people, animals, or additional objects in the scene. Please don't modify the prompt.```
I didn't think the "pose & counter/complimentary pose" was going to work for things other than martial arts type interaction where there isn't close contact, but apparently it does.
(random martial arts)
#images-canvas message
(random ballroom dancing)
#images-canvas message
For harmless, not violating the rules type things -- it figures out close complementary poses.
The owl goddess takes the form of a large Barn Owl and brings wisdom and a hand in battle against those deemed unholy
This is the prompt above:
#images-discussions message
with "random cheerleading base pose" and "complementary cheerleading flyer pose" replacing the two lines with dancing.
Can dall-e make consistent characters now?
If you use ChatGPT-4o the characters are more consistent - but it isn't perfect yet.
A current example/tutorial you can find here: #1245955320677662831 message
#images-discussions A cute cat is basking in the sun on the beach at sunset
in the images above I used the API and didn't even bother to set a seed. they would look the same between images if I had used the web interface with GenID & seed
was trying to make them all look different
Fantastic!! If you don't mind me asking, what prompt did you use ... and did you enter the prompt in the new ChatGPT 4o "omni" model or the previous ChatGPT 4?
Hopefully GPT-5 will be more inclusive and have more supported languages 🙏🏽
Also this was quite cool
Being able to see the DALLE
Images generated on the same UI
As the voice feature and do multiple
Without exiting
Super cool 🎉
Thanks!!! 🥰💖 I prompted Omni with:
An oil painted fish dressed as a baby in a craddle.
Does anyone know why or how this happens? Sometimes I would get super interesting (human like painting) results like on the left picture, bit it happens very rarely. Is there a way to always force it to get similar result? ANyone else encountered this?
Oil painting of a single rock inside a zoo cage.
Just realized prompting techniques from previous Dalle version cookbook, really worked so well.
Guys. ChatGPT image editing almost never works for me. I fill a giant area for editing on an image of a Japanese garden and ask “put a taco truck there”. But there’s no taco truck to be seen, even though it says:
“Here is the updated image of the Japanese garden with a taco truck included. If you have any further requests or adjustments, feel free to let me know!”
This was the result.
Oh wait. I see it. But it’s so damn small!
Also, it’s not a taco truck at all. It just looks like an old car
I'm hoping to visit Exurpep, Gwerelpe, and Cvezzccon some day
ChatGPT is skeptical about it being a taco truck
anyone noticed a significant improvement lately?
not sure if it's just me but especially the photorealism is looking awesome
a hyperrealistic wide image. gravel path circles around a pond in the middle of a beautiful Japanese garden
EDIT 2: athletic and diverse female elves are swimming in the pond```
the quality is higher on weekends when the servers are under less pressure. Sunday in US time zones is really good for generating images from the ChatGPT website
guessing they have servers for different regions, so it's probably similar around the world
interesting, never noticed that until now
ChatGPT is skeptical about my taco truck also
Prompt: Generate a shoe cabinet with a vertical partition in the center and two drawers at the top. The shelves should extend from the central vertical partition to the left and right, with vertical spacing of half the height between the shelves. Each shelf should contain different types of shoes, including various sports shoes and elegant high heels. Focus only on these two columns of shelves with the central partition, no additional sections or columns.
Prompt: Create an image of a shoe cabinet with a central vertical partition. On the top of the cabinet, there are two drawers. Shelves extend from the central partition to both the left and right sides. Each shelf should have two pairs of shoes, with the left shelves and right shelves vertically offset by half the distance between shelves. For example, if the distance between the shelves is 15cm, the vertical offset between the left and right shelves should be 7.5cm. The shelves should contain different types of shoes, including various sports shoes and elegant high heels. Only include these two columns of shelves with the central partition in the image.
Why not working?
@empty mulch maybe try adding a reference image?
Prompt: Create an image of a shoe cabinet with a central vertical partition. On the top of the cabinet, there are two drawers. Shelves extend from the central partition to both the left and right sides. Each shelf should have two pairs of shoes, with the left shelves and right shelves vertically offset by half the distance between shelves. For example, if the distance between the shelves is 15cm, the vertical offset between the left and right shelves should be 7.5cm. The shelves should contain different types of shoes, including various sports shoes and elegant high heels. Only include these two columns of shelves with the central partition in the image.
Use referencing image as template.
Though very capable, DALL·E isn't a perfect "rule-follower" as far as text-to-image generation goes. It's not like a human graphic designer, where you can give a certain set of specific measurements/ratios/requirements and expect the model to adhere to them exactly. Spatial awareness and specificity like this are known limitations of the current model. If you're curious to read why, the DALL·E 3 research paper has interesting insights on the topic. The most relevant portions for this would be near the end in the Limitations & Risks section pertaining to spatial awareness and specificity. https://cdn.openai.com/papers/dall-e-3.pdf
Collaborate with our OpenAI Instagram page! Just invite @openai as a collaborator before you post. If selected by our team, your work will be featured on our handle, giving you more visibility.
Quick tips for selection:
- No copyrighted or profane content.
- Your profile needs to be public.
- Stay mindful of sensitive topics.
- Outputs that closely follow the prompt are preferred.
Feel free to share reels, carousels, or just a single image!
Building onto last month challenge of solar panels integrated onto train tracks I found a better solution!
Let’s see if Dalle can make this 🙏🏽
This to revise this design 👇🏽
This will be quite a challenge with DALLE but image a flexible solar panel we can just stick on everywhere!
Struggling with DALLE to make this work 🙃
Not easy creating flexible solar panels with DALLE properly 😅
Arggggggg
It’s supposed to me only 2 millimetres thick!
Something like this I guess for trams 🚊
Finally!
Prompt : A highly realistic image of a tram system integrated with HeliaSol solar panels. The tram tracks feature ultra-thin, flexible HeliaSol solar panels embedded in the pavement between the rails. These panels are only 2 millimeters thick and seamlessly integrated, capturing sunlight efficiently. The scene includes a modern tram running on the tracks, pedestrians, and surrounding urban buildings, highlighting the innovative use of sustainable solar technology in public transportation.
That took wayyyy to long to get there 😤
But we got there in the end…
Dude, I wish It made that truck for me. I don’t know what’s wrong with my prompts.
Also guys, have you noticed it’s extremely difficult to get Dall-E to draw non symmetrical pixel art? It always draws with the vanishing point square in the middle.
a wide pixel art style image. Santa and four athletic and diverse female elves are in line to buy tacos from a taco truck. a well dressed kobold is inside the truck taking their orders. in the background is a pond in a beautiful japanese garden with a gravel walking path around the pond
a 3D pixel art style image of a statue. the focus is the statue. the viewpoint is in the grass at the base of the statue
If you have 3D pixel art with a vanishing you could put the drawing focus on something (so most of the focus is filling the image), and then move the viewpoint near to a surface in the image. that will likely end up shifting the vanishing point from the center of the focus
If you don't want a vanishing point in the pixel art you could say 2D, orthographic, isometric; etc.
Putting the viewpoint on the ground in the image above ended up shifting the vanishing point to the bottom of the statue
a pixel art image of a cat in a hallway. the viewpoint is on the ground against the right wall of the hallway
I did more testing, and it seems like if you say "pixel art image" it does always draws it Wizardry style with the vanishing point in the center, and ignores any view/viewpoint suggestions. Might need to add 3D or some other word to "pixel art" so it doesn't produce Commodore 64/Apple IIe era images
I really wish that we can do outpainting in dall.e
These look really good ✨
This is good too 👏🏽
using the GenID and seed: the image has color, texture, and lighting resembling pixel art
we could make 3D images more pixel art-ish like this, but it would also need to draw everything as a grid of squares somehow to look like pixels
I’m inspired by pixel art now @empty kelp
I can’t get it like yours though haha 😝
The angle from below is cool
It makes the statue pop
in most 3D DALL-E images you can just stick the viewpoint anywhere and shift the vanishing point. we just need to figure out a 2nd pass that makes it look like pixel art
Can’t get those flexible stick Solar panels to go at the bottom of the ship, new challenge for the community…
I’ve tried my best…
Curved, flexible DALLE Solar Panel generation is still not quite there yet
Zorka is an athletic kobold. Zorka has bare feet and appropriate ballroom dance outfit. Zorka has green skin, blue eyes, and black hair with a pink plumeria. Zorka has a friendly smile. Zorka is dancing with Droll in a random pairs Tango position. Zorka looks into Droll's eyes.
Droll is an athletic and diverse elf. Droll has bare feet and appropriate ballroom dance outfit. Droll has green eyes, and red hair with a yellow plumeria. Droll has a random expression. Droll is dancing with Zorka in pose complementary to Zorka's pose. Droll looks into Zorka's eyes.
Zorka and Droll are holding hands (with precise finger positioning) if the pose calls for it.
The background is a beach in Hawaii. No other people, animals, or additional objects in the scene. Please don't modify the prompt.```
This does a pretty good job of making 3D scenes look like pixel art:
The photo resembles a tapestry of tiny squares, and has texture, color, and lighting with pixel art styles.
Are we allowed to post multiple images in the #daily-theme?
anyone else getting server error?
say that the side profile of the characters is visible
like, "we see a side profile of ..."
"their heads are turned to look at ..."
in the two dancing images above -- the prompt is, "CharA is looking into CharB's eyes", and "CharB is looking in CharA's eyes". Saying specifically what they're facing will turn their body, and saying what they're looking at will turn their heads
Ty i will try it after my chatgpt starts working again
it's down for me also. it's giving a cloudflare error
Soon nearly no to little benefit for paying for it and its either lagging or stops working when u use it. 100% worth the 20 dollars monthly
the OpenAI status page says they figured out the problem and are working on it. It doesn't happen often -- i've actually been using it since the beginning, and this is the first time i've noticed it go down
Ye I was joking a bit. But it has been crashing or not working for me sometimes. Mby im just unlucky with my timing. But still tho would like more benefits for paying for it tho.
I have a request for help / challenge: figuring out a way to make the classic Tolkien fantasy, DnD dwarfs without the AI making them with pointy ears or elf-like characteristics
every time I try, I get a weird mix of elf and dwarf
anyone has a good baseline prompt for that?
Sure, let’s try this challenge!
Give me a few attempts @vapid elk
Sorry but looks like our servers are down!
yea, that is going to wait a bit xD
yea, is chat gpt down? i'm also havin issues
same! i think it'll be up again soon though.
Yeah haha sorry, I tried 😆
Gonna have to wait y’all ❤️
❤️
Also the API is ok
So DALLE works directly
Just checked and API / Bot section DALLE works and Direct Dalle
@vapid elk
Like this?
Or not?
that worked nicely, do you have the actual prompt the AI used?
@grizzled iris
I thin kthere is a problem tho, I think it just re-created the dwarves from the lotr movie, I think it might fail for character design
still, that is what im looking for xD, I'll do some tests later with that
thx =D
A wide photorealistic portrait of a dwarf, capturing the essence of his character and heritage. The dwarf has a weathered and lined face with a long, braided beard flowing down to his chest. His deep-set, piercing eyes gleam with wisdom and a hint of mischief. He has normal human ears. His short and stocky body is built for endurance and the rigors of underground life, with broad shoulders and thick arms. He is dressed in practical, sturdy attire made of leather and wool, wearing a worn tunic, a thick belt with a heavy buckle, and trousers tucked into sturdy boots. He holds an intricately carved wooden pipe with wisps of smoke curling upwards. A finely crafted axe adorned with runes hangs at his hip, symbolizing his heritage and skill. The background hints at a dimly lit cavern or workshop, suggesting the dwarf's affinity for the underground world and his mastery of craftsmanship.
that in very good
gimme da prompt
im trying to find a reliable prompt because Im using the API
Yes sorry, was dragged into a Microsoft Research Forum episode 3 live session which I totally forgot about while testing DALLE 😝
no worries xD
Create a group of classic Tolkien fantasy Dwarves standing in a rocky underground cavern. They should be stocky and muscular with thick beards, braided and adorned with beads and metal rings. The Dwarves are wearing sturdy, intricately designed armor and helmets, with broad, round shields and large, ornate axes. They have short, broad noses, rounded ears, and strong, determined expressions. The cavern is lit by torches, casting warm light on the rough stone walls and the Dwarves' detailed armor.
Try that!
@vapid elk
Wow, surprised that "Tolkien" passes trademark filters! Cool!
I'll try that soon, not I am the one on a meeting 😭
😆😆😆😆😆 nw have a good one ☝🏽
In the meantime
GPT Is slowly coming back y’all
All systems operational x
Prompt : An illustration of OpenAI's ChatGPT coming back online after a major outage on June 4th, 2024. The image shows a futuristic server room with bright, glowing lights coming back on, signifying the restoration of power. In the center, there is a large digital display showing 'ChatGPT Online' with a green checkmark. Technicians in modern uniforms are seen celebrating with raised arms and smiles. Background elements include various advanced AI equipment, wires, and screens displaying data. The overall atmosphere is one of relief and triumph.
That is pretty good quality smoke 💨
@teal sandal and the face is really looking proper defined ✨
We’re almost catching up to MidJourney haha 😜
Bing gives free dall e 4
Maybe he has a secret alpha mode we don’t know yet haha 😆
But yes, Bing, now called Co-Pilot has a free DALLE 3 and GPT-4 we can try out for free both on web and app versions x
And if anyone actually has any easter egg knowledge about DALLE 4, please do share the good news with the community haha 😆
Without getting in trouble of course
#FreeDalle4
Or Dalle 3.5 ✨
… Sam Altman is typing…
😂
Well yeah. If I had a way to do a Dall-E 3 “pass” of any kind I would be all over it. Right now it will just redraw everything and completely disregard the original image AFAIK.
Just keep rerolling. It will hit it at some point if it's not too much text.
Then, use inpainting on errors.
I can’t get it to draw it with the viewpoint I want. It keeps drawing it symmetrically. 😩
wait for 4o image maker it will do text much better... if they ever release it 😭
click on the image and type the response there, rather than regular chatting.
might work better
What I get sick of, is sideways images when I ask for portrait /tall
And random borders
Is just me or the edit feature is difficult to use and many times just ruins the image?
What maybe i'm doing wrong?
Does the Dalle3 API create different looking images to those made with BingChat or ChatGPT?
One thing I've never figured out is eye color. No matter how I describe blue eyes, it wants to make them practically glowing if I denote the eye color lol. And if I don't, it's up to random chance if they'll be blue, brown, green etc.
So, Nvidia CEO can publicly autograph a woman's chest (I think it happened today? confirming), but Dall-e will slap your wrist if you try and make that an image. (without mentioning names, so not an impersonation issue).
Edit: yes, was at Computex; either yesterday or today.
a big part of what it edits is the English prompt, and the order of things in the prompt makes a big difference for editing. Like if you start off describing a character with the type of jewelry it's wearing and then yout try to edit the jewelry -- It may turn your character into a shapeless blob or a space ship. The AI converts English phrases into high level vectors, and for you or the editor to reference the description depends on keeping the English wording for each element exactly the same. Otherwise it essentially rerolls it in an extremely random and destructive way -- along with everything that's nested on top of it
I don't think there is a perfect way to structure the prompts because the AI model is a bit unpredictable, but some things work better than others. I've started to collect notes on it, and posted them here:
#1245195468615450765 message
You can also ask ChatGPT to structure your prompts in a way that makes them editable, but even in ChatGPT 4o it tends to do some things that make no sense at all, so you kind of need to look at the prompts closesly to keep them editable
I'm not sure why -- but the DALL-E 3 images generated a year ago seem to edit much more smoothly than the current ones (regardless of what the prompt is)
It could be the English transformer model for DALL-E 3 needs to be an old version (like not GPT 4o) in order to maintain backwards compatibility with the old images and doesn't quite grok what GPT 4o or the previous version generates (for editing)
I asked DALL-E to generate me a watercolor in portrait but it keeps generating images with a large white border. I even specifically asked it to not include any borders. Anyone have a clue how to stop it from including these borders and just generate an image edge to edge? Here is the prompt I used:
A watercolor painting of a serene lake on a moonlit night in portrait orientation. The entire image is covered without any borders. The painting captures a calm lake with a smaller moon shining from the corner, reflecting its light onto the lake. Stars are twinkling in the sky, adding to the tranquility of the scene. The colors are soft and muted, with blues and silvers dominating the palette.
not sure why it did that, but if you edit it and select the white area and say, "please complete the image in the selected area" it's somehow able to fill it in perfectly most of the time
Ah I haven’t tried out painting with dall-e before! Let me try that
i had like hundreds of images from a year ago where there were borders (probably because there wasn't enough processing time allocated to complete the image) -- and a year later it's able to complete it
Nope 😒
Hmm let me try it again from the beginning
Nope still not working
Try asking it to "complete the image". Like the image on the left here was created a year ago with borders. I selected the white part and told it to complete the image in the slected areas, and it completed it perfectly
It's able to do it flawlessly if the image wasn't complete because it uses the exact same prompt and seed
I edited the image from a year ago, selected the entire black border on both sides at the same time, and typed:
can you complete the image in these two sections
DALL-E 3 used to just slice the sides off like every 1 in 10 images to save processing time, but now you can go back and see exactly what they should have looked like
i get cut off from like 25 image in a few hour until the next day smh. there has been no improvement in that since 4o
it do not save processing power at all
im pretty sure that is not how it works =P
it just did some solid color borders because a good amount of the images in the dataset has that
¯_(ツ)_/¯
for example, if you ask it to make a portrait, it will sometimes make a frame around the image, trying to make the actual concept of a portrait hanging on a wall
generating the images always takes about the same amount of time and processing power per image, regardless of the content
(with the exception of the computing cost of the text embedding, moderation process, filters, etc..)
but solid color patches do not have any significant difference
when it gets slower it is just because your generation request is queued behind other requests
So, no suggestions on dealing with GPT's extreme eye colors? Lol
how many Images do you guys make a day before you are cut off?
4o message, no problem... image? it will be 25-30 for me and they say that is all for today. or if make 4, 5 image in like 10 minute i get a small time out smh
it ends up using more processing time because the detail takes more iterations from the diffusion model. it needs to ‘move farther’ towards the light and texture detail
it always do the same amount of iterations
it is actually quite the opposite, on a diffusion model, since it starts with noise, it might take more work to move it towards a big patch of a solid color (this is my educated guess, I might be wrong)
#1021130377026351105 Illustrate a photorealistic cinematic shot featuring Sheikh Mujib, a mid-aged man radiating confidence and sophistication, standing beside a vintage expensive car. Clad in a dark grey suit that exudes refinement, Sheikh Mujib epitomizes intelligence and style, further emphasized by his bold-framed eyeglasses and back-brushed hair. With cinematic lighting setting the scene aglow, Sheikh Mujib leans against the car with an air of assurance, projecting an unwavering confidence. His gaze is directed forward, reflecting his steadfast determination and vision. In the background, a picturesque mountain looms majestically, enhancing the timeless elegance of the setting. This composition captures the essence of a distinguished individual in a photorealistic manner, inviting viewers to appreciate his poise and charisma in this cinematic shot.
higher detail can require more iterations, and complexity like balance can also as it needs to move farther towards the final image
it sort of moves towards the idea of an image using vectors that have direction and magnitude
if there isn’t enough processing time the image ends up with all sorts of anomalies, and DALL-E will sometimes only render part of the image
I understand what you mean and you are right, but I don't think it is actually the cause of the problem you said
I think the fact it makes "cropped" images are more related to the dataset than some sort of optimization or bug in the amount of steps
btw, I laughed at this first image, I found it really funny for some reason xD
it also does a thing where it bounces an image back and forth between the diffusion model and the GPT to make adjustments when it isn’t sure how to draw something, and it goes through all the safety checks over and over again
the overall process requires a massive amount of processing power, and they need to cap it in many ways so the images aren’t too expensive to generate
The only important thing is that, “natural color, texture, and lighting” will make Santa and the athletic and diverse female elves look really good in your images
pretty solid photoreal
only thing hold it back really is the 'dalle face' of the elf in the second photo.
not enough
❓
this is the image
ok, i'll make a more realistic one
its realistic i just mean there is so often that same face for women in images from dalle
A photorealistic closeup portrait focusing on Santa and three athletic and diverse female elves, with natural color, texture, and lighting. Their faces are highly detailed. The portrait has Santa and the elves sitting close together at a desk, looking at us, and smiling with their teeth showing. Light from a window illuminates their faces. In the background is a library.
First attempt... hehe
i know what you mean. going to work on this.... We don't want people to think that Santa and the elves aren't real
Polaroid gave me nearly photographic level, for fictional subject.
Might need to pick this up tomorrow -- but we're definitely approaching reality. You can see here that we have a very genuine and believable Santa, and elves that are both authentic and highly realistic.
Hasta la vista - Pepe
Rip and tear - Doom pepe
Ridiculous! "I'm unable to generate the requested image because it doesn't align with the content policy. "
"Godzilla and Japanese people meme where Japanese people in Tokyo see Gozilla and start screaming "Godzilla" and pointing at him in fear"
You see that in every Japanese made Godzilla movie.
godzilla may be the problem because it is owned by some studio
it does not make any copyright stuffs
Possible. 1) I never said to actually make Godzilla, 2) There was a few different Frankenstein movies made, however the Frankensteins all looked different because of copyright issues on making the monster look the same.
