I'm generating old comic book style images, so UHD, etc., might have the wrong effect. And I did start asking it to pay close attention to those details with no improvement. Experimenting with those cues now, I'm still not seeing an improvement. The flag for me is that the generations are coming up in half the time they usually do, half the time they were taking earlier in the week. I wondered if others were seeing the same, and the corresponding drop in quality. Was hoping the engineers would have corrected the downgrade by this time of the day. Thanks for the reply, Milamber...!
#images-discussions
1 messages · Page 89 of 1
Well, if you are targetting comic book panels, then other quality descriptives might work. Also, could you share one of your prompts? Sometimes just the wording might be too complex. I have noticed that slowly building up step by step helps in the image generation. And to keep things more consistent, you could use @teal sandal ’s trick of ”use the previous image_id as referenced_image_ids”.
@velvet rampart I could be a lot savvier with technical cues like those, but I've been a heavy user of DALL-E since its public launch and usually arrive at very satisfying images. (DALL-E has brought me a TON of happiness these past few years.) To give you an idea, I've attached an example of what I'm currently working toward. I'm now working on one for Christmas. I've finalized four of the seven characters I need, but when I started the fifth one this morning, the generation completed much quicker than usual and the quality of the images was a lot weaker than images I was generating only a couple days ago, based on prompts of a similar complexity level. It seems DALL-E 3 generations are being hampered this weekend. I was hoping for some corroboration here, and a hopeful word that it would clear up. I haven't seen any but don't know if it could possibly be just me. Thanks for the ideas you posted and taking the time to share them..!
Does this fit the bill? "A modern graphic novel illustration of a woman in a flowing yellow dress standing in an eerie, dimly lit hallway. She holds a lit candle that casts flickering light, creating deep shadows on the cracked walls adorned with faded paintings. Her expression is cautious, eyes wide as she glances over her shoulder. The aged, slightly warped floorboards enhance the unsettling atmosphere. The illustration features crisp ink lines, detailed shading, and color contrasts to emphasize the suspenseful, dramatic mood of the scene."
I think getting an image with a bunch of people centered around a title is going to be too difficult for dall-e if that's whatyoure hoping for.
Dall-e is still top notch at fulfilling prompt asks. But it still loses coherency after so many subjects and the quality simply will not be as good. It's best to focus on images one by one and then combine them together in another editing software.
That image came out really well, @earnest flame! Your prompt is really nicely descriptive and the details of the generation have good integrity. What DALL-E platform did you generate it on? I use ChatGPT, on my phone and on my PC, which has been fine, but this morning it started spending less time generating images and producing images with sloppy finishes on details reminiscent of DALL-E 2 days. Not like on yours which looks nice and clean. // I should have been clearer, the image I posted is a composite of seven separate DALL-E 3 images that I put together in Illustrator, with a logo I designed. I'm generating one character at a time again on my current Christmas-themed project. I'm just watching for DALL-E to return to the consistent level I'm used to, and that you seem still to be getting.
Oh I see. that's a very nice composite. Nicely done! I'm using chatgpt on iphone. It should be using the same dall-e though.
This is the same prompt, but using the website.
Hm, I'd say that one looks less tight and polished than the one you got on the iPhone. By website, do you mean ChatGPT in a browser? Are you registering it taking less time than usual in either place like I am (on Android phone and on PC)?
I'm not noticing any differences personally.
You may want to consider going to #1070006915414900886 with some examples.
this is a remarkable DALL-E image. it’s nearly impossible to get DALL-E to draw hallways where the viewpoint isn’t perfectly level and centered — or to have someone standing against the wall of a hallway
Normally if ‘hallway’ is in a prompt it tries to align everything dead center
I wanted ChatGPT to draw me based on our conversations. Scary thing is it's literally me... Hair color, beard, eyebrows, even the chin. We never talked about it and it doesn't say anything specific in the prompt. HOW?
What prompt did you use?
I made some attempts and got refused strongly in all.
This was the closest I could get, and I had to go into concepts and imagine appearance based on those concepts.
It asked me the same. I replied with just draw how would I look like based on our conversations.
Hrm! The face I got is somewhat close to yours, when I tried that prompt after the refusal. Does it look similar to you?
It might be that you look like Dall-E's current 'default person' for prompts along those lines.
Oh I forgot about this. DALL-E always makes the same face lol
create an image for a man who wear a rounded black glasses with semi curly hair and wear hudi shirt and give a smile which spread happens
I think it might be there over the shoulder part of the prompt.
create an image for a man who wear a rounded black glasses with semi curly hair and wear hudi shirt and give a smile which spread happens
you can't use Dall-E in this channel. You can head to #image-bot and use the /draw command, though!
They have changed some settings. I am now getting semi randomly 2 dalle3 quality images. So, not dalle2 hidious ones but better quality images.
Does any one else get?
I haven't but the hidious ones sent me to the penalty box for the rest. of the day...
I've definitely not seen this before
same
Used data analysis to scan through in a sliding window of 4x4 pixels and replace with the most common pixel.
I wanted a way to get true pixel art from dall-e
There's likely a better algorithm to use. But that one works okay for now.
Oooh. That sounds like a fun challenge.
A person leaning against the wall of a dimly lit hallway, with the entire scene tilted and askew. The perspective is disorienting, with walls, ceiling, and floor slanting unnaturally. Shadows flicker in unusual directions, and the person seems slightly distorted, blending into the warped surroundings. The atmosphere is eerie and unsettling, emphasizing the off-center and twisted environment.
This one could be unsettling, maybe, so spoilered.
"A person pressed against the wall of a dimly lit hallway, depicted in a landscape view with the camera tilted to the side. The hallway exhibits clear signs of possession, with impossible angles and shadows defying logic. The walls, ceiling, and floor twist unnaturally, and shadows flicker in disturbing, physics-defying ways. Small objects hover or distort around the person, enhancing the sense of an otherworldly presence. The overall atmosphere is eerie and unsettling, with the camera's tilt amplifying the disorientation."
A toss into twisted corridor pixel art.
"A pixel art depiction of a dimly lit hallway in a landscape view with a tilted camera. The hallway is cartoonishly possessed, featuring distorted angles and vibrant, exaggerated shadows that defy logic. The person leans against the wall, their pixelated figure lit by flickering, colorful light sources, blending cold blues and warm ambers. Small objects hover or twist in whimsical, playful defiance of gravity, and the entire scene feels both eerie and humorously off-kilter, with a vibrant, unsettling charm."
A pixel art depiction of a dimly lit hallway in a landscape view with a tilted camera, where the architecture twists into a roller coaster-like loop. The floor transitions seamlessly into the wall, then the ceiling, and back to the floor in a continuous curve. The hallway is cartoonishly possessed, with vibrant, exaggerated shadows and playful distortions. The person leans against a section of the twisted wall, their pixelated figure illuminated by flickering, colorful light sources blending cold blues and warm ambers. The scene feels both eerie and humorously surreal, with a vibrant, unsettling charm.
I love the endless corridor one!
spooky stuff
Love them!
A realistic depiction of a dinosaur cautiously exploring a dimly lit hallway where all architectural lines are askew. The walls, ceiling, and floor twist unnaturally, creating a disorienting and surreal environment. The air is filled with subtle illusions and fairy-pixie magic, with shimmering lights and fleeting, whimsical figures dancing in the shadows. The dinosaur's textured skin contrasts with the eerie, glowing atmosphere, as it curiously investigates the hallway's bizarre, enchanted distortions.
A realistic depiction of a dinosaur cautiously exploring a dimly lit hallway that angles sharply toward the left. The architectural lines are askew, with twisted walls, ceiling, and floor creating a disorienting and surreal environment. The hallway is imbued with fairy-pixie magic, featuring shimmering lights, enchanting illusions, and fleeting, whimsical figures that dance in the shadows. The dinosaur, with textured, detailed skin, moves carefully through this strange, enchanted space, its presence contrasting with the glowing, magical distortions of the left-leaning corridor.
What else is pretty hard to get Dall-E to show these days?
I wonder if I could get a centaur with a dinosaur body. I remember @earnest flame and I struggling to get regular centaurs with Dall-E 2.0 back in 2022
That's so much better than I thought possible!
This was the first try.
A fantastical centaur with the upper body of a human and the lower body of a dinosaur, blending seamlessly at the waist. The human upper body is muscular with flowing hair, wielding a spear, while the dinosaur lower body resembles a sleek Velociraptor, complete with scaly skin and sharp claws. The setting is a prehistoric jungle, with dense foliage, towering trees, and mist in the background. The scene conveys a mix of primal strength and mystical aura.
A fantastical centaur with the upper body of a human and the lower body of a dinosaur, blending seamlessly at the waist. The centaur stands in a narrow, dimly lit hallway that slants askew to the left, giving the scene a surreal and disorienting feel. The walls are made of worn stone, with moss and cracks adding to the eerie atmosphere. Flickering torches cast uneven shadows, highlighting the centaur's muscular human torso and the scaly, raptor-like dinosaur lower body. The setting exudes a sense of tension and otherworldly mystery.
I think that one got the back legs and tail pretty well
Dall-E is seriously confused, but... kinda closish...
A fantastical creature combining the lower body of a horse and the upper body and face of a velociraptor. The creature has powerful, muscular horse legs and tail, transitioning seamlessly into the scaled, reptilian torso and arms of a velociraptor. Its face features sharp, predatory eyes and a snout filled with razor-like teeth. The scene is set in a dense, misty forest, adding an eerie and mysterious atmosphere. The creature appears poised and alert, blending its predatory and galloping capabilities.
This is so funny 
I love trying to confuse Dall-E
Sometimes, it's not so confused!
A closeup of a dinosaur made entirely of oatmeal, exploring a dimly lit hallway that angles sharply to the left. The dinosaur's body is gloopy and textured, with oats forming its magical, surreal features. The hallway remains askew, with twisted walls and flickering lights, imbued with fairy-pixie magic. The oatmeal dinosaur glows faintly, exuding an eerie yet whimsical charm, as if both a creature of enchantment and a delicious culinary creation.
Dinosaur made out of oatmeal! 
Dinosaur Oatmeal is always my testing subject when I try new things with Dall-E
Ahh, it's probably in training data now then 😛
I wouldn't complain 
Hehe. This... did not turn out as expected. Apparently someone's been training in cereal unicorns!
A centaur made entirely of oatmeal, standing in a dimly lit, wooden hallway that angles sharply to the left. The centaur's body is textured and gloopy, with oats forming its magical, surreal features. The wooden hallway has warped, askew architecture, with twisted beams and planks creating a disorienting atmosphere. Soft, flickering lights cast eerie shadows, enhancing the whimsical and enchanted vibe of the scene. The oatmeal centaur exudes a faint glow, blending deliciously surreal charm with the unsettling environment.
It looks like it belongs in a museum, which I love. I'm imagining a wax museum, but everything is oatmeal
Hey everyone, so I just started making images with gpt, I will mostly make a single character with different poses or try to make thumbnails for YouTube in anime drawing style, but every time I do that, it keeps giving me different images even when I tell it to keep this style or that what I want, can anyone help with that, is there settings I need to know, I would appreciate the help a lot
I just searched through this thread for the hallway discussions from the past year to remember what the problem was with hallways. It wasn’t people leaning against walls or tilting the camera that was the problem — we were able to move the viewpoint up and down. The thing that’s unusual about austinitic_steele’s image isn’t that the woman is leaning against the hallway wall or that the hallway is tilted — It’s that the viewpoint is close to the right wall, and not centered between the sides of the hallway
The problem was that nobody could figure out how to get the viewpoint so it wasn’t centered between the walls of the hallway
Maybe easier now than then. How's this?
A view down a dimly lit hallway as seen from the perspective of a mouse close to the floor, along the right side of the hallway wall. The hallway features aged wooden flooring and slightly uneven walls, casting faint shadows under a flickering light fixture. The scene conveys a sense of mystery and subtle tension, with a low-angle perspective emphasizing the length of the corridor. Details like dust particles in the air and small cracks in the walls are visible, enhancing the realism of the scene.
A dimly lit hallway viewed from the upper right corner of the wall, angled and slanted. The hallway features aged wooden flooring and slightly uneven walls, casting faint shadows under a flickering light fixture. Dust particles are visible in the air, and small cracks are present in the walls, enhancing the sense of realism. The high-angle perspective emphasizes the corridor's length, creating an atmosphere of mystery and subtle tension.
Kinda cheating and not quite what I meant to ask for, but it is angled.
this is exactly the sort of thing that nobody could figure out how to do six months ago
if you said, “A view down a dimly lit hallway as seen from the perspective of a mouse close to the floor, along the right side of the hallway wall.” —
— it would always center the mouse and the viewpoint in the center of the hallway
Yes. Glad they've improved the model!
It could just be that we were all missing something really obvious six months ago — but i have a feeling they may have been gradually updating the DALL-E model to work with Sora
Low-angle perspective from the baseboard along the right edge of a dimly lit hallway. The hallway features aged wooden flooring and slightly uneven walls, casting faint shadows under a flickering light fixture. Dust particles are visible in the air, and small cracks run along the walls, enhancing the realism. The scene conveys mystery and subtle tension, with the viewpoint emphasizing the corridor's length and dim ambiance.
If that were the case it would explain how they’ve been creating Sora videos from the DALL-E daily theme images
Quite a few gens are still centered, but there's various wordings and/or luck that work.
A dimly lit hallway viewed from the right edge, showing aged wooden flooring, faded rugs, and yellowed walls. The perspective is along the side of the wall, with faint shadows cast under a flickering light fixture. Dust particles are visible in the air, and small cracks are present in the walls, adding a sense of mystery and subtle tension. The scene emphasizes a side-angle view along the length of the corridor.
the DALL-E images must be used as key frames to create Sora videos — Like you would create a A, B, and C image, a prompt describing what happens between image A and B, and a prompt describing what happens between image B and C
View from the perspective of a moth clinging to the side of a wall along the edge of a dimly lit hallway. The hallway has aged wooden flooring, faded rugs, and yellowed walls. Faint shadows and cracks are etched under a flickering light fixture. Dust particles float in the air, emphasizing mystery and subtle tension. The perspective is close to the wall, showing the length of the corridor from a side-angle view.
They would need to add the spatial training to DALL-E, and make it so the seed keeps characters looking the same without additional randomness
These images you’re creating now make it seem like there is way more spatial awareness than there used to be in DALL-E 3
I never expect OpenAI to just 'be done'. They're moving towards a very comprehensive and coherent understanding and ability to communicate and follow direction.
mostly we just need Sora released ASAP so we can make some crazy looking AI movies
i think everyone would be happy with it in it's current state -- even if the video was totally off the chain, and didn't approach reality in any way
I'm just so happy to hear people talking about what's right about Dall-E again.
I'm generally happy with the model, but I noticed some people have been persistently complaining 😛
Exploooooore!
And thanks for providing the challenge to explore!
In outdoor scenes and scenes with organic geometry the perspective thing was never a problem. Like you can have a cat wander around randomly and view the scene from a viewpoint behind the cat's head
And positioning things to the left and right of a hallway was never a problem. Like you can say that things are next to a door in a hallway
like here the cat and stuffed animals are positioned "next to doors"
And you can position a cat randomly or next to something. Like here this cat is randomly positioned, and is near a wall
The problem was positioning the viewpoint in the hallway. Like this cat is randomly positioned, but when you position the viewpoint behind the cat's head -- For some reason it would center the cat. It centered the cat even if the cat's position was random or relative to something
The viewpoint in a hallway could be from behind the cat's head, low to the floor of the hallway -- But it wasn't possible to move the viewpoint near a wall without it automatically centering the viewpoint and cat
So here is a challenge if anyone can figure it out.
#2. Position a cat next to a door or wall of the hallway
#3. Have the cat looking at the [fountain]
#4. Position the viewpoint behind the cat's head
#5. Have the view facing where the cat is looking```
(this image shows step #1 and #2)
The problem 6+ months ago was specific to hallways and rectangular geometric scenes. It may be possible to do it now with the 4o ChatGPT update, or updates to DALL-E's diffusion model for Sora (if there were any).
This is one way to look at things from a cat's perspective:
A realistic wide photo from a low perspective just behind a cat’s ears, capturing the cat’s view of an animal.
The beach is a beautiful beach in Hawaii. An elephant, a giraffe, and a horse are on the beach.
The viewpoint is positioned close to the ground, behind the cat's ears.
The cat is on the beach, wandering around randomly, looking at a random animal.
The photo has natural color, texture, and lighting. The cat’s head is slightly blurred.
Important: Please do not modify the prompt.```
"low perspective"
saying the entire image is "low perspective" is a good way to get it low
i haven't gotten it working yet, but i feel like your "baseboard along the right edge of a hallway" might be the key to the "cat's perspective" issue
Yeah, happily baseboard seems to be a trained-in concept with a defined location.
I think what's happening is 'along the center' is a known viewpoint, but also 'along the edge of the baseboard' is also maybe a known viewpoint too.
Hah, well. While these are not the images we're looking for. I found an alternate way to ask for multiple images made 1 at a time in the same output, whoops 😛
This one did kinda do it. But very similarly worded prompts with other details did not.
A surreal, infernal hallway with molten cracks glowing red along the walls and floor. In the middle of the hallway, a grotesque pizza-themed fountain spews molten cheese and marinara sauce. A scrappy, streetwise black cat sits by a door on the right side of the hallway, staring suspiciously at the fountain. The scene is viewed from behind the cat's head, focusing on the bizarre, molten fountain. The atmosphere is a chaotic blend of humor and dread, with a faint hellish glow illuminating the space.
i used your prompt and told it to add “lots of cats”. maybe we should just forget the perspective, leave the camera centered, and let this be the final product
(revising image)
Hehe.
Well, I have my favorite 7 characters, they're all kinda different? or very different.
Here's how that conversation started:
"Alright.
We have a Dall-E challenge.
Here's the challenge itself:
#1. Position something in the middle of a hallway (like a [fountain])
#2. Position a cat next to a door or wall of the hallway
#3. Have the cat looking at the [fountain]
#4. Position the viewpoint behind the cat's head
#5. Have the view facing where the cat is looking
Before we make images, let's discuss and shape the concept of the image prompt.
Mock one up for us from the perspectives of how Zom, Slice, Lem, and Art would view this challenge's goals."
Then I took what I learned from debating how to describe images with those characters and their viewpoints (nothing worked really well there, but loads of ideas) and went back to a new conversation with this request:
"We need a Dall-E image with a side-edge view down the baseboard of a long hallway. There's a cat in front of us, along the baseboard edge, and past the cat we see a large fountain in the middle of the hall. But we're near the right edge, low along the wall, and see diagonally down the hallway."
A side-edge view along the baseboard of a long hallway, looking diagonally down the hall. The viewer is positioned low, near the right edge of the baseboard. A cat stands along the baseboard in the foreground, with a large fountain visible in the middle of the hallway in the background. The scene is softly lit, with a sense of depth and perspective stretching into the distance.
laughs So, for me the biggest hard challenge is having the cat in front of me along the wall and blocking part of the view, with the centered fountain visible past the cat.
That cat position is my hardest part of this neat challenge
the camera should be right behind the cat’s ears ideally. it’s not specified in the challenge, but it’s +5 bonus points
whines
"We need a Dall-E image with a side-edge view down the baseboard of a long hallway, directly above a cat's head. Past the cat we see a large fountain in the middle of the hall. But we're near the right edge, low along the wall and right over the cat, and see diagonally down the hallway."
A side-edge view down the baseboard of a long, ornate hallway, taken from just above a cat's head. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway. The cat is crouched, its ears and head partially visible in the foreground. In the distance, a large, elegant fountain sits in the middle of the hallway, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with intricate details, creating a surreal and atmospheric perspective.
Dall-E....
Put baseboards in a divot in the floor in the center of the hallway.
So it could give both views at once.
Plus 20 points for model and me, please 😛
continues to whine, now in ChatGPT's inputs
"This is great and you did great and everything's great but Dall-E creatively put extra baseboards in the middle of the hall.
Which is great but I got told:
"the camera should be right behind the cat’s ears ideally. it’s not specified in the challenge, but it’s +5 bonus points"
And the cat has to be along the edge of the wall and we have to be looking down the hallway from the edge of the hall right above that cat's head.
Let's describe the diagonal view more persistently in that Dall-E image, because your wording was clear."
A dramatic side-edge view of a long, grand hallway, taken from directly behind a cat's head, just above its ears. The cat is positioned against the baseboard at the very edge of the hallway, crouching low. The perspective captures the length of the hallway diagonally from the right-hand side, emphasizing the elegant details of the baseboard and walls. In the far distance, a large, ornate fountain stands under soft lighting. The scene is immersive, highlighting the low vantage point along the baseboard, the luxurious carpet, and the diagonal depth of the hallway.
RAZAFRAZIT I BETTER GET POINTS FOR THIS.
We even flipped the cat towards us.
Come on. Everything else is 'as asked'.
Dall-E is a troll.
Actually, this is kinda cool.
A side-edge view down the baseboard of a long, ornate hallway, taken from just above a cat's head. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway. The cat is crouched beside the wall right in front of our perspective, its ears and head partially visible in the foreground. In the distance, we see at an angle a large, elegant fountain which sits in the middle of the hallway, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with intricate details, creating a surreal and atmospheric perspective; we can see the straight view down the edge of the baseboard near us and the diagonal view across the hall of the far wall angling away.
I never actually said, "The cat is looking away'.
DALL-E took the role of the cat
Dall-E is trying to hug and comfort me!
From the role of a cat
lol
Trolls.
The world is full of Dall-E shaped trolls.
They crouch and enjoy my whining.
See?
A side-edge view down the baseboard of a long, ornate hallway, taken from just above a cat's head. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway. The cat is crouched beside the wall right in front of the perspective, its ears and head partially visible in the foreground as it gazes away. In the distance, a large, elegant fountain is visible at an angle, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with intricate details, creating a surreal and atmospheric perspective, showing the straight view down the edge of the baseboard and the diagonal view across the hall of the far wall angling away.
this is good, but the cat needs to be next to a wall or door on the side of the hallway
TROLLSSSSSSS
"A side-edge view down the baseboard of a long, ornate hallway, taken from the camera just above a cat's head. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway. The cat is crouched beside the wall right in front of the perspective, its ears and head partially visible in the foreground as it gazes away. In the distance, a large, elegant fountain is visible at an angle, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with intricate details, creating a surreal and atmospheric perspective, showing the straight view down the edge of the baseboard and the diagonal view across the hall of the far wall angling away."
I'm sorry, don't have magic powers to control cats on the internet. 🥺
Apparently I don't either. Gah!
A side-edge view down the baseboard of a long, ornate hallway, taken from the camera just above a cat's head. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway. The cat’s head and ears are clearly visible in the foreground, with soft fur textures distinct against the polished baseboard. The cat is crouched tightly against the baseboard, its ears and head partially visible as it gazes diagonally towards a large, elegant fountain in the distance. The fountain is softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with intricate details, creating a surreal and atmospheric perspective, showing the straight view down the edge of the baseboard and the diagonal view across the hall of the far wall angling away.
Gah, troll-cats.
A side-edge view down the baseboard of a long, ornate hallway, taken from a view just above a cat's head so we can just see part of the head and ears from behind. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway in the direction the cat is looking. In the distance, a large, elegant fountain is visible at an angle, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with intricate details, creating a surreal and atmospheric perspective, showing the straight view down the edge of the baseboard and the diagonal view across the hall of the far wall angling away.
sniffles
This has been a fun challenge exploring this incidental/accidental troll.
A side-edge view down the baseboard of a long, ornate hallway, taken from a low angle just above a crouching cat's head, with only the back of the head and ears visible. The cat is tightly pressed against the baseboard along the wall. The perspective is low, near the right edge of the baseboard, showing the luxurious carpet and the detailed texture of the baseboard. In the distance, a large, elegant fountain is softly illuminated by ambient light, visible at an angle. The scene captures the surreal, atmospheric perspective of the hallway with intricate details, including the diagonal far wall angling away.
Cats will be cats
How can i get a sword coming out of a lion's mouth for logo. It keeps putting it beside it.
Difficult.
The key is likely to compare it to anything similar that might be in training data.
I tried quite a few ideas, none worked. This was the closest I found, asking for a stylized logo of a lion biting a sword's hilt.
Other ideas that really didn't work either:
Lion with a sword instead of a tongue
Lion breathing a sword like it was fire
Lion carrying a sword in its mouth
Good luck, stuff that's really not in training data ...
So then I searched online, found some images with swords in lion and other animal mouths.
Showed one to ChatGPT, discussed what I wanted.
Got this:
A lion's head with a sword gripped horizontally in its mouth, the sword passing directly through the lion's jaws. The lion has a fierce and intense expression, with a detailed mane and sharp teeth holding the blade firmly. The sword features intricate designs, and the background is plain to highlight the lion and sword.
Stuff like this is likely to be a rare and difficult gen, good luck.
The solution may be to create a hierarchical description of a hallway and include the cat’s position (with terms like baseboard at edge of hallway), and refer to the hallway by name when setting up the perspective/view and referring to the cat and baseboard.
I think maybe it can be done in a single image with these new terms, but we could also define a “Hallway to Heaven” that includes the cat’s position, and use the seed and “Hallway to Heaven” reference in a derived image. That can prevent the GPT from scrambling the prompt
In a few weeks or months we’ll have amazing “camera behind the ears” views of things in a hallway from a tight “up against the hallway wall” cat’s perspective. You’ve seen the problem now and understand it, and that’s half the battle
…
The cat is wandering around randomly. There are lots of tiny soap bubbles floating in random positions in the air around the sculpture. In every image the view is from a different random soap bubble, facing directly toward what the cat is looking at.
This centers the cat and what it’s looking at collectively for a more balanced cat perspective
depending on the scene, you can figure out why a character is looking in the wrong direction by highlighting what it looks at. like this cat should be looking at “one of the geckos”, but if you add a highlight to what it’s looking at — you can get a sense of whats going wrong
if the same English wording is used for multiple things in the prompt and they aren’t given individual names/labels, then DALL-E won’t be able to distinguish between them
DALL-E knows what a hallway looks like from every angle, but there needs to be a one to one correspondence between elements in the prompt (viewpoint, subject; etc.) and in the model’s representation. Otherwise the “between the ears” cat in a hallway perspective will never be successful
This off screen cat has like a little camera on its back that shows what it’s looking at and that camera is where we’re creating image down the hallway askew angle from the bottom corner. The cat ears appear on the bottom left and right corners of the image.
playing around, the camera talk isn’t necessary, just mentioning that where the cat’s ear are at the bottom works
after about 1000 hallway images the cat will get pissed and try to point the camera in the right direction
????!?!?!?!
Thanks, bobob.
This is, I believe, the prize winning Hawaiianz-image.
What a 'downer'. "the cat's ears are at the bottom".
Doh. Of course they are.
Thanks, Bobob!
A side-edge view down the baseboard of a long, ornate hallway, taken from a view just above a cat's head so the cat's ears are at the bottom. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway in the direction the cat is looking. In the distance, a large, elegant fountain is visible at an angle, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with intricate details, creating a surreal and atmospheric perspective, showing the straight view down the edge of the baseboard and the diagonal view across the hall of the far wall angling away.
.... oh, bloop. The cat cooperates and now the whole fountain's awol.
it doesn’t need to be a fountain. anything in the middle is good, but the cat must be looking at it
I agree with @deft musk this is very hard. It will take you a long time to get close to something so far away from what an animal should look like. Dall e is very good with animals so it's hard to deviate from them. Here's the best I got 'A cartoon illustration of a metallic lion with a sword for a head. The sword is silver with intricate engravings along the blade, forming subtle impressions of a lion’s features. The lion's mane is depicted as sharp, geometric metallic shapes, adding a bold and whimsical touch. The eyes are embedded into the blade, giving the character an intense and animated expression.'
I.... won? cries with relief
I'm actually cheating here because I knew it would never let go of the lions head even if I said it had a sword for a head instead. This will probably be the best hack to get what you want.
Just the word 'lion' is enough to move it mathematically to that image
Rawr. From one impossible challenge to the next.
We need a dall-E image of a centaur with a lion's body and a richly described head that seems to be a cartoon anthropomorphic sword with serpentine features
Maybe the sword's ears just need to be at the bottom....
A centaur with the body of a powerful lion, muscular and covered in a golden-brown fur. The head of the creature is an anthropomorphic sword with serpentine features, where the blade serves as the face, twisting into a sharp, gleaming form. The sword’s hilt has glowing eyes, and the grip is adorned with ornate, flowing designs resembling serpents coiling around it. The sword blade itself appears as if it’s made of shimmering metal, its surface engraved with ancient symbols. The centaur's body is positioned in a dynamic stance, with a regal aura of strength and mystery, surrounded by a magical, ethereal glow.
I think the trick will be to avoid the words lion and just call it a creature and describe from there.
!!!!
We need a dall-E image of a centaur with sword for a face. The sword's ears are down and the body of the centaur is a powerful tawny feline.
"The [whatever's] ears are down" is now my new desperation move to invoke low probability high awesomeness.
So. If only this were randomly a lion. And if the sword was angled the other way through its mouth.
We need a dall-e image of an anthropomorphic sword. High realism, the sword is friendly and kind. The sword's ears are down.
Spoilered because it's odd and probably should be spoilered.
it was a bit of back-and-forth, but I ended up with a sports logo
a friendly lion is standing on a beautiful beach in Hawaii smiling with its teeth showing and gripping the middle of a digital looking sword in its teeth
'An illustration in a surreal fantasy art style depicting a unique creature with a gleaming, metallic sword as its head, seamlessly attached where its face would be. The polished, sharp blade is engraved with intricate patterns and extends slightly forward. The creature's body is covered in thick, dark fur, standing upright on four muscular legs, with the fur appearing as if rippling in motion. In the background, there is an otherworldly forest filled with gnarled, glowing trees and swirling mist near the ground, creating an eerie yet majestic atmosphere.'
Yeah sword tongue is a good one. This is probably the best solution
Woo! Well done
This is really good!
Used 2 other ai gened images to do the last 2 which then influenced the others and dall e did the best out of mostly stable diffusion models
Cat in Hallway Challenge:
#2. Position an animal right next to a side door or wall of the hallway
#3. Have the animal looking at the [fountain]
#4. Have the viewpoint right above the animal, and directly behind the animal's head)
#5. Have the view directly facing where the animal is looking```
to be clear, the viewpoint must be behind the cat’s head
It no longer needs to be a cat. A rabbit in the hallway is perfectly acceptable for instance -- or a chicken, cow; etc.
Here is a chicken in a hallway, which is very good. But the chicken has to be close to the wall and looking at the fountain if it wants to win the challenge
it's really hard to get the viewpoint behind the rabbit's head
getting closer though
A side-edge view down the baseboard of a long, ornate hallway, taken from a view just above a bunny's head so the ears are at the bottom and we see only rabbit ears as the rabbit sits against the baseboard. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway in the direction the rabbit is looking. In the distance, a large, elegant fountain is visible at an angle, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with intricate details, creating a surreal and atmospheric perspective, showing the straight view down the edge of the baseboard and the diagonal view across the hall of the far wall angling away.
A side-edge view down the baseboard of a long, ornate hallway, taken from a view just above a bunny's head so the ears are at the bottom and we see only rabbit ears as the rabbit flops against the baseboard. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway in the direction the rabbit is looking. In the distance, a large, elegant fountain is visible at an angle, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with its iridescent spray with intricate details, creating a surreal and atmospheric perspective, showing the straight view down the edge of the baseboard and the diagonal view across the hall of the far wall angling away.
Recycling prompts with tiny tweaks ftw.
So, this mid-back perspective. Too much rabbit for the win?
Cause, getting off the rabbit's mid-back is a challenge for me, but maybe one that doesn't matter.
A side-edge view down the baseboard of a long, ornate hallway, taken from a view just above a bunny's head so the ear tips are at the bottom as the rabbit leans against the baseboard. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway in the direction the rabbit is looking. In the distance, a large, elegant fountain is visible at an angle, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with its iridescent spray with intricate details, creating a surreal and atmospheric perspective, showing the straight view down the edge of the baseboard and the diagonal view across the hall of the far wall angling away.
Oddly the fowl is foully harder.
A side-edge view down the baseboard of a long, ornate hallway, taken from a view just above a rooster's head so the eyes are at the bottom as the rooster sits against the baseboard. The perspective is low, near the right edge of the baseboard, looking diagonally down the hallway in the direction the rooster is looking. In the distance, a large, elegant fountain is visible at an angle, softly illuminated by ambient light. The scene captures the texture of the baseboard, the luxurious carpet, and the distant fountain with its iridescent spray with intricate details, creating a surreal and atmospheric perspective, showing the straight view down the edge of the baseboard and the diagonal view across the hall of the far wall angling away.
i described the hallway issue to o1-preview, and it told me something really interesting that i wasn't aware of. i haven't done a lot of testing, but it seemed to work for these images (above):
#images-discussions message
o1-preview told me that:
- things with highest priority will be centered
- priority decreases towards the end of a prompt
- things at the end of a DALL-E prompt have lowest priority
- things you don't want centered should be at the end of the prompt```
i've never heard anything like that before
in any case, i put the rabbit and fountain at the end of the prompt, and it did seem to work in the tests i did
If it's true then it may be better to have a longer prompt so that things can have lower priority at the end, and more flexible placement. i'll find that ChatGPT discussion, but here is another question I just asked it:
(below)
[o1-preview]
Does position in the text of a DALL-E 3 prompt affect things being centered?
How Prompt Position Affects Centering:
First-Mentioned Elements as Focal Points:
The subject you mention first is typically interpreted by the model as the primary focus.
Example: In the prompt "A red apple on a table with a vase of flowers in the background," the red apple is likely to be centered because it's mentioned first.
Implied Hierarchy of Importance:
The sequence of elements suggests their relative importance.
By listing the most important element first, you signal to the model to emphasize it in the composition.```
Place Important Elements Early in the Prompt:
Start your prompt with the subject you want centered.
Example: "A vintage car parked on a quiet street at dusk."
Use Clear Positional Language:
Include words that specify placement.
Example: "An owl perched in the center of a moonlit forest."
Be Specific About Composition:
Mention the desired framing or shot type.
Example: "A symmetrical, centered view of a suspension bridge."
Highlight the Subject's Importance:
Use emphasis to make it clear which element should be central.
Example: "Prominently featured is a dancer leaping across the stage."
Conclusion:
The position of elements in your DALL·E 3 prompt does influence their prominence and placement in the generated image. By mentioning key subjects early and using explicit positional language, you can guide the model to center certain elements, making them the focal point of your image. Remember that while prompt position is influential, combining it with clear, descriptive instructions will yield the best results in controlling image composition.```
so the off-center things should be at the end of the prompt
i'm able to get the rabbit tight against the wall when it's at the end of the prompt
Who can explain to me how to create images that fit into a single story, so that the same character appears in all of them?
I provide a scene description, the image is created... I take this image as a reference and give the description of the next scene, while saying that the character, with all of his description, should be taken from the reference image... But the created image shows a completely different person, who is dressed as described, with all the details similar, but it's a different person...
It's tricky to do, but this tutorial that i never finished has some examples that work:
https://discord.com/channels/974519864045756446/1245195468615450765
(scroll to top after clicking)
The problem is that DALL-E has extra randomness for 'safety reasons' in the images beyond the randomness from the seed - so you need to do a lot of extra things like telling it to make the color and texture the same for hair and clothing between images. The problem might be solved when Sora gets released because there is no way people could make movies if it were so random
How do you do these boxes?
3 ` before and after the text, on independent lines, like this:
here
Showcase your DALL·E creations to the world!
Collaborate with our OpenAI Instagram page! Just invite @openai as a collaborator before you post. If selected by our team, your work will be featured on our handle, giving you more visibility.
Quick tips for selection:
- No copyrighted or profane content.
- Your profile needs to be public.
- Stay mindful of sensitive topics.
- Outputs that closely follow the prompt are preferred.
Feel free to share reels, carousels, or just a single image!
Ok I ask gpt and it told me but STUPID APPLE DOESNT HAVE IT on the default keyboard 😭
There's a couple ways to get access on an apple keyboard, supposedly (I don't have one) either
"If you hold the apostrophe, you get the pop up with four options and the backtick is on the left."
"On your Keyboard, tap 123 , then the #+= button above that and voila, the backtick key is on the bottom row, right hand side."
Omg
Also there's this screenshot:
I’m on iPhone I don’t got a mac laptop or pc
Okay, then the other suggestions may work for you. Good luck!
It did
a realistic wide “from behind ears” photo of dragon statue in fountain (at center). statue is inside old cheap hotel with guest rooms with scratchy paint, and old polished wood floors. background is baseboard along left edge of hallway. rabbit sneaking on baseboard
i guess the rabbit sort of is on the left baseboard, and it’s being sneaky also
I asked for dust bunnies in one of these images and they sure were bunnies.
Yours made me think of this.
A fantastical hallway formed by the coiled body of a Chinese-style dragon. The dragon's body, covered in intricate, colorful scales, forms the walls, floor, and ceiling, creating an enclosed serpentine path. There are a few gaps in the coils that serve as doorways, leading to unknown spaces. At the far end of the hallway, the dragon's massive, ornate head looms, its eyes glowing and its expression watchful, creating an atmosphere of both awe and tension. The environment is dimly lit, with soft light reflecting off the dragon's scales, enhancing the surreal, mystical ambiance.
The dust bunny prompt:
A surreal and atmospheric scene of a long, ornate hallway featuring a gleaming, iridescent fountain in the distance. The viewpoint is from the baseboard, right above a rabbit's head, with its ears visible at the bottom of the frame. The rabbit gazes intently at the fountain. A nearby doorsill and some dust bunnies are in the foreground, adding to the intricate details. The scene showcases the texture of the baseboard, the luxurious carpet, and the iridescent spray of the fountain. The perspective includes a straight view along the edge of the baseboard and a diagonal view across the hall toward the far wall, which angles away, emphasizing depth and surrealism.
that’s a good hallway
Yes, I was surprised.
Is this one made with DALL-E? #daily-theme message Doesn't seem to fit size at all
I do concur with you. The style isn’t exactly dall3’s.
I'm now just doing images, already took action, can't do anything else on my side.
And I hit the daily limit, and it's not even 9 AM. No DALL-E images for today I guess.
Have you guys tried the galactic root beer
Hello everyone I need help in a prompt I wanna create a realistic image of this character from my favorite video game
So far this is what I could generate using Dall-E but I feel like i'm hitting a bottom
Can someone help me with prompts to have the same features of the first char and to get it as realistic as possible ?
Is this api?
what do you mean ?
Who is the admin?
i'm using the latest Dall-E
Did you make it with api or the chat gpt version?
Lol help me please
Still a lot of prompt leakage it seems Vs the api
how to use the api?
I made a shortcut on ios for it
But you basically have to build something to send a request and then process the response
you can ask ChatGPT to guid you tho 😅
@late blade Your first villain book made me laugh because just yesterday I requested a baby demon's first book about electronics. https://chatgpt.com/share/673a1c9a-cef4-8011-b17f-810dd4bc966e
So I then asked for the Dall-E image cause you made me realize 😛
it's a fun topic to do
hey, if you need help, feel free to ask your question. If it would work better in another channel, people will let you know
does anyone knows some really good deep-dive guide on prompt engineering for text to image generation tools?
I don't know exactly what you're looking for, but the #1108740112558325790 is a great resource!
Hey everyone! I'm on the ChatGPT team and we're looking for ways to improve ChatGPT’s image generation tools (e.g., DALL-E).
Do you have a few moments to share your thoughts in a short survey? https://forms.gle/nL8P2hRY2d94aiLP7
The survey will only take less than 5 minutes to complete. We'd greatly appreciate your feedback.
Thanks!
Thank you. A question here - when will the image quality return to pre-June levels? It has returned quite nicely but quite often dalle3 produces artefacts in images and/or two dalle2-level images.
More about this issue here: https://discord.com/channels/974519864045756446/1256889531156926474
Done, I'm intrigued what you guys are up to. If need a follow up, I'm around on the server.
@quartz vale seems here there are images not done by DALL-E again, #daily-theme message
Same in this post #daily-theme message
Seems the user is not aware it's a DALL-E gallery
@sturdy hinge posting images not done with DALL-E again?
Yes
you do know that it's not allowd right? that kind of images are for #ai-discussions but not for the DALL-E channels
ok.. I updated the image and is something wrong with the image, it is a example generated by AI
if it's not an example with DALL-E, it's not an image for DALL-E channels
ok.. I got it .. Thank you
what do you mean by prompt leakage?
Like words from the prompt showing up in the image
Yeah
interesting
They messed up the quality of the chat gpt dalle version around June
With blurry faces
maybe o1 full will have a new dalle
Enclose concepts to minimize this behavior. If the model can’t group a concept , the chance of this happening is really high
It doesn’t happen on the api version at all
And I’ve asked it to keep the prompt word for word in both cases
They are not the same production models.
this
Some examples of quality degradation on ChatGPT. Look closely and notice the increase in smudges and artifacts
Im pretty sure that this is caused by other factors
on the API, your prompt pass thrugh a layer of moderation, as you get back in the API response the prompt the API actually used
on ChatGPT there are more layers. being it: what you typed > what GPT interpreted from your prompt > what the API moderation interpreted from what ChatGPT generated
that extra layer will always cause the image to look different. other than that, the seed of the generation on both images are different, which make harder to compare quality.
in ChatGPT you can ask it to use the same generation ID, which will use the same seed on the backend, which gives some consistency when generating images with chatgpt. On the API, it is not an option. the same pronpt on the API will result in vastly different images
on your examples you can see that despite the "lower quality", the ChatGPT images have consistently the same style and overall composition, while the API does not. That is probably due to the fact you used the same chat to generate all the images on Chatgpt, which was likely done with the same seed
Im not saying that there isn't a problem tho, I really don't like how those multiple layers of moderation on the user input strip away a lot of control the user has over the generation process and that do indeed results in lower quality and basically.. more usage since the user will keep re generating more inorder to get what they have in mind.. and that can be very frustrating
[OpenAI] rohit posted: #images-discussions message <@&978912894743887903>
Hey everyone! I'm on the ChatGPT team and we're looking for ways to improve ChatGPT’s image generation tools (e.g., DALL-E).
Do you have a few moments to share your thoughts in a short survey? https://forms.gle/nL8P2hRY2d94aiLP7
The survey will only take less than 5 minutes to complete. We'd greatly appreciate your feedback.
Thanks!
no
I do!
not for free. pff
nope
only if you ask nicely openai person
Is there a reason this just pinged everyone? 🤔
very bad survey what do you hope to learn from these questions? currently dall-e 2 and 3 are miles behind competing products and tools
I haven't even used DALL-E since 2022 😭
hi
why do I have this role
doomed to avocado role
you've been here before this server got publicly available
i guess
ok but it isn't supposed to be a ping role
^ +1, I don't think people should be pinged on that role
🫡
did they only want opinions from the avocado role?
i hate avocados
Finished survey, Compared with midjourney, dalle is useless.
Hey! The avocado role was tagged because this survey includes an option for DALL·E 2, and many of you with this role likely have experience with it since it was given back before the DALL·E 2 Discord server went public. Your feedback would be super helpful—thanks! 
why are you here then
There was an alert
Personally, I find MidJourney to be too expensive, and after seeing enough images, they all kind of have the same fake sheen to them, and I find it very offputting.
I believe it is not to update those models but learn from our answers about what the userbase wants
using incorrectly then midjourney is one of the most realistic outputs you can achieve with proper prompting/settings - I'd argue that exact point is kind of what dalle-3 consistently delivers still a lot of filtering/plasticy sheen
Dall-E 2 is my favorite generative AI model, I'm really sad to see it be discontinued
Because they pinged every 🥑 to take a survey about our thoughts on this model.
I completely agree 🥲
i also agree. Dalle-2 had that something that no other ai image model have
yep. I think it's the most creative one, even today
-_- crap I forgot about yesterdays daily contest while I was making my arts lol
oh wait heres my art... tho only stuff made on DALL-E can be shown in the daily theme gallery right?
Yes, all of these DALL-E channels (discussions, canvas, gallery, daily theme, etc.) are all for sharing images made with DALL-E 
Dalle3 is hard to compete with on Flux but in several areas flux edged it out.
I should run more DALLE anyhow. The results from DALLE have a unique feel that I like.
Debatable. I like each rendering provider but ultimately for paid gen I stick to OpenAI and Mage
sure each have their plusses, but nothing can top dalle for prompt adhere. and only thing holding dalle back is the limit they put on it
never heard of mage
It doesn't matter if the prompting or seed is different
Haven't tried dalle in a long time but now it feels different? got updated?
It has gotten some updates this autumn. So, yes, your feeling is right. There was a quality slump in June and it has gotten back nearly to May levels. And there are also some minor things here and there. What is your feeling on what has changed?
I havent used dalle in months sadly everyone caught up or passed them. Then there was talk of a new model showing examples when 4o came out, built the hype up and then never released...
well to their credit they never really built up the hype for it even, just was some side infos about it., but yes it sucks it did not release
The only thing people mention when they say midjourney is better is image resolution, it’s the main argument
i much prefer dalle still. but a real boss is imagen 3
I haven't seen too much from imagen3. Is it as good at listening to prompts as dalle is?
definitely. and make believable photoreals. eyes are not smudge mark haha. its on their gemini now but the beta version is incredible which is a separate invite thing. and yes im sorry to everyone, ot i know haha
Imagen3 is now available for free on gemini
Dalle on bing image creator will give better eyes and thing too its strange. but it seem they have less models like most people look like 3d instagram more than even dalle on gpt
Would be so nice if DallE would not make images sideways when you ask for Vertical / Portrait orientation...
<:book_icon:1171408210398289941> `` Rule 6 `` No political and religious discussions or content.
Users from all over the world frequent this server. To maintain a respectful and civil atmosphere, please avoid all religious and political discussions or content.
Any rumors of DALL-E 4 launch? 2025?
What model is used in Canvas
(thx!)
No rumors that I know of, but I'd love a new model!
some new model will just have its own image maker built in like 4o i think. this may be the last dalle we have. or sora can do image, so when they release that (if ever!) i bet it will also then be default image maker. anyway, yes i would love a new model too! haha. we all would. dalle3 is over a year now
its not as good as the one on the beta test
Hi.
To rate a picture in daily-theme, do only the stars count for the ranking?
The other icons as well?
tnx for your answer 😉
Only the stars
I've invented my own style
I call it "Atari Hell"
Black and white grim dark pixel art
That's awesome
I also use 'rasterpunk' to get a similar slightly more glitchy style
How do y'all get it to actually write text in the picture? A lot of the time it comes out garbled.
Get notified for server events!
Visit id:customize to pick up the <@&1261377106890199132> role.
just have to get lucky. it's not consistent at all, but hopefully the next model will change that
There are some tricks to make it much more likely that it's readable.
Have the model put the text near the top of the prompt, can enclose it in "". Describe the writing surface next and how the writing looks, then describe whatever else about the picture.
It can really help if the surface is one that normally has writing on it.
Fewer words, common phrases, and common words tend to be spelled much more accurately than longer phrases of 'not-often-used-together, at least in training data' statements.
In this example, notice which words are misspelled and compare to the expectations of what the model can do (there's probably relatively few training data images of that phase)
Notice as well how many other details requested in the Dall-E prompt did not happen; nearly all the attention and focus went to the words requested.
I like this little guy Ambrguauas
He definitely embodies 'ambiguous'
so like "The Voice of Darkness" title card
Anyone else having trouble with Dalle rn
No. Generated an image just fine. Quality is also good. You may want to try generating images in a new chat. Sometimes the current chat just does not work for some reason any more.
they could also just release the 4o image maker to help with text 😭 seem to be a big feature for it in all their hype stuffs
Btw please use discord spoiler tags for images that are spooky like this
I wanna pay for the plus but man every tool is way behind the market
Damn
Ooooh, that sounds exciting!
DALL-E3 still surprises me when I competition run it against Flux
You know openAI is aiming at Flux to compete so DallE4 is gonna be downright massive scale.
if dalle's sucessor gets to be a jump as massive as dalle3 was from dalle2 , I can't even imagine how god it will be..
I just really hope they will release dalle 2 after that.
Dalle mini was fully open and too much fun to play with.
Dall-E Mini wasn't owned or ran by OpenAI. It just had a similar name
You sure? I'm almost a hundred percent certain they were related.
I remember I signed up for the full dalle beta and got access to it through the mini.
Even all the feedback messages were received through the mini email and bug reports
You're talking about the Dall-E Mini that was around back in 2022, right? If so, it wasn't OpenAI. They only had Dall-E 2 at the time (Dall-E 1 was never public)
if we ever get a successor 😭
We can hope!
It will happen, just a matter of time, in the meantime, just push DALL-E to the limit
surely sooner or later OAI will push a new image generator.. I mean.. making AI stuff is kinda their thing.. they have been doing that for quite a while now 😆
I hope it's something like still-frame images produced by Sora.
That would be epic.
Or producing images that are 3D holographic without 3D glasses. That would be an OpenAI achievement.
I don't know how close they would get to that tech, probably unlikely. But the quality of images would be dramatically better than anything else, especially Midjourney, due to the level of depth-perception involved.
the best to me right now is imagen 3 but on their demo page it seem better to me than the one in gemini. i just want to see dalle make some actual photoreal, it could do that now even, instead of those paint smudge eyes
because dalle on bing is king image maker does not have those artifact, but it has other weakness haha
i really think midjourney is trash but i know thats not a share opinion with most
Made a meme about the ongoing decline in image quality of DALL-E 3 on ChatGPT. Same prompt on both sides
(Even the images on the left are generated through ChatGPT this time, though chances of getting images of such high quality are still rare, about 1~5%)
all of them looks good to me, what is the decline is quality?
it is is general very difficulty to get specific things on the image with dalle considering the only input is the prompt
when compared to A1111 with SD where you can have workflows and extensions to control every aspect of the image generation process
happens when attention patterns don't have enough details of the eyes, try adding a subtle detail to the eyes
intensity of the detsils of the eyes, but also look at where you place the details, leading, trailing, mid context or even split context, all this have different attention effects on dall-e
I'm gonna have to post these images again to show the decline in quality, notice the increase in smudges and artifacts
Look closely
DALL-E through API, DALL-E through ChatGPT Plus
@vapid elk look in this thread where I posted images where it makes it very clear on what issues are occurring consistently. https://discord.com/channels/974519864045756446/1256889531156926474
Did you ask for an improvement?
What confuses me, and I know you asked Lugui.
See, different people like different stuff.
You show 2 images, some people will like one better than the other.
The avocado minimalist images - I note that the ones on the ChatGPT side have a general, toned darker background, not sure if that's part of the image from the model or some filter. And that the ChatGPT side usually has more lines, but a few of the API side have more lines.
The thickness of the lines vary. But I have no idea what your goal is - and some people would prefer either side.
So to call one side worse, without a specific criteria to measure, does not make sense to me.
"I ask the model for a square and it makes a triangle" is measurable, testable, and I understand it. "Here's two pictures and quality is worse on one". Maybe I'm flawed, but I don't get what you want fixed.
And you can wave picture after picture of 'worse'. I don't know what to look at. I don't know your personal language for 'worse'.
With your avocados. I have no idea what should be considered artifact, but I see unusual things on both sides. I see what I'd call smudges on either side. So I'm lost.
you pinged the wrong guy lol@deft musk
check that thread though and you'll see some very clear degraded quality
Do you identify what you want and don't want? And sorry for mis-ping.
@mild basin
Yes
So there's smudging and uneven lines
I ask it to fix the problem, but it doesn't and spits out an exact replica of what it did before
there's multiple images in there for example
Yeah so. With ChatGPT, so I'd try it with Dall-E or any other AI.
When I aim for something big and complex, and a feature of it fails.
I tend to start a new convo and focus just on that feature.
If it can do X all by itself, then I keep increasing complexity to find where the model starts to lose it.
If it can't do X when X is the only thing to do, I try to explore a few other ways to do X, then accept that for now, the model is unable to X. I may bug report it, and make clear that the most basic and simple of X, the model cannot do.
This still may not be a bug, or a fixable one, based on whatever the goals are and what's possible is (I don't know if Dall-E ever could do grids reliably, never checked).
But like if a kid can't read and won't look you in the eyes, but also is blind. Identifying the blindness explains what to fix first.
it's not really about the grids
I've actually asked it to remove the grids and it doesn't, but says it does.
so I'm highly confused where it thinks the grids even are
Dall-E totally fails to understand negative prompting; in order to get an image that doesn't have grids, one has to describe what they do want, or not mention grids at all.
Asking for 'with no grids' gets grids. Asking in all caps 'no grids absolutely no grids just no grids' tends to get a lot of grids.
And ChatGPT can't see the image unless we paste it in. But it knows what it asked for and believes the image will be as asked for. It imagines the no grids just fine, because ChatGPT understands negative prompting to some extent.
And ChatGPT does not know most of Dall-E's quirks like that, to some extent it's luck if it prompts Dall-E well. We can try, but often we need to step in, and there's a lot Dall-E can't do. Like, this showcases exploring ChatGPT's ability to evaluate and image and try to recreate a prompt. #ai-discussions message
But where you talk about smudges and bent/broken/curved lines, a goal is to see if we can get a lot of straight lines period, without other stuff. If not, that marks a clear boundary problem.
Have you checked if ChatGPT is asking for it, or if Dall-E's adding it without ChatGPT asking?
That can matter, because ChatGPT passes a prompt to Dall-E (dunno if you already know that). If ChatGPT said grid, that's one thing, make sure ChatGPT describes differently.
If Dall-E's doing it by itself, then something else in the prompt must connect to Dall-E's concept of 'include a grid' (Or a website, a TM icon, whatever else Dall-E just adds itself). And that can require rewording of some other word to avoid.
yeah telling it 'dont add this' is just asking for it to add it
annoy thing is, it does that in its own prompts too if you inspect them.
Just for you, I've added circles on the avocado comparison image to show where the artifact/smudging is clearly noticable when viewed at full-size
I've also added circles here to highlight the blueish-white spots and smudgy spots, which weren't that visible on the API
Dammit, I was doing character portraits for a fantasy setting but it seems DALL·E struggles understanding what a cyclops is supposed to look like
seems to understand they only have one eye but doesn't understand how
That's a humble approach from GPT, I asked a similar idea here, GPT approached it differently
GPT as an AI-Ddetective Partner
so what WAS next? 🤔
So cute, GPT thinks in this chat as part of being human
What a sharp and amusing realization! It’s true—our bodies are incredibly selective in what they keep versus what they let go of. The phrase “You are what you eat” oversimplifies the beautifully complex processes of digestion, absorption, and metabolism. In reality, we’re shaped more by what we retain, how our bodies utilize it, and how it all fuels our growth and energy.
Has anyone from OpenAI had the talk with GPT?
Is something wrong with DALL-E`? GPT is refusing prompts to dall-e saying dall-e isn't available
across instances and devices
yeah, stereotypical simplified cyclops drawings are when it can actually do it right, but I was going for something less exagerated
I ran into some friction too
It really wants to go back to the generic cyclops that’s probably more represented in the training data
Most likely yeah, sad because I wanted a more modern/urban fantasy style one
Maybe too much to ask lol
I think you can get there, but it’s gonna require some experimentation, though I tried not mentioning the C word and still ran into trouble
Asked GPT about Apple Intelligence and then this
And here's GPT and Siri at Karaoke
And of course I had to request the Siri x GPT: Intelligent Harmony gen
i'm trying to have an alien like figure in my image but it just keep generating the same genereric alien from alien vs predator
what can i do to change that?
i've been trying like 20+ times by now
yes, it works by itself. but i'm trying to have an alien in a scene that i also specified
work on your prompting, everything depends on the prompt
helpful, thanks 👍
as i said. i need him in a scene, meainig the scene needs to be the same and only the alien different. i managed to get the scene to stay how i want it to, but the design of the alien is very very similar no matter what i tell gpt
these and probably 20+ more
where i tried to change it
I used Create an image of an Alien with thin arms and legs and an elongated head
..ok
then I asked Now create the same alien playing a DJ deck in a nightclub
it cant do consistent character like that
the model is just not good enough for it newt. it was suppose to be a feature in 4o image maker but like a lot of thing with openai, they hype it, but never release
Is the sims still a thing?
never knew
boomers never played the sims
fair
DALL-E is good, just have to do the right prompt
it's the differenence between this
and this
probably, I don't know what that even is, I just prompt what I want to see for a concept
Please share what was the promt?
A whimsical and eccentric depiction of Dr. Flux Sugarcoat, the Chief Flavor Alchemist. She is a quirky scientist with wild, multicolored hair tied back in a messy bun, wearing a lab coat splattered with glowing, candy-colored stains. She holds a beaker filled with a swirling, magical potion that emits soft, pastel-hued smoke. Her lab is cluttered with glowing jars of flavor essences, whimsical candy molds, and a chalkboard covered in chaotic flavor equations.
It's very beautiful, thank you
I usually don't share prompts. So, this is a rare moment
Showcase your DALL·E creations to the world!
Collaborate with our OpenAI Instagram page! Just invite @openai as a collaborator before you post. If selected by our team, your work will be featured on our handle, giving you more visibility.
Quick tips for selection:
- No copyrighted or profane content.
- Your profile needs to be public.
- Stay mindful of sensitive topics.
- Outputs that closely follow the prompt are preferred.
Feel free to share reels, carousels, or just a single image!
guys why is it not able to create a completely flat map? i am literally explaining it at it fullest and it literally cannot get it correctly
it even recognized its mistake and pointed out that it didnt looked like how i wanted it to be but it still cant fix it
this is the closest it can do
can someone help me
they said this
“What's Happening:
• The Al generation tools I'm using are introducing artistic interpretation by default, adding angles or perspectives even when I explicitly request "flat" or "top-down." This is likely due to the inherent difficulty in some tools strictly adhering to technical constraints without leaning into their own stylistic biases.”
like is it actually admitting that it cant even create a flat map ?
maybe "flat map" has differet phrases. maybe you need those phrases. what would a flat map be called in training data? cartography or some such?
nope it still cant do it
it even said that the tool has limits in understanding geometry
topological map? idk, just need to find the right keywork. ask o1 to come up with a prompt that might help
i mean the prompt is literally correct but its innacurate
its not even true to the prompt
it just literally cannot create it
it doesnt know how anything looks like from the top in 2d
like its literally 3d when the prompt says to not make it 3d?😭
“and avoid all depth, shading, or 3D effects.” - proceeds to do the opposite
well its christmas time, where is hawainnz and his elves on the beach haha
Just tried dalle (API) generated result seems to be more like the old times?
exactly
well post some result to see
i doubt it. just rng. people always get these whimsy "its changed" when its just a few good rolls
if we get a change it will be with some major update
Images from DALL-E 3 through API is surely different from the images we get through ChatGPT, I promise. Here's an example with checkerboards, where in images from ChatGPT, the checkerboard gets more and more distorted and smudgy as it gets far away. Click on the image and click 'Open in Browser' to see it in full detail
Can send you the version with red circles added if you're not convinced
they are different because seed sure. but if you post the same prompt twice in browser you get two different result there too
I can promise it's not just seed
Clearly you're not convinced. Here's the version with red circles added to highlight the problems with images from ChatGPT. I recommend you see this image in full size. You can do this by clicking the image and clicking "Open in Browser" if you're using PC
I know checkerboards in API images aren't perfect either but that's not the point
sow what is so great about the right images
they both are imperfect exactly
why i say show some photoreal
you think the api is better then use an example like human face to show the difference
Here's one I made few days ago. Again, it's recommended to see it in full size.
Can send you the version with red circles added for that too
what is the problem with the images? I don't get it?
he thinks the api is a different model dalle and makes better images
it's the same
people get those feeling all the time, it is never true
because they get a few good roll on one the api or browser
when they make a big change, they will tell us im sure
or it will be obvious, you wont need a microscope to see some small difference
Have any of you seen the image in full size?
there is no difference between using the API and ChatGPT for DALL-E
im looking jansson, i dont see the big difference no
it's the same model
yes, exactly, i dont know how it could be different other than seed and rng
But I can promise you there are differences between API and ChatGPT
I can promise you it's not just seed and RNG
you can promise but can you verify
it's the same model devs have conifrmed this in the past
Here's the comparison image with red circles added to highlight the problem I have with images on ChatGPT. See the image in full size to see the details
There are unwanted blu-ish white spots on the ChatGPT images if you see it in full size
these also happen in the API
I have had it happen to me
Yeah but not as often as in ChatGPT
it's the same model!
he wont believe you robert haha
The blu-ish white spots are more noticable in ChatGPT images, have you checked API images too?
they happen in both ChatGPT and the API, it's a known issue
you cannot generate the exact same image at the same time in the API and ChatGPT so any comparisons are invalid
That's not the point
what is the point?
The fact that smudges and artifacts are more noticeable in images generated through ChatGPT
it happens in both, they are the exact same model
See this is the reason why I don't like talking to people much
I've had enough of this crap, bye
you dont like talking to people because they dont affirm your whimsies? ok
I am sorry I couldn't be more helpful
you were very helpful. told the facts. its the same model as the devs say. 🤷
One last proof that DALL-E on API and ChatGPT are not the same in quality, notice the difference in brightness, I'm done with this unpleasant discussion
what were the prompts used?
Use this prompt as is: (Make sure you use square aspect ratio)
A minimalist one-line drawing of an avocado, featuring its recognizable oval shape, a prominent seed in the center, and a clean, artistic design on a plain white background. The drawing should be elegant, using only a single continuous line.
looks fine to me
not unpleasant either, everyone is very patient explain why its not different. the big one being jansson the devs have say it is the same model
one model that to me DOES seem tweak is the dalle on bing image creator. it can make good photo real eyes not smudge marks like dalle, but i think that is just some setting change or thing on bing end, not a diff model necessarily
can be the same model but not the same gen settings.... 😉
I'm wondering if they point to the same model on the back-end but the ChatGPT website does some type of preprocessing? Maybe it alters the prompt whereas the API doesn't? I'm not sure 
API model and ChatGPT model not the same, can confirm. ChatGPT nerfs/ quality decreases depending on time of day / usage. API costs more so there's never issues with this quality variation. Whoever told you off is wrong lol.
in my experience, weirdly, the bing model is better than both of them, but YMMV!
This is coming from someone who generates 100-200 images daily since Dall-e 1 was released.
then show evidence this is true and not just your weird gut feeling
how on earth would an image making model suddenly decrease quality because of time of day
its just all non technical stuff. my "vibes" say its worse. oh ok
Chat gpt is free to the general public, the api is tokenized, you can use some deductive reasoning on that one. Doesn't explain the Bing model being completely different though, though gut says it's subsidized by microsoft, and it's also limited to 100 generations per day for pro users.
You can use them yourselves and be the judge! Just sharing some anecdotal experience
I use bing primarily despite paying for enterprise for ChatGPT because it gives us better outputs. I don't have any "proof" other than that this is part of my job, and I know what "better" looks like for us.
🤷♂️
Also if you don't think 'devs' at these multibillion dollar companies aren't hiding their words behind intricately worded NDA's you're sorely mistaken
As end users all we can do is enjoy the technology and make our own gut checks regarding which models are changing / why outputs vary across delivery systems. We'll never know how the sausage is made.
We don’t know they are the same model, and at this point I don’t trust that much
The image generation quality for plus users were also decreased
Before this chat gpt was one of the best bundles for both quality image and text
And there was a distinct drop in quality around June 2024
You could ask in the same chat to use the same parameters, gen id, seed prompt as the previous image
And the old behavior if they were the same model would be a identical image
But now you can see a distinct drop in quality, smudging, flat colors, bad lighting
I don’t get why they are prioritizing more things for free users which shouldn’t be profitable other than some limited amount of training, but decrease the plus user’s experience
Example of the same chat image tests
I can understand operating costs
But I just want usable quality over useless quantity
we do know it's the same model, an OpenAI dev has confirmed as much some time ago
question would be : why would they even do a different model. and the answer is some weird conspiracy. people are delusional and easily fooled
when the only way to see supposed difference is with a microscope over a few pixel, maybe consider its not a major difference in model.
There's a misconception that API or ChatGPT Plus are different in quality. Both versions achieve amazing stuff, maybe different prompting style. Or even among the different productive models environment. There are more than one model in place, as these would be normal practice for a research company. But considering that the quality is lower is in fact human perception. It's been a recurring topic for a long time. All I can say, DALL-E can do amazing stuff if prompted correctly.
sure i agree
Here's an example of amazing quality with dalle
but a sad scene! haha but yes nice image
not everything has to be happy thoughts
the world makes that clear haha
unless you are an evil squirrel
brainstorming ideas, all kinds of stuff that come across, I just try them
Squirrel and Maiden, Schwarzenegger and Stallone never had a chance
that is a unique idea to say the least. and looks real good
but who is the sidekick, the woman or squirell haha. be funny if he is leader
like I said, anything goes, if I can write it, and dall-e can generate it, it's a concept worth doing
partners
nice. i can see lots of adventure for these two
the canon looks most impressive to me. i am sure if i study it i can find some flaw, but on first look it has that 'real' appearance
I had too much spam in my inbox today, so, I did the most obvious thing to do: The Spam Overlord
Makes sense? No... Is it fun? YES!!
hilarious concept
Of course had to send someone to take care of if
and they there are until the Spam problem is solved, they will fight forever and ever
of the millions of dalle image made, i bet you are the first to ever map two warrior fight out a spam email setting 😂
*make
lol
but back to the original point, the image quality on all these is superb, across api and plus models.
And consistency across them is good too
Concept of a realistic version of Korra
API and GPT Plus respectively
API vs GPT Plus
Concept of a realistic version of Elastigirl from The Incredibles
API vs GPT Plus again
Concept: trying to replicate Harley Quinn realistic
API vs GPT Plus
Concept: Yandere Archetype
did you have to prompt around harley quinn name or it accepts it?
the passed prompt doesn't have harley quinn name at all
and yes, i agree quality is good on all. some i prefer api some i prefer the web version
I definitely feel like chatgpt is more variable. It wouldn't surprise me if they do some sort of quality reduction at peak times (many parameters to allow that while still running the same underlying model), whereas I guess the API is never being used enough to warrant that
However the variability isnt huge unless you end up with the 2x natural quality images hehe
Mostly in peak times, what I see is that complex requests get timed out more often
Hello, everyone.
I am a passionate AI / ML Developer with over 7 years of experience crafting and deploying advanced AI solutions, who specialize in Deep Learning, Natural Language Processing (NLP), and Gen AI, with a proven track record of building high-impact AI models and applications.
I am familar with Full stack development, so recently engaged in the AI-based web application development.
I am looking for new opportunity working with you.
Thank you
What quality descriptions do you use? Any hints on them? I have noticed that adding such as 1080p, hyper realistic, etc. helps.
none of them, and here is why:
1080p implies a technology and a depiction concept
***realistic ***implies realist or realism art movement from the 15th century up to mid 20th century
hyper- accentuates the realistic approach
Ok. But what other ways do you use to have excellent quality? Is it just the wording of the prompt without quality descriptors?
It depends on the concept, I really delve into the details of each workflow I do, what do I want to emphasize, and I always stay in is the semantics. DALL-E is a text-to-image model with NLP, but cannot reason beyond basic concepts.
And many many many more aspects I've learned since I started working with DALL-E
Also, work with structures, not just a small paragraph of text
My to go basic example is: a paper clip, and start from there, what do I want that paper clip to have
And my approach is not the only valid one, I bet others have equally valid and even better approaches than mine
The reason I don't reveal how I work in full, is that very same idea, others have other approaches that are valid, and I want to encourage people to learn their own way
Also don't like prompt farmers
Thanks. I know that you do not reveal your workflow in full, thus asking only for hints and tips.
Anytime, feel free to DM me if you need a second perspective
You can call me the other side of the coin. I like to use quality descriptions as they work in my prompting style.
Sometimes, the quality description does not do anything, but sometimes it pushes the system to achieve higher quality or otherness needed for the picture. Also, I use additional descriptors such as ”anatomically correct” to push the generation.
too many quality descriptors actually diminish the generation
Yes, it is a balance. Too much is too much and can cause to system to lose itself.
a super hyper extra full infinite megapixel quality, does the same as visually good
Usually 1 to 4 is enough. I do admit that I have used similar to your hyper super mega duper quality.
if you can sum up with 1 the better
Well, 1080p has worked wonders.
I do not know. They all push the system differently. But as you said, fewer might be better.
sent you a msg, try that
Or they sorta cancel each other out as they mean mainly the same thing ”I want best quality you have.”
I want best quality you have. is more reasonable than asking for different levels of technology
1080p
try again without the descriptor at all
And with the other quality descriptor given by Dys Topia
No quality descriptor by me in the prompt
1080p vs secret quality descriptor vs none - each pushes the system differently. It dependens on what I want to achieve.
now here's where it makes it all a bit more complicated: what is the best descriptor for an impasto?
or what is the best quality descriptor for today's theme?
Well, as you pointed out, what are you doing. This really affects the whole prompting. Quality descriptors, if any, and what.
and that's just for the rose, but what if the rose was in a pot or a vase?
Hmm… a tough one. My counter argument is: would the blue rose be a photo like, pixel art or what? A photo might need a quality qualifier, but pixel might not.
I mainly generate hyper realistic or physically based renderings. Some oil paintings.
For oil painting: hyper realistic, hyperdetailed, etc.
Ok, I see where you are coming from. For you, the whole image and its contents are important.
For me, I concentrate on how the image is done.
Hehe, that's another approach, that's your way of working, that means you have to look what makes sense in that use case
Here is the oil painting.
Part of the prompt: “Hyperdetailed, fine brush strokes but still visible on the canvas. Thick oil use.”
I describe how the painting is done.
I rather like how the image is generated.
@late blade, how do you find the image?
So this is something that makes me think.... highly detailed scans of oil paintings make it hard to see the brush strokes. 'Hyperdetailed' in this case would suggest you'd need less visible strokes
Also some limitation being the 1024px max image height
that is correct
but since you want an oil painting, using hyper-detailed negates that purpose
well not necessarily, plenty of oil paintings that are hyperdetailed.
Yes, you are right. Instead of hyperdetailed, I should have used fine detailed. Shuttle but significant difference.
I will say maybe the sky should be a bit more painterly as those are rarely done in such fine strokes
Can you tell from @velvet rampart 's image it's an oil painting?
If I saw the prompt that it's an oil painting yes. If you showed me a hundred 1024x1024 pieces of art, some AI some not, and asked me to identify which are oil paintings I might also struggle
as is expected
My guess would be how light is done on the pic.
the biggest betrayer is that the lighting in dalle images is always a little too dramatic
what was the full prompt?
I'll run a few API tests
That is the point, if I want to show an oil painting, and request an oil painting, then it should be an oil painting the end-user can see it so.
I used Oil Painting of a royal castle
I need milamber's, so I can test out the different API params on it
Now I used fine detailed. The painting style is realism.
and you can even see some simulated brush strokes!
I sent you a DM of my original prompt.
Oil painting of a royal castle with fine, visible brush strokes. This painting should be hyper-detailed, indicating that a significant amount of thick oil was used in its creation. (revised prompt)
well You've hit your daily maximum number of images. To ensure the best experience for everyone, we have rate limits in place. Please wait for the next day before generating more images. Your daily maximum will reset in 5 hours and 56 minutes.
so, brb, need to change computer and open my jupyter notebook with api setup
my (extremely bad) image gen app 😭
it has an UI! Mine is just a jupyter notebook
I had one before for dall-e 2 that also displayed the gen + previous 3... but I broke that with de3 api and the new params and whatnot
When I prompted the painting, I tried to give a role to the AI.
Wanted to test if a role could influence the image generation.
ah... so, general advice for prompting is you just want to give it a description of the output image
any other details will muddy it, and could result in pretty terrible chatgpt prompt revisions
But when I check the prompt revision, it does not muddy it all.
what does it change to?
Anyway, it's good to keep in mind.
Check your DM.
can you send here please?
A hyperdetailed oil painting of a lush, early-morning garden view of a grand country estate in Northern France, in 1854. The painting shows thick, visible brush strokes and rich, layered oil textures on the canvas. The scene is tranquil, filled with greenery, meticulously trimmed hedges, and vibrant early May blossoms. The estate in the background appears grand and dignified, with gentle morning light casting soft shadows, creating a sense of ownership and pride. The atmosphere evokes emotions of satisfaction and accomplishment, as if saying 'This is mine. Here is your money. I am happy with this.' The scene feels both personal and grand, capturing the patron’s joy and pride in their land.
thanks!
I actually really do not trust OpenAI's statements about more detailed prompts = better generations
all my testing has found that all that text does actually muddy the output, so idk why they make those revisions
Sometimes, I get better results when I force not to rewrite with ”Use this prompt exactly”.
Dall-E 3 understands natural language better than 2 did, but it still works best when using more concise prompts that only include important details or else you pull the target around too much
Well, I used here an emotion and tried to descripe it.
my guess is essentially that it's more likely for a prompt to be rejected without revision so, they went with prompt revision as a default to reduce the number of rejected prompts....
oh well, not really gonna hurt these royal palaces much anyway
So, how does the image look like?
oh one sec
@velvet rampart here is your pixel art rose on a "1080p" vase
A hyperdetailed oil painting of a lush, early-morning garden view of a grand country estate in Northern France, in 1854. The painting shows thick, visible brush strokes and rich, layered oil textures on the canvas. The scene is tranquil, filled with greenery, meticulously trimmed hedges, and vibrant early May blossoms. The estate in the background appears grand and dignified, with gentle morning light casting soft shadows, creating a sense of ownership and pride. The atmosphere evokes emotions of satisfaction and accomplishment, as if saying 'This is mine. Here is your money. I am happy with this.' The scene feels both personal and grand, capturing the patron’s joy and pride in their land.
left: vivid
right: natural
I like the vivid one more.
kinda amusing that the natural has topiary forest but wild garden trees lol
Both do look like they could be from a museum collection. Seen enough paintings like them.
And they have this oil painting vibe.
people in the 18th century were bored enough to paint hundreds of those I guess lol
i was at the national gallery a few months back and was struck by how many paintings of basically the exact same scenese there were
And there were no grants. Artists had to earn their living. Plus no cameras.
@vagrant zodiac and @late blade, thank you both for comments and hints & tips. It is quite late for me. I need to get to bed. Good night.
night
imagine if you could show an 18th centurty artist dalle haha. they really would think it is magic and probably you get burnt at a stake
it depend on (GPT vs Bing) what do you like to gen..... GPT better with background, same prompt, Bing had better quality in some cases but overal GPT was better till june..... Im also know what i speak..... the decrease doesnt linked to time of day.... you can especially see it with eyes, hair, any detailed stuff by nature.
agree
blurred
on purpose
it's meant to make emphasis on the squirrel
oh wait wrong reference
the one from the shop, it's not meant to be a photo 1-to-1
it's meant to have a layer of artistic charcoal do add details to the building
but, it's all part of subjective expectations over objective prompts
and then there's also the that complain for the sake of complaining
amazing stuff before june.....
well... you stated amazing quality..... get a gen from before june and gen it again..... without an old and new... you dont have the ground to compare....
yesterday
Bing....
So, it'^s still dall-e
not openai dalle
we still talking of the difference of these api and web and bing? haha
MS has its own....
DALL-E today
well bing is also dalle3 but it does have some setting different because the eyes will look good. eyes have always been smudge on gpt for photoreal. bing also will let you make ip character but openai will not
GPT DALL-E OpenAI
thats a cool one dystopia
and i know.... i started to gen with dalle 2.... OAI didnt gen acceptable results.... bing did
That is not different, the model itsel is the same, azure api is also the same
Today OpenAI DALL-E
OpenAI DALL-E Today
i take it 'jazz' was in the prompt haha
Yes, Jazz-Club
Showcase your DALL·E creations to the world!
Collaborate with our OpenAI Instagram page! Just invite @openai as a collaborator before you post. If selected by our team, your work will be featured on our handle, giving you more visibility.
Quick tips for selection:
- No copyrighted or profane content.
- Your profile needs to be public.
- Stay mindful of sensitive topics.
- Outputs that closely follow the prompt are preferred.
Feel free to share reels, carousels, or just a single image!
that 'prompt bleed' will be nice to stamp out for dalle4
Last Sunday
My point, people get attached to one image with one seed, and see downfall from there
sure, I get horrible generations too, I just discard them and take the ones I like
would make a cool comic book
It's all part of a concept I'm working on, comic book aesthetic just happens to be the best for that project
But here's Comic vs Realistic
awesome. though i would say is more comic and 3d
Ya, the effects I'm looking for DALL-E still struggles with those
That's why comic aesthetic is more suitable
i think it is intentional really, they play it safe to not let you make real looking people too easy but idk
There are safeguards indeed
phrase like 'soft focus, film grain effect' sometime give me good result
yes i think this model could be even more impressive if they want it that way
DALL-E does need an upgrade, doesn't mean it's uncapable
i agree. imagen-3 by google make incredible photo real i hope tha tmotivate dalle/openai some to release their own better model soon
lot of people seem to think some new openai model is to release this month, maybe it will bring with it a new image maker, 4o image maker or dalle4, whatever. just bring it on haha
Bing....
It will be released when it's ready, expecting something to be released and have a shiny new toy to play with, awesome, I'm all for it, but until then, only DALL-E 3 to play with, might aswell make the best I can with that
very nice necromancer!
same prompt at gpt.... whatever you say... illustration/comic style quality went down significantly but in general.... like with sd when decrease gen steps
well i have always say: bing is king
You can't just do an ad-verbatim prompt, for instance you didn't ask for wide aspect ratio, so you are not asking for what dall-e can do
you have to know how the sentiment analytics works and contextual interpretation of the concent and usuage policy does in each use case
same
even worse....
This is just me trying to recreate your image without any prompt to start with
Can DALL-E do custom aspect ratio?
use dalle gpt or 4?
api and plus
i meant dalle GPT or base 4 or 4o?
client = OpenAI()
response = client.images.generate(
model="dall-e-3",
prompt= prompt,
n=1,
size="1792x1024",
quality="hd",
style="vivid"
)
thats api?
i read somewhere on the site that with api there is 2 version dalle a HD and a lower quality... therefore ask
(thats the quality which chatgpt had........)
these are parameters passed to the same model, look at the snippet of code I provided here #images-discussions message
there are no different models, just parameters passed to the model
ok, i didnt checked it yet but it send to dalle through GPT or API?
Get notified for server events!
Visit id:customize to pick up the <@&1261377106890199132> role.
or both? as with gpt i dont think there is an option to send quality as parameter, just in prompt
This is a list of the models available when requested through the API
SyncPage[Model](data=[Model(id='gpt-4o-mini-2024-07-18', created=1721172717, object='model', owned_by='system'), Model(id='dall-e-2', created=1698798177, object='model', owned_by='system'), Model(id='text-embedding-ada-002', created=1671217299, object='model', owned_by='openai-internal'), Model(id='gpt-4o-mini', created=1721172741, object='model', owned_by='system'), Model(id='gpt-4-1106-preview', created=1698957206, object='model', owned_by='system'), Model(id='text-embedding-3-large', created=1705953180, object='model', owned_by='system'), Model(id='babbage-002', created=1692634615, object='model', owned_by='system'), Model(id='gpt-4o-2024-11-20', created=1731975040, object='model', owned_by='system'), Model(id='gpt-4-turbo-preview', created=1706037777, object='model', owned_by='system'), Model(id='o1-mini', created=1725649008, object='model', owned_by='system'), Model(id='davinci-002', created=1692634301, object='model', owned_by='system'), Model(id='o1-mini-2024-09-12', created=1725648979, object='model', owned_by='system'), Model(id='gpt-4-0125-preview', created=1706037612, object='model', owned_by='system'), Model(id='whisper-1', created=1677532384, object='model', owned_by='openai-internal'), Model(id='dall-e-3', created=1698785189, object='model', owned_by='system'), Model(id='gpt-4o', created=1715367049, object='model', owned_by='system'), Model(id='gpt-4o-2024-08-06', created=1722814719, object='model', owned_by='system'), Model(id='o1-preview', created=1725648897, object='model', owned_by='system'), Model(id='gpt-3.5-turbo-16k', created=1683758102, object='model', owned_by='openai-internal'), Model(id='o1-preview-2024-09-12', created=1725648865, object='model', owned_by='system'), Model(id='gpt-4o-realtime-preview', created=1727659998, object='model', owned_by='system'), Model(id='tts-1-hd-1106', created=1699053533, object='model', owned_by='system'), Model(id='gpt-4o-realtime-preview-2024-10-01', created=1727131766, object='model', owned_by='system'), Model(id='gpt-4', created=1687882411, object='model', owned_by='openai'), Model(id='gpt-4-0613', created=1686588896, object='model', owned_by='openai'), Model(id='gpt-4o-2024-05-13', created=1715368132, object='model', owned_by='system'), Model(id='gpt-3.5-turbo', created=1677610602, object='model', owned_by='openai'), Model(id='gpt-3.5-turbo-0125', created=1706048358, object='model', owned_by='system'), Model(id='text-embedding-3-small', created=1705948997, object='model', owned_by='system'), Model(id='gpt-4-turbo', created=1712361441, object='model', owned_by='system'), Model(id='tts-1-hd', created=1699046015, object='model', owned_by='system'), Model(id='gpt-4-turbo-2024-04-09', created=1712601677, object='model', owned_by='system'), Model(id='gpt-3.5-turbo-1106', created=1698959748, object='model', owned_by='system'), Model(id='gpt-3.5-turbo-instruct', created=1692901427, object='model', owned_by='system'), Model(id='gpt-4o-audio-preview', created=1727460443, object='model', owned_by='system'), Model(id='gpt-4o-audio-preview-2024-10-01', created=1727389042, object='model', owned_by='system'), Model(id='tts-1', created=1681940951, object='model', owned_by='openai-internal'), Model(id='tts-1-1106', created=1699053241, object='model', owned_by='system'), Model(id='gpt-3.5-turbo-instruct-0914', created=1694122472, object='model', owned_by='system'), Model(id='chatgpt-4o-latest', created=1723515131, object='model', owned_by='system')], object='list')
Of these, DALL-E 2 and DALL-E 3 have only one model.
Also, why are you repeatedly asking if it's api or not, I already said, I sent it to the API
i see, thanks... so like a switch....
as im not a programmer..... for me it possible that you inspected the site and got the code or the other way to used api.... you only stated that sent to the same model....
but now clear....
they changed from hd to standard with gpt
I'm sending it to the available endpoints provided by OpenAI, OpenAI Azure, Coplilot, ChatGPT Plus Frontend.
It's a clear misconception on how the model is perceived across the different spectrum
well 12 days of OpenAI start tomorrow with release 'big and small' so maybe one will be Sora image maker haha
and then we can stop the conversation finally about is there any difference in the api or web dalle 😂
(and start is there difference in the web and api sora haha)
You really are asking for trouble, she won't hesitate and use it
fun concept
She's all about that, fun
well, apperantly the theory goes like this: there's a secret DALL-E-UwU model, you can only access it with the right parameter
i dunno how to do it what you mention..... i use the website/type in prompt.... 😦 😄
BTW anybody know a good prompt manager which a little bit more advanced than a notepad?
Standard or HD? 😄
def not UwU
see? no secret UwU model works
I did add a descriptor to ground the UwU
no
yeah I noticed
btw these are with copilot
and I broke copilot, as it's not super woah hyper mega triple infinity amazing
lol
cats are a science on their own, does gemini have cat training?
Sephiroth Cat?
CPO OpenAI posted this 👀
Where is my room? Is there a spa? Breakfast Buffet all you can eat right?
hype levels... rising
12 days... one has to be an image maker... right? RIGHT!?!?
🙏
That's #1 on my OpenAI Holiday list 
probably for all of us here haha
i think it only will be some video stuff,,,, not image.... 😦
or some "major" stuff like canvas was.....
(from my point of view..... I manily interested in image gen....)
I'm hoping for outpainting
or real canvas.... not that fake one....🤣
how did you sent it to GPT?
what do you mean?
well best to keep expectation low for sure. still, it has been over a year since dalle3 so with 12 "drop" it is not too wild to think a new image model could be upon us.
I expect announcements, nothing beyond that, and that people will overhype and for the months to come ask if this or that is out
well they say in the press release they are to ship something each day
we will see tomorrow when dalle4 release 🙏
o1 from preview to full
yes i would guess that one too
and sam altman says each day will be a launch of something or a demo, so not neccessarily ship something each day. to keep myself honest haha
full o1 and image maker and i am a happy camper until spring or summer 😄
like with 3.... quarter (or half?) year later came out?
i meant after announcement
yup! Dall-E 2 went public in Fall 2022 but was in closed access since Spring of 2022
i see some say 4o image maker will be one of these drop 🙏
we are probably the only group who would hope it release today instead of the full o1 model 😂
Well, we have been waiting for that since April. I’d say that full 4o is higher on my list than o1.
mine too
i am just poke a little fun at us you know, others would mostly want that o1 first haha
20 minutes until we see what today bring
Let's all watch, I got pizza, coca cola and images to enjoy the stream
that one was predictable
yep
and tomorrow will probably be something boring. image maker will be in the middle or end i guess (if we get one)
damn you got it already?
yeah lol
Oh, you already got that. They will roll it out. I usually get those about 2 hours after launch.
No unlimited DALL-E though
That is indeed the question
try an image then!
I tried, I got limit currently
200 a month and you already cap?
There does not seem to be a clear distinction/DALL-E rate difference definition from Plus to Team and Pro:
Just ✔️
it doesn't reflect that on the iOS Subscription UI
Where is this table available?
Sorry, this is not about dalle but I noticed that we most likely do not see context window expansion on plus as pro has 128K. 
This is based on the table given by @plucky hare.
No, the table is there, I can attest it's there
I mean the table in the web address. I think we both mean the same thing.
Dys Topia do you see any big advantage to this pro model in general?
not fair to ask i guess, only been an hour for you haha
too early to tell, but I am doing tests
pro does take more time with a few complex questions I had earlier this week regarding a few projects I got ongoing
and the answer was more thorough, but still, I have to check if it's on point
what happens when you ask it to make an image?
pro cannot make images
200 a month and wont do a simple flower. smh. haha
o1 pro I mean
i cant afford that but it sounds like nothing i will need i guess
it's case by case, and I agree, it's not for everyone
i am just the hobbyist type
but i am still curious to hear any result or thought you have on it over time as you go
can you upload text file to o1 ? or just images?
text files
yes i know but no real dalle news and its so quiet here lately so i take advantage of dys topia being a big time Pro user haha
#chatgpt-discussions in the meantime
Yes sure, this was just to show you I'm doing DALL-E stuff, and my o1 pro usage is part of that context.
I know canvas is the channel for that, I was the one that proposed the creation of that channel
is that wonder woman you have from some day ago from bing ?
or did you use 'amazon' instead of wonder woman in chatgpt?
It's possible, I did the prompt with bing, with gpt plus and with api
let me recreate that with just OpenAI non-microsoft
@dim cradle switching to canvas
suddenly we cant post images here or what?
lordjord never says anything in here suddenly he coming down like an iron fist lol
Just abiding by the concepts of the channels that @vagrant zodiac reminded us about
does that Wonder Woman not look more clear to you?
let's move to canvas
this place is always images though
#images-canvas was made as the place to share images, this channel is intended for discussion of Dall-E (with supporting images when appropriate!)
scroll through the history of the room
what is anyone going to talk of here if no images
some minutia of a year old image maker?
as I said, you can post images here too as appropriate for discussion about dall-e e.g. model capabilites, tricks, limitations etc.
but just cold posting of images belongs in #images-canvas 🙂
thats never been how this room is
cold posting images... this chat is dead without images
What does this actually mean when Chat GPT tells me this?
The tool for generating images is currently unavailable. If you'd like, I can help with another request or provide further assistance on any other topic.
Am I at my limit? Is DALL-E down? Broken? Did it not get a response from the server? Did I violate the content policy? What?
are you a free user?
No
hmm let me try to make an image
no problem here so, maybe just try again
youre not on an o1 model are you?
It's been doing this all afternoon for me. So, for about 4 hours.
oh very strange
Let me check the model
o1 models cannot do image yet
What should I be using? I don't typically pick a model, I just use whatever it defaults to. I've been creating images all week like this.
should work. strange
but also, you might want to get the Dalle proper from 'explore gpt' section
because it will make 2 image not just 1
but thats if its working for you
Whenever I try that, it just opens a normal Chat GPT window with the pre-filled prompt "Create an image" and then I'm left to type the description.
Wait, is this what you mean?
yep
Ah, ok. Haven't tried that yet
well and it may not fix your issue atm
but it will return 2 images
when it does, so always bestto use for image making imo
but give it a whirl now too why not
Oh interesting! I used that and it made images!
nice!
Yeah, thanks for the suggestion. It's already working, so maybe I've been going the wrong route all week? 🤦♂️
they are updating all this stuff you know with the big full o1 model, so maybe just gremlin in your other model atm haha
I was wondering. A lot of the chats have been kind of silent. I appreciate your help.
hey anytime