#How to make DALL-E 3 use text in images more consistently?

1 messages · Page 1 of 1 (latest)

hybrid plaza
#

I'm using GPT4 to come up with prompts for somewhat complicated images to generate with DALL-E. I like how it's capable of generating large amounts of text within the image itself (I know it makes frequent typos, but that doesn't really matter for me).

The problem is that the more complicated the image prompt becomes, the less likely it is to even attempt to generate words in it at all. I would like to be able to guarantee that I'll have words in my image without having to retry or change the desired output.

Does anybody have any prompt engineering tips to convince DALL-E to generate images with words?

Here's an example of a simple prompt you can put in ChatGPT with DALL-E enabled that sometimes fails in this manner:
"Come up with a complex scene for a humorous image with words in it, then draw it."

sturdy harbor
# hybrid plaza I'm using GPT4 to come up with prompts for somewhat complicated images to genera...

It's useful that you shared the sometimes failing prompt.

Do you know how often it fails?

I modified it to make it easier to test, and hopefully more likely to succeed, but I don't know how rare failure was before.

But here's the 'small modification':

We need 5 images, made one at a time in the same output, with wide riffs on this theme: Come up with a complex scene for a humorous image with words in it, then draw it. Be sure to tell Dall-E in quotes the text to include in the image.

4/5 succeeded, and the fail prompt was:

A humorous scene in a fantasy forest. A group of animals, including a deer wearing a monocle, a bear in a bowler hat, and a fox with a waistcoat, are having a sophisticated tea party. They are sitting around a large mushroom table with tiny cups and teapots. A squirrel is serving as the waiter, wearing a small apron and offering pastries. Above them, a banner humorously reads: "Critter's High Tea: Elegance in the Wilderness.". The background features an enchanted forest with magical plants and fairy lights.

hybrid plaza
sturdy harbor
# hybrid plaza 4/5 times is about how often it succeeds no matter what prompt engineering techn...

Okay. Possibly simple workaround then, how does this prompt rewording work for you?

We need 5 images, made one at a time in the same output, with wide riffs on this theme: Come up with a complex scene for a humorous image with words in it. When you make the prompt, describe the text in quotes first.

This also appears to have the effect maybe of tending to improve the spelling as well, which you mentioned you don't really care about.

hybrid plaza