#AI images and text

1 messages · Page 1 of 1 (latest)

wintry herald
#

Why is AI terrible at putting text in images?

tame kernelBOT
#

Hi @wintry herald. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

AI models like DALL-E and ChatGPT, developed by OpenAI, have limitations when it comes to generating text within images. These limitations are primarily due to the way these models are trained and the nature of their underlying architecture.

DALL-E, for instance, is designed to generate creative imagery based on textual prompts. However, when it comes to generating text within these images, the results can be unreliable. This is because when the model encounters text in a prompt, it sees tokens that represent whole words and must map those to letters in an image. This process can lead to words having missing or extra characters. OpenAI has acknowledged this limitation and mentioned the possibility of exploring conditioning on character-level language models to help improve this behavior in future work [source] (https://community.openai.com/t/703020).

Similarly, ChatGPT can sometimes generate images with text spelled correctly, but it's not to be expected. The highest chance of getting text spelled correctly is when you let the rest of the image be unexpected creations [source] (https://community.openai.com/t/647000).

In summary, while AI has made significant strides in many areas, generating text within images is a complex task that current models still struggle with. This is an active area of research, and improvements are expected as the technology evolves.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: