#Let the user specify his priorities.

1 messages · Page 1 of 1 (latest)

oblique summitBOT
#

Reported by @manic perch

Bug Report: Let the user specify his priorities.
`Steps to Reproduce`

This resulted of a discussion with ChatGPT on how Dall-E interpret a prompt, and prioritises its elements. Its choices are not always what the user expects, hence the need for an explicit way to specify what elements are important, or to override the cultural biases of the drawing AI.

`Expected Result`

In the instance I needed six pyramids, Aztec-style.
In another example, I wanted a building with a convex curved roof, that is more high than large, like a gothic arch.

`Actual Result`

In the first example, even with provided with a sketchy top, side and perspective view, Dall-E stubbornly draw a Maya pyramid, even when specifying "Aztec-style pyramid like the Pyramid of the Sun in Teotihuacan". I had to specify an "Earth mound" to get what I wanted.

In the second instance, it kept drawing concave and flattened roofs, as to take the opposite course of what I asked. Only at the 12th try or so, it started to make high roofs, but still concave. I had to finish manually.

`Environment`

Windows 11, Brave browser

#
Additional Information

Please provide relevant details to help resolve the issue, such as:

  • ChatGPT Shared Link (if applicable).
  • Screenshots or videos demonstrating the problem.

-# ➜ Need to contact support? Visit the OpenAI Help Center.

manic perch
# oblique summit

after text descriptions failled to produce the requested geometry, I entered this image (one of the previous modified by hand) It continued to flatten the rooks like in a Disney movie.
This may look like a detail, but it completelly fails whhen the purpose is to illustrate a tale with detailed descriptions of these houses, how they are built, and why they evolved that way from the materials available on site.
"Not Disney" should be enough to remove all the stereotypes of this kind.

#

This image was the first result of the following prompt:
"In a landscape similar to the previous ones, but arranged differently, appears Likpatitlan, the strange capital of the priests of the Great Continent. There are eight Aztec-style pyramids, gently sloping and stepped, each of a different size. They are light grey-green, made of stones covered in lichen. They are aligned along the camera’s line of sight. Each has a small building at its summit, made of light ochre stones of varying sizes and shapes. They are aligned along the camera’s line of sight. To the left lies a vast grassy esplanade, with a long path paved with light ochre stones, and small buildings around the perimeter."
Despite the accurate description "gentle slope" "Aztec style" and further refinements like further "5° slope Dall-E stubbornly kept drawing Maya pyramids. ChatGPT explained that the reason is that Dall-E wrongly prioritises common cultural models over even accurate or insisting prompts. Hence the title of ths bug report and demand.

manic perch
#

(image from above deleted by mistake. )

#

I finally got my scene after much fight and cunning against Dall-E. But I wanted another view, from a different point of view. The prompt started simple, but grew complicated to try to circunvent all the possible hallucinations:
"This image shows Likpatitlan. It is not a mesoamerican site, but a fictional site on another planet. It is a single continuous site composed of: some houses in the foreground, a row of mounds, a stone runway, and in the back a row of earth mounds, each different. All these elements form one rigid structure with fixed relative positions and perspective. Your work is to rotate the entire Likpatitlan structure by 80 degrees clockwise, (or change viewpoint), in a single piece, so that the runway is in the line of view. Then move the whole thing much further away, near the horizon. Do this while preserving the exact geometry, distances, and alignment between all elements of Likpatitlan. Do not redesign or reinterpret any part. do not change number, shape, or spacing do not move elements independently preserve perspective consistency no artistic reinterpretation The light source is still at left. Photorealistic rendering, vivid colors 2048px x1536px “Treat this as a 3D scene being rotated, not as an illustration to redraw.

The result s the second image: Dall-E ignored the request to move all elements together, and moved or rotated each differently, at random.
Failure to preserve rigid spatial structure under requested global transformation (as Chat GPT put it)