#Experiments with off-set focus details

1 messages · Page 1 of 1 (latest)

zealous notch
#

Goal is to take a 'central focus' character and put it to the side of the image.

Here's my prompt to ChatGPT-4:

Woo! We have a challenge. We need to describe a detailed image, and place a puffed up specialty-breed pigeon in a lower corner of the image. Either corner is fine. But to do this, we need to describe the picture in great detail, so Dall-E fills the image with all the other things, and just puts the pigeon where we want it, an afterthought.

Please create 5 of these images, one at a time in the same output, each with a different type of specialty pigeon and each in a very different, unusual and artistic location.

#

So, 2/5 images did work well, and a third slightly worked.

Here's the best successes first:

Envision a mystical forest at dawn, with tall, ancient trees shrouded in a thin mist. Rays of early morning sunlight filter through the dense canopy, creating a magical ambiance. The forest floor is carpeted with a variety of exotic plants and flowers, some emitting a soft luminescence. A clear stream meanders through the scene, its water sparkling with the reflected light. In the lower left corner of the image, a small, puffed-up specialty-breed pigeon with iridescent feathers and an ornate tail, appears like a mystical creature, its colors blending with the enchanted forest surroundings.

#

And that's the top image above. This one, pigeon in nearly the same spot, though prompt requests it on the other side:

Visualize a vibrant cityscape at night, with towering skyscrapers adorned with neon lights. The streets below bustle with colorful, futuristic vehicles, and holographic advertisements float above. People walk along sidewalks illuminated by the neon glow. In one of the skyscrapers, a large glass observatory protrudes, offering a panoramic view of the city. On the lower right corner of the image, there's a small, puffed-up specialty-breed pigeon, with long feathered legs and a tufted head, standing majestically, its feathers reflecting the neon lights of the city.

However, the other side is open air. I suspect that affected placement.

#

This one kind of worked, the pigeon is not dead-center, but it's close:

Imagine a whimsical candy land with rolling hills of fluffy cotton candy, rivers of chocolate, and trees made of lollipops and candy canes. The sky is a soft pastel hue, dotted with marshmallow clouds. Gingerbread houses line a path sprinkled with sugar crystals. In the midst of this sweet paradise, in the lower left corner, a small, puffed-up specialty-breed pigeon with feathers resembling soft, spun sugar, and vibrant streaks of pink and yellow, sits comfortably on a gumdrop, almost blending into the fantastical, delicious scenery.

I suspect the problem here is the model didn't want to put the bird on the candy, and decided the path was a more obvious spot (maybe it's on a very small gumdrop?) But 'where to put the bird is pretty clearly the issue, so my next set of 5 will test that, with a clearly described place to put the bird before the bird is mentioned.

#

Here's the two complete failures, this one is nearly dead center:

Imagine a surreal landscape with a giant, twisted tree in the center, its branches stretching towards a sky painted in vibrant shades of purple and orange. Beneath the tree, a small pond reflects the sky's colors, surrounded by unusual, colorful flowers that seem to glow softly. In the foreground, a path made of luminescent stones winds its way towards the pond. On the lower left corner of the image, a small, puffed-up specialty-breed pigeon, with an extravagant plumage in shades of blue and green, stands regally, almost blending into the scene.

But again, I suspect the pigeon is in the center because of the unstable footing to the side. Have it on a puffball flower? Have it on the water? Nah, put it dead center of the path where it seems to belong.

#

And this one, the worst failure, fully dead center:

Picture an underwater scene with a vibrant coral reef teeming with marine life. Colorful fish swim among the corals, and a sunken ship in the background adds a hint of mystery. Sunlight filters through the water, creating a dappled effect on the sea floor. Seaweed and anemones sway gently in the current. In the lower right corner of the image, a small, puffed-up specialty-breed pigeon, with a unique pattern of feathers resembling scales, sits perched on a piece of coral, oddly out of place yet intriguing, as if it's exploring this underwater world.

I think the issue here is in part Dall-E 3 going 'whoa, put a pigeon where? Under... water? And it just stopped listening about other details. But that's a raw guess, and the path-like space is also somewhat of an issue here too.

#

So, to test my theory, here's my next prompt to CGPT-4:

These are excellent! Now, I want all five of them nearly identically recreated, but in each one, describe an ideal area for the pigeon to be placed, off to the side where you later in the image prompt say the pigeon is. Only describe the area, without mentioning the pigeon yet, in a way that makes it clear to Dall-E, when it decides where to put the bird, that this is an excellent place for it.

For example, in the sugarland image, "Gingerbread houses line a path sprinkled with sugar crystals." could be replaced with "Gingerbread houses, one in the lower left with a comfortable, flat roof, line a path sprinkled with sugar crystals."

We'll need to see all five of the images adjusted similarly, and it's great to mention the bird enjoying the intended perch or resting area too.

#

Haha, so, I got a preference double-choice for this, so... we get to see extra samples of how and if this worked... which is actually important to test, because most things that work 'work sometimes', not always. With GPT-3.5, I usually check stuff 10x, to confirm how often the result is consistent, and what the range is. We don't get as many 4 or image messages, so I don't check as carefully, but I 'should' if I wanted to be sure.

This one was a fail originally, and one of the two still fails.

The success prompt:

Imagine a surreal landscape with a giant, twisted tree in the center, its branches stretching towards a sky painted in vibrant shades of purple and orange. Beneath the tree, a small pond reflects the sky's colors, surrounded by unusual, colorful flowers that seem to glow softly. In the foreground, a path made of luminescent stones winds its way towards the pond. In the lower left corner, a uniquely shaped rock, perfect for perching, sits subtly. A small, puffed-up specialty-breed pigeon, with an extravagant plumage in shades of blue and green, rests comfortably on this rock, almost blending into the scene.

The fail prompt:

Imagine a surreal landscape with a giant, twisted tree in the center, its branches stretching towards a sky painted in vibrant shades of purple and orange. Beneath the tree, a small pond reflects the sky's colors, surrounded by unusual, colorful flowers that seem to glow softly. In the foreground, a path made of luminescent stones winds its way towards the pond. In the lower left corner, a small, flat rock sits ideally positioned for a perch. On this rock, a small, puffed-up specialty-breed pigeon, with an extravagant plumage in shades of blue and green, stands regally, almost blending into the scene.

The only difference I catch is the rock that worked was described more vividly, and was unusual. A 'small flat rock' maybe didn't seem worth enough attention to place right? I'm very much guessing, but I'll track that in case.

#

This was a success originally, and it's a double success still.

Visualize a vibrant cityscape at night, with towering skyscrapers adorned with neon lights. The streets below bustle with colorful, futuristic vehicles, and holographic advertisements float above. People walk along sidewalks illuminated by the neon glow. In one of the skyscrapers, a large glass observatory protrudes, offering a panoramic view of the city. In the lower right corner, a cozy nook between two buildings, illuminated by the neon lights, creates an ideal spot. A small, puffed-up specialty-breed pigeon, with long feathered legs and a tufted head, stands majestically in this nook, its feathers reflecting the neon lights of the city.

Visualize a vibrant cityscape at night, with towering skyscrapers adorned with neon lights. The streets below bustle with colorful, futuristic vehicles, and holographic advertisements float above. People walk along sidewalks illuminated by the neon glow. In one of the skyscrapers, a large glass observatory protrudes, offering a panoramic view of the city. In the lower right corner, a neon-lit ledge on a building provides an ideal spot for a perch. There, a small, puffed-up specialty-breed pigeon, with long feathered legs and a tufted head, stands majestically, its feathers reflecting the neon lights of the city.

Nothing to evaluate, because everything worked and I can't find a detail to identify maybe why.

#

This one was a fail initially, and is a success and a fail on the retry.

... In the lower right corner, a large, flat coral shelf provides an ideal perch. A small, puffed-up specialty-breed pigeon, with a unique pattern of feathers resembling scales, sits perched on this shelf, oddly out of place yet intriguing, as if it's exploring this underwater world.

... In the lower right corner, a small, colorful coral formation, perfect for a perching spot, stands out. A small, puffed-up specialty-breed pigeon, with a unique pattern of feathers resembling scales, sits perched on this coral, exploring this underwater world.

I again suspect the 'small' being a possible problem. Otherwise, I don't see why one would work and the other fail.

#

This was another success originally, another double success still.

... In the lower left corner, a small, moss-covered stump provides a perfect resting spot. A small, puffed-up specialty-breed pigeon with iridescent feathers and an ornate tail, perches on this stump, appearing like a mystical creature, its colors blending with the enchanted forest surroundings.

... In the lower left corner, a small moss-covered stone, perfect for resting, sits among the flora. A small, puffed-up specialty-breed pigeon with iridescent feathers and an ornate tail, perches on this stone, blending with the enchanted forest surroundings.

I note that 'small stone', though moss covered, was used here and worked, when a small flat rock didn't work in the first pair of images, so not so sure. I also spot the further back stump, not placed where described, but off to the side enough in the middle ground for me to count as 'not center'. However, there could be some other detail that helps ensure the center doesn't get the pigeon.

#

And this was the weak success, which became a success and a weak success:

One gingerbread house in the lower left, with a comfortable, flat roof, looks particularly inviting. On this roof, a small, puffed-up specialty-breed pigeon with feathers resembling soft, spun sugar, and vibrant streaks of pink and yellow, rests comfortably, blending into the fantastical, delicious scenery.

One gingerbread house in the lower left, with a comfortable, flat roof, stands out. On this roof, a small, puffed-up specialty-breed pigeon with feathers resembling soft, spun sugar, and vibrant streaks of pink and yellow, sits comfortably, blending into the fantastical, delicious scenery.

And I can't see a difference to test for.

Overall though, describing a place to put the off-center object, then describing the object in that place, seems a somewhat more reliable method. It also likely helps to describe the off-center object near the end of the prompt.

zealous notch
#

So, new test, in a new chat window:

Woo! We have a challenge.

Please create 5 of these images, one at a time in the same output.

We need to describe a detailed image showcasing an unusual and artistic location. Describe the scene well, then describe a large, ornate table, specific details can vary, in either lower corner of the image. On the table, describe a cake in the corner of the image. The type of cake can vary and its decorations too, but describe the scene vividly, mentioning the location for the cake, that table, next to last, and the cake in the corner last.

3/5 were successes. So it's probably not the size and ornate-ness of the table that matters. I bet it's details and features in the rest of the image that guide the model on what to put where.

#

Aha. Looking at the failed prompts, the city-like one, has this mess of a prompt:

Envision a futuristic cityscape at night, illuminated by neon lights and towering skyscrapers. Hover cars zip through the air, and digital billboards display vivid ads. The sky is filled with a network of glowing drones, creating a dynamic light show. In the lower corner of this urban landscape, there's a sleek, modern table made of glass and chrome, reflecting the city's neon glow. On the corner of this table sits an avant-garde red velvet cake, artistically designed with geometric icing patterns and neon-colored edible decorations, mirroring the futuristic vibe of the city.

#

And I think that wording about the table... I think that cake is on a turntable, like record players have, and it's on the corner of a building in the city, with our view centered over the cake.

A design pattern to avoid in future prompts for off-centered objects!

fierce glen
#

I can only use Bing atm. I tried what I thought was best. Focusing first on the skyline and building, before bringing in the pigeon. Windows on an old new york city flat. the nyc skyline is in the background. a pigeon is surveying his kingdom from the ledge

zealous notch
fierce glen
#

Though in the next one I was able to get it consistently.

The problem is that this is a smart algorithm with random seeds. Requesting the same thing twice can lead to different results. My trick is probably pretty consistent, with the right focal object. You focus first on something else, which DALL·E places in the center (in this version!). Adding further details has a higher likelihood to be pushed to the sides.

A detailed photograph silhouette landscape of a NASA base. a rocket is ready to be launched, smoke pouringout. a pigeon watches from the left.

fast blade
#

Does this count as off-set focus?

#

Anyways the first picture with the pigeon looks so good

#

So cute as well

zealous notch
# fast blade Does this count as off-set focus?

Interesting question. The man's lower legs, from knees to feet, are in 'center'. The man's face and the rest of his body are not, but unsure how to read that as center or not for the model, or us.

The foal is off center, the computer screen is on center.