#Game Sprite Generation

1 messages · Page 1 of 1 (latest)

hazy bobcat
#

I am trying my darndest to create videogame sprites, but I cannot get Dall-E (through ChatGPT4o) to generate what I'm looking for. I've reset the chat multiple times, and tried so many different prompts multiple times. I'll have it generate a high level view of a game, but then when asking it to create the sprites, get something completely different. Here is just one of the many:
"32 x 32 pixel hungry hippo sprite sheet (photorealistic illustration avatar) (gamedev) (Unreal Engine 5) (white background) (game ready assets)".
I've also attached sprite images to give it an example and used this prompt:
Draw a sprite sheet, similar to what I've attached, with a solid background, consisting of 16 frames in a 4 high by 4 wide grid. These are of 4 fat hippo sitting and facing different from each other. 2 are facing away from the camera, one pointed left, the other pointed right. Each hippo should stretch forward with no hands to eat a marble from the ground with its mouth. frames 1-4 are hippo one on the left upper hand side of the board. Frames 5-8 are hippo 2 on the upper right hand corner. Frames 9-12 are hippo 3 on the bottom left side of the board, back facing the camera. Frames 13-16 are hippo 4 back facing the camera on the bottom right hand side.

I've also used much more detailed prompts, but they just won't produce sprites, nor will they get the subjects to face the direction I ask, etc.
Anyone able to assist me?

stark bronze
# hazy bobcat I am trying my darndest to create videogame sprites, but I cannot get Dall-E (th...

Hey! This kind of consistency/specificity is a challenge for the model currently. It seems like it may improve as more of the multimodality of GPT-4o goes live (see examples here: https://openai.com/index/hello-gpt-4o/), but for now I find it helpful to keep in mind that DALL·E isn't really an "instruction follower" like how a graphic designer can get a set of instructions and create a result that follows them.

Instead, DALL·E is "just" a text-to-image generator. These are definitely similar, and DALL·E is extremely capable, but for the most part it means that DALL·E is starting "from scratch" with every image it generates. Like your example of high level game view transitioning into sprite creation: DALL·E doesn't "see" or "keep in mind" the game view to "know" what the sprites in said view should look like. Instead, it simply gets a new text input to use for generating a new image output.

There is some flexibility offered by inpainting on ChatGPT to edit certain portions of DALL·E images while keeping the rest unchanged, but this still isn't the same as the consistency/specificity you're describing. You might also search for the word "genID" (or gen ID, gen_ID, etc.) in the #images-discussions and/or #1163443000060420206 channels -- this is a kind of "hidden parameter" that some users have reported helps improve consistency of certain elements in successive generations.

Finally, as for the instructions about getting characters to face in certain directions: this is another limitation from how the model was trained. DALL·E learned by basically reading a bunch of synthetic captions that described a bunch of images. However, the synthetic captioner was not perfect at things that involve spatial awareness (above, below, to the left of, etc.).

#

Because of this, DALL·E also is not perfect that these kinds of instructions -- basically, it doesn't "know" what directions are perfectly, so it is prone to getting them wrong. If you're interested, you can read more in the DALL·E 3 research paper: https://cdn.openai.com/papers/dall-e-3.pdf

still tusk
#

If you have a more recent nVidia (newer than GTX 1080) graphics card you can try using Fooocus which uses the open-source Stable Diffusion XL engine. I am getting better results with that than DAL-E 3

hazy bobcat
#

I did try the in painting, it worked a little bit. For instance, it helps me add small features to an image, but nothing major. The training and programming you wrote about makes perfect sense. It’s probably why most every model does the same thing, including adobe. and programming

trim bear
#

How long untill we see ss13 server sprites lol

flint smelt
#

for example, i challenge the ai. This forces it to do as I want