#Gpt image prompt
1 messages · Page 1 of 1 (latest)
Hi @solid hare. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
To achieve your goal—having GPT Image generate a realistic photo of your 3D character in a custom pose using both a character reference sheet and a skeleton pose image—you should:
-
Upload both images as references:
- The character reference sheet (for appearance, style, and details).
- The skeleton pose image (for the desired pose).
-
Write a clear, detailed prompt:
- Specify that the character in the reference sheet should be recreated in the pose shown in the skeleton image.
- State that you want a realistic photo style.
Prompt Example:
Create a realistic photo of the character shown in the first reference image, posed exactly as in the second reference image (the skeleton pose). Match the character’s appearance, clothing, and style from the reference sheet, but use the body position and gesture from the skeleton pose image.
How to structure your API call:
- Use the Image API or Responses API to provide both images as input references (either as base64 or file IDs).
- Include your prompt as shown above.
Python Example (using the Responses API):
response = client.responses.create(
model="gpt-image-1",
input=[
{
"role": "user",
"content": [
{"type": "input_text", "text": "Create a realistic photo of the character shown in the first reference image, posed exactly as in the second reference image (the skeleton pose). Match the character’s appearance, clothing, and style from the reference sheet, but use the body position and gesture from the skeleton pose image."},
{"type": "input_image", "image_url": "data:image/png;base64,<character_reference_base64>"},
{"type": "input_image", "image_url": "data:image/png;base64/<skeleton_pose_base64>"},
],
}
],
tools=[{"type": "image_generation"}],
)
Replace <character_reference_base64> and <skeleton_pose_base64> with your actual base64-encoded images.
Key tips:
- Be explicit in your prompt about which image provides the character and which provides the pose.
- The model uses the images as anchors for appearance and pose, and the text prompt to clarify your intent.
- This approach is supported and recommended for fine-grained control over composition and style, as described in the OpenAI documentation and Sora prompting guide.