External Publication
Visit Post

Understanding how gpt-image models on edits see "mask" and transparency

OpenAI Developer Community May 25, 2026
Source

Reminder: transparent background cannot be requested on “2” - it is a denied API parameter and not a variable in analysis here.

How does mask work, or transparent input?

I thought I would just construct an image and ask.

The back portion of the car in the image was yellow, and I made it transparent, with a value (100%,100%,100%,0) for white behind the transparency, a color usually not seen. I drew a mask over the car’s grille and the large logo on the side.

The application creates as its communication method beyond ambiguous spec:

  • The baseimage[] is the current canvas as RGBA. Image/key transparency stays transparent. Loaded-image outfill transparency is also transparent, but its hidden RGB is changed to checkerboard as a hint.
  • The mask is a separate RGBA PNG. Its alpha is transparent where the model should edit: user-painted mask areas plus outfill areas. Its RGB is sepia/grayscale context, with user-painted regions shown in gray.

gpt-image-2

Apparently receives no input transparency - the back of the car is the underlying RGB white without any hint alpha channel is perceived or understood. The mask is translated to the image correctly. However, the AI thinks this is transparent. It might not receive the contents at all?

Try 2 - to have the AI describe masked contents

This is fabricating that the background is transparent - no, it is just white. A new grille for the car was drawn in the masked grille, with no notation. The AI cannot report on the text originally on the side of the car, and doesn’t adequately describe the input mask color being gray + transparent.

gpt-image-2 Conclusion

Mask is used in the API to “damage” the input image, in the same way that DALL-E 2 had no idea what was masked out by having transparency only in a second mask image.

Seems OpenAI is sending the model a mask as transparency, and your transparent input doesn’t work.

gpt-image-1.5

Reminder: input_fidelity is foisted mandatory, API settings are not obeyed - 4k or 6k additional

Also hallucinations and re-creations.

The grille mask was not precise, and simply made black instead of a new infill. A new side logo was made and reported on (likely as the text generated lower is informed by seen context above). Transparent back was embellished, as one might expect with no ability to send back transparency with “opaque”.

gpt-image-1.5 with background transparency enabled

The AI did NOT make any transparency, it made a background checkerboard. Again seems to indicate the side text only as a mask where original contents are not describable.

gpt-image-1

This is even more confused, but the model is not known for writing text well.

Doing work on gpt-image-2

Expecting the masked areas are understood and obeyed as the only writeable area (instead of prompting to ignore the mask rules as before):

Give the drag car sticker an aggressive grille. Create a name for the car.

The masked area - and logic - was exceeded in making a car name. The transparent back was made black instead of white, so there’s still ambiguity about the transmission of transparency input that can’t be output.

Edits prompt text (click for more details)

Request to OpenAI

Have the image team document the context placement of images and mask clearly at the color space level, expectation, and the perception and training, so applications can be developed with high quality.

Discussion in the ATmosphere

Loading comments...