Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreidawqa2evfxtgssghqiy253yfg53axknr2yiesavetyq5u3l7uc6e",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mmpcppzmw332"
  },
  "path": "/t/understanding-how-gpt-image-models-on-edits-see-mask-and-transparency/1381752#post_1",
  "publishedAt": "2026-05-25T20:03:54.000Z",
  "site": "https://community.openai.com",
  "tags": [
    "(click for more details)"
  ],
  "textContent": "Reminder: transparent background cannot be requested on “2” - it is a denied API parameter and not a variable in analysis here.\n\n## How does mask work, or transparent input?\n\nI thought I would just construct an image and ask.\n\nThe back portion of the car in the image was yellow, and I made it transparent, with a value (100%,100%,100%,0) for white behind the transparency, a color usually not seen.\nI drew a mask over the car’s grille and the large logo on the side.\n\nThe application creates as its communication method beyond ambiguous spec:\n\n  * The **base`image[]`** is the current canvas as RGBA. Image/key transparency stays transparent. Loaded-image outfill transparency is also transparent, but its hidden RGB is changed to checkerboard as a hint.\n  * The **`mask`** is a separate RGBA PNG. Its alpha is transparent where the model should edit: user-painted mask areas plus outfill areas. Its RGB is sepia/grayscale context, with user-painted regions shown in gray.\n\n\n\n## gpt-image-2\n\nApparently receives no input transparency - the back of the car is the underlying RGB white without any hint alpha channel is perceived or understood.\nThe mask is translated to the image correctly. However, the AI thinks this is transparent. It might not receive the contents at all?\n\n## Try 2 - to have the AI describe masked contents\n\nThis is fabricating that the background is transparent - no, it is just white. A new grille for the car was drawn in the masked grille, with no notation. The AI cannot report on the text originally on the side of the car, and doesn’t adequately describe the input mask color being gray + transparent.\n\n## gpt-image-2 Conclusion\n\nMask is used in the API to “damage” the input image, in the same way that DALL-E 2 had no idea what was masked out by having transparency only in a second mask image.\n\n_Seems OpenAI is sending the model a mask as transparency, and your transparent input doesn’t work._\n\n## gpt-image-1.5\n\nReminder: input_fidelity is foisted mandatory, API settings are not obeyed - 4k or 6k additional\n\nAlso hallucinations and re-creations.\n\nThe grille mask was not precise, and simply made black instead of a new infill. A new side logo was made and reported on (likely as the text generated lower is informed by seen context above). Transparent back was embellished, as one might expect with no ability to send back transparency with “opaque”.\n\n### gpt-image-1.5 with background transparency enabled\n\nThe AI did NOT make any transparency, it made a background checkerboard. Again seems to indicate the side text only as a mask where original contents are not describable.\n\n## gpt-image-1\n\nThis is even more confused, but the model is not known for writing text well.\n\n## Doing work on gpt-image-2\n\nExpecting the masked areas are understood and obeyed as the only writeable area (instead of prompting to ignore the mask rules as before):\n\n> Give the drag car sticker an aggressive grille. Create a name for the car.\n\nThe masked area - and logic - was exceeded in making a car name. The transparent back was made black instead of white, so there’s still ambiguity about the transmission of transparency input that can’t be output.\n\nEdits prompt text (click for more details)\n\n# Request to OpenAI\n\nHave the image team document the context placement of images and mask clearly at the color space level, expectation, and the perception and training, so applications can be developed with high quality.",
  "title": "Understanding how gpt-image models on edits see \"mask\" and transparency"
}