Livestream starting now: ChatGPT Images 2.0
API notes
Quality:low = less expensive; Quality: medium, high = more expensive
2.0 provides much more arbitrary image sizing, as long as specified at 16px intervals.
A maximum of 3:1 ratio. Noteworthy for sharers is making an image for the maximum 1:1 native Discourse forum size 690x500 (1.38 ratio) is too small and not quantized correctly; you’d need 53% more area to meet the minimum. 1376x992 or 1376x1008 for a doubling; or 1024x752.
That demonstrates that you might want to think about tooling to offer end users or to AI, that you provide alternately, adaptively, one dimension and a ratio or a “desired”->“output” resolution that can then be “stepped” and clipped.
No direct formula is provided for the passes and upscaling and dimensional expansion billing. For example, if we consider images as 16px tiles or patches, 1024x1024 would be 64x64 = 4096, yet that bill is 7024 tokens at high quality. Time to dig into the doc site’s Javascript and retrieve what should have been delivered in documentation.
https://developers.openai.com/api/docs/guides/image-generation#size-and-quality-options
Weirdness with the billing calculator that must be explained formulaically:
1024x1024 = 7024 tokens 1536x1024 = 5488 tokens 1792x1024 = 5063 tokens (also DALL-E 3’s wide size)
Does quality suffer at a larger image and yet lower billing? Or is there error to be discovered (as with almost every prior image product so far…)?
Double that 1792 for 4x the area? 3584x2048 = 12329 tokens
Discussion in the ATmosphere