Gpt-image-2 output pricing calculator - online.. and in your code!
OpenAI made this opaque. I’m pleased to present:
hotnova.com
gpt-image-2 Price Explorer
Interactive image output token and cost calculator, much easier than OpenAI’s calculator (if you can find it, that likes to say “invalid” after your input).
That’s based the algorithm behind cost calculations for gpt-image-2. If you like JavaScript, steal away there.
Simple code
Bringing this forum around to “stuff for developers” instead of "complaints from developers, here’s the money shot: Python code for you to compute token output from quality and size as inputs (non-validating)
IMAGE_MODEL_SPECS = {
"gpt-image-2": {
"size_limits": {
"step_px": 16,
"min_pixels": 655_360,
"max_pixels": 8_294_400,
"max_dimension_px": 3_840,
"max_aspect_ratio": 3.0,
},
"quality_axis_factors": {"low": 16, "medium": 48, "high": 96},
"token_area_offset_pixels": 2_000_000,
"token_area_scale_denominator": 4_000_000,
"image_output_price_per_million_tokens": 30.00,
},
}
def calculate_image_tokens(quality, width, height, model="gpt-image-2"):
spec = IMAGE_MODEL_SPECS[model]
quality_axis_factor = spec["quality_axis_factors"].get(quality)
if quality_axis_factor is None:
allowed = "', '".join(spec["quality_axis_factors"])
raise ValueError(f"quality must be one of '{allowed}'; got {quality!r}")
long_edge = max(width, height)
short_edge = min(width, height)
short_axis_factor = (
2 * quality_axis_factor * short_edge + long_edge
) // (2 * long_edge)
return (
quality_axis_factor
* short_axis_factor
* (spec["token_area_offset_pixels"] + width * height)
+ spec["token_area_scale_denominator"]
- 1
) // spec["token_area_scale_denominator"]
Code with utilities included
- Validate image size : return error message about the input limit “rules” violated, or use no messages as “success”
- Normalize : re-shape a desired resolution into one that works
- Recommend cheaper : in step bands, there are cheaper resolutions with a more rectangular aspect ratio to be found - we find them and recommend the size string!
- Tokens to dollars : because math is hard; this consumes from a model’s truth (and possible future models)
Includes a demo so you can exercise these all:
GPT image size helpers demo
Enter sizes like 1200x1600. Blank input exits.
Size: *333x355*
333x355 is invalid for gpt-image-2:
- Width and height must both be divisible by 16.
- Pixel budget must be at least 655,360 pixels, inclusive.
We can fix that in code, though!
Normalized: 333x355 -> 784x848
-- costs for 784x848 --
low: 160 tokens, $0.004800
cheaper larger is 784x880: 151 tokens, $0.004530
medium: 1,408 tokens, $0.042240
cheaper larger is 784x880: 1,388 tokens, $0.041640
high: 5,693 tokens, $0.170790
cheaper larger is 784x864: 5,591 tokens, $0.167730
Python helper utilities
from math import ceil, floor
IMAGE_MODEL_SPECS = {
"gpt-image-2": {
"size_limits": {
"step_px": 16,
"min_pixels": 655_360,
"max_pixels": 8_294_400,
"max_dimension_px": 3_840,
"max_aspect_ratio": 3.0,
},
"quality_axis_factors": {"low": 16, "medium": 48, "high": 96},
"token_area_offset_pixels": 2_000_000,
"token_area_scale_denominator": 4_000_000,
"image_output_price_per_million_tokens": 30.00,
},
}
def calculate_image_tokens(quality, width, height, model="gpt-image-2"):
spec = IMAGE_MODEL_SPECS[model]
quality_axis_factor = spec["quality_axis_factors"].get(quality)
if quality_axis_factor is None:
allowed = "', '".join(spec["quality_axis_factors"])
raise ValueError(f"quality must be one of '{allowed}'; got {quality!r}")
long_edge = max(width, height)
short_edge = min(width, height)
short_axis_factor = (
2 * quality_axis_factor * short_edge + long_edge
) // (2 * long_edge)
return (
quality_axis_factor
* short_axis_factor
* (spec["token_area_offset_pixels"] + width * height)
+ spec["token_area_scale_denominator"]
- 1
) // spec["token_area_scale_denominator"]
def validate_image_size(width, height, model="gpt-image-2"):
limits = IMAGE_MODEL_SPECS[model]["size_limits"]
step = limits["step_px"]
min_pixels = limits["min_pixels"]
max_pixels = limits["max_pixels"]
max_dimension = limits["max_dimension_px"]
max_ratio = limits["max_aspect_ratio"]
if type(width) is not int or type(height) is not int or width <= 0 or height <= 0:
return ["Enter whole-number width and height values greater than 0."]
pixels = width * height
long_edge = max(width, height)
short_edge = min(width, height)
errors = []
if width % step != 0 or height % step != 0:
errors.append(f"Width and height must both be divisible by {step}.")
if pixels > max_pixels:
errors.append(
f"Pixel budget must be no greater than {max_pixels:,} pixels, inclusive."
)
if pixels < min_pixels:
errors.append(
f"Pixel budget must be at least {min_pixels:,} pixels, inclusive."
)
if long_edge > max_dimension:
errors.append(
f"Maximum edge length must be less than or equal to {max_dimension:,}px."
)
if long_edge > max_ratio * short_edge:
errors.append(f"Aspect ratio must be no greater than {max_ratio:g}:1.")
return errors
def normalize(width, height, model="gpt-image-2"):
limits = IMAGE_MODEL_SPECS[model]["size_limits"]
step = limits["step_px"]
min_area = ceil(limits["min_pixels"] / (step * step))
max_area = floor(limits["max_pixels"] / (step * step))
max_side = floor(limits["max_dimension_px"] / step)
max_ratio = float(limits["max_aspect_ratio"])
width = max(1, int(width))
height = max(1, int(height))
ratio = max(1.0 / max_ratio, min(max_ratio, width / height))
if ratio >= 1.0:
max_area = min(max_area, max_side * max(1, floor(max_side / ratio)))
else:
max_area = min(max_area, max_side * max(1, floor(max_side * ratio)))
pixels = width * height
if pixels < min_area * step * step:
area = min_area
elif pixels > max_area * step * step:
area = max_area
else:
area = pixels / (step * step)
target_w = (area * ratio) ** 0.5
target_h = (area / ratio) ** 0.5
choices = []
for h in {floor(target_h) - 1, floor(target_h), ceil(target_h), ceil(target_h) + 1}:
if 1 <= h <= max_side:
lo = max(1, ceil(min_area / h), ceil(h / max_ratio))
hi = min(max_side, floor(max_area / h), floor(h * max_ratio))
if lo <= hi:
w = min(hi, max(lo, round(ratio * h)))
choices.append((w, h))
for w in {floor(target_w) - 1, floor(target_w), ceil(target_w), ceil(target_w) + 1}:
if 1 <= w <= max_side:
lo = max(1, ceil(min_area / w), ceil(w / max_ratio))
hi = min(max_side, floor(max_area / w), floor(w * max_ratio))
if lo <= hi:
h = min(hi, max(lo, round(w / ratio)))
choices.append((w, h))
best = min(
choices,
key=lambda size: (
((size[0] - target_w) / target_w) ** 2
+ ((size[1] - target_h) / target_h) ** 2,
abs(size[0] * size[1] - area),
),
)
return best[0] * step, best[1] * step
def recommend_cheaper_larger_size(model, size, quality):
if isinstance(size, str):
width, height = map(int, size.lower().split()[0].split("x"))
else:
width, height = size
if validate_image_size(width, height, model):
return None
spec = IMAGE_MODEL_SPECS[model]
q = spec["quality_axis_factors"].get(quality)
if q is None:
allowed = "', '".join(spec["quality_axis_factors"])
raise ValueError(f"quality must be one of '{allowed}'; got {quality!r}")
limits = spec["size_limits"]
step = limits["step_px"]
max_dimension = (limits["max_dimension_px"] // step) * step
long_side = max(width, height)
short_side = min(width, height)
grow_width = width >= height
tokens = calculate_image_tokens(quality, width, height, model)
if long_side > short_side:
prev_long = long_side - step
prev_size = (prev_long, short_side) if grow_width else (short_side, prev_long)
if not validate_image_size(prev_size[0], prev_size[1], model):
if calculate_image_tokens(quality, prev_size[0], prev_size[1], model) > tokens:
return None
max_long = min(
max_dimension,
(int(limits["max_aspect_ratio"] * short_side) // step) * step,
(limits["max_pixels"] // short_side // step) * step,
)
band = (2 * q * short_side + long_side) // (2 * long_side)
for next_band in range(band - 1, 0, -1):
threshold = (2 * q * short_side) // (2 * next_band + 1) + 1
candidate_long = max(long_side + step, ((threshold + step - 1) // step) * step)
if candidate_long > max_long:
return None
candidate_size = (
(candidate_long, short_side)
if grow_width
else (short_side, candidate_long)
)
if calculate_image_tokens(
quality,
candidate_size[0],
candidate_size[1],
model,
) < tokens:
return f"{candidate_size[0]}x{candidate_size[1]}"
return None
def image_tokens_to_dollars(tokens, model="gpt-image-2"):
price_per_million = IMAGE_MODEL_SPECS[model]["image_output_price_per_million_tokens"]
return tokens * price_per_million / 1_000_000
def demo():
model = "gpt-image-2"
qualities = ["low", "medium", "high"]
print("GPT image size helper demo")
print("Enter image sizes like 1200x1600.")
print("Press Enter at the size prompt to choose another quality.")
print("Press Enter at the quality prompt, or enter an invalid quality choice, to exit.")
while True:
print()
print("Quality choices:")
for index, quality in enumerate(qualities, start=1):
print(f" {index}. {quality}")
try:
quality_choice = input("Choose quality 1, 2, or 3: ").strip()
except EOFError:
print()
return
if not quality_choice:
return
if quality_choice not in {"1", "2", "3"}:
return
quality = qualities[int(quality_choice) - 1]
print()
print(f"Using quality: {quality}")
while True:
try:
size_text = input("Size: ").strip()
except EOFError:
print()
return
if not size_text:
break
try:
clean_size_text = size_text.lower().replace(" ", "")
if "x" in clean_size_text:
parts = clean_size_text.split("x")
if len(parts) != 2:
raise ValueError
width = int(parts[0])
height = int(parts[1])
else:
width = int(clean_size_text)
height_text = input("Height: ").strip()
if not height_text:
break
height = int(height_text)
except ValueError:
print("Enter a size like 1200x1600, or enter a whole-number width.")
continue
original_width = width
original_height = height
errors = validate_image_size(width, height, model)
if errors:
print()
print(f"{original_width}x{original_height} is not a valid request size:")
for error in errors:
print(f" - {error}")
width, height = normalize(width, height, model)
print(f"Normalized size: {original_width}x{original_height} -> {width}x{height}")
else:
print()
print(f"{width}x{height} is valid.")
print("No normalization needed.")
tokens = calculate_image_tokens(quality, width, height, model)
cheaper_size = recommend_cheaper_larger_size(model, (width, height), quality)
print(f"Output tokens: {tokens:,}")
print("Image output cost: "
f"${image_tokens_to_dollars(tokens, model):.6f}")
if cheaper_size:
cheaper_tokens = calculate_image_tokens(
quality,
*map(int, cheaper_size.split("x")),
model,
)
print(
f"Cheaper larger size: {cheaper_size} "
f"({cheaper_tokens:,} output tokens)"
)
else:
print("Cheaper larger size: none found")
print()
def demo():
model = "gpt-image-2"
qualities = ["low", "medium", "high"]
print("GPT image size helpers demo")
print("Enter sizes like 1200x1600. Blank input exits.")
while True:
text = input("\nSize: ").strip().lower().replace(" ", "")
if not text:
return
try:
if "x" in text:
width, height = map(int, text.split("x"))
else:
width = int(text)
height = int(input("Height: ").strip())
except ValueError:
print("Enter a size like 1200x1600, or a whole-number width.")
continue
errors = validate_image_size(width, height, model)
if errors:
old_size = f"{width}x{height}"
print(f"{old_size} is invalid for {model}:")
for error in errors:
print(f"- {error}")
print("We can fix that in code, though!")
width, height = normalize(width, height, model)
print(f"Normalized: {old_size} -> {width}x{height}")
else:
print(f"{width}x{height} is valid.")
print(f"-- costs for {width}x{height} --")
for quality in qualities:
tokens = calculate_image_tokens(quality, width, height, model)
cost = image_tokens_to_dollars(tokens, model)
cheaper_size = recommend_cheaper_larger_size(model, (width, height), quality)
print(f"{quality}: {tokens:,} tokens, ${cost:.6f}")
if cheaper_size:
cheaper_width, cheaper_height = map(int, cheaper_size.split("x"))
cheaper_tokens = calculate_image_tokens(
quality,
cheaper_width,
cheaper_height,
model,
)
cheaper_cost = image_tokens_to_dollars(cheaper_tokens, model)
print(
f" cheaper larger is {cheaper_size}: "
f"{cheaper_tokens:,} tokens, ${cheaper_cost:.6f}"
)
else:
print(" cheaper larger: none")
if __name__ == "__main__":
demo()
Python with a verbose breakdown of what’s being computed, reverse-engineered. Input validation and errors messages included.
from typing import Final, Literal
## Validation constants
# Size validation happens before token calculation. The accepted size space is a
# 16 px lattice, so a dimension that is only 1 px away from a valid value can
# still have no token price.
SIZE_GRANULARITY_PX: Final[int] = 16
# The pixel budget is inclusive at both ends. These limits are checked against
# width * height, independent of the aspect-ratio band used later.
MIN_PIXEL_BUDGET: Final[int] = 655_360
MAX_PIXEL_BUDGET: Final[int] = 8_294_400
# The maximum edge rule is separate from total pixel budget. An image can be
# under the pixel budget and still be invalid if either side is too long.
MAX_EDGE_LENGTH_PX: Final[int] = 3_840
# Aspect ratio is checked as long_edge / short_edge <= 3. The exact 3:1 case is
# valid; only ratios greater than 3:1 are rejected.
MAX_ASPECT_RATIO: Final[int] = 3
def validate_image_size(width: int, height: int) -> list[str]:
"""
Return validation messages for dimensions that cannot receive a token price.
An empty list means the dimensions are eligible for token calculation.
"""
if type(width) is not int or type(height) is not int or width <= 0 or height <= 0:
return ["Enter whole-number width and height values greater than 0."]
errors: list[str] = []
if width % SIZE_GRANULARITY_PX != 0 or height % SIZE_GRANULARITY_PX != 0:
errors.append("Width and height must both be divisible by 16.")
pixel_budget = width * height
if pixel_budget > MAX_PIXEL_BUDGET:
errors.append(
f"Pixel budget must be no greater than "
f"{MAX_PIXEL_BUDGET:,} pixels, inclusive."
)
if pixel_budget < MIN_PIXEL_BUDGET:
errors.append(
f"Pixel budget must be at least "
f"{MIN_PIXEL_BUDGET:,} pixels, inclusive."
)
long_edge = max(width, height)
short_edge = min(width, height)
if long_edge > MAX_EDGE_LENGTH_PX:
errors.append(
f"Maximum edge length must be less than or equal to "
f"{MAX_EDGE_LENGTH_PX:,}px."
)
if long_edge > MAX_ASPECT_RATIO * short_edge:
errors.append("Aspect ratio must be no greater than 3:1.")
return errors
## Algorithm constants
Quality = Literal["low", "medium", "high"]
# The quality setting enters the calculation as an integer axis factor.
# For square images, the pre-area token grid is this value squared:
# low=16*16, medium=48*48, high=96*96.
QUALITY_AXIS_FACTORS: Final[dict[Quality, int]] = {
"low": 16,
"medium": 48,
"high": 96,
}
# The final area multiplier is:
#
# (AREA_OFFSET_PIXELS + width * height) / AREA_SCALE_DENOMINATOR
#
# The positive offset means the token count is not proportional to image area
# alone. At the minimum valid pixel budget, the offset is larger than the image
# itself, so smaller valid images still carry a substantial fixed area term.
AREA_OFFSET_PIXELS: Final[int] = 2_000_000
AREA_SCALE_DENOMINATOR: Final[int] = 4_000_000
def _round_half_up_ratio(numerator: int, denominator: int) -> int:
"""
Round numerator / denominator to the nearest integer, with exact halves up.
The aspect component falls into integer bands rather than staying continuous.
Using integer arithmetic keeps boundary cases deterministic and avoids
moving a half-step threshold because of binary floating-point representation.
"""
return (2 * numerator + denominator) // (2 * denominator)
def _ceil_div(numerator: int, denominator: int) -> int:
"""
Return ceil(numerator / denominator) for positive integers.
Token totals are whole numbers. Any non-zero fractional remainder in the
scaled calculation increases the reported total to the next integer token.
"""
return (numerator + denominator - 1) // denominator
def calculate_image_tokens(quality: Quality, width: int, height: int) -> int:
"""
Return the output token count for an image setting.
The calculation is symmetric in width and height. Rotating an image does not
change the result because only the long edge, short edge, and total area are
used.
Raises:
ValueError: If quality is not one of "low", "medium", or "high", or if
the dimensions violate the size rules.
"""
if quality not in QUALITY_AXIS_FACTORS:
allowed = ", ".join(repr(value) for value in QUALITY_AXIS_FACTORS)
raise ValueError(f"quality must be one of {allowed}; got {quality!r}")
errors = validate_image_size(width, height)
if errors:
raise ValueError("Invalid image size:\n- " + "\n- ".join(errors))
long_edge = max(width, height)
short_edge = min(width, height)
quality_axis_factor = QUALITY_AXIS_FACTORS[quality]
# The longer side keeps the full quality axis factor. The shorter side is
# reduced according to short_edge / long_edge and then rounded into an
# integer band.
#
# This rounded band is the source of the visible downward jumps in resolution
# tables: as one edge grows, area rises gradually, but the aspect band can
# drop by 1 at a threshold. That one-band drop can outweigh the added pixels.
short_axis_factor = _round_half_up_ratio(
quality_axis_factor * short_edge,
long_edge,
)
# The calculation behaves like a rectangular token grid:
#
# full quality axis factor * aspect-adjusted short-axis factor
#
# For square images, short_axis_factor equals quality_axis_factor, so the
# grid is exactly quality_axis_factor squared.
token_grid = quality_axis_factor * short_axis_factor
pixel_budget = width * height
# Area then scales the token grid with a fixed positive offset. Keeping this
# as integer arithmetic gives an exact final ceiling instead of depending on
# floating-point rounding near token boundaries.
scaled_token_numerator = token_grid * (AREA_OFFSET_PIXELS + pixel_budget)
return _ceil_div(scaled_token_numerator, AREA_SCALE_DENOMINATOR)
def main() -> None:
quality: Quality = "medium"
width = 1200
height = 1472
try:
tokens = calculate_image_tokens(quality, width, height)
print(f"quality={quality}, width={width}, height={height} -> {tokens} tokens")
except ValueError as v:
print(f"-- ValueError --\n{v}")
if __name__ == "__main__":
main()
Documentation
If OpenAI were to explain “how to calculate costs” in natural language, here’s what it might look like.
Summary (click for more details)
Result
Look forward to your favorite image creation applications telling you not just your input cost in language tokens (also the “vision” image input component), but also your output and final cost - before you push a “send” button!
Discussion in the ATmosphere