External Publication
Visit Post

API Error Codes: A Test Suite Pattern I Stole from Stripe

DEV Community [Unofficial] July 1, 2026
Source

Read Stripe's API reference for an hour and you'll notice every endpoint has a complete enumerated list of error codes with example payloads. Then look at your own API.

That comparison can be a little uncomfortable.

Stripe doesn't treat errors as an afterthought. Their documentation gives error responses nearly as much attention as successful ones. Every endpoint explains not only what can go right, but exactly what can go wrong, including structured error codes, HTTP status codes, descriptions, and example payloads.

Most APIs aren't like that.

They document 200 OK responses in detail while reducing failures to a short table:

  • 400 Bad Request
  • 401 Unauthorized
  • 404 Not Found
  • 500 Internal Server Error

The business errors—the ones your users actually encounter—are often hidden inside controller code, scattered across wiki pages, or not documented at all.

The same imbalance usually exists in the test suite.

Hundreds of happy-path tests.

Very few negative tests.

A few years ago, I borrowed a simple idea from Stripe and turned it into one of the most valuable testing patterns we've adopted.

Instead of writing negative tests one by one, we created a centralized error-code catalog and generated much of the negative test suite directly from it.

The result wasn't just better API error code testing.

It also improved documentation, reduced maintenance, and made breaking changes much harder to introduce accidentally.

Here's how the pattern works.

Why Error Responses Are Part of Your API Contract

Many teams unconsciously treat successful responses as the "real" API.

Everything else is considered an exception.

That mindset creates fragile systems.

Imagine a payment API.

A successful request returns:

{
  "paymentId": "PAY-1001",
  "status": "Succeeded"
}

Easy enough.

Now consider all the legitimate failure cases:

  • Card expired
  • Insufficient funds
  • Duplicate transaction
  • Currency not supported
  • Merchant suspended
  • Fraud detected
  • Payment amount exceeds limit

Those aren't bugs.

They're expected business outcomes.

If your consumers must handle them, then those responses are every bit as much a part of the API contract as the successful response.

Once you accept that idea, testing strategy changes dramatically.

The Error-Code Catalog as a Test Input

The foundation of this approach is maintaining a single catalog of every business error the API intentionally exposes.

For example:

errors:

  USER_NOT_FOUND:
    httpStatus: 404
    message: User not found

  EMAIL_ALREADY_EXISTS:
    httpStatus: 409
    message: Email already exists

  INVALID_TOKEN:
    httpStatus: 401
    message: Invalid authentication token

  PAYMENT_DECLINED:
    httpStatus: 402
    message: Payment declined

  ORDER_ALREADY_SHIPPED:
    httpStatus: 409
    message: Order cannot be modified

Notice what's happening here.

We're no longer documenting HTTP status codes alone.

We're documenting business outcomes.

Every new error introduced by engineering must first appear inside this catalog.

That simple rule creates a surprising amount of consistency.

Why a Catalog Matters

Without one:

  • Developers invent new response formats.
  • Documentation drifts.
  • QA forgets edge cases.
  • Frontend developers discover failures during integration.

With one:

  • Documentation stays centralized.
  • Consumers understand every supported failure.
  • Every error becomes automatically testable.

The catalog becomes an executable specification.

One Test Per Error Code, Generated from the Catalog

Once every supported error exists in one place, generating baseline negative tests becomes straightforward.

Instead of manually writing dozens of nearly identical scenarios, a generator simply walks through the catalog.

Conceptually:

for (const error of errorCatalog) {
    generateNegativeTest(error);
}

Each generated test validates:

  • Expected HTTP status
  • Business error code
  • Error message
  • Response schema

For example, suppose the catalog contains:

EMAIL_ALREADY_EXISTS

The generated scenario becomes:

  1. Create a customer.
  2. Attempt to create the same customer again.
  3. Verify:
{
  "code": "EMAIL_ALREADY_EXISTS",
  "message": "Email already exists"
}

No engineer needed to remember to write that negative test.

Adding a new business error automatically creates a new baseline test.

Why This Scales

Imagine:

  • 180 endpoints
  • 95 business error codes

Without automation, every additional error increases maintenance.

With generation:

  • Documentation grows.
  • Test coverage grows.
  • Maintenance barely changes.

Engineers can spend their time writing meaningful business scenarios instead of repetitive validation tests.

The Shape Assertion That Prevents Silent Error Drift

Checking only the HTTP status code is one of the biggest mistakes in error response testing.

Consider this response:

{
  "code": "USER_NOT_FOUND",
  "message": "User not found",
  "requestId": "abc123"
}

Months later, someone simplifies the global exception handler.

Now the API returns:

{
  "error": "User not found"
}

The endpoint still returns:

404

Most tests still pass.

Meanwhile:

  • Mobile applications break.
  • Frontend parsing fails.
  • Monitoring dashboards stop correlating request IDs.

Nobody notices until production.

Shape Assertions

To prevent this, every generated negative test also validates the structure of the response.

For example:

expect(response.body).toEqual({

    code: expect.any(String),

    message: expect.any(String),

    requestId: expect.any(String)

});

Notice we're validating more than values.

We're validating the response contract itself.

That single assertion catches:

  • Missing fields
  • Renamed properties
  • Structural changes
  • Serialization mistakes

before consumers experience them.

Why It Matters

Many API consumers depend on fields such as:

  • Error code
  • Message
  • Correlation ID
  • Documentation URL
  • Retry hint

Changing any of those silently becomes a breaking API change.

Shape assertions make those changes impossible to miss.

Keeping the Catalog in Sync With the Code (Code Generation)

The obvious concern is maintenance.

Nobody wants to update:

  • Source code
  • Documentation
  • Tests
  • Error catalog

manually every time a new error appears.

Fortunately, most modern applications already define business errors centrally.

Example:

export enum ErrorCode {

    USER_NOT_FOUND,

    PAYMENT_DECLINED,

    INVALID_TOKEN,

    EMAIL_ALREADY_EXISTS

}

From this single definition, it's possible to generate:

  • Markdown documentation
  • OpenAPI components
  • Error catalogs
  • Client SDK constants
  • Negative tests

Everything derives from one source.

That dramatically reduces maintenance.

Additional Benefits

Once generation is introduced:

Documentation Never Falls Behind

The documentation updates whenever the enum changes.

Generated Tests Stay Current

Every new business error immediately receives baseline coverage.

SDKs Stay Consistent

Frontend applications can reference generated constants rather than string literals.

Reviews Improve

Adding a new business error becomes visible during pull request review.

Instead of hiding inside controller code, it's now part of the API contract.

The Two Error Codes We Deliberately Don't Test

Although the catalog covers almost every business failure, there are two categories we intentionally exclude.

1. Generic Internal Server Errors

Example:

500 Internal Server Error

These represent unexpected failures.

They're not business behavior.

Instead of attempting to trigger every possible server crash, we simply verify:

  • Sensitive stack traces aren't exposed.
  • Generic messages are returned.
  • Request IDs are included.
  • Logging works correctly.

Testing every internal failure path produces little value.

Testing the response contract provides much more.

2. Infrastructure Failures

Examples include:

  • Database unavailable
  • Kafka offline
  • Redis unreachable
  • DNS failure
  • Cloud storage outage

These aren't business errors.

They're infrastructure events.

We test them separately using:

  • Chaos engineering
  • Fault injection
  • Resilience testing
  • Disaster recovery exercises

Keeping them outside the regular API negative tests avoids unnecessary instability in CI pipelines.

Unexpected Benefits

After adopting this approach, several improvements appeared that we hadn't anticipated.

Better Documentation

Engineers could browse every supported business error in one place.

Cleaner APIs

Every endpoint returned a consistent error structure.

Faster Reviews

New business errors became obvious during pull requests.

Happier Frontend Teams

Consumers no longer guessed which failures might occur.

Stronger Regression Protection

Structural changes to error responses surfaced immediately.

Final Thoughts

Most API teams invest enormous effort in testing successful requests while giving comparatively little attention to failures.

Stripe's documentation demonstrates a different philosophy.

Errors are part of the public API contract.

They deserve documentation.

They deserve consistency.

And they deserve automated tests.

By maintaining an error-code catalog, generating one baseline test per error, validating response shapes, and deriving documentation from code, you can significantly reduce maintenance while increasing confidence that your API behaves consistently—even as it evolves.

The best part is that this approach scales naturally.

As new business errors appear, your documentation and test suite grow automatically rather than relying on engineers to remember yet another negative test.

If you're looking to automate this style of contract-driven testing, you can spin up a free trial to try this catalog pattern and explore how generated negative tests, schema validation, and API contracts can work together to keep your error handling consistent over time.

Discussion in the ATmosphere

Loading comments...