Persistent 0% prompt cache hits on GPT-5.5 with Auckland NZ Cloudflare 520s complicating every workaround
slackermanz:
your cache prefix
There’s not really a “cache prefix”. As the cache hit process and routing is described, the first 256 tokens is hashed, and the hash determines same-server routing. The prompt cache key you can send is also included in this hashing, so it can serve to break the cache hit - a feature only useful after significant parallel usage.
The database lookup method of 24hr storage is not described other than it persists longer, but no other strategy or minimum length requirement beyond 1024 tokens (actually needing and delivering on closer to 1200) has been documented or expected. So ensure there is no changing of the bulk of the start of context (which OpenAI does on you anyway when they change an injected date).
BTW, right now: [usage] in/cached: 135229/133376 | out/reasoning: 433/154
I had a second turn with no cache hit until it started iterating also.
Discussion in the ATmosphere