Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreieacedidsrl54dkpgx35k2p4pffh3tbmxmwmiwukz4a6zjigk7tl4",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mfwi2fiwgdi2"
  },
  "path": "/t/why-are-completion-tokens-so-high/1375358#post_2",
  "publishedAt": "2026-02-28T14:16:48.000Z",
  "site": "https://community.openai.com",
  "textContent": "  1. gpt-5 models are reasoning models. You pay for their internal thinking as output.\n  2. gpt-5-nano thinks excessively long for poor results. Better to just use mini.\n  3. use the API parameter “reasoning_effort”, and set it to “low”. That will indicate to the model how much to think (the parameter for Chat Completions).\n  4. Or simply use gpt-4.1, which goes right to producing output without first deliberating about and valuing which “code” to generate. Use a “top_p”: 0.01 if you want consistent answers instead of random ones.\n\n",
  "title": "Why are completion tokens so high?"
}