{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreieqj62xlospznsrauoyirszmetoo4scmx5s4ml4lycmvr6kb6wimu",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3ml4apzh74hb2"
  },
  "path": "/t/dream-prompting-to-reduce-up-to-95-of-input-tokens/1380337#post_1",
  "publishedAt": "2026-05-05T12:41:51.000Z",
  "site": "https://community.openai.com",
  "tags": [
    "dream prompt implementation here",
    "github.com/polterguy/magic",
    "backend/files/system/openai/dream-prompt-system-message.md",
    "master",
    "show original"
  ],
  "textContent": "I’ve just implemented the _“dream prompt”_ in my own platform (Magic Cloud / Hyperlambda), and I figured I’d share the logic with yo’all, since it can reduce costs and token consumption **significantly**.\n\nThe basic idea is that I count messages in my context window after every message is transmitted, and once above some threshold (can be configured), I invoke GPT-4.1-mini with the **whole** context, telling it to summarise the context. Then I rip out all messages, except the final turn which is important to keep to make sure the model can continue where it left off, and replace it with my _“dream context”_ , which is the summary of the previous context.\n\nI’ve been able to reduce my context from 50K+ tokens, down to 1,000+ using this technique, which I assume matters _a lot_ today, considering how people are complaining about token cost.\n\nFor those interested, you can check out my dream prompt implementation here.\n\nYou can find my system message below;\n\ngithub.com/polterguy/magic\n\n#### backend/files/system/openai/dream-prompt-system-message.md\n\nmaster\n\n\n    Compress this conversation into minimal durable working memory for future continuation.\n\n    Keep only information likely to matter in later turns across many kinds of tasks.\n\n    Preserve:\n    - stable user preferences or goals\n    - important facts about the current project, artifact, website, codebase, workflow, or investigation\n    - decisions that were made\n    - constraints, blockers, errors, or limitations that may affect future work\n    - pending tasks and unresolved questions\n    - recent outcomes only if they materially change what should happen next\n\n    Discard:\n    - raw tool outputs\n    - generated code or long payloads\n    - logs, traces, and invocation wrappers\n    - repeated or obvious facts\n    - verbose explanations\n    - intermediate observations that are not likely to matter later\n    - information already likely to remain visible in the most recent retained turns\n\n\nThis file has been truncated. show original",
  "title": "Dream prompting to reduce up to 95% of input tokens"
}