{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreieqj62xlospznsrauoyirszmetoo4scmx5s4ml4lycmvr6kb6wimu",
"uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3ml4apzh74hb2"
},
"path": "/t/dream-prompting-to-reduce-up-to-95-of-input-tokens/1380337#post_1",
"publishedAt": "2026-05-05T12:41:51.000Z",
"site": "https://community.openai.com",
"tags": [
"dream prompt implementation here",
"github.com/polterguy/magic",
"backend/files/system/openai/dream-prompt-system-message.md",
"master",
"show original"
],
"textContent": "I’ve just implemented the _“dream prompt”_ in my own platform (Magic Cloud / Hyperlambda), and I figured I’d share the logic with yo’all, since it can reduce costs and token consumption **significantly**.\n\nThe basic idea is that I count messages in my context window after every message is transmitted, and once above some threshold (can be configured), I invoke GPT-4.1-mini with the **whole** context, telling it to summarise the context. Then I rip out all messages, except the final turn which is important to keep to make sure the model can continue where it left off, and replace it with my _“dream context”_ , which is the summary of the previous context.\n\nI’ve been able to reduce my context from 50K+ tokens, down to 1,000+ using this technique, which I assume matters _a lot_ today, considering how people are complaining about token cost.\n\nFor those interested, you can check out my dream prompt implementation here.\n\nYou can find my system message below;\n\ngithub.com/polterguy/magic\n\n#### backend/files/system/openai/dream-prompt-system-message.md\n\nmaster\n\n\n Compress this conversation into minimal durable working memory for future continuation.\n\n Keep only information likely to matter in later turns across many kinds of tasks.\n\n Preserve:\n - stable user preferences or goals\n - important facts about the current project, artifact, website, codebase, workflow, or investigation\n - decisions that were made\n - constraints, blockers, errors, or limitations that may affect future work\n - pending tasks and unresolved questions\n - recent outcomes only if they materially change what should happen next\n\n Discard:\n - raw tool outputs\n - generated code or long payloads\n - logs, traces, and invocation wrappers\n - repeated or obvious facts\n - verbose explanations\n - intermediate observations that are not likely to matter later\n - information already likely to remain visible in the most recent retained turns\n\n\nThis file has been truncated. show original",
"title": "Dream prompting to reduce up to 95% of input tokens"
}