{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreigrmwxob3na2sbo7t6v7dqoi4mj7eklzj6mfeclnsvdmaxlrwhq2e",
"uri": "at://did:plc:haakkg7y3xdghcdmprxeexso/app.bsky.feed.post/3mj523ebj4w72"
},
"path": "/t/any-good-openrouter-interface-which-is-private-and-secure/36916#post_12",
"publishedAt": "2026-04-10T05:52:24.000Z",
"site": "https://discuss.privacyguides.net",
"tags": [
"docs.openwebui.com",
"❓ FAQ / Open WebUI",
"Performance Tips Guide",
"Memory & Personalization / Open WebUI"
],
"textContent": "Dellsam1:\n\n> the only problem now is I can’t seem to find a way to limit my messages sent to api\n\nHere you go:\n\ndocs.openwebui.com\n\n### ❓ FAQ / Open WebUI\n\nQ: How can I get support or ask for help?\n\n> ### Q: Why am I seeing multiple API requests when I only send one message? Why is my token usage higher than expected?\n>\n> **A:** Open WebUI uses **Task Models** to power background features that enhance your chat experience. When you send a single message, additional API calls may be made for:\n>\n> * **Title Generation** : Automatically generating a title for new chats\n> * **Tag Generation** : Auto-tagging chats for organization\n> * **Query Generation** : Creating optimized search queries for RAG (when you attach files or knowledge)\n> * **Web Search Queries** : Generating search terms when web search is enabled\n> * **Autocomplete Suggestions** : If enabled\n>\n\n>\n> By default, these tasks use the **same model** you’re chatting with. If you’re using an expensive API model (like GPT-4 or Claude), this can significantly increase your costs.\n>\n> **To reduce API costs:**\n>\n> 1. Go to **Admin Panel > Settings > Interface** (for title/tag generation settings)\n> 2. Configure a **Task Model** under **Admin Panel > Settings > Models** to use a smaller, cheaper model (like GPT-4o-mini) or a local model for background tasks\n> 3. Disable features you don’t need (auto-title, auto-tags, etc.)\n>\n\n>\n>> Cost-Saving Recommendation\n>>\n>> Set your Task Model to a fast, inexpensive model (or a local model via Ollama) while keeping your primary chat model as a more capable one. This gives you the best of both worlds: smart responses for your conversations, cheap/free processing for background tasks.\n>\n> For more optimization tips, see the **Performance Tips Guide**.\n\nDellsam1:\n\n> I was also hoping for a chatgpt style memory\n\nMemory is currently beta/experimental:\n\ndocs.openwebui.com\n\n### Memory & Personalization / Open WebUI\n\nThe Memory system is currently in Beta/Experimental stage. You may encounter inconsistencies in how models store or retrieve information, and storage formats may change in future updates.",
"title": "Any good openrouter interface which is private and secure?"
}