Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreid73pzkpfewif533mrfr2sailqx2nro34mamxm3eojxczmcfu3gze",
    "uri": "at://did:plc:lk3jfj3zq4k4wxnk474axylu/app.bsky.feed.post/3mln2abgkxgb2"
  },
  "path": "/t/having-issues-with-building-a-customer-support-ai-with-openai-need-help-to-deploy-production/1377925#post_5",
  "publishedAt": "2026-05-12T04:44:02.000Z",
  "site": "https://community.openai.com",
  "textContent": "I completely understand your frustration. A lot of teams assume building a production-ready customer support AI with the current OpenAI stack will be straightforward, but in reality, it still requires strong orchestration, retrieval tuning, guardrails, and evaluation layers outside the model itself.\n\nThe good news is that reliable systems are definitely possible in production — but most successful implementations are not relying on the model alone. They usually combine:\n\n  * structured RAG pipelines instead of raw file search\n\n  * reranking + chunk optimization\n\n  * strict grounding prompts\n\n  * conversation state management\n\n  * fallback/handoff logic\n\n  * deterministic workflows for critical actions\n\n  * evaluation pipelines to measure hallucinations and retrieval quality continuously\n\n\n\n\nIn our experience, the biggest mistake is expecting Assistants API or Agents SDK alone to behave like a complete customer support platform. They are powerful building blocks, but production systems typically need custom orchestration around them.\n\nFor support use cases specifically:\n\n  * retrieval quality matters more than model size\n\n  * smaller focused context windows often outperform huge document dumps\n\n  * hybrid search (vector + keyword/BM25) improves reliability significantly\n\n  * tool execution should be constrained and explicit\n\n  * hallucination reduction usually comes from better retrieval architecture, not only prompt engineering\n\n\n\n\nAlso, model inconsistency is still real across repeated runs. Most teams solve this with:\n\n  * confidence scoring\n\n  * answer verification layers\n\n  * retrieval validation\n\n  * deterministic templates for policy-related answers\n\n  * human escalation paths\n\n\n\n\nYou are definitely not alone here. Many teams go through this exact phase before stabilizing their architecture. Don’t treat this as a failure of your implementation — customer support AI at production scale is genuinely an engineering problem, not just an API integration problem.\n\nI would suggest simplifying the stack first:\n\n  1. Build a highly reliable retrieval pipeline\n\n  2. Add strict grounding and citations\n\n  3. Introduce tools/actions only after retrieval becomes stable\n\n  4. Create evaluation datasets from real support tickets\n\n  5. Measure failures systematically instead of relying on ad-hoc testing\n\n\n\n\nOnce that foundation is stable, the Agents SDK becomes much more useful.\n\nWishing you the best — sounds like you are already doing the hard work most teams underestimate.",
  "title": "Having issues with building a customer support AI with OpenAI . Need help to deploy production?"
}