Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreid7k7cpj7vrr5ei6evv5hwcdhqcyuclfzo4ctsopd73e4kmqm4bpu",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mpeqtf6xnpd2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreih6lzfmufduij3j3zx5iqt7dj7q73eccb2nbvmhvoeebhc26ogoi4"
    },
    "mimeType": "image/webp",
    "size": 83882
  },
  "path": "/abdullahmubin/how-i-built-a-persona-chat-agent-and-fought-hallucination-a-rag-story-22hf",
  "publishedAt": "2026-06-28T19:24:49.000Z",
  "site": "https://dev.to",
  "tags": [
    "aiops",
    "llm",
    "rag",
    "agents"
  ],
  "textContent": "I wasn't building another AI chatbot.\n\nI was building an AI persona.\n\nSomeone users could actually have conversations with.\n\nA fictional person with:\n\n  * a backstory\n  * opinions\n  * psychology\n  * memories\n  * a unique voice\n\n\n\nImagine chatting with a 46-year-old small business owner named Jane.\n\n  * You ask Jane about a business strategy article.\n  * She shouldn't answer like a generic AI assistant.\n  * She should answer like **Jane**.\n  * That was the goal.\n\n\n\nAnd for a while...\n\nIt worked.\n\n> **Then the hallucinations started.**\n\n##  Index\n\n  1. The Goal\n  2. The Architecture\n  3. The First Hallucination\n  4. Bug #1 — Retrieval Wasn't Working\n  5. Bug #2 — Retrieved Context Was Ignored\n  6. Bug #3 — Grounded Facts Mixed With Fiction\n  7. Bug #4 — The Right Facts, The Wrong Answer\n  8. Bug #5 — Over-Grounding Broke Personal Conversations\n  9. The Final Architecture\n  10. Lessons Learned\n  11. Final Thoughts\n\n\n\n##  1. The Goal\n\nThe idea was simple.\n\nA persona shouldn't magically know everything.\n\nInstead, it should discuss content that exists inside a project.\n\n\n\n    Business Article\n            ↓\n    Knowledge Retrieval\n            ↓\n    Persona Reads Context\n            ↓\n    Persona Replies In Character\n\n\nThe response should be:\n\n  * grounded in the article\n  * consistent with the persona\n  * natural to read\n\n\n\nSounds straightforward, It wasn't.\n\n##  2. The Architecture\n\nThe system had two major parts.\n\n##  A preprocessing pipeline\n\nBefore the persona ever saw a user message, several lightweight agents prepared it.\n\nEach stage had exactly one responsibility:\n\n  * understand the message\n  * check for manipulation\n  * extract objectives\n  * enrich with context\n  * prepare a structured packet\n  * validate the result\n\n\n\nOnly after that did the persona generate a response.\n\n###  Two separate memories\n\nThe system searched two different kinds of knowledge.\n\nThe first contained shared project content:\n\n  * articles\n  * documentation\n  * reports\n  * blog posts\n\n\n\nThe second stored the persona's own long-term memories and previous conversations.\n\nBoth were searched in parallel before every reply.\n\n##  3. The First Hallucination\n\nI asked:\n\n> Tell me something about **Actionable Steps for Business Leaders.**\n\nThe article clearly listed five recommendations.\n\nThe persona replied:\n\n> \"Honestly, I think business leaders should focus on hard work, honesty, and setting clear goals...\"\n\nIt sounded convincing.\n\nThere was only one problem.\n\nNone of that existed in the article.\n\nThe entire answer was improvised.\n\n##  4. Bug #1 — Retrieval Wasn't Working\n\nThe first question was obvious.\n\nDid the model actually receive the article?\n\nFortunately, every request logged how many knowledge chunks were retrieved.\n\nOne request showed:\n\n\n\n    Project Knowledge Retrieved\n\n    0 chunks\n\n\nThat immediately explained everything.\n\nThe article wasn't reaching the model.\n\n###  The Cause\n\nThe editor correctly saved every article. But only the primary database was updated. The retrieval index never received those changes. The article existed.\n\n> The retrieval system simply couldn't see it.\n\n###  The Fix\n\nEvery content update now performs two operations:\n\n\n\n    Save Content\n          ↓\n    Update Retrieval Index\n\n\nI also ran a one-time synchronization job for older articles.\n\nAfter that:\n\n\n\n    Project Knowledge Retrieved\n\n    4 chunks\n\n\nRetrieval finally worked.\n\nOr so I thought.\n\n##  5. Bug #2 — Retrieved Context Was Ignored\n\nNow the system successfully retrieved relevant context.\n\nYet the persona still answered:\n\n> \"Honestly, I'm not really sure about that.\"\n\nThe information was there. The model simply ignored it.\n\n###  The Cause\n\nEarlier in the system prompt was a rule that essentially said:\n\n> If you don't know something, admit it honestly.\n\n  * The retrieved knowledge appeared much later in the prompt.\n  * The model treated the earlier instruction as more important.\n  * So it honestly believed it didn't know.\n\n\n\n###  The Fix\n\nInstead of simply appending retrieved context, I explicitly overrode the earlier rule.\n\nSomething like:\n\n> **The following information has been shared with you. You have read it. Treat it as knowledge you genuinely possess.**\n\nThat tiny prompt change completely changed the model's behavior.\n\n##  6. Bug #3 — Grounded Facts Mixed With Fiction\n\nNow the persona finally referenced the article.\n\nBut then it added:\n\n> \"I've been applying these strategies in my own business for years.\"\n\nThe article never said that.\n\nNeither did the persona profile.\n\nThe model invented a believable personal experience.\n\nIronically...\n\nThis was the hardest hallucination to catch.\n\nBecause it sounded perfectly reasonable.\n\n###  My First Attempt\n\nI created another stage after generation.\n\nIts job was simple:\n\n\n\n    Raw Response\n          ↓\n    Remove Ungrounded Sentences\n          ↓\n    Final Response\n\n\nIt failed. Completely.\n\n> The model removed correct information and kept the **hallucination**.\n\n###  The Better Solution\n\nInstead of asking the model to judge itself...\n\nI rebuilt the response from scratch.\n\nEvery retrieved article was split into individual facts.\n\nLike this:\n\n\n\n    1. Conduct market research.\n\n    2. Define your value proposition.\n\n    3. Invest in innovation.\n\n    4. Build a flexible business plan.\n\n\nThe grounding stage received:\n\n  * the numbered facts\n  * the user's question\n  * the persona's original response (only for tone)\n\n\n\nIts instructions became:\n\n> Rewrite the answer using **only** these numbered facts.\n\n  * Not summarize.\n  * Not invent.\n  * Not elaborate.\n  * Rewrite.\n\n\n\nThat single architectural change removed nearly every invented personal story.\n\n##  7. Bug #4 — The Right Facts, The Wrong Answer\n\nNext question:\n\n> What percentage of businesses with a formal plan achieve higher revenue growth?\n\nThe article contained two different percentages. The persona picked, the wrong one. Not because it hallucinated. Because both numbers existed in the source.\n\n###  The Cause\n\nThe grounding stage knew the available facts.\n\nIt didn't know which fact the user actually wanted.\n\n###  The Fix\n\nTwo improvements solved it.\n\n###  Pass the user's question\n\nInstead of only seeing the generated response...\n\nthe grounding stage also receives the original question.\n\nNow it can match facts against the user's intent.\n\n###  Retrieve more context\n\nThe retrieval step originally returned too few knowledge chunks.\n\nIncreasing the retrieval depth ensured the relevant statistic was almost always available.\n\n##  8. Bug #5 — Over-Grounding Broke Personal Conversations\n\nThen I asked:\n\n> Have you personally conducted market research?\n\nThe persona suddenly replied:\n\n> I don't have information about that.\n\nTechnically...\n\nThe grounding stage was correct.\n\nThe article didn't mention Jane's personal life.\n\nBut that wasn't the question. I wasn't asking about the article. I was asking Jane.\n\n###  The Fix\n\nBefore grounding begins, the system now classifies the question.\n\n\n\n    Personal Question?\n            │\n       Yes ─────► Leave response unchanged\n            │\n            No\n            ▼\n    Ground response from retrieved facts\n\n\nExamples:\n\n\n\n    Have you ever done market research?\n\n    → Personal\n\n\n\n    What percentage of businesses achieve higher growth?\n\n    → Factual\n\n\nThis tiny classifier completely changed the behavior.\n\n##  9. The Final Architecture\n\nToday the pipeline looks like this.\n\n\n\n    User Message\n          │\n          ▼\n\n    Message Processing\n\n          ▼\n\n    Knowledge Retrieval\n\n          ▼\n\n    Persona Generates Reply\n\n          ▼\n\n    Grounding Stage\n\n    • Personal Question?\n          │\n          ├── Yes → Keep response\n\n          └── No → Rebuild from source facts\n\n          ▼\n\n    Final Response\n\n          ▼\n\n    Conversation Saved\n\n\nThe result is a persona that can:\n\n  * accurately discuss project content\n  * quote facts correctly\n  * maintain its personality\n  * avoid inventing experiences\n  * switch naturally between factual and personal conversations\n\n\n\n##  10. Lessons Learned\n\n###  1. Log Your Retrieval\n\nThe most useful debugging signal wasn't inside the model. It was a simple retrieval count. Without that log, I would have blamed the AI instead of my own architecture.\n\n###  2. Prompt Order Matters\n\nEarlier instructions often dominate later ones. If one instruction should replace another... say so explicitly.\n\n###  3. Models Are Bad At Reviewing Their Own Work\n\n  * Asking a model to detect its own hallucinations isn't very reliable.\n  * Rebuilding from constrained facts worked far better than asking it to critique itself.\n\n\n\n###  4. Grounding Starts With Classification\n\n  * Not every question should be grounded.\n  * Some questions belong to the knowledge base.\n  * Others belong to the persona.\n  * Treating them the same breaks both.\n\n\n\n###  5. The Most Dangerous Hallucinations Are The Believable Ones\n\n  * Generic hallucinations are easy to spot.\n  * Profile-consistent hallucinations aren't.\n\n\n\nWhen Jane said:\n\n> \"I've been using these techniques in my own business...\"\n\nEveryone believed her. Including me.\n\n> Those are the hallucinations worth worrying about.\n\n##  11. Final Thoughts\n\nBefore this project, I thought hallucinations were mostly a prompt engineering problem.\n\nI was wrong.\n\nThey turned out to be:\n\n  * retrieval problems\n  * synchronization problems\n  * prompt precedence problems\n  * architecture problems\n  * validation problems\n  * classification problems\n\n\n\nEvery fix uncovered another hidden weakness.\n\nBut that's exactly how production systems improve.\n\nToday the persona can accurately discuss project content, answer factual questions without inventing details, and still hold natural conversations as a believable character.\n\nThe biggest lesson wasn't learning how to write a better prompt.\n\nIt was learning that **reducing hallucinations is an architectural problem not just a prompting problem.**",
  "title": "How I Built a Persona Chat Agent and Fought Hallucination — A RAG Story"
}