Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreihx6atzmnu6sd772uzcvc45r4bpsgxk2gqwr5orlgijvh6ewih66i",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mol5xxn37ib2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreiajwg6ycxqsk4ywxv26aqub5i5j7yyfs26knlqe6egx62o5fkxlti"
    },
    "mimeType": "image/webp",
    "size": 75494
  },
  "path": "/lukaswalter/agent-framework-rag-for-agents-giving-your-agent-the-right-context-1n15",
  "publishedAt": "2026-06-18T15:30:00.000Z",
  "site": "https://dev.to",
  "tags": [
    "ai",
    "dotnet",
    "csharp",
    "tutorial",
    "lukaswalter.dev",
    "previous article",
    "AI context provider",
    "PostgreSQL with pgvector",
    "Agent Framework RAG",
    "TextSearchProvider API reference",
    "Vector databases for .NET AI apps",
    "The Microsoft.Extensions.VectorData library",
    "Process custom data for AI applications",
    "RAG Is a Data Problem Before It's a Prompt Problem",
    "Indirect Prompt Injection Is a Trust Boundary Problem"
  ],
  "textContent": "> _This is Part 13 of my series on the Microsoft Agent Framework. You can read the original post over on lukaswalter.dev._\n\nIn the previous article, we looked at workflows.\nWorkflows make sense when the process itself needs structure: state, checkpoints, events, human approvals, and resumable execution.\n\nThis post is the bridge from Agent Framework into RAG.\nI plan on doing a full RAG deep dive sometime later. The practical question for now is smaller:\n\nHow do I connect an Agent Framework agent to private application knowledge without stuffing every document into the prompt?\n\nFor agents, RAG is less about adding more text and more about giving the agent a controlled retrieval path.\nThe agent should fetch the right context at the point where it needs it.\n\n##  Agents do not know your private data\n\nYour company documents, product catalog, tickets, rules, policies, runbooks, and internal knowledge base live outside the model.\nThe model has generic knowledge. Your application has private knowledge.\nTreat those as separate systems.\n\nYou can paste some private data into the prompt, and for a demo that may be enough.\nBut this falls apart quickly:\n\n  * full documents are expensive to send repeatedly\n  * long prompts are fragile\n  * stale documents may sit next to current ones\n  * users may not be allowed to see every source\n  * long context still needs selection\n\n\n\nThe last point is easy to underestimate.\nA larger context window lets you send more text.\nIt does not decide which text is correct, current, relevant, or permitted.\n\nDo not give the agent all knowledge.\nGive it the right context at the moment it needs it.\n\nRetrieval owns that job.\n\n##  The minimal RAG shape\n\nThe basic RAG loop is small:\n\n\n\n    user question\n    -> retrieve relevant chunks\n    -> pass chunks to the agent\n    -> agent answers using that context\n\n\nFor documents, the longer pipeline usually looks like this:\n\n\n\n    documents\n    -> chunks\n    -> embeddings\n    -> vector store\n    -> search\n    -> retrieved context\n    -> agent response\n\n\nDocuments are split into smaller chunks.\nThose chunks are embedded into vectors.\nThe vectors and source metadata are stored.\nWhen a user asks a question, the question is embedded too.\nThe search layer finds nearby chunks and returns only those chunks to the agent.\n\nStop there for now.\n\nThere are some hard parts here:\nchunk boundaries, embedding model choice, hybrid search, reranking, freshness, access control, observability, and evals.\nThey are just not the point yet.\n\nFor now, keep the boundary clear:\n\nRAG is the retrieval layer around the agent.\nThe agent is not the retrieval layer.\n\n##  Agent Framework is not the RAG engine\n\nMicrosoft Agent Framework gives you the agent runtime.\nIt does not give you a finished ingestion pipeline, chunking strategy, embedding setup, vector store, ranking model, permission model, freshness process, or retrieval eval suite.\n\nAgent Framework helps you decide how the agent receives and uses context:\n\n  * you can retrieve context before calling the agent\n  * you can inject retrieved context through an AI context provider\n  * you can expose retrieval as a function tool\n  * you can make retrieval one step in a workflow\n\n\n\nThe retrieval system still belongs to your application architecture.\n\nIt might use Azure AI Search, PostgreSQL with pgvector, SQL Server vector search, Cosmos DB, Qdrant, Redis, a normal search index, or an internal HTTP API.\nThe agent does not need to care.\n\nThe agent needs a focused capability.\nNot direct database access.\n\n##  Retrieval as an agent tool\n\nFor many agent apps, I would start by exposing retrieval as a tool.\n\nThe tool is narrow:\n\n\n\n    SearchKnowledgeAsync(\n        string query,\n        string? category,\n        int limit)\n\n\nThe agent can call it when the answer depends on private knowledge.\nYour application decides what the tool is allowed to search.\n\nThis matches the tool-design rule from earlier in the series:\n\nTools should expose controlled capabilities, not raw infrastructure.\n\nA small version looks like this:\n\n\n\n    using System.ComponentModel;\n    using Microsoft.Agents.AI;\n    using Microsoft.Extensions.AI;\n    using Microsoft.Extensions.DependencyInjection;\n\n    public sealed record KnowledgeSearchResult(\n        string Title,\n        string Source,\n        string Snippet,\n        double Score);\n\n    public interface IKnowledgeSearch\n    {\n        Task<IReadOnlyList<KnowledgeSearchResult>> SearchAsync(\n            string query,\n            string? category,\n            int limit,\n            CancellationToken cancellationToken);\n    }\n\n    [Description(\"Searches approved internal knowledge articles, policies, and runbooks.\")]\n    public static Task<IReadOnlyList<KnowledgeSearchResult>> SearchKnowledgeAsync(\n        [Description(\"Focused search query. Rewrite the user's message into search terms.\")]\n        string query,\n        [Description(\"Optional source category such as policy, runbook, product, support, or architecture.\")]\n        string? category,\n        [Description(\"Maximum number of results to return. Use 3 to 5 for normal questions.\")]\n        int limit,\n        IServiceProvider services,\n        CancellationToken cancellationToken)\n    {\n        var search = services.GetRequiredService<IKnowledgeSearch>();\n\n        return search.SearchAsync(\n            query,\n            category,\n            Math.Clamp(limit, 1, 5),\n            cancellationToken);\n    }\n\n\nThe model supplies `query`, `category`, and `limit`.\nThe application supplies `IKnowledgeSearch`.\n\nKeep that split.\n\nThe model can ask for a search.\nIt does not get a connection string, a database client, or permission to browse every source.\n\nThen attach the tool to the agent:\n\n\n\n    AIAgent supportAgent = chatClient.AsAIAgent(\n        instructions: \"\"\"\n        You answer questions about the internal engineering platform.\n\n        Use SearchKnowledgeAsync when the answer depends on private company\n        documentation, runbooks, policies, known issues, or product rules.\n\n        If the search results do not contain enough evidence, say that the indexed\n        sources do not answer the question. Do not invent policy details, limits,\n        prices, permissions, or operational steps.\n        \"\"\",\n        tools: [AIFunctionFactory.Create(SearchKnowledgeAsync)],\n        services: app.Services);\n\n\nThe agent-side RAG flow is:\n\n  1. The user asks a question.\n  2. The agent decides it needs private knowledge.\n  3. The agent calls the retrieval tool with a focused query.\n  4. The application searches the allowed sources.\n  5. The agent receives a few results and answers from them.\n\n\n\nAt that point, retrieval is just another tool.\nThe pattern fits Agent Framework because tools already give you that controlled application boundary.\n\n##  The user message is not always the search query\n\nUsers ask messy questions.\n\nFor example:\n\n\n\n    What were the most important changes in our cancellation policy last year?\n\n\nA better retrieval query might be:\n\n\n\n    cancellation policy changes last year\n\n\nOr, if you expose metadata filters:\n\n\n\n    await SearchKnowledgeAsync(\n        query: \"cancellation policy changes last year\",\n        category: \"policy\",\n        limit: 5,\n        services,\n        cancellationToken);\n\n\nThe agent can help here.\nIt can translate a conversational request into a smaller retrieval query.\n\nBut do not overcomplicate this too early.\nStart by logging the generated tool query and checking whether it actually finds better results than the raw user message.\n\nBad query rewriting is worse than no query rewriting.\nIt can remove the term that mattered.\n\n##  Metadata filters keep retrieval inside the boundary\n\nVector similarity finds related text.\nIt does not know whether that text belongs to the right tenant, product, language, version, source system, or user permission scope.\n\nYou often need filters.\n\nCommon filters include:\n\n  * tenant\n  * user permissions\n  * document type\n  * product\n  * category\n  * language\n  * date\n  * version\n  * source system\n\n\n\nSome filters can be model supplied.\n`category` is a reasonable example because the model can often infer whether a question is about a policy, runbook, product, or support article.\n\nSome filters should not be model supplied.\n\nTenant, user ID, role, entitlement, and document permissions should come from your authenticated application context.\nThe model should not be allowed to say:\n\n\n\n    Search tenant = admin\n\n\nand suddenly see admin-only documents.\n\nA better application boundary looks like this:\n\n\n\n    public interface IKnowledgeSearch\n    {\n        Task<IReadOnlyList<KnowledgeSearchResult>> SearchAsync(\n            string query,\n            string? category,\n            int limit,\n            UserKnowledgeScope scope,\n            CancellationToken cancellationToken);\n    }\n\n\nThe tool can accept the search query and category.\nYour application adds `UserKnowledgeScope` from the current user.\n\nSimilarity search finds related text.\nMetadata filters keep the search inside the right boundary.\n\n##  Manual retrieval is still valid\n\nExposing retrieval as a tool is not the only option.\n\nFor a pure documentation assistant, you may not want the model to decide whether to search.\nYou may want retrieval on every request.\n\nPlain application code is enough:\n\n\n\n    IReadOnlyList<KnowledgeSearchResult> results =\n        await knowledgeSearch.SearchAsync(\n            query: userQuestion,\n            category: null,\n            limit: 5,\n            cancellationToken);\n\n    string context = string.Join(\n        \"\\n\\n\",\n        results.Select(result => $\"\"\"\n        Source: {result.Title}\n        {result.Snippet}\n        \"\"\"));\n\n    AgentResponse response = await supportAgent.RunAsync($\"\"\"\n        Answer the user's question using the retrieved context.\n        If the context is not enough, say so.\n\n        Retrieved context:\n        {context}\n\n        User question:\n        {userQuestion}\n        \"\"\",\n        cancellationToken: cancellationToken);\n\n\nYou can also use Agent Framework context providers, such as `TextSearchProvider`, when that fits your setup.\nThe tradeoff is the same either way:\n\n  * automatic retrieval is predictable\n  * retrieval as a tool is more selective\n\n\n\nIf almost every request needs private knowledge, retrieve before the agent call.\nIf retrieval is one capability among several, expose it as a tool.\n\n##  When RAG is the wrong tool\n\nRAG is for finding relevant context.\nCode is for exact operations.\n\nIf the user asks:\n\n\n\n    What are the top 5 products by revenue?\n\n\nthat should probably be SQL or an analytics API, not vector search.\n\nThe same applies to:\n\n  * exact lookups\n  * IDs\n  * prices\n  * current status\n  * rankings\n  * totals\n  * permissions\n  * deterministic business rules\n\n\n\nVector search is good at finding related text.\nIt is not a calculator, database constraint, authorization system, or reporting engine.\n\nIf the answer must be exact, use normal code behind a tool.\n\nFor example:\n\n\n\n    [Description(\"Returns the top products by revenue for an authorized reporting period.\")]\n    public static Task<IReadOnlyList<ProductRevenue>> GetTopProductsByRevenueAsync(\n        DateOnly from,\n        DateOnly to,\n        int limit,\n        IServiceProvider services,\n        CancellationToken cancellationToken)\n    {\n        var reporting = services.GetRequiredService<IRevenueReporting>();\n\n        return reporting.GetTopProductsByRevenueAsync(\n            from,\n            to,\n            Math.Clamp(limit, 1, 20),\n            cancellationToken);\n    }\n\n\nThis still gives the agent a tool.\nIt is just not RAG.\n\n##  When I would use this\n\nUse retrieval with an Agent Framework agent when:\n\n  * the answer depends on private documents or records\n  * the model's generic knowledge is not enough\n  * stuffing the prompt would be expensive or noisy\n  * the agent can benefit from searching only when needed\n  * the application can enforce source permissions behind the tool\n\n\n\nStart with a narrow search tool.\nLog the query the agent sends.\nLog the sources returned.\nCheck whether the answer actually used those sources.\n\nThat gives you enough signal to see where the retrieval design is weak.\n\n##  When I would not use this\n\nDo not use RAG when the task needs deterministic data access or computation.\n\nUse normal code for current state, totals, rankings, exact IDs, prices, permissions, and business rules.\n\nDo not use RAG as a way to bypass application boundaries.\nIf a user cannot access a document in the product, the retrieval tool should not return it to the agent.\n\nAlso avoid building the full ingestion and retrieval platform before you have a real use case.\nStart with one domain, a small corpus, and a handful of questions you can verify.\n\n##  Conclusion\n\nAgent Framework gives you a clean place to put retrieval into the agent loop.\nIt does not make RAG automatic.\n\nThe design I would carry forward is simple:\n\n  * keep private knowledge outside the base prompt\n  * expose retrieval as a focused capability\n  * let the application enforce permissions and filters\n  * give the agent only the context it needs\n  * use code, not RAG, for exact operations\n\n\n\nAs I said before, I will do a deep dive into RAG later on. So in the next Agent Framework post we will move to multimodal agents: images, PDFs, and provider differences.\nThe agent boundary gets messy there in a different way.\nSome providers can work with images or document inputs natively, some need different message formats, and some scenarios are still better handled by manual preprocessing before the agent sees anything.\n\n##  Further reading\n\n  * Agent Framework RAG\n  * TextSearchProvider API reference\n  * Vector databases for .NET AI apps\n  * The Microsoft.Extensions.VectorData library\n  * Process custom data for AI applications\n  * RAG Is a Data Problem Before It's a Prompt Problem\n  * Indirect Prompt Injection Is a Trust Boundary Problem\n\n",
  "title": "Agent Framework RAG for Agents: Giving Your Agent the Right Context"
}