Multi-turn RAG for Technical Documentation: Using Context-Aware Query Rewriting + Semantic Caching — Is This a Sound Approach?
Hugging Face Forums [Unofficial]
May 26, 2026
this sounds to me like 2 issues.
1. a search issue
2. fetch issue.
im moving to have conversational anchor documents be a part of my conversational flow. meaning that every so often i have the AI Create an Anchor Document of the conversation, as well as a primer that summerizes the conversation. both the Anchor document, and the Primer i have worked to get good formats for what needs to be transcribed.
the reson for this approach is simple. the lost in the middle problem.
pulling from context gets harder and harder for AI as the conversation extends. one of the most notible partersn is Lost in the middle. that is the phenomenon where information from the beginning of a conversation and information from the most recent turns are more retrivable than anything in the middle. because information at the beggining of the conversation and information on the recent turns has greater semantic weight for AI Attention.
anchor blocks mitigate this by ‘refreshing’ the conversation. now, alot of people know about anchor blocks. i dont know how many know that you can configure anchor block timeing AND content constraints. meaning you can create instructions on what the anchor block should include.
when useing these as PHYSICAL documents, you can work them in to your rag system.
thats part of the fetch issue. once you resolve that now ‘fetch’ is also look for physical files instead of looking for things that are in the ‘murky swamp’ that is the context issue, now you can look at search.
ive been working on a ‘fuzzy search’ system, in this system you set some hard parameters, but you build the search instruction to be ‘vague’ when searching, so that not only is it looking for the hard coded tags and such, but it can also pull related topics, incace you have 900 things that it should latch on to, but you can only rememeber 3.
and also a part of search, im working on a unified convention for file names and internal layouts.
which makes search even more effective.
and these methods are simple house keeping tecniques.
they may help in your case.
i think they are related because i to am building a type of 3 phase rag system that works off of physical file storage. and this is part of the approach i am takeing.
Discussion in the ATmosphere