Best AI Architecture for Processing and Querying Large PDFs (7000 Pages) with Fast Response Time
Hi everyone,
I’m building an AI-powered educational assistant that must answer questions grounded in very large PDFs (up to ~4000 pages per document). I’m currently using a RAG-based setup, but I’m facing serious production issues and would appreciate architectural guidance.
Current Problems:
Very high latency
Responses take 25+ seconds.
Sometimes even longer with complex queries.
Missing information in responses
The system retrieves only partial sections.
Important parts of the document are ignored.
Answers feel incomplete or fragmented.
Requirements:
Fast response time (<3 seconds ideally)
High-quality, well-structured answers
Accurate grounding with page references
Ability to handle 4000+ pages reliably
Production-ready and scalable
Any one have a better Ai pipeline, or knew how to implement it in a proper way ?
Discussion in the ATmosphere