External Publication
Visit Post

Best AI Architecture for Processing and Querying Large PDFs (7000 Pages) with Fast Response Time

OpenAI Developer Community February 20, 2026
Source

Hi everyone,

I’m building an AI-powered educational assistant that must answer questions grounded in very large PDFs (up to ~4000 pages per document). I’m currently using a RAG-based setup, but I’m facing serious production issues and would appreciate architectural guidance.

Current Problems:

  1. Very high latency

    • Responses take 25+ seconds.

    • Sometimes even longer with complex queries.

  2. Missing information in responses

    • The system retrieves only partial sections.

    • Important parts of the document are ignored.

    • Answers feel incomplete or fragmented.

  3. Requirements:

  • Fast response time (<3 seconds ideally)

  • High-quality, well-structured answers

  • Accurate grounding with page references

  • Ability to handle 4000+ pages reliably

  • Production-ready and scalable

Any one have a better Ai pipeline, or knew how to implement it in a proper way ?

Discussion in the ATmosphere

Loading comments...