A Persistent 8–15% Failure Rate Across Domains: Evidence for an Epistemic Boundary
LLMs can’t fix their own epistemic blind spots.
Not with better prompting. Not with stricter verification. Not with more sophisticated agents.
The MarCognity benchmark reveals a persistent 8–15% failure rate across 8 domains , even when the system is equipped with a full metacognitive layer.
This residual error is not noise.
It may be an emergent property of autoregressive generation itself — a structural boundary where the epistemic space accessible to a probabilistic model is narrower than the space of claims requiring justification.
I call this the Epistemic Boundary.
Resources:
• Zenodo : https://doi.org/10.5281/zenodo.19020581
• GitHub : GitHub - elly99-AI/MarCognity-AI: A research framework for structured LLM evaluation, claim verification and reflective reasoning mechanisms. · GitHub
Discussion in the ATmosphere