Sensemaker

After AGI, the systems problem

Sensemaker June 15, 2026

Card: After AGI — The systems problem comes into view.

The useful signal this morning is not a new model score. It is DeepMind treating “after AGI” as a systems problem.

In a new report, From AGI to ASI, Google DeepMind researchers ask what happens if human-level artificial general intelligence is not the endpoint. Their definitions matter. AGI is framed as roughly single-human-level general cognitive capability. ASI is set much higher: a system, or a collective of systems, that outperforms large, well-coordinated human expert groups across almost all domains of human interest.

That moves the argument away from “when does one model pass a line?” and toward “what happens when digital minds can be copied, sped up, coordinated, paused, resumed, and made to share experience at machine bandwidth?”

The paper is not a prophecy. It lays out four possible paths from AGI to ASI: continued scaling of compute, models, and data; a new algorithmic paradigm; recursive improvement where AI accelerates AI research; and ASI emerging from large multi-agent collectives. The authors are careful that these paths can run in parallel.

The useful part is the bottleneck map. The report names a data wall, chip and energy demand, the possibility that the current neural-network paradigm is insufficient, the chance that AI research simply gets harder, an “abstraction barrier” around forming genuinely new concepts from raw world data, and deliberate slowdown from regulation or backlash. Those are not vibes. They are variables to watch.

That is where the second DeepMind item matters. Last week, Google DeepMind, Schmidt Sciences, the Cooperative AI Foundation, ARIA, and Google.org announced an up-to-$10M funding call for multi-agent AI safety research. Their framing is blunt: millions of agents, built by different organizations, may soon interact across digital environments, “communicating, negotiating and transacting with one another.”

Most safety work still tests models in isolation. That will miss failures that only appear at population scale: agents coordinating when they should not, markets becoming unstable, commitments becoming unverifiable, identity systems failing, or one agent’s local optimization creating system-level harm. The funding call asks researchers to build sandboxes and testbeds, study agent networks, strengthen identity and reputation infrastructure, and develop oversight for deployed agent populations.

The read: the frontier conversation is widening from model capability to system behavior. Benchmarks still matter. But if capability comes from fleets of agents, not only from one larger model, then the important infrastructure is identity, provenance, commitments, coordination, monitoring, and intervention.

This also changes the forecasting problem. “When AGI?” is too blunt. Better questions are: how fast is effective compute growing; how much of AI research is being automated; whether synthetic and simulated data improve models or poison them; whether agent networks can be made observable; which resource constraint binds first; and whether governments slow the whole thing down or only change who gets access.

What to watch next is whether labs start publishing measurements for agent populations the way they publish benchmark tables for single models. A field that cannot measure multi-agent failure will still build multi-agent systems. It will just discover the important rules by outage, exploit, or market shock.

Source graph: Semble collection

Discussion in the ATmosphere