External Publication

“It’s the Architecture, Stupid” — Why Prompt Engineering Won’t Fix Agents

Hugging Face Forums [Unofficial] April 14, 2026

Borrowing from the classic “it’s the economy, stupid” — the same applies here. We’re blaming prompts for what is fundamentally an architectural problem.

Paper: Beyond Prompting: Decoupling Cognition from Execution in LLM-based Agents through the ORCA Framework Code: GitHub - gfernandf/agent-skills: Agents should execute whenever possible — runtime for composable AI agent skills · GitHub

We keep pretending that better prompts will fix LLM agents.

They won’t.

We’ve built an entire ecosystem of tooling, courses, and “best practices” around prompt engineering — as if the problem were linguistic.

It’s not.

It’s architectural.

The uncomfortable truth

Let’s be honest about what most agent systems are doing today:

Take a task
Generate a prompt
Call the model
Hope it “reasons” correctly
Repeat

This is not a system.

This is recomputation disguised as intelligence.

We are replaying cognition, not building it

Every time your agent runs, it:

Reconstructs context
Rebuilds reasoning
Re-derives intermediate steps

There is no reuse of cognition.

No structure. No persistence. No abstraction layer.

Just prompts.

We are not building systems. We are replaying thoughts.

Why prompt engineering feels like it works (until it doesn’t)

Prompt engineering gives the illusion of control:

Add more instructions
Add more examples
Add more constraints

And yes — performance improves.

Until it plateaus.

Because everything still lives inside a single forward pass:

no memory of reasoning
no composability
no reuse

It’s like trying to fix software architecture by writing better comments.

The real problem is architectural

The core issue is simple:

We are using LLMs as stateless reasoning engines.

And then compensating for that with increasingly complex prompts.

Instead of:

modeling cognition
structuring reasoning
reusing intermediate steps

We regenerate everything every time.

That doesn’t scale.

Not in cost. Not in latency. Not in reliability.

What’s actually missing

What’s missing is not a better prompt.

It’s a runtime layer that:

encodes reusable cognitive steps
separates reasoning into structured components
allows composition instead of regeneration

In other words:

a system that reuses cognition instead of recomputing it.

From prompts to skills (and where ORCA fits)

Instead of:

→ Prompt → Model → Output

You need:

→ Skill → Execution → Structured Output

Not conceptually. Operationally.

This is exactly what ORCA implements: a runtime layer where “skills” are reusable cognitive units — not prompts.

defined inputs
structured outputs
explicit execution

No recomputation. No guesswork.

Why most agent frameworks hit a wall

Most “agent frameworks” today are:

prompt orchestration layers
tool wrappers
retry loops with better formatting

They don’t model cognition.

They orchestrate prompts.

That’s not a runtime.

The shift we actually need

The shift is not better prompting.

It’s architectural.

From:

stateless generation

To:

structured, reusable cognition

That’s the gap ORCA is designed to close.

Final thought

Prompt engineering isn’t useless.

It’s just solving the wrong problem.

We’ve been optimizing the interface instead of the system.

And it shows.

If you’ve pushed prompt engineering far enough, you’ve seen the limit.

The question is:

are you ready to try what replaces it?