What does the emerging AI agent stack actually look like?

Over the past year we’ve seen a lot of progress in agent frameworks and runtimes.

But something I’ve been trying to understand is what the emerging “agent stack” actually looks like.

Right now most discussions focus on frameworks:

LangChain
CrewAI
AutoGen

But when building real systems it feels like several layers are emerging:

Application layer
Agent framework (task orchestration, planning)
Identity / persona layer
Execution layer (tool calls, runtime environment)
Verification / governance
Infrastructure

Different projects seem to focus on different parts of this stack:

  • frameworks handle orchestration
  • guardrails handle policy enforcement
  • tracing systems capture execution history

One layer that still feels underdefined is the identity/persona layer — how agent roles and behavioral constraints are represented and reused.

In many systems persona is still implemented as prompts and configuration.

I’m curious how others see the agent stack evolving. Are these layers converging toward something stable, or are we still in the early experimentation phase?

1 Like

hi @joy7758

I really appreciate your questions and concerns :slight_smile: I also constantly aks myfelf many questions to understand things that interest me and that I also somethimes need to understand because of work :slight_smile:

I would answer “these layers are converging faster than most people realize”. IMHO the stack you’ve outlined maps surprisingly well to what’s actually being built in production today.
The patterns are solidifying, even if they aren’t always talked about as a unified stack.

I think arguably the most mature layer is “agent framework/orchestration layer” as well as “execution layer”. LangGraph - it’s become the dominant orchestration primitive over the past year

Regarding identity/persona layer - this is where I’d push back a bit on the “underdefined” characterization.
You are roght that persona is still implemented via prompts and configuration - but the architecture around it has gotten significantly more sophisticated.

The key innovation is the “middleware stack pattern”.
In deepagents, identity isn’t a singel system prompt - it’s a composable pipeline of middleware that each contribute to the agent’s persona and capabilities.

Each middleware dynamically injects context into the system prompt at runtime. For example, SkillsMiddleware.

This means persona is:

  • composable - different middleware contribute different aspects of identity
  • layered - base behaviors can be overridden by more specific ones
  • dynamic - middleware can modify the prompt based on runtime state
  • reusable the same skill definitions work across different agents

Similarly, MemoryMiddleware loads persistent context from structured files (like AGENTS.md), so an agent’s accumulated knowledge becomes part of its identity across sessions.

Is this a dedicated identity object you can serialize and port between runtimes? No yet, not now. But the middleware composition pattern is a real architectural layer, not just " prompts and configuratio". It’s closer to dependency injection for agent behavior.

if we’re talking about the “verification/ governance ayer”, I believe that this is the layer that’s maturing fastest right now - driven by enterprise demand. Three mechanisms are converging:

  • checkpointing (audit trial)
  • human-in-the-loop (runtime governance)
  • observability (langsmith)

So, converging or experimental?

Converging. That’s my personal answer.

  • Orchestration -settled
  • execution - stable
  • Governance - raidly maturing
  • identity/persona - it’s there, not yet portable thugh - architectural patterns are real and reusable
  • Infrastructure - it’s there

and again! thanks for posting this - it really resonates in my brain :smiley: :heart_hands:

That’s a really interesting perspective.

The middleware composition pattern you describe makes a lot of sense — especially the idea of persona emerging from a pipeline rather than a single prompt.

What I’m still wondering about is the portability aspect.

Right now middleware gives us composability inside a framework, but it still feels tightly coupled to that runtime (LangGraph, DeepAgents, etc.).

Do you think the ecosystem will eventually move toward something like a portable identity or persona representation that can travel across runtimes?

Or do you think middleware composition will remain the dominant pattern?