Handling multi-turn conversations with sub-agents in a deep agent architecture

Hello team,

We currently have a global chat page where users interact with an agent. We’ve architected this as a deep agent with sub-agents, each responsible for a slightly different part of the workflow. For example, user onboarding is one sub-agent, performing user actions is another, and there’s one more sub-agent beyond that. Looking closely at this architecture, there’s one problem.

User onboarding is a multi-turn conversation, and the same is true for our other sub-agents, they’re all multi-turn by nature.

So with our current setup, the orchestrator (the deep agent) has to re-delegate to the relevant sub-agent on every single turn, even when that turn is part of the same ongoing flow.

How should we handle scenarios like this? I haven’t found much in the docs about managing multi-turn conversations within this kind of sub-agent architecture.

hi @SujayPrabhu

The built-in synchronous task sub-agent in Deep Agents is stateless and ephemeral by design - each delegation gets a fresh sub-agent whose entire memory is the description string you pass it, it runs autonomously, returns one final message, and is discarded. So re-delegating every turn is the only thing that can happen, and even that starts the sub-agent cold.

You can see it in the source: on every task call the sub-agent’s history is overwritten with a single message

# deepagents/middleware/subagents.py → _validate_and_prepare_state
subagent_state["messages"] = [HumanMessage(content=description)]

and only its last message comes back as a ToolMessage. The docs say the same: “Subagents are stateless by design… subagents start fresh each time… repeats the full flow” (multi-agent docs), and in their capability table subagents score nothing for “Direct user interaction.”

So for onboarding/actions - flows where the agent talks to the user across turns - task sub-agents are the wrong tool. Make which flow is active explicit graph state owned by one stateful agent, with a checkpointer and a stable thread_id. This is the Handoffs pattern: a tool flips a state variable, middleware swaps the prompt + tools, and the flow stays active across turns with no re-routing.

class AppState(DeepAgentState):
    active_flow: NotRequired[Literal["router", "onboarding", "actions"]]

@tool
def set_active_flow(flow, runtime: ToolRuntime[None, AppState]) -> Command:
    """Switch the current conversation flow."""
    return Command(update={
        "active_flow": flow,
        "messages": [ToolMessage(f"Active flow set to {flow}", tool_call_id=runtime.tool_call_id)],
    })

@wrap_model_call
def apply_flow_config(request, handler):
    flow = request.state.get("active_flow") or "router"
    cfg = {
        "router":     {"prompt": "Greet and route. Call set_active_flow('onboarding') for new users.", "tools": [set_active_flow]},
        "onboarding": {"prompt": "Run onboarding. Ask one question at a time - collect only missing fields.", "tools": [set_active_flow, save_field, finish_onboarding]},
        "actions":    {"prompt": "Help the user perform account actions.", "tools": [set_active_flow, perform_action]},
    }[flow]
    return handler(request.override(system_prompt=cfg["prompt"], tools=cfg["tools"]))

agent = create_deep_agent(
    model="anthropic:claude-sonnet-4-6",
    tools=[set_active_flow, save_field, finish_onboarding, perform_action],
    middleware=[apply_flow_config],
    state_schema=AppState,
    checkpointer=InMemorySaver(),  # use a DB-backed saver in prod
)

# every user turn - same thread_id restores active_flow:
agent.invoke({"messages": [{"role": "user", "content": user_msg}]},
             {"configurable": {"thread_id": conversation_id}})

Now turn 2 reads active_flow from the checkpoint and continues the flow directly - no re-delegation. The docs measure ~40–50% fewer model calls on repeat turns vs. subagents because the agent “is still active from turn 1 (state persists).”

When to use what:

  • user-driven multi-turn flow (onboarding/actions) → stateful parent + handoffs (above), or the Skills pattern
  • bounded autonomous chunk inside a flow → keep a sync task sub-agent - add interrupt_on if it must pause to ask the user within that one task
  • long-running background job you steer later → async sub-agents (these are stateful - own thread, task_id, and update_async_task sends follow-ups on the same thread)

Docs: Handoffs · Multi-agent · Async subagents · Persistence

The one design change that fixes your problem: make “which flow is active?” explicit, persisted state - not something the orchestrator re-infers from chat history every turn.

LangGraph leveraging

If the flows have hard sequencing or branching, model them as a LangGraph graph. The middleware approach above is the lightweight version of the same idea (one agent, state-driven config). When the routing logic gets non-trivial - strict step order, gated transitions, deterministic branches between flows, or fan-out - promote each flow to a node in a StateGraph and let LangGraph own the transitions. The docs call this the Custom workflow pattern: “Build bespoke execution flows with LangGraph, mixing deterministic logic and agentic behavior. Embed other patterns as nodes.”

from langgraph.graph import StateGraph, START

builder = StateGraph(AppState)
builder.add_node("router", router_agent)
builder.add_node("onboarding", onboarding_agent)   # each is itself a create_agent/create_deep_agent
builder.add_node("actions", actions_agent)

builder.add_edge(START, "router")
builder.add_conditional_edges("router", lambda s: s["active_flow"],
                              {"onboarding": "onboarding", "actions": "actions"})
graph = builder.compile(checkpointer=InMemorySaver())

You get the same persistence story (checkpointer + thread_id keep the active flow), but routing is now explicit and inspectable in LangSmith instead of living inside an LLM’s judgment. Flows can also hand control to each other via Command(goto=..., graph=Command.PARENT) - just be deliberate about which messages cross between nodes (context engineering). Rule of thumb: state-driven middleware for “same agent, different mode” - an explicit graph when the control flow itself is the thing you need to guarantee.

Docs: Custom workflow / Multi-agent · LangGraph