Seems like streaming with multiple agents is totally broken

george32 · March 4, 2026, 6:18pm

I am using typedscript sdk to create main agent which uses tool call (lets imagine ask_subagent).
This tool creates separate new agent using “createAgent“ function.

Both agents (main and subagent) use stream() method with “messages“ stream mode.

The problem is no matter how I was trying to isolate output of both agents I am getting interleaved responses.
I even used await new Promise(resolve => setTimeout(resolve, 5000)); before returning “ask_subagent“ tool call result to the main agent but it still prints messages generated by sug-agent while sub-agent is working.

I have only two ideas:

Bun runtime stdout is broken
You reuse some global state while streaming responses.

What’s goind on why I am getting duplicate messages like “ This This is is a a …“

george32 · March 4, 2026, 7:02pm

Seems like if I specify callback: [] when call stream() on a sub-agent this fixes the issue.

Can someone clarify what’s going on and what is callbacks?

hntrl · March 4, 2026, 10:53pm

Hey! Do you have a full code sample that you can share to help us diagnose this?

We do have a dedicated section to subagent streaming in case it’s of interest: Frontend - Docs by LangChain

pawel-twardziak · March 6, 2026, 11:06pm

hi @george32

The callbacks config in LangChain/LangGraph is an inheritable property. Any callback handler added to a parent runnable automatically propagates to all child runnables via getChild(). When both parent and child independently add their own StreamMessagesHandler (by both calling .stream() with "messages" mode), you get two handlers listening to the same token events = double output. Breaking the chain with callbacks: [] or restructuring as a subgraph node are both valid fixes.

Why callbacks: [] fixes the issue?
By passing callbacks: [] in the subagent’s .stream() config you break the inheritance chain. The empty array replaces the inherited callback list, so the subagent starts fresh with only the new StreamMessagesHandler it creates internally. Result: each token is emitted exactly once.

This issue is a well-known footgun in LangGraph JS when nesting agents or subgraphs that both stream in "messages" mode. The root cause is callback handler inheritance - the parent’s StreamMessagesHandler propagates into child runnables and listens to the same LLM token events that the child’s own handler also listens to, producing every token twice.

Solutions

1: Pass callbacks: [] on the subagent stream (your workaround)

2: Use .invoke() instead of .stream() on the subagent

If you only need the parent’s stream for the final UI and the subagent’s output can be returned as a complete message, call .invoke() on the subagent inside the tool. The parent’s StreamMessagesHandler will still capture the subagent’s LLM tokens (since it inherits into the child), giving you streaming in the parent output without any duplication:

const tool = tool(async (input) => {
  const subAgent = createAgent({ model, tools: subTools });
  const result = await subAgent.invoke({
    messages: [{ role: "user", content: input.query }],
  });
  return result.messages.at(-1)?.content ?? "";
}, { name: "ask_subagent", schema: z.object({ query: z.string() }) });

3: Model the subagent as a subgraph node (for complex setups)

Instead of spawning a subagent inside a tool, add it as a proper subgraph node:

const subAgent = createAgent({ model, tools: subTools });
const parentGraph = new StateGraph(AgentState)
  .addNode("main_agent", mainAgentNode)
  .addNode("sub_agent", subAgent) // subgraph node
  .addEdge("main_agent", "sub_agent")
  .compile();

// Stream with subgraphs to get events from both
for await (const [ns, mode, chunk] of await parentGraph.stream(
  input,
  { streamMode: "messages", subgraphs: true }
)) {
  // ns tells you which agent emitted the chunk
}

LangGraph handles callback scoping correctly for subgraph nodes, and the subgraphs: true flag gives you namespaced events so you can tell which agent produced each chunk.

4: Use the nostream tag to suppress streaming from internal llm calls

If the subagent’s LLM is called inside a tool and you don’t want its tokens leaking into the parent stream at all, tag it:

const quietModel = model.withConfig({
  tags: ["langsmith:nostream"]  // or just "nostream"
});
const subAgent = createAgent({ model: quietModel, tools: subTools });

Topic		Replies	Views
Streaming from sub-agents in the subagents pattern LangChain python-help	3	408	January 22, 2026
Streaming Custom Messages from Subgraph Tools JS frontend LangGraph Backend LangGraph self-hosted , js-help	1	497	July 19, 2025
LangChain Agents: stream_mode="messages" intermittently emits cumulative AIMessageChunk (large duplicate text mid-response) LangChain python-help	6	540	November 21, 2025
If I use subgraph=True + stream_mode='messages' when call stream(), the arguments of tool call become incorrect LangGraph python-help	8	688	September 24, 2025
How to make a sub-agent to respond to the user's question in a tool-calling multi-agent architecture in v1 LangChain python-help	3	481	November 2, 2025

Seems like streaming with multiple agents is totally broken

Solutions

Related topics