If I use subgraph=True + stream_mode='messages' when call stream(), the arguments of tool call become incorrect

mogitozhang · September 22, 2025, 8:13am

    data_agent = create_react_agent(
        model=llm,
        tools=[get_cards_by_panel_id],
        prompt=template,
        name="data manager"
    )

    supervisor_agent = create_react_agent(
        model=supervisor_llm,
        tools=[],
        prompt=supervisor_template,
        name="business manager"
    )

    def call_data_agent(
        state: MessagesState,
    ) -> Command[Literal["supervisor_agent"]]:
        response = data_agent.invoke(state)
        update = {**response}
        return Command(update=update, goto="supervisor_agent")
    
    def manager_node(
    state: MessagesState, config
    ) -> Command[Literal["data_agent", END]]:
        response = supervisor_agent.invoke(state)
        
        if response["messages"][-1].content.find("next_agent:data_agent") >= 0 or response["messages"][-1].content.find("next_agent: data_agent") >= 0:
            next = "data_agent"
        else:
            next = END
        update = {**response}

        return Command(
            update=update,
            goto=next,
        )

    builder = StateGraph(MessagesState)
    builder.add_node("supervisor_agent", manager_node)
    builder.add_node("data_agent", call_data_agent)

    builder.add_edge(START, "supervisor_agent")
    builder.add_edge("supervisor_agent", END)
    graph = builder.compile()

and the tool `get_cards_by_panel_id` is like below:

@tool
def get_cards_by_panel_id(id: int) -> str:
...

my question is like “give me the information of panel which id is 35563“, if I call graph.stream() like this:

for event in graph.stream({"messages": [{"role": "user", "content": user_input}]}, config, context=ctx):
    for value in event.values():
        print("Assistant:", value["messages"][-1].content)

the tool call is OK( I print the messages of data_agent and the argument in method `get_cards_by_panel_id` the argument is 35563). But if I call graph.stream() like this:

for namespace, data in graph.stream({"messages": [{"role": "user", "content": user_input}]}, config, context=ctx, subgraphs=True, stream_mode="messages"):
    if data[0].content:
        print(data[0].content, end="", flush=True)

the argument become 3563 (while the tokens produced by llm is 35563, but arguments become 3563).

I want to know how to resolve this problem?

python version: 3.9.22

langgraph version: 0.6.7

xuro-langchain · September 22, 2025, 6:55pm

What happens if you don’t use subgraph=True but keep messages as the stream mode? The behavior only happens when both are true?

mogitozhang · September 23, 2025, 6:19am

If I just set stream_mode, the llm won’t produce message token by token but the argument of tool is right. So I think if it has some association with stream?

I also tried a single agent, if I set stream_mode=’messages’, the argument of tool call is incorrect too. That makes me guess if it’s caused by stream token? So that the langgraph cannot get a complete content?

xuro-langchain · September 23, 2025, 1:17pm

Could you provide more detail on how you’re providing the tool result? I’d be surprised if streaming missed tokens generally since we would expect to see similar bugs in a lot of places

pawel-twardziak · September 23, 2025, 5:54pm

Hi @mogitozhang

I figured out this. Let me know if works.

You’re reading partial LLM token chunks. In stream_mode="messages", the stream yields incremental message pieces (tokens), not finalized structured tool calls. When you also enable subgraphs=True, the chunk boundaries and namespacing can make this more visible. The tool itself still receives the correct argument (e.g., 35563), but mid-stream chunks can show transient, incomplete JSON like 3563.

Instead (pick one):

Use a state stream for finalized data: rely on updates (or values) to read the finalized tool call and arguments rather than token chunks.

for ns, update in graph.stream(
    {"messages": [{"role": "user", "content": user_input}]},
    config,
    subgraphs=True,
    stream_mode="updates",
):
    # update is a dict of node_name -> {state_delta}
    node_update = next(iter(update.values()))
    if "messages" in node_update:
        last = node_update["messages"][-1]
        # When the AI requests a tool, arguments here are finalized
        tool_calls = getattr(last, "tool_calls", None) or last.get("tool_calls")
        if tool_calls:
            args_json = tool_calls[0]["function"]["arguments"]
            print("Final tool args:", args_json)

Keep token streaming, but stream the real args from the tool: emit a custom event from inside the tool to expose the exact parsed arguments you’re executing with.

from langgraph.config import get_stream_writer
from langchain_core.tools import tool

@tool
def get_cards_by_panel_id(id: int) -> str:
    writer = get_stream_writer()
    writer({"tool": "get_cards_by_panel_id", "args": {"id": id}})
    # ... run tool
    return "ok"

for mode, chunk in graph.stream(
    {"messages": [{"role": "user", "content": user_input}]},
    config,
    subgraphs=True,
    stream_mode=["messages", "custom"],
):
    if mode == "messages":
        msg, meta = chunk
        if msg.content:
            print(msg.content, end="", flush=True)
    elif mode == "custom":
        print("\nTool args:", chunk)

Disable token streaming on the specific model: if you don’t need tokens from that agent/model, initialize it with disable_streaming=True so only finalized updates come through.

Why this happens: per the docs, messages mode streams LLM outputs token-by-token; tool-call JSON (including arguments) is constructed incrementally and only becomes reliable once the message is complete. See LLM tokens and Stream subgraph outputs:

Optional: upgrade LangGraph/LangChain to the latest version, as streaming + tools has seen fixes and polish over time, but the core guidance above (don’t parse tool-call args from token chunks) remains the recommended approach.

mogitozhang · September 24, 2025, 6:57am

I tried other custom llm model and I found that only the Qwen_32B(which developed by Alibaba) has this problem. After replace it with DeepSeekV3, the id passed to tool become correct when stream_mode is True!

If I still use Qwen_32B, As @pawel-twardziak suggested, I can only set the disable_streaming=True for the llm of `data_agent`

mogitozhang · September 24, 2025, 7:04am

Hi @pawel-twardziak ! Thanks so much for your reply!

The first one and third one works (the second one still return 3563) ! And the third one maybe the best choice for me, because I want stream token as much as possible, so I just set disable_streaming to llm of data_agent.

But after some tests, I found that the problem only occur when I use the Qwen_32B(which developed by Alibaba) as model, If I use DeepSeekinstead, the arguments is correct!

pawel-twardziak · September 24, 2025, 10:39am

hi @mogitozhang yeah, I faced that before, some OSS models still struggle with tool caling

xuro-langchain · September 24, 2025, 5:20pm

Gotcha, that’s a really interesting insight. Will note with the team that Qwen_32B has this issue, glad to hear you found a workaround!

Topic		Replies	Views
Streaming Custom Messages from Subgraph Tools JS frontend LangGraph Backend LangGraph self-hosted , js-help	1	460	July 19, 2025
Seems like streaming with multiple agents is totally broken LangChain js-help	3	84	March 6, 2026
How to stream an LLM from a tool? LangGraph python-help	0	313	August 29, 2025
How to stream from graph with ReactAgent? LangGraph python-help	4	823	September 17, 2025
Cannot disable streaming from LLMs in Langgraph Platform + Langgraph React library Deployment	7	1463	September 11, 2025

If I use subgraph=True + stream_mode='messages' when call stream(), the arguments of tool call become incorrect

Related topics