Agent Chat UI stream reasoning and tool calls?

Can the Agent Chat UI stream reasoning summaries and tool calls (e.g. from OpenAI built-in web search tool and custom tools)? I don’t see it streaming the reasoning summaries or OpenAI built-in web search tool calls; I only see it returning the the custom tool calls and regular output text messages. Minimal example code for concreteness of an example agent that would do reasoning, built-in tool calls, and custom tools calls:

import hashlib

from langchain.agents import create_agent

from langchain.tools import tool

from langchain_openai import ChatOpenAI




@tool

def sha256(text: str) -> str:

"""Return the SHA-256 hex digest of `text` (exact, deterministic)."""

return hashlib.sha256(text.encode("utf-8")).hexdigest()




# OpenAI built-in / server-side tool:

openai_web_search = {"type": "web_search_preview"}



llm = ChatOpenAI(

model="gpt-5.2",

use_responses_api=True,   # optional, but makes intent explicit

temperature=0,

)




agent = create_agent(

    llm,

tools=[openai_web_search, sha256],

system_prompt=(

"Use web search for up-to-date facts. "

"Use the sha256 tool for any hashing."

    ),

)




prompt = """

Do these steps in order:

1) Use web_search_preview to find the latest stable Python release version number from python.org.

2) Call sha256 on EXACTLY that version string (e.g. '3.13.1' — just the version number).

Return a JSON object with keys: version, sha256.

"""

Hello, @bradyneal

TL;DR: Yes, reasoning summaries and built-in tool calls do show up when streaming, but they behave differently from regular text. You need stream_mode=“messages” and output_version=“responses/v1”.

The reason you’re only seeing custom tool calls and final text is likely because you’re using stream_mode=“updates” (the default). In that mode, LangGraph gives you the full model output after the entire call finishes — so everything arrives at once at the end, not streamed.

What you want is stream_mode=“messages”, that gives you chunks as they come in.

The other piece is output_version=“responses/v1” on ChatOpenAI. Without that, reasoning summaries and built-in tool call data get buried in additional_kwargs instead of showing up as structured content blocks.

Important caveat about built-in tools: OpenAI’s built-in server-side tools like web_search_preview do NOT stream incrementally. They execute on OpenAI’s servers and arrive as a single complete block once the search is done — that’s an OpenAI API limitation, not a LangChain one.

Here’s how each event type actually behaves during streaming:

  • Text output — streams token-by-token via deltas

  • Reasoning summaries — stream incrementally via text deltas

  • Custom function calls (like your sha256) — stream name first, then argument deltas

  • web_search_call — arrives as one complete block when the search finishes (no deltas)

Here’s roughly what you’d want:

import asyncio
import hashlib

from langchain.agents import create_agent
from langchain.tools import tool
from langchain_openai import ChatOpenAI

@tool
def sha256(text: str) → str:
“”“Return the SHA-256 hex digest of text (exact, deterministic).”“”
return hashlib.sha256(text.encode(“utf-8”)).hexdigest()

llm = ChatOpenAI(
model=“gpt-5.2”,
use_responses_api=True,
output_version=“responses/v1”,
reasoning={“effort”: “medium”, “summary”: “auto”},
temperature=0,
)

agent = create_agent(
llm,
tools=[
{“type”: “web_search_preview”},
sha256,
],
system_prompt=(
"Use web search for up-to-date facts. "
“Use the sha256 tool for any hashing.”
),
)

async def main():
async for msg_tuple in agent.astream(
{
“messages”: [
{
“role”: “user”,
“content”: “Do a web search for the latest news on the stock market.”,
}
]
},
stream_mode=“messages”,
):
msg_chunk, metadata = msg_tuple

    for block in msg_chunk.content:
        if isinstance(block, dict):
            if block["type"] == "reasoning":
                print("REASONING:", block.get("summary", ""))

            elif block["type"] == "web_search_call":
                print("WEB SEARCH:", block)

            elif block["type"] == "text":
                print(block.get("text", ""), end="")

if name == “main”:
     asyncio.run(main())


Quick recap of the settings:

  • output_version=“responses/v1” — puts reasoning, web_search_call, etc. into structured content blocks

  • stream_mode=“messages” — streams individual chunks as they arrive from the model

  • reasoning={“summary”: “auto”} — tells OpenAI to include reasoning summaries; without this you won’t get them at all


Reference docs

LangChain / LangGraph:

OpenAI:

Hope that helps!