Hello, @bradyneal
TL;DR: Yes, reasoning summaries and built-in tool calls do show up when streaming, but they behave differently from regular text. You need stream_mode=“messages” and output_version=“responses/v1”.
The reason you’re only seeing custom tool calls and final text is likely because you’re using stream_mode=“updates” (the default). In that mode, LangGraph gives you the full model output after the entire call finishes — so everything arrives at once at the end, not streamed.
What you want is stream_mode=“messages”, that gives you chunks as they come in.
The other piece is output_version=“responses/v1” on ChatOpenAI. Without that, reasoning summaries and built-in tool call data get buried in additional_kwargs instead of showing up as structured content blocks.
Important caveat about built-in tools: OpenAI’s built-in server-side tools like web_search_preview do NOT stream incrementally. They execute on OpenAI’s servers and arrive as a single complete block once the search is done — that’s an OpenAI API limitation, not a LangChain one.
Here’s how each event type actually behaves during streaming:
-
Text output — streams token-by-token via deltas
-
Reasoning summaries — stream incrementally via text deltas
-
Custom function calls (like your sha256) — stream name first, then argument deltas
-
web_search_call — arrives as one complete block when the search finishes (no deltas)
Here’s roughly what you’d want:
import asyncio
import hashlib
from langchain.agents import create_agent
from langchain.tools import tool
from langchain_openai import ChatOpenAI
@tool
def sha256(text: str) → str:
“”“Return the SHA-256 hex digest of text (exact, deterministic).”“”
return hashlib.sha256(text.encode(“utf-8”)).hexdigest()
llm = ChatOpenAI(
model=“gpt-5.2”,
use_responses_api=True,
output_version=“responses/v1”,
reasoning={“effort”: “medium”, “summary”: “auto”},
temperature=0,
)
agent = create_agent(
llm,
tools=[
{“type”: “web_search_preview”},
sha256,
],
system_prompt=(
"Use web search for up-to-date facts. "
“Use the sha256 tool for any hashing.”
),
)
async def main():
async for msg_tuple in agent.astream(
{
“messages”: [
{
“role”: “user”,
“content”: “Do a web search for the latest news on the stock market.”,
}
]
},
stream_mode=“messages”,
):
msg_chunk, metadata = msg_tuple
for block in msg_chunk.content:
if isinstance(block, dict):
if block["type"] == "reasoning":
print("REASONING:", block.get("summary", ""))
elif block["type"] == "web_search_call":
print("WEB SEARCH:", block)
elif block["type"] == "text":
print(block.get("text", ""), end="")
if name == “main”:
asyncio.run(main())
Quick recap of the settings:
-
output_version=“responses/v1” — puts reasoning, web_search_call, etc. into structured content blocks
-
stream_mode=“messages” — streams individual chunks as they arrive from the model
-
reasoning={“summary”: “auto”} — tells OpenAI to include reasoning summaries; without this you won’t get them at all
Reference docs
LangChain / LangGraph:
OpenAI:
Hope that helps!