Hi team! I’m building a streaming agent UI with LangChain and Azure OpenAI (GPT-5.2 with reasoning mode). I’m using astream_events(version="v2") to capture token-by-token streaming.
Setup:
langchain-openaiwithAzureChatOpenAIreasoning={"effort": "high", "summary": "detailed"}andoutput_version="responses/v1"- Using
astream_eventsto captureon_chat_model_streamevents
What works:
- Initial reasoning tokens stream correctly via
content_blockswithtype="reasoning" - Tool calls and results are captured via
on_tool_start/on_tool_end - Final reasoning after all tools completes streams correctly
Issue:
After a tool result returns, the model often goes directly to the next tool call (tool_args tokens) without emitting visible reasoning text. I see this pattern:
reasoning tokens → tool_call → tool_result → tool_args (next tool) → tool_result → ... → reasoning tokens (final)
I expected intermediate reasoning between each tool call, similar to how ChatGPT/Claude show “thinking” text between tool invocations.
Questions:
- Is there a way to force/encourage the model to emit reasoning summaries between tool calls, not just at the start and end?
- Are there different
astream_eventsevent types I should be listening for to capture this intermediate content? - Is this a model behavior limitation with the Responses API, or is there a configuration I’m missing?
Any guidance appreciated! ![]()