I keep encountering various issues when calling via this method. Is this because the feature hasn’t been fully optimized yet? If the problem stems from my incorrect operations, could you provide a sample code with the correct usage?
Yes - version="v3" is the recommended, generally-available streaming API, not an experimental one. It’s the typed-projection Event Streaming API, shipped in LangChain v1.3 (LangGraph v1.2, Deep Agents v0.6). The docs recommend it for “most application and frontend use cases.” So your “various issues” are almost certainly a usage detail, not an unfinished feature.
Canonical correct usage:
stream = agent.stream_events(
{"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
version="v3",
)
for message in stream.messages: # one item per LLM call
for delta in message.text: # token deltas
print(delta, end="", flush=True)
final_state = stream.output
The usual culprits behind “it doesn’t work”:
- Forgetting
version="v3"→ you get old stream-mode tuples and.messages/.text/.outputwon’t exist - Mixing sync/async -
stream_events→for;await astream_events→async for - Consuming a projection twice (each iterator is single-use; use
interleave()/asyncio.gather) - Old version - needs LangChain ≥ 1.3 (
pip install -U langchain langgraph) - Multimodal arrives as content blocks, not plain strings - handle block types, don’t assume
str
Share your exact *_events(...) call, pip show langchain langgraph, and the actual traceback.
Grounded in the official docs (event-streaming, streaming, deepagents event-streaming) and the Token Streams to Agent Streams blog.
Could you please help me review the code below? I’ve enabled streaming mode, but the responses during tool calling are behaving strangely. Here’s my sample code:
import os
import dotenv
from langchain.agents import create_agent
from langchain_openai import ChatOpenAI
dotenv.load_dotenv()
def get_weather(city: str) -> str:
"""Get weather for a city."""
return f"It's always sunny in {city}!"
model = ChatOpenAI(
model="deepseek-v4-flash",
base_url= "https://dashscope.aliyuncs.com/compatible-mode/v1",
api_key=os.getenv("API_KEY"),
)
agent = create_agent(
model=model,
tools=[get_weather],
)
stream = agent.stream_events({
"messages": [{"role": "user", "content": "What is the weather in SF?"}],
}, version="v3")
for message in stream.messages:
for delta in message.text:
print(delta, end="", flush=True)
final_state = stream.output
This is the output I’m getting:
D:\Python\Lib\site-packages\langgraph\pregel\main.py:3723: LangChainBetaWarning: The v3 streaming protocol on Pregel is experimental.
return self._pregel_stream_v3(
D:\Python\Lib\site-packages\langgraph\pregel\main.py:3573: LangChainBetaWarning: The v3 streaming protocol on Pregel is experimental.
return GraphRunStream(graph_iter, mux)
Let me try that againLet me call the correct functionLet me properly call the weather function:Let me use the correct tool name:
Let me check the weather in San Francisco for you!
进程已结束,退出代码为 0
Is this caused by the large language model?
Yes- this is the model, not v3 streaming. Two separate things are happening:
-
The warning is harmless.
"The v3 streaming protocol on Pregel is experimental"comes from the low-level LangGraph/Pregel runtime, not from the LangChainstream_events(version="v3")API you’re calling. That API is the recommended one for normal use (docs). The warning doesn’t affect tool calling. -
The real issue: your model isn’t emitting a structured tool call. The monologue (
"Let me call the correct function…","Let me use the correct tool name…") means the model is describing a tool call in plain text instead of returning an OpenAI-formattool_callspayload.create_agentcan only executeget_weatherwhen the model returns a propertool_callsfield - here it never does, so nothing runs.
Confirm it without streaming:
result = agent.invoke({"messages": [{"role": "user", "content": "What is the weather in SF?"}]})
for m in result["messages"]:
print(type(m).__name__, getattr(m, "tool_calls", None))
print(m.content)
If the AIMessage has tool_calls == [] while the content is that monologue, it’s confirmed.
Likely cause: deepseek-v4-flash through the DashScope OpenAI-compatible endpoint isn’t returning native function calls. Check that (a) the model id is correct and (b) that model actually supports tool/function calling on that endpoint. Swap in a known tool-capable model (e.g. gpt-4o-mini or a Qwen/DeepSeek variant with function-calling) using the exact same code - if the tool fires, the model/endpoint was the variable.
One more note: tool calls don’t appear in stream.messages → .text (text deltas only). Stream them via the .tool_calls projection instead:
for tc in stream.tool_calls:
print(tc)
Thanks for your help! I switched to qwen3.7-max and it works now. It seems DeepSeek models don’t support tool calling after all, their compatibility really needs some improvement haha.
Theoretically DeepSeek supports tool calling, it might just not be compatible with v3 streaming.
It might be the case. Would be good to observe/debug it more.