I am encountering a recurring JSONDecodeError: Invalid control character when streaming responses from Anthropic via LangGraph. The error originates from the Anthropic SDK’s streaming parser (anthropic._streaming.SSE.json) but surfaces within langchain_core and langgraph when using agent.astream(…) with stream_mode=“messages”.
Issue Details:
- Setup: I am using LangGraph with an Anthropic agent (Claude 3.5 Sonnet) created via create_agent.
- Error: When calling agent.astream({“messages”: […]}, stream_mode=“messages”), the stream occasionally fails with: json.decoder.JSONDecodeError: Invalid control character at: line 1 column 112 (char 111).
- Root Cause: The stack trace points to the json() method in anthropic._streaming.py, which calls json.loads(self.data) in strict mode.
- Current Workaround: I have monkey-patched SSE.json to use json.loads(…, strict=False) at startup. This prevents the crash, but it does not feel like a sustainable solution.
Questions:
1. Is there a recommended way to handle this within LangGraph or LangChain when using Anthropic streaming?
2. Should the Anthropic client or the LangChain integration be more resilient to control characters in streamed JSON chunks?
3. Are there known configuration options or model parameters that could help avoid this?
I can provide a minimal reproduction snippet, exact package versions, and a full sanitized stack trace if needed. Please advise on the best way to resolve this or if this is a known issue.
(1) Recommended ways to handle this in LangGraph/LangChain today
I think there isn’t a built-in LangGraph/LangChain knob that can “fix” malformed SSE JSON once it reaches the Anthropic SDK parser. Practical options that are supported:
Upgrade: make sure you’re on the latest compatible anthropic + langchain-anthropic + langgraph. This is the first thing maintainers will ask for, and it sometimes resolves edge-case streaming bugs.
Turn off provider streaming, keep graph streaming: set LangChain’s disable_streaming=True on the model instance. Then stream_mode="messages" will still emit the final AIMessage (but you won’t get token-by-token updates), and you avoid the SDK SSE parser entirely.
LangChain core documents disable_streaming as a supported escape hatch for models whose streaming is unreliable (see BaseChatModel field docs in LangChain core).
Retry at the graph-call level: wrap your agent.astream(...) call (or the model runnable) with a retry that treats JSONDecodeError as retriable. This won’t “resume” a broken stream, but it will recover by restarting the call.
In LangChain, BaseChatModel exposes with_retry() (conceptually; see LangChain runnable docs). Just be aware you may need to dedupe partial UI output when retrying a stream.
If you must keep token streaming: keep your monkeypatch (or a forked pinned SDK) temporarily, but also capture and report the raw offending SSE data to upstream. Without seeing the exact bytes, it’s hard to make a principled fix.
(2) Should Anthropic SDK or LangChain be more resilient here?
Primary fix belongs in the Anthropic SDK because it owns the SSE framing + JSON parsing. LangChain/LangGraph receive already-parsed events - they can’t realistically correct malformed JSON without re-implementing the transport.
(3) Configuration/model parameters to avoid it
This error is almost never caused by model parameters like temperature. It’s typically transport / SSE framing / decoding. Things that do help:
Eliminate proxies / middleware between you and api.anthropic.com:
If you’re using HTTP_PROXY / HTTPS_PROXY, try bypassing (e.g. NO_PROXY=api.anthropic.com) and see if the issue disappears.
If you’re streaming through a reverse proxy (NGINX, Cloudflare, etc.), ensure it’s configured for SSE passthrough (no buffering, no transformations).
Disable streaming at the model layer (disable_streaming=True) if reliability matters more than token UX.
What to share for a high-quality bug report
If you post an issue upstream (Anthropic SDK) or to LangChain:
Exact versions: anthropic, langchain-core, langchain-anthropic, langgraph, Python version
Whether any proxy is in play
The raw SSE event payload that fails JSON parsing (even if sanitized). The critical bit is the literal character at the reported char offset
To make sure it’s the SDK issue - could you test a direct connection via the SDK with similar prompts and e.g. 100 or 1000 (if automated) attempts? (no Langchain/Langgraph)