Best Practices for Streaming in Agentic Systems with Non-Streaming Proxies

sahil0094 · August 13, 2025, 7:35am

I’m working on building an agentic system using LangGraph in an enterprise environment where all outbound traffic must pass through a corporate proxy. A common issue in such setups is that the proxy does not support streaming and instead buffers the entire response from the LLM before forwarding it.
This effectively negates the real-time, token-by-token streaming benefit, even if the underlying LLM API supports it. The application receives the full response in a single chunk after the LLM has completed generation.
While LangGraph’s ability to stream state updates via stream_mode="updates" is incredibly valuable for showing node-to-node progress, the loss of the LLM’s “typing effect” impacts the user experience.
My question to the community is:
How are people managing streaming for agentic systems when faced with a non-streaming proxy?
I’m particularly interested in any established best practices or architectural patterns. For instance:

Graceful Fallback: Are there recommended ways to configure a LangGraph application to detect a non-streaming connection and fall back gracefully, perhaps by managing user expectations in the UI?
Communicating with IT: For those who have successfully addressed this at the infrastructure level, what arguments or technical details were most effective in convincing IT/security teams to enable streaming for specific LLM endpoints on proxies (e.g., Zscaler, Palo Alto, etc.)?
Alternative Patterns: Are there any alternative design patterns within LangGraph to simulate a more responsive experience in this scenario? For example, breaking down large LLM calls into smaller, sequential calls to provide more frequent updates, even if they aren’t token-level.
Any insights, shared experiences, or recommended workarounds would be highly appreciated. Understanding how others are tackling this common real-world constraint would be a great help to many developers in the community.

Topic		Replies	Views
Cannot disable streaming from LLMs in Langgraph Platform + Langgraph React library Deployment	7	1508	September 11, 2025
How to stream an LLM from a tool? LangGraph python-help	0	328	August 29, 2025
Long Streaming Sessions LangGraph python-help	1	419	July 9, 2025
If I use subgraph=True + stream_mode='messages' when call stream(), the arguments of tool call become incorrect LangGraph python-help	8	635	September 24, 2025
Stream stops midway to backend, but completes on LangSmith Deployment self-hosted , python-help	4	764	July 29, 2025

Best Practices for Streaming in Agentic Systems with Non-Streaming Proxies

Related topics