Issue Summary
Experiencing persistent connection stability issues on LangGraph Platform production deployment for 6+ days.
Technical Details
Stream Connections (/runs/{id}/stream
)
net::ERR_CONNECTION_CLOSED
during active streaming- Retry pattern: ECONNREFUSED → nginx 404 → success
- Multiple drops per session
WebSocket Connections
ECONNREFUSED
andECONNRESET
errors during active connections- Tool calls timeout, require page refresh to recover
Key Observations
- Works perfectly: Local development (
langgraph dev
) and self-hosted - Fails consistently: LangGraph Platform only
- WebSocket disconnects every 5-10 minutes on platform
- nginx 404s suggest ingress-level issues
- Both connection types affected simultaneously
Troubleshooting Attempted
- Client-side pinging every 30s with
ws.ping()
- still disconnecting - Confirmed no deployments during disconnection times
- No custom timeout configurations
Additional Issue
After package upgrades: disable_streaming=True
setting ignored on platform but respected locally, causing duplicate message chunks.
Questions
- Known production infrastructure stability issues?
- Expected connection stability SLAs?
- Recommended WebSocket timeout configurations?
Has anyone experienced similar platform-specific connectivity issues? Any workarounds or configuration recommendations?