Hey,
We are observing intermittent issues with our LangSmith Hosted Cloud Deployment.
- LangGraph API Version: 0.7.90
- Deployment Mode: LangSmith Cloud
- Deployment Type: Development
- Runtime: Node 20
Below included stacktraces from the Server logs which suggest underlying infra errors bubbling up.
Is this a known issue and is there any mitigation or resolution available ?
Best
30/03/2026, 17:45:10 Closing gRPC client pool (5 clients)
30/03/2026, 17:45:10 Terminating JS graphs process
30/03/2026, 17:45:10 Shutting down remote graphs
30/03/2026, 17:45:10 Lifespan failed
Traceback (most recent call last):
File "/usr/lib/python3.12/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
yield
File "/usr/lib/python3.12/site-packages/httpx/_transports/default.py", line 394, in handle_async_request
resp = await self._pool.handle_async_request(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 256, in handle_async_request
raise exc from None
File "/usr/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 236, in handle_async_request
response = await connection.handle_async_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpcore/_async/connection.py", line 101, in handle_async_request
raise exc
File "/usr/lib/python3.12/site-packages/httpcore/_async/connection.py", line 78, in handle_async_request
stream = await self._connect(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpcore/_async/connection.py", line 124, in _connect
stream = await self._network_backend.connect_tcp(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpcore/_backends/auto.py", line 31, in connect_tcp
return await self._backend.connect_tcp(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpcore/_backends/anyio.py", line 113, in connect_tcp
with map_exceptions(exc_map):
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/usr/lib/python3.12/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ConnectError: All connection attempts failed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.12/site-packages/langgraph_runtime_postgres/lifespan.py", line 178, in lifespan
await graph.collect_graphs_from_env(True)
File "/api/langgraph_api/graph.py", line 483, in collect_graphs_from_env
File "/api/langgraph_api/js/remote.py", line 828, in wait_until_js_ready
File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1768, in get
return await self.request(
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1540, in request
return await self.send(request, auth=auth, follow_redirects=follow_redirects)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1629, in send
response = await self._send_handling_auth(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1657, in _send_handling_auth
response = await self._send_handling_redirects(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1694, in _send_handling_redirects
response = await self._send_single_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1730, in _send_single_request
response = await transport.handle_async_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/site-packages/httpx/_transports/default.py", line 393, in handle_async_request
with map_httpcore_exceptions():
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/usr/lib/python3.12/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ConnectError: All connection attempts failed
30/03/2026, 17:44:32 Resolving graph my-workflow-v2
30/03/2026, 17:44:32 Resolving graph my-workflow
30/03/2026, 17:44:32 Starting graph loop
30/03/2026, 17:44:02 Successfully submitted metadata to LangSmith instance
--- (these are other instances of errors also observed)
30/03/2026, 15:29:04 Closing gRPC client pool (5 clients)
30/03/2026, 15:29:04 Checkpointer ingestion task cancelled. Draining queue.
30/03/2026, 15:29:04 Shutting down remote graphs
30/03/2026, 15:29:04 Terminating JS graphs process
30/03/2026, 15:29:04 Received SIGTERM. Exiting..
30/03/2026, 15:29:04 Finished server process [1]
30/03/2026, 15:29:04 Application shutdown complete.
30/03/2026, 15:29:04 Waiting for application shutdown.
30/03/2026, 15:29:04 Shutting down
30/03/2026, 15:28:58 Received SIGTERM. Exiting..
30/03/2026, 15:28:58 asyncio.exceptions.CancelledError
30/03/2026, 15:28:58 await getter
30/03/2026, 15:28:58 File "/usr/lib/python3.12/asyncio/queues.py", line 158, in get
30/03/2026, 15:28:58 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30/03/2026, 15:28:58 Traceback (most recent call last):
File "/usr/lib/python3.12/site-packages/starlette/routing.py", line 645, in lifespan
await receive()
File "/usr/lib/python3.12/site-packages/uvicorn/lifespan/on.py", line 137, in receive
return await self.receive_queue.get()
30/03/2026, 15:28:58 During handling of the above exception, another exception occurred:
30/03/2026, 15:28:58 SystemExit: 1
30/03/2026, 15:28:58 File "/api/langgraph_api/graph.py", line 512, in _handle_exception
30/03/2026, 15:28:58 File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
30/03/2026, 15:28:58 File "uvloop/cbhandles.pyx", line 83, in uvloop.loop.Handle._run
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 476, in uvloop.loop.Loop._on_idle
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 557, in uvloop.loop.Loop._run
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 1379, in uvloop.loop.Loop.run_forever
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 1505, in uvloop.loop.Loop.run_until_complete
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 1512, in uvloop.loop.Loop.run_until_complete
30/03/2026, 15:28:58 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30/03/2026, 15:28:58 return self._loop.run_until_complete(task)
30/03/2026, 15:28:58 File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
30/03/2026, 15:28:58 ^^^^^^^^^^^^^^^^
30/03/2026, 15:28:58 return runner.run(main)
30/03/2026, 15:28:58 File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
30/03/2026, 15:28:58 ERROR: Traceback (most recent call last):
30/03/2026, 15:28:58 Entrypoint task finished
30/03/2026, 15:28:58 Checkpointer ingestion task cancelled. Draining queue.
30/03/2026, 15:28:58 Shutting down health and metrics server
30/03/2026, 15:28:58 asyncio.exceptions.CancelledError
30/03/2026, 15:28:58 await getter
30/03/2026, 15:28:58 File "/usr/lib/python3.12/asyncio/queues.py", line 158, in get
30/03/2026, 15:28:58 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30/03/2026, 15:28:58 Traceback (most recent call last):
File "/usr/lib/python3.12/site-packages/starlette/routing.py", line 645, in lifespan
await receive()
File "/usr/lib/python3.12/site-packages/uvicorn/lifespan/on.py", line 137, in receive
return await self.receive_queue.get()
30/03/2026, 15:28:58 During handling of the above exception, another exception occurred:
30/03/2026, 15:28:58 SystemExit: 1
30/03/2026, 15:28:58 File "/api/langgraph_api/graph.py", line 512, in _handle_exception
30/03/2026, 15:28:58 File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
30/03/2026, 15:28:58 File "uvloop/cbhandles.pyx", line 83, in uvloop.loop.Handle._run
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 476, in uvloop.loop.Loop._on_idle
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 557, in uvloop.loop.Loop._run
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 1379, in uvloop.loop.Loop.run_forever
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 1505, in uvloop.loop.Loop.run_until_complete
30/03/2026, 15:28:58 File "uvloop/loop.pyx", line 1512, in uvloop.loop.Loop.run_until_complete
30/03/2026, 15:28:58 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30/03/2026, 15:28:58 return self._loop.run_until_complete(task)
30/03/2026, 15:28:58 File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
30/03/2026, 15:28:58 ^^^^^^^^^^^^^^^^
30/03/2026, 15:28:58 return runner.run(main)
30/03/2026, 15:28:58 File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
30/03/2026, 15:28:58 Closing gRPC client pool (5 clients)
30/03/2026, 15:28:58 Terminating JS graphs process
30/03/2026, 15:28:58 Shutting down remote graphs
30/03/2026, 15:28:58 ERROR: Traceback (most recent call last):
30/03/2026, 15:28:58 Successfully shutdown queue
30/03/2026, 15:28:58 Workers finished.
30/03/2026, 15:28:58 Queue task cancelled. Shutting down workers. Will terminate after 180s
30/03/2026, 15:28:58 Shutting down queue...
30/03/2026, 15:24:49 Successfully submitted metadata to LangSmith instance
30/03/2026, 15:24:49 HTTP Request: POST https://eu.api.smith.langchain.com/v1/metadata/submit "HTTP/1.1 204 No Content"
30/03/2026, 15:24:49 Successfully submitted metadata to LangSmith instance
30/03/2026, 15:24:49 HTTP Request: POST https://eu.api.smith.langchain.com/v1/metadata/submit "HTTP/1.1 204 No Content"
30/03/2026, 15:22:56 redis: 2026/03/30 14:22:56 pool.go:617: redis: connection pool: failed to dial after 3 attempts: dial tcp 192.168.112.124:6379: i/o timeout