Stream reconnection and cancellation endpoints fail in production environment while working correctly in local development

keenanberry · July 14, 2025, 6:00pm

Stream Reconnection and Cancellation Not Working in Production

Environment

Framework: Next.js (App Router)
LangGraph SDK: @langchain/langgraph-sdk/react
Hosting: Vercel
Browser: Chrome 138

Issue Description

In my Next.js application using the LangGraph React SDK, stream reconnection and cancellation work perfectly in local development but fail silently in the production environment.

Specifically:

Working locally:
- Stream reconnection (via reconnectOnMount: true) successfully rejoins existing streams when reloading a tab
- Cancellation (via the stop() function) successfully terminates running threads
Not working in production:
- No API calls are made to the reconnection endpoint (/threads/{id}/runs/{id}/stream)
- No API calls are made to the cancellation endpoint (/threads/{id}/runs/{id}/cancel)
- No errors appear in the console

What I’ve Tried

Verified environment variables are correctly set in production
Confirmed authentication is working (can create new threads and interact normally)
Added detailed logging to the custom fetch implementation
Confirmed the stop() function exists and is being called

Code Example

const { isLoading, messages, values, submit, stop } = useStream<AgentGraphState>({
  apiUrl: LANGGRAPH_DEPLOYMENT_URL,
  assistantId: AGENT_NAME,
  messagesKey: 'messages',
  threadId: tab.threadId,
  onThreadId: (threadId) => onThreadIdChange(tab.id, threadId),
  callerOptions: {
    fetch: customFetch  // Custom fetch implementation with auth headers
  },
  reconnectOnMount: true,
});

// Stop handler
const handleStopBotResponse = useCallback(() => {
  stop();  // This works locally but not in production
  // Reset UI state...
}, [messages, stop]);

Any insights on why these specific endpoints might fail in production while other LangGraph API calls work correctly would be greatly appreciated!

keenanberry · July 21, 2025, 8:33pm

The issue seems to be related to this section of the useStream code: langgraphjs/libs/sdk/src/react/stream.tsx at main · langchain-ai/langgraphjs · GitHub. The runMetadataStorage was always null in my vercel deployed environments… I never quite figured out why unfortunately, but I assumed it had something to do with Vercel’s SSR.

Anyway, I figured out a workaround that handles localStorage / sessionStorage in my own component and explicitly uses the joinStream to rejoin the stream when my component mounts.

  const handleMetadataRunEvent = useCallback((run: { run_id: string }) => {
    if (typeof window !== 'undefined') {
      // Use tab.id (chat_id) as the localStorage key since it's unique per ChatWindow instance
      // NOTE: I use tab.id because its specific to my implementation but thread_id would make sense too
      const storageKey = `chat:${tab.id}:run`;
      window.localStorage.setItem(storageKey, run.run_id);
    }
  }, [tab.id]);

  const { isLoading, messages, values, submit, stop, getMessagesMetadata, joinStream } = useStream<AgentGraphState>({
    apiUrl: LANGGRAPH_DEPLOYMENT_URL,
    assistantId: AGENT_NAME,
    messagesKey: 'messages',
    threadId: tab.threadId,
    onThreadId: (threadId: string) => {
      onThreadIdChange(tab.id, threadId);
    },
    onError(error: unknown) {
      console.error('Error in useStream:', error instanceof Error ? error.message : error);
      dispatch({ type: 'SET_ERROR_SNACKBAR', payload: true });
    },
    onMetadataEvent: (metadata) => {
      if (metadata.run_id) {
        handleMetadataRunEvent({ run_id: metadata.run_id });
      }
    },
    onFinish: (result) => {
      handleResponseFinish(result);
    },
    callerOptions: {
      fetch: customFetch
    },
  })

Since the stop() callback also relies on this runMetadataStorage object to hit the threads/{thread_id}/runs/{run_id}/cancel endpoint, I ended up calling stop() to abort the stream and then handling the client.runs.cancel(threadId, runId) in a server action.

  const handleStopBotResponse = useCallback(() => {
    // Bug: stop() doesn't call langgraph /cancel endpoint, but it does abort the stream
    stop();
    // Handle thread run cancel
    const runId = window.localStorage.getItem(`chat:${tab.id}:run`);
    if (tab.threadId && runId) {
      cancelLanggraphRun(tab.threadId, runId);
    }
    window.localStorage.removeItem(`chat:${tab.id}:run`);
    // rest of callback
}, [messages, stop, tab.threadId, tab.id]

Figured I’d share in case this helps someone else. Also open to better ideas here if anyone’s got any!
Thanks

Topic		Replies	Views
LangGraph run cancellation works locally but returns 404 in production Deployment cloud , js-help	3	778	July 22, 2025
Cannot disable streaming from LLMs in Langgraph Platform + Langgraph React library Deployment	7	1230	September 11, 2025
Stream stops midway to backend, but completes on LangSmith Deployment self-hosted , python-help	4	628	July 29, 2025
Runs stopped streaming Deployment	4	634	July 23, 2025
Langgraph JS Sdk LangGraph js-help	0	219	August 22, 2025