Display branch switcher button without fetching the full history of messages

Hi,

I’m using LangGraph with useStream and I have a question about branching + branch switcher rendering in long conversations.

I’m implementing a chat UI with:

  • edit / regenerate using checkpoints (stream.submit(..., { checkpoint }))
  • branching support via getStateHistory / fetchStateHistory
  • branch navigation UI (switching between alternative generations)

When a user interacts with a message (edit or regenerate), I can:

  • retrieve metadata.firstSeenState.parent_checkpoint
  • compute branch information
  • correctly create new branches

For messages that were not interacted with, I don’t have branch metadata available.

So in a long conversation:

  • I only get branchOptions / checkpoint metadata lazily (when user clicks or edits)
  • I cannot reliably render a full branch switcher for all messages
  • To fully compute branch availability, I would need to fetch and scan the entire checkpoint history

This creates a dilemma:

  • Either I eagerly fetch large parts of history (expensive for long-lived agents)
  • Or I compute metadata on demand (but then the UI is incomplete until interaction)

What is the recommended production pattern for handling this in LangGraph?

Related to: https://forum.langchain.com/t/edit-regenerate-with-usestream-adds-message-at-end-instead-of-replacing/

Hello @Jourdelune

The branch switcher in useStream is not lazy per-message. It is computed from a single bounded history fetch that covers all messages at once.

The core pipeline

fetchStateHistory (POST /threads/{id}/history, limit N)
        ↓
getBranchSequence()  ← builds parent→child tree from parent_checkpoint.checkpoint_id
        ↓
getBranchView()      ← flattens tree for active branch, produces branchByCheckpoint map
        ↓
getMessagesMetadataMap() ← maps each message to its firstSeenState + branchOptions

All of this runs client-side on the returned ThreadState[] array. branchOptions for every visible message is computed in one pass over that array, there is no per-message lazy fetch.

export function getBranchSequence<StateType extends Record<string, unknown>>(
  history: ThreadState<StateType>[]
) {
  // ...
  history.forEach((state) => {
    const checkpointId = state.parent_checkpoint?.checkpoint_id ?? "$";
    childrenMap[checkpointId] ??= [];
    childrenMap[checkpointId].push(state);
    // ...
  });
  return options.getMessages(currentValues).map((message, idx) => {
    const messageId = message.id ?? idx;
    const firstSeenState = findLast(
      options.history ?? [],
      (state) =>
        state.values != null &&
        options.getMessages(state.values).map((m, idx) => m.id ?? idx).includes(messageId)
    );
    // branchOptions derived from branchByCheckpoint built above

The historyLimit knob

useStream accepts fetchStateHistory as a boolean (omitting it defaults to false) or an object { limit: number }. This sets the window for the single history fetch:

  const historyLimit =
    typeof options.fetchStateHistory === "object" &&
    options.fetchStateHistory != null
      ? (options.fetchStateHistory.limit ?? false)
      : (options.fetchStateHistory ?? false);

  const builtInHistory = useThreadHistory<StateType>(
    client,
    threadId,
    historyLimit,
    // ...
  );

Critical detail: the three possible values behave very differently:

// fetchStateHistory not set / false → calls getState, returns a single-element array.
// Only one checkpoint is available; no branch tree can be built, no branchOptions anywhere.
fetchStateHistory: false  // default

// fetchStateHistory: true → calls getHistory with limit hardcoded to 10
// Only the 10 most recent checkpoints are fetched
fetchStateHistory: true

// The correct production form — explicit limit
fetchStateHistory: { limit: 50 }

This is directly in the fetchHistory function:

function fetchHistory<StateType extends Record<string, unknown>>(
  client: Client,
  threadId: string,
  options?: { limit?: boolean | number }
) {
  if (options?.limit === false) {
    return client.threads.getState<StateType>(threadId).then((state) => {
      if (state.checkpoint == null) return [];
      return [state];
    });
  }

  const limit = typeof options?.limit === "number" ? options.limit : 10;
  return client.threads.getHistory<StateType>(threadId, { limit });
}

The underlying API call is POST /threads/{threadId}/history with { limit, before, metadata, checkpoint } as the body.

What branchOptions actually means

branchOptions is not set for every message, it is only populated for the first message at each fork point. Messages on a linear (non-forked) path will have branchOptions: undefined, which is correct behaviour there is nothing to switch between.

      branchByCheckpoint[checkpointId] = {
        branch: item.path.join(PATH_SEP),
        branchOptions: (item.path.length > 0
          ? (pathMap[item.path.at(-2) ?? ROOT_ID] ?? [])
          : []
        ).map((p) => p.join(PATH_SEP)),
      };

Messages on the trunk (item.path.length === 0) get an empty branchOptions array and an empty string branch. Then in getMessagesMetadataMap, if branch.branch is empty the entire branch object is discarded (branch = undefined). There is also a deduplication pass, if the same set of branch options was already emitted for a prior message, the current message’s switcher is suppressed. This ensures the branch switcher only appears once per fork point, at the earliest message in that fork.

Partial history is handled gracefully

getBranchSequence explicitly handles the case where the fetched history is a window (not the full tree). Among all orphaned roots (checkpoints whose parent is not present in the batch), it selects the one whose subtree contains the most recent (latest) descendant checkpoint and treats it as the virtual root "$":

  // If dealing with partial history, take the branch
  // with the latest checkpoint and mark it as the root.
  const lastOrphanedNode =
    childrenMap.$ == null
      ? Object.keys(childrenMap)
          .filter((parentId) => !nodeIds.has(parentId))
          .map((parentId) => { /* traverse to find latest descendant */ })
          .sort((a, b) => a.lastId.localeCompare(b.lastId))
          .at(-1)?.parentId
      : undefined;

  if (lastOrphanedNode != null) childrenMap.$ = childrenMap[lastOrphanedNode];

The .sort().at(-1) picks the maximum lastId, i.e. the most recently-active branch. This is intentional: when history is a partial window you want to render the live conversation thread, not a stale old branch. This is why partial history still produces correct branch rendering within the fetched window.


Recommended Production Pattern

The framework already chose a clear design: one bounded upfront fetch, all branch metadata computed from that window. There is no architectural dilemma — only the right knobs to tune.

1. Always use fetchStateHistory with an explicit limit

const stream = useStream({
  apiUrl: "...",
  assistantId: "...",
  fetchStateHistory: { limit: 50 },
  // ...
});

Do not use fetchStateHistory: true in production. It silently caps at 10 checkpoints, which is almost always too few once a conversation grows past a few turns. Always pass { limit: N }.

For most production conversations, 50–100 checkpoints covers everything displayed in the viewport. This is the cost/completeness tradeoff the framework exposes.

2. Understand what you get within the window

Within the fetched window, every fork point automatically gets branchOptions computed, for all messages at that fork, not just ones the user has interacted with. The issue of “metadata available only after click” means fetchStateHistory was false or true (capped at 10).

const meta = stream.getMessagesMetadata(message);
meta.branchOptions;    // string[] | undefined — set only for fork-point messages
meta.firstSeenState;   // ThreadState with parent_checkpoint available for all messages in window
meta.branch;           // current branch identifier

3. For pagination in long conversations

The client.threads.getHistory API accepts a before parameter (a Config object containing checkpoint info). For very long threads, implement scroll-to-load-older-history:

const olderHistory = await client.threads.getHistory(threadId, {
  limit: 50,
  before: configOfOldestCheckpointInCurrentWindow,
});
// Merge into your local history array and re-run getBranchContext

This pattern matches the SDK’s design intent — the before parameter exists precisely for this cursor-based pagination.

4. For edit/regenerate: use the already-available firstSeenState

When the user edits a message, you already have what you need without any additional fetch:

const meta = stream.getMessagesMetadata(message);
const parentCheckpoint = meta.firstSeenState?.parent_checkpoint;

await stream.submit(newInput, {
  checkpoint: parentCheckpoint,  // branch from this point
});

firstSeenState.parent_checkpoint is available for all messages present in the history window, this is the intended branching anchor.

5. For very long-lived agents: push metadata at write time

If your threads grow to thousands of checkpoints and even a limit-50 fetch is incomplete, consider storing branch-relevant metadata (e.g. message_id → checkpoint_id) in your own database when each run completes. You can use on_chain_end / stream checkpoint events for this. This avoids ever needing a full history scan for rendering.


TL;DR

Concern Answer
“branchOptions only available after interaction” Enable fetchStateHistory: { limit: N } computed upfront for all fork-point messages in the window
fetchStateHistory: true vs { limit: 50 } Not equivalent. true silently caps at 10 checkpoints; always use { limit: N } in production
branchOptions on every message Only at fork points; linear messages get undefined — this is correct behaviour
Branch switcher appearing on multiple messages Deduplicated by the SDk, same fork only emits a switcher once
Cost for long conversations Tune limit to cover visible viewport; paginate with before cursor on scroll
firstSeenState.parent_checkpoint not available Available for any message present in the fetched history window
Full history eagerly Not recommended; the framework explicitly designed around bounded + partial-history-aware tree building

The getBranchSequence partial-history handling (selecting the most-recently-active orphaned branch as root) and the silent limit: 10 default when passing fetchStateHistory: true are the two most important implementation details to understand.