How to time travel to before interrupt and resume with a different value?

Hello,

Goal:

Time-travel to before the graph interruption and provide a different resume value.

Current behaviour:

I have this graph:

    const graph = new StateGraph(MessagesAnnotation)
    .addNode("model_node", async (state) => {
      const model = this.getModel();

      const result = await model.invoke([new SystemMessage("You are an AI assistant."), ...state.messages], {
        runId: uuidv4(),
      });

      return {
        messages: [result],
      }
    })
    .addNode("human_node", async (state, config) => {

      const nextHumanMessage = await interrupt({
        type: "waiting_for_input",
      })

      return {
        messages: [nextHumanMessage],
      }
    })
    .addEdge(START, "model_node")
    .addEdge("model_node", "human_node")
    .addEdge("human_node", "model_node")
    .compile({
        checkpointer: this.checkpointer,
    });
  1. I invoke the graph like this:
graph.streamEvents({message:[new HumanMessage("Hello")]})
  1. Graph runs, model replies (e.g., “Hello, how are you?”,) and the graph interrupts on the human node.
  2. I use graph.getStateHistory() to get the latest checkpoint of the human_node and I hold onto that.
  3. Human sends next message with which I resume the graph using: `new Command({resume:new HumanMessage(“What is 1 + 1”)})`
  4. Graph runs, agent replies “1+1 = 2“
  5. Now I want to time-travel and modify that “What is 1 + 1” message to “What is 2+ 2”. I do that using:
graph.streamEvents(new Command({resume: new HumanMessage("What is 2+2")}), {
                        version: "v2",
                        configurable: {
                            thread_id: threadId,
                            ...checkpoint?.configurable, // the checkpoint I held onto from earlier
                        },
                        signal: abortController.signal,
                    });
  1. Graph runs. However, the agent still replies with “1+1=2”. In Langsmith, I can see the old resume value being used in the model node.

Here is a trace: LangSmith (slightly different example)

The only way I have found to get around this is with the following hack:

.addNode("human_node", async (state, config) => {

      // @ts-ignore
     config.configurable["__pregel_scratchpad"].resume=[] // HACK

      const nextHumanMessage = await interrupt({
        type: "waiting_for_input",
      })

      return {
        messages: [nextHumanMessage],
      }
    })

This hack works for now, but I think it will break things in the future as I add sub-graphs, parallel interrupts, etc…

hi @KDKHD

do you want a “dirty” solution or as idiomatic as possible? :slight_smile:

hey @pawel-twardziak

idiomatic as possible pls :slight_smile:

hi @KDKHD sorry for the late response, was overwhelmed with work. Will answer this weekend

1 Like

Hey @pawel-twardziak,

Do you know if it is actually possible to time-travel to before an interruption in an idomatic way?

Hi @KDKHD
I have some findings already. I need to clarify them and answer, probably today.

My findings:

Why the scratchpad hack “works” but is unsafe

works because it clears the in‑memory resume list before interrupt() looks at it, forcing the runtime to fall back to the “null resume” coming from the current Command. But __pregel_scratchpad is an internal implementation detail used to coordinate per‑task resumes, nested graphs, and subgraph resume mapping (PregelScratchpad in libs/langgraph-core/src/pregel/types.ts and _scratchpad in pregel/algo.ts). Mutating it:

  • can desynchronize the scratchpad from the checkpoint’s pending writes and __pregel_resume_map, especially if there are multiple interrupts in the same node or subgraph
  • will almost certainly break if the internal layout changes in a future LangGraph release
  • is not needed once you use either resume maps or updateState/forking

Using the documented resume map (Command({ resume: { interruptId: value } })) or fork + updateState patterns keeps you on the supported surface area and composes cleanly with more complex graphs (parallel interrupts, subgraphs, etc.), while giving you the “time travel and change the human input” behavior you’re after.

So, instead of mutating __pregel_scratchpad, you should (a) time‑travel to the checkpoint you saved at the interrupt, and (b) resume using a resume map keyed by the interrupt ID, or (for this simple graph) fork the state with updateState and change the messages channel. Both approaches are supported by LangGraph and won’t break with subgraphs or parallel interrupts.

This comes from how interrupts and resumes are implemented internally in LangGraph JS:

  • The interrupt helper reads its resume values from an internal scratchpad (__pregel_scratchpad) that is derived from checkpoint pending writes and an optional __pregel_resume_map in the config. See the implementation in libs/langgraph-core/src/interrupt.ts and how _scratchpad is built in libs/langgraph-core/src/pregel/algo.ts (where resume and nullResume are computed from checkpoint writes and resumeMap) for details.

  • When you first call new Command({ resume: new HumanMessage("What is 1+1") }), the runtime writes that resume value into the checkpoint’s pending writes and propagates it into the scratchpad on the next run (PregelLoop._first in libs/langgraph-core/src/pregel/loop.ts).

  • When you later time‑travel using the same checkpoint config, those stored resume writes are still associated with that checkpoint and are read before the new Command.resume you pass, so interrupt continues to return the old value ("What is 1+1"). Your scratchpad hack works because it force‑clears the resume array right before interrupt reads it, but that’s relying on private internals.

Pattern 1: use resume maps keyed by interrupt IDs

LangGraph supports resuming specific interrupts using a resume map: a mapping from interrupt IDs to resume values. This is the pattern shown in the human‑in‑the‑loop docs, where they build a resume_map from state.interrupts and pass it to Command(resume=...) in both Python and JS:

// JS-style example from the docs (simplified)
const state = await parentGraph.getState(threadConfig);
const resumeMap = Object.fromEntries(
  state.interrupts.map((i) => [
    i.interruptId,
    `human input for prompt ${i.value}`,
  ])
);

await parentGraph.invoke(new Command({ resume: resumeMap }), threadConfig);

In your case you can adapt this pattern as follows:

  1. After the first interrupt, capture the checkpoint and interrupt ID.

    const threadConfig = { configurable: { thread_id: threadId } };
    
    // Run until the interrupt on human_node
    for await (const ev of graph.streamEvents(
      { messages: [new HumanMessage("Hello")] },
      { ...threadConfig, version: "v2" },
    )) {
      // ...stop when you see the interrupt event...
    }
    
    // Get the checkpoints for this thread and pick the one at the human_node interrupt
    const history: StateSnapshot[] = [];
    for await (const s of graph.getStateHistory(threadConfig)) {
      history.push(s);
    }
    const interruptCheckpoint = history[0]; // or filter by metadata / node if needed
    
    // Extract the interrupt id from that checkpoint
    const interrupts = interruptCheckpoint.tasks
      .flatMap((t) => t.interrupts ?? []);
    const interruptId = interrupts[0]?.id;
    
  2. Normal resume (first time) using a resume map instead of a bare value.

    await graph.streamEvents(
      new Command({
        resume: { [interruptId]: new HumanMessage("What is 1+1") },
      }),
      interruptCheckpoint.config, // note: use the checkpoint's config
    );
    
  3. Later, time‑travel and change that same interrupt’s value.

    To “go back” to just before the interrupt and pretend the user had asked "What is 2+2" instead, reuse the same checkpoint and interruptId, but pass a different resume map:

    await graph.streamEvents(
      new Command({
        resume: { [interruptId]: new HumanMessage("What is 2+2") },
      }),
      {
        ...interruptCheckpoint.config, // already includes thread_id & checkpoint_id
        version: "v2",
        signal: abortController.signal,
      },
    );
    

Because this uses the documented resumeMap mechanism (__pregel_resume_map internally), it doesn’t rely on or overwrite __pregel_scratchpad directly. The mapping from interruptId → resume value is stable across time‑travel, nested graphs, and parallel interrupts, and the runtime computes the correct per‑task resume sequence from it (see _scratchpad and interrupt() in libs/langgraph-core/src/pregel/algo.ts and libs/langgraph-core/src/interrupt.ts).

Pattern 2: fork and directly edit the human message in state

For your particular graph—where the human node just returns { messages: [nextHumanMessage] }—another straightforward option is to fork the state at the checkpoint and modify the messages channel, as shown in the time‑travel docs:

  1. From the checkpoint you saved at the interrupt, call updateState to write a new human message into the state as if human_node had returned it:

    // Fork the checkpoint by updating messages as if human_node had run with "What is 2+2"
    const forkConfig = await graph.updateState(
      interruptCheckpoint.config,
      { messages: [new HumanMessage("What is 2+2")] },
      "human_node", // attribute the update to human_node
    );
    
  2. Then continue running from that forked checkpoint:

    for await (const ev of graph.streamEvents(
      null,
      { ...forkConfig, version: "v2" },
    )) {
      // model_node now sees the updated messages and should answer "4"
    }
    

This pattern is closer to the “forking” example in the JS time‑travel docs (time-travel) where updateState is used to edit a specific checkpoint and then the graph is resumed from the resulting forked checkpoint.


Feel free to try those patterns. I believe there is a more idiomatic way to solve your issue :slight_smile:

Thank you @pawel-twardziak for this write-up! This is really interesting. Pattern 2 is very creative! Ill implement them asap and see how they work in practice. Resume_map is something new I learned today.

1 Like

Sure :slight_smile: Let’s investigate more if needed!

Hey @pawel-twardziak ,

I don’t think the resumeMap works in js. In the following example, the final log still contains “What is 1 + 1”. Full code is here: GitHub - KDKHD/interrupt_test

import { StateGraph, END, START, interrupt, messagesStateReducer, StateSnapshot, Command, Annotation } from "@langchain/langgraph";
import { MemorySaver } from "@langchain/langgraph";
import { BaseMessage, HumanMessage, AIMessage } from "@langchain/core/messages";

// Define the state interface
const StateAnnotation = Annotation.Root({
    messages: Annotation({
        reducer: messagesStateReducer,
        default: () => []
    }),
    turnCount: Annotation({
        reducer: (x: number, y: number) => y ?? x,
        default: () => 0,
    })
});

type GraphState = typeof StateAnnotation['State'];

// Create the memory checkpointer
const checkpointer = new MemorySaver();

// Define the agent node
async function agentNode(state: GraphState): Promise<Partial<GraphState>> {
    const { messages, turnCount } = state;
    const lastMessage = messages[messages.length - 1];

    console.log(`\n🤖 Agent (Turn ${turnCount}): Processing...`);

    // Simple echo agent - in a real scenario, this would call an LLM
    const response = `I received: "${lastMessage.content}". This is turn ${turnCount}.`;

    return {
        messages: [new AIMessage(response)],
        turnCount: turnCount + 1,
    };
}

// Define the human node (interrupt)
async function humanNode(state: GraphState): Promise<Partial<GraphState>> {
    console.log("\n👤 Human interrupt node activated!");
    console.log("Processing human input...");

    // Call interrupt to pause execution and wait for human input
    const humanMessahe = interrupt("Waiting for human input...");

    // Simulate human input after interrupt is handled

    return {
        messages: [humanMessahe],
    };
}

// Build the graph
const workflow = new StateGraph(StateAnnotation)
    .addNode("agent", agentNode)
    .addNode("human", humanNode)
    .addEdge(START, "agent")
    .addEdge("agent", "human")
    .addEdge("human", "agent");

// Compile the graph with checkpointer
const app = workflow.compile({ checkpointer });

// Main execution function
async function main() {
    console.log("🚀 Starting LangGraph with Memory Checkpointer");
    console.log("==============================================\n");

    const config = {
        configurable: {
            thread_id: "thread-1",
        },
    };

    // Initial state
    const initialState: Partial<GraphState> = {
        messages: [new HumanMessage("Hello! Let's start the conversation.")],
        turnCount: 1,
    };

    // Run the graph - it will handle interrupts automatically
    // In LangGraph v1.x, invoke accepts partial state updates
    const result = await app.invoke(initialState, config);

    // Print all messages
    console.log("\n📝 Conversation History:");
    console.log("========================");
    const messages = result.messages as BaseMessage[] | undefined;
    const turnCount = result.turnCount as number | undefined;

    if (messages && Array.isArray(messages)) {
        messages.forEach((msg: BaseMessage, idx: number) => {
            const type = msg.constructor.name.replace("Message", "");
            console.log(`\n[${idx + 1}] ${type}: ${msg.content}`);
        });
    }

    console.log("\nâś… Graph execution completed!");
    console.log(`Total turns: ${turnCount ?? 0}`);

    console.log("========================");

    const history: StateSnapshot[] = [];
    for await (const s of app.getStateHistory(config)) {
        history.push(s);
    }
    console.log(history);

    const statesWithInterrupts = history.filter(s => {
        const hasInterrupt = s.tasks.some(t => t.interrupts.length > 0);
        return hasInterrupt;
    });

    const lastInterruptedState = statesWithInterrupts[statesWithInterrupts.length - 1];
    const interruptId = lastInterruptedState.tasks[0].interrupts[0].id!;

    const result2 = await app.invoke(new Command({
        resume: {
            [interruptId]: new HumanMessage("What is 1 + 1")
        }
    }
    ), {
        configurable: {
            ...lastInterruptedState.config.configurable
        }
    });

    const result3 = await app.invoke(new Command({
        resume: {
            [interruptId]: new HumanMessage("What is 2 + 2")
        }
    }
    ), {
        configurable: {
            ...lastInterruptedState.config.configurable
        }
    });

    console.log(result3.messages);

}

// Run the main function
main().catch(console.error);


Also, if the interrupt happens in a subgraph, how does one retrieve a checkpoint from that subgraph interrupt? app.getStateHistory(config)only returns data from the parent graph. This means, if I want to provide a checkpoint ID while resuming, I can only resume to the specific node in the parent graph and the subgraph starts from the beginning.

(If I don’t provide a checkpointID while resuming and just include the thread id, then both the parent and child graphs resume directly to the subgraph interrupt, which is as expected)

hi @KDKHD

thanks for the feedback - I’ll look at that again