Twice execution of agent when using the interrupt

The Problem:

I’m performing an action using the tool and interrupt. Once the tool executes certain process, the graph should interrupt. Because of the execution of processes, that produce different result on every execution, I can’t place the interrupt inside the tool, so I’m using the flag to redirect to interrupt node in which I do preprocessing and postprocessing for the interrupt.

When resuming the graph, it should be starting from the interrupt_node, performs the postprocessing(updating the tool message) and provide that new tool message to the agent which should produce an AI message. But in my case, it produces 2 AI messages.

Understanding I got from Debugging:

After receiving a signal from the tool_node, I proceed to the interrupt_node. I use the interrupt keyword to pause the graph. When it resumes, it first goes to the agent_node. At the agent_node, it reaches the line where the llm chain is invoked. Instead of executing this line, it continues from the interrupt_node. While at the interrupt_node, I update the tool message (since I didn’t use the interrupt within the tool_node, as I needed to perform some operations first, which conflicted with the interrupt requirements). After updating the tool message, I update the state to reflect this specific tool message. Finally, the flow returns to the agent_node as initially outlined in the graph’s setup.

Now, it resumes its execution from the llm chain invocation line, thus generates a message.
After that the agent node is being executed 2nd time from the first line of the agent_node. Thus generating the second message.

If anyone can explain this behaviour, then it would be quite helpful.

Approaches I have used:

I’ve tried to use Command(goto=“agent_node”, update=…) and returning the updated state and defining the edge during the initialization as well. Nothing worked. I’ve tried to ask it to ‘Chat LangChain’, but it gets confused while helping me and returning me the same result.

Graph Initialization

builder = StateGraph(states.GraphState, states.ContextSchema)

builder.add_node("agent_node", nodes.agent_node)
builder.add_node("tool_node", ToolNode(list (tools.tools_map.values())))
builder.add_node("interrupt_node", nodes.interrupt_node)

builder.add_edge(START, "agent_node")
builder.add_conditional_edges("tool_node", nodes.conditional_edge_for_tool_node_to_interrupt_node, {
    "interrupt_node": "interrupt_node",
    "agent_node": "agent_node"
})
# Explicit edge from interrupt_node to agent_node to prevent double routing
builder.add_edge("interrupt_node", "agent_node")
graph = builder.compile(checkpointer=checkpointer, store=memory_store)
  • Above snippet is how I create my graph.

Remaining Nodes:

async def agent_node(state: GraphState, config, runtime: Runtime, store: BaseStore):

    student_profile = ... # user profile fetched from the store

    grade_subject_id = state.get("grade_subject_id")
    grade_subject_name = "Subject not selected yet."
    if grade_subject_id:
        grade_subject_name = state.get("grade_subject_response").get("grade_subjects").get(grade_subject_id).get("grade_subject_name")

    input_dict = {
        "messages": state["messages"],
        "current_time": datetime.now(ZoneInfo("Asia/Kolkata")).strftime("%Y-%m-%d %H:%M:%S"),
        "student_profile": student_profile,
        "available_skills": SKILLS_PROMPT,
        "selected_subject": grade_subject_name
    }
    res = await tuvaed_runnable.ainvoke(input_dict)
    logger.info(res.pretty_repr())

    if res.tool_calls:
        writer = get_stream_writer()

        writer({"is_tool_call": True, "tool_calls": res.tool_calls})

        return Command(
            goto="tool_node",
            update={
                "messages": [res],
            }
        )
    
    return Command(
        goto=END,
        update={
            "messages": [res],
        }
    )



def process_interrupt_response(interrupt_response: InterruptResponse):
	# Using this, I'm preparing the feedback for the tool message and updated values for the state keys.
    interrupt_reason_res = interrupt_response.reason.value

    feedback_dict = interrupt_response.feedback or {}
    state_update_dict = {
        "is_flow_interrupted": False,
        "interrupt_reason": None,
        "interrupt_tool_call_id": None,
    }

    tool_call_feedback = ""

    if interrupt_reason_res == InterruptionReason.ASSESSMENT_CREATED.value:
        assessment_feedback = "some feedback string"
        state_update_dict["assessment_meta"] = None
        tool_call_feedback = assessment_feedback
    else:
        raise ValueError(f"Invalid reason passed. Received: `{interrupt_reason_res}`")

    print("--------------------------------")
    print(f"tool_call_feedback: {tool_call_feedback}, state_update_dict: {state_update_dict}")
    print("--------------------------------")

    return tool_call_feedback, state_update_dict

def prepare_interrupt_payload(state: GraphState):
	# Using this function, I'm doing the preprocessing for the interrupt response which will be consumed by the FE
    final = {
        "interrupt_tool_call_id": state.get("interrupt_tool_call_id"),
        "is_flow_interrupted": state.get("is_flow_interrupted"),
        "interrupt_reason": state.get("interrupt_reason"),
    }

    if not all(final.values()):
        raise ValueError("Missing required fields in final payload, values: {final}")


    interrupt_reason = final["interrupt_reason"]

    if interrupt_reason == InterruptionReason.ASSESSMENT_CREATED.value:
        assessment_meta = state.get("assessment_meta")
        if not assessment_meta:
            raise ValueError("Assessment meta not found")
        final["data_payload"] = assessment_meta
    
    return final

async def interrupt_node(state: GraphState, config, runtime: Runtime, store: BaseStore):
    """
    Interrupts the flow and waits for the feedback from the user.
    When resuming, this node processes the resume response and continues to agent_node.
    """

    final = prepare_interrupt_payload(state)

    logger.info(f"***Final payload before interrupting***: {final}")
    interrupt_response = interrupt(final)
    logger.info(f"***Interrupt response***: {interrupt_response}")

    interrupt_response = InterruptResponse(**interrupt_response)
    tool_call_feedback, state_update_dict = process_interrupt_response(interrupt_response)

    messages = state["messages"]
    existing_tool_message = None
    i = len(messages) - 1
    while i >= 0:
        if isinstance(messages[i], ToolMessage) and messages[i].tool_call_id == interrupt_response.tool_call_id:
            existing_tool_message = messages[i]
            break
        i -= 1
    
    if not existing_tool_message:
        logger.error(f"Interrupt tool call id {interrupt_response.tool_call_id} not found in messages")
        return state_update_dict

    existing_tool_message.content += tool_call_feedback
    
    return {
        "messages": [existing_tool_message], 
        **state_update_dict
    }

async def conditional_edge_for_tool_node_to_interrupt_node(state: GraphState, config, runtime: Runtime, store: BaseStore):
    """
    Conditional edge for tool node to interrupt node.
    This should only be evaluated when coming FROM tool_node, not when coming from interrupt_node.
    """
    is_flow_interrupted = state.get("is_flow_interrupted")
    logger.info(f"Conditional edge evaluation: is_flow_interrupted={is_flow_interrupted}")
    
    if is_flow_interrupted:
        logger.info("Interrupting the flow - routing to interrupt_node")
        return "interrupt_node"
    
    logger.info("Flow not interrupted - routing to agent_node")
    return "agent_node"

It would be helpful if anyone can explain the problem and possible solution.

hi @neel

In LangGraph, interrupt(value) does not pause at that line and then continue from the next line later.

Instead:

  • The first call raises a GraphInterrupt and stops execution.
  • When you resume with Command(resume=...), the value you pass becomes the return value of interrupt(), and the node function restarts from the beginning (so any code before interrupt() runs again). This is documented explicitly in the LangGraph Interrupts docs.
    Source: LangGraph Interrupts

That ‘node restarts from the beginning’ part is usually where unexpected duplicates come from if you do side effects before the interrupt() call.

How do you resume the graph after the interrupt?

Hi @pawel-twardziak ,

Thanks for the help.

Yeah, it is already mentioned in the docs, and that is the reason why I’m interrupting the graph in a node which has idempotent result even if it is re-executed during the resume.

As for the second point you mentioned, my interrupt_node in which the interrupt is being called, only performs below operations.

  • prepare the final payload for the interrupt.
    • produces a dictionary containing the tool_call_id, flag for the interruption, and reason flag to execute different processes. And data_payload, required for different processes.
  • Interrupt.
    • Actual interrupt and receiving the response when the graph is resumed.
  • Process the interrupt response to resume
    • Postprocessing involves the resetting flags that were helpful during the interrupt, because the graph has been resumed.
    • Updating the tool message which will provide the detailed understanding of the user’s actions.

Because, the interrupt_node will be re-executed while resuming the graph, my above-mentioned process becomes idempotent.

As for resuming the graph, below is my code:

    async def stream_graph(
        self,
        user_id: int,
        session_id: str,
        user_query: str,
        resume_payload: dict[str, Any] | None,
        graph_config: dict[str, Any],
        graph_context: dict[str, Any],
    ):
        if not self.graph:
            raise ValueError("Please build the graph first")
        

        if resume_payload:
            logger.info(f"Resume payload: {resume_payload}")
            graph_state = Command(resume=resume_payload)
        else:
            user_message = HumanMessage(content=user_query)
            logger.debug(user_message.pretty_repr())
            graph_state = {
                "messages": [user_message],
                "is_subtopic_marked_completed": False
            }

        context_schema = states.ContextSchema(**graph_context)


        async def _stream(graph_input):
            try:
                logger.info(f"Creating graph.astream for user_id={user_id}, session_id={session_id}")
                streaming_obj = self.graph.astream(
                    graph_input, 
                    config=graph_config,
                    context=context_schema,
                    stream_mode=["messages", "updates", "custom"]
                )

                cumulated_tokens = ""
                cumulated_tokens_count = 0
                chunk_iteration_count = 0
                async for mode, chunk in streaming_obj:
                    chunk_iteration_count += 1
                    if chunk_iteration_count == 1:
                        logger.info(f"Received first chunk from graph.astream for user_id={user_id}, session_id={session_id}")
                    data = {}

                    if mode == "messages":

                        token, metadata = chunk
                        if metadata['langgraph_node'] in ["agent_node"]:
                            if not (isinstance(token, HumanMessageChunk) or isinstance(token, HumanMessage)):
                                if token.content:
                                    cumulated_tokens += token.content
                                    cumulated_tokens_count += 1

                                if token.usage_metadata:
                                    data["end_message"] = True
                                    data["usage_metadata"] = token.usage_metadata
                                    
                                if cumulated_tokens_count == 5 or "end_message" in data:
                                    data["token"] = cumulated_tokens
                                    cumulated_tokens = ""
                                    cumulated_tokens_count = 0

                    elif mode == "custom":
                        if chunk:
                            data.update(chunk)

                    elif mode == "updates":
                        if chunk.get("__interrupt__"):
                            data.setdefault("updates", {}).update(chunk["__interrupt__"][0].value)
                        else:
                            tool_node_update = chunk.get("tool_node", {})
                            if tool_node_update and "is_subtopic_marked_completed" in tool_node_update:
                                data["updates"] = {"is_subtopic_marked_completed": tool_node_update["is_subtopic_marked_completed"]}

                    if data:
                        logger.debug(f"data: {data}")
                        # Write each data dict as a json line (jsonl), not as SSE ("data: ...\n\n")
                        yield json.dumps(data) + "\n"
                        await asyncio.sleep(0.05)

            except Exception as e:
                logger.exception(f"Error streaming the graph: {str(e)}")
                # Output a json line for error, not an SSE line
                yield json.dumps({'error': str(e)}) + "\n"
            
            logger.info(f"graph streaming completed successfully for user `{user_id}` with session_id `{session_id}`")
            yield json.dumps({'end': True}) + "\n"


        return _stream(graph_input=graph_state)

I’m using try, except here to provide the end flag in any case (graceful completion or any un-expected error). In the docs, it is mentioned to not use the interrupt in try/except, but this is not the case here, right?

I think, ‘Understanding I got from Debugging’ section has caused some confusion, let me clarify.

The expected behaviour:

  1. Tool node generates the signal to interrupt, redirects to interrupt_node because of the condition edge.
  2. Interrupt node does pre-processing for the interrupt and pauses the graph which returns the payload, I created.
  3. The external system resumes the graph with the resume payload.
  4. Now, the graph should resume its execution from interrupt_node. Which does the preprocessing to create payload, it won’t interrupt this time, and does the post-processing mentioned above.
  5. After that, it should redirected to the relevant node, in my case its agent_node, mentioned during the initialization of the graph (tried with Command as well).
  6. Now the graph should continue the remaining execution just like it was never interrupted.

The behaviour I saw during the debugging:

Till point 3, it follows the expected behaviour.

  1. The graph starts from agent_node(agent_node is the first node in the graph). It reaches the line which invokes the llm chain. Before executing that line, the debugging pointer points the start of interrupt_node.
  2. The interrupt_node is executed. Which does the preprocessing to create payload, it won’t interrupt this time, and does the post-processing mentioned above.
  3. After the execution of interrupt_node, it is redirected to agent_node. Now the debugging point is located on the llm_chain invocation line. An AI message is created and the execution of the agent_node is completed.
  4. After the completion, the interrupt_node is re-executed. Because, it is redirected to agent_node, it will call the agent_node again, thus producing the message again.

I’m not sure whether this behaviour is related to the debugging feature of the IDE, but in the end, the agent produces the message twice, that is the problem.

Let me know if this helps or need more details.

hi @neel

thanks for following up.

Two more questions:

  • Do any of your tools (inside ToolNode) ever return Command(goto=…)? (Even in some branches)
  • Besides START->agent_node and interrupt_node, do you also have a static edges out of agent_node or tool_node (in addition to returning Command(goto=…))?

Hi @pawel-twardziak

  • All the tools are returning Command(goto=”agent_node”, update=…).
  • I had only 2 edges at first, (START → agent_node), (tool_node →[ agent_node, interrupt_node]). The current version has 3rd edge (interrupt_node → agent_node) as suggested by Chat LangChain bot. That’s it, nothing else.
1 Like

Hi @pawel-twardziak ,

In addition to this problem, I have faced another situation as well, not frequently but it had occurred sometimes.

So, I’m using the interrupt to process the complex user actions, and the result of that action is appended in the original tool call. The agent only sees the rich/complete tool message in the sequence (agent calls the tool → Tool is executed, returns the partial response, generates interrupt signal → interrupting graph → on resume, updating the relevant tool message → returning to agent node) I have implemented. However, I’ve seen sometimes, that after the execution of interrupt (tool message is incomplete yet), the agent generates a response based on that incomplete tool message. And follows the same process afterwards mentioned in the discussion, like generating 2 AI messages.

Sharing this with you, hope it may help.

Hi @neel

thanks, I’ll go through this again today

Hi @neel ,

Short version — this is almost certainly a graph topology + Command(goto=…) interaction issue, not an IDE debugging artifact.


Root Cause (Most Likely)

You currently have:

  • Static edge: interrupt_node → agent_node

  • Tools returning: Command(goto="agent_node")

  • And earlier: tool_node → agent_node

This creates multiple valid paths back to agent_node, so after resume:

  1. interrupt_node runs (expected)

  2. It routes to agent_node

  3. agent_node finishes

  4. Another edge/goto sends it back again

  5. → duplicate AI message

That’s why you see the model response twice.

This is classic “double routing” in LangGraph.


Important Rule in LangGraph

You should choose one routing mechanism:

  • Either static edges

  • Or Command(goto=...)

Not both for the same transition.

Mixing them causes exactly this behavior.


What To Fix

If your tools already return:

Command(goto="agent_node")

Then remove:

interrupt_node → agent_node

static edge.

Let routing happen only via Command.


About the “agent responding before interrupt finishes”

This usually happens if:

  • The interrupt is not the last executed transition

  • Or there’s still a path allowing agent_node to execute before resume completes

Again — usually caused by multiple edges pointing back to agent_node.


Recommended Clean Architecture

  • START → agent_node (static)

  • tool_node returns Command(goto=…)

  • interrupt_node returns Command(goto=…)

No static edges out of interrupt_node or tool_node.

Single routing authority = Command.


About try/except

Your try/except in streaming is fine.
The warning in docs refers to wrapping the interrupt() call itself inside try/except inside a node — not your outer streaming layer.

You’re safe there.


Bottom Line

Your duplicate messages are almost certainly caused by double routing (static edges + Command).

Pick one routing strategy — preferably Command — and the issue should disappear.

Yeah, it might be possible. Will let you know once I fix them.
Thanks for the help, @Bitcot_Kaushal.

Hi @Bitcot_Kaushal & @pawel-twardziak

I’ve replaced other routing methods with Command, and fixed the issue of double routing as well. Now, I get the expected behaviour from the agent/graph.

Thanks for the help.

2 Likes

hi @neel

great to hear you managed that!

Wrapping up for the others:


The final issue was double routing back to agent_node, which caused duplicate AI outputs after resume.

In practice, routing was being mixed across mechanisms:

  • static edges (for example interrupt_node -> agent_node), and
  • Command(goto=...) returns from nodes/tools.

In LangGraph, Command does not override static edges - static edges still execute. So combining both can create multiple valid paths and trigger duplicate execution.

The fix was to replace other routing methods with Command and remove the double-routing paths, so there was a single routing authority. After that, the author confirmed the graph/agent behavior was correct.


Can you confirm this scenarion @neel ? If so, mark the post as Solved - for the others too :slight_smile:

Hey @neel

That’s great to hear! :blush:

I’m really glad switching fully to Command-based routing resolved the issue and the graph is now behaving as expected. These routing conflicts can be subtle, but once cleaned up, things usually fall into place nicely.

Thanks for sharing the update — and well done sticking through the debugging process! :rocket:

@neel just remmeber to mark it as Solved please :slight_smile:

hi @pawel-twardziak ,

Sorry for the delay, as I was on PTO. Marked the post as Solved.

1 Like

thanks @neel :slight_smile: