Edit/Regenerate with useStream adds message at end instead of replacing

When using the branching chat pattern with useStream (Vue), the edit and regenerate functionality does not work as documented from Branching chat - Docs by LangChain. Messages are appended at the end of the conversation instead of being replaced at the edit point.

Expected Behavior

Following the exact code from the Branching chat documentation:

Edit:

stream.submit(
  { messages: [{ ...msg, content: newText }] },
  { checkpoint }
);

Regenerate:

stream.submit(undefined, { checkpoint });

This should replace/regenerate the message at the checkpoint point, creating a new branch.

Actual Behavior

Both edit and regenerate operations add the new message at the end of the conversation instead of replacing the message at the checkpoint point.

Example:

  • Before edit: [user: "Hello", ai: "Hi there!"]
  • After edit on “Hello”: [user: "Hello", ai: "Hi there!", user: "Modified Hello", ai: "New response!"]
  1. Set up useStream with fetchStateHistory: true
  2. Send a message and get an AI response
  3. Click edit on the user message
  4. Modify the text and submit
  5. Observe: the new message is added at the end, not replacing the original

Code Snippet

function handleEdit(msg, metadata, text) {
  const checkpoint = metadata.firstSeenState?.parent_checkpoint;
  if (!checkpoint) return;

  // This adds message at end instead of replacing
  stream.submit(
    { messages: [{ ...msg, content: text }] },
    { checkpoint }
  );
}

function handleRegenerate(metadata) {
  const checkpoint = metadata.firstSeenState?.parent_checkpoint;
  if (!checkpoint) return;

  // This adds new response at end instead of replacing
  stream.submit(undefined, { checkpoint });
}

hi @Jourdelune

how does your backend agent look like? Can you share some code?

my backend is a simple langgraph server with deep agent (langgraph dev)

More detail:

{
  "dependencies": ["."],
  "graphs": {
    "generalist": "./app/services/ai/generalist_agent/graph.py:make_graph"
  },
  "http": {
    "app": "./app/main.py:app",
    "cors": {
      "allow_origins": ["http://localhost:3000", "http://127.0.0.1:3000"],
      "allow_credentials": true,
      "allow_methods": ["*"],
      "allow_headers": ["*"]
    }
  },
  "auth": {
    "path": "./app/auth_handler.py:auth"
  }
}
"""LangGraph factory function for the Aegra Agent Server.

This module exposes ``make_graph`` — a factory function called by the Aegra
runtime for every run.  It reads ``config["configurable"]`` to determine the
active mode, knowledge bases, and agent settings, then delegates to
``construct_generalist_agent`` which returns a ``CompiledStateGraph``.
"""

from langchain_core.runnables import RunnableConfig

from services.ai.generalist_agent.agent import construct_generalist_agent


async def make_graph(config: RunnableConfig):
    """Build the generalist agent graph for the current run.

    Aegra calls this factory for every new run.  The ``config["configurable"]``
    dict is populated from the frontend's ``useStream.submit()`` call.

    Expected configurable keys:
        mode:                 Active mode (generalist, search, template, diagram, mindmap, review).
        mode_args:            Extra arguments for the mode (e.g. {"template_id": 42}).
        knowledges_bases:     List of knowledge base IDs.
        agent_id:             ID of the agent configuration in the Junco DB.
        custom_system_prompt: Agent-level custom system instructions.
        context:              Additional context injected into the system prompt.
        thread_id:            Aegra thread UUID, used as sandbox isolation key.
    """
    cfg = config.get("configurable", {})
    return construct_generalist_agent(
        knowledges_bases=cfg.get("knowledges_bases", []),
        mode=cfg.get("mode", "generalist"),
        mode_args=cfg.get("mode_args", {}),
        chat_id=cfg.get("chat_id", "default"),
        agent_id=cfg.get("agent_id"),
        custom_system_prompt=cfg.get("custom_system_prompt", ""),
        context=cfg.get("context", ""),
    )

and construct_generalist_agent return an instance of deep agent

show me the deep agent code too please

I would like but it’s a private code for an internal test project in my company, so I can’t really share :/.
What would you like to see?

whether you use a checkpointer if you run it not via langgraph dev

agent = create_deep_agent(
    tools=[...],
    interrupt_on={"write_file": True},
    checkpointer=checkpointer,  # Required for state modification
)

I use InMemoryCheckpointer:


from langgraph.checkpoint.memory import InMemorySaver

checkpointer = InMemorySaver()

def construct_generalist_agent(
    knowledges_bases: list[int],
    custom_system_prompt: str = "",
    context: str = "",
    chat_id: int | str = 0,
    agent_id: int | None = None,
    mode: str = "generalist",
    mode_args: dict | None = None,
) -> Any:
   ...

    backend = LLMSandboxBackend(chat_id, sandbox_pool, agent_id=agent_id)

    agent = create_deep_agent(
        model,
        tools=[
            list_files,
            build_download_files(chat_id, backend),
            build_fetch_assets(backend),
        ],
        subagents=subagents,
        backend=backend,
        skills=["/skills/"],
        system_prompt=system_prompt,
        middleware=middleware,
        checkpointer=checkpointer,
    )
    return agent

Ok. How do you serve your agent? Is it one instance persisting across all requests, or does every frontend request instantiate a new agent?

every new front request create a new instance of the agent. make_graph is called each time useStream need to interact with the agent.

I tested with:

from langgraph.checkpoint.sqlite import SqliteSaver


import sqlite3

conn = sqlite3.connect("./checkpoints.db", check_same_thread=False)

checkpointer = SqliteSaver(conn=conn)


def construct_generalist_agent(...): 
      agent = create_deep_agent(
        model,
        tools=[
            list_files,
            build_download_files(chat_id, backend),
            build_fetch_assets(backend),
        ],
        subagents=subagents,
        backend=backend,
        skills=["/skills/"],
        system_prompt=system_prompt,
        middleware=middleware,
        checkpointer=checkpointer,
    )
    return agent

same issue

then InMemory agent does not make sense. But Sqllite should work as its state persists across requests.

even if SQLite doesn’t work, then the problem is somewhere else

Could you prepare this bug/error reproduction in a separate public repo and send me the link? I could debug more then

I’m doing that!

1 Like

Done! GitHub - Jourdelune/test · GitHub

super! lemme check it now…

here is the test to see the issue:
Write “1”, wait for the reply
Write “2”, wait for the reply
Write “3”, wait for the reply
Write “4”, wait for the reply
Write “5”, wait for the reply
Edit “3” to “3b”, and see the result. The last message is replaced by the new edit.

1 Like

I did a new test where I show the checkpoint id of the message: <script setup lang="ts">import { ref, computed } from "vue";import { useStre - Pastebin.com
and all the message except the last 3 have the same checkpoint id.

<div v-for="msg in streamedMessages" :key="msg.id" class="space-y-2">
          <p>
            (checkpoint {{ stream.getMessagesMetadata(msg)?.firstSeenState?.parent_checkpoint?.checkpoint_id }})
          </p>
          <div v-if="editingId === msg.id" class="flex justify-end">

So I find the issue but I don’t know how to resolve this.

Okay, I think I find the issue. fetchStateHistory in useStream fetch only the last 10 checkpoints, in my case with deep_agent, each message return 6 checkpoints, so in two message (my input and the reply) I have fetched the maximum number of checkpoint, so editing a message that is before the last 10 checkpoints will result in an edit of the first checkpoint of the last 10 checkpoints. using fetchStateHistory: { limit: 100 } fix the issue. But for longer conversation the issue will be here. I don’t know the good way to handle this.

So a temporary fix is:

const stream = useStream<{ messages: any[] }>({
  apiUrl: rawUrl,
  assistantId: "generalist",
  fetchStateHistory: { limit: 100 },
  reconnectOnMount: true,
  onThreadId: (id: string) => {
    threadId.value = id;
    console.log("Thread ID:", id);
  },
});

also, the maximum number that fetchStateHistory support is 1000, so the issue will happen if there are more than 166 conversations in my case.