LangGraph thread copy can take 12+ minutes: recommended production pattern?

gouveags · May 19, 2026, 7:31pm

Hi LangGraph team,

I opened a GitHub issue here with the full details: Thread copy endpoint can take 12+ minutes with no async/progress or shallow-copy option · Issue #7859 · langchain-ai/langgraph · GitHub

We use LangGraph (on a LangSmith US deploy) Platform’s thread copy endpoint:

POST /threads/{thread_id}/copy

to power a user-facing “fork chat” feature. For large conversations, we observed this call taking over 12 minutes.

I understand that a full thread copy may need to copy checkpoint history and writes, so the duration itself may be expected. The production challenge is that the endpoint appears to be synchronous/blocking, and we couldn’t find a documented way to:

run the copy asynchronously and poll for completion
receive progress/status while copying
copy only the latest state/current checkpoint instead of the full checkpoint history
estimate whether a thread will be expensive to copy before starting

This is tricky behind serverless/proxy/browser constraints. In our case, AWS Lambda has a 15-minute hard timeout, so long copies can get very close to the runtime ceiling.

My questions:

Is /threads/{thread_id}/copy expected to copy the full checkpoint/write chain?
Is there a supported “shallow copy” pattern for creating a new thread from latest state only?
Is there a recommended production pattern for user-facing fork/copy flows when copies can take several minutes?
Would an async copy API or copy-depth option be feasible?

Thanks!

gouveags · May 19, 2026, 7:32pm

We adjusted our application to tolerate the current behavior: async backend processing, longer polling, explicit user messaging, and Lambda timeout budgeting. That helps operationally, but it does not solve the underlying uncertainty around a long synchronous copy call.

pawel-twardziak · May 19, 2026, 9:26pm

Hi @gouveags

Going through your four questions with the relevant source/docs pointers:

1. Does /threads/{thread_id}/copy copy the full checkpoint/write chain?

Yes, by contract. BaseCheckpointSaver.copy_thread is explicit that implementations must copy the complete parent chain (all ancestor checkpoints and their checkpoint_writes) - copying only the head would silently break DeltaChannel reconstruction. That’s why there’s no built-in “copy depth” knob. The platform handler (libs/langgraph-api/src/api/threads.mts → storage/ops.mts) is a thin wrapper that awaits checkpointer.copy(...) synchronously, and the REST schema confirms: no request body, no ?async, no progress endpoint.

One amplifier worth checking: if you’re on BYOC with a custom checkpointer that hasn’t implemented acopy_thread, the platform falls back to a slow path that “re-inserts checkpoints one by one” (custom-checkpointer docs). On a tool-heavy long thread that’s exactly the regime where you hit double-digit minutes.

2. Is there a supported shallow-copy pattern?

Yes - not via copy, but via threads.create(supersteps=...). The SDK docstring literally says “Used for copying a thread between deployments.” and the Prepopulated state doc covers the same shape. It’s O(1) in history size and avoids the DeltaChannel issue because channels get a fresh snapshot through __start__:

state = await client.threads.get_state(thread_id=source_thread_id)

forked = await client.threads.create(
    graph_id="agent",
    metadata={"forked_from": source_thread_id},
    supersteps=[{
        "updates": [{"values": state["values"], "as_node": "__start__"}]
    }],
)

Trade-off: you lose time-travel history and mid-interrupt context. For most “duplicate this conversation” UX that’s fine; for “rewind to any past turn” it isn’t.

3. Recommended production pattern

Two paths, picked by what the user actually needs:

Fork latest state (default, user-facing): threads.create(supersteps=...) as above. Single roundtrip, no Lambda timeout risk.
Fork with full history: move it off Lambda. Click → enqueue (SQS / Step Functions / Fargate) → return a copy_job_id immediately → worker calls threads.copy(...) with a long HTTP timeout → UI polls. A 12-min synchronous call inside a 15-min Lambda will eventually lose that race no matter how you budget it.

For an “estimate before firing” signal, threads.get_history(thread_id, limit=...) + the size of values is a decent heuristic - flip the UI to “Fork latest state” automatically past a checkpoint-count / payload-size threshold.

Also worth knowing: if you only want to branch from a past point on the same thread, threads.update_state(thread_id, values, checkpoint_id=...) does that without any copy.

4. Feasibility of async API / depth option

Both look feasible. An async mode is straightforward (enqueue + poll, like runs). A “latest only” depth flag would essentially need to go through the same write path as supersteps (write a fresh head, not slice the chain) to stay safe with DeltaChannel. Until that lands, supersteps ≈ mode=latest_state and a worker queue ≈ async=true, both supported and stable.

Hope this helps!

nyxwavea2lc · June 15, 2026, 1:34pm

Thanks for sharing this thread. I’m trying to tighten the production vocabulary around “safe copy” and wanted your correction from real workload: if LangGraph added a shallow-copy mode, what acceptance test would you require before exposing the fork to users? Would “latest-checkpoint state equivalence within a latency budget plus async history backfill” be sufficient, or would you still require full checkpoint/write-chain completion first? I’m trying to avoid ambiguous “copy succeeded” language in runbooks.

Topic		Replies	Views
How to clone a thread in LangGraph LangGraph python-help	3	216	January 20, 2026
Thread copy API fails with KeyError when visiting previously unvisited subgraph nodes LangGraph python-help	4	337	September 19, 2025
Persistent New Threads for Cron Jobs Deployment	3	278	October 7, 2025
Restoring to checkpoint doesn't resume from node of checkpoint LangGraph python-help	3	519	November 21, 2025
Europe Langgraph platform threads stuck Deployment cloud	4	578	December 31, 2025

LangGraph thread copy can take 12+ minutes: recommended production pattern?

Related topics