global task
async with AsyncSqliteSaver.from_conn_string("checkpoints.db") as memory:
checkpoint_tuple =await memory.aget_tuple({"configurable": {"thread_id": task["id"]}})
checkpoint = checkpoint_tuple.checkpoint if checkpoint_tuple else None
if checkpoint:
print(f"Restore checkpoint: {checkpoint['id']}")
else:
print("No checkpoint was found. The execution will start from the beginning.")
config: RunnableConfig = {"configurable": {"thread_id": task["id"], "checkpoint_ns": "",'checkpoint_id': checkpoint["id"]},"callbacks": [langfuse_handler]}
msg = builder_create_report.compile(
name="report generation", checkpointer=memory
).astream(
state,config,context=Context(model="Qwen3-Coder-480B-A35B-Instruct"),
# stream_mode="messages"
)
I want to recover from the mistakes and continue to proceed (since the errors are sporadic).
The current situation I’m facing is that when the checkpointer saves at point B, and I restart the graph anew, the graph will not continue from point B or point C, but will start anew from point A. Could it be that the design of langgraph is exactly as it is, or is there something wrong with my configuration? My version of Langgraph is V0.6.5.
Gratitude !!
Replenish:
When I checked at the breakpoint, the status was normally read up to the end of point B.
Hi! Why are you passing the checkpoint_id in your LangGraph invocation? If there is an existing mid-graph checkpoint for that thread, all you need to do is pass the thread_id and LangGraph will pick up where the last run left off.
I also tried passing only the “thread_id”, and initially it was like that. But after I realized that I couldn’t proceed from point C, I then tried passing the “checkpoint_id”. However, the result was still starting at point A.
You can resume from the last saved checkpoint and continue from the next node (e.g., resume at C after B) instead of restarting at A. To do so, invoke the graph with the same thread_id and pass None as the input (or a Command(resume=...) if using interrupts). Do not re‑pass the original state as input, and generally don’t set checkpoint_id unless you explicitly want to time‑travel to an older point.
Why you’re seeing a restart at A
In LangGraph, if you provide a non‑None input on the next run, LangGraph treats it as a new invocation and maps that input into the state, which will start the run from the entry node. This is by design. To resume, you should let the runtime load the last checkpoint and proceed without injecting new input.
Concluding:
The core issue: Re-passing the prior state as input causes a new invocation from the entry node. To resume, pass None (or a Command(resume=…)) and keep the same thread_id.
Don’t set checkpoint_id unless you want to time-travel to a specific earlier checkpoint; omit it to continue from the latest checkpoint for the thread.
Correct ways to resume
Resume from the latest checkpoint in the thread (after B, continue to C):
async with AsyncSqliteSaver.from_conn_string("checkpoints.db") as memory:
graph = builder_create_report.compile(checkpointer=memory)
# Use the SAME thread_id as the prior run; do NOT pass state again
config = {"configurable": {"thread_id": task["id"]}}
# Resume execution: pass None input so LangGraph loads last checkpoint
async for _ in graph.astream(None, config, context=Context(model="Qwen3-Coder-480B-A35B-Instruct")):
pass
Time travel
to_replay = graph.get_state_history({"configurable": {"thread_id": task["id"]}})
# pick the checkpoint you want, e.g. the one whose `next` contains ("C",)
# Then resume from that exact moment by providing its config (with checkpoint_id)
target_config = to_replay[0].config # or whichever you selected
async for _ in graph.astream(None, target_config, context=Context(model="Qwen3-Coder-480B-A35B-Instruct")):
pass
Optional: handling sporadic failures
If errors in C are transient, consider adding a retry policy at compile time or in config so C is retried before a full resume. See RetryPolicy in the LangGraph types and related how‑tos.