Checkpoint cleanup

Hello there,
Im trying to reduce the db size by deleting some checkpoints and writes.
However, Im not really sure what needs to stay there for HIL and to maintain state within a thread in an agent with several subagents. Should I leave the last checkpoint for a thread with empty ns?
Should I worry about namespaces?
Leaving the latest checkpoint seems to work but Im not really sure. I understand that I can configure stuff in langgraph.json but I would like to understand how checkpoints work.
Thank you for your time and help.

@Samerlkhodary

Keeping the newest checkpoint per (thread_id, checkpoint_ns) is sufficient to restore the current running state for that conversation.

An empty namespace in streaming means “main agent”; keeping the latest checkpoint for the main-thread (empty namespace) is fine.

Each subgraph invocation gets its own separate checkpoint chain under a distinct checkpoint_ns. So a single thread with subgraphs (subagents) produces checkpoints across multiple namespaces

This means:

  1. If you delete subgraph checkpoints (non-empty namespace) but keep the parent’s latest checkpoint .. the parent’s checkpoint_map (think of it as a linking dict) will reference subgraph checkpoint IDs that no longer exist. Resuming a subgraph (including HIL interrupts inside subgraphs) will break.

  2. If the graph has already completed (no pending interrupts, no pending tasks), then the subgraph checkpoints are no longer needed – they were only relevant for that execution. The next invocation on the same thread will create new subgraph checkpoints with new task IDs/namespaces.

The best way to understand the checkpoints is by following the docs: Persistence - Docs by LangChain

Also I found the following diagram to be really useful to understand Persistence:

1 Like

Thank you very much for your reply !
The langgraph.json ttl, when set to keep_latest it keeps the last checkpoints I assume or?
Also for the write, we should keep all the checkpoint writes entries that are related to the checkpoint ids that we kept? or those are safe to clear?

  • * "keep_latest": Retains the thread and latest checkpoint, but deletes older checkpoint data that subsequent runs won’t need which will reduce your DB usage automatically.
  • I would recommend not to manually remove checkpoint write entries as it might mess up time travel and rely only on `keep_latest`.

Hi @Samerlkhodary ,

:one: ttl: "keep_latest"

Yes — it keeps:

  • The thread

  • The latest checkpoint per namespace

And automatically removes older checkpoints that are no longer needed for future runs. So you don’t need to manually manage old checkpoints in most cases.


:two: What about checkpoint_writes?

You should not manually delete checkpoint write entries.

Even if you’re keeping only the latest checkpoint:

  • Writes are part of the internal consistency model

  • They support replay / time-travel

  • Removing them manually can corrupt state recovery

If you’re using keep_latest, let LangGraph handle cleanup safely.


Important nuance (from earlier discussion)

If you manually delete:

  • Subgraph checkpoints (non-empty namespace)

  • But keep the parent

You risk breaking resume flows (especially HIL interrupts).

So best practice:

:backhand_index_pointing_right: Use keep_latest
:backhand_index_pointing_right: Don’t manually prune writes
:backhand_index_pointing_right: Let the runtime manage persistence integrity

That keeps your DB smaller without risking broken resumes :+1:

@Samerlkhodary Is everything clear now?

1 Like