Hi, for our data analytics pipeline, I need to extract exactly which messages users have sent (and when) across threads. Right now I’m using a raw SQL query against the checkpoints table, for example:
SELECT
thread_id,
metadata -> 'writes' -> '__start__' -> 'messages' -> 0 -> 'kwargs' ->> 'content' AS user_msg,
(checkpoint->>'ts')::timestamptz AS sent_at
FROM
checkpoints
WHERE
metadata -> 'writes' -> '__start__' -> 'messages' -> 0 -> 'kwargs' ->> 'content' IS NOT NULL
AND checkpoint_ns = '';
However, it feels a little hacky. Is there an officially supported SDK method that returns the same user-sent content and timestamps without me having to decode the JSONB structure myself?
There isn’t a direct SDK method for extracting user messages from checkpoints, you’d need to use the checkpoint API to get thread history and then filter for user messages:
from langgraph.checkpoint.postgres import PostgresSaver
checkpointer = PostgresSaver(connection_string)
thread_history = checkpointer.list({"configurable": {"thread_id": thread_id}})
for checkpoint in thread_history:
messages = checkpoint.checkpoint.get("channel_values", {}).get("messages", [])
for msg in messages:
if msg.type == "human": # User messages
print(f"Content: {msg.content}, Time: {checkpoint.ts}")