I recently learned about vector databases. When I recall valid information and give it to the model for a response, I’m using LangGraph. Do I need to create a RAG node to fetch knowledge and directly append it to the global state as historical memory? Is this a common practice? But I wonder if it might be better to only write the recalled information into the system prompt of the response model node to get the reply, without adding it to the state context.
If you’re writing memory information into the system prompt of a model node, I think you still need to integrate it yourself. I view the system prompt more as a direction corrector or role definer. You can check out “The persona selection model” article by Anthropic. I’d suggest avoiding putting anything into the model node’s system prompt—even memory keywords—as I believe it can cause task drift.
I’m relatively new to LangGraph, but I maintain context manually in my projects. For context compression or forced memory persistence, I only use RAG storage for non-structured data (for fuzzy matching or multimodal memories). For structured data, I use relational databases. In complex scenarios, I employ hybrid retrieval strategies with appropriate chunking.
I currently use Qdrant as my vector database. For multimodal inputs that need persistent memory, you can also use relational databases for mapping. However, Qdrant’s payload feature allows attaching structured conditions during vector search, enabling hybrid retrieval.
I think it’s reasonable to treat RAG as an external tool—placing retrieved content after the user’s input but before the model’s output. If you append it directly to the global state as historical memory, the context will bloat. A better approach might be to only store the user’s query, then use payload keywords for attribute pre-filtering during retrieval.
This is just my rough idea. LangGraph probably has more elegant solutions—I’m just not familiar enough with it yet. Would appreciate insights from more experienced folks.
Here’s a lesser-known fact, or you could call it a “cold knowledge” tip: When storing data in a vector database, the payload must include the original text chunk. This is because embeddings cannot be reversed—there’s no way to reconstruct the original text from a vector. Current retrieval methods are all based on similarity matching, not one-to-one restoration. If you need to return the original content, you have to store it explicitly in the payload for mapping.
First of all, thank you for your reply. I hope to have more exchanges with you. What I specifically mean is that in a model response node, if you directly invoke a system prompt, it only affects this particular conversation and won’t be added to the state messages. Would this reduce the token count?
async def model_messages_get_node(state:dict):
model_list_get = await model_list()
return {
"messages":[
model_list_get[2].invoke([SystemMessage(content=model_messages_get_prompt)]+ state["messages"])
]
}
By the way, are you also a beginner with langGraph? I hope we can have more interactions and collaborate more with you.
Or it’s not a system prompt, but changed to another type of prompt. What I mean is to briefly inform the model in this way, rather than forming a permanent memory in the context.
Yes, I think it would reduce tokens. I see the conversation memory between user and model as a unique list. Whether it’s sub-agents, tool calls, or structured outputs—they all operate around this list (reading from it or writing to it). Not sure if this analogy is accurate, haha.
Or it’s not a system prompt, but changed to another type of prompt. What I mean is to briefly inform the model in this way, rather than forming a permanent memory in the context.
Just as you said.
Sure, I think so too.
I’m also a beginner—I wasn’t even clear on how LangGraph’s state appending works. I actually had to ask an AI what the difference is between that and my own manual context management, just so I could reply to you naturally, haha.
May I ask where you are from? If there’s a chance, we could work on some small projects together to strengthen our knowledge.
我来自中国
![]()
那就更方便了,因为我也是bro
hehe guys, speak English ![]()
Of course, haha, I’m really happy to see your comment.