hi @ketiko
@Bitcot_Kaushal is on the right track, but there are some important nuances worth unpacking here.
LangGraph’s node-level caching requires two components working together:
- cache backend (
cache) - the storage layer (e.g., InMemoryCache, SqliteCache). This is passed to graph.compile(cache=...)
- cache policy (
cache_policy) - per-node rules that determine what gets cached and for how long (CachePolicy(ttl=60)). This is set via graph.add_node("my_node", my_fn, cache_policy=CachePolicy())
Without both pieces, caching doesn’t activate. A cache backend without any cache_policy is like having a filing cabinet with no instructions on what to file.
The create_agent situation
Looking at the source code, create_agent does accept a cache parameter and passes it correctly to .compile(cache=cache):
# langchain/agents/factory.py (line 673, 1601-1608)
def create_agent(
model,
tools,
*,
# ... other params ...
cache: BaseCache[Any] | None = None,
) -> CompiledStateGraph:
# ...
return graph.compile(
# ...
cache=cache, # <-- passed through
).with_config(config)
However, create_agent does not set cache_policy on any of its internal nodes ("model", "tools", middleware nodes). All graph.add_node(...) calls inside the function omit cache_policy:
# Internal node creation -- no cache_policy set
graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=False))
graph.add_node("tools", tool_node)
So to answer your question directly: the cache parameter alone provides the storage backend, but without any cache_policy being set on nodes, caching will not actually activate for any node in the agent graph.
option 1: set cache_policy on the compiled graph directly
The Pregel engine (which powers compiled graphs) has a graph-level cache_policy attribute that acts as a fallback for all nodes that don’t have their own policy. You can set it after creating the agent:
from langchain.agents import create_agent
from langgraph.cache.memory import InMemoryCache
from langgraph.types import CachePolicy
agent = create_agent(
model="openai:gpt-4o",
tools=[my_tool],
cache=InMemoryCache(),
)
# Set a graph-level cache policy that applies to ALL nodes
agent.cache_policy = CachePolicy(ttl=120) # cache for 2 minutes
This works because of how the execution engine resolves cache policies (source):
# In the task preparation logic:
cache_policy = proc.cache_policy or cache_policy # node-level or graph-level fallback
Caveat: This applies caching to every node (model calls, tool execution, middleware hooks). That may or may not be what you want. For example, caching the model node means identical message sequences will return cached LLM responses, which could save tokens but might produce stale results.
option 2: build a custom StateGraph
For fine-grained control over which nodes get cached, build the graph manually:
from langgraph.graph import StateGraph
from langgraph.prebuilt import ToolNode
from langgraph.cache.memory import InMemoryCache
from langgraph.types import CachePolicy
builder = StateGraph(AgentState)
# Only cache the model node, not the tools
builder.add_node("model", call_model, cache_policy=CachePolicy(ttl=60))
builder.add_node("tools", ToolNode(tools)) # no cache_policy -- tools run fresh
# ... add edges ...
graph = builder.compile(cache=InMemoryCache())
option 3: implement caching inside your tool functions
As @Bitcot_Kaushal suggested, you can implement your own caching logic within tool functions using functools.lru_cache, Redis, or any other caching mechanism. This gives you full control but means you’re managing the cache lifecycle yourself.
Summary
| Component |
create_agent |
Manual StateGraph |
cache backend |
Supported via cache= param |
Supported via .compile(cache=...) |
Per-node cache_policy |
Not exposed |
Supported via add_node(..., cache_policy=...) |
Graph-level cache_policy |
Can be set post-creation (agent.cache_policy = ...) |
Can be set post-compilation |
The cache parameter in create_agent sets up the plumbing, but you need to additionally set cache_policy (either graph-level or per-node) for caching to actually take effect. The easiest workaround is Option 1 above.
Sources