Feature Request : “GraphCache”, an Intelligent Semantic Caching for LangGraph Agents

Hi everyone

I’ve been exploring LangGraph’s caching system and had an idea that might push it to the next level.

While LangGraph today supports Redis-based caching for deterministic calls (exact-prompt reuse), most agent graphs still end up re-reasoning semantically identical steps,costing both tokens and latency.

Idea: GraphCache, A semantic, state-aware cache for LangGraph

The goal is to cache reasoning patterns, not just string inputs.

Key Features

  • Semantic caching:
    Uses embeddings to match equivalent prompts across runs (e.g., “summarize this week’s Jira tickets” ≈ “get this week’s Jira summary”).

  • State-aware keys:
    Cache key includes graph node + upstream state fingerprint, so it’s context-safe.

  • Automatic cache warming:
    Precomputes results for frequently hit subgraphs.

  • Analytics layer:
    Tracks cost savings, latency reduction, and cache-hit ratios (Prometheus/Grafana compatible).

Why it matters

  • Agents often repeat 60–80 % of their reasoning across users or sessions.
    Intelligent reuse could cut cost by 40–70 % while improving response speed dramatically.

Implementation direction

I’m considering building this as a plug-in layer:

  • before_node_run hook → check semantic cache (Redis + FAISS backend)

  • after_node_run hook → store embeddings + output + state fingerprint

  • Optional Prometheus metrics + Grafana dashboard

Would love to hear thoughts on:

  • Whether this aligns with LangGraph’s roadmap for caching/memory

  • Best way to expose hooks (middleware vs node-decorator)

  • Any prior work or open design discussions I can build on

If there’s interest, and a green signal, would really love to contribute in some or other form.

Thanks,
Akilesh KR