Handling sensitive data in LangGraph's shared State (Found an interesting paper)

Hey everyone,

I’ve been building some complex multi-agent workflows with LangGraph lately and hit a bit of a conceptual roadblock regarding data privacy: specifically, how to handle PII inside the shared State object.

While looking for best practices, I stumbled across a recent arXiv paper called AgentLeak. I thought it was super relevant to how we design our graphs.

The authors point out that standard LLM privacy audits usually only check the final output of the chain. But in MAS architectures, sensitive data placed in the shared state can easily leak into intermediate tool calls or be unnecessarily exposed to nodes that shouldn’t see it.

They built a benchmark to evaluate this exact issue, covering 7 internal communication channels and 32 classes of privacy attacks across regulated domains.

Links:

It really got me thinking about graph design. How do you all handle state sanitization in LangGraph? Do you build specific “filter nodes” just to scrub PII before passing the state to the next agent, or do you handle it at the tool level?

Would love to hear your architecture patterns!

Hey @yagobski :waving_hand: welcome — this is a very good question.

You’re absolutely right: most “LLM privacy” discussions focus on final output, but in LangGraph-style multi-agent systems, the real risk surface is the internal state and tool boundary.

The AgentLeak paper is pointing at something very real.


The Core Problem in LangGraph

In LangGraph:

  • State is shared (or partially shared)

  • Nodes can read/write keys

  • Tools often receive slices of state

  • Intermediate reasoning may serialize state implicitly

If you put raw PII into the shared State, you’ve already widened the blast radius.

So the design question becomes:

How do we reduce unnecessary data visibility across nodes?


Patterns That Actually Work in Practice

:one: State Segmentation (Most Important)

Instead of one big shared state:

class State(TypedDict):
    public_context: dict
    sensitive_context: dict
    internal_reasoning: dict

Then:

  • Only specific nodes receive sensitive_context

  • Other nodes operate only on public_context

  • Tools never receive full state unless explicitly required

Think “principle of least privilege,” but for graph nodes.


:two: Explicit State Projection Per Node

In LangGraph, don’t pass full state by default.

Instead:

  • Wrap nodes in adapter functions

  • Extract only required fields

  • Return only minimal updates

This avoids accidental propagation of sensitive fields.


:three: PII Tokenization Layer (Very Effective)

Before PII enters the graph:

  • Replace PII with internal IDs

  • Store actual values in secure store (DB, vault)

  • Agents operate on surrogate keys

Example:

User email → USER_123
SSN → TOKEN_456

Only specific “resolution nodes” rehydrate the data.

This dramatically reduces leakage risk.


:four: Tool-Level Guards (Second Line of Defense)

Every tool wrapper should:

  • Validate allowed input schema

  • Reject unexpected state keys

  • Log access attempts

Treat tools as security boundaries, not just utilities.


:five: Separate Subgraphs for Sensitive Ops

For highly regulated workflows:

  • Put sensitive logic in a subgraph

  • Avoid exposing that branch to general reasoning agents

  • Return only sanitized outputs

This creates architectural containment.


What I Would Avoid

:cross_mark: Putting full user profile in shared state
:cross_mark: Letting planning agents see raw PII
:cross_mark: Allowing tool calls to serialize entire state blobs
:cross_mark: Logging raw state without redaction


Where AgentLeak Is Especially Relevant

Their point about “intermediate channels” is key:

Leak vectors include:

  • Tool arguments

  • Memory buffers

  • Debug logs

  • Sub-agent communication

  • Planner-to-worker messaging

LangGraph’s flexibility increases power — but also responsibility.


My Mental Model

Treat LangGraph state like:

A distributed in-memory database with untrusted compute nodes.

Design like you would for:

  • Microservices

  • Zero-trust architecture

  • RBAC systems

Not like a simple Python dict.


If I Had to Pick a Default Pattern

For serious systems:

  1. Tokenize PII at ingestion

  2. Segmented state keys

  3. Explicit projections per node

  4. Tool input validation

  5. Sanitized logging

That combination gives strong practical protection.


Really glad you brought this up — this is exactly the kind of architectural discussion multi-agent systems need more of.

1 Like

Hi @Bitcot_Kaushal,

Glad to see it sparked a discussion here.

You’ve hit on exactly why we built this benchmark. In LangGraph, the State object is powerful but it’s essentially a ‘shared memory’ that every node can potentially read from or write to. Our research showed that even if an agent isn’t supposed to look at certain PII, the way Multi-Agent Systems (MAS) are currently structured makes it very easy for that data to leak into tool calls, logs, or intermediate steps.

Regarding your question on architecture patterns:

  1. Filter Nodes vs. Scoped States: While filter nodes (sanitizers) are a good first step, we found they are often bypassed by sophisticated ‘jailbreak’ style prompts or indirect leakage. A more robust pattern in LangGraph might be using nested sub-graphs with strictly defined schemas that only expose a subset of the state to specific agents.

  2. The 7 Leakage Channels: In the paper, they identify 7 specific ways data escapes. In a LangGraph context, we should be particularly worried about Tool Arguments leakage (where the agent passes PII to an external API unknowingly) and Observability leakage (PII ending up in LangSmith or traces).

I’d be curious to hear: have you tried implementing a ‘Privilege Separation’ pattern in your graphs yet, or are you mostly relying on system prompts to keep agents from touching sensitive keys in the state?

Happy to dive deeper into the technicalities if anyone is interested!

Hi @yagobski

Right now, I’m mostly using system prompts to keep agents from getting to sensitive keys in the shared state.

I haven’t used a full privilege separation or scoped subgraph pattern yet, but I agree that prompt-level controls alone aren’t strong enough to keep things separate and can be bypassed.