Best way to handle very large tool outputs in LangGraph (avoid LLM + LangSmith overload)

Hi all — I’m building an agentic workflow (LangGraph + Gemini) where one of my tools returns a very large JSON output. The full data is required for downstream processing, but:

  • I must not send the raw JSON to the LLM

  • I must avoid huge payloads appearing in LangSmith traces

  • I’m running local-only and Streamlit.io, no external DB

  • The agent may generate multiple large artifacts per run

I tested three approaches:


1. response_format="content_and_artifact"

Store the large JSON in artifact, return only a summary in content.
With a small patch, LangSmith won’t upload the full artifact.

Pros: agent-friendly, clean, scalable
Cons: still need disk storage if I want persistence


2. Write large JSON to local temp files

Tool writes to disk and returns only a file key/path.

Pros: safe, no risk of LLM exposure or LangSmith overload
Cons: manual lifecycle, less native to LangGraph


3. Global variables

Simple dict holding all results.

Pros: easy to prototype
Cons: breaks in reruns, not persistent, not agent-friendly


Question:

What is the recommended pattern for handling large tool outputs that must remain accessible across an agent’s workflow while keeping them out of the LLM context and LangSmith traces?

Should I primarily rely on content_and_artifact + disk-backed storage, or is there a more idiomatic approach?

Thanks!