Strategies for Context Management

PeymanKh · October 16, 2025, 8:02am

Hello friends, I am working on a multi-agent system and wanted to ask how you would approach a challenge I’m facing.

I’m building a ReAct agent with a web search tool, and the topic it researches can require around 40 tool calls. The agent accumulates text after each tool call and passes it to the LLM again, so I need a way to handle this accumulation efficiently to optimize latency and token usage.

One idea I had is to compress the web search results after each tool call. However, LangChain seems to require that each AIMessage with a tool call be followed by a ToolMessage generated in the tool_node with the same tool_call_id in the message history. This prevents me from inserting a compression node that adds a compressed ToolMessage in between.

How would you handle this situation? Are there any resources or design patterns you’d recommend for managing this kind of context growth in a ReAct workflow?

pawel-twardziak · October 16, 2025, 11:41am

hi @PeymanKh

A few positions:

PeymanKh · October 16, 2025, 4:23pm

@pawel-twardziak Thank you so much my friend

Topic		Replies	Views
Handling Large Tool Outputs Required for Subsequent Tool Calls in LangGraph LangGraph python-help	4	1166	August 20, 2025
LangChain Agents: stream_mode="messages" intermittently emits cumulative AIMessageChunk (large duplicate text mid-response) LangChain python-help	6	427	November 21, 2025
Summarization Middleware Talking Shop	6	129	January 26, 2026
Summarization for Multi Agent Systems Talking Shop python-help	2	250	November 26, 2025
Token bloat when using create_agent LangGraph intro-to-langgraph , python-help	4	224	November 25, 2025

Strategies for Context Management

Related topics