Correct way to inject file contents to LangChain agent

Edmizer · December 13, 2025, 10:39am

Hi,
I’m working on a project (for educational purposes) and trying to create a chatbot where I can upload files and ask questions on each file.

When selecting a file and querying “summarize the content of the file”, the chatbot correctly returns a response.

When uploading another file with the same query, the chatbot repeats previous response.

When disabling short-term memory, the issue is resolved.

I’m using a supervisor agent with a sub-agent, that performs RAG on a file. as the LLM model I’ve selected for the supervisor agent doesn’t support tool-calling, I’m using a middleware with the @dynamic_promptdynamic_prompt decorator hook. the middleware passes the query and file to the sub-agent, the response from the sub-agent is returned in the middleware to the supervisor agent.

I’ve followed the following docs:

supervisor agent - link
custom middleware - link
rag agent - link

follow up questions:

How should I pass the sub-agents response to the supervisor agent? should I modify contentBlocks or content?
Should I keep using @dynamic_prompt or should I change the hook?
How do I maintain short-term memory for the supervisor agent?

hsm207 · December 14, 2025, 5:19pm

Have you looked at the state transitions in your agent? The fact that disabling short-term memory “fixes” it means your agent’s state is getting polluted between queries.

You can use the before_model and after_model decorators to print the states. You’ll likely see the context from the first file sticking around and confusing the supervisor on the query.

I would use the before_model decorator to invoke the sub-agent and update the state with the subagent’s response. That way the supervisor has the necessary context to synthesize an answer.

Edmizer · December 14, 2025, 11:23pm

Using dynamic prompt with short-term memory does mess up the supervisor agent as I’ve injected the sub-agent response into the system prompt text.

I’ve also tested using the @before_model_call hook. In the hook function, I’ve tried injecting the sub-agent response once as a HumanMessage and once as an AiMessage. I’m seeing better results with the AIMessage.

Now, I’m still wondering what I should do. Is it the correct way? Or should I inject the sub-agent response as a content block to user’s query HumanMessage?

UPDATE:
after running the code again, injecting context as an AIMessage is making problems and confusing the supervisor. I’ve now switched to @wrap_model_call in order not to keep the sub-agent response in chat history and injecting the sub-agent response in a HumanMessage instead.

Topic		Replies	Views
How to make a sub-agent to respond to the user's question in a tool-calling multi-agent architecture in v1 LangChain python-help	3	457	November 2, 2025
Inter Communication between agents using create_agent as a supervisor and as a tools Talking Shop intro-to-langgraph , python-help	22	493	January 18, 2026
Information is not passing in Supervisor agent LangGraph python-help	2	620	August 20, 2025
Challenges with Supervisor–Sub-Agent Architecture for Complex, Step-Driven Workflows in LangChain LangChain python-help	9	185	March 12, 2026
Langchiain deep_agent very slowly LangChain intro-to-langgraph , python-help	12	478	February 24, 2026

Correct way to inject file contents to LangChain agent

Related topics