Hey, guys:
I am using the LangChain v1 to develpo a multi-agent system. I am following the new tool-calling architecture in the v1 documents. In this architecture, the supervisor agent will make the final response. I am wondering if there a way for a sub-agent to respond to the user’s question directly? The reason I am asking this is because for longer responses, if the supervisor needs to re-generate what a sub-agent generates, it gonna be pretty slow.
Any suggestions are appreciated!
Hi @DataNoob0723
Yes. You don’t have to make the supervisor re-generate what a worker produced. In LangGraph v1 you can either stream the sub-agent’s tokens directly to the UI, let a worker be a terminal node (pass-through final), or use an interrupt to surface the worker’s message to the user. All three avoid the supervisor re-writing long outputs.
-
Stream the worker’s tokens directly (recommended for latency):
- Use the graph’s event stream and forward
on_chat_model_stream events from the worker node to the client while it’s running. The supervisor can still own control-flow, but you don’t re-generate text; you simply surface the worker’s stream to the user.
-
Pass-through “final answer” from a worker (no re-generation):
- Define a convention/tool like
final_answer(text) that a worker can call when it has the user-facing answer. Your graph logic detects this and ends the graph (or returns immediately) with that text. The supervisor then acts as a router/arbiter only and doesn’t paraphrase.
- Practically, add an edge from that worker to
END, or have a small function node that copies the worker’s content to the final return value without invoking another LLM.
-
Use an interrupt to emit a worker response to the UI (human-in-the-loop style):
- A worker can
interrupt(...) to yield control and surface content to the application. The app shows that message to the user, optionally collects input, and then resumes the graph. This is useful if you want an explicit “sub-agent speaks now” step.
-
Topology tweak: make the worker a terminal node:
- In a supervisor pattern, you don’t have to always return to the supervisor. You can route
worker -> END so the worker is the final responder when appropriate. The supervisor remains a router, not a re-writer.
Notes:
- You can combine streaming with any of the above so the user sees tokens as they are generated by the worker.
- If you still want a single “final message” object at the end, keep the supervisor node but only pass through the worker’s content (no model call), or use the
END edge from the worker.
References
1 Like
Thanks a lot for the insights!
I wonder what if supervisor needs more than one sub agent to achieve the task ? would a conditional node to decide if we need to go back to the supervisor better or there is a better way to collabortate among agents in supervisor topology ? should it be a supervisor or swarm in such case ?