The referenced link contains an example snippet with a subgraph being invoked in the parent node.
I was wondering if it’s advisable to compile the subgraph dynamically for each parent graph invocation?
def create_and_call_subgraph(state: State):
# create a new subgraph based on the state
subgraph_builder = StateGraph(SubgraphState)
for agent in state.get('selected_agents'):
subgraph_builder.add_node(agent)
subgraph = subgraph_builder.compile()
# Transform the state to the subgraph state
subgraph_output = subgraph.invoke({“bar”: state[“foo”]})
# Transform response back to the parent state
return {“foo”: subgraph_output[“bar”]}
builder = StateGraph(State)
builder.add_node("router", my_router)
builder.add_node("subgraph", create_and_call_subgraph)
builder.add_edge(START, "router")
builder.add_edge("router", "subgraph")
graph = builder.compile()
The use case I’m trying to solve involves a supervisor router that manages multiple sub-agents. To reduce the context sent to the LLM, I want to create a subgraph that includes only the agents relevant to the current user query, instead of sending the entire list of agents to the LLM. Any suggestions or advice?
I reckon, you can build and compile a subgraph inside a node on every invocation, but it’s usually not the pattern LangGraph is optimized for. For your goal (only exposing a subset of agents to the LLM), it’s typically better to keep the graph topology static and make the router’s prompt depend on selected_agents rather than recompiling a new subgraph each time.
This follows the way the official subgraphs examples build graphs once and reuse them, instead of compiling dynamically in the hot path.
1. Is compiling a subgraph per invocation “wrong”?
No, it’s allowed and will work, especially if:
Graphs are small and
Throughput is low (e.g., a few requests per minute, not hundreds per second).
StateGraph(...).compile() does more than just wrap functions:
Validates and freezes the topology
Wires in the checkpointer and config handling
Prepares the execution planner
Doing that on every node invocation is extra overhead that you normally avoid by compiling at startup and reusing the compiled graph.
You also lose some benefits of treating it as a first-class subgraph:
If you wire a compiled subgraph into the parent before calling compile, LangGraph can automatically share the checkpointer and make streaming/debugging across parent + subgraph much nicer.
If you instead build and invoke a fresh StateGraph inside a node, that inner execution looks more like an opaque function call from the parent graph’s point of view.
So it’s acceptable, but not the idiomatic or most efficient pattern unless you really need per-request topology changes.
2. A simpler pattern for your use case
Your use case is:
Supervisor router manages many sub‑agents.
For a given query you want to restrict which agents are offered to the LLM, to keep the prompt small.
You don’t actually need a dynamically rebuilt subgraph for that. Instead:
Keep a static set of agent nodes in the graph (or a static supervisor graph pattern like in the multi‑agent / supervisor examples).
Add to the state something like selected_agents: list[str] or rich metadata about the “active” subset.
In the router node, construct the system prompt / tool list from selected_agents only.
Conceptually:
class State(TypedDict):
user_query: str
selected_agents: list[AgentConfig] # computed earlier in the graph
def router(state: State):
# Build a prompt that only mentions selected agents
instructions = """You are a supervisor. You may call ONLY these agents:
{descriptions}
""".format(
descriptions="\n".join(a.description for a in state["selected_agents"]),
)
# Call your LLM with just those options
# model = ...
# response = model.invoke(...)
...
The underlying graph can still have all agents as nodes; the LLM just never hears about the irrelevant ones for a given query, so you’ve reduced context without touching the graph topology.
This is very close to the official “supervisor + workers” / multi‑agent workflow patterns, where the supervisor chooses among a fixed set of workers but you control what gets surfaced in the prompt.
3. When dynamic per‑query graphs do make sense
Dynamic graph compilation is useful in more advanced setups, for example:
A planner builds a DAG (Deep/Directed Acyclic Graph) at runtime from the user query (e.g., a plan of tools/agents to run with dependencies), and you then turn that plan into a StateGraph and compile it once for that request.
This is exactly the kind of “build a graph at runtime, then compile and execute it” pattern described in the community example “Building Dynamic Agentic Workflows at Runtime”.
In that pattern, you still avoid recompiling in a tight loop: you compile once per planned workflow, not once per node call inside a long‑lived parent graph.
If your agent selection logic is that complex, a good architecture is:
Top‑level entrypoint (or small parent graph) runs a planner / router.
The planner returns a plan or a list of agents.
Your application code builds a dedicated StateGraph for that plan, calls .compile() once, and then runs that compiled graph (graph.invoke / graph.stream).
4. Practical recommendation
Putting it together:
If you mainly care about shrinking the LLM context:
Keep a static graph.
Use selected_agents (or similar) in state to dynamically generate the router prompt/tool list.
If you truly need different topologies per query:
Build and compile a per‑query graph in a planner step, but don’t re‑compile inside every node.
Only compile per‑invocation inside a node if your workflows are small, low‑traffic, and you don’t mind the extra overhead and reduced observability.
That way you stay close to the idiomatic patterns shown in the subgraph and multi‑agent docs, but still get the context savings you’re after.
Official LangGraph Python docs on subgraphs, including examples of compiling once and calling from a node: LangGraph Python docs – Subgraphs.
LangGraph graphs reference, which documents the StateGraph / CompiledGraph model and the compile step: LangGraph graphs reference.
Thank you for the detailed response, @pawel-twardziak
Thank you as well for sharing your perspective, @wfh
That’s very helpful advice, @pawel-twardziak , about reducing the context sent to the LLM while keeping the subgraph static. In our setup, the subgraph uses an internal library owned by another team, which is why we wanted to experiment with a dynamic subgraph. We will explore whether our internal library can be adjusted or configured to send a reduced context when making LLM calls so that we can keep the graph static.
I also wanted to clarify something we noticed while working on this POC. As far as I know, a node is supposed to take a state and return an updated state. In one of our POCs, however, we return the subgraph (compiledGraph) itself instead of an updated state, and it still appears to work:
In this setup, LangGraph seems to call the returned subgraph without any explicit invoke from our side, which is different from the documented pattern where you call subgraph.invoke inside a node and return updated state.
Is this behavior safe to rely on, or is it more of an implementation detail that could break with future versions? In particular:
Is returning a compiled subgraph from a node an officially supported pattern, or should nodes always return a state-like mapping?
If it is supported, does LangGraph automatically handle synchronous vs asynchronous execution and streaming?
as far as I can read correctly from the source code:
A node function is always wrapped in a RunnableCallable
RunnableCallable has a built‑in rule: if your function returns another Runnable, it immediately runs that runnable with the same input and returns its output
Since CompiledStateGraph is a Runnable, returning it from the node causes the compiled subgraph to be invoked automatically, and the node’s effective output becomes the subgraph’s final state dict
The state‑update machinery then sees a normal dict and updates the parent graph’s state, which is why your example “works” even though the function literally return subgraph
Trying to answer your questions:
Is returning a compiled subgraph from a node an officially supported pattern, or should nodes always return a state-like mapping?
No - it’s not an officially supported or documented pattern.
The contract for a stateful node in StateGraph/CompiledStateGraph is:
Return a state‑like mapping (dict of updates), or
A Command (or list/tuple of Commands), or
A typed/annotated state object that can be converted into updates
If a node returns some arbitrary object, the core CompiledStateGraph.attach_node logic is written to treat that as an error (INVALID_GRAPH_NODE_RETURN_VALUE). The reason your example works is that the node function is wrapped in a RunnableCallable with recurse=True, and that wrapper happens to auto‑invoke any Runnable it gets back - turning your returned CompiledStateGraph into a second‑stage call whose result is a dict, which is then accepted. That’s a side effect of the generic LangChain Runnable wrapper, not an intentional, stable LangGraph API. So for code you care about, you should still write nodes to return state updates / Commands, and use the documented subgraph patterns:
Graph‑as‑node: builder.add_node(“subgraph”, compiled_subgraph); or
Explicit call: out = subgraph.invoke(…); return {…state updates…}.
If it is supported, does LangGraph automatically handle synchronous vs asynchronous execution and streaming?
For the documented subgraph patterns, yes:
A CompiledStateGraph is a Runnable, so when you add it as a node (builder.add_node(“subgraph”, subgraph)), the PregelNode.bound runnable’s invoke/ainvoke/stream/astream are used automatically, and with subgraphs=True you get proper subgraph‑scoped streaming/events.
If you call subgraph.invoke / subgraph.ainvoke / subgraph.stream / subgraph.astream inside a node, the usual LangChain/LangGraph sync, async, and streaming semantics apply.
For the “return a graph from the node” trick specifically:
Today, RunnableCallable.invoke and ainvoke do automatically call ret.invoke(…) / ret.ainvoke(…) when your node returns a Runnable, so sync vs async is honored and any streaming the inner graph does is consumed and then forwarded through the outer node’s runnable sequence.
But this is an implementation detail of the RunnableCallable(recurse=True) wrapper, not a guaranteed LangGraph node contract. There’s no promise that this auto‑recursion (or its exact streaming behavior) will remain stable, and it doesn’t integrate with the explicit subgraphs=True streaming/inspection machinery.
So: supported and documented for “graph‑as‑node” and “node explicitly invokes subgraph”; not officially supported (just incidentally working) for “have a node return a compiled subgraph and rely on the wrapper to auto‑invoke it.”
Anything you can do by “returning a subgraph so it auto‑invokes” you can already do with the two official approaches (graph‑as‑node, or explicit subgraph.invoke/astream inside the node), but those approaches are clearer, typeable, and easier to integrate with checkpointers, streaming, and tooling.
Ya we designed the Runnable API to recurse in that way. It’s predates LangGraph as a library and is stable, but it’s less legible from a software design perspective so I don’t see any purpose in using it in the context of StateGraph