How should I deploy a self-hosted multi-agent system?

I would like to use the LangGraph CLI to serve and deploy LangChain agents on a corporate network. The goal is to have a main agent that delegates tasks to remote subagents. A user could interact with the main agent but also be able to directly interact with a remote agent. Something like the diagram shown below where Agent A would be the main agent while Agents B and C are the subagents. I’m aware of the LangGraph CLI tool that can be used to serve and deploy a single agent. But how would I use the tool for a multi-agent system? Would I serve/deploy each agent separately in their own Docker environment? Or would I create a single Docker environment and run the agents in the same environment? Would I deploy each agent on their own server or on separate servers? Any advice on this would be very helpful.

Hello @wigging, let’s start with the more involved Split Deployment approach, then cover the simpler alternative at the end.

Step 1: Build and Deploy Agents B and C Independently

Agent B

cd agent_b/

langgraph build -t my-company/agent-b:latest

docker run -p 8001:8000 --env-file .env \
  -e REDIS_URI=redis://shared-redis:6379/1 \
  -e DATABASE_URI=postgres://...@shared-pg/agent_b_db \
  my-company/agent-b:latest

Agent C

cd agent_c/

langgraph build -t my-company/agent-c:latest

docker run -p 8002:8000 --env-file .env \
  -e REDIS_URI=redis://shared-redis:6379/2 \
  -e DATABASE_URI=postgres://...@shared-pg/agent_c_db \
  my-company/agent-c:latest

Note: Multiple deployments can share the same Redis and Postgres instances — just use different database numbers/names:

  • Redis: redis://shared-redis:6379/1 for Agent B, .../2 for Agent C
  • Postgres: postgres://host/agent_b_db vs postgres://host/agent_c_db

LANGSMITH_API_KEY and LANGGRAPH_CLOUD_LICENSE_KEY are required for the server to start. Include them in the .env file passed via --env-file.


Choose either Step 2a or Step 2b depending on your needs.


Step 2a: Agent A Uses RemoteGraph to Call Agents B and C

(Use this if you are designing Agent A’s orchestration flow manually with LangGraph)

from langgraph.pregel.remote import RemoteGraph
from langgraph.graph import StateGraph, MessagesState, START
import os

# Point to the deployed servers
agent_b = RemoteGraph("agent_b", url=os.environ["AGENT_B_URL"])
agent_c = RemoteGraph("agent_c", url=os.environ["AGENT_C_URL"])

# Agent A's graph embeds B and C as subgraph nodes
builder = StateGraph(MessagesState)
builder.add_node("agent_b", agent_b)
builder.add_node("agent_c", agent_c)
builder.add_edge(START, "agent_b")
# ... routing logic ...
graph = builder.compile()

RemoteGraph has full API parity with a local CompiledGraph.invoke(), .stream(), .get_state(), and .update_state() all work identically.


Step 2b: Agent A as a Deep Agent with Agents B and C as Async Remote Subagents

(Use this if you want Agent A to be a batteries-included orchestrator using the deepagents package)

# agent_a/src/agent.py

from deepagents import AsyncSubAgent, create_deep_agent
import os

async_subagents = [
    AsyncSubAgent(
        name="agent_b",
        description="Specialist for [domain B]. Use for tasks requiring ...",
        graph_id="agent_b",
        url=os.environ["AGENT_B_URL"],   # e.g. http://agent-b-server:8001
    ),
    AsyncSubAgent(
        name="agent_c",
        description="Specialist for [domain C]. Use for tasks requiring ...",
        graph_id="agent_c",
        url=os.environ["AGENT_C_URL"],   # e.g. http://agent-c-server:8002
    ),
]

graph = create_deep_agent(
    model="anthropic:claude-sonnet-4-6",
    system_prompt="You are the main orchestrator. Delegate specialized work to your subagents.",
    subagents=async_subagents,
)

Step 3: Deploy Agent A

// agent_a/langgraph.json
{
  "dependencies": ["."],
  "graphs": {
    "agent_a": "./src/agent.py:graph"
  },
  "env": ".env"
}
cd agent_a/

langgraph build -t my-company/agent-a:latest

docker run -p 8000:8000 --env-file .env \
  -e REDIS_URI=redis://shared-redis:6379/0 \
  -e DATABASE_URI=postgres://...@shared-pg/agent_a_db \
  -e AGENT_B_URL=http://agent-b-server:8001 \
  -e AGENT_C_URL=http://agent-c-server:8002 \
  my-company/agent-a:latest

Single Deployment (Simpler Alternative)

If you prefer, you can also use a Single Deployment with one Docker image. In this case, RemoteGraph is not needed. All agents are registered in a single langgraph.json and share the same server. As with the split deployment, Agent A’s orchestration logic can be either a custom LangGraph flow or a deepagents deep agent:

{
  "dependencies": ["."],
  "graphs": {
    "agent_a": "./agents/agent_a.py:graph",
    "agent_b": "./agents/agent_b.py:graph",
    "agent_c": "./agents/agent_c.py:graph"
  },
  "env": ".env"
}

Each graph gets its own default assistant and can be invoked independently by passing its graph name as the assistant_id in the request, so users can still interact directly with Agent B or Agent C without going through Agent A.


Which Deployment Should You Choose?

  • Start with Single Deployment to validate your system and assess whether it can handle your expected traffic, it is simpler to operate and requires no inter-service networking.
  • Switch to Split Deployment if your single deployment cannot handle the load, or if you need independent scaling per agent.
  • Use Split Deployment if Agents B and C require different compute profiles (e.g., GPU vs. CPU, different memory limits).
  • Use Split Deployment if you want faster, non-blocking parallel execution of subagents.
  • For a corporate network specifically, Split Deployment is generally recommended, since it gives you better fault isolation, and clearer traceability per agent.

Doc References

Thank you for the reply. I have some questions regarding the self-hosted infrastructure where the agents would be deployed to.

Based on your reply, if agents B and C share the same Redis and Postgres database then I guess the remote agents would need to run on the same server; is that correct? My diagram only shows two remote agents but I imagine that more agents would be developed and deployed. So would all the remote agents be running on the same server or would they need their own individual server? And what kind of hardware requirements such as memory (RAM) and CPU cores would you suggest? Is there a set of minimum requirements for an agent?

The purpose of agent A is to orchestrate tasks to other remote subagents. But where should this agent live? Should I just host it on the same server along with the other agents and the user would access it remotely? Or should it run on the user’s local machine and the agent itself would access the subagents remotely?

Lastly, what kind of Git repository structure would you recommend for such a project? I was thinking each agent would have its own repository so tests, documentation, and code development can be done independently of the other agents. But if we develop a lot of agents then that would create a lot of separate repos. So maybe the code for all the agents should go in one repository?

No, they don’t need to run on the same server, but can share the Redis and Postgres with dedicated DBs.

You can run all of them on same server, but if the compute profile is different or need more parallelization then you can run them on individual servers

The LangGraph docs do not specify a minimum officially. From practical experience, 2 CPU cores and 4 GB RAM is a reasonable starting point, scale up as needed. For production workloads with concurrent runs, 4 CPU cores and 8 GB RAM tends to work well.

Agent A should be deployed remotely on a server, just like Agents B and C. Users can access any agent directly, including Agents B and C, without going through Agent A. Each deployed agent exposes the same LangGraph Server API, so any agent can be a direct entry point.

I would suggest having a monorepo and structuring something like:

my-agents/
├── apps/
│ ├── agent_a/ # Main orchestrator
│ │ ├── src/
│ │ │ └── agent_a/
│ │ │ ├── init.py
│ │ │ ├── graph.py
│ │ │ ├── nodes.py
│ │ │ └── tools.py
│ │ ├── tests/
│ │ ├── langgraph.json
│ │ ├── pyproject.toml
│ │ └── .env.example
│ ├── agent_b/ # Subagent B
│ │ ├── src/
│ │ │ └── agent_b/
│ │ │ ├── init.py
│ │ │ ├── graph.py
│ │ │ └── tools.py
│ │ ├── tests/
│ │ ├── langgraph.json
│ │ ├── pyproject.toml
│ │ └── .env.example
│ └── agent_c/ # Subagent C
│ ├── src/
│ │ └── agent_c/
│ │ ├── init.py
│ │ ├── graph.py
│ │ └── tools.py
│ ├── tests/
│ ├── langgraph.json
│ ├── pyproject.toml
│ └── .env.example

├── libs/
│ └── shared/ # Common state schemas, tools, utilities
│ ├── src/
│ │ └── shared/
│ │ ├── init.py
│ │ ├── state.py
│ │ └── tools.py
│ └── pyproject.toml

├── infra/
│ ├── docker-compose.yml # Local dev: Redis + Postgres + all agents
│ └── k8s/ # Helm charts for production K8s
│ ├── agent_a/
│ ├── agent_b/
│ └── agent_c/

├── docs/
├── .github/
│ └── workflows/
│ ├── agent_a.yml # Independent CI/CD per agent
│ ├── agent_b.yml
│ └── agent_c.yml
├── pyproject.toml # Root workspace (uv workspaces)
└── README.md

In your note you mentioned that LANGSMITH_API_KEY and LANGGRAPH_CLOUD_LICENSE_KEY are required for the server to start. But I want to run this on a local network, not in the LangSmith cloud. So why do I need these keys?

Yes, you still need both of them even if you are using self-hosted deployment. Please check Self-host standalone servers - Docs by LangChain. LANGGRAPH_CLOUD_LICENSE_KEY is needed to “authenticate ONCE at server start up”