Langchiain deep_agent very slowly

xiaoyang · January 16, 2026, 7:50am

The “langchian deep_agent” is calling the sub-agent very slowly. Could you please tell me how to solve this problem? I have only loaded one sub-agent.

simon.budziak · January 16, 2026, 10:58am

Hello @xiaoyang , could you please share some code?

xiaoyang · January 19, 2026, 1:40am

ok, here are some code

agent = create_deep_agent(
model=model,
tools=[
],
subagents=SUBAGENTS,
system_prompt=MAIN_AGENT_PROMPT,
backend=FilesystemBackend(root_dir=r"\Parse\src\api", virtual_mode=True),
interrupt_on={
}
)

simon.budziak · January 20, 2026, 7:24am

Hi @xiaoyang what do you mean by “is calling the sub-agent very slowly”. What model are you using? Are you using thinking mode? Since you are using create_deep_agent this intentionally works slower compared to create_agent.

Could you try using different/faster model and observe its behaviour
Do you have any logging? You could add some output logging or middleware to observe where exactly it works slowly

xiaoyang · January 23, 2026, 8:52am

感觉思考非常慢，底层应该多次在反复思考

xiaoyang · January 23, 2026, 8:56am

i am use deepseek-chat model , and this model has only one tool.

keenborder786 · January 24, 2026, 1:14pm

@xiaoyang is this self-hosted?

xiaoyang · January 26, 2026, 1:54am

use crate_deep_agent is model self-hosted

keenborder786 · January 29, 2026, 2:14pm

I think so that is the reason why your execution is too slow. Can you just do a simple request, and perhaps see what the inference speed of your model is.

xiaoyang · January 30, 2026, 5:24am

yes, i think deep_agent send multiple requests to the API . so it was very slow

keenborder786 · February 3, 2026, 1:33pm

So that is the problem, I would recommend you host your model on better specs or use a mini version of the model.

keenborder786 · February 21, 2026, 2:01am

@xiaoyang if you were able to verify this, can you mark the last answer as the solution?

Bitcot_Kaushal · February 24, 2026, 5:11pm

Hi @xiaoyang ,

The slowness isn’t coming from LangChain itself — it’s because:

You’re self-hosting the model.
create_deep_agent (Deep Agent) makes multiple LLM calls internally.
Your model (e.g., deepseek-chat) likely has slow inference speed on your current hardware.

Deep agents typically:

Plan
Reason
Possibly re-evaluate
Call tools
Reflect

That means multiple API/model calls per user request. If your model is self-hosted and slow per request, total latency increases quickly.

Why it feels like “thinking multiple times”

Because it actually is.

Deep agents often:

Generate plan
Evaluate tool choice
Possibly re-plan
Continue reasoning

Each step = another model call.

What you can do

Test raw inference speed with a single simple prompt
Check tokens/sec throughput
Upgrade hardware (GPU, VRAM, CPU)
Use a smaller model variant
Reduce reasoning depth if configurable

Bottom line

Slow execution =
(multiple agent calls) × (slow self-hosted model inference)

So yes — your observation is correct.

Topic		Replies	Views
Call multiple deep agent from a parent langgraph/deepagent Deep Agents	20	594	February 7, 2026
Deep Agents in Cloud LangGraph self-hosted , guidelines , intro-to-langgraph , python-help	0	152	October 6, 2025
How to build a custom subagent which by itself a langgraph graph which have its own schema? Deep Agents	10	313	June 10, 2026
Token bloat when using create_agent LangGraph intro-to-langgraph , python-help	4	306	November 25, 2025
Preventing sub agents structured output post processing in a deep agent Deep Agents	12	536	February 6, 2026

Langchiain deep_agent very slowly

Why it feels like “thinking multiple times”

What you can do

Bottom line

Related topics