Building an AI networking agent (LangChain vs LangGraph vs Deep Agents)?

I’m new to building AI agents, but after a lot of research I’ve decided to go all-in on the LangChain ecosystem.

My goal is to build a serious AI networking agent focused on troubleshooting and analysis (think real-world network ops, not demos).

Some context:

  • Most integrations will be via MCP servers (network devices, APIs, etc.)

  • I don’t see much need for filesystem-based tools

  • The agent should be able to:

    • Run structured troubleshooting workflows

    • Potentially spawn sub-agents for parallel analysis (e.g., per device or hypothesis)

    • Reuse modular “skills” over time

Where I’m stuck is how to actually structure this within the ecosystem:

  • Should I start simple with LangChain and evolve later?

  • Jump straight into LangGraph for more control?

  • Or think in terms of deeper agent systems from day one?

I also don’t know whether I’m overengineering this vs missing something fundamental.

Additional questions:

  • Are there recommended architectural patterns for multi-agent troubleshooting systems?

  • Is a hybrid approach (structured workflow + agent decision-making) common in practice?

  • Any real-world examples of similar systems (infra/networking/observability agents)?

Would really appreciate guidance from folks who’ve built non-trivial agent systems. I’m optimizing for something that can scale beyond a prototype.