I built a multi-agent orchestration system using LangGraph that dynamically scales parallelism based on semantic drift variance. Full disclosure: I can’t code - this was built through AI collaboration (Grok), so I’m looking for feedback from people who actually know what they’re doing.
The Problem
When running parallel agents, they can drift from the original objective in different directions. Most systems either use fixed worker counts or scale based on queue depth, but neither approach accounts for alignment quality.
The Solution
LangBaby measures how far each worker’s results drift from the objective (using embeddings), computes variance across all workers, smooths it over a rolling window, and dynamically adjusts parallelism (1-6 workers):
-
High variance (workers diverging) → scale down
-
Low sustained variance (workers aligned) → scale up
Essentially treating parallelism as a control theory problem.
Current Status
Work-in-progress research prototype. The architecture is conceptually complete but has placeholder functions for LLM integration, task execution, and critic evaluation. It needs real-world implementation and testing.
GitHub: https://github.com/robotransit/langbaby
What I’m Looking For
Since I built this through adversarial iteration with AI rather than from coding experience:
-
Is the approach fundamentally sound? Any obvious flaws in the drift-variance scaling logic?
-
Edge cases I’m missing? What breaks this in practice?
-
Better approaches? Is there a simpler/more robust way to measure alignment quality for scaling decisions?
-
Implementation gotchas? What will I hit when actually wiring up the placeholder functions?
Any feedback from people building production agent systems would be hugely valuable. I have domain understanding but not implementation experience, so I don’t know what I don’t know.