Integrating Celery + Redis with LangGraph for heavy RAG indexing (chunking, embeddings) — best practices?

Petar · December 25, 2025, 2:48pm

Hi everyone ,

I’m currently building a LangGraph-based RAG application and I’m looking for best practices / real-world advice on integrating Celery + Redis to offload heavy background work from LangGraph.

Current architecture (naive, probably wrong)

Right now, everything lives inside LangGraph:

LangGraph is used as the orchestrator
But it also performs heavy work:
- PDF parsing (PyMuPDF)
- chunking (Neural/Semantic chunkers)
- embeddings (external APIs)
- vector storage (Qdrant)
All of this runs inside LangGraph nodes (async Python)

This worked fine at first, but we’ve clearly made a naive architectural mistake.

The problem

LangGraph (especially in managed cloud environments) has limited CPU resources, and when:

10–100 users upload documents at the same time
each upload triggers indexing + chunking + embeddings

we run into:

CPU contention
long blocking workflows
poor throughput
no real backpressure or queueing

In short: LangGraph is doing work it probably shouldn’t be doing.

What we want to achieve

We want to move heavy indexing work out of LangGraph and treat LangGraph as a control plane only.

Target direction:

LangGraph:
- orchestration
- state machine
- decision-making (does this document need indexing?)
Celery + Redis:
- background indexing
- chunking
- embeddings
- vector DB writes
LangGraph should enqueue jobs and continue without blocking

Questions

Is Celery + Redis a reasonable choice for offloading heavy RAG indexing from LangGraph?
What is the recommended integration pattern?
- Should LangGraph enqueue Celery jobs directly?
- Should LangGraph poll for completion, or resume via callbacks/webhooks?
How do you model state cleanly?
- Keep all state in LangGraph store / DB?
- Pass only IDs to Celery tasks?
Are there any anti-patterns to avoid when combining:
- LangGraph (workflow orchestration)
- Celery (task execution)
- Redis (queue / coordination)
Has anyone implemented a similar setup for:
- document indexing
- RAG pipelines
- multi-user concurrent uploads?

Summary

We’ve learned the hard way that:

Putting CPU-heavy indexing directly inside LangGraph does not scale well.

We’re now looking to decouple orchestration from execution, and any advice, examples, or architecture pointers would be hugely appreciated

Thanks in advance!

Topic		Replies	Views
Feature request: First-class message queue / work queue integration for durable, multi-worker LangChain/LangGraph execution LangGraph product-feedback , python-help	1	131	January 5, 2026
Best Pattern for “Batch” Human Review with Multiple LangGraph Workers? LangGraph python-help	4	200	January 21, 2026
❓ How to Reduce Double Agent Calls in React architecture (LangGraph) & Reduce Latency LangGraph python-help	4	722	September 26, 2025
LLM Cache for LangGraph LangGraph python-help	3	1008	August 5, 2025
Choice of platform LangGraph intro-to-langgraph , python-help	0	93	October 14, 2025

Integrating Celery + Redis with LangGraph for heavy RAG indexing (chunking, embeddings) — best practices?

Questions

Related topics