Adding rate limiters API to embeddings

andrewchu16 · September 16, 2025, 2:45am

Hey langchain! I noticed that embeddings don’t have a rate_limiter API like chat models do, when many cloud providers set rate limits in the same way. I see two options: 1) add a rate limiter to only init_embeddings or 2) add a rate limiter directly to the base Embeddings class. I’m not knowledgeable enough to be certain but the latter option seems like it could break more things.

Adding rate limiter support would be helpful to anyone using embedding models in the same way it is helpful to anyone using chat models. Perhaps more, as there are many applications that generate embeddings as background jobs in bulk.

When I was developing a knowledge base bot, I had to attach my own rate limiter to my embedding model.

I would be very interested to work on this feature, what do you think?

Topic		Replies	Views
Optional Metadata Return (e.g., Token Usage) to Embeddings Classes LangChain python-help	1	281	August 19, 2025
RateLimitError in the first lesson of LangChain Essentials LangChain Academy	1	54	January 6, 2026
Ratelimit Error LangChain Academy intro-to-langsmith	0	60	November 11, 2025
Init_embeddings does not support `google_gemini` LangSmith Product Help	0	172	July 11, 2025
@langchain/aws v1.0 ChatBedrockConverse: ‘Maximum tokens exceeds model limit’ LangChain js-help	1	184	October 27, 2025

Adding rate limiters API to embeddings

Related topics