The current Embeddings API (e.g., OpenAIEmbeddings, AzureOpenAIEmbeddings, BedrockEmbeddings) only returns raw vectors from embed_documents/embed_query, without metadata like token usage from providers. This limits cost tracking and debugging, especially for batched/chunked calls—unlike LLMs with LLMResult.
You’re absolutely right that the current Embeddings API limitation makes cost tracking and debugging difficult compared to LLMs that return rich metadata through LLMResult
. This is a known gap in the LangChain ecosystem that affects many developers working with embedding heavy applications. The main workarounds involve using custom callback handlers to capture usage data during embedding calls, directly using provider specific SDKs (like the OpenAI client or boto3 for Bedrock) alongside LangChain embeddings to get metadata or creating wrapper classes that extend the base embedding classes with usage tracking functionality. While these approaches work, they add complexity compared to the straightforward metadata access available with LLM calls. This is a frequently requested feature in the LangChain community, so it’s worth checking their GitHub issues or contributing a feature request for native embedding metadata support that would bring embeddings up to parity with LLM result handling.