LiteLLM Router in LangChain: Missing Model Name and Cost in LangSmith Traces

I’m using a LiteLLM router with LangChain to switch between multiple models, but I’m having an issue with observability in LangSmith. Here’s my setup:

from langchain_litellm import ChatLiteLLMRouter
from litellm.router import Router

model_list = [
    {
        "model_name": "medium",
        "litellm_params": {
            "model": "openai/gpt-4.1-mini",
            "api_key": settings.openai_api_key.get_secret_value(),
        },
    },
    {
        "model_name": "medium",
        "litellm_params": {
            "model": "anthropic/claude-haiku-4-5",
            "api_key": settings.anthropic_api_key.get_secret_value(),
        },
    },
]

router = Router(
    model_list=model_list,
    routing_strategy="simple-shuffle",
    num_retries=2,
)

chat = ChatLiteLLMRouter(
    router=router,
    model="medium",
    temperature=0,
    streaming=True,
    stream_options={"include_usage": True},
)

I’m using a LiteLLM router through LangChain and in LangSmith I only see token usage (input/output), but I don’t see the actual model that was used for each call and the cost per LLM invocation (cost is the most important metric I need). Is there a way to enable this via configuration, or do I need to implement a custom handler? It just feels a bit surprising that this isn’t supported out of the box in LangChain/LangSmith.

Thanks in advance!

For the first question, the invoked model can already be tracked via the ls_model_name key in the default metadata attributes that are passed to LangSmith.

For cost tracking, LangSmith unfortunately only calculates costs automatically for OpenAI and Anthropic wrappers. For other providers in your case ChatLiteLLM, you need to supply well-defined fields that LangSmith recognizes in order for cost to be displayed, as explained here. To add this support, a feature request should be filed on the langchain-litellm GitHub.

LiteLLM actually has a built-in litellm.completion_cost() function that already supports cost calculation for many providers and we can then update the usage_metadata to have well defined keys to send the cost but again you should open this feature in langchain lite llm repo.

1 Like

Thanks for the answer!

The only issue is that ls_model_name currently shows just the model group, not the actual model used (expected to see litellm_params.model). In my case, instead of something like gpt-4.1-mini, I only see "heavy"- which is a group that includes multiple models that LangSmith can switch between. I’d like to track the exact model, not just the group:

Also, thanks for pointing out usage_metadata - I’ll take a look at that

Hello,
Okay, I think so this will also be an additional feature request:

ModelResponseStream(id='chatcmpl-DUeBZwWhLNTHEO4TNebwK8aeE6H1Q', created=1776196685, model='gpt-4o', object='chat.completion.chunk', system_fingerprint='fp_d737a10594', choices=[StreamingChoices(finish_reason='stop', index=0, delta=Delta(provider_specific_fields=None, content=None, role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, service_tier='default', obfuscation='JbBmyv8jDU') <--
ModelResponseStream(id='chatcmpl-DUeBZwWhLNTHEO4TNebwK8aeE6H1Q', created=1776196685, model='gpt-4o', object='chat.completion.chunk', system_fingerprint='fp_d737a10594', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, content=None, role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, usage=Usage(completion_tokens=9, prompt_tokens=9, total_tokens=18, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=0, cached_tokens=0, text_tokens=None, image_tokens=None, video_tokens=None)))

Above is the stream response we get from LiteLLM using which we can update the response_metadata of final AI Message that is returned by ChatLiteLLMRouter.

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.