The Qwen3.6b model in fireworks through initchatmodel reporting hugely inflated tokens

JoAmps · May 28, 2026, 10:40am

I am tracking token usage across all models. I have been using fireworks via the init_chat_model package.

Every model reports the correct token usage, except accounts/fireworks/models/qwen3p6-plus

I was able to fix it by setting model_kwargs={“reasoning_effort”: “none”},

it was working fine, but yesterday all of a sudden, it started reporting hugely bloated tokens again,

This is what i mean, actual tokens used 3800, repored tokens, 1229292929,

How do i fix this

mdrxy · May 28, 2026, 5:32pm

Hi @JoAmps, would you please open an issue on the langchain repo with a MRE? That way we can investigate further. Thank you!

Topic		Replies	Views
@langchain/aws v1.0 ChatBedrockConverse: ‘Maximum tokens exceeds model limit’ LangChain js-help	1	328	October 27, 2025
Request: Support Azure Anthropic in initChatModel LangChain js-help	0	244	November 19, 2025
About the LangChain category LangChain	0	544	June 27, 2025
Tokens not being tracked in langsmith LangSmith Product Help python-help	0	126	November 19, 2025
Tool/Function Calling with Llama-3.2-3B-Instruct model (local) LangChain python-help	8	634	January 3, 2026