Hello Forum!
When using ChatHuggingFace + HuggingFacePipeline, the returned AIMessage never populates the AIMessage.tool_calls (it is always empty). Even if the output of the model contains well formatted json for tool calling.
Expected outcome:
AIMessage.tool_calls containing a list of ToolCall objects (like when using openAI’s chatmodel) when the llm response contains json object for tool call.
Diving into the source code, I can see that this is expected since there is no code for parsing the results.
When the llm is instance of HuggingFacePipeline, the ChatHuggingface.\_generate(…) method calls:
1. self._to_chat_prompt(..)
2. llm._generate(…)
3. self._to_chat_result(…)
The llm._generate method returns an LLMResults (which does not contain any field for tool_calls), so my guess would be that the ChatHuggingaFace model should be parsing the tool_calls and creating the appropriate ToolCall objects.
I have some bandwidth to work on this. It would be great if we could align on possible solutions and how to implement them. If you think that this change can be interesting enough to be incorportated, can think on a possible ways to implement and post them here for a more focused/precise discussion.
Looking forward to hear your thoughts!!
I’m using the following versions:
-
langchain=1.0.3
-
langchain_core=1.0.3
-
langchain-huggingface=1.0.1
I’m posting an example code to reproduce the issue, just in case I am doing something wrong.
import torch
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_huggingface.llms import HuggingFacePipeline
from langchain_huggingface import ChatHuggingFace
from langchain.tools import tool
@tool
def multiply(a: int, b: int) -> int:
"""Multiply a and b.
Args:
a: first int
b: second int
"""
return a * b
model_name='Qwen/Qwen3-4B-Thinking-2507'
llm = HuggingFacePipeline.from_model_id(
model_id=model_name,
task='text-generation',
device=0,
batch_size=1,
model_kwargs={'temperature': 0.1,
'max_length': 8192,
'torch_dtype':torch.float16
},
)
sys_msg = SystemMessage(content="You are a helpful assistant tasked with performing arithmetic on a set of inputs.")
prompt='''
multiply 3 by 4.
You must use a tool name multiply which receives as parameters the two numbers to be multiplied
Respond only using a JSON blob with the following format:
{
"name": "multiply",
"args": { "a": "3", "b": "4" },
"id": "multiply_call",
"type": "tool_call"
}
'''
human_msg = HumanMessage(content=prompt)
chat_model = ChatHuggingFace(llm=llm)
llm_with_tools = chat_model.bind_tools([multiply])
llm_output = llm_with_tools.invoke([sys_msg,human_msg])
print(llm_output.tool_calls)
Diego