Tool/Function Calling with Llama-3.2-3B-Instruct model (local)

Hi, I am using meta-llama/Llama-3.1-8B-Instruct and have successfully parsed the function called and executed it through prompting. However, I am unsure how I can feed the tool result back to the model. It seems ToolMessage does not apply to Huggingface model as it asks to provide tool_call_id which does not exist in the previous AIMessage. However, if i check the chat template of Llama 3.1, it does show signs of receiving tool messages back through a role called ‘ipython’. However, I am unsure how I can implement using the Langchain messages framework. Would appreciate any help!

@pawel-twardziak

@TZengCoder, you can do something like following:

from langchain_huggingface import HuggingFaceEndpoint,ChatHuggingFace
model = HuggingFaceEndpoint(repo_id="meta-llama/Llama-3.1-8B-Instruct",task="text-generation")
chat = ChatHuggingFace(llm=model, verbose=True)
agent = create_agent(model = chat, tools = [], # your tools
system_prompt = "You are a Langchain Expert" # Your main prompt)
# create agent automatically using langgraph creates a ReAct loop that you are refering to where agent executes tool, see it's output, send back to llm and generate a final message.

Hi, thanks for the reply! Was wondering if this works with the local models like:

from langchain_huggingface import HuggingFacePipeline
from langchain.agents import create_agent
from langchain.tools import tool
llm = HuggingFacePipeline.from_model_id(
    model_id="Qwen/Qwen2.5-7B-Instruct",
    task="text-generation"
)
@tool
def get_weather(location: str) -> str:
    """Get weather information for a location."""
    return f"Weather in {location}: Sunny, 72°F"

agent = create_agent(chat_model, tools=[get_weather])
result = agent.invoke(
    {"messages": [{"role": "user", "content": "What's the weather in San Francisco?"}]}
)

I tried running this but the result outputs: “However, I’m a large language model, I don’t have real-time access to current weather conditions. But I can suggest some ways for you to find out…” (which does not utilize the tool). I am unsure if this is due to the local models unable to have the tools_response attribute like those seen in openai. I only noticed huggingfaceendpoint supports tool calls in its response but not the local models

Please do the following:

from langchain_huggingface import HuggingFacePipeline,ChatHuggingFace

from langchain.agents import create_agent

from langchain.tools import tool

llm = HuggingFacePipeline.from_model_id(

model_id=“Qwen/Qwen2.5-7B-Instruct”,

task=“text-generation”

)

chat_model = ChatHuggingFace(llm=llm)

@tool

def get_weather(location: str) → str:

“”“Get weather information for a location.”“”

return f"Weather in {location}: Sunny, 72°F"

agent = create_agent(chat_model, tools=[get_weather])

result = agent.invoke(

{"messages": \[{"role": "user", "content": "What's the weather in San Francisco?"}\]}

)






My apologies, I accidentally left out the line when copying over from my notebook:

chat_model = ChatHuggingFace(llm=llm)

Hence, the output is still the same, unable to use the tool :smiling_face_with_tear:

Please try the following:

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint


from langchain.agents import create_agent


from langchain.tools import tool


llm = HuggingFaceEndpoint(repo_id="Qwen/Qwen2.5-7B-Instruct",task="text-generation")


chat_model = ChatHuggingFace(llm=llm)


@tool

def get_weather(location: str) -> str:

"""Get weather information for a location."""

return f"Weather in {location}: Sunny, 72°F"


agent = create_agent(chat_model, tools=[get_weather])


result = agent.invoke(


{"messages": [{"role": "user", "content": "What's the weather in San Francisco?"}]}


)


print(result)

Sorry, but is there a possible solution without the use of an endpoint model as I do wish to host the model?

have you solve this issue? I am facing the same situation like this

You can pass localhost to `HuggingFaceEndpoint as well

model = HuggingFaceEndpoint(

            endpoint_url="http://localhost:8010/",

            max_new_tokens=512,

            top_k=10,

            top_p=0.95,

            typical_p=0.95,

            temperature=0.01,

            repetition_penalty=1.03,

            huggingfacehub_api_token="my-api-key",

        )