Why is all messages still being passed when LLM is called? Trim_message isn't working. Here is my code

hopegood · November 25, 2025, 2:48pm

from langchain.messages import RemoveMessage

from langgraph.graph.message import REMOVE_ALL_MESSAGES

from langgraph.checkpoint.memory import InMemorySaver

from langchain.agents import create_agent, AgentState

from langchain.agents.middleware import before_model

from langgraph.runtime import Runtime

from langchain_core.runnables import RunnableConfig

from typing import Any

from localLLMConfiger import office_chat_model as model

import fiddlerConfiger

from langchain_core.messages.utils import trim_messages

from langchain.messages import SystemMessage,HumanMessage,AIMessage

messages = [

SystemMessage("you're a good assistant, you always respond with a joke."),

HumanMessage("i wonder why it's called langchain"),

AIMessage(

            'Well, I guess they thought "WordRope" and "SentenceString" just '

            "didn't have the same ring to it!"

        ),

HumanMessage("and who is harrison chasing anyways"),

AIMessage(

            "Hmmm let me think.\\n\\nWhy, he's probably chasing after the last "

            "cup of coffee in the office!"

        ),

HumanMessage("what's my name")\]

print(trim_messages(messages,

                max_tokens=3,

                token_counter=len,

                strategy="last",

                start_on="human",

                include_system=True))

@before_model

def __trim_messages(state: AgentState, runtime: Runtime) → dict[str, Any] | None:

"""Keep only the last few messages to fit context window."""

messages = state\["messages"\]

msg = trim_messages(messages,

                max_tokens=3,

                token_counter=len,

                strategy="last",

                start_on="human",

                include_system=True

    )

return {

    "messages": msg

    

}

agent = create_agent(

model,

middleware=\[\__trim_messages\],

checkpointer=InMemorySaver(),

)

config: RunnableConfig = {“configurable”: {“thread_id”: “1765”}}

final_response = agent.invoke({“messages”: messages}, config)

final_response[“messages”][-1].pretty_print()

pawel-twardziak · November 25, 2025, 3:38pm

Hi @hopegood

could you format you post again please? Right now I’m having a hard time trying to read and understand it

And it would be helpful if you shared more code on how the graph is built

hopegood · November 26, 2025, 1:26am

Sorry, the formatting was incorrect when I copied the code. I have now resubmitted the code.

hopegood · November 26, 2025, 9:44am

In @wrap_model_call, use the following code:
request.messages = trim_messages(
request.messages,
max_tokens=3,
token_counter=len,
strategy=“last”,
start_on=“human”,
include_system=True
)
return handler(request)`

Topic		Replies	Views
How do I trim messages stored in memory before invoking? LangChain js-help	6	460	December 2, 2025
Delete some messages from the State in a long conversation LangGraph python-help	7	895	October 28, 2025
How to retireve the final AI message after .invoke? LangChain intro-to-langgraph , python-help	3	813	November 4, 2025
How to prune tool call messages in case of recursion limit error? LangGraph python-help	2	867	August 22, 2025
Complete context compression through middleware LangChain intro-to-langgraph , python-help	2	129	March 19, 2026

Why is all messages still being passed when LLM is called? Trim_message isn't working. Here is my code

Related topics