I have the following example using the SummarizationMiddleware for two different models (both locally served through the same provider) and for one of the models using summary_prompt.
Using Qwen3 Coder 30b seems to work great (see the first output) and I get the summary as expected. However, I’m confused as to why three messages plus the summary is kept when I’ve set messages_to_keep=2.
Using gpt-oss-20b gives a strange summary (see the second output). Also has three messages plus summary (so I’m thinking this is how it’s supposed to work and I just don’t understand why it works that way).
Using Qwen3 Coder 30b with summary_prompt="Summarize the messages" results in the agent not understanding what messages to summarize (see the third output). Do I need to properly pass the agent the correct content in the prompt?
import os
from dotenv import load_dotenv
from langchain.chat_models import init_chat_model
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware
load_dotenv()
model = init_chat_model(
#model='openai/gpt-oss-20b', # see second output
model='qwen/qwen3-coder-30b',
model_provider='openai',
base_url=os.getenv("OPENAI_API_BASE_URL")
)
agent = create_agent(
model=model,
tools=[],
middleware=[
SummarizationMiddleware(
model=model,
max_tokens_before_summary=200,
messages_to_keep=2,
#summary_prompt="Summarize the messages", # see third output
),
]
)
messages = agent.invoke(
{"messages": 'Tell me about Langchain'}
)
messages['messages'].extend(['Now tell me about React agents'])
messages = agent.invoke(messages)
for message in messages['messages']:
print(message.content_blocks)
### OUTPUT for Qwen3 Coder
#[{'type': 'text', 'text': 'Here is a summary of the conversation to date:\n\nLangchain is a framework...'}] # This is a correct summary
#[{'type': 'text', 'text': 'LangChain is a popular Python library...'}] # Should this message be kept?
#[{'type': 'text', 'text': 'Now tell me about React agents'}] # I expect this to be here
#[{'type': 'text', 'text': 'React agents are...'}] # Expected answer
### OUTPUT for gpt-oss
# [{'type': 'text', 'text': "Here is a summary of the conversation to date:\n\nHumanMessage(content='Tell me about Langchain', additional_kwargs={}, response_metadata={}, id='46a657c8-3c2d-4fd1-aee0-c49a8fc4262a')"}] # Unexpected summary
# [{'type': 'text', 'text': '**LangChain** is an open‑source framework...'}] # Why kept?
# [{'type': 'text', 'text': 'Now tell me about React agents'}] # Expected
# [{'type': 'text', 'text': '## “React Agents” – Building LLM‑Powered Agent...'}] # Expected
### OUTPUT for Qwen3 Coder with summary_prompt
#[{'type': 'text', 'text': "Here is a summary of the conversation to date:\n\nI don't see any messages provided for me to summarize..."}] # Do we need to pass messages properly in prompt?
#[{'type': 'text', 'text': "LangChain is a popular Python library..."}] # Why kept?
#[{'type': 'text', 'text': 'Now tell me about React agents'}] # Expected
#[{'type': 'text', 'text': 'React agents are...'}] # Expected