Questions about SummarizationMiddleware outputs for different models and using summary_prompt

I have the following example using the SummarizationMiddleware for two different models (both locally served through the same provider) and for one of the models using summary_prompt.

Using Qwen3 Coder 30b seems to work great (see the first output) and I get the summary as expected. However, I’m confused as to why three messages plus the summary is kept when I’ve set messages_to_keep=2.

Using gpt-oss-20b gives a strange summary (see the second output). Also has three messages plus summary (so I’m thinking this is how it’s supposed to work and I just don’t understand why it works that way).

Using Qwen3 Coder 30b with summary_prompt="Summarize the messages" results in the agent not understanding what messages to summarize (see the third output). Do I need to properly pass the agent the correct content in the prompt?

import os
from dotenv import load_dotenv
from langchain.chat_models import init_chat_model
from langchain.agents import create_agent
from langchain.agents.middleware import SummarizationMiddleware

load_dotenv()

model = init_chat_model(
    #model='openai/gpt-oss-20b', # see second output
    model='qwen/qwen3-coder-30b',
    model_provider='openai',
    base_url=os.getenv("OPENAI_API_BASE_URL")
)

agent = create_agent(
    model=model,
    tools=[],
    middleware=[
        SummarizationMiddleware(
            model=model,
            max_tokens_before_summary=200,
            messages_to_keep=2,
            #summary_prompt="Summarize the messages", # see third output
        ),
    ]
)

messages = agent.invoke(
    {"messages": 'Tell me about Langchain'}
)
messages['messages'].extend(['Now tell me about React agents'])
messages = agent.invoke(messages)
for message in messages['messages']:
    print(message.content_blocks)


### OUTPUT for Qwen3 Coder
#[{'type': 'text', 'text': 'Here is a summary of the conversation to date:\n\nLangchain is a framework...'}] # This is a correct summary
#[{'type': 'text', 'text': 'LangChain is a popular Python library...'}] # Should this message be kept?
#[{'type': 'text', 'text': 'Now tell me about React agents'}] # I expect this to be here
#[{'type': 'text', 'text': 'React agents are...'}] # Expected answer


### OUTPUT for gpt-oss
# [{'type': 'text', 'text': "Here is a summary of the conversation to date:\n\nHumanMessage(content='Tell me about Langchain', additional_kwargs={}, response_metadata={}, id='46a657c8-3c2d-4fd1-aee0-c49a8fc4262a')"}] # Unexpected summary
# [{'type': 'text', 'text': '**LangChain** is an open‑source framework...'}] # Why kept?
# [{'type': 'text', 'text': 'Now tell me about React agents'}] # Expected
# [{'type': 'text', 'text': '## “React Agents” – Building LLM‑Powered Agent...'}] # Expected


### OUTPUT for Qwen3 Coder with summary_prompt
#[{'type': 'text', 'text': "Here is a summary of the conversation to date:\n\nI don't see any messages provided for me to summarize..."}] # Do we need to pass messages properly in prompt?
#[{'type': 'text', 'text': "LangChain is a popular Python library..."}] # Why kept?
#[{'type': 'text', 'text': 'Now tell me about React agents'}] # Expected
#[{'type': 'text', 'text': 'React agents are...'}] # Expected

I remember this from the Community Feedback Session last night! thank you for posting it! Will flag for our oss team

1 Like

Thanks for your help! :slightly_smiling_face:

1 Like