In the docs it speaks about how to trim messages for short term memory in order to not exceed token windows. The problem is that the documentation is outdated. The ‘preModelHook‘ option is not available anymore. My question is, how can I do the same with the current version of langchain?
Hi @Younes
have you tried middleware? Overview - Docs by LangChain
Use agent middleware (the beforeModel hook) to trim or summarize messages before the model call, or use the built‑in summarizationMiddleware.
Built-in one:
import { createAgent, summarizationMiddleware } from "langchain";
import { ChatOpenAI } from "@langchain/openai";
const agent = createAgent({
model: new ChatOpenAI({ model: "gpt-4o" }),
tools: [],
middleware: [
summarizationMiddleware({
model: new ChatOpenAI({ model: "gpt-4o" }),
trigger: { tokens: 4000, messages: 10 }, // when to summarize
keep: { messages: 20 }, // how much to retain afterward
}),
],
});
or your own middleware
import { createAgent, createMiddleware, trimMessages } from "langchain";
import { ChatOpenAI } from "@langchain/openai";
const trimHistory = createMiddleware({
name: "TrimHistory",
beforeModel: async (state) => ({
messages: await trimMessages(state.messages, {
strategy: "last",
maxTokens: 4000,
allowPartial: true,
includeSystem: true,
// Simple approximate counter; replace with your tokenizer if needed
tokenCounter: async (msgs) =>
msgs.reduce(
(sum, m) => sum + (typeof m.content === "string" ? m.content.length / 4 : 0),
0
),
}),
}),
});
const agent = createAgent({
model: new ChatOpenAI({ model: "gpt-4o" }),
tools: [],
middleware: [trimHistory],
});
I am trying to create a middleware that trims the messages, but it doesn’t seem to be working. With my code snippet below I would expect the agent to not have any memory, since I am removing all previous messages. But it still has memory.
trimHistory = createMiddleware({
name: 'TrimHistory',
beforeModel: async (state) => ({
messages: [],
}),
});
private readonly agent = createAgent({
model: new ChatOpenAI({ model: 'gpt-4o-mini' }),
tools: [retrieveDocuments],
contextSchema: z.object({ organisationId: z.number() }),
checkpointer: this.checkpointer,
middleware: [this.trimHistory],
});
checkpointer: this.checkpointer, is the memory.
If the middleware does not work, there must be sth wrong with its implementation.
This is the implementation:
this.checkpointer = PostgresSaver.fromConnString(
process.env.DATABASE_URL as string,
);
I meant the middleware implementation.
Could you share your entire code so that I could asses and test whether it is an implementation issue?
I’ll share all the parts that belong to the agent:
const retrieveDocuments = tool(
async ({ query }, config) => {
const { organisationId } = config.context;
const vectorStore = await PGVectorStore.initialize(embeddingsModel, {
tableName: 'embeddings',
postgresConnectionOptions: {
connectionString: process.env.DATABASE_URL,
},
});
const limit = 20;
const retrievedDocs = await vectorStore.similaritySearch(query, limit, {
organisationId,
});
const serialized = retrievedDocs
.map(
(doc) =>
`Source: ${doc.metadata.source}\n${doc.metadata?.ogImage ? `ogImage: ${doc.metadata.ogImage}\n` : ''}Content: ${doc.pageContent}`,
)
.join('\n');
await vectorStore.end();
return [serialized, retrievedDocs];
},
{
name: 'retrieve',
description: 'Retrieve information related to a query.',
schema: z.object({ query: z.string() }),
responseFormat: 'content_and_artifact',
},
);
const checkpointer = PostgresSaver.fromConnString(
process.env.DATABASE_URL as string,
);
const agent = createAgent({
model: new ChatOpenAI({ model: 'gpt-4o-mini' }),
tools: [retrieveDocuments],
contextSchema: z.object({ organisationId: z.number() }),
responseFormat: z.object({
text: z.string().describe('A markdown repsonse to the user query'),
sources: z
.array(z.object({ url: z.string(), ogImage: z.string() }))
.describe('the sources used to come to an answer'),
}),
checkpointer: checkpointer,
});