I want to use ChatAnthropic for the streaming output of Json, but it seems not supported? Here is my sample code, the Json is not streamed, and it was the last to come out.
const model = new ChatAnthropic({
model: "claude-sonnet-4-20250514",
anthropicApiKey: process.env.ANTHROPIC_API_KEY,
anthropicApiUrl: process.env.ANTHROPIC_BASE_URL,
});
async function main() {
const response = await model.withStructuredOutput(z.object({
title: z.string(),
content: z.string()
})).stream("Hello, how are you?");
for await (const chunk of response) {
console.log(chunk);
}
}
main();
1 Like
Hi @daniel-style
AFAIK, you can stream text deltas, but you can’t stream a validated JSON object produced by withStructuredOutput() token-by-token. The structured result is emitted only after the model finishes and the parser validates the full response. That’s why the JSON appears last in your loop.
What works today:
-
Stream text/events, then parse when complete (or try tolerant incremental parsing yourself).
-
If you need a single validated object, call invoke() (non-streaming) on the structured runnable.
Examples
- Stream text deltas with ChatAnthropic
import { ChatAnthropic } from "@langchain/anthropic";
const model = new ChatAnthropic({
model: "claude-sonnet-4-20250514",
anthropicApiKey: process.env.ANTHROPIC_API_KEY!,
anthropicApiUrl: process.env.ANTHROPIC_BASE_URL,
});
const stream = await model.stream("Hello, how are you?");
for await (const chunk of stream) {
const text = Array.isArray((chunk as any).content)
? (chunk as any).content
.filter((c: any) => c?.type === "text")
.map((c: any) => c.text ?? "")
.join("")
: "";
process.stdout.write(text);
}
- Get a structured object (non-streaming)
import { z } from "zod";
const schema = z.object({
title: z.string(),
content: z.string(),
});
const result = await model.withStructuredOutput(schema).invoke(
"Hello, how are you?"
);
console.log(result); // { title: string, content: string }
- Stream events and buffer yourself (optional)
If you must approximate JSON streaming, stream tokens/events and build up a buffer, then attempt partial parses (e.g., with a tolerant parser) until the final validated object arrives at the end:
import { jsonrepair } from "jsonrepair"; // optional, helps with partial JSON
const runnable = model; // or a prompt → model chain
const events = await runnable.streamEvents("Emit a JSON object with title and content.", {
version: "v2",
});
let buf = "";
for await (const ev of events) {
if (ev.event === "on_chat_model_stream") {
const text = (ev as any)?.data?.chunk?.content
?.filter((c: any) => c?.type === "text")
?.map((c: any) => c.text ?? "")
?.join("") ?? "";
buf += text;
// Attempt partial parse (best-effort). Final validated object still comes at the end.
try {
const repaired = jsonrepair(buf);
JSON.parse(repaired); // if this succeeds, you can surface partials
} catch {}
}
}
Notes
-
withStructuredOutput() uses a parser that validates the full message; it does not yield partial objects.
-
This behavior is model/provider-agnostic in LangChain today: you’ll see the structured value only after completion.
-
If truly streaming structured data is a hard requirement, consider emitting line-delimited JSON fragments from the model and parsing line-by-line, or using a provider/mode with first-class JSON-mode streaming supported by your stack. Just be aware you’ll be trading strict validation for incremental UX.
Thanks! I think I need to figure out other ways to handle this, maybe like using the streamObject from Vercel’s ai SDK.
1 Like