Nothing is wrong. In LangChain/LangGraph, a message’s content can be either a plain string or a list of structured “content blocks” (dicts). When your model/tooling returns multimodal or tool-call data, content will be a list of typed blocks (e.g., {"type": "text", "text": ...}, {"type": "tool_call", ...}) rather than a single string. To consistently get just the human‑readable text, take the last assistant (AIMessage) and read its .text property, which safely extracts textual parts.
Why you’re seeing dicts
Message content is intentionally polymorphic. The content field is defined as str | list[str | dict] in LangChain core. This lets messages carry multimodal data and tool calls in a standardized way.
Providers and tools add structured blocks. Tool calls, reasoning blocks, images/files, etc. are represented as typed dicts inside the list of content blocks.
How to reliably get the final text
Filter for the last assistant (AIMessage) and use .text:
# result is the output of graph.invoke(...)
messages = result["messages"]
# Find the last assistant message (the graph may end on a ToolMessage otherwise)
last_ai = next(m for m in reversed(messages) if getattr(m, "type", None) == "ai")
# Safely extract human-readable text
final_text = last_ai.text
Notes
If your graph finishes on a tool node, the last message may be a ToolMessage. In that case you still want the last AIMessage before it.
If you need OpenAI-like formatting, you can ask LangGraph to format messages by using add_messages(format="langchain-openai") in your state reducer; this will keep tool calls and multimodal blocks standardized when you inspect them.