Handling Direct Tool Responses and Output Extraction with MCP Tools in LangGraph

Hey all!
I have a question regarding the use of LangGraph with MCP tools.

Previously, when using standard LangGraph tools, I was able to return a tool’s response directly to the user by using the return_direct=True flag, effectively bypassing the agent step. This approach worked well, especially when handling large tool outputs.

After migrating to MCP tools, I’ve noticed that this behavior is no longer available—the execution flow always returns to the agent after a tool call.
Is there a recommended or supported approach in LangGraph/MCP to short-circuit the agent and send the tool output directly to the user?

Additionally, is there a way to capture what an MCP tool returns and reliably extract that information from the graph state or message history?

Any guidance, best practices, or suggested patterns for handling this scenario would be greatly appreciated.
Thank you!

hi @Bitcot_Kaushal

You could add a conditional edge after your tool node that ends if the last tool response is from a “terminal” tool:

from langchain_core.messages import ToolMessage
from langgraph.graph import END

TERMINAL_TOOLS = {"big_export", "download_report"}

def route_after_tools(state):
    # Look at trailing ToolMessage(s) produced by the tool node
    for m in reversed(state["messages"]):
        if not isinstance(m, ToolMessage):
            break
        if m.name in TERMINAL_TOOLS:
            return END
    return "agent"  # or wherever your loop continues

Regarding extracting from message history

Tool results live in the graph state as ToolMessages. Scan backward and collect the trailing tool messages (handles parallel tool calls too):

from langchain_core.messages import ToolMessage

def last_tool_messages(state):
    out = []
    for m in reversed(state["messages"]):
        if isinstance(m, ToolMessage):
            out.append(m)
        else:
            break
    return list(reversed(out))

For MCP structured output: use ToolMessage.artifact


btw are you using create_agent or custom graph?

Thanks @pawel-twardziak for your suggestion. BTW I am using create_agent.

hi @Bitcot_Kaushal

create_agent already supports the “stop after tool” behavior, but it only triggers when the tool objects passed into create_agent have return_direct=True.

Ensure the MCP tools you pass into create_agent are LangChain BaseTools with return_direct=True (set it if the object allows it, or wrap the MCP tool with a local @tool(return_direct=True) delegator).

Gotcha: if the model makes multiple tool calls in one step and only some are return-direct, create_agent will keep looping (it exits only when the executed calls in that step are all return-direct).

Extracting MCP tool output with create_agent - nothing changes here: read the resulting ToolMessage(s) from state[“messages”].

Does it make sense for you, @Bitcot_Kaushal ?

But I am not sure if this is right an right appraoch wrapping MCP tool under Local Tool

Why? What would you consider “right approach”?

What I meant is: if I have to wrap an MCP tool inside a local tool anyway, then what’s the real benefit of using MCP? At that point, I could just put the same logic directly into the local tool itself.

You asked for @tool(return_direct=True) - this is the benefit. It’s just a wrapper.

here is the example with langchain-mcp-adapters:

"""
Demo: MCP tools + langchain-mcp-adapters + create_agent + return_direct + PostgresSaver

What this script shows:
- How to expose local Python functions as MCP tools (via FastMCP) over stdio.
- How to load those MCP tools into LangChain/LangGraph using the official adapter:
  https://github.com/langchain-ai/langchain-mcp-adapters
- How to *modify the loaded MCP tools* to behave like return-direct tools by setting
  `tool.return_direct = True` (LangChain uses `return_direct`, not `return_directly`).
- How to run `create_agent` with an OpenAI model and a PostgreSQL checkpointer
  (AsyncPostgresSaver) so runs are persisted by `thread_id`.

Prereqs (typical):
  pip install -U python-dotenv mcp langchain-mcp-adapters langgraph-checkpoint-postgres "langchain[openai]"

Environment (.env):
  OPENAI_API_KEY=...
  POSTGRES_URI=postgresql://postgres:postgres@localhost:5432/postgres?sslmode=disable

Run:
  python src/mcp_tools_return_direct_create_agent_openai_postgres_demo.py

Run only the MCP server (stdio):
  python src/mcp_tools_return_direct_create_agent_openai_postgres_demo.py --mcp-server
"""

from __future__ import annotations

import argparse
import asyncio
import os
import sys
import uuid
from typing import Iterable, Sequence

from dotenv import load_dotenv


def _parse_args() -> argparse.Namespace:
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--mcp-server",
        action="store_true",
        help="Run the local FastMCP server over stdio (used by the client demo).",
    )
    parser.add_argument(
        "--rows",
        type=int,
        default=50,
        help="How many rows to request from the 'big_report' tool in the client demo.",
    )
    return parser.parse_args()


def _set_return_direct(tools: Sequence[object], tool_names: set[str]) -> list[object]:
    """
    Mutate/copy tools to set `return_direct=True` for selected tool names.

    `create_agent` checks `tool.return_direct` when deciding whether it can route
    tool execution directly to END (skipping the next model step).
    """

    updated: list[object] = []
    for t in tools:
        name = getattr(t, "name", None)
        if isinstance(name, str) and name in tool_names:
            # Prefer in-place assignment (often works for BaseTool).
            try:
                setattr(t, "return_direct", True)
                updated.append(t)
                continue
            except Exception:
                pass

            # Fallback for pydantic-ish tools that are “frozen”:
            # BaseTool is typically a pydantic model and supports model_copy(update=...).
            model_copy = getattr(t, "model_copy", None)
            if callable(model_copy):
                updated.append(model_copy(update={"return_direct": True}))
                continue

            # Last resort: keep it unchanged (and rely on routing/interrupts instead).
            updated.append(t)
        else:
            updated.append(t)
    return updated


async def run_demo_client(*, rows: int) -> None:
    """
    Client-side demo:
    - starts a local MCP server (this same file) via stdio
    - loads MCP tools via MultiServerMCPClient/get_tools()
    - sets return_direct=True on a chosen tool
    - runs create_agent with AsyncPostgresSaver
    """

    load_dotenv()

    POSTGRES_URI = os.environ.get("POSTGRES_URI")
    if not POSTGRES_URI:
        raise RuntimeError(
            "Missing POSTGRES_URI. Add it to your .env (see file header)."
        )

    # --- MCP tools via official adapter (Python) ------------------------------
    # Source: https://github.com/langchain-ai/langchain-mcp-adapters (README)
    from langchain_mcp_adapters.client import MultiServerMCPClient

    servers: dict[str, dict] = {
        "demo": {
            "transport": "stdio",
            "command": sys.executable,
            "args": [os.path.abspath(__file__), "--mcp-server"],
        }
    }

    client = MultiServerMCPClient(servers)
    tools = await client.get_tools()

    print("Loaded MCP tools:")
    for t in tools:
        print(f"- {getattr(t, 'name', '<unknown>')} (return_direct={getattr(t, 'return_direct', False)})")

    # --- Modify MCP tools: set return_direct=True ----------------------------
    # IMPORTANT: LangChain's tool flag is `return_direct` (not return_directly).
    # With create_agent, tools marked return_direct can short-circuit to END.
    tools = _set_return_direct(tools, {"big_report"})

    print("\nAfter setting return_direct on selected tools:")
    for t in tools:
        if getattr(t, "name", None) == "big_report":
            print(f"- {t.name}: return_direct={getattr(t, 'return_direct', False)}")

    # --- Checkpointer (Postgres) --------------------------------------------
    # Docs: https://docs.langchain.com/oss/python/langgraph/add-memory
    from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver

    async with AsyncPostgresSaver.from_conn_string(POSTGRES_URI) as checkpointer:
        # Create the schema on first run (safe to call repeatedly in most setups).
        await checkpointer.setup()

        # --- Agent -----------------------------------------------------------
        from langchain.agents import create_agent

        system_prompt = (
            "You are a helpful assistant.\n"
            "When the user asks for a report, call the 'big_report' tool.\n"
            "The 'big_report' tool is configured to return_direct, so after the tool executes, "
            "the run should end immediately (no extra commentary).\n"
        )

        agent = create_agent(
            model="openai:gpt-4.1",
            tools=tools,
            system_prompt=system_prompt,
            checkpointer=checkpointer,
        )

        thread_id = str(uuid.uuid4())
        config = {"configurable": {"thread_id": thread_id}}

        user_msg = (
            f"Generate a report with {rows} rows. "
            "Return the report contents as-is."
        )

        result = await agent.ainvoke(
            {"messages": [{"role": "user", "content": user_msg}]},
            config=config,
        )

    # --- What you get back ---------------------------------------------------
    # create_agent returns a dict with a 'messages' list.
    messages = result["messages"]
    print("\nFinal messages (last 3):")
    for m in messages[-3:]:
        # pretty_print() exists on LangChain messages; fall back to repr.
        pretty = getattr(m, "pretty_print", None)
        if callable(pretty):
            pretty()
        else:
            print(repr(m))

    # If return_direct worked, the last message is commonly a ToolMessage from big_report.
    try:
        from langchain.messages import ToolMessage
    except Exception:
        ToolMessage = None  # type: ignore

    if ToolMessage is not None:
        tool_messages = [m for m in messages if isinstance(m, ToolMessage)]
        if tool_messages:
            last_tool = tool_messages[-1]
            print("\nLast ToolMessage summary:")
            print(f"- name: {last_tool.name}")
            print(f"- content length: {len(str(last_tool.content))}")
            if getattr(last_tool, "artifact", None):
                print(f"- artifact keys: {list(last_tool.artifact.keys())}")


def run_demo_mcp_server() -> None:
    """
    A minimal MCP server exposing two tools:
    - big_report: returns a “large-ish” text payload
    - add: simple arithmetic tool
    """

    from mcp.server.fastmcp import FastMCP

    mcp = FastMCP("DemoMCP")

    @mcp.tool()
    def add(a: int, b: int) -> int:
        """Add two integers."""
        return a + b

    @mcp.tool()
    def big_report(rows: int = 50) -> str:
        """Return a multi-line report. This simulates a large tool output."""
        rows = max(1, min(rows, 500))
        lines = ["id,value"]
        for i in range(1, rows + 1):
            lines.append(f"{i},{i*i}")
        return "\n".join(lines)

    mcp.run(transport="stdio")


def main() -> None:
    args = _parse_args()
    # IMPORTANT:
    # `FastMCP.run()` spins up its own event loop (via AnyIO). If we call it from inside
    # `asyncio.run(...)`, we get:
    #   RuntimeError: Already running asyncio in this thread
    # So we keep the server path fully synchronous.
    if args.mcp_server:
        run_demo_mcp_server()
        return

    asyncio.run(run_demo_client(rows=args.rows))


if __name__ == "__main__":
    main()