Inter Communication between agents using create_agent as a supervisor and as a tools

Hi @razaullah thanks for the code. I tried to reproduce your code locally and the behavior you’re seeing is expected: thread_id + checkpoint_ns only scopes state inside each agent graph, but it doesn’t automatically share memory across different agents. Each sub‑agent is its own graph, so the patient ID won’t be available to the appointment agent unless you explicitly pass or store it. Because patient_agent, appt_agent, and book_appointment are separate graphs, the patient ID will not be shared automatically. You need to explicitly carry it across tools.

The simplest reliable fix is to store the patient ID in the supervisor tool layer, then inject it into later tool calls when missing.

Below are the changes and exactly where to add them, to make it work in your case:

  1. Add a small session store + ID extractor

Why: we need a place to remember the patient ID once it’s found so later tools can reuse it.
Where: same file as build_supervisor (top‑level).

import re

PATIENT_ID_RE = re.compile(r"\bID\b[^A-Za-z0-9]{0,5}([A-Za-z0-9_-]+)", re.IGNORECASE)
SESSION_PATIENT_IDS: dict[str, str] = {}

def extract_patient_id(text: str) -> str | None:
    match = PATIENT_ID_RE.search(text or "")
    return match.group(1) if match else None
  1. Store the patient ID inside your existing fetch_patient_information_tool

Why: this tool is where the ID first appears, so we capture it there and save into the session store.
Where: inside build_supervisor in fetch_patient_information_tool.

@tool
async def fetch_patient_information_tool(request: str, runtime: ToolRuntime) -> str:
    root_thread_id = runtime.config["configurable"]["thread_id"]

    result = await patient_agent.ainvoke(
        {"messages": [{"role": "user", "content": request}]},
        config=shared_config(runtime)
    )

    response = extract_last_ai_message(result)
    patient_id = extract_patient_id(response)
    if patient_id:
        SESSION_PATIENT_IDS[root_thread_id] = patient_id
        logger.info("Stored Patient ID %s for thread %s", patient_id, root_thread_id)

    return response
  1. Inject the stored ID before calling appointment tools

Why: when the user says “book an appointment” without repeating the ID, we attach the stored ID so appt_agent can proceed.
Where: in your appointment tool wrapper (the tool that calls appt_agent.ainvoke(…)). If your code currently uses appt_agent inside a tool, apply this there. Example:

Do the same for your booking tool if it also needs the ID.

@tool
async def manage_appointment_tool(request: str, runtime: ToolRuntime) -> str:
    root_thread_id = runtime.config["configurable"]["thread_id"]

    if not extract_patient_id(request):
        stored_id = SESSION_PATIENT_IDS.get(root_thread_id)
        if stored_id:
            request = f"{request} (Patient ID {stored_id})"
            logger.info("Injected Patient ID %s into appointment request", stored_id)

    result = await appt_agent.ainvoke(
        {"messages": [{"role": "user", "content": request}]},
        config=shared_config(runtime)
    )
    return extract_last_ai_message(result)
  1. Remove return_direct=Truefrom each tool

So now:

  • Each sub‑agent is isolated; the checkpointer doesn’t merge their states.
  • This approach explicitly carries the patient ID across tools, so the appointment agent doesn’t re-ask for it.
  • It works regardless of whether you use in‑memory or DB-backed checkpointing.

If you want the same behavior in your app, the next step is to replace the in‑memory SESSION_PATIENT_IDS with your real persistence layer (DB/Redis) while keeping the same extract+inject flow.

Let me know if this fixes your issue, I have also created a Python script locally, so if you will have further problems i can share it with you so you can inspect it locally.

@simon.budziak Thank you for the suggestion, but I can not use a manual method, as my agent will be used for different purpose and I can fetch information’s based on the patient id or mobile number or name and date of birth which I can not control. I used the following code for sub-agents checkpointer_ns and now they are storing same thread_id and checkpoint_ns , but the supervisor still does not store checkpointer_ns. The code is as follows.

def extend_runtime_config(runtime: ToolRuntime, *, checkpoint_ns: str):
    base = runtime.config or {}
    base_conf = dict(base.get("configurable", {}) or {})
    base_conf["checkpoint_ns"] = checkpoint_ns  # force shared ns
    # keep existing thread_id/checkpoint_id/etc if present
    return {**base, "configurable": base_conf}

And I used it in the sub-agents as follows:

@tool(return_direct=True)
async def fetch_patient_information_tool(request: str, runtime: ToolRuntime) -> str:
    """Use this tool to retrieve patient information such as name, mobile number, date of birth, or address. 
    This tool handles ONLY patient personal information queries. Call this when the user asks about their 
    personal details or provides a patient ID to retrieve information.
    
    Args:
        request: The user's request for patient information
    """
    root_thread_id = runtime.config["configurable"]["thread_id"]
    logger.info(f"[patient_agent] 🚀 Invoking with thread_id={root_thread_id}, request: {request}")
    
    # Build config with shared checkpoint namespace for memory continuity
    sub_config = extend_runtime_config(runtime, checkpoint_ns=SHARED_NS)
    
    # 💾 Log checkpoint state BEFORE invoke to see what memory is loaded
    await log_checkpoint_state(patient_agent, "patient_agent", sub_config)

    # Note: checkpointer is already bound to patient_agent at creation time
    result = await patient_agent.ainvoke(
        {"messages": [{"role": "user", "content": request}]},
        config=sub_config
    )
    
    # Log sub-agent's internal tool calls and messages AFTER invoke
    log_subagent_result("patient_agent", result)
    
    response = extract_last_ai_message(result)
    logger.info(f"[patient_agent] ✅ Final response: {response[:200]}...")
    return response

Hi @razaullah

I’ve done some investigation within another thread. It might help prove and ground your statement and the working code above Subgraph checkpointing - #8 by pawel-twardziak. Let me know whether this makes sense. Cheers!