Preventing sub agents structured output post processing in a deep agent

I am using a deep agent as a coordinator that manages several sub agents. The sub agents produce markdown tables, SQL snippets that are validated via Pydantic and OpenAI’s Structured Outputs (strict: true).

The issue is that the deep agent often “post-processes” these results before presenting them to the end-user. My goal is that the deep agent should handle the conversation but must pass the output from the sub agents 1:1 to the chat.

Is there a strategy to stop the main agent from processing the structured output?

hi @SeccoMaracuja

thanks for creating new post for your issue <3

how have you implemented your subagents? As subagents, tools or maybe skills?

if you could share your code, would be easier :slight_smile:

I implemented them as subagents (compiled):

model = init_chat_model(“openai:gpt-5.2”)

agent = create_deep_agent(

model=model,

subagents=subagents,

system_prompt=get_prompt(“cordinator-agent.prompt.md”),

tools=[]

)



mapping_runnable = create_agent(

model=model,

tools=get_toolbox_tools(),

system_prompt=get_prompt("mapping_spec_subagent.prompt.md"),

response_format=ProviderStrategy(MappingSpecification, strict=True)

)



mapping_spec_subagent = CompiledSubAgent(

    name="mapping_spec_subagent",

    description="Creates the data mapping specification in phase A using database introspection. Returns a structured MappingSpecification object.",

    runnable=mapping_runnable

)



class MappingEntry(BaseModel):

    """Represents a single column mapping rule."""

model_config = ConfigDict(extra='forbid')

target_column: str = Field(..., alias="Zielspalte", description="The name of the column in the target database table.")

source_type: Literal['ODX XPath', 'Datenbankspalte', 'Abgeleitete Logik', 'Hartcodiert'] = Field(..., alias="Quelltyp", description="The type of source (e.g., 'ODX XPath', 'Raw DB Column', 'Derived Logic', 'Hardcoded').")

source_location_logic: str = Field(..., alias="Logik", description="The specific XPath, Column Name, or Logic used to derive the value.")



class MappingSpecification(BaseModel):

"""The complete data mapping specification for a target table."""

model_config = ConfigDict(extra='forbid')

target_table: str = Field(..., alias="Zieltabelle", description="The name of the destination table.")

mappings: List[MappingEntry] = Field(..., alias="Mappings", description="List of all column mappings as a table.")

business_logic_notes: Optional[str] = Field(..., alias="Geschäftslogik Notizen", description="Any specific business logic or constraints applied.")

hi @SeccoMaracuja

I think you would have to go for structured_response and return_direct=True. Sth like this:

from langchain.tools import tool

def spec_to_markdown(spec: MappingSpecification) -> str:
    lines = ["| Zielspalte | Quelltyp | Logik |", "|---|---|---|"]
    for m in spec.mappings:
        lines.append(f"| {m.target_column} | {m.source_type} | {m.source_location_logic} |")
    if spec.business_logic_notes:
        lines += ["", "### Geschäftslogik Notizen", spec.business_logic_notes]
    return "\n".join(lines)

@tool("generate_mapping_spec", return_direct=True)
def generate_mapping_spec(user_request: str) -> str:
    result = mapping_runnable.invoke({
        "messages": [{"role": "user", "content": user_request}]
    })
    spec: MappingSpecification = result["structured_response"]
    return spec_to_markdown(spec)

agent = create_deep_agent(
    model=model,
    subagents=subagents,
    system_prompt=get_prompt("cordinator-agent.prompt.md"),
    tools=[generate_mapping_spec],
)

And then remove mapping_spec_subagent from subagents list.

This makes sense, thank you. I thought subagents would help delegate specialized tasks, but in my case, a single deep agent with structured output and a shared context seems better.

My second subagent handles writing, executing, and verifying SQL functions by the mapping specification. Should I replace it with a tool? Generally: when is it better to use subagents vs tools?

It’s not easy to answer your question without the context - what your deep agent should do within one loop run?

Tell me please what’s the context and I’ll try to design a dedicated deep agent (if it’s necessary) for you with the whole explanation “why” :slight_smile:

TIP: Tools can be configured to return directly (return_direct) - then the agent loop stops after tool execution.


This is the sequence diagram, if it doesnt’t help let me know.
The agent should implement a SQL-based ETL pipeline that transforms raw ODX diagnostic data from a raw data schema into structured relational formats within the traget schema.
The agent has two sub agents: one is creating a mapping specification with source and target table columns and the other one is generating the sql script.

I saw this example in the documentation that a deep agent should fit, but I’m not sure if I should use LangGraph. Sometimes I have problems where the SQL generator’s code can’t find the data in the source tables because the mapping specification has too little context.

1 Like

IMHO you have a fixed, well-defined workflow with a verification loop (deploy → run → count → diagnose → fix; max ~3 attempts) and a human approval gate between Phase A and Phase B. That’s much closer to a state machine / workflow graph than an open-ended “delegate and summarize” assistant.

AFAIK from the source code. deep agents subagents are built for context quarantine. In the deepagents source, the task tool intentionally does not pass structured_response between the main agent and the subagent:

# State keys that are excluded when passing state to subagents and when returning
# updates from subagents.
# When returning updates:
# 1. The messages key is handled explicitly to ensure only the final message is included
# 2. The todos and structured_response keys are excluded as they do not have a defined reducer
#    and no clear meaning for returning them from a subagent to the main agent.
_EXCLUDED_STATE_KEYS = {"messages", "todos", "structured_response"}

And for a CompiledSubAgent, the main agent gets only the final message back (wrapped as a ToolMessage), not the full internal state:

class CompiledSubAgent(TypedDict):
    """A pre-compiled agent spec.

    !!! note

        The runnable's state schema must include a 'messages' key.

        This is required for the subagent to communicate results back to the main agent.

    When the subagent completes, the final message in the 'messages' list will be
    extracted and returned as a `ToolMessage` to the parent agent.
    """

So if your mapping spec is produced as a validated structured object (e.g., ProviderStrategy(…, strict=True)), that structured object won’t automatically propagate to the supervisor via subagents unless you explicitly serialize/persist it (e.g., write it to a file and have the main agent read it).

Therefore, my recommendation: dedicated LangGraph workflow (graph) with explicit state keys

Use a graph that keeps shared state across phases (e.g., params, mapping_spec, schema_metadata, sql_script, attempt, verification_results). This directly addresses “SQL generator can’t find data” because Phase B always has the full Phase A outputs + any runtime observations.

Quick shot - draft: a good shape that in my opinion would would match your diagram:

  • Node: parameter_collection → validates German inputs
  • Node: schema_introspection (tools) → fetch table columns/constraints + optional row-count sanity checks
  • Node: create_mapping_spec (LLM structured output, strict) → writes mapping_spec into graph state
  • Interrupt / HITL gate → human corrections/approval
  • Node: generate_sql (LLM) → uses mapping_spec + schema_metadata (+ any sampled evidence)
  • Node: deploy_and_run (tools) → execute DDL, run staging function, collect errors
  • Node: verify (tools) → row count checks, constraints checks
  • Conditional routing:
  • success → finalize/save
  • failure and attempt < 3 → diagnose (tools) → generate_sql
  • failure and attempt == 3 → stop with report

This is exactly what LangGraph is good at: deterministic routing + loops + interrupts + persisted state.

When a Deep Agent still makes sense

Keep a Deep Agent when the work is not a predictable workflow (researchy, exploratory, lots of “figure it out”), or when you truly want context quarantine. But your ETL pipeline is mostly “known steps + tooling + verification”, so a graph is the more reliable base.

If you want to stay on Deep Agents anyway

You can still make it work, but you must explicitly persist canonical artifacts between subagents:

  • Mapping spec subagent must write the spec to the filesystem (e.g., /artifacts/mapping_spec.json) and the SQL generator must read that file, because structured_response won’t be returned through task.
  • Alternatively, don’t make mapping spec a subagent: run it in the main agent (or as a tool) so the structured result is in the main run’s state.

Does this input make sense for you @SeccoMaracuja ?

1 Like

Thank you very much, this makes a lot of sense! I’ll try out the LangGraph approach.

1 Like

Good luck @SeccoMaracuja ! If any further concerns, feel free to post a new thread :slight_smile:

Hi Pawel,

quick question: for an upcoming pitch I’m looking for a fast refactor without rebuilding the whole agent setup.
My idea is to model the current sub-agents as Skills, mainly to give the main agent the right context, without changing the overall architecture too much.

What do you think about using Skills this way? Do you see any major downsides?

Thanks a lot!

Hi @SeccoMaracuja

this is an interesting topic. Since the main topic in this post is solved, could you please create another post and tag me? It’s to keep topics organized to make them easier to search for the others.
Once you tag me, I’ll answer there soon :slight_smile: