With too many fields, how should deepagents handle this properly？

keenborder786 · March 23, 2026, 1:46pm

This approach of using query as a parameter, which can consist of multiple different fields, much like SQL seems the most viable, rather than hardcoding all fields individually:

async def query_field(company_name: str, query: str) -> str:
    """
    Second interface — keyword-based field matching.
    """
    fields = match_fields(query)

    if not fields:
        return (
            f"No fields related to '{query}' were matched. "
            "Please try a more specific description."
        )

    result = await api.kyc(company_name)
    data = result["result"]["Data"]

    lines = []

    for field in fields:
        value = deep_get(data, field)

        if value == "unknown":
            lines.append(f"\n【{field}】No relevant data found")
        else:
            lines.append(f"\n【{field}】")
            lines.append(format_value(value))

    return "\n".join(lines)

That said, I would recommend improving the Tool Description by encoding within it the fields that can be part of the query. Including a few examples of valid queries in the Tool Description could also help improve performance (just an example).

@tool
async def query_company_fields(company_name: str, query: str) → str:
“”"
Query specific fields from a company’s KYC profile.

Available fields:
    Registration:
        registered_name, incorporation_date, registered_address,
        operating_status, business_scope, company_type

    Financials:
        registered_capital, paid_in_capital, annual_revenue,
        credit_rating, tax_id

    Legal:
        licenses, litigation_records, administrative_penalties,
        bankruptcy_status

    Personnel:
        legal_representative, shareholders, beneficial_owners,
        board_members

Example queries and the fields they map to:
    "Is the company still active?"
        → operating_status

    "Who founded the company?"
        → legal_representative, shareholders

    "Any lawsuits or penalties?"
        → litigation_records, administrative_penalties

    "What is the registered capital?"
        → registered_capital, paid_in_capital

    "Tell me about the beneficial owners"
        → beneficial_owners, shareholders

Args:
    company_name: The company to look up.
    query: Natural language description of what information is needed.

Returns:
    Formatted field values from the company's KYC record.
"""
return await query_field(company_name, query)

Additional Suggestion for Improving `query_field`

I have another suggestion to improve your query_field. Rather than doing deterministic matching as you are doing now, you can make use of another LLM call for matching (below is just an example and given this will be only one LLM call with no past Message history the LLM context window can handle large schema token size):

FIELD_SCHEMA = """

field1: Legal registered name of the company

field2: Date of incorporation

field3: Registered address (full)

field4: Operating status (active/inactive/dissolved)

...

field255: ...

"""

FIELD_SELECTOR_PROMPT = """

You are a data field selector. Given a user question, return ONLY the relevant field names from the schema below as a JSON list.

Schema:

{schema}

User question: {query}

Return only a JSON array of field names, e.g. ["field1", "field42"]. No explanation.

"""

from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import ChatPromptTemplate

async def select_fields_with_llm(query: str) -> list[str]:
    """Use LLM to select relevant fields from schema based on user query."""
    prompt = ChatPromptTemplate.from_template(FIELD_SELECTOR_PROMPT)
    chain = prompt | llm | JsonOutputParser()

    fields = await chain.ainvoke({
        "schema": FIELD_SCHEMA,
        "query": query
    })

    # Validate: only return fields that actually exist
    valid = set(ALL_FIELD_NAMES)
    return [f for f in fields if f in valid]

async def query_field(company_name: str, query: str) -> str:
    fields = await select_fields_with_llm(query)

    if not fields:
        return f"No relevant fields found for: '{query}'"

    result = await api.kyc(company_name)
    data = result["result"]["Data"]

    lines = []

    for field in fields:
        value = deep_get(data, field)

        if value == "unknown":
            lines.append(f"【{field}】No data available")
        else:
            lines.append(f"【{field}】")
            lines.append(format_value(value))

    return "\n".join(lines)

I hope this helps

Topic		Replies	Views
Structured data fields (1000+): Dedicated LLM channel vs vectorized field names? OSS Product Help python-help	3	18	March 26, 2026
Deepagent does not complete all to dos OSS Product Help	2	244	November 28, 2025
Missing `reasoning_content` field LangChain python-help	1	900	December 10, 2025
Agent Claims it Needs More Info but Proceeds to Generate Answer With Tool Call LangGraph python-help	3	186	December 5, 2025
Agent Chat UI stream reasoning and tool calls? LangChain python-help	1	248	March 7, 2026

With too many fields, how should deepagents handle this properly？

Additional Suggestion for Improving query_field

Related topics

Additional Suggestion for Improving `query_field`