Groq Model Response

I am using Groq as the model provider and the model I am using is openai/gpt-oss-120b. Some time I am getting a worse response from the model as follows. I am using a multi-agent flow.

The response from the agent is:
The appointment 33854985 at Takhassusi Hospital, Dental Clinic with Reem Alghamdi on thirty‑thirty‑thirty? (Oops! I made … … … … … … … … …… … … … … … … … … … … … ……………… … …………… … ………… … … … … … … … … … … … … … ………… ……………………………………………………………………………………………………………………………………………………………………………………………

I’m sorry— … … … hall…

Oops! … … … …

… … …

Apologies…

… … ……

… Sorry…

We…

……

……

……

……

……

……

……

Okay…

We…

……

We…

……

The…

……

……

……

……

……

……

…The appointment for the patient Raza Ullah Ullah is Cancelled at Takhassusi Hospital, Dental Clinic with Reem Alghamdi on thirty‑first August two thousand twenty‑six at five p.m.

What could be the reason for this type of responses. Need advice.

Hi @razaullah

what does your graph definition look like? Could you share as much code as possible?

I am using a multi-agent flow. I am using create_agent as a supervisor and other sub agents as a tools. the supervisor agent code is as follows:

supervisor = create_agent(
            model=model,
            system_prompt=supervisor_prompt,
            tools=[fetch_patient_information_tool, manage_appointment_tool, book_appointment_tool],
            checkpointer=checkpointer,
            name="supervisor_agent",
)

One of the tools is defined as follows:

@tool(return_direct=True)
async def fetch_patient_information_tool(request: str, runtime: ToolRuntime) -> str:
    """Use this tool to retrieve patient information such as name, mobile number, date of birth, or address. 
    This tool handles ONLY patient personal information queries. Call this when the user asks about their 
    personal details or provides a patient ID to retrieve information.
    
    Args:
        request: The user's request for patient information
    """
    root_thread_id = runtime.config["configurable"]["thread_id"]
    logger.info(f"[patient_agent] 🚀 Invoking with thread_id={root_thread_id}, request: {request}")
    
    # Build config with shared checkpoint namespace for memory continuity
    sub_config = extend_runtime_config(runtime, checkpoint_ns=SHARED_NS)
    
    # 💾 Log checkpoint state BEFORE invoke to see what memory is loaded
    await log_checkpoint_state(patient_agent, "patient_agent", sub_config)

    # Note: checkpointer is already bound to patient_agent at creation time
    result = await patient_agent.ainvoke(
        {"messages": [{"role": "user", "content": request}]},
        config=sub_config
    )
    
    # Convert ToolMessages to TOON format
    result = convert_tool_messages_to_toon_format(result, tool_names=["mssql_get_patient_info", "mssql_get_patient_by_name_dob"])
    
    # Update the checkpoint with TOON formatted messages
    await patient_agent.aupdate_state(sub_config, result)
    logger.info(f"[patient_agent] 💾 Updated checkpoint with TOON formatted messages")
    
    logger.info(f"[patient_agent] 🛠️ Sub-agent invocation complete.")
    # Log sub-agent's internal tool calls and messages AFTER invoke
    log_subagent_result("patient_agent", result)
    
    response = extract_last_ai_message(result)
    logger.info(f"[patient_agent] ✅ Final response: {response[:200]}...")
    response = strip_markdown(response)
    
    return response.lower()

The sub-agent is defined as follows:

async def patient_information_retrieval_agent(tools, language_id, checkpointer=None):
    """
    Create a patient information retrieval agent.
    """
    logger.info(f"Creating patient information retrieval agent with checkpointer: {checkpointer}")
    prompt_folder = 'english_prompts' if language_id == 2 else 'arabic_prompts'
    prompt_path = os.path.join(os.path.dirname(__file__), '..', 'prompts', prompt_folder, 'patient_information_retrieval_prompt.yml')
    logger.info(f"Loading patient information retrieval prompt from with language id: {language_id}")
    
    with open(prompt_path, 'r', encoding='utf-8') as f:
        prompts = yaml.safe_load(f)

    return create_agent(
        model=model,
        tools=tools,
        system_prompt=prompts['patient_information_retrieval_prompt'].format(LANGUAGE_ID=language_id),
        name="patient_information_retrieval_agent", 
        checkpointer=checkpointer,
        middleware=[
            # SummarizationMiddleware(
            #     model = model,
            #     trigger=("tokens", 1000),
            #     keep=("messages", 5),
            # ),
            ]
    )

The other tools and sub_agents are defined in the same manner.

@pawel-twardziak If you need other information’s let me know.

1 Like

thanks, I’m pondering… :face_with_monocle:

Could you root out all the TOON conversion functionality? I’m pretty sure, together with the aupdate_state usage within the tools, it messes up.
Remove and tell me please whether the issue is still there.
There is no real need to convert to TOON since it does not provide any significant reduction and, what’s crucial, the framework is not ready for that.

@pawel-twardziak I removed the TOON conversion function and now analyzing the results. I will let you know about the outcomes observed in this case.

Thanks for your feedback.

1 Like

@pawel-twardziak When I removed the TOON format conversion function, it’s working now. Now as I have a multi-agent architecture. Also I need that agent’s should shared the context. Now when I am interacting with the agent for booking an appointment, it consumes almost 35k tokens. I tried to exclude the tools using the built in middleware, but sometime it is excluding a tool, that I need to fulfill the user request, I am getting an error.
One other approach to exclude the tool’s that are used somewhere before, but may I need an information that were extracted in the first tool call. Also SummarizationMiddleware will update the full context.

Can you guide me for this case. Which approach will be best to have a full context for fulfilling user requests and also minimize the tokens consumptions.

Looking forward for your guidance.

1 Like

Hi @razaullah

OK, great it works again. I’ll follow up today or tomorrow.

hi @razaullah I have a few conclusions, but I need to put them together. I’ll get back to you tomorrow

Sorry, was occupied yesterday, will do my best today.