How to stream an LLM from a tool?

Hi, I’m quite new to LangGraph, and have no idea, how to properly manage next thing:

I have an agent built with create_react_agent and a few tools:

from langgraph.config import get_stream_writer
from langchain.agents import create_react_agent
from langchain_core.tools import tool

@tool
async def my_proposed_tool(state: Annotated[UserContext, InjectedState], prompt: str):
    llm = ChatOpenAI(
          model="gpt-4.1-mini",
          streaming=True,
    )
    messages = [...]
    async for chunk in llm.with_structured_output(schema=SomeModel.model_json_schema()).astream_events(input=messages):
          current_data = chunk['data']
          post_processed_data = post_process(current_data)
          print(post_processed_data) # must be yield/`get_stream_writer` here


agent = create_react_agent(
          model="openai:gpt-4.1-mini",
          tools=[                     
                 # other tools..
                 my_proposed_tool,
          ]
          # other props
)

# stream agent

Inside the agent, I want a function/tool (e.g. my_proposed_tool) that, when invoked, calls an LLM that streams partial JSON (structured output). I need to intercept each partial, run custom post-processing, and only then stream the processed chunks.

My current idea is to add a tool that calls the LLM internally and use astream/astream_events plus get_stream_writer to emit my own events. BUT when I do this, the compiled agent also streams the inner LLM tokens alongside my custom events. Is there a way to suppress inner LLM streaming so only my processed output goes out? Or it’s even a bad practise to put LLM inside a tool? But then how do I manage this flow…
Maybe somehow with subgraphs?