Proper testing of a tool?

eric-burel · March 18, 2026, 3:17pm

Hey folks, I’d be eager to better unit test tools but the current experience is not super great:

How to get typing when importing a tool? The “invoke” function is untyped making it hard to discover the expected args for `invoke`
Is it possible to tie some metadata to a tool directly? This would be also useful for other use cases. For instance, I’d like to treat some tools as “final”, meaning that when they are called, the agent should not call a tool afterwards. More broadly it could be interesting to tag them, or even attach a nice “print” function to a tool to optimize how it displays its results

I’d like to avoid exporting separately the unwrapped function, as I think a tool should be tested as a tool

Example code I’d like to improve:

class TestDocumentsTools(unittest.TestCase):
    def test_read_document_file_text_content(self):
        filepath = path.join(assets_dir, "./foo.odt")
       # could be easier to call the wrapped function rather than the tool here
        content = read_document_file_text_content.invoke(filepath)
        self.assertTrue(content.find("foo") > -1)
        self.assertFalse(content.find("<") > -1)

    def test_copyfile_dst_does_not_exist(self):
        not_exist_dir = path.join(assets_dir, "does_not_exist")
       # not typed: flaky and hard to figure
       # I would often pass args in order rather than using a dict
        copy_res = copy_file.invoke(
            {"filepath": "./foo.py", "new_directory_or_filepath": not_exist_dir})
        self.assertEqual(copy_res, "Destination does not exist")

keenborder786 · March 18, 2026, 3:42pm

eric-burel:

Hey folks, I’d be eager to better unit test tools but the current experience is not super great:

How to get typing when importing a tool? The “invoke” function is untyped making it hard to discover the expected args for invoke

Is it possible to tie some metadata to a tool directly? This would be also useful for other use cases. For instance, I’d like to treat some tools as “final”, meaning that when they are called, the agent should not call a tool afterwards. More broadly it could be interesting to tag them, or even attach a nice “print” function to a tool to optimize how it displays its results

I’d like to avoid exporting separately the unwrapped function, as I think a tool should be tested as a tool

Example code I’d like to improve:
class TestDocumentsTools(unittest.TestCase):
    def test_read_document_file_text_content(self):
        filepath = path.join(assets_dir, "./foo.odt")
       # could be easier to call the wrapped function rather than the tool here
        content = read_document_file_text_content.invoke(filepath)
        self.assertTrue(content.find("foo") > -1)
        self.assertFalse(content.find("<") > -1)

    def test_copyfile_dst_does_not_exist(self):
        not_exist_dir = path.join(assets_dir, "does_not_exist")
       # not typed: flaky and hard to figure
       # I would often pass args in order rather than using a dict
        copy_res = copy_file.invoke(
            {"filepath": "./foo.py", "new_directory_or_filepath": not_exist_dir})
        self.assertEqual(copy_res, "Destination does not exist")

Hello @eric-burel

TL;DR

Typed invoke: Not possible today; the decorator erases types. Use tool.func(arg1, arg2) instead to get positional args and full type hints on the original function.
Metadata: Already built in: return_direct=True for “final” tools, tags=[...] for tagging, metadata={...} for arbitrary data. All settable via @tool(return_direct=True, tags=["foo"]).
Test without exporting the raw function: Use tool.func(...) directly in tests. It’s the original unwrapped callable, fully typed, no dict needed.

1. Typing for `invoke`

This is a fundamental limitation of the @tool decorator’s current design. The decorator’s overloads all return BaseTool (an erased type), and BaseTool.invoke is:

    def invoke(
        self,
        input: str | dict | ToolCall,
        config: RunnableConfig | None = None,
        **kwargs: Any,
    ) -> Any:
        tool_input, kwargs = _prep_run_args(input, config, **kwargs)
        return self.run(tool_input, **kwargs)

Both input: str | dict | ToolCall and -> Any are erased from the original function signature. There is no generic StructuredTool[InputModel, ReturnType] that would flow the original types through.

Your practical options today:

Option A — Use tool.func directly in tests. StructuredTool stores the original callable as .func:

    func: Callable[..., Any] | None = None
    """The function to run when the tool is called."""

So copy_file.func("./foo.py", not_exist_dir) gives you full positional args + type checking, because you’re calling the original function. The test suite itself does this (line 2239 in test_tools.py). The downside is that it bypasses tool-layer machinery (callbacks, error handling), which may or may not matter in unit tests.

Option B — Cast at import. If you want invoke to be discoverable, you can annotate the tool at definition with a protocol or cast:

from typing import cast
from langchain_core.tools import StructuredTool

@tool
def copy_file(filepath: str, new_directory_or_filepath: str) -> str:
    """..."""
    ...

copy_file = cast(StructuredTool, copy_file)
# Now at least your IDE knows it's a StructuredTool and you can inspect .func

Option C — Use args_schema explicitly with a typed Pydantic model, then call invoke with that model directly. Still returns Any, but the schema and field names are fully discoverable.

The “proper” fix would require the @tool decorator to return a generic StructuredTool[ArgsModel, ReturnType] that propagates through invoke. That’s a deeper type system change not currently present in the codebase.

2. Metadata / tagging tools

Good news: BaseTool already has most of what you want:

    return_direct: bool = False
    """Whether to return the tool's output directly.

    Setting this to `True` means that after the tool is called, the `AgentExecutor` will
    stop looping.
    """
    ...
    tags: list[str] | None = None
    """Optional list of tags associated with the tool.
    ...
    """

    metadata: dict[str, Any] | None = None
    """Optional metadata associated with the tool.
    ...
    """
    ...
    extras: dict[str, Any] | None = None
    """Optional provider-specific extra fields for the tool.
    ...
    """

“final” / stop-after-call: return_direct=True is exactly this. You can set it via the decorator: @tool(return_direct=True).
Tags: tags=["final", "display:table"] works out of the box.
Arbitrary metadata: metadata={"is_final": True, "display_fn": my_fn} also works, though storing callables in metadata is unconventional.
Custom print/display: There’s no first-class display_fn field, but you could store it in metadata or, more idiomatically, subclass StructuredTool to add typed fields.

3. Testing the tool as a tool (without exporting the raw function)

The cleanest solution for invoking with positional arguments while staying on the tool is calling .func on the StructuredTool, which gives you the fully typed original function without a separate export:

class TestDocumentsTools(unittest.TestCase):
    def test_read_document_file_text_content(self):
        filepath = path.join(assets_dir, "./foo.odt")
        # Call the underlying function with positional args — fully typed
        content = read_document_file_text_content.func(filepath)
        self.assertTrue(content.find("foo") > -1)

    def test_copyfile_dst_does_not_exist(self):
        not_exist_dir = path.join(assets_dir, "does_not_exist")
        # Positional, typed, no dict needed
        copy_res = copy_file.func("./foo.py", not_exist_dir)
        self.assertEqual(copy_res, "Destination does not exist")

If you want to test the full tool path (callbacks, error handling, validation), keep using invoke with a dict — that’s the intended interface. For pure logic tests, .func(...) is cleaner and the pattern the project itself uses.

Summary table:

Goal	Current solution
Positional args + type hints	`tool.func(arg1, arg2)`
Keyword args via dict	`tool.invoke({"key": val})`
“Final” / stop-after	`@tool(return_direct=True)`
Tagging,.	`@tool(...)` + `tags=[...]`
Arbitrary metadata	`metadata={...}` on `BaseTool`
Typed `invoke` return	Not available; requires generic `StructuredTool[I, O]`

eric-burel · March 18, 2026, 5:58pm

Wow thanks for the detailed answer! Somehow I couldn’t find these from either docs or chat.langchain, this totally solves my questions and will vastly improve my tools setup!!
For `return_direct` this will cover some use cases, while some others are more subtle, in this case I would add a “maybe_return_direct” tag for instance and have the LLM to decide whether it should process the ToolMessage or “just” send its result as is. Thanks again!

Topic		Replies	Views
Tool.func typing LangChain python-help	3	94	April 3, 2026
Cannot access ToolRuntime in BaseTool subclass LangChain python-help	1	247	January 2, 2026
What is different between using DynamicStructuredTool vs tool()? LangChain js-help	2	341	December 15, 2025
LangChain/LangGraph tool args validation middleware Talking Shop	3	55	June 12, 2026
Document issue for tool binding LangChain product-feedback , python-help	10	855	November 13, 2025

Proper testing of a tool?

1. Typing for invoke

2. Metadata / tagging tools

3. Testing the tool as a tool (without exporting the raw function)

Related topics

1. Typing for `invoke`