Send PDF File as a ToolMessage

I need to create a tool that retrieves a PDF and returns it to the LLM.

Configuration

  • LLM Provider: ChatBedrock
  • Model: Anthropic Claude Sonnet 3.5 v2
  • Agent Type: React Agent

Currently, my tool looks like this:

import base64
from langchain_core.tools import tool


@tool
def get_pdf():
    """
    Search on PDF information
    """
    pdf_path = "/sample/sample"

    with open(pdf_path, "rb") as pdf_file:
        pdf_data = base64.b64encode(pdf_file.read()).decode("utf-8")

    return [
        {
            "type": "file",
            "source_type": "base64",
            "data": pdf_data,
            "mime_type": "application/pdf",
        },
    ]

Problem

When running this, I get the following error:

{"error":"An error occurred (ValidationException) when calling the InvokeModel operation: Input is too long for requested model."}

This makes me think that toolMessage is passing the PDF as plain text instead of sending it as a proper file.

I am also having this issue. Any insights from Langchain’s team?

Hi @rst.monzani

It looks like the model/provider you selected can’t handle such a long file.

from langchain_core.tools import tool
import base64
from pathlib import Path

@tool
def list_pdfs() -> list[dict]:
    """
    Return multiple PDFs as valid ToolMessage content blocks.
    Prefer URL or provider file IDs to avoid large payloads.
    """
    # Example 1: URL-based file (best for large files)
    pdf_url_block = {
        "type": "file",
        "url": "https://example.com/sample.pdf",
    }

    # Example 2: Provider-managed file (Anthropic/OpenAI file id)
    # Only use if you've uploaded the file via the provider's Files API.
    provider_file_block = {
        "type": "file",
        "id": "file_abc123",  # e.g., Anthropic/OpenAI file id
    }

    # Example 3: Small local file inlined as base64 (avoid for big PDFs)
    # NOTE: base64 increases size; can trigger model input limits.
    local_path = Path("/path/to/small.pdf")
    b64 = base64.b64encode(local_path.read_bytes()).decode("utf-8")
    base64_file_block = {
        "type": "file",
        "base64": b64,
        "mime_type": "application/pdf",
    }

    return [pdf_url_block, provider_file_block, base64_file_block]

If the file is too long and you still want to send it directly, split it into chunks and then apply vector search. Send over the result chunks to the LLM.