Model agnostic multimodal LLM call

PabloBotinGP · October 13, 2025, 7:10pm

Hi all,

I’m experimenting with LangGraph for a simple prototype where the graph has just one node that calls an LLM. The wrinkle is that the LLM call is not a trivial text prompt:

The input includes a PDF file that must be uploaded (and referenced) in the model call.

Right now I’m handling this by calling the OpenAI Responses API directly inside the node. This works fine, but it makes my graph tightly coupled to OpenAI. I would like to make the workflow model-agnostic so that later I can swap in Anthropic, Gemini, or other providers without rewriting the graph logic.

My questions:

Does LangGraph provide any built-in functions or abstractions to handle multimodal message objects (text + image/file), or is the expectation that developers wrap each provider’s SDK/API in their own adapter nodes?
If the latter, is the recommended pattern to:
- Define a neutral internal schema for messages (e.g., {type: "text" | "image" | "file", content: ...})
- Then write per-provider adapters that map this neutral schema to the provider’s required structure?
Are there any examples of LangGraph projects where multimodal input is handled in a provider-agnostic way, so that graph nodes remain portable across OpenAI, Anthropic, Gemini, etc.

Thanks a lot for any guidance!

chester-lc · October 13, 2025, 7:54pm

Hello, thanks for this question.

If you’re using LangChain chat models with LangGraph, it will take care of this for you. See this guide.

Note that there is a minor change planned for this format in the upcoming v1.0 release (although the v0.3* format will continue to work in 1.0). You can reference the associated v1.0 docs here. 1.0 will be released this month and alpha releases are available now.

PabloBotinGP · October 30, 2025, 10:24pm

Thanks!

I also realized that when I upload the PDF, only the text layer of it appears to be considered, whereas the image itself is not. Do you have more info about this? Do you consider it necessary to render my pdf into an image and provide both PDF and image?

Topic		Replies	Views
LangGraph seems to ignore tool outputs with Gemini models LangGraph python-help	4	162	January 5, 2026
File uploads using langgraph models LangChain python-help	1	983	November 11, 2025
How to read uploaded file by LangGraph LangChain Academy	5	561	October 29, 2025
Send PDF File as a ToolMessage LangChain python-help	3	593	October 22, 2025
LLM tools invoking LangGraph python-help	3	685	July 2, 2025

Model agnostic multimodal LLM call

Related topics