Inconsistent image_url Formats in ChatOpenAI vs. ChatGoogleGenerativeAI

I’ve noticed that ChatOpenAI() only accepts an object for image_url input, whereas ChatGoogleGenerativeAI() accepts a plain string. I’d like to clarify which payload shape is intended to be “standard” and whether this difference should be elevated as a bug or feature request.

ChatOpenAI (Object-Only)

// This fails if sent as a string:
{"type": "image_url", "image_url": { "url": "https://example.com/foo.png" }}

3.2 ChatGoogleGenerativeAI (String OK)

// This works for Google but not OpenAI:
{"type": "image_url", "image_url": "https://example.com/foo.png"}

I welcome any insights on whether this divergence is by design or an oversight. If it’s better addressed as a bug report or a spec enhancement, please let me know. Thanks in advance for your guidance!

Hi @p-karunya, thanks for posting this.

The first format is OpenAI’s chat completions format and is accepted by all models that support image inputs.

It looks like ChatGoogleGenerativeAI implemented support for image_url passed as a string as a convenience, but this is not standard.

See this guide for more on working with multi-modal inputs in LangChain.