Hey!
I’m building a conversational AI agent where users can attach files to their messages. Right now I’m using HumanMessage with a content array (type: "text" for regular messages and type: "file" for attachments).
The issue I’m running into is that OpenAI’s Chat Completions API only seems to accept application/pdf in the "file" content block - everything else like .docx, .xlsx, .txt, etc. returns an error:
Invalid file data: ‘messages[1].content[1].file.file_data’. Expected a base64-encoded data URL with an application/pdf MIME type, but got unsupported MIME type ‘text/plain’.
I’m curious how others handle non-PDF files in practice. Do you typically convert them to PDF, or just extract the text? If you extract the text, how do you handle it on the UI side so it’s clear what came from the user versus what came from a file - do you rely on metadata, or some other approach?
Thanks in advance!