Problem:
- Dimension cannot be specified dynamically. We must hard-code DB schemas (e.g., pgvector dimensions) and hope provider defaults don’t change.
- Risk of silent breaking changes. If Google increases default dims, our pipelines fail (DB rejects vectors of wrong length).
- Cost & performance. Many tasks don’t need 3k-dim vectors; smaller dims reduce storage and retrieval latency with minimal quality loss (per Google docs).
Today’s behavior
const emb = new GoogleGenerativeAIEmbeddings({
model: "models/gemini-embedding-001",
// no way to set outputDimensionality
});
const v = await emb.embedQuery("What is the meaning of life?");
// Returns default size (currently 3072)
Behavior from offitial google doc (https://ai.google.dev/gemini-api/docs/embeddings#javascript_3):
import { GoogleGenAI } from "@google/genai";
async function main() {
const ai = new GoogleGenAI({});
const response = await ai.models.embedContent({
model: 'gemini-embedding-001',
content: 'What is the meaning of life?',
outputDimensionality: 768,
});
const embeddingLength = response.embedding.values.length;
console.log(`Length of embedding: ${embeddingLength}`);
}
main();