Hii with Langchain js i make RAG application where we use Mongodb vector DB database. in Chat it work fine
Hello everyone!
I have built a RAG application using LangChain JS + MongoDB Atlas Vector Search, and everything works perfectly for text chat.
Now I want to add real-time voice (live audio) conversation using Google GenAI Live WebSocket API, but I am unsure how to integrate it with LangChainโs RAG retriever.
Below is my full implementation and what I want to achieve.
My Current Setup (Working)
Technologies:
- LangChain JS
- MongoDB Atlas Vector Search
- Google Generative AI (embeddings + chat model)
- Express backend
- SSE streaming for chat responses
- PDF ingestion & chunking
My Full Code (Current Working RAG Setup)
Server Setup
import express from "express";
import cors from "cors";
import dotenv from "dotenv";
import multer from "multer";
import path from "path";
import fs from "fs";
import { v4 as uuidv4 } from "uuid";
import { MongoClient } from "mongodb";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
import {
GoogleGenerativeAIEmbeddings,
ChatGoogleGenerativeAI,
} from "@langchain/google-genai";
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf";
import { WebSocketServer } from "ws";
import { GoogleGenAI, Modality } from "@google/genai";
dotenv.config();
const app = express();
const PORT = process.env.PORT || 8000;
// ==========================
// โ๏ธ Middleware
// ==========================
app.use(cors());
app.use(express.json());
app.use(express.urlencoded({ extended: true }));
const upload = multer({ dest: "uploads/" });
// ==========================
// ๐ง MongoDB & Model Setup
// ==========================
const MONGO_URI = process.env.MONGO_URI;
const DB_NAME = process.env.MONGO_DB_NAME || "rag_demo";
const COLLECTION_NAME = process.env.MONGO_COLLECTION || "knowledge_base";
const embeddings = new GoogleGenerativeAIEmbeddings({
model: "text-embedding-004",
apiKey: process.env.GOOGLE_API_KEY,
});
const chatModel = new ChatGoogleGenerativeAI({
model: "gemini-2.0-flash",
apiKey: process.env.GOOGLE_API_KEY,
streaming: true,
});
Upload API (PDF โ Chunks โ Embeddings โ MongoDB)
app.post("/api/upload-file", upload.single("file"), async (req, res) => {
const { file } = req;
const { name } = req.body;
if (!file) return res.status(400).json({ error: "File is required" });
if (!name) return res.status(400).json({ error: "Name is required" });
const ext = path.extname(file.originalname).toLowerCase();
const fileId = uuidv4();
try {
if (ext !== ".pdf") {
return res.status(400).json({ error: "Only PDF files are supported" });
}
const loader = new PDFLoader(file.path);
const rawDocs = await loader.load();
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1500,
chunkOverlap: 200,
});
const splitDocs = await splitter.splitDocuments(rawDocs);
const documents = splitDocs.map((doc, index) => ({
...doc,
metadata: {
fileId,
name,
fileName: file.originalname,
chunkIndex: index,
uploadedAt: new Date(),
},
}));
const client = new MongoClient(MONGO_URI);
await client.connect();
const db = client.db(DB_NAME);
const collection = db.collection(COLLECTION_NAME);
const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
collection,
indexName: "vector_index",
textKey: "pageContent",
embeddingKey: "embedding",
});
await vectorStore.addDocuments(documents);
await client.close();
fs.unlinkSync(file.path);
res.status(201).json({
message: "โ
File uploaded & vectors stored in MongoDB Atlas",
fileId,
totalChunks: documents.length,
});
} catch (err) {
console.error("Upload error:", err);
res.status(500).json({ error: err.message });
}
});
Chat API (RAG + SSE Streaming)
app.post("/api/chat-stream", async (req, res) => {
const { question } = req.body;
if (!question) return res.status(400).json({ error: "Question is required" });
try {
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
const client = new MongoClient(MONGO_URI);
await client.connect();
const db = client.db(DB_NAME);
const collection = db.collection(COLLECTION_NAME);
const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
collection,
indexName: "vector_index",
textKey: "pageContent",
embeddingKey: "embedding",
});
const results = await vectorStore.similaritySearch(question, 5);
const context = results.map((r) => r.pageContent).join("\n---\n");
const prompt = `
You are a helpful AI assistant. Use the context below to answer the user's question.
If the answer is not in the context, say "I don't know".
Context:
${context}
Question:
${question}
`;
const stream = await chatModel.stream(prompt);
for await (const chunk of stream) {
if (chunk?.content && chunk.content.length > 0) {
const token = chunk.content || "";
res.write(`data: ${JSON.stringify({ type: "text", data: token })}\n\n`);
}
}
res.write(`data: ${JSON.stringify({ type: "complete" })}\n\n`);
res.end();
await client.close();
} catch (err) {
console.error("Chat streaming error:", err);
res.write(`data: ${JSON.stringify({ type: "error", data: err.message })}\n\n`);
res.end();
}
});
Now My Actual Question: How to do Live Voice RAG?
Google GenAI provides real-time bidirectional audio streaming using WebSockets:
wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent
Using @google/genai, we can do:
const session = await client.live.connect({
model: "gemini-2.0-flash-live",
});
This gives:
- Real-time microphone streaming
- Real-time transcript
- Real-time audio response
My Question to the LangChain Community
Does LangChain JS support RAG inside Live Voice (WebSocket) sessions?
Specifically: