Checked
- I searched existing ideas and did not find a similar one
- I added a very descriptive title
- I’ve clearly described the feature request and motivation for it
Feature request
Add a WhatsApp Chat Loader to LangChain.js, matching the functionality available in LangChain Python.
Motivation
WhatsApp is one of the most widely used messaging platforms globally (2B+ users), and chat export is a common use case for conversational AI and RAG applications. The Python implementation exists but the JS/TS version doesn’t, creating a gap for developers who need this functionality.
Why this matters:
- Feature parity with Python implementation - ensures consistency across the LangChain ecosystem
- High demand - WhatsApp chat analysis is a frequent use case for conversation history, customer support analysis, and knowledge extraction
- Simple implementation - Text file parsing only, no external API dependencies or authentication required
- Low maintenance burden - Standard .txt export format with well-defined structure
Reference:
- Python implementation: WhatsAppChatLoader — 🦜🔗 LangChain documentation
Proposal (If applicable)
Important discovery: This is actually a Chat Loader, not a Document Loader. It returns ChatSession objects with HumanMessage instances, preserving the conversational structure.
Implementation approach:
- Type: Chat Loader (Document Loader)
- Location:
libs/langchain-community/src/chat_loaders/whatsapp.ts - Entrypoint:
@langchain/community/chat_loaders/whatsapp - Input: Exported WhatsApp chat files (.txt format, without media)
- Output:
ChatSessionwithHumanMessageobjects containing sender and timestamp metadata - Features:
- Multi-line message support
- System message filtering (deleted messages, media omitted, etc.)
- Timestamp and sender extraction
- Support for both 12-hour (AM/PM) and 24-hour formats
- Dependencies: None (pure text parsing with regex)
- Tests: Integration tests with sample chat exports
Example usage:
import { WhatsAppChatLoader } from "@langchain/community/chat_loaders/whatsapp";
const loader = new WhatsAppChatLoader("path/to/chat.txt");
const chatSessions = await loader.load();
// Returns ChatSession with HumanMessage objects
Question: Given the simplicity of this loader and low maintenance requirements, would it be acceptable to contribute this directly to @langchain/community, or should this be published as a standalone NPM package per the current integration guidelines? I’m ready to implement this with full tests and documentation if this is welcome in the main repository.