[Question] Best practices for using a local embedding model (e.g., Hugging Face) in LangChain?

Gaodelike · November 19, 2025, 6:20am

Hi everyone,

I am currently working on building a RAG application using LangChain. Due to data privacy/cost constraints, I need to use a locally hosted embedding model instead of relying on external APIs like OpenAI.

I have read through the official documentation and searched the forum, but I couldn’t find a clear guide or discussion specifically addressing how to properly configure a local model for this purpose.

My Goal: I want to load a model (e.g., Qwen3-Embedding-0.6B or a custom local path) to generate embeddings for my documents.

The Problem: I am unsure which class is best suited for this or how to point LangChain to my local model weights.

Current Approach (Pseudocode):

Python

# I am trying to achieve something like this:
from langchain_community.embeddings import HuggingFaceEmbeddings

# Is this the correct way to point to a local path?
embeddings = HuggingFaceEmbeddings(
    model_name="/path/to/my/local/model",
    model_kwargs={'device': 'cpu'}
)

Could anyone provide a working code snippet or point me to the correct wrapper class for running local embeddings efficiently?

Any help would be greatly appreciated!

Thanks.

Topic		Replies	Views
Optional Metadata Return (e.g., Token Usage) to Embeddings Classes LangChain python-help	1	396	August 19, 2025
New init_chat_model format using LM Studio? LangChain self-hosted , python-help	4	2098	July 15, 2025
Does Langchain run locally? LangChain python-help	3	227	December 30, 2025
Tool/Function Calling with Llama-3.2-3B-Instruct model (local) LangChain python-help	8	634	January 3, 2026
How to Preload LLM and Embedding Models at Startup to Avoid Repeated Initialization LangGraph python-help	1	246	October 16, 2025

[Question] Best practices for using a local embedding model (e.g., Hugging Face) in LangChain?

Related topics