[Question] Best practices for using a local embedding model (e.g., Hugging Face) in LangChain?

Hi everyone,

I am currently working on building a RAG application using LangChain. Due to data privacy/cost constraints, I need to use a locally hosted embedding model instead of relying on external APIs like OpenAI.

I have read through the official documentation and searched the forum, but I couldn’t find a clear guide or discussion specifically addressing how to properly configure a local model for this purpose.

My Goal: I want to load a model (e.g., Qwen3-Embedding-0.6B or a custom local path) to generate embeddings for my documents.

The Problem: I am unsure which class is best suited for this or how to point LangChain to my local model weights.

Current Approach (Pseudocode):

Python

# I am trying to achieve something like this:
from langchain_community.embeddings import HuggingFaceEmbeddings

# Is this the correct way to point to a local path?
embeddings = HuggingFaceEmbeddings(
    model_name="/path/to/my/local/model",
    model_kwargs={'device': 'cpu'}
)

Could anyone provide a working code snippet or point me to the correct wrapper class for running local embeddings efficiently?

Any help would be greatly appreciated!

Thanks.