How to Fix 'embedding dimension mismatch in production' in LangChain (Python)

By Cyprian AaronsUpdated 2026-04-22

embedding-dimension-mismatch-in-productionlangchainpython

If you’re seeing ValueError: embedding dimension mismatch in a LangChain Python app, it means the vector length produced by your embedding model does not match the vector length expected by your vector store index. This usually shows up in production after a model swap, an index migration, or when one service is still using old embeddings while another has moved on.

The failure is almost always deterministic: your documents were indexed with one embedding dimension, and your query path is now using another.

The Most Common Cause

The #1 cause is mixing embedding models with different output dimensions against the same vector store or index.

A common pattern is: you indexed documents with text-embedding-ada-002-style embeddings, then later switched to a newer model or a local embedding model without rebuilding the index. LangChain will happily call both models, but your backend will reject the mismatch.

Broken code	Fixed code
```python
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma

Index was created earlier with a different embedding model

embeddings = OpenAIEmbeddings(model="text-embedding-3-large") # 3072 dims

db = Chroma( collection_name="support_docs", persist_directory="./chroma_db", embedding_function=embeddings, )

results = db.similarity_search("How do I reset my password?", k=4) |python from langchain_openai import OpenAIEmbeddings from langchain_chroma import Chroma

Use the same embedding model that was used to build the index

embeddings = OpenAIEmbeddings(model="text-embedding-3-small") # example: 1536 dims

db = Chroma( collection_name="support_docs", persist_directory="./chroma_db", embedding_function=embeddings, )

results = db.similarity_search("How do I reset my password?", k=4)


If you changed models intentionally, rebuild the index instead of reusing old vectors.

```python
# Rebuild flow
docs = loader.load()
splits = text_splitter.split_documents(docs)

db = Chroma.from_documents(
    documents=splits,
    embedding=embeddings,
    collection_name="support_docs",
    persist_directory="./chroma_db",
)

The key rule: the same collection must use one embedding dimension for all stored vectors and queries.

Other Possible Causes

1) Query embeddings and document embeddings come from different classes

This happens when ingestion uses one class and retrieval uses another.

# Broken: ingest with OpenAI embeddings, query with HuggingFace embeddings
from langchain_openai import OpenAIEmbeddings
from langchain_huggingface import HuggingFaceEmbeddings

doc_embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
query_embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Same index, different dimensions -> mismatch

Fix: use one embedding provider per index.

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
# Use embeddings everywhere for this collection

2) You pointed at an existing vector store built with old data

This is common in production when persist_directory, Pinecone namespace, Qdrant collection, or Weaviate class already contains vectors from a previous deployment.

db = Chroma(
    collection_name="support_docs",
    persist_directory="./chroma_db",  # old vectors may already exist here
    embedding_function=embeddings,
)

Fix options:

•delete and rebuild the collection
•version the collection name
•version the namespace

db = Chroma(
    collection_name="support_docs_v2",
    persist_directory="./chroma_db",
    embedding_function=embeddings,
)

3) Your vector store schema was created with a fixed dimension

Some stores enforce dimension at creation time. If you change models later, inserts fail immediately or searches fail later depending on backend behavior.

# Example conceptually for fixed-dimension stores:
# Collection created for 1536-dim vectors cannot accept 3072-dim vectors.

Fix: recreate the index with the new dimension. In Pinecone/Qdrant/Weaviate-like systems, this often means dropping and recreating the collection/index.

4) Mixed environments are writing different embeddings to the same backend

This happens when staging and prod share infrastructure, or two workers run different app versions during rollout.

# broken deployment pattern
service-a:
  EMBEDDING_MODEL: text-embedding-3-small

service-b:
  EMBEDDING_MODEL: all-MiniLM-L6-v2

Fix: lock model version in config and deploy atomically. Don’t let two services write to the same collection unless they share the exact same embedding config.

How to Debug It

•
Print the actual vector lengths
```
vec = embeddings.embed_query("test")
print(len(vec))
```
Compare this with what your vector store expects. If they differ, you found the bug.
•
Check what model built the existing index
- •Inspect deployment logs from ingestion jobs.
- •Check saved metadata if your backend stores it.
- •Look for old collection_name, namespace, or persist_directory values.
•
Verify ingestion and retrieval use the same class Search your codebase for:
- •OpenAIEmbeddings
- •HuggingFaceEmbeddings
- •OllamaEmbeddings
- •AzureOpenAIEmbeddings
If more than one class touches the same index, assume mismatch until proven otherwise.
•
Reproduce against a clean index Create a fresh test collection and insert one document. If that works, your production data/index is stale. If it still fails, your embedding configuration is inconsistent before storage even happens.

Typical backend errors you may see include:

•ValueError: Embedding dimension mismatch
•InvalidArgumentException: vector dimension does not match index dimension
•400 Bad Request: expected dimension X but got Y

Prevention

•
Version your indexes
- •Use names like customer_docs_v1, customer_docs_v2
- •Rebuild on every embedding model change

•

Pin embedding models in config

EMBEDDING_MODEL=text-embedding-3-small
VECTOR_COLLECTION=support_docs_v2

Don’t hardcode this in multiple places.

•
Add a startup check Validate that:
- •query embeddings length matches stored index dimension
- •ingestion and retrieval use the same provider/model combo

If you want one rule to keep in mind: an index is tied to an embedding dimension, not just to “embeddings” in general. Once that changes, treat it as a new index.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit