How to Fix 'embedding dimension mismatch in production' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-22

embedding-dimension-mismatch-in-productionllamaindexpython

When you see ValueError: embedding dimension mismatch, LlamaIndex is telling you that the vectors already stored in your index were created with a different embedding size than the model you’re using now. This usually shows up in production after an embedding model swap, a redeploy, or when a persisted index gets loaded with a new Settings.embed_model.

The failure is not random. Your vector store expects one fixed dimension, and LlamaIndex is trying to insert or query with another.

The Most Common Cause

The #1 cause is this: you built the index with one embedding model, then later queried or added documents with a different one.

A common example is creating the index with text-embedding-ada-002 and later switching to text-embedding-3-large, or moving from one local model to another without rebuilding the persisted store.

Broken pattern	Fixed pattern
Persist index with one embedder, load it with another	Keep the same embedder for that index, or rebuild the index

# BROKEN
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage, Settings
from llama_index.embeddings.openai import OpenAIEmbedding

# Index was originally built with ada-002 (1536 dims)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-large")  # 3072 dims

storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)

query_engine = index.as_query_engine()
response = query_engine.query("What is our claims policy?")

# FIXED
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage, Settings
from llama_index.embeddings.openai import OpenAIEmbedding

# Use the same embedding model that was used when the index was created
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")  # 1536 dims

storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)

query_engine = index.as_query_engine()
response = query_engine.query("What is our claims policy?")

If you want to change embedding models, do not reuse the old persisted vector store. Rebuild the index from source documents so every vector has the same dimension.

Other Possible Causes

1) You changed `Settings.embed_model` after building part of the index

This happens when app startup code sets a default embedder, but some ingestion job overrides it later.

from llama_index.core import Settings

Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
# build some nodes here

Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
# add more nodes here -> mismatch risk

Keep embedding configuration immutable per index lifecycle.

2) Your vector database already contains old embeddings

If your Pinecone, Qdrant, Weaviate, or Chroma collection was created earlier with a different dimension, LlamaIndex will fail when inserting new vectors.

# Example: existing collection has dim=1536
# New model outputs dim=1024
vector_store = PineconeVectorStore(index_name="support-index")
storage_context = StorageContext.from_defaults(vector_store=vector_store)

Fix by deleting and recreating the collection, or by using a separate namespace/index per embedding model version.

3) You mixed document embeddings and query embeddings from different models

This can happen if ingestion uses one model and retrieval uses another. The stored vectors may be valid, but query-time similarity search breaks because query vectors must match the stored dimension.

# Ingestion: OpenAIEmbedding(model="text-embedding-ada-002")
# Query-time: HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

Use one embedding provider/model pair end-to-end for a given vector store.

4) You restored serialized data from an older deployment

A classic production issue: old pods write embeddings into persistent storage, then a new release comes up with a different default model.

env:
  - name: EMBED_MODEL
    value: text-embedding-3-large

If the persisted data came from text-embedding-ada-002, this deployment will break until you migrate or rebuild the store.

How to Debug It

•
Check the exact error message
- •
  Typical messages include:
  - •ValueError: Embedding dimension mismatch
  - •Expected embedding dimension 1536 but got 3072
- •If you see this during insert_nodes or as_query_engine(), it’s almost always a stored-vs-current model mismatch.
•
Print the active embed model
```
from llama_index.core import Settings
print(Settings.embed_model)
```
Confirm which class is actually running in production. Don’t trust what you think is configured; inspect runtime state.
•
Inspect your vector store dimensions
- •For Pinecone/Qdrant/Weaviate/Chroma, check collection/index metadata.
- •Compare that number to your current embedder output size.
- •If they differ, you found the cause.
•
Test embedding output directly
```
emb = Settings.embed_model.get_text_embedding("hello world")
print(len(emb))
```
If this returns 3072 but your vector DB was created for 1536, you need to rebuild or migrate.

Prevention

•
Version your embedding model alongside your index.
- •Treat embed_model as schema, not config.
- •If it changes, create a new collection or rebuild the old one.
•
Lock ingestion and retrieval to the same settings object.
```
from llama_index.core import Settings
Settings.embed_model = ...
```
Set it once at process startup and don’t mutate it mid-run.
•
Store metadata about embeddings with every deployment.
- •Model name
- •Dimension count
- •Vector store name/namespace
- •Index build timestamp

If you’re already in production and need zero-downtime recovery, build a new vector store with the new model and cut traffic over after reindexing. Mixing dimensions inside one store is not something LlamaIndex can safely paper over.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit