How to Fix 'embedding dimension mismatch in production' in LlamaIndex (Python)
When you see ValueError: embedding dimension mismatch, LlamaIndex is telling you that the vectors already stored in your index were created with a different embedding size than the model you’re using now. This usually shows up in production after an embedding model swap, a redeploy, or when a persisted index gets loaded with a new Settings.embed_model.
The failure is not random. Your vector store expects one fixed dimension, and LlamaIndex is trying to insert or query with another.
The Most Common Cause
The #1 cause is this: you built the index with one embedding model, then later queried or added documents with a different one.
A common example is creating the index with text-embedding-ada-002 and later switching to text-embedding-3-large, or moving from one local model to another without rebuilding the persisted store.
| Broken pattern | Fixed pattern |
|---|---|
| Persist index with one embedder, load it with another | Keep the same embedder for that index, or rebuild the index |
# BROKEN
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage, Settings
from llama_index.embeddings.openai import OpenAIEmbedding
# Index was originally built with ada-002 (1536 dims)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-large") # 3072 dims
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
query_engine = index.as_query_engine()
response = query_engine.query("What is our claims policy?")
# FIXED
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage, Settings
from llama_index.embeddings.openai import OpenAIEmbedding
# Use the same embedding model that was used when the index was created
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002") # 1536 dims
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
query_engine = index.as_query_engine()
response = query_engine.query("What is our claims policy?")
If you want to change embedding models, do not reuse the old persisted vector store. Rebuild the index from source documents so every vector has the same dimension.
Other Possible Causes
1) You changed Settings.embed_model after building part of the index
This happens when app startup code sets a default embedder, but some ingestion job overrides it later.
from llama_index.core import Settings
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
# build some nodes here
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
# add more nodes here -> mismatch risk
Keep embedding configuration immutable per index lifecycle.
2) Your vector database already contains old embeddings
If your Pinecone, Qdrant, Weaviate, or Chroma collection was created earlier with a different dimension, LlamaIndex will fail when inserting new vectors.
# Example: existing collection has dim=1536
# New model outputs dim=1024
vector_store = PineconeVectorStore(index_name="support-index")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
Fix by deleting and recreating the collection, or by using a separate namespace/index per embedding model version.
3) You mixed document embeddings and query embeddings from different models
This can happen if ingestion uses one model and retrieval uses another. The stored vectors may be valid, but query-time similarity search breaks because query vectors must match the stored dimension.
# Ingestion: OpenAIEmbedding(model="text-embedding-ada-002")
# Query-time: HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
Use one embedding provider/model pair end-to-end for a given vector store.
4) You restored serialized data from an older deployment
A classic production issue: old pods write embeddings into persistent storage, then a new release comes up with a different default model.
env:
- name: EMBED_MODEL
value: text-embedding-3-large
If the persisted data came from text-embedding-ada-002, this deployment will break until you migrate or rebuild the store.
How to Debug It
- •
Check the exact error message
- •Typical messages include:
- •
ValueError: Embedding dimension mismatch - •
Expected embedding dimension 1536 but got 3072
- •
- •If you see this during
insert_nodesoras_query_engine(), it’s almost always a stored-vs-current model mismatch.
- •Typical messages include:
- •
Print the active embed model
from llama_index.core import Settings print(Settings.embed_model)Confirm which class is actually running in production. Don’t trust what you think is configured; inspect runtime state.
- •
Inspect your vector store dimensions
- •For Pinecone/Qdrant/Weaviate/Chroma, check collection/index metadata.
- •Compare that number to your current embedder output size.
- •If they differ, you found the cause.
- •
Test embedding output directly
emb = Settings.embed_model.get_text_embedding("hello world") print(len(emb))If this returns
3072but your vector DB was created for1536, you need to rebuild or migrate.
Prevention
- •
Version your embedding model alongside your index.
- •Treat
embed_modelas schema, not config. - •If it changes, create a new collection or rebuild the old one.
- •Treat
- •
Lock ingestion and retrieval to the same settings object.
from llama_index.core import Settings Settings.embed_model = ...Set it once at process startup and don’t mutate it mid-run.
- •
Store metadata about embeddings with every deployment.
- •Model name
- •Dimension count
- •Vector store name/namespace
- •Index build timestamp
If you’re already in production and need zero-downtime recovery, build a new vector store with the new model and cut traffic over after reindexing. Mixing dimensions inside one store is not something LlamaIndex can safely paper over.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit