How to Fix 'embedding dimension mismatch' in LlamaIndex (Python)
If you’re seeing ValueError: embedding dimension mismatch, LlamaIndex is telling you that the vector store and the embedding model do not agree on vector size. This usually shows up when you switch embedding models, reuse an old index, or load persisted data that was built with a different embedding dimension.
In practice, the failure happens during indexing or retrieval, often inside classes like VectorStoreIndex, StorageContext, OpenAIEmbedding, or your vector DB adapter such as PineconeVectorStore, ChromaVectorStore, or QdrantVectorStore.
The Most Common Cause
The #1 cause is simple: you created the index with one embedding model, then later queried or appended with another model that produces a different vector size.
For example, OpenAI’s text-embedding-3-small and text-embedding-3-large do not have the same default dimensions. If your vector store was created for 1536-dim embeddings and you later send 3072-dim vectors, LlamaIndex will fail.
Broken vs fixed pattern
| Broken | Fixed |
|---|---|
| Build index with one embedding model, query with another | Use the same embedding model for indexing and querying |
| Reuse persisted vectors after changing models | Rebuild the index or migrate the stored vectors |
# BROKEN
from llama_index.core import VectorStoreIndex, Settings
from llama_index.embeddings.openai import OpenAIEmbedding
# Index built with one model
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
index = VectorStoreIndex.from_documents(docs)
# Later in another process / notebook cell
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-large")
query_engine = index.as_query_engine()
response = query_engine.query("What is the policy term?")
# FIXED
from llama_index.core import VectorStoreIndex, Settings
from llama_index.embeddings.openai import OpenAIEmbedding
embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Settings.embed_model = embed_model
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine()
response = query_engine.query("What is the policy term?")
If you already persisted data with the wrong dimension, don’t just change code and hope it works. Delete and rebuild the index, or re-embed everything into a fresh collection with the new model.
Other Possible Causes
1. Persisted vector store was created with a different model
This is common when you restart an app and load old storage from disk.
from llama_index.core import StorageContext, load_index_from_storage
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
If ./storage contains vectors from a previous embedding model, loading succeeds but retrieval fails later with dimension errors. The fix is to wipe the persisted store or migrate it.
2. Your vector database collection has a fixed dimension
Some backends enforce dimension at collection creation time.
# Example: Qdrant / Pinecone style setup
# Collection created for 1536 dimensions earlier
# Now you're sending 3072-dimensional embeddings
If the collection already exists, check its configured vector size. For Pinecone or Qdrant, create a new collection/index matching the current embedding model.
3. You changed embedding providers between environments
A dev machine might use one provider and production another.
# local.py
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
# prod.py
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-large-en-v1.5")
Those models produce different dimensions. If both environments point to the same vector DB namespace or collection, you’ll get mismatches immediately.
4. You are mixing nodes from different pipelines
This happens when documents were ingested through two separate jobs using different embed settings.
# Job A uses one embedder
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
# Job B uses another embedder later
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")
The resulting index looks valid until retrieval hits mixed vectors in the same store. Keep one embedding pipeline per collection.
How to Debug It
- •
Print the active embedding model
- •Verify what LlamaIndex is actually using at runtime.
- •Check
Settings.embed_modelbefore indexing and querying.
- •
Check the embedding dimension directly
- •Generate one test embedding and inspect its length.
- •Example:
emb = Settings.embed_model.get_text_embedding("test") print(len(emb))
- •
Inspect your vector store schema
- •Confirm what dimension the backend expects.
- •For Pinecone/Qdrant/Chroma, look at collection or index metadata.
- •
Compare against persisted data
- •If you reused storage, delete it and rebuild once.
- •If rebuilding fixes it, your old persisted vectors were created with a different model.
Prevention
- •Use one embedding model per vector collection.
- •Version your ingestion pipeline so model changes force a full reindex.
- •Store metadata alongside each index:
- •embedding provider
- •model name
- •expected dimension
A practical pattern is to fail fast on startup:
expected_dim = 1536
actual_dim = len(Settings.embed_model.get_text_embedding("dimension check"))
if actual_dim != expected_dim:
raise ValueError(f"Embedding dim mismatch: expected {expected_dim}, got {actual_dim}")
That saves you from discovering the problem only after users start querying production data.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit