How to Fix 'embedding dimension mismatch during development' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-22

embedding-dimension-mismatch-during-developmentllamaindexpython

When you see ValueError: Embedding dimension mismatch in LlamaIndex, it means the vectors stored in your index do not match the vector size produced by the embedding model you’re using now. This usually shows up during development when you switch models, rebuild partial indexes, or load an old persisted index with a new embedding config.

The pattern is predictable: you indexed documents with one embedding model, then queried or appended with another. LlamaIndex is strict here because vector similarity search only works when every embedding has the same dimension.

The Most Common Cause

The #1 cause is changing embedding models between index creation and query time.

A common mistake is creating the index with one model, then later loading it with a different model that returns a different vector size.

Broken pattern	Fixed pattern
Build index with one embedder, query with another	Use the same embedder for indexing and querying
Persist index, then change `Settings.embed_model`	Rehydrate with the original embed model or rebuild the index

# BROKEN
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index.embeddings.openai import OpenAIEmbedding

# During indexing
embed_model_v1 = OpenAIEmbedding(model="text-embedding-3-small")  # 1536 dims
index = VectorStoreIndex.from_documents(docs, embed_model=embed_model_v1)
index.storage_context.persist("./storage")

# Later during query time
embed_model_v2 = OpenAIEmbedding(model="text-embedding-3-large")  # 3072 dims
storage_context = StorageContext.from_defaults(persist_dir="./storage")
loaded_index = load_index_from_storage(storage_context)
query_engine = loaded_index.as_query_engine(embed_model=embed_model_v2)

response = query_engine.query("What does this policy say?")

# FIXED
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index.embeddings.openai import OpenAIEmbedding

embed_model = OpenAIEmbedding(model="text-embedding-3-small")

# Indexing
index = VectorStoreIndex.from_documents(docs, embed_model=embed_model)
index.storage_context.persist("./storage")

# Querying
storage_context = StorageContext.from_defaults(persist_dir="./storage")
loaded_index = load_index_from_storage(storage_context)
query_engine = loaded_index.as_query_engine(embed_model=embed_model)

response = query_engine.query("What does this policy say?")

If you want to use a new embedding model, rebuild the index from scratch. Persisted vectors are not portable across embedding dimensions.

Other Possible Causes

1. Mixing local and remote embeddings

You might create embeddings locally with HuggingFaceEmbedding and later query with an OpenAI model.

from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.embeddings.openai import OpenAIEmbedding

# Index built with local model
index_embed = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")  # 384 dims

# Query uses OpenAI model
query_embed = OpenAIEmbedding(model="text-embedding-3-large")  # 3072 dims

Keep one embedding provider per index unless you intentionally rebuilt everything.

2. Reusing a persisted vector store after changing models

This happens a lot in dev when ./storage survives between test runs.

# Old run used 384-dim embeddings
# New run uses 1536-dim embeddings

storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)

If your embedder changed, delete the persisted storage directory and rebuild:

rm -rf ./storage

3. Swapping models behind `Settings.embed_model`

LlamaIndex often reads from global settings. If one part of your app changes it, another part may silently use the wrong embedder.

from llama_index.core import Settings

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
# later in another module...
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

Set it once at process startup and treat it as immutable for that app run.

4. Using a vector store that already contains incompatible vectors

If you point LlamaIndex at an existing Pinecone, Qdrant, Weaviate, or Chroma collection created with another dimension, you’ll get dimension errors as soon as inserts or queries happen.

# Existing collection was created with dim=384
vector_store = PineconeVectorStore(index_name="support-kb")
index = VectorStoreIndex.from_vector_store(vector_store)

The fix is usually to create a new namespace/collection/index name per embedding configuration.

How to Debug It

•
Print the current embedding model and expected dimension
- •Check what your code is actually using at runtime.
- •
  For example:
```
print(Settings.embed_model)
```
- •If you use different models in different modules, that’s your first red flag.
•
Inspect the stored vector dimension
- •Check your vector database collection metadata or persisted files.
- •Compare it against the output size of your current embedding model.
- •
  Typical mismatch examples:
  - •stored: 384, current: 1536
  - •stored: 1536, current: 3072
•
Search for multiple embedder initializations
- •
  Grep for these class names:
  - •OpenAIEmbedding
  - •HuggingFaceEmbedding
  - •CohereEmbedding
  - •AzureOpenAIEmbedding
- •If more than one appears in the same app path, confirm they are not being mixed on one index.
•
Rebuild from scratch and retest
- •Delete persisted storage and recreate the index.
- •If the error disappears after rebuilding, your old vectors were incompatible.
- •This is often the fastest way to confirm whether persistence is the problem.

Prevention

•Pin one embedding model per index and document it in code comments or config.
•Version your vector collections by embedding configuration, not just by app name.
•Add a startup check that compares expected dimension against stored collection metadata before serving traffic.
•When upgrading models, rebuild indexes instead of trying to reuse old persisted vectors.

If you want a simple rule: one index, one embedding model, one dimension. Most “embedding dimension mismatch” errors in LlamaIndex come from violating that contract during development.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit