How to Fix 'embedding dimension mismatch when scaling' in LlamaIndex (Python)

By Cyprian AaronsUpdated 2026-04-22
embedding-dimension-mismatch-when-scalingllamaindexpython

If you see ValueError: embedding dimension mismatch when scaling, it usually means LlamaIndex is trying to compare or store vectors that were created with different embedding models. This shows up most often when you switch models, reuse an old vector index, or mix embeddings from different providers in the same storage.

The key detail: your index, vector store, and query-time embedder must all agree on vector size. If one side produces 1536-dimension vectors and the other expects 3072, LlamaIndex will fail during insert, retrieval, or scaling operations.

The Most Common Cause

The #1 cause is reusing a persisted index after changing the embedding model.

A common pattern is:

  • build the index with one model
  • later load the same persisted storage
  • query or insert with a different model

That gives you a mismatch between stored vectors and newly generated vectors.

Broken patternFixed pattern
Build with one embedder, load with anotherUse the same embedder for build + query, or rebuild the index
Persist old vectors and change models laterClear storage before switching embedding dimensions
# BROKEN: index was built with text-embedding-3-small (1536 dims)
# then queried later with text-embedding-3-large (3072 dims)

from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index.embeddings.openai import OpenAIEmbedding

storage_context = StorageContext.from_defaults(persist_dir="./storage")

# Query-time embedder changed
embed_model = OpenAIEmbedding(model="text-embedding-3-large")

index = load_index_from_storage(storage_context)
query_engine = index.as_query_engine(embed_model=embed_model)

response = query_engine.query("What is our claims process?")
# FIXED: use the same embedding model that was used to build the index
# or rebuild the index if you want to switch models

from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index.embeddings.openai import OpenAIEmbedding

embed_model = OpenAIEmbedding(model="text-embedding-3-small")
storage_context = StorageContext.from_defaults(persist_dir="./storage")

index = load_index_from_storage(storage_context)
query_engine = index.as_query_engine(embed_model=embed_model)

response = query_engine.query("What is our claims process?")

If you want to change models, do not keep the old persisted data around. Delete the storage directory and rebuild:

import shutil

shutil.rmtree("./storage", ignore_errors=True)

Other Possible Causes

1) Mixing embedding providers in one pipeline

This happens when ingestion uses one provider and querying uses another.

from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

ingest_embed_model = OpenAIEmbedding(model="text-embedding-3-small")
query_embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

These may produce different dimensions. Keep ingestion and query on the same embedding family unless you know exactly what your vector store supports.

2) Vector store schema fixed to an older dimension

Some stores enforce a fixed vector size at collection creation time. If your first insert created a 1536-dim collection, later inserts of 3072-dim vectors will fail.

# Example: existing collection already created for 1536-dim vectors
vector_store_config = {
    "collection_name": "support_docs"
}

Fix:

  • drop and recreate the collection
  • or create a new collection name per embedding model

3) Cached nodes built from stale embeddings

If you persist nodes or chunks and later re-run only part of the pipeline, some nodes may still carry old embeddings.

# Stale cache example
index.storage_context.persist(persist_dir="./storage")
# later: documents changed but cached embeddings were reused

Fix:

  • clear cached embeddings
  • re-run ingestion end-to-end after changing chunking or model settings

4) Chunking changes without rebuilding embeddings

Chunk size changes do not directly change embedding dimension, but they often trigger partial rebuilds where old and new nodes coexist in storage.

from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=512)  # changed from previous run

If you changed chunking logic, rebuild the entire index so every node has embeddings from the same pipeline run.

How to Debug It

  1. Print the embedding model name at ingest and query time

    • Verify both sides use the same class and model.
    • Check for OpenAIEmbedding, HuggingFaceEmbedding, OllamaEmbedding, etc.
  2. Inspect vector dimensions

    • Log a sample embedding length from both pipelines.
    • Example:
      emb = embed_model.get_text_embedding("test")
      print(len(emb))
      
    • If lengths differ, that is your root cause.
  3. Check persisted storage

    • If you are using StorageContext.from_defaults(persist_dir=...), assume old vectors may be present.
    • Delete the directory and rebuild if you recently changed models.
  4. Verify vector store constraints

    • For Pinecone, Qdrant, Weaviate, Chroma, or pgvector-backed setups, confirm collection/index dimension matches your current embedder.
    • If needed, create a fresh namespace or collection.

Prevention

  • Keep embedding config in one place:

    • model name
    • provider class
    • normalization settings
    • persist directory / collection name
  • Version your indexes by embedding model:

    • support_docs_openai_1536
    • support_docs_bge_384
    • never reuse a collection across incompatible dimensions
  • Rebuild after any embedding-related change:

    • model swap
    • provider swap
    • vector DB migration
    • major chunking pipeline changes

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides