How to Fix 'embedding dimension mismatch in production' in CrewAI (TypeScript)

By Cyprian AaronsUpdated 2026-04-22
embedding-dimension-mismatch-in-productioncrewaitypescript

When CrewAI throws an embedding dimension mismatch in production, it means the vector you’re trying to store or query does not match the dimension of the embedding model backing your vector store. In practice, this usually shows up when you change embedding providers, swap models between environments, or reuse a persisted index built with a different model.

The error often appears after deployment, not locally. That’s because local dev and production rarely use the same .env, the same vector DB snapshot, or the same default embedding model.

The Most Common Cause

The #1 cause is mixing embedding models with different output sizes. For example, you indexed documents with text-embedding-3-small in one environment and queried them later with a different model, or you let CrewAI default to one embedding provider locally and another in production.

Here’s the broken pattern:

import { Agent } from "crewai";
import { OpenAIEmbeddingFunction } from "@langchain/openai";

const agent = new Agent({
  role: "Support Analyst",
  goal: "Answer customer questions",
  backstory: "You help support teams.",
  // Wrong: production uses a different embedding config than the stored index
  embedder: {
    provider: "openai",
    config: {
      model: process.env.EMBEDDING_MODEL || "text-embedding-3-large",
    },
  },
});

And here is the fixed pattern:

import { Agent } from "crewai";

const EMBEDDING_MODEL = "text-embedding-3-small"; // keep this stable across envs

const agent = new Agent({
  role: "Support Analyst",
  goal: "Answer customer questions",
  backstory: "You help support teams.",
  embedder: {
    provider: "openai",
    config: {
      model: EMBEDDING_MODEL,
    },
  },
});

The key is consistency. If your vector store was built with one dimension, every producer and consumer must use the same embedding model until you reindex.

A real-world error usually looks like this:

Error: embedding dimension mismatch in production
Expected dimension 1536 but received 3072
at VectorStore.addDocuments (...)
at KnowledgeStorage.upsert (...)
at Agent.executeTask (...)

If you see Expected dimension X but received Y, you are almost certainly mixing models or indexes.

Other Possible Causes

1) Reusing an old persisted vector index

If you changed models but kept the same Pinecone, Qdrant, Chroma, or pgvector collection, the old vectors are still there.

// Broken: old collection reused after switching embeddings
const collectionName = "customer-support-kb";

Fix by versioning the collection:

// Fixed: new collection for new embedding dimensions
const collectionName = "customer-support-kb-v2";

2) Different embeddings in ingest vs query path

Sometimes ingestion uses one config and retrieval uses another.

// Broken ingest path
const ingestEmbedder = { provider: "openai", config: { model: "text-embedding-3-small" } };

// Broken query path
const queryEmbedder = { provider: "openai", config: { model: "text-embedding-3-large" } };

Use one shared config object:

const embedderConfig = {
  provider: "openai" as const,
  config: { model: "text-embedding-3-small" },
};

const ingestAgent = new Agent({ role: "Ingestor", goal: "...", backstory: "...", embedder: embedderConfig });
const queryAgent = new Agent({ role: "Retriever", goal: "...", backstory: "...", embedder: embedderConfig });

3) Hidden environment drift

Your .env.local and production secrets may point to different providers.

# local
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small

# production
EMBEDDING_PROVIDER=azure_openai
EMBEDDING_MODEL=text-embedding-3-large

Even if both are “OpenAI-like”, dimensions can differ. Log resolved values at startup.

4) Mixed documents in the same namespace/collection

If multiple services write into the same namespace, one service can poison another with incompatible vectors.

// Broken shared namespace
namespace = "prod"

Use service-specific namespaces:

namespace = `prod:${process.env.SERVICE_NAME}:v1`

How to Debug It

  1. Print the resolved embedding model in both paths

    • Log ingestion and retrieval configs at startup.
    • Confirm they match exactly.
    • If they differ, stop there.
  2. Check the stored vector dimension

    • Inspect one record in your vector DB.
    • Compare it to your current model’s output.
    • For OpenAI-style embeddings:
      • text-embedding-3-small → usually 1536
      • text-embedding-3-large → usually 3072
  3. Verify collection or namespace history

    • Ask whether this index existed before a model change.
    • If yes, assume stale vectors until proven otherwise.
    • Rebuild into a new collection name instead of patching in place.
  4. Add a startup guard

    • Fail fast if the configured dimension does not match what your app expects.
const EXPECTED_DIMENSION = 1536;

function assertEmbeddingDimension(actualDimension: number) {
  if (actualDimension !== EXPECTED_DIMENSION) {
    throw new Error(
      `embedding dimension mismatch in production. Expected ${EXPECTED_DIMENSION}, got ${actualDimension}`
    );
  }
}

Prevention

  • Pin one embedding model per knowledge base

    • Don’t let defaults drift between environments.
    • Treat embedding changes like schema migrations.
  • Version your vector collections

    • Use names like kb-v1, kb-v2.
    • Reindex when changing models instead of reusing stale data.
  • Centralize embedder configuration

    • Export one shared config module for ingestion, retrieval, and agents.
    • Never duplicate embedding settings across files.

If you’re seeing CrewAI errors like Error: embedding dimension mismatch in production or a downstream failure inside VectorStore.addDocuments, fix it by aligning the model, rebuilding the index, and making the configuration explicit everywhere. That’s usually all it takes.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides