How to Fix 'vector search returning irrelevant results' in CrewAI (Python)

By Cyprian AaronsUpdated 2026-04-22
vector-search-returning-irrelevant-resultscrewaipython

What This Error Usually Means

If your CrewAI agent is returning irrelevant vector search results, the retriever is working, but your embeddings or chunking are wrong. In practice, this shows up when the agent pulls semantically distant documents, even though the query looks correct.

You’ll usually see this after wiring VectorStoreRetrieverTool, Knowledge, or a custom RAG pipeline into a CrewAI agent and then asking for something specific like policy terms, claim rules, or account limits.

The Most Common Cause

The #1 cause is bad chunking plus mismatched embedding settings. You indexed one way, queried another way, and the vector store is doing exactly what you asked it to do.

The classic mistake is splitting documents too aggressively or using different embedding models for indexing and retrieval.

Broken vs fixed pattern

Broken patternFixed pattern
Chunk size too small, no overlapChunk size large enough to preserve context
Index with one embedding modelQuery with the same embedding model
Store raw PDFs as-isClean text before embedding
# BROKEN
from crewai import Agent, Task, Crew
from crewai_tools import VectorStoreTool
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter

docs = load_policy_docs()  # returns raw PDF text with headers/footers mixed in

splitter = RecursiveCharacterTextSplitter(
    chunk_size=200,
    chunk_overlap=0,
)

chunks = splitter.split_documents(docs)

# Index with one embedding model
index_embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectordb = Chroma.from_documents(chunks, index_embeddings)

# Query with a different model later in another process:
query_embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")

tool = VectorStoreTool(vectorstore=vectordb)
# FIXED
from crewai import Agent, Task, Crew
from crewai_tools import VectorStoreTool
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter

docs = load_policy_docs_cleaned()  # stripped headers/footers, normalized whitespace

splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=120,
)

chunks = splitter.split_documents(docs)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectordb = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    collection_name="policy_knowledge",
)

tool = VectorStoreTool(vectorstore=vectordb)

Why this matters:

  • Small chunks often lose the sentence that gives the answer meaning.
  • Zero overlap cuts off context at boundaries.
  • Different embedding models produce incompatible vector spaces.
  • Garbage-in text from PDFs creates noisy nearest neighbors.

Other Possible Causes

1) You indexed low-quality text

If your source has repeated headers, footers, page numbers, OCR noise, or tables flattened into nonsense, similarity search gets polluted.

# Bad: raw OCR/PDF extraction
text = "CONFIDENTIAL\nPage 12\nClaim payout limit ...\nCONFIDENTIAL"

# Better: clean before chunking
text = clean_text(text)

2) Your retriever is using the wrong search type

If you need broader recall but use strict cosine top-k only, you may miss the right chunk and get nearby junk instead.

retriever = vectordb.as_retriever(
    search_type="mmr",   # better recall/diversity for many corpora
    search_kwargs={"k": 5, "fetch_k": 20},
)

If your corpus is small and highly structured, similarity may be fine. For messy enterprise docs, mmr often behaves better.

3) Metadata filtering is too broad or too narrow

A bad filter can silently remove the correct documents before ranking happens.

# Too broad: returns unrelated docs from every business unit
retriever.search_kwargs = {
    "filter": {"tenant_id": "acme"}
}

# Better: add domain-specific filters
retriever.search_kwargs = {
    "filter": {"tenant_id": "acme", "doc_type": "claims_policy"}
}

4) Your query rewrite step is distorting the user question

CrewAI agents often rephrase queries before retrieval. If the rewrite becomes vague, your vector search follows that vagueness.

# Risky prompt behavior:
"Rewrite the user's question to be shorter."

# Better:
"Rewrite only for missing entities. Preserve domain terms like policy number,
claim type, coverage name, and dates."

How to Debug It

  1. Inspect the top-k raw matches
    • Print retrieved chunks before they reach the agent.
    • If the top results are obviously wrong, this is an indexing/retrieval issue, not an agent issue.
results = retriever.get_relevant_documents("What is my claim payout limit?")
for i, doc in enumerate(results[:5]):
    print(i, doc.page_content[:300], doc.metadata)
  1. Check whether embeddings match

    • Confirm the same embedding class and model are used for indexing and querying.
    • Inconsistent models are a common source of irrelevant results.
  2. Test with a known exact query

    • Use a phrase you know exists in one document.
    • If retrieval still fails on an exact match-like query, chunking or cleaning is broken.
  3. Temporarily remove filters and query rewriting

    • Disable metadata filters.
    • Bypass any LLM-based query transformation.
    • If results improve immediately, the bug is in those layers.

Prevention

  • Use one embedding model per collection and lock it in config.
  • Standardize document cleaning before indexing: remove boilerplate, normalize whitespace, strip OCR noise.
  • Tune chunking per document type:
    • policies: larger chunks with overlap
    • FAQs: smaller chunks are acceptable
    • tables/forms: extract structure before embedding

A good production pattern is to version your index config alongside code:

INDEX_CONFIG = {
    "embedding_model": "text-embedding-3-small",
    "chunk_size": 800,
    "chunk_overlap": 120,
    "search_type": "mmr",
}

If you keep seeing irrelevant hits in CrewAI Python projects, start with chunking and embeddings first. That’s where most RAG failures actually live.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides