How to Fix 'vector search returning irrelevant results during development' in LangChain (Python)

By Cyprian AaronsUpdated 2026-04-22

vector-search-returning-irrelevant-results-during-developmentlangchainpython

If your LangChain vector search is returning irrelevant results during development, the retriever is usually doing exactly what you told it to do — just not what you intended. In practice, this shows up when chunking, embeddings, or metadata filters are misconfigured, and the top-k matches look semantically close but useless.

The most common symptom is: you ask a question, LangChain returns documents that share a few keywords, but not the actual answer. You’ll often see this with FAISS, Chroma, or PineconeVectorStore backed by VectorStoreRetriever.

The Most Common Cause

The #1 cause is bad chunking plus weak embedding input. If you split documents too aggressively, strip structure, or embed noisy text, the vector store has nothing meaningful to work with.

A classic broken pattern is embedding raw text chunks without preserving context:

Broken	Fixed
Split on arbitrary character counts	Split on semantic boundaries with overlap
Embed tiny fragments like headers alone	Keep enough surrounding context in each chunk
Query against low-signal chunks	Query against chunks that contain full meaning

# BROKEN
from langchain_text_splitters import CharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS

text_splitter = CharacterTextSplitter(chunk_size=200, chunk_overlap=0)
chunks = text_splitter.split_text(raw_policy_text)

embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_texts(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

docs = retriever.get_relevant_documents("What is the claims waiting period?")

# FIXED
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=150,
    separators=["\n\n", "\n", ". ", " ", ""]
)
chunks = text_splitter.split_text(raw_policy_text)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_texts(chunks, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 6})

docs = retriever.get_relevant_documents("What is the claims waiting period?")

Why this matters:

•Too-small chunks lose context.
•Zero overlap breaks references across boundaries.
•Plain character splitting often cuts tables, clauses, and definitions in half.
•If you’re using RetrievalQA or create_retrieval_chain, garbage chunks still flow downstream and look like retrieval failure.

Other Possible Causes

1) Wrong embedding model for your corpus

If you indexed with one embedding model and queried with another, similarity search becomes meaningless. This happens when developers rebuild part of the pipeline and forget to reindex.

# BAD: indexed with one model, queried with another
index_embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
query_embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

Fix:

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = FAISS.from_texts(chunks, embeddings)
# use the same embeddings object/config for retrieval lifecycle

2) You forgot to normalize metadata filters

A bad filter can silently exclude the right documents and leave only irrelevant ones. This shows up a lot with Chroma.as_retriever(search_kwargs={"filter": ...}).

# BAD: filter key/value mismatch
retriever = vectorstore.as_retriever(
    search_kwargs={"k": 5, "filter": {"tenantId": "123"}}
)

# FIXED: match stored metadata exactly
retriever = vectorstore.as_retriever(
    search_kwargs={"k": 5, "filter": {"tenant_id": "123"}}
)

If your metadata schema is inconsistent, retrieval will look random even though the index is fine.

3) Your documents are not deduplicated

Duplicate chunks dominate nearest-neighbor results. You’ll think retrieval is “irrelevant,” but it’s actually returning near-identical copies of boilerplate.

# Example dedupe before indexing
seen = set()
unique_chunks = []
for chunk in chunks:
    key = chunk.strip()
    if key not in seen:
        seen.add(key)
        unique_chunks.append(chunk)

4) `k` is too small or score thresholding is too aggressive

If you only retrieve 2 docs from a noisy index, you may miss the right one entirely. The same happens when using similarity score thresholds that are too strict.

retriever = vectorstore.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"score_threshold": 0.85}
)

Try this first:

retriever = vectorstore.as_retriever(search_kwargs={"k": 8})

Then tune thresholds after inspecting actual scores.

How to Debug It

•
Inspect raw retrieved chunks
- •Print the top 5 docs before they go into your chain.
- •Check whether the issue is retrieval or generation.

docs = retriever.get_relevant_documents("What is the claims waiting period?")
for i, doc in enumerate(docs):
    print(i, doc.page_content[:300], doc.metadata)

•
Test with a known-answer query
- •Use a question whose answer exists in one exact document.
- •If retrieval still fails, your index setup is wrong.
•
Check embedding consistency
- •Confirm the same embedding model was used for indexing and querying.
- •Rebuild the index after changing models.
•
Verify chunk size and overlap
- •Print chunk lengths.
- •Look for tiny fragments or chopped sentences.

print(min(len(c) for c in chunks), max(len(c) for c in chunks))
print(chunks[:3])

If you see lots of sub-200 character chunks from policy PDFs or legal docs, that’s usually your problem.

Prevention

•Use RecursiveCharacterTextSplitter for most production text corpora.
•Rebuild the entire vector index whenever embeddings change.
•Add retrieval tests with fixed queries and expected source documents.
•Log retrieved document IDs, scores, and metadata during development so bad config shows up immediately.

If you want reliable LangChain retrieval, treat indexing as part of your application logic, not a preprocessing script you run once and forget. Most “irrelevant results” bugs are really data-shaping bugs hiding behind a vector database.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit