How to Integrate Haystack for lending with Elasticsearch for RAG

By Cyprian AaronsUpdated 2026-04-21

haystack-for-lendingelasticsearchrag

Haystack for lending plus Elasticsearch gives you a clean RAG stack for credit, underwriting, and loan servicing workflows. Haystack handles retrieval and orchestration; Elasticsearch gives you fast full-text search, filtering, and vector retrieval over borrower docs, policy docs, and product knowledge.

For lending agents, this combo is useful when the answer needs to come from internal policy, not a generic model. Think loan eligibility checks, document Q&A, adverse action explanations, and servicing support where traceability matters.

Prerequisites

•Python 3.10+
•An Elasticsearch cluster running locally or in your VPC
•An Elasticsearch user with index read/write permissions
•Haystack installed with Elasticsearch integration support
•Access to an embedding model for vector indexing
•Lending documents ready as plain text, JSON, or extracted PDF text

Install the packages:

pip install haystack-ai elasticsearch sentence-transformers

If you are using a managed Elasticsearch service, make sure you have:

•ELASTICSEARCH_URL
•username/password or API key
•TLS settings configured correctly

Integration Steps

•Create an Elasticsearch connection and index your lending corpus.

Use Haystack’s document abstraction and Elasticsearch’s client to prepare the store. For lending use cases, keep metadata like product_type, jurisdiction, policy_version, and doc_type.

from elasticsearch import Elasticsearch
from haystack import Document

es = Elasticsearch(
    "https://localhost:9200",
    basic_auth=("elastic", "changeme"),
    verify_certs=False,
)

docs = [
    Document(
        content="Borrowers must provide two months of bank statements for unsecured personal loans.",
        meta={"doc_type": "policy", "product_type": "personal_loan", "jurisdiction": "US"}
    ),
    Document(
        content="Debt-to-income ratio above 45% requires manual review.",
        meta={"doc_type": "underwriting_rule", "product_type": "mortgage", "jurisdiction": "US"}
    ),
]

•Build an embedding pipeline and write documents into Elasticsearch.

Haystack pipelines let you preprocess documents before storage. In production, chunk large policy PDFs before indexing.

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

for doc in docs:
    doc.embedding = model.encode(doc.content).tolist()

index_name = "lending-rag"

if not es.indices.exists(index=index_name):
    es.indices.create(
        index=index_name,
        mappings={
            "properties": {
                "content": {"type": "text"},
                "embedding": {"type": "dense_vector", "dims": 384},
                "meta.doc_type": {"type": "keyword"},
                "meta.product_type": {"type": "keyword"},
                "meta.jurisdiction": {"type": "keyword"},
            }
        },
    )

for i, doc in enumerate(docs):
    es.index(
        index=index_name,
        id=str(i),
        document={
            "content": doc.content,
            "embedding": doc.embedding,
            **{"meta." + k: v for k, v in doc.meta.items()},
        },
    )

es.indices.refresh(index=index_name)

•Wire Haystack retrieval to Elasticsearch.

Use Haystack’s retriever layer to query Elasticsearch by embedding similarity and metadata filters. This is where lending-specific routing starts to matter.

from haystack import Pipeline
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.components.embedders import SentenceTransformersTextEmbedder

query_embedder = SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")

# If you're using Haystack's native ES integration in your stack,
# connect the retriever to your ES-backed document store.
# The exact class depends on your Haystack version/package layout.
# Pattern below shows the pipeline wiring used in production.
pipeline = Pipeline()
pipeline.add_component("embedder", query_embedder)
pipeline.add_component("retriever", InMemoryEmbeddingRetriever())

pipeline.connect("embedder.embedding", "retriever.query_embedding")

•Add an LLM generator for grounded answers.

Once retrieval returns relevant policy chunks, pass them into a generator component. Keep the prompt strict: answer only from retrieved context and cite missing information explicitly.

from haystack.components.builders import PromptBuilder

template = """
You are a lending operations assistant.
Answer only from the provided context.

Context:
{% for doc in documents %}
- {{ doc.content }}
{% endfor %}

Question: {{ question }}
Answer:
"""

prompt_builder = PromptBuilder(template=template)

pipeline.add_component("prompt_builder", prompt_builder)
pipeline.connect("retriever.documents", "prompt_builder.documents")

•Execute the RAG flow with a lending question.

This is the part your agent will call at runtime. The query should include enough context to route correctly by product or jurisdiction if needed.

question = "What documents are required for an unsecured personal loan?"

result = pipeline.run({
    "embedder": {"text": question},
    "retriever": {"query_embedding": None},  # filled by connected embedder output
    "prompt_builder": {"question": question},
})

print(result)

Testing the Integration

Run a direct Elasticsearch check first, then verify retrieval returns the right policy text.

query_vector = model.encode("What documents are required for an unsecured personal loan?").tolist()

resp = es.search(
    index=index_name,
    knn={
        "field": "embedding",
        "query_vector": query_vector,
        "k": 3,
        "num_candidates": 10,
    },
)

for hit in resp["hits"]["hits"]:
    print(hit["_source"]["content"])

Expected output:

Borrowers must provide two months of bank statements for unsecured personal loans.
Debt-to-income ratio above 45% requires manual review.

If you get those results back, your retrieval layer is working and Haystack can now ground responses on top of it.

Real-World Use Cases

•Loan policy assistant that answers questions about income verification, collateral rules, exceptions, and required disclosures.
•Underwriting copilot that retrieves borrower-specific guidance from policy docs and routes edge cases for manual review.
•Servicing agent that explains payment deferrals, late fee rules, hardship programs, and account maintenance steps with citations.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit