How to Integrate Haystack for retail banking with Elasticsearch for RAG

By Cyprian AaronsUpdated 2026-04-21
haystack-for-retail-bankingelasticsearchrag

Combining Haystack for retail banking with Elasticsearch gives you a clean pattern for retrieval-augmented generation in regulated banking workflows. Haystack handles the agent and pipeline orchestration, while Elasticsearch gives you fast full-text search, filters, and scalable document retrieval over policies, product docs, FAQs, and customer support content.

This setup is useful when your agent needs to answer questions like loan eligibility, card disputes, fee schedules, or KYC requirements with grounded answers pulled from approved bank content.

Prerequisites

  • Python 3.10+
  • An Elasticsearch cluster running locally or in your VPC
  • A Haystack-based retail banking project installed
  • Access to an embedding model
  • Bank-approved documents loaded as text chunks
  • Environment variables set for connection details:
    • ELASTICSEARCH_URL
    • ELASTICSEARCH_INDEX
    • OPENAI_API_KEY or another embedding provider key if you use one

Integration Steps

  1. Install the required packages

    Start with the core Haystack and Elasticsearch dependencies.

    pip install haystack-ai elasticsearch sentence-transformers
    

    If your retail banking stack uses a specific Haystack integration package or internal wrapper, install that too. The important part is that your pipeline can create embeddings, write documents to Elasticsearch, and query them back.

  2. Connect to Elasticsearch and create an index

    Use the official Elasticsearch Python client to verify connectivity and create a dedicated index for banking content.

    import os
    from elasticsearch import Elasticsearch
    
    es = Elasticsearch(os.environ["ELASTICSEARCH_URL"])
    
    index_name = os.environ.get("ELASTICSEARCH_INDEX", "banking-rag")
    
    if not es.indices.exists(index=index_name):
        es.indices.create(
            index=index_name,
            mappings={
                "properties": {
                    "content": {"type": "text"},
                    "title": {"type": "text"},
                    "source": {"type": "keyword"},
                    "embedding": {"type": "dense_vector", "dims": 384}
                }
            }
        )
    
    print(es.info())
    

    Keep the schema simple. For RAG, you need searchable text fields plus a vector field for semantic retrieval.

  3. Load banking documents into Haystack documents

    In Haystack, represent each policy or FAQ as a Document. For retail banking, this usually means product terms, interest rules, fee tables, dispute handling docs, and compliance-approved help center articles.

    from haystack import Document
    
    docs = [
        Document(
            content="Debit card replacement takes 5 to 7 business days. Expedited delivery is available for premium accounts.",
            meta={"title": "Debit Card Replacement", "source": "support-center"}
        ),
        Document(
            content="Personal loan applicants must be at least 18 years old and have verifiable income.",
            meta={"title": "Personal Loan Eligibility", "source": "lending-policy"}
        ),
    ]
    
    print(len(docs))
    
  4. Embed and write documents to Elasticsearch

    Use Haystack components to generate embeddings, then store both text and vectors in Elasticsearch. In production, you would typically run this in an indexing job.

    import os
    from haystack.components.embedders import SentenceTransformersDocumentEmbedder
    from haystack.components.writers import DocumentWriter
    from haystack_integrations.document_stores.elasticsearch import ElasticsearchDocumentStore
    
    document_store = ElasticsearchDocumentStore(
        hosts=os.environ["ELASTICSEARCH_URL"],
        index=os.environ.get("ELASTICSEARCH_INDEX", "banking-rag"),
        embedding_dim=384,
        vector_similarity="cosine",
        recreate_index=False,
    )
    
    embedder = SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
    embedder.warm_up()
    
    embedded_docs = embedder.run(docs)["documents"]
    
    writer = DocumentWriter(document_store=document_store)
    writer.run(embedded_docs)
    
    print("Indexed documents:", len(embedded_docs))
    
  5. Build the RAG query pipeline

    Wire up retrieval from Elasticsearch with generation in Haystack. The retriever pulls the top matching chunks; the prompt builder feeds those chunks into your LLM.

    from haystack import Pipeline
    from haystack.components.builders import PromptBuilder
     from haystack.components.generators import OpenAIGenerator
     from haystack.components.embedders import SentenceTransformersTextEmbedder
     from haystack_integrations.components.retrievers.elasticsearch import (
         ElasticsearchEmbeddingRetriever,
     )
    
     template = """
     Answer the question using only the provided context.
    
     Context:
     {% for doc in documents %}
     - {{ doc.content }}
     {% endfor %}
    
     Question: {{ question }}
     Answer:
     """
    
     query_embedder = SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
     retriever = ElasticsearchEmbeddingRetriever(document_store=document_store)
     prompt_builder = PromptBuilder(template=template)
     generator = OpenAIGenerator(model="gpt-4o-mini")
    
     rag_pipeline = Pipeline()
     rag_pipeline.add_component("query_embedder", query_embedder)
     rag_pipeline.add_component("retriever", retriever)
     rag_pipeline.add_component("prompt_builder", prompt_builder)
     rag_pipeline.add_component("generator", generator)
    
     rag_pipeline.connect("query_embedder.embedding", "retriever.query_embedding")
     rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
     rag_pipeline.connect("prompt_builder.prompt", "generator.prompt")
    
     result = rag_pipeline.run(
         {
             "query_embedder": {"text": "How long does debit card replacement take?"},
             "prompt_builder": {"question": "How long does debit card replacement take?"},
         }
     )
    
     print(result["generator"]["replies"][0])
    

Testing the Integration

Run a direct retrieval test first so you know Elasticsearch is returning the right chunk before you involve generation.

query = "What are the requirements for a personal loan?"
result = rag_pipeline.run(
    {
        "query_embedder": {"text": query},
        "prompt_builder": {"question": query},
    }
)

print(result["generator"]["replies"][0])

Expected output:

Applicants must be at least 18 years old and have verifiable income.

If you get an unrelated answer, check these first:

  • The document chunking is too coarse or too small
  • The embedding model used at query time does not match indexing time quality expectations
  • Your Elasticsearch vector field dimensions do not match the embedder output
  • The prompt is not restricting answers to retrieved context

Real-World Use Cases

  • Retail banking support agent
    Answer customer questions about fees, card delivery times, account opening rules, overdraft policies, and loan eligibility using approved internal content only.

  • Branch staff assistant
    Help frontline staff retrieve product terms quickly during customer conversations without searching multiple portals.

  • Compliance-grounded FAQ bot
    Serve policy-safe responses for KYC, disputes, chargebacks, and disclosures by retrieving from controlled bank documentation indexed in Elasticsearch.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides