How to Integrate Haystack for banking with Elasticsearch for multi-agent systems

By Cyprian AaronsUpdated 2026-04-21
haystack-for-bankingelasticsearchmulti-agent-systems

Combining Haystack for banking with Elasticsearch gives you a practical retrieval layer for multi-agent systems that need to answer questions over policies, product docs, transaction notes, and compliance artifacts. The pattern is simple: Haystack handles agent orchestration and retrieval logic, while Elasticsearch gives you fast, filterable search over large banking corpora.

For banking teams, this is useful when one agent needs to fetch KYC policy clauses, another needs product eligibility rules, and a third needs audit evidence from indexed documents. Instead of hardcoding lookups or scanning PDFs manually, you wire both systems together and let each agent query the same searchable knowledge base.

Prerequisites

  • Python 3.10+
  • An Elasticsearch cluster running locally or in Elastic Cloud
  • Access to your Haystack for banking package and its document store components
  • API credentials for Elasticsearch if you are not using local development mode
  • A corpus of banking documents to index:
    • policy PDFs
    • product FAQs
    • compliance notes
    • internal runbooks
  • pip installed

Install the main dependencies:

pip install haystack-ai elasticsearch

If your Haystack for banking distribution ships as a separate package or internal wheel, install that too:

pip install haystack-for-banking

Integration Steps

  1. Set up Elasticsearch and create a dedicated index.

For banking workloads, keep search isolated by domain and environment. Use one index per document class or business unit so agents can filter precisely.

from elasticsearch import Elasticsearch

es = Elasticsearch(
    "http://localhost:9200",
    basic_auth=("elastic", "changeme"),
)

index_name = "banking_knowledge_base"

if not es.indices.exists(index=index_name):
    es.indices.create(
        index=index_name,
        mappings={
            "properties": {
                "content": {"type": "text"},
                "source": {"type": "keyword"},
                "doc_type": {"type": "keyword"},
                "customer_segment": {"type": "keyword"}
            }
        }
    )

print(es.info())
  1. Connect Haystack to Elasticsearch using an Elasticsearch-backed document store.

Haystack’s Elasticsearch integration lets you write documents once and retrieve them from multiple agents later. Use the document store as the shared memory layer for your system.

from haystack import Document
from haystack_integrations.document_stores.elasticsearch import ElasticsearchDocumentStore

document_store = ElasticsearchDocumentStore(
    hosts="http://localhost:9200",
    basic_auth=("elastic", "changeme"),
    index="banking_knowledge_base",
)

docs = [
    Document(
        content="Retail customers can open a savings account with a minimum balance of $100.",
        meta={"source": "product_policy.pdf", "doc_type": "policy", "customer_segment": "retail"},
    ),
    Document(
        content="KYC verification requires government ID and proof of address.",
        meta={"source": "kyc_standard.pdf", "doc_type": "compliance", "customer_segment": "all"},
    ),
]

document_store.write_documents(docs)
print(document_store.count_documents())
  1. Add an embedding/retrieval pipeline for semantic search.

In multi-agent systems, exact keyword matching is not enough. One agent may ask “what documents do we need to onboard a small business customer,” while another asks “which ID checks are mandatory.” Semantic retrieval closes that gap.

from haystack import Pipeline
from haystack.components.embedders import SentenceTransformersTextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever

# If your stack uses an Elasticsearch embedding retriever component,
# swap this retriever for the Elasticsearch-backed equivalent in your version.
query_embedder = SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")

retriever = InMemoryEmbeddingRetriever(document_store=document_store)

rag_pipeline = Pipeline()
rag_pipeline.add_component("embedder", query_embedder)
rag_pipeline.add_component("retriever", retriever)

rag_pipeline.connect("embedder.embedding", "retriever.query_embedding")
  1. Wire the retrieval layer into an agent-friendly function.

This is the part your agents actually call. Keep it small and deterministic: input query in, ranked evidence out.

def retrieve_banking_context(query: str, top_k: int = 3):
    result = rag_pipeline.run(
        {
            "embedder": {"text": query},
            "retriever": {"top_k": top_k},
        }
    )
    return result["retriever"]["documents"]

hits = retrieve_banking_context("What documents are required for KYC onboarding?")
for doc in hits:
    print(doc.content)
    print(doc.meta)
  1. Expose the retriever as a shared tool for multiple agents.

In a multi-agent setup, each agent should reuse the same retrieval function but apply different prompts or workflows. That keeps search consistent across compliance, support, and operations agents.

class BankingRetrievalTool:
    def __init__(self):
        self.document_store = document_store

    def search(self, query: str):
        return retrieve_banking_context(query=query, top_k=5)

tool = BankingRetrievalTool()

compliance_docs = tool.search("List all required onboarding checks for retail customers")
for d in compliance_docs:
    print(d.content)

Testing the Integration

Run a direct search against Elasticsearch first, then verify Haystack returns relevant documents through the pipeline.

query = {
    "query": {
        "match": {
            "content": "minimum balance savings account"
        }
    }
}

raw_response = es.search(index=index_name, body=query)
print(raw_response["hits"]["hits"][0]["_source"]["content"])

haystack_docs = retrieve_banking_context("minimum balance savings account")
print(haystack_docs[0].content)

Expected output:

Retail customers can open a savings account with a minimum balance of $100.
Retail customers can open a savings account with a minimum balance of $100.

If those two lines match or are close semantically, the integration is working. If Elasticsearch returns results but Haystack does not, check your embedding pipeline and document store configuration first.

Real-World Use Cases

  • Compliance assistant

    • One agent retrieves AML/KYC policy clauses from Elasticsearch.
    • Another agent drafts responses for analysts using the retrieved evidence.
  • Customer support triage

    • A support agent searches product terms and fee schedules.
    • A resolution agent uses the same indexed corpus to generate approved answers.
  • Operations knowledge base

    • An ops agent queries incident runbooks and escalation procedures.
    • A second agent cross-checks audit references before creating tickets or summaries.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides