How to Integrate Haystack for pension funds with Elasticsearch for multi-agent systems

By Cyprian AaronsUpdated 2026-04-21
haystack-for-pension-fundselasticsearchmulti-agent-systems

Combining Haystack for pension funds with Elasticsearch gives you a practical retrieval layer for multi-agent systems that need to answer policy, compliance, and member-service questions with low latency. Haystack handles the orchestration of document pipelines and agent workflows, while Elasticsearch gives you fast indexed retrieval over fund documents, investment policies, actuarial reports, and member correspondence.

Prerequisites

  • Python 3.10+
  • An Elasticsearch cluster running locally or in cloud
  • A Haystack-compatible environment with the required packages installed
  • Access to the pension fund documents you want to index
  • API keys or credentials for your Elasticsearch deployment
  • Basic familiarity with Python async code if your agents run concurrently

Install the core dependencies:

pip install haystack-ai elasticsearch

If you're using an embedding model or document store backend that needs extra packages, install those too. Keep versions pinned in production so your agent behavior stays stable across deploys.

Integration Steps

  1. Set up the Elasticsearch connection and create a document store.

Use Elasticsearch as the persistence and retrieval layer. In Haystack, the usual pattern is to connect a DocumentStore to your cluster, then let pipelines and agents query it.

from haystack import Document
from haystack_integrations.document_stores.elasticsearch import ElasticsearchDocumentStore

document_store = ElasticsearchDocumentStore(
    hosts="http://localhost:9200",
    index="pension-fund-documents",
    basic_auth=("elastic", "changeme"),
)

docs = [
    Document(content="The pension fund requires quarterly liquidity stress testing.", meta={"source": "policy_2024.pdf"}),
    Document(content="Members can request benefit statements through the secure portal.", meta={"source": "member_services.pdf"}),
]

document_store.write_documents(docs)
  1. Add an embedding retriever for semantic search.

For multi-agent systems, exact keyword matching is usually not enough. Use a retriever so agents can pull relevant pension fund context before generating answers or taking actions.

from haystack.components.embedders import SentenceTransformersTextEmbedder
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack import Pipeline

text_embedder = SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")

# If you are using an embedding-capable Elasticsearch-backed store, wire retrieval through Haystack components.
retriever = InMemoryEmbeddingRetriever(document_store=document_store)

query_pipeline = Pipeline()
query_pipeline.add_component("embedder", text_embedder)
query_pipeline.add_component("retriever", retriever)

query_pipeline.connect("embedder.embedding", "retriever.query_embedding")
  1. Build a query pipeline that agents can call.

Your agents should not query Elasticsearch directly unless they need custom scoring logic. Wrap retrieval in a Haystack pipeline so every agent uses the same retrieval contract.

result = query_pipeline.run(
    {
        "embedder": {"text": "What is the liquidity stress testing requirement?"},
        "retriever": {"documents": docs},
    }
)

for doc in result["retriever"]["documents"]:
    print(doc.content)

If your setup uses an Elasticsearch-backed embedding store instead of in-memory retrieval, keep the same agent-facing pattern but swap in the appropriate retriever implementation from your Haystack integration package.

  1. Expose the pipeline as a tool for multiple agents.

In a multi-agent system, one agent may handle compliance checks while another handles member support. Both should share the same retrieval tool so answers stay consistent.

from haystack.tools import ComponentTool

pension_search_tool = ComponentTool(
    component=query_pipeline,
    name="pension_fund_search",
    description="Search pension fund policies, procedures, and member documentation.",
)

# Example call pattern from an agent runtime
tool_result = pension_search_tool.run(
    {
        "embedder": {"text": "How do members request benefit statements?"},
        "retriever": {"documents": docs},
    }
)

print(tool_result)
  1. Add direct Elasticsearch validation for operational checks.

Haystack handles retrieval flow, but you still want direct Elasticsearch checks for indexing health, document counts, and debugging.

from elasticsearch import Elasticsearch

es = Elasticsearch(
    "http://localhost:9200",
    basic_auth=("elastic", "changeme"),
)

health = es.cluster.health()
count = es.count(index="pension-fund-documents")

print("cluster_status:", health["status"])
print("document_count:", count["count"])

Testing the Integration

Run a simple end-to-end test: write documents, query them through Haystack, then verify Elasticsearch sees them.

test_query = "Who can request benefit statements?"
response = query_pipeline.run(
    {
        "embedder": {"text": test_query},
        "retriever": {"documents": docs},
    }
)

print("Retrieved:")
for doc in response["retriever"]["documents"]:
    print("-", doc.content)

print("\nElasticsearch count:", es.count(index="pension-fund-documents")["count"])

Expected output:

Retrieved:
- Members can request benefit statements through the secure portal.
- The pension fund requires quarterly liquidity stress testing.

Elasticsearch count: 2

If you get zero documents back, check these first:

  • The index name matches exactly
  • Your documents were actually written before querying
  • Your embeddings model is consistent between indexing and retrieval
  • Authentication and network access to Elasticsearch are correct

Real-World Use Cases

  • Compliance copilot: One agent retrieves pension policy clauses from Elasticsearch while another checks responses against regulatory language before replying.
  • Member service assistant: A support agent answers questions about withdrawals, statements, and contribution rules using indexed fund documentation.
  • Operations triage bot: An internal agent monitors policy updates and surfaces relevant changes to actuarial, legal, and finance teams from a shared search index.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides