How to Integrate Haystack for wealth management with Elasticsearch for startups

By Cyprian AaronsUpdated 2026-04-21
haystack-for-wealth-managementelasticsearchstartups

Combining Haystack for wealth management with Elasticsearch gives you a practical retrieval layer for financial documents, client notes, portfolio reports, and policy data. For a startup building AI agents, this means you can answer advisor questions with grounded context instead of relying on the model’s memory.

Prerequisites

  • Python 3.10+
  • An Elasticsearch cluster running locally or in the cloud
  • API credentials for your Haystack for wealth management service
  • pip installed and working
  • A .env file or secret manager for:
    • ELASTICSEARCH_URL
    • ELASTICSEARCH_API_KEY
    • HAYSTACK_API_KEY
    • HAYSTACK_BASE_URL

Install the Python packages:

pip install elasticsearch haystack-ai python-dotenv

Integration Steps

  1. Connect to Elasticsearch and create a financial index

Start by creating an index that can store wealth management documents like meeting notes, product sheets, KYC summaries, and research snippets.

from elasticsearch import Elasticsearch

es = Elasticsearch(
    "https://localhost:9200",
    api_key="YOUR_ELASTICSEARCH_API_KEY",
    verify_certs=False,
)

index_name = "wealth_docs"

if not es.indices.exists(index=index_name):
    es.indices.create(
        index=index_name,
        mappings={
            "properties": {
                "doc_id": {"type": "keyword"},
                "client_id": {"type": "keyword"},
                "title": {"type": "text"},
                "content": {"type": "text"},
                "doc_type": {"type": "keyword"},
                "created_at": {"type": "date"}
            }
        }
    )
  1. Write documents from your wealth management system into Elasticsearch

In a startup setup, you usually ingest notes from CRM exports, advisor transcripts, or portfolio PDFs after parsing them upstream. Store the normalized text in Elasticsearch so Haystack can retrieve it later.

from datetime import datetime

documents = [
    {
        "_index": index_name,
        "_id": "doc-001",
        "_source": {
            "doc_id": "doc-001",
            "client_id": "client-1001",
            "title": "Quarterly review notes",
            "content": "Client prefers moderate risk exposure and wants ESG-aligned funds.",
            "doc_type": "meeting_note",
            "created_at": datetime.utcnow().isoformat()
        }
    },
    {
        "_index": index_name,
        "_id": "doc-002",
        "_source": {
            "doc_id": "doc-002",
            "client_id": "client-1001",
            "title": "Portfolio allocation summary",
            "content": "Current allocation: 55% equities, 35% fixed income, 10% cash.",
            "doc_type": "portfolio_report",
            "created_at": datetime.utcnow().isoformat()
        }
    }
]

for doc in documents:
    es.index(index=doc["_index"], id=doc["_id"], document=doc["_source"])

es.indices.refresh(index=index_name)
  1. Create a Haystack pipeline that uses Elasticsearch as the retriever backend

Haystack’s retriever components can query Elasticsearch directly. For modern Haystack versions, use the ElasticsearchDocumentStore plus a retriever such as BM25Retriever.

from haystack import Document
from haystack.document_stores.elasticsearch import ElasticsearchDocumentStore
from haystack.components.retrievers import InMemoryBM25Retriever

document_store = ElasticsearchDocumentStore(
    hosts=["https://localhost:9200"],
    basic_auth=("elastic", "YOUR_ELASTICSEARCH_PASSWORD"),
    verify_certs=False,
    index=index_name,
)

# Write Haystack documents into the store if you're managing them through Haystack
docs = [
    Document(content="Client prefers moderate risk exposure and wants ESG-aligned funds.", meta={"client_id": "client-1001", "doc_type": "meeting_note"}),
    Document(content="Current allocation: 55% equities, 35% fixed income, 10% cash.", meta={"client_id": "client-1001", "doc_type": "portfolio_report"}),
]
document_store.write_documents(docs)

retriever = InMemoryBM25Retriever(document_store=document_store)

If your Haystack version exposes an Elasticsearch-specific retriever in your stack, use that component directly. The key pattern is the same: Elasticsearch stores the corpus, Haystack retrieves relevant chunks for the agent.

  1. Add a query layer for advisor-style questions

Now wire up retrieval so your agent can ask questions like “What is this client’s risk preference?” and get grounded results.

query = {
    content := None
}

results = retriever.run(query="What is the client's risk preference?")
for doc in results["documents"]:
    print(doc.content)
    print(doc.meta)

A more production-ready version filters by client before retrieval:

results = document_store.filter_documents(filters={"client_id": ["client-1001"]})
for doc in results:
    print(doc.content)
  1. Wrap retrieval in an AI agent call

This is where Haystack becomes useful inside an agent system. Retrieve context from Elasticsearch first, then pass that context into your LLM prompt.

from haystack.components.builders import PromptBuilder

template = """
You are an assistant for a wealth management team.
Use only the retrieved context to answer.

Context:
{% for doc in documents %}
- {{ doc.content }}
{% endfor %}

Question: {{ question }}
Answer:
"""

prompt_builder = PromptBuilder(template=template)

retrieval_output = retriever.run(query="What investment profile does client-1001 prefer?")
prompt_output = prompt_builder.run(
    documents=retrieval_output["documents"],
    question="What investment profile does client-1001 prefer?"
)

print(prompt_output["prompt"])

Testing the Integration

Run a quick end-to-end check: write one document to Elasticsearch, retrieve it through Haystack, and confirm the content comes back.

test_doc = {
    "_index": index_name,
    "_id": "doc-test-001",
    "_source": {
        "doc_id": "doc-test-001",
        "client_id": "client-2002",
        "title": "Risk questionnaire summary",
        "content": "Client has low risk tolerance and prefers capital preservation.",
        "doc_type": "risk_profile"
    }
}

es.index(index=test_doc["_index"], id=test_doc["_id"], document=test_doc["_source"])
es.indices.refresh(index=index_name)

response = es.search(
    index=index_name,
    query={"match": {"content": {"query":"low risk tolerance"}}}
)

print(response["hits"]["hits"][0]["_source"]["content"])

Expected output:

Client has low risk tolerance and prefers capital preservation.

Real-World Use Cases

  • Advisor copilot

    • Retrieve client meeting notes, portfolio summaries, and compliance docs to draft response suggestions during calls.
  • Client onboarding assistant

    • Search KYC forms, suitability questionnaires, and internal policy docs to help ops teams complete onboarding faster.
  • Portfolio research Q&A

    • Let analysts query stored research memos and market commentary without digging through shared drives or PDFs.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides