How to Build a compliance checking Agent Using LlamaIndex in Python for wealth management

By Cyprian AaronsUpdated 2026-04-21
compliance-checkingllamaindexpythonwealth-management

A compliance checking agent in wealth management reviews client communications, proposals, and portfolio actions against internal policy and regulatory rules before anything is sent or executed. It matters because a single bad recommendation, missing disclosure, or unsuitable trade can create regulatory exposure, client harm, and audit failures.

Architecture

  • Document ingestion layer

    • Pulls in compliance policies, suitability rules, product restrictions, fee schedules, and approved disclosures.
    • Typical sources: PDF policy manuals, SharePoint exports, CRM notes, and ticketing logs.
  • Knowledge index

    • Stores policy text in a retrievable format using VectorStoreIndex.
    • Lets the agent ground decisions in firm-approved documents instead of free-form model memory.
  • Rule evaluation layer

    • Applies deterministic checks for hard constraints like restricted securities, jurisdiction blocks, KYC status, and concentration limits.
    • This should not be delegated to the LLM.
  • LLM reasoning layer

    • Uses LlamaIndex query engines to explain whether a draft communication or action violates policy.
    • Produces a structured rationale for reviewers and auditors.
  • Audit logging layer

    • Records inputs, retrieved policy passages, model output, timestamp, user identity, and final disposition.
    • Needed for supervision review and regulatory evidence.
  • Human review workflow

    • Escalates ambiguous cases to compliance officers.
    • The agent should recommend; it should not auto-approve high-risk actions.

Implementation

1. Load compliance policies into a LlamaIndex knowledge base

Use SimpleDirectoryReader for local documents and build a VectorStoreIndex. In production you would swap the storage backend for something with access controls and residency guarantees.

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.settings import Settings
from llama_index.llms.openai import OpenAI

# Set your LLM once for the app
Settings.llm = OpenAI(model="gpt-4o-mini", temperature=0)

# Load policy documents from a controlled folder
documents = SimpleDirectoryReader(
    input_dir="./compliance_docs",
    recursive=True
).load_data()

# Build the searchable index
index = VectorStoreIndex.from_documents(documents)

# Create a query engine for policy lookup
query_engine = index.as_query_engine(similarity_top_k=3)

2. Add deterministic checks before calling the model

Wealth management has rules that do not need interpretation. Check those first: restricted products, missing risk profile, sanctions flags, or jurisdiction constraints.

def deterministic_compliance_checks(request: dict) -> list[str]:
    issues = []

    if request.get("client_jurisdiction") in {"US", "EU"} and request.get("product") == "unapproved_private_placement":
        issues.append("Product is not approved for this jurisdiction.")

    if request.get("risk_profile") == "conservative" and request.get("recommended_asset") == "high_yield_bonds":
        issues.append("Recommendation appears unsuitable for conservative profile.")

    if request.get("kyc_status") != "verified":
        issues.append("KYC is not verified.")

    if request.get("restricted_security") is True:
        issues.append("Requested security is on the restricted list.")

    return issues

3. Use the query engine to ground the compliance decision

The agent should retrieve the relevant policy sections and produce an answer that cites them. For richer workflows, wrap this with a structured response schema or a tool-calling orchestrator later.

from dataclasses import dataclass

@dataclass
class ComplianceResult:
    status: str
    rationale: str
    policy_excerpt: str

def check_compliance(request_text: str) -> ComplianceResult:
    # Retrieve supporting policy text
    response = query_engine.query(
        f"Review this wealth management request for compliance risk:\n{request_text}\n"
        f"Return the relevant policy basis and whether it is compliant."
    )

    answer = str(response)

    # Simple classification logic; replace with structured output if needed
    lowered = answer.lower()
    if "non-compliant" in lowered or "violation" in lowered or "not allowed" in lowered:
        status = "flagged"
    else:
        status = "clear"

    return ComplianceResult(
        status=status,
        rationale=answer,
        policy_excerpt=answer[:800],
    )

request_text = """
Client wants to move 40% of retirement assets into an illiquid private placement.
Client has conservative risk tolerance. KYC is verified.
"""

result = check_compliance(request_text)
print(result.status)
print(result.rationale)

4. Orchestrate both layers into one agent-facing function

This is the pattern you actually want in production: deterministic checks first, retrieval-backed reasoning second, then human escalation when needed.

def review_request(request: dict) -> dict:
    issues = deterministic_compliance_checks(request)

    if issues:
        return {
            "decision": "escalate",
            "reason": "Deterministic rule violation",
            "issues": issues,
        }

    request_text = f"""
    Client jurisdiction: {request.get('client_jurisdiction')}
    Risk profile: {request.get('risk_profile')}
    Product: {request.get('product')}
    Recommended asset: {request.get('recommended_asset')}
    Notes: {request.get('notes')}
    """

    llm_result = check_compliance(request_text)

    if llm_result.status == "flagged":
        return {
            "decision": "escalate",
            "reason": llm_result.rationale,
            "policy_excerpt": llm_result.policy_excerpt,
        }

    return {
        "decision": "approve_for_review",
        "reason": llm_result.rationale,
        "policy_excerpt": llm_result.policy_excerpt,
    }

Production Considerations

  • Data residency

    • Keep client data and indexed policies inside approved regions.
    • If your firm operates across jurisdictions, partition indexes by region so EU client data does not cross into US-hosted stores without a legal basis.
  • Auditability

    • Log every decision with retrieved document IDs, timestamps, user IDs, and model version.
    • Store both the raw request and the exact policy snippets used so compliance can reconstruct the decision later.
  • Guardrails

    • Never let the LLM approve trades on its own.
    • Use hard-coded thresholds for suitability, concentration limits, restricted lists, and sanction screening. Let LlamaIndex explain context; do not let it invent rules.
  • Monitoring

    • Track false positives, false negatives, escalation rate, and average review latency.
    • Sample outputs weekly with compliance officers to catch drift when policies change.

Common Pitfalls

  1. Using only semantic search for hard rules

    • Retrieval is good for finding policy text.
    • It is bad at enforcing exact thresholds like “max 10% exposure” or “no private placements for retail clients.” Implement those as code.
  2. Letting the model summarize without citations

    • A vague answer like “this looks fine” is useless in an audit.
    • Always return retrieved excerpts or document references alongside the decision.
  3. Mixing client data across environments

    • Wealth management data often includes PII, account numbers, tax status, and investment objectives.
    • Separate dev/test/prod indexes and encrypt storage at rest; do not build one shared sandbox with real client records.
  4. Treating compliance docs as static

    • Policies change after regulatory updates or internal committee decisions.
    • Rebuild or refresh your VectorStoreIndex on a controlled schedule and version your source documents.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides