How to Build a compliance checking Agent Using LlamaIndex in Python for fintech

By Cyprian AaronsUpdated 2026-04-21
compliance-checkingllamaindexpythonfintech

A compliance checking agent for fintech reviews customer communications, transaction narratives, policy drafts, or support tickets against internal controls and regulatory rules. It matters because the cost of missing a restricted phrase, an unsuitable recommendation, or a jurisdiction-specific policy violation is not just rework — it can become a reportable incident, a fine, or a blocked release.

Architecture

  • Document ingestion layer

    • Pulls policies, procedures, regulatory guidance, and product constraints from PDFs, DOCX files, or internal wiki exports.
    • In production, this should be versioned by jurisdiction and product line.
  • Indexing layer

    • Uses SimpleDirectoryReader and VectorStoreIndex to turn compliance documents into searchable chunks.
    • Keep separate indexes for AML, KYC, marketing review, and disclosures.
  • Retrieval layer

    • Uses VectorIndexRetriever or query engines built on top of the index to fetch the most relevant policy passages.
    • Retrieval should return citations so reviewers can trace every decision.
  • Compliance reasoning layer

    • Uses an LLM-backed query engine with a strict prompt that asks for pass/fail plus rationale.
    • The output should map to specific policy references, not free-form advice.
  • Audit logging layer

    • Stores the input text, retrieved policy snippets, model output, and final decision.
    • This is non-negotiable in fintech because you need evidence for reviews and regulators.
  • Guardrail layer

    • Applies PII redaction, jurisdiction checks, confidence thresholds, and human escalation rules.
    • If the agent cannot justify a decision with retrieved evidence, it should escalate instead of guessing.

Implementation

1) Install dependencies and load your compliance corpus

Use LlamaIndex to ingest your policy docs from disk. Keep documents separated by domain so you can enforce different rules per workflow.

pip install llama-index llama-index-llms-openai llama-index-embeddings-openai
from llama_index.core import SimpleDirectoryReader

# Example structure:
# ./compliance_docs/aml/
# ./compliance_docs/marketing/
# ./compliance_docs/disclosures/

docs = SimpleDirectoryReader(
    input_dir="./compliance_docs",
    recursive=True,
).load_data()

print(f"Loaded {len(docs)} documents")

2) Build a vector index over the policies

This is the core retrieval layer. For fintech use cases, you want deterministic retrieval over approved policy content before any model judgment happens.

import os
from llama_index.core import VectorStoreIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI

os.environ["OPENAI_API_KEY"] = "your-api-key"

embed_model = OpenAIEmbedding(model="text-embedding-3-small")
llm = OpenAI(model="gpt-4o-mini", temperature=0)

index = VectorStoreIndex.from_documents(
    docs,
    embed_model=embed_model,
)

query_engine = index.as_query_engine(
    llm=llm,
    similarity_top_k=4,
)

3) Define a compliance check function with citations

The pattern here is simple: send the candidate text to the query engine along with a strict instruction. The response should include whether the text passes review and which policy excerpts support that conclusion.

def check_compliance(candidate_text: str) -> str:
    prompt = f"""
You are a compliance reviewer for a fintech company.

Task:
1. Determine whether the text violates any policy in the indexed compliance documents.
2. Return one of: PASS, FAIL, ESCALATE.
3. Cite the exact policy basis from retrieved documents.
4. If evidence is insufficient or ambiguous, return ESCALATE.

Text to review:
{candidate_text}
"""
    response = query_engine.query(prompt)
    return str(response)

sample_text = """
Open an account instantly with no identity verification required.
Guaranteed approval for all users in under 60 seconds.
"""

result = check_compliance(sample_text)
print(result)

4) Add structured output and audit logging

For production systems, don’t rely on raw text alone. Log inputs and outputs in a format your risk team can inspect later. LlamaIndex gives you access to source nodes through response objects when you need traceability.

import json
from datetime import datetime

def audited_check(candidate_text: str):
    prompt = f"""
Review this text for fintech compliance issues.
Return PASS/FAIL/ESCALATE with concise rationale and cite sources.

Text:
{candidate_text}
"""
    response = query_engine.query(prompt)

    audit_record = {
        "timestamp": datetime.utcnow().isoformat(),
        "input_text": candidate_text,
        "decision_output": str(response),
        "sources": [
            {
                "text": node.node.get_content(),
                "score": node.score,
            }
            for node in getattr(response, "source_nodes", []) or []
        ],
    }

    print(json.dumps(audit_record, indent=2))
    return audit_record

If you want tighter control over retrieval before generation, use VectorIndexRetriever directly:

from llama_index.core.indices.vector_store.retrievers import VectorIndexRetriever

retriever = VectorIndexRetriever(index=index, similarity_top_k=4)
nodes = retriever.retrieve("Does this marketing copy make misleading claims?")
for node in nodes:
    print(node.score)
    print(node.node.get_content())

Production Considerations

  • Enforce data residency

    • Keep EU customer content in EU-hosted infrastructure if your regulatory posture requires it.
    • Separate indexes by region so queries never cross residency boundaries.
  • Log every decision path

    • Store input text hashes, retrieved nodes, model version, prompt version, and final disposition.
    • Auditors care about reproducibility more than clever prompts.
  • Add hard guardrails

    • Block outputs that do not cite retrieved policy content.
    • Escalate when retrieval confidence is low or when the text touches high-risk categories like sanctions, lending decisions, or suitability claims.
  • Monitor drift by policy version

    • When regulations change or internal policies are updated, rebuild indexes and track which version was used for each decision.
    • A stale index is how compliant systems become non-compliant quietly.

Common Pitfalls

  1. Using the agent as a legal oracle

    • Mistake: letting the model decide based on memory alone.
    • Avoid it by forcing retrieval from approved documents and requiring citations in every answer.
  2. Mixing jurisdictions in one undifferentiated index

    • Mistake: combining US marketing rules with EU disclosure rules in one blob of context.
    • Avoid it by partitioning indexes by region, product type, and control family.
  3. Skipping escalation logic

    • Mistake: treating low-confidence results as PASS because the model sounded confident.
    • Avoid it by defining explicit ESCALATE thresholds for ambiguous cases and routing them to human compliance review.

A fintech compliance agent is only useful if it is traceable. Build it so every answer can be explained back to a policy document, every decision can be audited later, and every uncertain case gets handed off instead of guessed.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides