How to Build a claims processing Agent Using LlamaIndex in Python for retail banking

By Cyprian AaronsUpdated 2026-04-21

claims-processingllamaindexpythonretail-banking

A claims processing agent for retail banking takes an incoming customer claim, classifies it, pulls the right policy and account context, checks required documents, and routes the case with a clear decision trail. It matters because retail banks need faster turnaround without losing control over compliance, auditability, and customer data handling.

Architecture

•
Claim intake layer
- •Receives claim text from chat, email, or a case form.
- •Normalizes fields like customer ID, product type, incident date, and claim category.
•
Document retrieval layer
- •Pulls policy docs, product T&Cs, claims playbooks, and prior case notes.
- •Uses VectorStoreIndex with metadata filters for product line, jurisdiction, and version.
•
Reasoning and extraction layer
- •Uses an LLM to extract structured claim facts.
- •Produces JSON-like outputs for downstream workflow systems.
•
Decision support layer
- •Maps the extracted facts to eligibility checks and next actions.
- •Flags missing documents, suspicious patterns, or escalation triggers.
•
Audit and compliance layer
- •Logs retrieved sources, model output, timestamps, and human overrides.
- •Supports internal audit and regulatory review.
•
Case management integration
- •Writes the final result into CRM or claims workflow tools.
- •Hands off uncertain cases to a human reviewer.

Implementation

1) Build a retriever over claims policy documents

Start by loading your internal documents into a vector index. For retail banking, keep policy version and jurisdiction as metadata so you can filter by region and product line later.

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.schema import MetadataMode

# Load policy docs from a controlled internal folder
documents = SimpleDirectoryReader("./claims_policies").load_data()

# Build the index
index = VectorStoreIndex.from_documents(documents)

# Create a retriever
retriever = index.as_retriever(similarity_top_k=3)

query = "What documents are required for a card fraud claim in the UK?"
nodes = retriever.retrieve(query)

for node in nodes:
    print(node.score)
    print(node.node.get_content(metadata_mode=MetadataMode.LLAMA_PARSER))
    print("---")

This gives you grounded context before the model makes any recommendation. In banking, that grounding is what keeps the agent tied to approved policy instead of hallucinating process steps.

2) Define a structured output schema for claim extraction

You want deterministic fields out of unstructured customer messages. Use PydanticProgramExtractor or a structured LLM response path; here I’ll show a simple Pydantic schema plus OpenAI chat completion through LlamaIndex.

from pydantic import BaseModel, Field
from typing import Optional

class ClaimCase(BaseModel):
    customer_id: str = Field(..., description="Retail banking customer identifier")
    claim_type: str = Field(..., description="e.g. card_fraud, chargeback, account_takeover")
    jurisdiction: str = Field(..., description="Country or region")
    incident_date: str = Field(..., description="ISO date string")
    amount: Optional[float] = Field(None, description="Claim amount if known")
    missing_documents: list[str] = Field(default_factory=list)

Now wire the LLM with an extraction prompt. In production you would use your approved model endpoint and enforce output validation before any workflow action.

from llama_index.llms.openai import OpenAI
from llama_index.core.prompts import PromptTemplate

llm = OpenAI(model="gpt-4o-mini", temperature=0)

prompt = PromptTemplate(
    """
You are a retail banking claims intake assistant.
Extract the following fields from the customer message:
{schema}

Customer message:
{message}

Return only valid JSON matching the schema.
"""
)

message = """
My debit card was used for three transactions I did not make yesterday in London.
I have already blocked the card. My customer ID is RB12345.
"""

response = llm.complete(prompt.format(schema=ClaimCase.model_json_schema(), message=message))
print(response.text)

3) Combine retrieval with reasoning using `RetrieverQueryEngine`

This is the core pattern: retrieve policy context first, then ask the LLM to decide next steps using only that context. That keeps answers auditable.

from llama_index.core import Settings
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response_synthesizers import get_response_synthesizer

Settings.llm = llm

response_synthesizer = get_response_synthesizer(response_mode="compact")
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,
)

bank_query = """
Customer reports debit card fraud in the UK.
State required documents, immediate next steps, and whether this should be escalated to manual review.
"""

result = query_engine.query(bank_query)
print(result.response)

In practice you should wrap this in a service that also stores:

•retrieved node IDs
•source document versions
•model name and prompt version
•final human decision if one exists

That audit trail is non-negotiable for regulated workflows.

4) Add guardrails before routing to operations

Do not let the agent auto-close or auto-pay claims without controls. Use business rules outside the model for thresholds like amount limits or sanctions-related flags.

A simple pattern is:

•If missing mandatory fields → request more info
•If amount exceeds threshold → escalate to human reviewer
•If policy retrieval confidence is low → stop and escalate

def route_claim(claim: ClaimCase, retrieval_score: float) -> str:
    if not claim.customer_id or not claim.claim_type:
        return "request_more_info"
    if claim.amount is not None and claim.amount > 500:
        return "human_review"
    if retrieval_score < 0.75:
        return "human_review"
    return "proceed"

# Example usage after extraction + retrieval scoring logic
print(route_claim(
    ClaimCase(
        customer_id="RB12345",
        claim_type="card_fraud",
        jurisdiction="UK",
        incident_date="2026-04-20",
        amount=120.0,
        missing_documents=["signed_declaration"]
    ),
    retrieval_score=0.82
))

Production Considerations

•
Data residency
- •Keep indexing and inference inside approved regions.
- •For UK/EU retail banking claims data, avoid sending PII to non-compliant endpoints.
•
Auditability
- •Persist prompt versions, retrieved chunks, final outputs, and user overrides.
- •Store enough evidence so compliance teams can reconstruct every decision.
•
Monitoring
- •Track retrieval hit rate, escalation rate, hallucination reports, and average time-to-resolution.
- •Alert on spikes in manual review or low-confidence classifications.
•
Guardrails
- •Enforce redaction for PANs, account numbers, addresses, and national IDs before prompts leave your app boundary.
- •Keep hard business rules outside the LLM; use the model for classification and summarization only.

Common Pitfalls

•
Using the model as the decision engine
- •Mistake: letting the LLM approve or deny claims directly.
- •Fix: use deterministic rules plus human review for anything financial or exception-based.
•
Skipping metadata on policy documents
- •Mistake: indexing all docs without versioning or jurisdiction tags.
- •Fix: attach metadata like product, country, effective_date, and policy_version, then filter at retrieval time.
•
No validation on extracted fields
- •Mistake: trusting free-form model output as workflow input.
- •Fix: validate against a schema like Pydantic, reject malformed outputs, and fall back to manual handling when fields are missing.
•
Ignoring compliance logging
- •Mistake: only storing final answers.
- •Fix: log sources used by RetrieverQueryEngine, user inputs after redaction, and every handoff to operations.

A good retail banking claims agent does less than people expect at first glance. It extracts clean case data, retrieves approved policy context via LlamaIndex components like VectorStoreIndex and RetrieverQueryEngine, then routes decisions through strict controls that satisfy audit and compliance requirements.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit