How to Build a claims processing Agent Using AutoGen in Python for retail banking

By Cyprian AaronsUpdated 2026-04-21

claims-processingautogenpythonretail-banking

A claims processing agent in retail banking takes a customer claim, gathers the missing facts, checks policy and account context, routes the case to the right workflow, and produces an auditable recommendation or next action. It matters because claims are high-volume, time-sensitive, and compliance-heavy; if you automate them poorly, you create rework, regulatory risk, and bad customer outcomes.

Architecture

•
Customer intake layer
- •Accepts claim details from branch systems, mobile apps, or back-office queues.
- •Normalizes fields like claim type, account number, transaction date, amount, and narrative.
•
Policy and rules retrieval
- •Pulls product rules, dispute windows, KYC status, AML flags, and claim eligibility criteria.
- •Keeps the agent grounded in bank-approved sources instead of free-form reasoning.
•
Multi-agent orchestration
- •Uses separate agents for triage, evidence review, compliance validation, and final recommendation.
- •Keeps responsibilities narrow so each step is easier to audit.
•
Case memory and audit trail
- •Stores every message, tool call, and decision rationale.
- •Supports internal audit, dispute resolution, and regulator requests.
•
Human approval gate
- •Escalates edge cases: fraud suspicion, high-value claims, residency restrictions, or policy exceptions.
- •Prevents the agent from making final decisions where policy requires human sign-off.
•
Secure integration layer
- •Connects to core banking APIs, document stores, and case management systems.
- •Enforces data minimization and residency constraints before any external LLM call.

Implementation

1) Install AutoGen and define your agents

For this pattern, use autogen with a small group chat. One agent triages the claim; another acts as compliance reviewer; a third summarizes the final case note.

import os
from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager

llm_config = {
    "model": "gpt-4o-mini",
    "api_key": os.environ["OPENAI_API_KEY"],
    "temperature": 0,
}

triage_agent = AssistantAgent(
    name="triage_agent",
    llm_config=llm_config,
    system_message=(
        "You triage retail banking claims. "
        "Extract claim type, severity, missing information, and next action. "
        "Never invent facts. If data is missing, ask for it."
    ),
)

compliance_agent = AssistantAgent(
    name="compliance_agent",
    llm_config=llm_config,
    system_message=(
        "You validate retail banking claims against policy. "
        "Check KYC status, dispute window assumptions, auditability, and escalation triggers. "
        "Flag any compliance or residency issues."
    ),
)

case_writer = AssistantAgent(
    name="case_writer",
    llm_config=llm_config,
    system_message=(
        "You write concise case notes for bank operations. "
        "Return a structured summary with decision rationale and follow-up steps."
    ),
)

user_proxy = UserProxyAgent(
    name="ops_system",
    human_input_mode="NEVER",
)

2) Create a controlled group chat workflow

The important part is not just calling an LLM. It is constraining the conversation so each agent does one job and the output can be reviewed later.

groupchat = GroupChat(
    agents=[user_proxy, triage_agent, compliance_agent, case_writer],
    messages=[],
    max_round=4,
)

manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)

claim_payload = """
Claim ID: CLM-10482
Customer: Retail checking account holder
Issue: Card transaction dispute for $184.22
Transaction date: 2026-03-18
Reported date: 2026-03-20
Channel: Mobile app
Notes: Customer says merchant charged twice
"""

user_proxy.initiate_chat(
    manager,
    message=(
        "Process this retail banking claim. "
        "1) Triage it. "
        "2) Check compliance risks. "
        "3) Produce a case note with next steps.\n\n"
        f"{claim_payload}"
    ),
)

3) Add tool functions for bank systems

AutoGen works well when you attach tools for deterministic checks. Use actual Python functions for policy lookup or account verification before the model makes recommendations.

from datetime import datetime

def check_dispute_window(transaction_date_str: str) -> str:
    txn_date = datetime.strptime(transaction_date_str, "%Y-%m-%d")
    days_since = (datetime.utcnow() - txn_date).days
    return "within_window" if days_since <= 60 else f"outside_window:{days_since}"

def get_kyc_status(customer_id: str) -> str:
    # Replace with real core banking / KYC service call
    return "verified"

triage_agent.register_for_llm(name="check_dispute_window", description="Check whether a card dispute is within policy window")(check_dispute_window)
compliance_agent.register_for_llm(name="get_kyc_status", description="Fetch KYC verification status")(get_kyc_status)

Use these tools to force objective checks into the flow. For retail banking claims this matters because eligibility windows and identity status should come from systems of record, not model inference.

4) Structure outputs for downstream case management

Your final output should be machine-readable enough to persist into a case system. Keep it short and explicit.

final_case_note = {
    "claim_id": "CLM-10482",
    "status": "needs_review",
    "reason": [
        "Transaction appears to be within dispute window",
        "Customer narrative suggests duplicate merchant charge",
        "KYC status must be confirmed before closure"
    ],
    "next_steps": [
        "Request merchant receipt or app screenshot if available",
        "Open payment network dispute workflow",
        "Route to human reviewer if amount exceeds threshold"
    ],
}
print(final_case_note)

Production Considerations

•
Data residency
- •Keep PII inside approved regions.
- •If your LLM endpoint is outside your regulated geography, redact customer identifiers before sending prompts.
•
Auditability
- •Persist every prompt, response, tool result, and routing decision.
- •Store immutable logs with claim ID correlation so compliance teams can reconstruct the full path later.
•
Guardrails
- •Block final decisions on fraud suspicion thresholds unless a human approves.
- •Add deterministic checks for dispute windows, sanctions hits, account status changes, and policy exclusions before any LLM recommendation.
•
Monitoring
- •Track escalation rate by claim type.
- •Watch for hallucinated fields in outputs like invented transaction references or unsupported eligibility claims.

Common Pitfalls

•
Letting the model decide eligibility from raw narrative
- •Avoid this by pushing eligibility logic into tools like check_dispute_window() and core banking lookups.
- •The model should explain results from systems of record, not replace them.
•
Mixing customer support chat with operational decisioning
- •Keep conversational intake separate from adjudication.
- •Claims processing needs structured outputs; free-form chat creates bad downstream automation.
•
Ignoring audit requirements
- •If you do not store intermediate messages and tool calls through GroupChatManager, you will not have a defensible trail.
- •In retail banking that becomes a problem fast during disputes or regulator review.
•
Sending unredacted PII to external models
- •Mask account numbers, national IDs, addresses, and sensitive complaint text before prompt submission.
- •Apply residency-aware routing so regulated data never leaves approved infrastructure.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit