How to Build a claims processing Agent Using AutoGen in Python for banking

By Cyprian AaronsUpdated 2026-04-21
claims-processingautogenpythonbanking

A claims processing agent in banking takes an incoming claim, extracts the relevant facts, checks policy and account context, routes exceptions, and produces a decision-ready summary for a human reviewer. It matters because claims are high-volume, document-heavy, and regulated; if you automate the first pass correctly, you cut handling time without losing auditability or control.

Architecture

  • Ingress layer

    • Receives claim payloads from a case management system, API gateway, or queue.
    • Normalizes documents, metadata, and customer identifiers.
  • Extraction agent

    • Pulls structured fields from emails, PDFs, scanned forms, and notes.
    • Produces a canonical claim object with confidence scores.
  • Validation agent

    • Checks policy coverage, KYC flags, transaction references, date windows, and duplicate claims.
    • Calls bank systems through controlled tools instead of free-form reasoning.
  • Decision orchestrator

    • Coordinates multiple agents using AutoGen AssistantAgent instances.
    • Escalates ambiguous cases to a human approver.
  • Audit and logging layer

    • Stores every prompt, tool call, response, and final decision.
    • Supports internal audit and regulator review.
  • Data boundary controls

    • Enforces data residency, PII redaction, and access controls before any model call.
    • Keeps sensitive banking data out of uncontrolled endpoints.

Implementation

1) Set up AutoGen agents with strict roles

Use separate agents for extraction and validation. In banking, do not let one model both invent facts and approve them.

import os
from autogen import AssistantAgent

llm_config = {
    "config_list": [
        {
            "model": "gpt-4o-mini",
            "api_key": os.environ["OPENAI_API_KEY"],
        }
    ],
    "temperature": 0,
}

extractor = AssistantAgent(
    name="extractor",
    llm_config=llm_config,
    system_message=(
        "You extract structured claims data from banking claim text. "
        "Return only valid JSON with fields: claim_id, customer_id, amount, "
        "currency, incident_date, merchant_name, reason_code, confidence."
    ),
)

validator = AssistantAgent(
    name="validator",
    llm_config=llm_config,
    system_message=(
        "You validate banking claims against policy rules. "
        "Do not approve claims without evidence. Return JSON with fields: "
        "status, risk_flags, missing_data, recommended_action."
    ),
)

2) Add controlled tools for policy lookup

AutoGen works best when the model can call deterministic functions for bank systems. Keep the tool narrow and make the return shape explicit.

from typing import Dict

def lookup_policy(customer_id: str) -> Dict:
    # Replace with real internal service call
    return {
        "customer_id": customer_id,
        "coverage_active": True,
        "daily_limit": 5000,
        "requires_manual_review": False,
        "kyc_status": "verified",
    }

def check_duplicate_claim(claim_id: str) -> Dict:
    # Replace with real case management query
    return {"claim_id": claim_id, "duplicate_found": False}

3) Orchestrate the workflow with GroupChat and GroupChatManager

This pattern gives you traceable multi-agent coordination without turning the system into a black box. The manager controls turn-taking; your app can stop the conversation once enough evidence exists.

from autogen import UserProxyAgent, GroupChat, GroupChatManager

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    code_execution_config=False,
)

task = """
Extract the claim details from this case:
Customer says a card transaction of USD 184.22 at MERCHANT_X on 2026-03-18 was unauthorized.
Claim ID CLM-88421. Customer ID CUST-1029.
Then validate it against policy rules.
"""

groupchat = GroupChat(
    agents=[user_proxy, extractor, validator],
    messages=[],
    max_round=4,
)

manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)

user_proxy.initiate_chat(manager, message=task)

4) Wire in deterministic post-processing before any decision leaves the system

Never send raw model output directly to downstream banking workflows. Parse it, validate it against schema rules, then route based on risk.

import json

raw_output = """
{"claim_id":"CLM-88421","customer_id":"CUST-1029","amount":184.22,"currency":"USD",
"incident_date":"2026-03-18","merchant_name":"MERCHANT_X","reason_code":"unauthorized",
"confidence":0.93}
"""

claim = json.loads(raw_output)
policy = lookup_policy(claim["customer_id"])
duplicate = check_duplicate_claim(claim["claim_id"])

decision = {
    "claim_id": claim["claim_id"],
    "status": "manual_review" if duplicate["duplicate_found"] or not policy["coverage_active"] else "eligible_for_auto_review",
    "audit_tags": ["pii_redacted", "policy_checked", "duplicate_screened"],
}
print(decision)

Production Considerations

  • Deploy inside your bank’s approved boundary

    • Use VPC/private networking and approved model endpoints.
    • If data residency is required, keep prompts and logs in-region.
  • Log everything needed for audit

    • Store input hashes, tool calls, outputs, timestamps, model version, and reviewer actions.
    • Make logs immutable or append-only where possible.
  • Add hard guardrails

    • Redact PANs, account numbers, SSNs/national IDs before LLM calls.
    • Block any claim approval path that lacks policy confirmation or human sign-off above threshold.
  • Monitor drift and failure modes

    • Track extraction accuracy by claim type.
    • Watch for hallucinated merchant names, incorrect dates, and overconfident approvals.

Common Pitfalls

  1. Using one agent for extraction and approval

    • This creates role confusion and bad decisions.
    • Split extraction from validation so each step has one job.
  2. Trusting free-text outputs

    • Banking workflows need schema validation.
    • Force JSON output and reject anything that does not parse cleanly.
  3. Skipping tool-based verification

    • The model should not infer policy status or duplicate history from text alone.
    • Always query authoritative systems through functions like lookup_policy() before routing a claim.
  4. Ignoring audit requirements

    • If you cannot explain why a claim was escalated or approved, you cannot ship it in banking.
    • Persist prompts,responses,and tool results with enough context for internal audit and regulator review.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides