How to Build a claims processing Agent Using AutoGen in Python for fintech

By Cyprian AaronsUpdated 2026-04-21

claims-processingautogenpythonfintech

A claims processing agent in fintech takes incoming claim requests, extracts the facts, checks them against policy rules, requests missing evidence, and drafts a decision for human review or auto-approval. It matters because claims ops is where cost leaks happen: slow handling, inconsistent decisions, and weak audit trails all turn into compliance risk and customer churn.

Architecture

•
Claim intake layer
- •Accepts structured JSON from API gateways, web forms, or internal case systems.
- •Normalizes fields like claimant ID, transaction ID, amount, timestamp, merchant, and supporting documents.
•
Policy retrieval layer
- •Pulls product rules from a controlled source such as a database, vector store, or policy service.
- •Keeps the agent grounded in current fintech policy instead of model memory.
•
AutoGen orchestration layer
- •Uses AssistantAgent for reasoning and drafting.
- •Uses UserProxyAgent to execute deterministic checks, call internal services, and enforce human approval when needed.
•
Validation and fraud checks
- •Verifies required fields, duplicate claims, velocity rules, chargeback status, KYC/AML flags, and transaction ownership.
- •Keeps hard business logic outside the LLM.
•
Decision and audit layer
- •Produces a structured outcome: approve, reject, escalate, or request more info.
- •Logs inputs, tool calls, intermediate reasoning summaries, and final decision for auditability.

Implementation

1. Install AutoGen and define the claim schema

For production work, keep the claim payload strict. Fintech workflows break when the agent receives free-form text with missing identifiers.

from pydantic import BaseModel
from typing import Optional

class Claim(BaseModel):
    claim_id: str
    customer_id: str
    transaction_id: str
    amount: float
    currency: str
    merchant: str
    reason: str
    evidence_url: Optional[str] = None

2. Create an assistant agent and a user proxy agent

The AssistantAgent handles reasoning. The UserProxyAgent is where you run deterministic code and gate external actions. In claims processing, that means the model suggests; your code decides.

import os
from autogen import AssistantAgent, UserProxyAgent

llm_config = {
    "model": "gpt-4o-mini",
    "api_key": os.environ["OPENAI_API_KEY"],
    "temperature": 0,
}

assistant = AssistantAgent(
    name="claims_assistant",
    llm_config=llm_config,
    system_message=(
        "You are a claims processing assistant for fintech. "
        "Classify claims using policy rules. "
        "Never approve without evidence of policy fit. "
        "Return concise decisions with reasons."
    ),
)

user_proxy = UserProxyAgent(
    name="claims_executor",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=3,
)

3. Add deterministic validation and a simple policy check

This is the part most teams get wrong: they let the LLM infer eligibility from prose alone. Don’t do that. Put hard checks in Python and expose them as callable functions.

def validate_claim(claim: dict) -> tuple[bool, str]:
    required = ["claim_id", "customer_id", "transaction_id", "amount", "currency", "merchant", "reason"]
    missing = [k for k in required if k not in claim or claim[k] in (None, "", [])]
    if missing:
        return False, f"Missing required fields: {', '.join(missing)}"
    if claim["amount"] <= 0:
        return False, "Amount must be greater than zero"
    if claim["currency"] not in {"USD", "EUR", "GBP"}:
        return False, f"Unsupported currency: {claim['currency']}"
    return True, "ok"

def policy_check(claim: dict) -> dict:
    # Example fintech rule set
    if claim["amount"] > 5000:
        return {"decision": "escalate", "reason": "High-value claim requires manual review"}
    if "unauthorized" in claim["reason"].lower():
        return {"decision": "review", "reason": "Potential fraud keyword detected"}
    return {"decision": "approve_candidate", "reason": "Meets baseline policy rules"}

4. Orchestrate the conversation and force a structured decision

Use initiate_chat() to start the workflow. The assistant receives validated inputs plus rule outputs and returns a decision draft that your service can persist after final checks.

def process_claim(claim: dict):
    valid, message = validate_claim(claim)
    if not valid:
        return {"status": "rejected", "reason": message}

    rule_result = policy_check(claim)

    prompt = f"""
Claim data:
{claim}

Policy result:
{rule_result}

Task:
Return JSON with keys:
decision (approve|reject|escalate|request_info),
reason,
required_next_action.
Keep it short.
"""

    chat_result = user_proxy.initiate_chat(
        assistant,
        message=prompt,
        max_turns=2,
        clear_history=True,
    )

    return {
        "status": "processed",
        "policy_result": rule_result,
        "agent_response": chat_result.chat_history[-1]["content"],
    }

If you want this to run as an API endpoint later, wrap process_claim() inside FastAPI and persist both the raw input and final output to an immutable audit store.

Production Considerations

•
Audit logging
- •Store every claim payload version, validation result, policy outcome, model response, and final human decision.
- •For fintech audits, you need traceability from input to disposition.
•
Data residency
- •Keep PII inside approved regions.
- •If your deployment spans jurisdictions, route EU customer claims to EU-hosted inference endpoints only.
•
Guardrails
- •Never let the model directly mutate account state or issue payouts.
- •Use allowlisted tools only; all transfers should require deterministic backend approval.
•
Monitoring
- •Track approval rate by product line, escalation rate, false rejects, latency per claim type, and model drift.
- •Alert on spikes in escalations or repeated “request_info” outcomes; those usually indicate broken upstream data or prompt regressions.

Common Pitfalls

•
Using the LLM as the source of truth
- •Mistake: asking the model to decide eligibility from raw text alone.
- •Fix: enforce schema validation and hard policy checks in Python before any agent response.
•
Skipping audit-grade logging
- •Mistake: storing only the final answer.
- •Fix: log inputs, rule outputs, tool calls, timestamps, model version, and reviewer actions in an append-only store.
•
Letting prompts handle compliance
- •Mistake: putting KYC/AML thresholds or regional restrictions only in system prompts.
- •Fix: encode compliance rules in code and configuration so they are testable and versioned like any other control.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit