How to Build a claims processing Agent Using AutoGen in Python for investment banking

By Cyprian AaronsUpdated 2026-04-21

claims-processingautogenpythoninvestment-banking

A claims processing agent in investment banking takes incoming claim packets, extracts the relevant facts, checks them against policy and trade records, routes exceptions to the right desk, and produces an auditable decision trail. It matters because claims in this context are not just customer service events; they can trigger financial exposure, regulatory reporting, legal review, and downstream reconciliation across operations, compliance, and finance.

Architecture

•
Ingress layer
- •Accepts claim documents from email, SFTP, case management systems, or internal APIs.
- •Normalizes payloads into a single claim schema before any LLM call.
•
Triage agent
- •Classifies claim type: trade dispute, fee reversal, settlement mismatch, margin issue, or operational error.
- •Decides whether the claim can be auto-processed or needs human review.
•
Evidence extraction agent
- •Pulls structured fields from PDFs, emails, and ticket notes.
- •Extracts trade IDs, timestamps, counterparties, amounts, account references, and supporting evidence.
•
Policy and compliance agent
- •Checks the claim against internal policy rules, retention requirements, KYC/AML flags, and escalation thresholds.
- •Enforces data residency constraints before any external model call.
•
Decision agent
- •Produces an outcome: approve, reject, request more information, or escalate.
- •Generates a concise rationale with citations to source artifacts.
•
Audit logger
- •Persists every prompt, tool call, model response, and final decision.
- •Stores immutable records for compliance review and post-trade investigations.

Implementation

1) Install AutoGen and define the claim schema

For Python projects using AutoGen’s current API surface, start with pyautogen and use AssistantAgent, UserProxyAgent, and GroupChatManager. Keep the claim payload explicit so the model does not invent fields.

pip install pyautogen pydantic

from pydantic import BaseModel
from typing import Literal, Optional

class Claim(BaseModel):
    claim_id: str
    claimant_name: str
    account_id: str
    claim_type: Literal["trade_dispute", "fee_reversal", "settlement_mismatch", "margin_issue", "ops_error"]
    amount: float
    currency: str
    trade_id: Optional[str] = None
    description: str
    jurisdiction: str
    residency_region: str

2) Build the AutoGen agents

Use one agent for analysis and one for execution. In investment banking you want a hard boundary between reasoning and action so you can inspect outputs before anything touches downstream systems.

import os
from autogen import AssistantAgent, UserProxyAgent

llm_config = {
    "config_list": [
        {
            "model": "gpt-4o-mini",
            "api_key": os.environ["OPENAI_API_KEY"],
        }
    ],
    "temperature": 0,
}

triage_agent = AssistantAgent(
    name="triage_agent",
    llm_config=llm_config,
    system_message=(
        "You triage investment banking claims. "
        "Return only valid JSON with keys: decision, risk_level, reason, next_action. "
        "Never approve claims missing trade_id when claim_type is trade_dispute."
    ),
)

executor = UserProxyAgent(
    name="executor",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=1,
)

3) Orchestrate the workflow with a group chat

This pattern lets you add compliance or legal agents later without rewriting the control flow. The manager coordinates the conversation while your code keeps ownership of validation and persistence.

from autogen import GroupChat, GroupChatManager
import json

def process_claim(claim: Claim):
    prompt = f"""
Claim:
{claim.model_dump_json(indent=2)}

Rules:
- If residency_region is not 'EU' or 'US', escalate for data residency review.
- If amount >= 1000000 or claim_type == 'trade_dispute', mark as high risk.
- Do not recommend approval if trade_id is missing for trade disputes.
- Output JSON only.
"""

    groupchat = GroupChat(
        agents=[triage_agent],
        messages=[],
        max_round=2,
    )
    manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)

    executor.initiate_chat(manager, message=prompt)

4) Add deterministic guardrails before execution

Do not let the model be the last gate. Use Python validation for policy checks and only send eligible cases to AutoGen for reasoning.

def precheck(claim: Claim):
    if claim.claim_type == "trade_dispute" and not claim.trade_id:
        return {"decision": "escalate", "reason": "Missing trade_id"}
    if claim.amount >= 1_000_000:
        return {"decision": "escalate", "reason": "High-value claim"}
    if claim.residency_region not in {"EU", "US"}:
        return {"decision": "escalate", "reason": "Data residency review required"}
    return {"decision": "review", "reason": "Eligible for LLM triage"}

claim = Claim(
    claim_id="CLM-10021",
    claimant_name="ABC Capital",
    account_id="ACC-77881",
    claim_type="trade_dispute",
    amount=250000.0,
    currency="USD",
    trade_id="TRD-44551",
    description="Mismatch between executed price and confirmed ticket.",
    jurisdiction="NY",
)
print(precheck(claim))

Production Considerations

•
Data residency
- •Route EU claims to an EU-hosted inference path if your policy requires it.
- •Never send restricted client data to a model endpoint outside approved regions.
•
Auditability
- •Persist raw input documents, normalized payloads, prompts, responses, tool calls, and final decisions.
- •Use immutable storage with retention aligned to regulatory recordkeeping rules.
•
Guardrails
- •Require structured output with JSON schema validation before any downstream action.
- •Add deterministic policy checks for thresholds like amount limits, product type exclusions, and missing identifiers.
•
Monitoring
- •Track auto-approved rate, escalation rate by desk/product/jurisdiction, hallucination incidents, and average time to resolution.
- •Alert on spikes in “missing evidence” outcomes because that usually means upstream ingestion broke.

Common Pitfalls

•
Letting the LLM decide everything
- •Bad pattern: asking the model to both interpret policy and execute actions.
- •Fix it by splitting deterministic policy checks from LLM reasoning. The model should recommend; your code should enforce.
•
Skipping schema enforcement
- •Bad pattern: parsing free-form text from the assistant directly into downstream systems.
- •Fix it with strict Pydantic models or JSON schema validation before storage or routing.
•
Ignoring jurisdictional constraints
- •Bad pattern: sending all claims through one shared model endpoint regardless of client location.
- •Fix it by tagging each claim with residency metadata at ingress and routing based on approved regions only.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit