How to Build a policy Q&A Agent Using AutoGen in Python for retail banking

By Cyprian AaronsUpdated 2026-04-21

policy-q-aautogenpythonretail-bankingpolicy-qanda

A policy Q&A agent in retail banking answers staff or customer-facing questions against approved policy documents: card disputes, fee waivers, KYC rules, complaint handling, overdraft limits, and escalation paths. It matters because the wrong answer creates compliance risk, inconsistent customer treatment, and audit gaps. The agent has to be accurate, cite sources, and refuse to guess when policy is missing.

Architecture

•
Policy knowledge source
- •Approved PDFs, internal wiki pages, SOPs, and regulatory extracts.
- •Store only sanctioned content; do not let the model browse random web pages.
•
Retriever layer
- •Chunk policies and retrieve the top relevant passages for each question.
- •In banking, retrieval quality matters more than model size because policy language is narrow and exact.
•
Assistant agent
- •Uses an LLM to answer only from retrieved policy text.
- •Must produce concise answers with citations and escalation guidance when needed.
•
Reviewer / guardrail agent
- •Checks whether the answer is grounded in policy and whether it contains risky advice.
- •Blocks unsupported claims, especially around eligibility, exceptions, fees, disputes, AML/KYC, and complaints.
•
Orchestrator
- •Routes the user question through retrieval, answer generation, and review.
- •Keeps an audit trail of inputs, retrieved passages, model output, and final response.
•
Audit store
- •Persist conversation IDs, source document IDs, timestamps, and final answers.
- •Required for compliance review and incident investigation.

Implementation

1) Install AutoGen and prepare your policy docs

Use AutoGen’s agent APIs directly. For a retail banking Q&A bot, I prefer a small set of curated documents over a large noisy corpus.

pip install pyautogen

from pathlib import Path

policy_docs = [
    Path("policies/card_disputes.txt").read_text(),
    Path("policies/fee_waivers.txt").read_text(),
    Path("policies/complaints_handling.txt").read_text(),
]

2) Build a retrieval function over approved policy text

This example uses simple keyword scoring to keep the pattern readable. In production you would swap this for a vector index or enterprise search backend.

import re
from collections import Counter

def chunk_text(text: str, chunk_size: int = 900):
    return [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]

chunks = []
for doc_id, doc in enumerate(policy_docs):
    for idx, chunk in enumerate(chunk_text(doc)):
        chunks.append({
            "doc_id": doc_id,
            "chunk_id": idx,
            "text": chunk
        })

def retrieve_policy(question: str, top_k: int = 3):
    q_terms = Counter(re.findall(r"\w+", question.lower()))
    scored = []
    for item in chunks:
        text_terms = Counter(re.findall(r"\w+", item["text"].lower()))
        score = sum(q_terms[t] * text_terms[t] for t in q_terms)
        if score > 0:
            scored.append((score, item))
    scored.sort(key=lambda x: x[0], reverse=True)
    return [item for _, item in scored[:top_k]]

3) Create AutoGen agents with explicit roles

AssistantAgent generates the answer. UserProxyAgent can be used as the orchestrator entry point for execution control. The important part is that the assistant receives only retrieved policy context.

import os
from autogen import AssistantAgent, UserProxyAgent

llm_config = {
    "model": "gpt-4o-mini",
    "api_key": os.environ["OPENAI_API_KEY"],
}

answer_agent = AssistantAgent(
    name="policy_answer_agent",
    llm_config=llm_config,
    system_message=(
        "You are a retail banking policy assistant. "
        "Answer only from provided policy excerpts. "
        "If the excerpts do not contain enough information, say so clearly. "
        "Always mention relevant escalation or compliance caveats."
    ),
)

review_agent = AssistantAgent(
    name="policy_review_agent",
    llm_config=llm_config,
    system_message=(
        "You review banking policy answers for grounding and risk. "
        "Reject unsupported claims. "
        "Check for compliance risk, missing citations, or advice that should be escalated."
    ),
)

orchestrator = UserProxyAgent(
    name="policy_orchestrator",
    human_input_mode="NEVER",
)

4) Run the retrieval → answer → review loop

This is the core pattern. The final response should be based on reviewed output only.

def build_context(question: str) -> str:
    passages = retrieve_policy(question)
    if not passages:
        return "No relevant approved policy passages found."
    
    lines = ["Approved policy excerpts:"]
    for p in passages:
        lines.append(f"\n[doc={p['doc_id']} chunk={p['chunk_id']}]\n{p['text']}")
    return "\n".join(lines)

def ask_policy_agent(question: str):
    context = build_context(question)

    answer_prompt = f"""
Question:
{question}

{context}

Write a direct answer using only these excerpts.
Include:
- short answer
- any exceptions
- escalation path if applicable
- source markers like [doc=0 chunk=1]
"""

    draft = answer_agent.generate_reply(messages=[{"role": "user", "content": answer_prompt}])

    review_prompt = f"""
Question:
{question}

Draft answer:
{draft}

Approve only if grounded in the provided excerpts and safe for retail banking use.
If not approved, explain what must be removed or corrected.
"""
    review = review_agent.generate_reply(messages=[{"role": "user", "content": review_prompt}])

    return {
        "question": question,
        "draft_answer": draft,
        "review": review,
    }

result = ask_policy_agent("Can we waive a late fee for a first-time cardholder complaint?")
print(result["draft_answer"])
print(result["review"])

Production Considerations

•
Data residency
- •Keep prompts, retrieved snippets, logs, and embeddings in-region where your bank operates.
- •If you serve multiple jurisdictions, partition storage by country or legal entity.
•
Compliance controls
- •Add deterministic refusal rules for topics like AML alerts, sanctions screening overrides, account opening exceptions, and regulatory interpretations.
- •Require citations from approved sources before returning an answer.
•
Auditability
- •Log question text, retrieved chunks with document IDs/version numbers, model output, reviewer output, and final response.
- •Make logs immutable or append-only so compliance teams can reconstruct decisions later.
•
Monitoring
- •Track grounded-answer rate, refusal rate, escalation rate, hallucination reports from agents on the floor.
- •Alert on spikes in “no relevant excerpt found” because that usually means your policy corpus is stale or incomplete.

Common Pitfalls

•
Letting the model answer without retrieval
- •This turns the bot into a generic chatbot with banking vocabulary.
- •Fix it by hard-wiring retrieval into every request path and rejecting empty-context answers.
•
Using broad prompts instead of bank-specific guardrails
- •“Be helpful” is not enough when fee waivers or complaint handling are involved.
- •Fix it with explicit instructions: cite sources, refuse unsupported claims, escalate when policy is ambiguous.
•
Skipping document versioning
- •Retail banking policies change often after product updates or regulatory changes.
- •Fix it by storing document_version, effective_date, and jurisdiction with every retrieved passage.
•
Ignoring operational boundaries
- •A UK complaint rule should not bleed into a US servicing workflow.
- •Fix it by routing questions through jurisdiction-aware indexes and restricting responses to the correct region.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit