How to Build a policy Q&A Agent Using AutoGen in Python for investment banking

By Cyprian AaronsUpdated 2026-04-21

policy-q-aautogenpythoninvestment-bankingpolicy-qanda

A policy Q&A agent for investment banking answers questions about internal policies, procedures, and controls without forcing bankers to dig through SharePoint, PDFs, and compliance portals. It matters because the wrong answer can create regulatory exposure, breach data handling rules, or slow down deal execution when people need a fast, auditable response.

Architecture

•
User interface layer
- •Chat UI in Slack, Teams, or a web app.
- •Sends the user question plus metadata like business unit, region, and request timestamp.
•
Policy retrieval layer
- •Pulls from approved sources only: policy PDFs, control manuals, SOPs, and compliance FAQs.
- •Uses a retrieval index or search tool with document versioning.
•
AutoGen agent layer
- •A primary AssistantAgent that answers using retrieved context.
- •A UserProxyAgent that executes tool calls and controls the interaction loop.
•
Guardrail and validation layer
- •Checks for disallowed content like legal advice, M&A deal specifics, MNPI handling instructions outside policy scope.
- •Forces citations to source documents and rejects unsupported claims.
•
Audit logging layer
- •Stores question, retrieved documents, final answer, model version, and timestamp.
- •Needed for compliance review and post-incident reconstruction.
•
Deployment boundary
- •Runs in a region approved for bank data residency.
- •Keeps prompts, embeddings, and logs inside the controlled environment.

Implementation

•Install AutoGen and set up your model client

Use the current AutoGen API with AssistantAgent and UserProxyAgent. For a production bank workflow, keep the model config explicit so you can pin model versions and route traffic through approved endpoints.

import os
from autogen import AssistantAgent, UserProxyAgent

llm_config = {
    "config_list": [
        {
            "model": "gpt-4o-mini",
            "api_key": os.environ["OPENAI_API_KEY"],
        }
    ],
    "temperature": 0,
}

policy_agent = AssistantAgent(
    name="policy_agent",
    llm_config=llm_config,
    system_message=(
        "You answer investment banking policy questions using only the provided context. "
        "If the answer is not in context, say you do not know. "
        "Always cite policy section names when available."
    ),
)

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=2,
)

•Add a retrieval function for approved policy text

In banking, do not let the agent freewheel across arbitrary filesystems or public web search. Keep retrieval constrained to an approved corpus with document IDs and versions so every answer can be traced back to source material.

POLICY_CORPUS = {
    "KYC_001_v4": """
    Know Your Customer checks must be completed before account opening.
    Enhanced due diligence is required for high-risk jurisdictions.
    """,
    "MNPI_014_v2": """
    Material non-public information must not be discussed in unsecured channels.
    If MNPI is received accidentally, escalate to Compliance immediately.
    """,
}

def retrieve_policy_context(question: str) -> str:
    q = question.lower()
    hits = []
    if any(term in q for term in ["kyc", "onboarding", "account opening"]):
        hits.append("KYC_001_v4")
    if any(term in q for term in ["mnpi", "inside information", "material non-public"]):
        hits.append("MNPI_014_v2")

    if not hits:
        return ""

    return "\n\n".join(f"[{doc_id}]\n{POLICY_CORPUS[doc_id].strip()}" for doc_id in hits)

•Wire the AutoGen conversation with context injection

The pattern here is simple: retrieve context first, then pass it into the agent as grounded evidence. The assistant should refuse to answer when no relevant policy is found.

def ask_policy_question(question: str) -> str:
    context = retrieve_policy_context(question)

    if not context:
        return (
            "I could not find an approved policy source for this question. "
            "Please check the policy library or escalate to Compliance."
        )

    prompt = f"""
Question:
{question}

Approved policy context:
{context}

Instructions:
- Answer only from the approved policy context.
- Include document IDs in your response.
- If there is ambiguity or missing detail, say so explicitly.
"""

    result = user_proxy.initiate_chat(
        policy_agent,
        message=prompt,
        clear_history=True,
        silent=True,
    )

    return result.summary if hasattr(result, "summary") else str(result)


if __name__ == "__main__":
    print(ask_policy_question("What are the rules for MNPI in chat tools?"))

•Add a lightweight validation gate before returning answers

For investment banking use cases, I like a deterministic post-check before anything reaches end users. This keeps the model honest on citations and blocks unsupported claims that could become audit issues later.

def validate_answer(answer: str) -> bool:
    required_markers = ["[MNPI_014_v2]", "[KYC_001_v4]"]
    has_citation = any(marker in answer for marker in required_markers)
    no_hallucination_language = all(
        phrase not in answer.lower()
        for phrase in ["i think", "probably", "best practice is"]
    )
    return has_citation and no_hallucination_language

Production Considerations

•
Data residency
- •Keep embeddings, logs, prompts, and vector stores inside the bank-approved region.
- •If legal entities span jurisdictions, partition corpora by region rather than mixing them.
•
Auditability
- •Log question, retrieved document IDs, prompt hash, model name, response hash, and user identity.
- •Store immutable records so Compliance can reconstruct exactly what was answered.
•
Guardrails
- •Block requests that ask for trading advice, legal interpretation beyond policy text, or guidance on hiding information flows.
- •Force escalation paths when the agent sees ambiguous regulatory language or missing source coverage.
•
Operational monitoring
- •Track retrieval hit rate, refusal rate, citation coverage, and escalation volume.
- •Spikes in “I don’t know” responses usually mean stale policies or broken indexing; spikes in confident answers with weak citations mean your guardrails are failing.

Common Pitfalls

•
Letting the agent answer without grounded context
- •This is how hallucinations enter compliance workflows.
- •Fix it by requiring retrieved document IDs before every response and refusing when retrieval returns nothing relevant.
•
Mixing policies across business lines or regions
- •A London banking desk may have different handling rules than a U.S. coverage team.
- •Fix it by tagging documents with jurisdiction and business unit metadata before retrieval.
•
Skipping version control on policy sources
- •If someone updates an SOP but your index still serves v3 while Compliance expects v4, your answers become unreliable fast.
- •Fix it by storing versioned document IDs like KYC_001_v4 and rebuilding indexes on every approved policy release.
•
Treating logs as harmless telemetry
- •Policy Q&A often includes sensitive operational details even when users think they are asking something simple.
- •Fix it by classifying logs as controlled records with access controls aligned to bank retention and privacy rules.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit