How to Build a policy Q&A Agent Using CrewAI in Python for banking

By Cyprian AaronsUpdated 2026-04-21
policy-q-acrewaipythonbankingpolicy-qanda

A policy Q&A agent for banking answers employee and customer-facing questions from approved policy documents, procedures, and regulatory guidance. The point is not just convenience; it is consistency, auditability, and reducing the risk of staff improvising on compliance-sensitive answers.

Architecture

  • Policy corpus loader

    • Ingests PDFs, DOCX, HTML, and internal knowledge base exports.
    • Normalizes content into chunks with source metadata: document name, version, effective date, jurisdiction.
  • Retrieval layer

    • Pulls only relevant policy passages for a question.
    • Should support vector search plus keyword fallback for exact terms like “AML”, “KYC”, “CDD”, or “wire transfer limits”.
  • CrewAI agent

    • Uses a constrained system prompt.
    • Answers only from retrieved policy context and refuses unsupported claims.
  • Compliance guardrails

    • Blocks PII leakage, disallowed advice, and hallucinated policy references.
    • Enforces escalation when the answer depends on legal interpretation or missing context.
  • Audit logging

    • Stores question, retrieved sources, final answer, model version, timestamps, and reviewer actions.
    • Required for internal controls and regulator review.
  • Human escalation path

    • Routes ambiguous or high-risk questions to compliance or operations.
    • Useful for sanctions, complaints handling, lending decisions, and account closure rules.

Implementation

1) Install CrewAI and define your policy tools

CrewAI’s Agent works well when you pair it with explicit tools. For banking, keep the tool surface narrow: one retrieval tool and one escalation tool are usually enough.

from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
from typing import List

class PolicySearchInput(BaseModel):
    query: str = Field(..., description="User question about bank policy")

class PolicySearchTool(BaseTool):
    name: str = "policy_search"
    description: str = "Search approved banking policy documents for relevant passages"
    args_schema = PolicySearchInput

    def _run(self, query: str) -> str:
        # Replace with your vector DB / search service call
        results = [
            {
                "source": "Retail Banking Policy v4.2",
                "section": "Cash Withdrawal Limits",
                "text": "Branch cash withdrawals above $10,000 require enhanced verification."
            },
            {
                "source": "AML Procedures v3.1",
                "section": "Suspicious Activity Escalation",
                "text": "Any suspected structuring must be escalated to Financial Crime within 1 business day."
            }
        ]
        return "\n".join([f"{r['source']} | {r['section']} | {r['text']}" for r in results])

class EscalateInput(BaseModel):
    reason: str = Field(...)

class EscalateTool(BaseTool):
    name: str = "escalate_case"
    description: str = "Create a compliance escalation ticket"
    args_schema = EscalateInput

    def _run(self, reason: str) -> str:
        return f"Escalation created: {reason}"

2) Create a banking-specific agent prompt

The prompt matters more than people think. In banking, you want the model to answer conservatively and cite sources from the retrieved context instead of inventing policy language.

policy_agent = Agent(
    role="Bank Policy Q&A Specialist",
    goal="Answer banking policy questions using only approved internal sources.",
    backstory=(
        "You support bank employees with policy questions. "
        "You never guess. If the source material is insufficient or the question "
        "requires legal/compliance judgment, you escalate."
    ),
    tools=[PolicySearchTool(), EscalateTool()],
    verbose=True,
    allow_delegation=False,
)

3) Define a task that forces grounded answers

Use a task that requires citations in plain text. Keep the output format strict so downstream systems can validate it before display.

question = """
Can a branch manager approve a cash withdrawal above $10,000 without extra verification?
"""

task = Task(
    description=(
        f"Answer this banking policy question using only retrieved policy content:\n\n{question}\n\n"
        "Requirements:\n"
        "- State the answer clearly.\n"
        "- Cite the source document names and sections.\n"
        "- If the answer is not fully supported by policy text, say 'Escalate to Compliance'.\n"
        "- Do not mention anything not present in the provided context."
    ),
    expected_output=(
        "A concise answer with citations in the format: Document | Section | Summary."
    ),
    agent=policy_agent,
)

4) Run the crew and capture outputs for audit

For production banking workflows, log both the input question and the exact response. You should also persist retrieved snippets separately so auditors can reconstruct why the agent answered the way it did.

crew = Crew(
    agents=[policy_agent],
    tasks=[task],
    verbose=True,
)

result = crew.kickoff()

print(result)

A practical pattern is to wrap kickoff() in your own service layer:

  • store request ID
  • store user identity and role
  • store retrieved sources
  • store final answer
  • store escalation outcome if triggered

That gives you traceability when someone asks why an employee received a specific policy interpretation.

Production Considerations

  • Deployment

    • Host retrieval services in-region if your bank has data residency requirements.
    • Keep model endpoints and vector stores inside approved cloud regions or on-prem boundaries.
    • Separate dev/test/prod corpora so draft policies never contaminate live answers.
  • Monitoring

    • Track refusal rate, escalation rate, retrieval hit rate, and unanswered questions.
    • Alert on answers without citations or answers generated from too little context.
    • Log model version changes because response behavior will shift after upgrades.
  • Guardrails

    • Block PII in prompts unless there is a legitimate business need and approved handling path.
    • Add deny rules for account-specific advice, sanctions screening outcomes, credit decisions, and legal interpretation.
    • Require human review for high-risk topics like fraud investigations or regulatory complaints.
  • Auditability

    • Preserve original documents with version hashes.
    • Record which chunks were retrieved for each response.
    • Make every answer reproducible from stored inputs and outputs.

Common Pitfalls

  • Using broad retrieval over uncontrolled documents

    • If your search index includes old memos or draft policies, the agent will quote stale guidance.
    • Fix it by indexing only approved sources with versioning and effective dates.
  • Letting the agent answer without citations

    • This turns a Q&A assistant into a guess engine.
    • Fix it by enforcing structured output that includes document names and sections every time.
  • Ignoring escalation thresholds

    • Not every question should be answered automatically.
    • Fix it by routing ambiguous requests to compliance when confidence is low or when topics touch AML/KYC, sanctions, lending exceptions, complaints handling, or legal interpretation.

A solid banking policy Q&A agent is less about clever prompting and more about control points. CrewAI gives you the orchestration layer; your job is to make sure every answer is grounded, logged, reviewable, and safe enough to sit inside a regulated workflow.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides