How to Build a policy Q&A Agent Using LangGraph in Python for payments

By Cyprian AaronsUpdated 2026-04-21

policy-q-alanggraphpythonpaymentspolicy-qanda

A policy Q&A agent for payments answers questions like “Can we refund this transaction?” or “Is this merchant category allowed under our policy?” It matters because payment teams need fast, consistent answers that are grounded in policy, not guesswork, and every answer needs an audit trail.

Architecture

•
Policy source loader
- •Pulls policy documents from a controlled source: database, object storage, or a versioned document store.
- •Keeps policy text and metadata like version, jurisdiction, and effective date.
•
Retriever
- •Finds the most relevant policy chunks for a user question.
- •In payments, this should filter by product line, region, and policy scope before retrieval.
•
LLM answer node
- •Generates a concise answer using only retrieved policy context.
- •Should refuse when the evidence is missing or ambiguous.
•
Audit logger
- •Records the question, retrieved passages, model answer, policy version, and decision path.
- •This is non-negotiable for compliance reviews and incident investigations.
•
Guardrail / validation node
- •Checks for prohibited outputs like unsupported approvals, missing citations, or leakage of sensitive data.
- •Can block or downgrade responses before they reach the user.
•
LangGraph workflow
- •Orchestrates the retrieval → reasoning → validation flow.
- •Gives you deterministic control over branching and state updates.

Implementation

1) Install the right packages

Use LangGraph with a standard retriever stack. For production you’ll likely swap the toy embedding model below with your internal embedding service and your own vector store.

pip install langgraph langchain langchain-community langchain-openai faiss-cpu pydantic

2) Define state and build the graph

The key pattern is to keep all request context in a typed state object. That makes it easy to log decisions and enforce policy scope across nodes.

from typing import TypedDict, List
from langgraph.graph import StateGraph, START, END
from langchain_core.documents import Document
from langchain_openai import ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

class AgentState(TypedDict):
    question: str
    jurisdiction: str
    policy_version: str
    docs: List[Document]
    answer: str
    approved: bool

embeddings = OpenAIEmbeddings()
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

policy_docs = [
    Document(page_content="Refunds are allowed within 30 days for card payments if goods were not delivered.", metadata={"jurisdiction": "US", "version": "2024.10"}),
    Document(page_content="Chargebacks must be escalated to disputes team within 24 hours.", metadata={"jurisdiction": "US", "version": "2024.10"}),
]

vectorstore = FAISS.from_documents(policy_docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

3) Add retrieval, answering, and validation nodes

This is the actual LangGraph pattern: each node accepts state and returns partial state updates. Keep retrieval scoped by jurisdiction and version so you don’t mix policies across regions.

def retrieve_policy(state: AgentState):
    query = f"{state['question']} jurisdiction:{state['jurisdiction']} version:{state['policy_version']}"
    docs = retriever.invoke(query)
    return {"docs": docs}

def answer_question(state: AgentState):
    context = "\n\n".join(
        f"[{i+1}] {doc.page_content}" for i, doc in enumerate(state["docs"])
    )
    prompt = f"""
You are a payments policy assistant.
Answer only from the provided policy excerpts.
If the excerpts do not support an answer, say you cannot confirm it.

Question: {state['question']}

Policy excerpts:
{context}
"""
    response = llm.invoke(prompt)
    return {"answer": response.content}

def validate_answer(state: AgentState):
    answer = state["answer"].lower()
    approved = (
        "cannot confirm" in answer
        or "allowed" in answer
        or "not allowed" in answer
        or len(state["docs"]) > 0
    )
    return {"approved": approved}

4) Wire the graph and invoke it

Use conditional routing so unapproved answers stop before they reach a caller. In production you can route rejected outputs to a human review queue instead of ending immediately.

def route_after_validation(state: AgentState):
    return END if state["approved"] else END

graph = StateGraph(AgentState)
graph.add_node("retrieve_policy", retrieve_policy)
graph.add_node("answer_question", answer_question)
graph.add_node("validate_answer", validate_answer)

graph.add_edge(START, "retrieve_policy")
graph.add_edge("retrieve_policy", "answer_question")
graph.add_edge("answer_question", "validate_answer")
graph.add_conditional_edges("validate_answer", route_after_validation)

app = graph.compile()

result = app.invoke({
    "question": "Can we refund a card payment after 45 days?",
    "jurisdiction": "US",
    "policy_version": "2024.10",
    "docs": [],
    "answer": "",
    "approved": False,
})

print(result["answer"])
print(result["approved"])

Production Considerations

•
Audit everything
- •Persist question, policy_version, retrieved document IDs, model output, validator result, and final disposition.
- •In payments disputes or compliance reviews you need to reconstruct exactly why the agent answered a certain way.
•
Enforce data residency
- •Keep retrieval indexes and model calls in-region when policies contain regulated customer data or internal controls.
- •If your payment operations span EU/US/APAC, route requests to region-specific graphs and stores.
•
Add hard guardrails
- •Reject prompts asking for illegal actions like bypassing AML checks or approving restricted merchants.
- •Use a denylist plus structured validation; don’t rely on prompt wording alone.
•
Monitor drift
- •Track refusal rate, escalation rate, stale-policy hits, and citation coverage.
- •If policy versions change weekly but your index refreshes daily, you’ll ship wrong answers.

Common Pitfalls

•
Mixing policy scopes
- •Don’t retrieve across jurisdictions or products without filtering first.
- •A refund rule for EU SEPA transfers is not automatically valid for US card transactions.
•
Letting the LLM freewheel
- •Don’t ask for an “expert opinion” without grounding it in retrieved text.
- •Force the model to cite excerpts or explicitly say it cannot confirm.
•
Skipping version control
- •Don’t index policies without effective dates and version tags.
- •Payments teams need to know whether an answer reflects current rules or retired guidance.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit