How to Build a underwriting Agent Using LlamaIndex in Python for wealth management

By Cyprian AaronsUpdated 2026-04-21

underwritingllamaindexpythonwealth-management

A underwriting agent for wealth management reads client profiles, product documents, risk policies, and supporting evidence, then produces a recommendation with traceable reasoning. It matters because advisors and operations teams need faster suitability checks without losing compliance, auditability, or control over sensitive client data.

Architecture

•
Document ingestion layer
- •Pulls in policy PDFs, suitability rules, product sheets, KYC notes, and investment mandate docs.
- •Uses LlamaIndex loaders to normalize content into Document objects.
•
Indexed knowledge base
- •Stores internal underwriting rules and product constraints.
- •Usually backed by a vector index for semantic retrieval plus metadata filters for jurisdiction or product line.
•
Retrieval + synthesis layer
- •Retrieves only the relevant policy sections for a given case.
- •Uses an LLM to synthesize a recommendation grounded in retrieved sources.
•
Decision schema
- •Forces the agent to return structured output like approve, reject, or needs_review.
- •Captures rationale, cited sources, and missing information.
•
Audit logging layer
- •Persists inputs, retrieved chunks, model output, timestamps, and versioned prompts.
- •Required for compliance review and post-trade or pre-trade traceability.
•
Guardrails layer
- •Blocks unsupported advice, PII leakage, and out-of-policy recommendations.
- •Routes ambiguous cases to a human reviewer.

Implementation

1) Load underwriting policy documents into LlamaIndex

Start with your internal underwriting docs. For wealth management, keep client PII out of the index unless you have explicit retention and residency approval.

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load policy docs from a controlled internal directory
documents = SimpleDirectoryReader(
    input_dir="./underwriting_policies",
    recursive=True
).load_data()

print(f"Loaded {len(documents)} documents")

If you need metadata filters later, attach jurisdiction or product tags at ingestion time. That makes it easier to separate US advisory rules from EU MiFID-related materials or region-specific suitability policies.

2) Build an index and retrieval engine

For this use case, VectorStoreIndex is enough to get started. In production, pair it with a persistent vector store so the index survives restarts and can be audited.

from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine(
    similarity_top_k=5,
    response_mode="compact"
)

response = query_engine.query(
    "What are the suitability constraints for a conservative client requesting private credit exposure?"
)

print(response)

This gives you grounded retrieval over policy text. For wealth management underwriting, that grounding is the difference between a useful assistant and an uncontrolled chat bot.

3) Force structured underwriting decisions

You want the model to return a decision object that downstream systems can consume. LlamaIndex supports typed outputs through PydanticOutputParser, which is a clean pattern for approval workflows.

from pydantic import BaseModel, Field
from typing import List
from llama_index.core.output_parsers import PydanticOutputParser
from llama_index.llms.openai import OpenAI

class UnderwritingDecision(BaseModel):
    decision: str = Field(description="approve, reject, or needs_review")
    rationale: str = Field(description="Short explanation tied to policy")
    missing_information: List[str] = Field(default_factory=list)
    cited_sources: List[str] = Field(default_factory=list)

parser = PydanticOutputParser(output_cls=UnderwritingDecision)
llm = OpenAI(model="gpt-4o-mini", temperature=0)

prompt = f"""
You are an underwriting assistant for wealth management.
Use only the provided policy context.
Return JSON matching this schema:
{parser.format}
"""

context = response.response if hasattr(response, "response") else str(response)

raw = llm.complete(prompt + "\n\nPolicy context:\n" + context)
decision: UnderwritingDecision = parser.parse(raw.text)

print(decision.model_dump())

This pattern is useful because your application can route needs_review cases to an advisor desk instead of auto-approving them. It also creates predictable outputs for audit trails.

4) Wrap it in an agentic workflow with human review

For regulated workflows, don’t let the model act alone. Use the retrieval step as evidence gathering and keep final execution behind a human approval gate when required by policy.

def underwrite_case(client_profile: str) -> UnderwritingDecision:
    query = f"""
    Assess this wealth management case against our underwriting policy:
    {client_profile}
    """
    evidence = query_engine.query(query)

    prompt = f"""
    Client case:
    {client_profile}

    Policy evidence:
    {evidence}

    Return a structured decision using this schema:
    {parser.format}
    """
    raw_result = llm.complete(prompt)
    return parser.parse(raw_result.text)

case = """
Client age 62. Conservative risk profile.
Requests allocation into illiquid private credit fund.
Liquidity horizon: less than 12 months.
"""

result = underwrite_case(case)
print(result.decision)
print(result.rationale)

That workflow is simple enough to maintain and strict enough to survive compliance review. In practice you’d add persistence for requests and responses plus model/version tracking.

Production Considerations

•
Auditability
- •Store every request, retrieved passage, final decision, model name, prompt version, and timestamp.
- •Compliance teams will ask why the agent approved or rejected a case; give them exact source citations.
•
Data residency
- •Keep client data in-region if your book of business spans multiple jurisdictions.
- •Use separate indexes per region or per business unit when regulatory boundaries matter.
•
Guardrails
- •Block generation of personalized investment advice outside approved policy scope.
- •If the client profile includes incomplete KYC fields or ambiguous liquidity needs, force needs_review.
•
Monitoring
- •Track rejection rates, human override rates, retrieval hit quality, and hallucination reports.
- •Watch for drift when policies change; stale embeddings can produce bad recommendations even if the model is stable.

Common Pitfalls

•
Putting raw client PII into the index
- •Avoid indexing names, account numbers, tax IDs, or unredacted statements unless you have explicit controls.
- •Keep sensitive attributes in transactional systems and pass only what’s needed at runtime.
•
Using generic prompts without policy grounding
- •A generic “decide if this is suitable” prompt will produce confident nonsense.
- •Always retrieve policy text first and force the model to cite it in the response.
•
Skipping human review on borderline cases
- •Wealth management has too many edge cases: complex trusts, concentrated positions, illiquid assets, cross-border clients.
- •Route uncertain decisions to an advisor or compliance officer instead of auto-finalizing them.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit