How to Integrate Anthropic for wealth management with Cloudflare Workers for RAG

By Cyprian AaronsUpdated 2026-04-21
anthropic-for-wealth-managementcloudflare-workersrag

When you pair Anthropic for wealth management with Cloudflare Workers, you get a clean pattern for RAG-backed advisor workflows: fast edge retrieval, low-latency policy checks, and a model that can turn retrieved context into compliant client responses. This is a strong fit for portfolio summaries, suitability checks, and advisor copilots that need to answer from firm-approved knowledge instead of freewheeling on the open web.

Prerequisites

  • Python 3.10+
  • An Anthropic API key
  • A Cloudflare account with:
    • a Worker deployed
    • either Workers KV, D1, or an HTTP endpoint for retrieval
  • pip install anthropic requests python-dotenv
  • Environment variables set:
    • ANTHROPIC_API_KEY
    • CLOUDFLARE_WORKER_URL
    • optional CLOUDFLARE_WORKER_TOKEN
  • A RAG corpus already indexed in your Worker backend:
    • policy docs
    • product sheets
    • approved market commentary
    • client communication templates

Integration Steps

  1. Set up your Python client and config

    Keep credentials out of code. The pattern here is simple: Python orchestrates the request, Cloudflare Workers handles retrieval, and Anthropic generates the final answer from retrieved context.

    import os
    import requests
    from dotenv import load_dotenv
    from anthropic import Anthropic
    
    load_dotenv()
    
    anthropic_client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
    worker_url = os.environ["CLOUDFLARE_WORKER_URL"]
    worker_token = os.getenv("CLOUDFLARE_WORKER_TOKEN")
    
  2. Query Cloudflare Worker for RAG context

    Your Worker should expose a retrieval endpoint like /retrieve. Send the user question and get back top-k chunks plus metadata. This keeps retrieval close to the edge and avoids stuffing your app with vector logic.

    def retrieve_context(query: str) -> dict:
        headers = {"Content-Type": "application/json"}
        if worker_token:
            headers["Authorization"] = f"Bearer {worker_token}"
    
        payload = {
            "query": query,
            "top_k": 5,
            "filters": {
                "doc_type": ["policy", "product", "advisor_note"]
            }
        }
    
        response = requests.post(
            f"{worker_url}/retrieve",
            json=payload,
            headers=headers,
            timeout=15,
        )
        response.raise_for_status()
        return response.json()
    
  3. Format retrieved chunks into an Anthropic-ready prompt

    For wealth management use cases, you want the model grounded in approved text and instructed not to invent facts. Use the Messages API with a strict system message and attach citations from the Worker payload.

    def build_context_block(results: dict) -> str:
        chunks = results.get("chunks", [])
        lines = []
    
        for i, chunk in enumerate(chunks, start=1):
            source = chunk.get("source", "unknown")
            title = chunk.get("title", "untitled")
            text = chunk.get("text", "")
            lines.append(f"[{i}] {title} | {source}\n{text}")
    
        return "\n\n".join(lines)
    
    def answer_with_anthropic(query: str) -> str:
        rag_results = retrieve_context(query)
        context_block = build_context_block(rag_results)
    
        message = anthropic_client.messages.create(
            model="claude-3-5-sonnet-latest",
            max_tokens=800,
            temperature=0.2,
            system=(
                "You are a wealth management assistant. "
                "Answer only using the provided context. "
                "If the context is insufficient, say what is missing. "
                "Do not provide personalized investment advice unless explicitly supported by policy text."
            ),
            messages=[
                {
                    "role": "user",
                    "content": (
                        f"Question: {query}\n\n"
                        f"Retrieved context:\n{context_block}\n\n"
                        "Return a concise answer with bullet points and cite source numbers."
                    ),
                }
            ],
        )
    
        return message.content[0].text
    
  4. Add a production-safe Worker call pattern

    In practice, you will want retries and explicit error handling around your Worker call. This matters because RAG systems fail in boring ways: timeouts, empty results, malformed JSON, or auth issues.

    def safe_retrieve_context(query: str) -> dict:
        try:
            return retrieve_context(query)
        except requests.HTTPError as e:
            return {"chunks": [], "error": f"Worker HTTP error: {e.response.status_code}"}
        except requests.RequestException as e:
            return {"chunks": [], "error": f"Network error: {str(e)}"}
        except ValueError as e:
            return {"chunks": [], "error": f"Invalid JSON from Worker: {str(e)}"}
    
  5. Wire the whole flow together

    This is the actual orchestration layer your agent can call from an API route or background job.

    def rag_wealth_assistant(question: str) -> dict:
        results = safe_retrieve_context(question)
    
        if results.get("error"):
            return {
                "answer": None,
                "error": results["error"],
                "sources": [],
            }
    
        answer = answer_with_anthropic(question)
        sources = [
            {
                "title": c.get("title"),
                "source": c.get("source"),
                "id": c.get("id"),
            }
            for c in results.get("chunks", [])
        ]
    
        return {
            "answer": answer,
            "sources": sources,
        }
    

Testing the Integration

Run a direct smoke test with a query that should hit your knowledge base.

if __name__ == "__main__":
    question = (
        "What is our approved process for explaining risk tolerance to a high-net-worth client?"
    )

    result = rag_wealth_assistant(question)
    print("ANSWER:\n", result["answer"])
    print("\nSOURCES:\n", result["sources"])

Expected output:

ANSWER:
- Risk tolerance should be explained using approved suitability language.
- The advisor should document objectives, time horizon, liquidity needs, and loss capacity.
- If the client profile is incomplete, request additional information before making recommendations.
- Source [1] and [3] support this process.

SOURCES:
[
  {'title': 'Suitability Policy', 'source': 'cf-worker-kv://policy/suitability'},
  {'title': 'Client Discovery Guide', 'source': 'cf-worker-kv://advisor/discovery'}
]

Real-World Use Cases

  • Advisor copilot

    • Answer product questions from firm-approved documents.
    • Draft compliant client responses grounded in policy text.
  • Portfolio service assistant

    • Summarize holdings, risk flags, and recent commentary using RAG.
    • Route ambiguous cases to human advisors with source citations.
  • Compliance-first knowledge search

    • Let internal teams ask natural-language questions over policies.
    • Keep retrieval at the edge with Cloudflare Workers and generation centralized in Anthropic.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides