How to Integrate Anthropic for banking with Cloudflare Workers for RAG

By Cyprian AaronsUpdated 2026-04-21

anthropic-for-bankingcloudflare-workersrag

Combining Anthropic for banking with Cloudflare Workers gives you a practical RAG stack for regulated environments: fast edge retrieval, controlled data access, and an LLM that can answer with context instead of guessing. The pattern is simple: Cloudflare Workers handles low-latency document retrieval at the edge, and Anthropic turns those retrieved chunks into compliant, grounded responses for banking workflows.

Prerequisites

•Python 3.10+
•An Anthropic API key
•
A Cloudflare account with:
- •Workers enabled
- •A deployed Worker endpoint for retrieval
- •Optional: Vectorize or KV for document storage
•pip installed
•
These Python packages:
- •anthropic
- •requests
- •python-dotenv
•
Basic familiarity with RAG:
- •chunking
- •embedding/retrieval
- •prompt construction

Install dependencies:

pip install anthropic requests python-dotenv

Set environment variables:

export ANTHROPIC_API_KEY="your_anthropic_key"
export CLOUDFLARE_WORKER_URL="https://your-worker.your-subdomain.workers.dev"

Integration Steps

1) Build the retrieval contract on Cloudflare Workers

Your Worker should expose a simple HTTP endpoint that accepts a query and returns the top matching chunks. Keep the response shape stable so your Python agent can consume it without custom parsing.

Example Worker response contract:

{
  "query": "What is the mortgage prepayment policy?",
  "results": [
    {
      "id": "doc_123",
      "text": "Mortgage prepayment penalties apply only in the first 36 months...",
      "source": "policy.pdf",
      "score": 0.92
    }
  ]
}

If you already have a Worker deployed, just make sure it supports POST requests with JSON input. From Python, you’ll call it like this:

import os
import requests

WORKER_URL = os.environ["CLOUDFLARE_WORKER_URL"]

def retrieve_context(query: str) -> list[dict]:
    resp = requests.post(
        f"{WORKER_URL}/retrieve",
        json={"query": query, "top_k": 5},
        timeout=10,
    )
    resp.raise_for_status()
    return resp.json()["results"]

2) Format retrieved chunks into a banking-safe context block

For banking use cases, don’t dump raw retrieval output into the model. Normalize it first. Strip irrelevant metadata, preserve source names, and keep the context bounded so you don’t blow up token usage.

def build_context_block(results: list[dict]) -> str:
    blocks = []
    for i, item in enumerate(results, start=1):
        blocks.append(
            f"[{i}] Source: {item.get('source', 'unknown')}\n"
            f"Score: {item.get('score', 0):.2f}\n"
            f"Text: {item['text']}"
        )
    return "\n\n".join(blocks)

This gives Anthropic a clean evidence block to reason over. In banking systems, that matters more than clever prompting.

3) Call Anthropic Messages API with the retrieved evidence

Use the Anthropic Messages API to generate answers grounded in the retrieved context. The key method is client.messages.create(...).

import os
from anthropic import Anthropic

client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

def answer_with_rag(question: str, context_block: str) -> str:
    prompt = f"""
You are a banking assistant.
Answer only using the provided context.
If the answer is not in the context, say you don't have enough information.

Question:
{question}

Context:
{context_block}
""".strip()

    message = client.messages.create(
        model="claude-3-5-sonnet-latest",
        max_tokens=400,
        temperature=0,
        messages=[
            {"role": "user", "content": prompt}
        ],
    )

    return message.content[0].text

Use temperature=0 for deterministic behavior in regulated workflows. That’s usually what you want when answering policy or product questions.

4) Wire retrieval and generation together in one agent function

Now connect both sides into a single RAG flow. This is what your application will call from an internal tool, chatbot backend, or workflow engine.

def rag_answer(question: str) -> dict:
    results = retrieve_context(question)
    context_block = build_context_block(results)
    answer = answer_with_rag(question, context_block)

    return {
        "question": question,
        "retrieved_sources": [r.get("source") for r in results],
        "answer": answer,
    }

if __name__ == "__main__":
    q = "Can customers prepay their mortgage without penalty?"
    result = rag_answer(q)
    print(result["answer"])

This pattern keeps retrieval outside the model and makes it easier to audit which documents influenced each answer.

5) Add a fallback path when retrieval returns nothing useful

Banking agents need deterministic fallback behavior. If Cloudflare returns no relevant chunks or low scores, do not let the model invent policy details.

def rag_answer_with_fallback(question: str) -> dict:
    results = retrieve_context(question)

    if not results or max(r.get("score", 0) for r in results) < 0.75:
        return {
            "question": question,
            "answer": (
                "I don't have enough verified information in the knowledge base "
                "to answer this confidently."
            ),
            "retrieved_sources": [],
        }

    context_block = build_context_block(results)
    answer = answer_with_rag(question, context_block)

    return {
        "question": question,
        "answer": answer,
        "retrieved_sources": [r.get("source") for r in results],
    }

That threshold should be tuned against your own corpus and evaluation set. Don’t guess it.

Testing the Integration

Run a simple end-to-end test from Python to verify both services are wired correctly.

def test_integration():
    question = "What documents are required to open a business account?"
    result = rag_answer_with_fallback(question)

    print("Question:", result["question"])
    print("Sources:", result["retrieved_sources"])
    print("Answer:", result["answer"])

if __name__ == "__main__":
    test_integration()

Expected output:

Question: What documents are required to open a business account?
Sources: ['account-opening-policy.pdf', 'kyc-checklist.md']
Answer: Based on the provided context, customers must submit...

If you get an empty source list or a generic hallucinated answer, check three things first:

•Worker endpoint path and payload shape
•Anthropic API key permissions and model name
•Retrieval score threshold and chunk quality

Real-World Use Cases

•
Policy Q&A assistant
- •Answer questions about mortgage terms, overdraft rules, fee schedules, or KYC requirements using approved internal docs only.
•
Advisor copilot
- •Let relationship managers ask natural-language questions over product manuals, suitability notes, and onboarding playbooks at the edge.
•
Claims or disputes helper
- •Retrieve claim-handling procedures or dispute resolution rules from Cloudflare storage and generate grounded responses through Anthropic.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit