How to Integrate Anthropic for investment banking with Cloudflare Workers for RAG

By Cyprian AaronsUpdated 2026-04-21
anthropic-for-investment-bankingcloudflare-workersrag

Why this integration matters

If you’re building RAG for investment banking, the hard part is not generating text. It’s controlling where the model gets context, keeping latency low, and making sure sensitive documents stay behind the right boundary.

Anthropic gives you strong long-context reasoning for deal docs, CIMs, pitch books, and internal research. Cloudflare Workers gives you a thin edge layer to fetch, filter, and normalize retrieval results before they ever hit the model.

Prerequisites

  • Python 3.10+
  • An Anthropic API key
  • A Cloudflare account with:
    • Workers enabled
    • wrangler installed and authenticated
    • A deployed Worker endpoint for retrieval
  • A vector store or document index behind the Worker:
    • Cloudflare Vectorize, or
    • your own retrieval service exposed through the Worker
  • pip packages:
    • anthropic
    • requests
  • Basic understanding of:
    • prompt assembly
    • chunking and embeddings
    • secure handling of banking documents

Integration Steps

  1. Install the Python dependencies

    Start with the client libraries you’ll use from your orchestration service.

    import os
    import requests
    from anthropic import Anthropic
    
    # pip install anthropic requests
    

    Set your secrets as environment variables:

    import os
    
    ANTHROPIC_API_KEY = os.environ["ANTHROPIC_API_KEY"]
    WORKER_URL = os.environ["WORKER_URL"]
    
  2. Create a Cloudflare Worker retrieval endpoint

    The Worker should accept a query, run retrieval against your document store, and return top chunks with metadata.

    Here’s a minimal Python client that calls that Worker from your agent service:

    import requests
    
    def retrieve_context(query: str) -> dict:
        resp = requests.post(
            f"{WORKER_URL}/rag/search",
            json={
                "query": query,
                "top_k": 5,
                "doc_types": ["cim", "research", "credit_memo"]
            },
            timeout=15,
        )
        resp.raise_for_status()
        return resp.json()
    

    A typical response should look like this:

    {
      "results": [
        {
          "chunk_id": "cim_0142",
          "source": "Q4_2024_CIM.pdf",
          "score": 0.91,
          "text": "Revenue grew 18% YoY driven by..."
        }
      ]
    }
    
  3. Assemble the RAG prompt for Anthropic

    Keep the retrieved context compact and explicit. For investment banking use cases, include source names so analysts can trace answers back to documents.

    from anthropic import Anthropic
    
    client = Anthropic(api_key=ANTHROPIC_API_KEY)
    
    def build_context_block(results: list[dict]) -> str:
        lines = []
        for r in results:
            lines.append(
                f"[Source: {r['source']} | Chunk: {r['chunk_id']} | Score: {r['score']:.2f}]\n"
                f"{r['text']}"
            )
        return "\n\n".join(lines)
    
  4. Call Anthropic with the retrieved context

    Use the Messages API and pass only the chunks that matter. For banking workflows, keep temperature low and force grounded answers.

     def answer_question(query: str) -> str:
         retrieval = retrieve_context(query)
         context_block = build_context_block(retrieval["results"])
    
         message = client.messages.create(
             model="claude-3-5-sonnet-latest",
             max_tokens=800,
             temperature=0,
             system=(
                 "You are an investment banking assistant. "
                 "Answer only using the provided context. "
                 "If the context is insufficient, say what is missing."
             ),
             messages=[
                 {
                     "role": "user",
                     "content": (
                         f"Question: {query}\n\n"
                         f"Retrieved context:\n{context_block}\n\n"
                         "Return a concise answer with source references."
                     ),
                 }
             ],
         )
    
         return message.content[0].text
    
  5. Add a guardrail layer before generation

    In practice, you don’t want every query to hit every document class. Filter by deal team, region, or document type in the Worker before returning chunks.

    def answer_deal_question(query: str, region: str = "US") -> str:
        resp = requests.post(
            f"{WORKER_URL}/rag/search",
            json={
                "query": query,
                "top_k": 3,
                "filters": {
                    "region": region,
                    "access_level": "banking_internal"
                }
            },
            timeout=15,
        )
        resp.raise_for_status()
        results = resp.json()["results"]
    
        prompt_context = build_context_block(results)
    
        msg = client.messages.create(
            model="claude-3-5-sonnet-latest",
            max_tokens=500,
            temperature=0,
            messages=[
                {
                    "role": "user",
                    "content": f"Use this context only:\n{prompt_context}\n\nQuestion: {query}",
                }
            ],
        )
        return msg.content[0].text
    

Testing the Integration

Run a simple end-to-end test with a question that should be answered from your indexed materials.

if __name__ == "__main__":
    question = "What was revenue growth in Q4 and what drove it?"
    answer = answer_question(question)
    print(answer)

Expected output:

Revenue grew 18% YoY in Q4.
The primary drivers were enterprise customer expansion and higher average contract value.
Sources: Q4_2024_CIM.pdf (chunk cim_0142), Management_Deck.pdf (chunk md_0081)

If you get an empty answer or hallucinated details, check these first:

  • The Worker is returning relevant chunks
  • Chunk text is short enough to fit in context cleanly
  • Your prompt says “answer only using provided context”
  • Temperature is set to 0
  • Source metadata is preserved end to end

Real-World Use Cases

  • Deal diligence assistant

    • Ask questions across CIMs, QoE reports, legal summaries, and management decks.
    • The Worker handles retrieval boundaries; Anthropic handles synthesis.
  • Equity research copilot

    • Pull earnings call excerpts, prior notes, and market data commentary.
    • Generate grounded summaries with citations analysts can verify fast.
  • Credit memo drafting

    • Retrieve borrower history, covenant language, and sector research.
    • Draft first-pass memos that stay tied to approved internal sources.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides