How to Integrate OpenAI for insurance with AWS Lambda for RAG

By Cyprian AaronsUpdated 2026-04-21
openai-for-insuranceaws-lambdarag

If you’re building an insurance agent that answers policy questions, triages claims, or pulls evidence from documents, this combo is the cleanest path: use AWS Lambda as the orchestration layer and OpenAI for insurance as the reasoning and generation layer. Lambda handles the event-driven glue, while OpenAI handles retrieval-aware responses over policy docs, claim notes, and underwriting guidelines.

Prerequisites

  • Python 3.10+
  • AWS account with:
    • Lambda enabled
    • IAM role for Lambda execution
    • S3 bucket for document storage
  • AWS CLI configured locally:
    • aws configure
  • OpenAI API key with access to your insurance use case
  • Python packages:
    • openai
    • boto3
    • requests if you call external endpoints
  • A document corpus for RAG:
    • policy PDFs
    • claims manuals
    • underwriting rules
  • Basic understanding of:
    • AWS Lambda handler structure
    • vector search or embeddings workflow

Integration Steps

1) Set up environment variables

Keep secrets out of code. Store your OpenAI key and AWS config in Lambda environment variables.

import os

os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["S3_BUCKET"] = "insurance-rag-docs"
os.environ["AWS_REGION"] = "us-east-1"

For production, set these in the Lambda console or via IaC. Don’t hardcode credentials in your handler.

2) Build a document retrieval function in Lambda

For RAG, your Lambda should fetch the relevant policy text before calling OpenAI. This example pulls a document from S3.

import boto3

s3 = boto3.client("s3", region_name="us-east-1")

def fetch_policy_text(bucket: str, key: str) -> str:
    obj = s3.get_object(Bucket=bucket, Key=key)
    return obj["Body"].read().decode("utf-8")

In a real setup, you’d usually retrieve top-k chunks from a vector store like OpenSearch, Pinecone, or Aurora pgvector. The pattern stays the same: retrieve context first, then send it to OpenAI.

3) Call OpenAI for insurance with retrieved context

Use the Responses API to generate an answer grounded in the retrieved policy text. This is where the insurance-specific prompt discipline matters: constrain the model to answer only from provided context.

from openai import OpenAI

client = OpenAI()

def answer_question(question: str, context: str) -> str:
    response = client.responses.create(
        model="gpt-4.1",
        input=[
            {
                "role": "system",
                "content": (
                    "You are an insurance assistant. "
                    "Answer only using the provided policy context. "
                    "If the answer is missing, say you don't have enough information."
                ),
            },
            {
                "role": "user",
                "content": f"Policy context:\n{context}\n\nQuestion:\n{question}",
            },
        ],
    )
    return response.output_text

This is the core RAG loop: retrieve relevant evidence, then generate a grounded response.

4) Wire everything into an AWS Lambda handler

Your Lambda receives an event from API Gateway, Step Functions, or another agent component. It loads the right document chunk and returns a JSON response.

import json
import os

def lambda_handler(event, context):
    question = event.get("question", "")
    doc_key = event.get("doc_key", "policy.txt")

    bucket = os.environ["S3_BUCKET"]
    policy_text = fetch_policy_text(bucket=bucket, key=doc_key)

    answer = answer_question(question=question, context=policy_text)

    return {
        "statusCode": 200,
        "headers": {"Content-Type": "application/json"},
        "body": json.dumps({
            "question": question,
            "answer": answer,
            "source_document": doc_key,
        }),
    }

This works well when your agent system needs a thin serverless layer between user events and model calls.

5) Add basic guardrails before production

Insurance workflows need traceability and failure handling. Add checks for empty inputs, oversized documents, and fallback behavior when retrieval fails.

def safe_lambda_handler(event, context):
    question = (event.get("question") or "").strip()
    doc_key = (event.get("doc_key") or "").strip()

    if not question:
        return {"statusCode": 400, "body": json.dumps({"error": "question is required"})}

    if not doc_key:
        return {"statusCode": 400, "body": json.dumps({"error": "doc_key is required"})}

    try:
        bucket = os.environ["S3_BUCKET"]
        policy_text = fetch_policy_text(bucket=bucket, key=doc_key)

        if not policy_text.strip():
            return {"statusCode": 404, "body": json.dumps({"error": "document is empty"})}

        answer = answer_question(question=question, context=policy_text)

        return {
            "statusCode": 200,
            "body": json.dumps({"answer": answer}),
        }
    except Exception as e:
        return {"statusCode": 500, "body": json.dumps({"error": str(e)})}

Testing the Integration

Invoke the Lambda locally or through AWS with a sample event. This verifies that retrieval and generation are wired correctly.

test_event = {
    "question": "Does this policy cover rental car reimbursement after an accident?",
    "doc_key": "auto_policy.txt"
}

result = lambda_handler(test_event, None)
print(result["statusCode"])
print(result["body"])

Expected output:

200
{
  "question": "...",
  "answer": "...grounded response based on policy text...",
  "source_document": "auto_policy.txt"
}

If retrieval is working but the answer looks generic, your context is too broad or your prompt is too loose. Tighten chunking and force citation-style grounding in the system message.

Real-World Use Cases

  • Claims intake assistant
    • A Lambda-triggered agent reads claim notes and policy docs, then returns coverage guidance and next actions.
  • Policy Q&A bot
    • Users ask “Is this procedure covered?” and get answers grounded in plan documents stored in S3 or a vector index.
  • Underwriting support workflow
    • An internal agent summarizes applicant documents against underwriting rules and flags missing evidence for review.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides