How to Integrate OpenAI for pension funds with AWS Lambda for RAG

By Cyprian AaronsUpdated 2026-04-21

openai-for-pension-fundsaws-lambdarag

OpenAI for pension funds gives you the model layer for retrieval, summarization, and response generation. AWS Lambda gives you the serverless execution layer to run ingestion, retrieval, and orchestration without managing servers.

For a RAG system in a pension-fund environment, this combo is useful when you need controlled document access, low-ops deployment, and fast responses over policy docs, investment memos, actuarial reports, and member communications.

Prerequisites

•Python 3.10+
•
AWS account with:
- •Lambda enabled
- •IAM role for Lambda execution
- •CloudWatch Logs access
•
AWS CLI configured locally:
- •aws configure
•An OpenAI API key
•
Python packages:
- •openai
- •boto3
- •requests or httpx
•
A document store or vector store for your RAG corpus:
- •Amazon S3 for raw documents
- •Optional: OpenSearch Serverless, Pinecone, or pgvector
•
Environment variables ready:
- •OPENAI_API_KEY
- •AWS_REGION
- •DOC_BUCKET

Integration Steps

•Set up the Lambda handler to receive a query

Your Lambda function should accept a user question and fetch relevant context from your document layer. Keep the handler thin; do retrieval in one place and generation in another.

import json
import os
import boto3

s3 = boto3.client("s3")
DOC_BUCKET = os.environ["DOC_BUCKET"]

def lambda_handler(event, context):
    question = event.get("question", "")
    if not question:
        return {"statusCode": 400, "body": json.dumps({"error": "question is required"})}

    # Placeholder retrieval step: load a known doc chunk from S3
    obj = s3.get_object(Bucket=DOC_BUCKET, Key="rag/context/pension-policy.txt")
    context_text = obj["Body"].read().decode("utf-8")

    return {
        "statusCode": 200,
        "body": json.dumps({
            "question": question,
            "context": context_text[:4000]
        })
    }

•Call OpenAI from Lambda using the Responses API

Use the OpenAI Python SDK inside Lambda to generate an answer grounded in the retrieved context. The current SDK pattern is client.responses.create(...).

import json
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

def generate_answer(question: str, context: str) -> str:
    prompt = f"""
You are an assistant for a pension fund operations team.
Answer only using the provided context.
If the answer is not in the context, say you don't know.

Context:
{context}

Question:
{question}
"""
    response = client.responses.create(
        model="gpt-4.1-mini",
        input=prompt
    )
    return response.output_text

def lambda_handler(event, context):
    payload = event if isinstance(event, dict) else json.loads(event["body"])
    question = payload["question"]
    rag_context = payload["context"]

    answer = generate_answer(question, rag_context)

    return {
        "statusCode": 200,
        "body": json.dumps({"answer": answer})
    }

•Wire retrieval and generation into one Lambda flow

In production RAG, you usually retrieve chunks first, then pass them to OpenAI. Here’s a compact version that uses S3 as the document source and returns a grounded answer.

import json
import os
import boto3
from openai import OpenAI

s3 = boto3.client("s3")
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
DOC_BUCKET = os.environ["DOC_BUCKET"]

def retrieve_context(query: str) -> str:
    # Replace this with vector search against OpenSearch/Pinecone/pgvector.
    obj = s3.get_object(Bucket=DOC_BUCKET, Key="rag/context/pension-policy.txt")
    return obj["Body"].read().decode("utf-8")

def answer_with_rag(question: str) -> str:
    context = retrieve_context(question)

    response = client.responses.create(
        model="gpt-4.1-mini",
        input=[
            {
                "role": "system",
                "content": "You answer questions for pension fund staff using only supplied context."
            },
            {
                "role": "user",
                "content": f"Context:\n{context}\n\nQuestion:\n{question}"
            }
        ]
    )
    return response.output_text

def lambda_handler(event, context):
    body = event if isinstance(event, dict) else json.loads(event["body"])
    question = body["question"]

    result = answer_with_rag(question)
    return {
        "statusCode": 200,
        "body": json.dumps({"answer": result})
    }

•Deploy the function with environment variables and IAM permissions

Your Lambda role needs permission to read documents from S3 and write logs. If you move retrieval into a vector database later, add that permission too.

import boto3

lambda_client = boto3.client("lambda", region_name="us-east-1")

with open("lambda_function.zip", "rb") as f:
    zipped_code = f.read()

response = lambda_client.update_function_code(
    FunctionName="pension-rag-agent",
    ZipFile=zipped_code,
)

lambda_client.update_function_configuration(
    FunctionName="pension-rag-agent",
    Environment={
        "Variables": {
            "OPENAI_API_KEY": "your-openai-key",
            "DOC_BUCKET": "your-doc-bucket"
        }
    }
)

print(response["FunctionArn"])

•Add structured output for downstream systems

If your agent needs to route answers into case management or compliance workflows, ask OpenAI for JSON output. That makes it easier to consume in Step Functions or another service.

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

response = client.responses.create(
    model="gpt-4.1-mini",
    input="""
Return JSON only with keys: answer, confidence, escalation_required.
Question: Can members withdraw employer contributions before retirement?
Context: Employer contributions are locked until retirement except under approved hardship rules.
"""
)

print(response.output_text)

Testing the Integration

Invoke your Lambda with a sample question and verify it returns an answer grounded in your stored policy text.

import json
import boto3

lambda_client = boto3.client("lambda", region_name="us-east-1")

payload = {
    "question": "Can a member access employer contributions before retirement?"
}

response = lambda_client.invoke(
    FunctionName="pension-rag-agent",
    InvocationType="RequestResponse",
    Payload=json.dumps(payload).encode("utf-8")
)

result = json.loads(response["Payload"].read())
print(result["statusCode"])
print(result["body"])

Expected output:

{
  "statusCode": 200,
  "body": "{\"answer\":\"Employer contributions are generally locked until retirement unless an approved hardship rule applies.\"}"
}

Real-World Use Cases

•
Member services assistant
- •Answer policy questions from staff using fund rules, contribution policies, and benefit guides.
•
Compliance copilot
- •Retrieve regulatory guidance and produce draft responses for review before sending to stakeholders.
•
Operations triage
- •Classify incoming requests like withdrawals, beneficiary updates, and pension estimates into workflow queues.

The pattern here is simple: Lambda handles orchestration and retrieval boundaries, while OpenAI handles reasoning over retrieved context. Keep documents out of prompts unless they were fetched by your own retrieval layer; that’s how you keep RAG controlled enough for pension workloads.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit