How to Integrate OpenAI for lending with AWS Lambda for RAG

By Cyprian AaronsUpdated 2026-04-21
openai-for-lendingaws-lambdarag

Combining OpenAI for lending with AWS Lambda gives you a clean way to run retrieval-augmented generation inside a serverless workflow. The pattern is simple: Lambda handles orchestration, document retrieval, and policy checks; OpenAI handles language understanding, summarization, and answer generation for lending workflows.

This is the right fit when you need an AI agent that can answer loan-policy questions, summarize borrower documents, or draft underwriting notes from internal knowledge bases without running a permanent service.

Prerequisites

  • Python 3.10+
  • AWS account with:
    • Lambda enabled
    • IAM role for Lambda execution
    • CloudWatch Logs permissions
  • AWS CLI configured locally:
    • aws configure
  • An OpenAI API key with access to the lending-capable model or endpoint you plan to use
  • boto3 installed for AWS SDK calls
  • openai Python package installed
  • A document store for RAG:
    • S3, DynamoDB, OpenSearch, or a vector database
  • Basic familiarity with:
    • AWS Lambda handler functions
    • JSON event payloads
    • Python packaging for Lambda deployment

Install the dependencies:

pip install openai boto3

Integration Steps

1) Set up your Lambda environment and credentials

Keep secrets out of code. Use environment variables in Lambda for the OpenAI key and any retrieval config.

import os

OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
KB_BUCKET = os.environ["KB_BUCKET"]

In AWS Lambda, add these environment variables:

  • OPENAI_API_KEY
  • KB_BUCKET
  • AWS_REGION

For local testing, export them in your shell:

export OPENAI_API_KEY="your-key"
export KB_BUCKET="your-doc-bucket"
export AWS_REGION="us-east-1"

2) Pull lending documents from S3 inside Lambda

For RAG, your Lambda function needs to fetch the relevant policy or borrower documents before calling OpenAI. This example loads a text file from S3.

import boto3

s3 = boto3.client("s3")

def load_document(bucket: str, key: str) -> str:
    response = s3.get_object(Bucket=bucket, Key=key)
    return response["Body"].read().decode("utf-8")

A typical event payload can include the document key and the user question:

{
  "document_key": "policies/loan-underwriting.md",
  "question": "Can we approve a borrower with two recent late payments?"
}

3) Build the RAG prompt and call OpenAI from Lambda

Use the retrieved document as context. For lending use cases, keep the prompt constrained to policy-based answers and require citations from the supplied context.

from openai import OpenAI

client = OpenAI(api_key=OPENAI_API_KEY)

def generate_answer(context: str, question: str) -> str:
    prompt = f"""
You are a lending assistant.
Answer only using the provided context.
If the context is insufficient, say so clearly.

Context:
{context}

Question:
{question}
"""
    response = client.responses.create(
        model="gpt-4.1-mini",
        input=prompt,
    )
    return response.output_text

This uses the OpenAI Python SDK’s responses.create() method. If your lending setup uses a managed enterprise endpoint or a custom model alias, keep the same pattern and swap the model name.

4) Wire everything into an AWS Lambda handler

Now connect retrieval and generation in one handler. This is the unit that API Gateway, Step Functions, or EventBridge will invoke.

import json
import os

def lambda_handler(event, context):
    bucket = os.environ["KB_BUCKET"]
    document_key = event["document_key"]
    question = event["question"]

    doc_text = load_document(bucket=bucket, key=document_key)
    answer = generate_answer(context=doc_text, question=question)

    return {
        "statusCode": 200,
        "headers": {"Content-Type": "application/json"},
        "body": json.dumps({
            "document_key": document_key,
            "question": question,
            "answer": answer,
        }),
    }

If you want this to behave like production RAG, keep documents chunked upstream instead of loading entire files. In practice:

  • Store embeddings separately
  • Retrieve top-k chunks first
  • Pass only those chunks into generate_answer()

5) Add a retrieval layer before generation

A real lending agent should not dump whole policies into the model. Fetch only relevant chunks using semantic search or keyword filtering before calling OpenAI.

Here’s a simple pattern using DynamoDB metadata plus S3 content:

import boto3

ddb = boto3.resource("dynamodb")
table = ddb.Table("lending_chunks")

def get_relevant_chunks(query: str):
    # Replace this with vector search in production.
    # This is a placeholder query path using metadata filters.
    resp = table.scan(Limit=5)
    return [item["chunk_text"] for item in resp["Items"]]

def build_context(query: str) -> str:
    chunks = get_relevant_chunks(query)
    return "\n\n---\n\n".join(chunks)

Then call it in your handler:

def lambda_handler(event, context):
    question = event["question"]
    rag_context = build_context(question)
    answer = generate_answer(rag_context, question)

    return {
        "statusCode": 200,
        "body": json.dumps({"answer": answer})
    }

Testing the Integration

Test locally first with a simple event payload. If you’re using Lambda Runtime Interface Emulator or SAM CLI, invoke the handler with mock data.

if __name__ == "__main__":
    test_event = {
        "document_key": "policies/loan-underwriting.md",
        "question": "What is our rule for late payments in the last 12 months?"
    }

    result = lambda_handler(test_event, None)
    print(result["body"])

Expected output looks like this:

{
  "document_key": "policies/loan-underwriting.md",
  "question": "What is our rule for late payments in the last 12 months?",
  "answer": "Based on the provided policy context..."
}

If it fails, check these first:

  • Lambda has permission to read from S3/DynamoDB
  • OPENAI_API_KEY is present in environment variables
  • Your model name matches what your account can access
  • The retrieved context actually contains relevant loan policy text

Real-World Use Cases

  • Loan policy assistant

    • Answer questions from underwriting manuals, credit policy docs, and exception matrices.
    • Useful for internal ops teams that need consistent policy answers.
  • Borrower document summarization

    • Summarize bank statements, income docs, or KYC packets before review.
    • Good fit for underwriting queues and analyst copilots.
  • Compliance-aware decision support

    • Combine retrieval from approved policy sources with OpenAI-generated explanations.
    • Helps produce audit-friendly notes for lending decisions.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides