LlamaIndex Tutorial (TypeScript): deploying to AWS Lambda for intermediate developers

By Cyprian AaronsUpdated 2026-04-21

llamaindexdeploying-to-aws-lambda-for-intermediate-developerstypescript

This tutorial shows how to build a small LlamaIndex TypeScript app that runs inside AWS Lambda and answers questions against your data. You need this when you want serverless inference for low-traffic workloads, event-driven document Q&A, or an API that scales to zero without managing servers.

What You'll Need

•Node.js 18+ and npm
•
An AWS account with permission to create:
- •Lambda functions
- •IAM roles
- •CloudWatch logs
•An OpenAI API key exported as OPENAI_API_KEY
•A local TypeScript project
•
These packages:
- •llamaindex
- •@aws-sdk/client-s3 if you plan to load documents from S3 later
- •esbuild for bundling the Lambda artifact
•
Basic familiarity with:
- •async/await
- •AWS Lambda handler signatures
- •environment variables

Step-by-Step

•Create a new TypeScript project and install dependencies. Keep the runtime small and bundle everything into one file, because Lambda cold starts get worse when you ship a large dependency tree.

mkdir llamaindex-lambda && cd llamaindex-lambda
npm init -y
npm install llamaindex
npm install -D typescript @types/node esbuild
npx tsc --init --rootDir src --outDir dist --module commonjs --target es2020 --esModuleInterop true
mkdir src

•Add a minimal LlamaIndex query handler. This example builds an index from in-memory text so it works end-to-end without external storage. In production, you would usually swap the source for S3, DynamoDB, or a vector store.

// src/index.ts
import { Document, VectorStoreIndex } from "llamaindex";

const docs = [
  new Document({ text: "AWS Lambda is a serverless compute service." }),
  new Document({ text: "LlamaIndex helps structure and query data for LLM applications." }),
];

let cachedIndex: VectorStoreIndex | null = null;

async function getIndex() {
  if (!cachedIndex) {
    cachedIndex = await VectorStoreIndex.fromDocuments(docs);
  }
  return cachedIndex;
}

export const handler = async (event: { question?: string }) => {
  const question = event.question ?? "What is Lambda?";
  const index = await getIndex();
  const engine = index.asQueryEngine();
  const response = await engine.query({ query: question });

  return {
    statusCode: 200,
    body: JSON.stringify({ question, answer: response.toString() }),
  };
};

•Add an environment-aware OpenAI configuration and keep the index warm across invocations. Lambda may reuse the same container, so module-level caching reduces repeated initialization work.

// src/index.ts
import { Document, VectorStoreIndex, Settings } from "llamaindex";

Settings.llm.model = "gpt-4o-mini";
Settings.embedModel.model = "text-embedding-3-small";

const docs = [
  new Document({ text: "AWS Lambda is a serverless compute service." }),
  new Document({ text: "LlamaIndex helps structure and query data for LLM applications." }),
];

let cachedIndex: VectorStoreIndex | null = null;

async function getIndex() {
  if (!cachedIndex) cachedIndex = await VectorStoreIndex.fromDocuments(docs);
  return cachedIndex;
}

•Make the handler compatible with API Gateway or direct Lambda invocation. This version accepts either { question } or an HTTP event body, which makes local testing and API Gateway integration easier.

// src/index.ts
export const handler = async (event: any) => {
  const body =
    typeof event?.body === "string" ? JSON.parse(event.body) : event ?? {};
  const question = body.question ?? "What is Lambda?";

  const index = await getIndex();
  const engine = index.asQueryEngine();
  const response = await engine.query({ query: question });

  return {
    statusCode: 200,
    headers: { "content-type": "application/json" },
    body: JSON.stringify({ question, answer: response.toString() }),
  };
};

•Bundle for Lambda and run a local smoke test. Use esbuild so the deployed artifact contains the compiled code and dependencies in one file.

{
  "name": "llamaindex-lambda",
  "version": "1.0.0",
  "main": "dist/index.js",
  "scripts": {
    "build": "esbuild src/index.ts --bundle --platform=node --target=node18 --outfile=dist/index.js",
    "test": "node -e \"require('./dist/index').handler({question:'What is LlamaIndex?' }).then(console.log)\""
  }
}

•Deploy the bundle to AWS Lambda and set the API key as an environment variable. If you use an HTTP trigger, attach API Gateway; if you use direct invocation, call the function with JSON payloads from your backend.

npm run build

aws lambda create-function \
  --function-name llamaindex-ts-demo \
  --runtime nodejs18.x \
  --handler index.handler \
  --role arn:aws:iam::123456789012:role/lambda-exec-role \
"$(printf '%s' \
'--zip-file fileb://<(cd dist && zip -r ../function.zip .)' )"

A more practical deployment flow is usually:

cd dist && zip -r ../function.zip .
aws lambda update-function-code \
  --function-name llamaindex-ts-demo \
  --zip-file fileb://../function.zip

aws lambda update-function-configuration \
  --function-name llamaindex-ts-demo \
  --environment Variables="{OPENAI_API_KEY=your-key-here}"

Testing It

Invoke the function with a simple payload like {"question":"What does LlamaIndex do?"} and confirm you get a JSON response with an answer field. Check CloudWatch logs if the function times out or fails during model initialization.

If you see import errors, your bundle is probably wrong; rebuild with esbuild and make sure node_modules is not being required at runtime. If responses are slow on the first request but fast after that, that’s expected cold-start behavior plus cached initialization.

For API Gateway deployments, send a POST request to the endpoint and verify that both direct JSON bodies and proxy events are handled correctly. If the model returns empty or irrelevant answers, confirm your OPENAI_API_KEY is set in Lambda and that your documents actually contain the information you’re asking for.

Next Steps

•Replace in-memory documents with S3-loaded files using @aws-sdk/client-s3
•Add a persistent vector store like Pinecone or OpenSearch instead of rebuilding on every cold start
•Wrap this handler in API Gateway + Cognito if you need authenticated access

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit