LlamaIndex Tutorial (TypeScript): deploying to AWS Lambda for intermediate developers
This tutorial shows how to build a small LlamaIndex TypeScript app that runs inside AWS Lambda and answers questions against your data. You need this when you want serverless inference for low-traffic workloads, event-driven document Q&A, or an API that scales to zero without managing servers.
What You'll Need
- •Node.js 18+ and npm
- •An AWS account with permission to create:
- •Lambda functions
- •IAM roles
- •CloudWatch logs
- •An OpenAI API key exported as
OPENAI_API_KEY - •A local TypeScript project
- •These packages:
- •
llamaindex - •
@aws-sdk/client-s3if you plan to load documents from S3 later - •
esbuildfor bundling the Lambda artifact
- •
- •Basic familiarity with:
- •async/await
- •AWS Lambda handler signatures
- •environment variables
Step-by-Step
- •Create a new TypeScript project and install dependencies. Keep the runtime small and bundle everything into one file, because Lambda cold starts get worse when you ship a large dependency tree.
mkdir llamaindex-lambda && cd llamaindex-lambda
npm init -y
npm install llamaindex
npm install -D typescript @types/node esbuild
npx tsc --init --rootDir src --outDir dist --module commonjs --target es2020 --esModuleInterop true
mkdir src
- •Add a minimal LlamaIndex query handler. This example builds an index from in-memory text so it works end-to-end without external storage. In production, you would usually swap the source for S3, DynamoDB, or a vector store.
// src/index.ts
import { Document, VectorStoreIndex } from "llamaindex";
const docs = [
new Document({ text: "AWS Lambda is a serverless compute service." }),
new Document({ text: "LlamaIndex helps structure and query data for LLM applications." }),
];
let cachedIndex: VectorStoreIndex | null = null;
async function getIndex() {
if (!cachedIndex) {
cachedIndex = await VectorStoreIndex.fromDocuments(docs);
}
return cachedIndex;
}
export const handler = async (event: { question?: string }) => {
const question = event.question ?? "What is Lambda?";
const index = await getIndex();
const engine = index.asQueryEngine();
const response = await engine.query({ query: question });
return {
statusCode: 200,
body: JSON.stringify({ question, answer: response.toString() }),
};
};
- •Add an environment-aware OpenAI configuration and keep the index warm across invocations. Lambda may reuse the same container, so module-level caching reduces repeated initialization work.
// src/index.ts
import { Document, VectorStoreIndex, Settings } from "llamaindex";
Settings.llm.model = "gpt-4o-mini";
Settings.embedModel.model = "text-embedding-3-small";
const docs = [
new Document({ text: "AWS Lambda is a serverless compute service." }),
new Document({ text: "LlamaIndex helps structure and query data for LLM applications." }),
];
let cachedIndex: VectorStoreIndex | null = null;
async function getIndex() {
if (!cachedIndex) cachedIndex = await VectorStoreIndex.fromDocuments(docs);
return cachedIndex;
}
- •Make the handler compatible with API Gateway or direct Lambda invocation. This version accepts either
{ question }or an HTTP event body, which makes local testing and API Gateway integration easier.
// src/index.ts
export const handler = async (event: any) => {
const body =
typeof event?.body === "string" ? JSON.parse(event.body) : event ?? {};
const question = body.question ?? "What is Lambda?";
const index = await getIndex();
const engine = index.asQueryEngine();
const response = await engine.query({ query: question });
return {
statusCode: 200,
headers: { "content-type": "application/json" },
body: JSON.stringify({ question, answer: response.toString() }),
};
};
- •Bundle for Lambda and run a local smoke test. Use esbuild so the deployed artifact contains the compiled code and dependencies in one file.
{
"name": "llamaindex-lambda",
"version": "1.0.0",
"main": "dist/index.js",
"scripts": {
"build": "esbuild src/index.ts --bundle --platform=node --target=node18 --outfile=dist/index.js",
"test": "node -e \"require('./dist/index').handler({question:'What is LlamaIndex?' }).then(console.log)\""
}
}
- •Deploy the bundle to AWS Lambda and set the API key as an environment variable. If you use an HTTP trigger, attach API Gateway; if you use direct invocation, call the function with JSON payloads from your backend.
npm run build
aws lambda create-function \
--function-name llamaindex-ts-demo \
--runtime nodejs18.x \
--handler index.handler \
--role arn:aws:iam::123456789012:role/lambda-exec-role \
"$(printf '%s' \
'--zip-file fileb://<(cd dist && zip -r ../function.zip .)' )"
A more practical deployment flow is usually:
cd dist && zip -r ../function.zip .
aws lambda update-function-code \
--function-name llamaindex-ts-demo \
--zip-file fileb://../function.zip
aws lambda update-function-configuration \
--function-name llamaindex-ts-demo \
--environment Variables="{OPENAI_API_KEY=your-key-here}"
Testing It
Invoke the function with a simple payload like {"question":"What does LlamaIndex do?"} and confirm you get a JSON response with an answer field. Check CloudWatch logs if the function times out or fails during model initialization.
If you see import errors, your bundle is probably wrong; rebuild with esbuild and make sure node_modules is not being required at runtime. If responses are slow on the first request but fast after that, that’s expected cold-start behavior plus cached initialization.
For API Gateway deployments, send a POST request to the endpoint and verify that both direct JSON bodies and proxy events are handled correctly. If the model returns empty or irrelevant answers, confirm your OPENAI_API_KEY is set in Lambda and that your documents actually contain the information you’re asking for.
Next Steps
- •Replace in-memory documents with S3-loaded files using
@aws-sdk/client-s3 - •Add a persistent vector store like Pinecone or OpenSearch instead of rebuilding on every cold start
- •Wrap this handler in API Gateway + Cognito if you need authenticated access
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit