Haystack Tutorial (Python): deploying to AWS Lambda for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
haystackdeploying-to-aws-lambda-for-intermediate-developerspython

This tutorial shows how to package a Haystack pipeline into an AWS Lambda handler that can answer a question from a small document set. You’d use this when you want a lightweight retrieval app behind API Gateway without running a server 24/7.

What You'll Need

  • Python 3.10 or 3.11
  • An AWS account with permission to create:
    • Lambda
    • IAM roles
    • CloudWatch Logs
  • AWS CLI configured locally
  • pip and venv
  • Haystack installed with OpenAI support:
    • haystack-ai
    • haystack-integrations
  • An OpenAI API key set as an environment variable
  • A deployment package or container image for Lambda

Step-by-Step

  1. Start by building the Haystack pipeline locally. Keep the document set tiny and deterministic so you can validate the Lambda behavior before introducing external storage or vector databases.
import os
from haystack import Document, Pipeline
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore

template = """
Answer the question using only the documents below.

Documents:
{% for doc in documents %}
- {{ doc.content }}
{% endfor %}

Question: {{ question }}
"""

document_store = InMemoryDocumentStore()
documents = [
    Document(content="AWS Lambda runs code without provisioning servers."),
    Document(content="Haystack is a framework for building LLM applications in Python."),
]
document_store.write_documents(documents)

pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipeline.add_component("prompt_builder", PromptBuilder(template=template))
pipeline.add_component("llm", OpenAIGenerator(api_key=os.environ["OPENAI_API_KEY"]))

pipeline.connect("retriever.documents", "prompt_builder.documents")
pipeline.connect("prompt_builder.prompt", "llm.prompt")

result = pipeline.run(
    {
        "retriever": {"query": "What is AWS Lambda?"},
        "prompt_builder": {"question": "What is AWS Lambda?"},
    }
)
print(result["llm"]["replies"][0])
  1. Move that logic into a Lambda handler. The key point is to initialize the pipeline outside the handler so warm invocations reuse it instead of rebuilding everything on every request.
import json
import os
from haystack import Document, Pipeline
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore

template = """
Answer using only the documents below.

Documents:
{% for doc in documents %}
- {{ doc.content }}
{% endfor %}

Question: {{ question }}
"""

document_store = InMemoryDocumentStore()
document_store.write_documents([
    Document(content="AWS Lambda runs code without provisioning servers."),
    Document(content="Haystack builds search and LLM pipelines in Python."),
])

pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipeline.add_component("prompt_builder", PromptBuilder(template=template))
pipeline.add_component("llm", OpenAIGenerator(api_key=os.environ["OPENAI_API_KEY"]))
pipeline.connect("retriever.documents", "prompt_builder.documents")
pipeline.connect("prompt_builder.prompt", "llm.prompt")

def lambda_handler(event, context):
    body = json.loads(event.get("body") or "{}")
    question = body.get("question", "What is AWS Lambda?")
    result = pipeline.run({
        "retriever": {"query": question},
        "prompt_builder": {"question": question},
    })
    reply = result["llm"]["replies"][0]
    return {
        "statusCode": 200,
        "headers": {"Content-Type": "application/json"},
        "body": json.dumps({"answer": reply}),
    }
  1. Package the function with dependencies. For Lambda, avoid relying on your local site-packages; build a clean deployment directory so you know exactly what gets uploaded.
mkdir -p lambda_app
cd lambda_app

cat > app.py <<'PY'
# paste the lambda_handler code here as app.py
PY

python -m venv .venv
source .venv/bin/activate

pip install --upgrade pip
pip install haystack-ai openai

mkdir -p package
cp app.py package/
pip install -t package haystack-ai openai

cd package && zip -r ../haystack-lambda.zip . && cd ..
zip -g haystack-lambda.zip app.py
  1. Create the Lambda function and configure its environment. Use an IAM role with basic execution permissions, then inject your OpenAI key as an environment variable for the first version of this setup.
aws iam create-role \
  --role-name haystack-lambda-role \
  --assume-role-policy-document '{
    "Version":"2012-10-17",
    "Statement":[{"Effect":"Allow","Principal":{"Service":"lambda.amazonaws.com"},"Action":"sts:AssumeRole"}]
  }'

aws iam attach-role-policy \
  --role-name haystack-lambda-role \
  --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

ROLE_ARN=$(aws iam get-role --role-name haystack-lambda-role --query 'Role.Arn' --output text)

aws lambda create-function \
  --function-name haystack-question-answering \
  --runtime python3.11 \
  --handler app.lambda_handler \
  --zip-file fileb://haystack-lambda.zip \
  --role "$ROLE_ARN" \
  --timeout 30 \
  --memory-size 512 \
  --environment Variables="{OPENAI_API_KEY=YOUR_OPENAI_KEY}"
  1. Invoke it with a JSON payload. This gives you a fast sanity check before wiring API Gateway or an ALB in front of it.
cat > event.json <<'JSON'
{
  "body": "{\"question\":\"What is AWS Lambda?\"}"
}
JSON

aws lambda invoke \
  --function-name haystack-question-answering \
  --payload fileb://event.json \
  response.json

cat response.json

Testing It

First verify that CloudWatch logs show the function cold-starting successfully and that no import errors appear for Haystack or OpenAI. Then confirm the response body contains an answer field with text grounded in your sample documents.

If you get ImportModuleError, your deployment zip usually missed dependencies or was built from the wrong directory. If you get timeouts, increase memory first; Lambda CPU scales with memory, and Haystack startup plus model calls can be slower than expected on small allocations.

For production, test both cold and warm invocations because initialization cost matters here. Also test failure cases like missing question, invalid JSON, and expired API keys so your API returns predictable errors instead of crashing.

Next Steps

  • Replace InMemoryDocumentStore with a real backend like OpenSearch or PostgreSQL once your document set grows.
  • Move secrets out of plain environment variables and into AWS Secrets Manager.
  • Add API Gateway and structured request validation so this can serve as a real HTTP endpoint.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides