LlamaIndex Tutorial (Python): adding authentication for beginners

By Cyprian AaronsUpdated 2026-04-21

llamaindexadding-authentication-for-beginnerspython

This tutorial shows how to add authentication to a Python app that uses LlamaIndex, so only approved users can access your index-backed chat or query endpoint. You need this when your LLM app is exposed over HTTP and you want to stop anonymous access before it reaches your data layer.

What You'll Need

•Python 3.10+
•llama-index
•fastapi
•uvicorn
•python-dotenv
•An OpenAI API key in OPENAI_API_KEY
•A shared bearer token for your app, stored as APP_API_KEY
•Basic familiarity with LlamaIndex indexes and query engines

Install the packages:

pip install llama-index fastapi uvicorn python-dotenv

Step-by-Step

•Create a simple environment file for secrets.

Keep credentials out of code. For a beginner setup, a single bearer token is enough to protect your endpoint.

OPENAI_API_KEY=sk-your-openai-key
APP_API_KEY=super-secret-demo-token

•Build a small LlamaIndex app.

This example loads a couple of documents into a vector index and exposes a query engine. The important part is that the index exists behind the API, not in the client.

from dotenv import load_dotenv
from llama_index.core import VectorStoreIndex, Document

load_dotenv()

docs = [
    Document(text="LlamaIndex helps connect LLMs to private data."),
    Document(text="Authentication should happen before any query reaches the index."),
]

index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine()

•Add bearer-token authentication to FastAPI.

This dependency checks the Authorization header before any request is processed. If the token does not match, the request is rejected with HTTP 401.

import os
from fastapi import FastAPI, Depends, HTTPException, status
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials

app = FastAPI()
security = HTTPBearer(auto_error=False)

def require_auth(credentials: HTTPAuthorizationCredentials = Depends(security)):
    expected_token = os.getenv("APP_API_KEY")
    if not expected_token:
        raise HTTPException(status_code=500, detail="APP_API_KEY is not set")

    if credentials is None or credentials.scheme.lower() != "bearer":
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Missing bearer token",
        )

    if credentials.credentials != expected_token:
        raise HTTPException(
            status_code=status.HTTP_401_UNAUTHORIZED,
            detail="Invalid token",
        )

•Protect the query endpoint and return LlamaIndex output.

The endpoint now requires valid authentication before it can call query_engine.query(). That keeps unauthenticated users from hitting your model or your data source.

from pydantic import BaseModel

class QueryRequest(BaseModel):
    question: str

@app.post("/query")
def query_index(payload: QueryRequest, _: None = Depends(require_auth)):
    response = query_engine.query(payload.question)
    return {"answer": str(response)}

•Run the server and test both failure and success cases.

Use curl to confirm the endpoint rejects missing or bad tokens, then confirm it works with the correct one. This is the fastest way to verify auth is actually enforced.

uvicorn app:app --reload

curl -X POST "http://127.0.0.1:8000/query" \
  -H "Content-Type: application/json" \
  -d '{"question":"What does LlamaIndex do?"}'

curl -X POST "http://127.0.0.1:8000/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer super-secret-demo-token" \
  -d '{"question":"What does LlamaIndex do?"}'

Testing It

First, send a request without the Authorization header. You should get a 401 Unauthorized response with "Missing bearer token". Then send a request with the wrong token and confirm you get "Invalid token".

After that, repeat the request with the correct bearer token from .env. If everything is wired correctly, FastAPI will pass control to the endpoint and you should get back an answer from LlamaIndex.

If you want one more check, temporarily print inside require_auth() and inside /query. The auth function should run on every request, and /query should only run after auth passes.

Next Steps

•Replace the shared bearer token with real user auth using OAuth2 or JWTs.
•Add role-based access control so different users can query different indexes.
•Move secrets into a proper secret manager like AWS Secrets Manager or Azure Key Vault.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit