LlamaIndex Tutorial (Python): adding authentication for intermediate developers

By Cyprian AaronsUpdated 2026-04-21
llamaindexadding-authentication-for-intermediate-developerspython

This tutorial shows you how to put a simple authentication layer in front of a LlamaIndex-backed Python app, so only valid users can query your index. You need this when you expose an internal knowledge base, support assistant, or document search API and don’t want anonymous access hitting your data or burning your model budget.

What You'll Need

  • Python 3.10+
  • llama-index
  • fastapi
  • uvicorn
  • python-dotenv
  • An LLM API key for your chosen provider, such as:
    • OPENAI_API_KEY
  • A small document set to index, such as local .txt files
  • Basic familiarity with running a FastAPI app

Install the packages:

pip install llama-index fastapi uvicorn python-dotenv

Step-by-Step

  1. Start by loading your documents and building a small LlamaIndex index. This example uses local text files so you can run it without extra infrastructure.
from pathlib import Path

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

docs_dir = Path("data")
docs_dir.mkdir(exist_ok=True)

(docs_dir / "policy.txt").write_text(
    "Employees must use MFA for all internal systems.\n"
    "Sensitive customer data must not be copied into chat tools."
)

documents = SimpleDirectoryReader(input_dir=str(docs_dir)).load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
  1. Add a tiny authentication module that validates an API key from the request header. For production, swap this out for real identity checks against your SSO, JWT issuer, or API gateway.
import os
import secrets

from fastapi import Header, HTTPException

API_TOKEN = os.getenv("APP_API_TOKEN", "dev-token")

def require_auth(x_api_token: str | None = Header(default=None)) -> None:
    if x_api_token is None:
        raise HTTPException(status_code=401, detail="Missing X-API-Token header")

    if not secrets.compare_digest(x_api_token, API_TOKEN):
        raise HTTPException(status_code=403, detail="Invalid API token")
  1. Expose the index through FastAPI and protect the query endpoint with that auth check. The key point is that authentication happens before any LlamaIndex call, so unauthorized requests never reach your retrieval layer.
from fastapi import Depends, FastAPI
from pydantic import BaseModel

app = FastAPI()

class QueryRequest(BaseModel):
    question: str

@app.post("/query")
def query_index(payload: QueryRequest, _: None = Depends(require_auth)):
    response = query_engine.query(payload.question)
    return {"answer": str(response)}
  1. Run the service and test both the authorized and unauthorized paths. This gives you a quick sanity check that the auth gate is working before you wire in a real identity provider.
export APP_API_TOKEN="super-secret-token"
uvicorn app:app --reload
curl -X POST "http://127.0.0.1:8000/query" \
  -H "Content-Type: application/json" \
  -H "X-API-Token: super-secret-token" \
  -d '{"question":"What does the policy say about MFA?"}'
  1. If you want to keep auth separate from your business logic, move the dependency into a reusable guard function and apply it across multiple endpoints. That keeps your ingestion, retrieval, and admin routes consistent.
from fastapi import APIRouter

router = APIRouter(dependencies=[Depends(require_auth)])

@router.get("/health")
def health():
    return {"status": "ok"}

@router.post("/ask")
def ask(payload: QueryRequest):
    response = query_engine.query(payload.question)
    return {"answer": str(response)}

app.include_router(router)

Testing It

First, send a request without the X-API-Token header and confirm you get 401 Missing X-API-Token header. Then send one with the wrong token and confirm you get 403 Invalid API token.

After that, send a valid token and verify the response contains an answer generated from your indexed documents. If you see a normal answer but unauthorized requests still succeed, your dependency is not attached to the route you think it is.

For a production check, log failed auth attempts and rate-limit repeated failures at the edge or gateway layer.

Next Steps

  • Replace the static header token with JWT validation against your identity provider.
  • Add role-based access control so different users can query different indexes or namespaces.
  • Put authentication in front of streaming endpoints if you’re using query_engine.query() plus server-sent events or WebSockets.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides