LlamaIndex Tutorial (Python): adding authentication for advanced developers

By Cyprian AaronsUpdated 2026-04-21
llamaindexadding-authentication-for-advanced-developerspython

This tutorial shows how to add authentication to a Python app built with LlamaIndex so only approved users can query your index and tool layer. You need this when your agent is exposed through an API, sits behind a web app, or handles internal data that should not be accessible to everyone.

What You'll Need

  • Python 3.10+
  • llama-index
  • fastapi
  • uvicorn
  • python-jose[cryptography]
  • passlib[bcrypt]
  • An OpenAI API key in OPENAI_API_KEY
  • A working LlamaIndex knowledge source, such as local documents or a vector store
  • Basic familiarity with FastAPI request handling and dependency injection

Step-by-Step

  1. Start by installing the packages and setting up your environment variables. The example below uses JWT-based auth, which is the right default for an API-backed agent.
pip install llama-index fastapi uvicorn python-jose[cryptography] passlib[bcrypt]
export OPENAI_API_KEY="your-openai-key"
export JWT_SECRET_KEY="change-this-to-a-long-random-secret"
  1. Build a small index from local documents. This keeps the tutorial self-contained and gives you something real to protect behind authentication.
from pathlib import Path

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

docs_dir = Path("data")
docs_dir.mkdir(exist_ok=True)
(docs_dir / "policy.txt").write_text(
    "Only employees with valid access can view customer policy details."
)

documents = SimpleDirectoryReader(input_dir="data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
  1. Add user authentication with hashed passwords and signed access tokens. In production, swap the in-memory user store for your identity provider or database, but keep the same pattern.
import os
from datetime import datetime, timedelta, timezone

from jose import jwt
from passlib.context import CryptContext

SECRET_KEY = os.environ["JWT_SECRET_KEY"]
ALGORITHM = "HS256"
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")

fake_users_db = {
    "alice": {
        "username": "alice",
        "hashed_password": pwd_context.hash("wonderland123"),
        "role": "analyst",
    }
}

def verify_password(plain_password: str, hashed_password: str) -> bool:
    return pwd_context.verify(plain_password, hashed_password)

def authenticate_user(username: str, password: str):
    user = fake_users_db.get(username)
    if not user or not verify_password(password, user["hashed_password"]):
        return None
    return user

def create_access_token(data: dict, expires_minutes: int = 30) -> str:
    payload = data.copy()
    payload["exp"] = datetime.now(timezone.utc) + timedelta(minutes=expires_minutes)
    return jwt.encode(payload, SECRET_KEY, algorithm=ALGORITHM)
  1. Expose the LlamaIndex query engine through FastAPI and protect it with a bearer token dependency. The important part is that the query path never touches the index unless the token has been validated first.
from fastapi import Depends, FastAPI, HTTPException
from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
from jose import JWTError

app = FastAPI()
bearer_scheme = HTTPBearer()

def get_current_user(credentials: HTTPAuthorizationCredentials = Depends(bearer_scheme)):
    token = credentials.credentials
    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
        username = payload.get("sub")
        if not username or username not in fake_users_db:
            raise HTTPException(status_code=401, detail="Invalid token")
        return fake_users_db[username]
    except JWTError:
        raise HTTPException(status_code=401, detail="Invalid token")

@app.post("/token")
def login(username: str, password: str):
    user = authenticate_user(username, password)
    if not user:
        raise HTTPException(status_code=401, detail="Bad credentials")
    access_token = create_access_token({"sub": user["username"], "role": user["role"]})
    return {"access_token": access_token, "token_type": "bearer"}

@app.get("/query")
def query(q: str, current_user=Depends(get_current_user)):
    response = query_engine.query(q)
    return {"user": current_user["username"], "answer": str(response)}
  1. Run the API and test both the authenticated and unauthenticated paths. Use one request to get a token, then reuse that token against the protected query endpoint.
uvicorn app:app --reload
curl -X POST \
  'http://127.0.0.1:8000/token?username=alice&password=wonderland123'
curl -H "Authorization: Bearer <PASTE_TOKEN_HERE>" \
  'http://127.0.0.1:8000/query?q=Who%20can%20view%20customer%20policy%20details%3F'

Testing It

First hit /query without an Authorization header; you should get a 403 from FastAPI’s bearer security layer or a 401 from your token validation path depending on how the request is formed. Then call /token with valid credentials and confirm you receive a signed JWT.

Use that JWT on /query and verify that the response includes both the authenticated username and an answer generated from your LlamaIndex data. If you want to harden this further, try an expired token or a wrong signature and confirm the request is rejected before any index lookup happens.

Next Steps

  • Replace the fake user store with OAuth2/OIDC against your bank’s identity provider
  • Add role-based authorization so different users can query different indexes or tools
  • Move from simple bearer auth to mTLS or signed service-to-service tokens for internal agent traffic

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides