LlamaIndex Tutorial (Python): implementing guardrails for beginners

By Cyprian AaronsUpdated 2026-04-21

llamaindeximplementing-guardrails-for-beginnerspython

This tutorial shows you how to add guardrails around a LlamaIndex-powered Python agent so it only answers within allowed topics, refuses risky prompts, and returns a safe fallback when the input is out of bounds. You need this when you’re building assistants for banking, insurance, or any internal workflow where “just let the model answer” is not acceptable.

What You'll Need

•Python 3.10+
•An OpenAI API key set as OPENAI_API_KEY
•llama-index installed
•llama-index-llms-openai installed
•Basic familiarity with QueryEngine, ChatEngine, or agents in LlamaIndex
•A clear policy for what should be allowed vs blocked

Install the packages:

pip install llama-index llama-index-llms-openai

Step-by-Step

•Start with a small policy layer. For beginners, the simplest guardrail is a deterministic classifier that checks whether a user request matches allowed business topics before it reaches the LLM.

from dataclasses import dataclass

ALLOWED_TOPICS = {
    "claims",
    "policy",
    "premium",
    "coverage",
    "billing",
    "account",
}

@dataclass
class GuardrailResult:
    allowed: bool
    reason: str

def basic_topic_guardrail(user_input: str) -> GuardrailResult:
    text = user_input.lower()
    if any(topic in text for topic in ALLOWED_TOPICS):
        return GuardrailResult(True, "Topic allowed")
    return GuardrailResult(False, "Outside supported scope")

•Build your LlamaIndex query engine as usual. The key is that the guardrail sits in front of the engine, not inside it, so you can block bad requests before they consume tokens or produce unsafe output.

import os
from llama_index.core import VectorStoreIndex, Document
from llama_index.llms.openai import OpenAI

docs = [
    Document(text="Policy renewals are processed 30 days before expiration."),
    Document(text="Claims require a claim number and incident date."),
    Document(text="Billing questions are handled by the finance team."),
]

llm = OpenAI(model="gpt-4o-mini", temperature=0)
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine(llm=llm)

•Wrap the query engine with a guard function. This pattern gives you one place to enforce policy and return a safe response when the request fails validation.

def guarded_query(user_input: str) -> str:
    check = basic_topic_guardrail(user_input)

    if not check.allowed:
        return (
            "I can only help with claims, policy, premium, coverage, billing, "
            "and account questions."
        )

    response = query_engine.query(user_input)
    return str(response)

print(guarded_query("How do I file a claim?"))
print(guarded_query("Help me write malware"))

•Add a second guardrail for output validation. Input filtering is not enough; you also want to make sure the model doesn’t drift into unsupported advice or expose content you don’t want surfaced.

BLOCKED_OUTPUT_PHRASES = [
    "guaranteed approval",
    "ignore previous instructions",
    "send me your password",
]

def validate_output(text: str) -> bool:
    lowered = text.lower()
    return not any(phrase in lowered for phrase in BLOCKED_OUTPUT_PHRASES)

def guarded_query_with_output_check(user_input: str) -> str:
    check = basic_topic_guardrail(user_input)

    if not check.allowed:
        return "Request blocked by input policy."

    response_text = str(query_engine.query(user_input))

    if not validate_output(response_text):
        return "The generated answer did not pass safety checks."

    return response_text

•Make the fallback behavior explicit and predictable. In production systems, your guardrails should fail closed: if anything looks suspicious or malformed, return a controlled message instead of trying to be clever.

def safe_answer(user_input: str) -> dict:
    try:
        result = basic_topic_guardrail(user_input)
        if not result.allowed:
            return {
                "allowed": False,
                "answer": None,
                "reason": result.reason,
            }

        answer = str(query_engine.query(user_input))
        if not validate_output(answer):
            return {
                "allowed": False,
                "answer": None,
                "reason": "Output validation failed",
            }

        return {
            "allowed": True,
            "answer": answer,
            "reason": result.reason,
        }
    except Exception as e:
        return {
            "allowed": False,
            "answer": None,
            "reason": f"Guarded failure: {e}",
        }

Testing It

Run three types of prompts: one clearly allowed, one clearly disallowed, and one ambiguous. For example, test "What does my policy cover?", "How do I exploit an account?", and "Tell me about insurance" to see whether your topic filter behaves as expected.

Check that blocked inputs never reach query_engine.query(). If you want to verify that directly, add logging around the guard function and confirm that disallowed requests stop before LLM invocation.

Also inspect the returned shape from safe_answer(). In production, consistent response objects make it easier for downstream services to handle accepted vs rejected requests without special-case parsing.

Next Steps

•Add an LLM-based moderation step using structured outputs for more flexible classification.
•Replace keyword matching with a retrieval-backed policy document and evaluate against real examples.
•Add observability: log blocked prompts, reasons, and false positives so you can tune the guardrails over time.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit