LlamaIndex Tutorial (Python): building prompt templates for beginners

By Cyprian AaronsUpdated 2026-04-21
llamaindexbuilding-prompt-templates-for-beginnerspython

This tutorial shows you how to build and use prompt templates in LlamaIndex with Python, from a basic string template to a reusable chat prompt. You need this when you want consistent LLM behavior, cleaner prompts, and a single place to manage wording across your app.

What You'll Need

  • Python 3.10+
  • llama-index installed
  • An OpenAI API key
  • Basic familiarity with Settings, VectorStoreIndex, and QueryEngine
  • A .env file or shell environment variable for OPENAI_API_KEY

Install the package:

pip install llama-index

Set your API key:

export OPENAI_API_KEY="your-api-key-here"

Step-by-Step

  1. Start with a simple document index so you have something to query.
    We’ll use a tiny in-memory dataset because the point here is the prompt template, not retrieval plumbing.
from llama_index.core import VectorStoreIndex, Document, Settings
from llama_index.llms.openai import OpenAI

Settings.llm = OpenAI(model="gpt-4o-mini")

documents = [
    Document(text="LlamaIndex helps connect data sources to LLMs."),
    Document(text="Prompt templates make output format consistent."),
]

index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
  1. Build a basic prompt template for the query engine.
    PromptTemplate is the simplest way to control how the retrieved context and question are passed into the model.
from llama_index.core.prompts import PromptTemplate

custom_prompt = PromptTemplate(
    "You are a helpful assistant.\n"
    "Use only the context below.\n\n"
    "Context:\n{context_str}\n\n"
    "Question: {query_str}\n"
    "Answer in one short paragraph."
)
  1. Attach the template to the query engine and run a test query.
    This is where the template becomes real: LlamaIndex will inject retrieved context into {context_str} and your user question into {query_str}.
query_engine.update_prompts(
    {"response_synthesizer:text_qa_template": custom_prompt}
)

response = query_engine.query("What does LlamaIndex help with?")
print(response)
  1. Use a richer template when you need structured output.
    For beginner-friendly apps, it’s common to ask for bullet points or JSON-like formatting so downstream code can parse results more reliably.
structured_prompt = PromptTemplate(
    "You are an assistant for beginners.\n"
    "Explain the answer using exactly 3 bullet points.\n"
    "Each bullet should be short and clear.\n\n"
    "Context:\n{context_str}\n\n"
    "Question: {query_str}"
)

query_engine.update_prompts(
    {"response_synthesizer:text_qa_template": structured_prompt}
)

response = query_engine.query("What are prompt templates used for?")
print(response)
  1. Create a chat-style prompt template for conversational workflows.
    If you’re building an agent or multi-turn assistant, ChatPromptTemplate gives you better control over system messages and role separation.
from llama_index.core.prompts import ChatPromptTemplate, ChatMessage, MessageRole

chat_template = ChatPromptTemplate(
    message_templates=[
        ChatMessage(role=MessageRole.SYSTEM, content=(
            "You are a patient Python tutor."
        )),
        ChatMessage(role=MessageRole.USER, content=(
            "Use this context:\n{context_str}\n\n"
            "Answer this question:\n{query_str}"
        )),
    ]
)

print(chat_template.format_messages(
    context_str="Prompt templates standardize model input.",
    query_str="Why should I use them?"
))

Testing It

Run each code block in order and confirm that response prints an answer grounded in your sample documents. If you get an API or authentication error, check that OPENAI_API_KEY is set in the same shell session where Python runs.

To verify the prompt is actually being used, change the instruction text from “one short paragraph” to “exactly 3 bullet points” and compare outputs. You should see formatting changes immediately if the custom template is wired correctly.

If you want deeper validation, print the formatted messages from ChatPromptTemplate before sending them to an LLM. That lets you inspect whether variables like {context_str} and {query_str} are being filled correctly.

Next Steps

  • Learn how to build separate templates for retrieval, synthesis, and routing in LlamaIndex
  • Add prompt variables for tone, audience level, and output format
  • Move your templates into versioned constants so product teams can update wording without touching business logic

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides