CrewAI Tutorial (Python): building custom tools for advanced developers
This tutorial shows you how to build a custom CrewAI tool in Python, register it with an agent, and use it in a real crew workflow. You need this when the built-in tools are not enough and you want your agent to call internal APIs, validate business rules, or interact with systems your team actually uses.
What You'll Need
- •Python 3.10+
- •
crewai - •
crewai-tools - •
pydantic - •An OpenAI API key set as
OPENAI_API_KEY - •A working internet connection for the model call
- •Basic familiarity with CrewAI agents, tasks, and crews
Install the packages:
pip install crewai crewai-tools pydantic
Set your API key:
export OPENAI_API_KEY="your-api-key"
Step-by-Step
- •Start by creating a custom tool class.
For advanced use cases, subclassBaseToolso you can control input validation and return structured output.
from typing import Type
from pydantic import BaseModel, Field
from crewai_tools import BaseTool
class PolicyLookupInput(BaseModel):
policy_id: str = Field(..., description="The insurance policy ID")
class PolicyLookupTool(BaseTool):
name: str = "policy_lookup"
description: str = "Look up policy status by policy ID."
args_schema: Type[BaseModel] = PolicyLookupInput
def _run(self, policy_id: str) -> str:
mock_db = {
"POL-1001": "Active",
"POL-1002": "Lapsed",
"POL-1003": "Pending underwriting",
}
return mock_db.get(policy_id, f"Policy {policy_id} not found")
- •Add a second tool that does something different.
In production, this could be a pricing calculator, claims validator, or internal CRM lookup. Here we keep it deterministic so you can test the wiring cleanly.
class PremiumEstimateInput(BaseModel):
age: int = Field(..., ge=18, le=100)
smoker: bool = Field(...)
class PremiumEstimateTool(BaseTool):
name: str = "premium_estimate"
description: str = "Estimate monthly premium based on age and smoking status."
args_schema: Type[BaseModel] = PremiumEstimateInput
def _run(self, age: int, smoker: bool) -> str:
base = 120 + (age - 18) * 2
if smoker:
base += 75
return f"Estimated premium: ${base:.2f}/month"
- •Create an agent and attach both tools.
The key point is that the agent can decide when to call each tool based on the task prompt and tool descriptions.
from crewai import Agent
policy_agent = Agent(
role="Insurance Operations Analyst",
goal="Answer policy and premium questions accurately using available tools.",
backstory="You work inside an insurance operations team and must use tools for factual lookups.",
tools=[PolicyLookupTool(), PremiumEstimateTool()],
verbose=True,
)
- •Define tasks that force tool usage.
Keep the instructions specific so the agent knows what data to fetch and how to format the response.
from crewai import Task
task_1 = Task(
description="Check the status of policy POL-1001 and explain it in one sentence.",
expected_output="A short status summary for POL-1001.",
agent=policy_agent,
)
task_2 = Task(
description="Estimate the monthly premium for a 42-year-old smoker.",
expected_output="A premium estimate with a brief explanation.",
agent=policy_agent,
)
- •Put everything into a crew and run it.
This is the part most people skip when testing tools in isolation. The real value comes from seeing whether the agent chooses the right tool under orchestration.
from crewai import Crew, Process
crew = Crew(
agents=[policy_agent],
tasks=[task_1, task_2],
process=Process.sequential,
verbose=True,
)
result = crew.kickoff()
print(result)
Testing It
Run the script and watch the verbose logs. You should see the agent deciding to call policy_lookup for the first task and premium_estimate for the second.
If the output is vague or incorrect, tighten your tool descriptions and task wording. CrewAI is sensitive to how clearly you describe when a tool should be used.
A good test is to change POL-1001 to an unknown policy ID and confirm the tool returns "not found". That tells you your custom logic is being executed instead of hallucinated by the model.
For production-style testing, wrap each _run method with unit tests that assert exact outputs for known inputs.
Next Steps
- •Replace the mock dictionaries with real API calls or database queries.
- •Add retry logic and timeout handling inside
_runfor external systems. - •Build structured outputs with Pydantic models instead of plain strings when downstream systems need reliable parsing.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit