What is state machines in AI Agents? A Guide for engineering managers in banking
State machines are a way to model an AI agent as a set of defined states, where each state has clear rules for what it can do and what state it can move to next. In AI agents, a state machine controls the agent’s behavior by making it follow explicit transitions instead of guessing its next step.
How It Works
Think of a state machine like a bank’s loan approval workflow.
A loan application does not jump around randomly. It starts in Received, moves to KYC Check, then Risk Review, then either Approved, Rejected, or Escalated. Each step has conditions that determine the next step, and the process is easy to audit because every transition is explicit.
An AI agent works the same way.
Instead of letting the model respond freely at every turn, you constrain it to a current state and a limited set of valid next states. That means the agent can only take actions that make sense for the business process it is handling.
A simple banking example looks like this:
- •Idle: waiting for a new customer request
- •Collecting Info: asking for missing details
- •Verifying Identity: checking KYC/AML data
- •Assessing Risk: running policy or credit checks
- •Escalating: handing off to a human reviewer
- •Completed: closing the case
Each transition is driven by an event:
- •Customer submits documents
- •KYC check passes or fails
- •Risk score crosses a threshold
- •Human operator approves or rejects
That structure matters because AI agents are otherwise prone to drift. If you let an LLM “just continue,” it may repeat itself, skip required compliance steps, or produce an action that is not allowed in the current workflow.
For engineering managers, the key idea is this:
a state machine turns an agent from a conversational system into an operational system.
Why It Matters
Engineering managers in banking should care because state machines solve problems that show up in production fast:
- •
Auditability
- •Every action is tied to a known state and transition.
- •That makes it easier to explain agent behavior to risk, compliance, and internal audit teams.
- •
Control over compliance
- •You can force required steps like KYC, sanctions screening, or human approval before sensitive actions.
- •The agent cannot “skip ahead” because the workflow blocks invalid transitions.
- •
Lower operational risk
- •State boundaries reduce hallucinated behavior and accidental side effects.
- •This matters when an agent touches customer data, payment flows, or underwriting decisions.
- •
Better incident handling
- •When something fails, you know exactly where the process stopped.
- •Recovery becomes simpler because you resume from a state instead of replaying an entire conversation.
Here’s a practical comparison:
| Approach | Behavior | Banking impact |
|---|---|---|
| Free-form LLM chat | Flexible but unpredictable | Harder to govern |
| Workflow engine only | Predictable but rigid | Limited intelligence |
| AI agent + state machine | Controlled intelligence | Best fit for regulated processes |
Real Example
Consider an insurance claims assistant that helps customers file motor claims.
Without a state machine, the assistant might ask inconsistent questions, miss required fields, or try to estimate settlement before collecting evidence. With a state machine, you define the process clearly:
- •
Claim Started
- •Customer opens claim through chat or web form
- •
Data Collection
- •Agent asks for policy number, date of incident, vehicle details, photos
- •
Validation
- •System checks policy active status, claim eligibility, duplicate claims
- •
Assessment
- •Agent summarizes facts and runs triage rules
- •If damage is minor and within threshold, proceed automatically
- •If fraud signals appear, move to escalation
- •
Human Review
- •Claims adjuster reviews flagged cases
- •
Resolution
- •Claim approved, rejected, or pending more information
The AI agent sits inside those states and performs narrow tasks:
- •extract information from customer messages
- •classify whether documents are complete
- •summarize evidence for adjusters
- •decide whether the case can move forward
The value is not that the model becomes smarter. The value is that its intelligence gets constrained by business logic.
That gives you three things banks care about:
- •consistent customer experience
- •fewer compliance mistakes
- •easier integration with existing claims or underwriting systems
A production pattern here is to keep the state machine outside the model and treat the LLM as one tool inside each state. The orchestrator decides what happens next; the model helps with interpretation and content generation.
Related Concepts
- •
Workflow orchestration
- •The broader system that manages tasks across services and humans.
- •
Finite State Machines (FSMs)
- •The classic software pattern behind most agent state models.
- •
Event-driven architecture
- •Useful when transitions are triggered by messages from external systems.
- •
Guardrails
- •Rules that prevent invalid outputs or unsafe actions in regulated environments.
- •
Human-in-the-loop review
- •A fallback path for cases that require manual approval or exception handling.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit