What is state machines in AI Agents? A Guide for developers in lending
State machines are a way to model an AI agent as a set of named states, where each state defines what the agent is allowed to do next. In AI agents, a state machine controls the agent’s behavior by moving it from one state to another based on events, rules, or tool results.
How It Works
Think of a state machine like a loan application workflow in your lending system.
A borrower starts in draft, moves to submitted, then maybe kyc_pending, under_review, approved, rejected, or manual_review. The application cannot jump randomly from draft to disbursed unless the required checks have passed. That is the core idea: the agent is always in one known state, and every transition is explicit.
For AI agents, this matters because agents are not just “chatbots that think.” They often need to:
- •collect missing information
- •call internal tools
- •wait for external systems
- •retry failed steps
- •escalate edge cases
A state machine gives you control over that flow.
A simple analogy: it’s like a bank teller using a checklist, not improvising every step. If the customer has no ID, the teller does not continue with account opening. The process moves into an exception path until the missing document arrives. Same with an AI agent: if the credit bureau API fails, the agent should move into retry_bureau or manual_fallback, not hallucinate a score.
Here’s what that looks like in practice:
| State | What the agent is doing | Next possible transitions |
|---|---|---|
start | Intake request | collect_docs, reject_incomplete |
collect_docs | Ask for income proof, ID, bank statements | kyc_check, wait_customer |
kyc_check | Call KYC/AML service | risk_assessment, manual_review, failed_kyc |
risk_assessment | Pull bureau data and score risk | approve, decline, underwriter_review |
approve | Generate decision and notify customer | End |
manual_review | Hand off to human underwriter | approve, decline |
The important part is that each state has clear responsibilities. Your LLM can generate text, summarize documents, and explain decisions, but the state machine decides what happens next.
Why It Matters
- •
Prevents uncontrolled agent behavior
Lending workflows have hard rules. A state machine stops an agent from skipping mandatory checks like KYC, affordability assessment, or fraud screening. - •
Makes failures recoverable
When bureau APIs time out or OCR fails on uploaded payslips, you need deterministic retry and fallback paths. State machines make those paths explicit. - •
Improves auditability
In lending, you need to explain why a decision was made. A state history gives you a trace: which checks ran, which tools were called, and where the process branched. - •
Supports human-in-the-loop review
Not every case should be fully automated. State machines let you route borderline applications into manual review without breaking the rest of the flow.
Real Example
Let’s say you are building an AI loan origination assistant for personal loans.
The user uploads an ID card, payslip, and bank statements. The agent’s job is not to “decide” immediately. It should orchestrate a sequence:
- •validate document completeness
- •extract key fields using OCR
- •run KYC/AML checks
- •fetch credit bureau data
- •calculate affordability and debt burden
- •decide whether to approve automatically or send to an underwriter
A good state machine for this might look like:
stateDiagram-v2
[*] --> intake
intake --> validate_docs
validate_docs --> request_missing_docs: incomplete
validate_docs --> extract_fields: complete
extract_fields --> kyc_check
kyc_check --> manual_review: failed_or_flagged
kyc_check --> bureau_lookup: passed
bureau_lookup --> risk_score
risk_score --> approve: within_policy
risk_score --> decline: below_threshold
risk_score --> manual_review: borderline
manual_review --> approve
manual_review --> decline
approve --> [*]
decline --> [*]
In implementation terms:
- •
The LLM handles unstructured tasks:
- •summarizing documents
- •explaining missing fields to customers
- •drafting underwriter notes
- •
The state machine handles control flow:
- •which tool runs next
- •whether retries are allowed
- •when to stop and escalate
That separation is what keeps production systems stable.
Without a state machine, teams often end up with “agent spaghetti”:
- •prompts calling tools directly
- •hidden branching inside prompt text
- •unclear retry logic
- •inconsistent outcomes across similar cases
With a state machine, every transition is testable.
For example:
if state == "bureau_lookup":
result = credit_bureau.fetch(applicant_id)
if result.timeout:
return "bureau_retry"
if result.score < MIN_SCORE:
return "decline"
if result.flags.contains("fraud"):
return "manual_review"
return "risk_score"
This is boring code in the best possible way. In lending systems, boring usually means safe.
Related Concepts
- •
Workflow orchestration
Broader process management across services; state machines are often one layer inside it. - •
Finite State Machines (FSMs)
The classic computer science model behind explicit states and transitions. - •
Agentic tool calling
How LLMs invoke APIs, databases, or internal services during execution. - •
Human-in-the-loop review
Escalation paths where humans override or confirm AI decisions. - •
Policy engines / decision engines
Rule-based systems that enforce lending policy alongside or inside the agent flow.
Keep learning
- •The complete AI Agents Roadmap — my full 8-step breakdown
- •Free: The AI Agent Starter Kit — PDF checklist + starter code
- •Work with me — I build AI for banks and insurance companies
By Cyprian Aarons, AI Consultant at Topiax.
Want the complete 8-step roadmap?
Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.
Get the Starter Kit