What Is an AI Agent? An Independent Reference.
An independent reference for AI agents in the workplace.
An AI agent is a software system that uses an underlying AI model, typically a large language model, to pursue a goal by perceiving its environment, deciding what to do next, taking actions through tools, observing the result, and iterating until the goal is reached or it gives up.
The phrase "AI agent" covers a spectrum. At one end sits a single language-model call dressed up with a few utility functions. At the other sits an autonomous system that decomposes a goal, calls tools, remembers what it has done, and recovers from errors without human supervision. The useful definition holds the spectrum together by what is constant across it: a goal, perception, decision, action, and a loop.
This is a reference page. It is built to be read in a single session by someone who needs to brief their CEO, a peer, or a board. It carries citations inline, dates every claim, and excludes vendor pricing, rankings, and predictions on principle.
Read the methodology →Every modern AI agent runs a variant of the same four-step cycle. Sense, think, act, observe. The loop iterates until the goal is reached, the agent gives up, or a human stops it. Vendor architectures differ in how each step is implemented. The cycle itself is constant.
See the expanded loop with internal sub-components on how AI agents work, or watch the loop run end-to-end on agent in action.
A more careful definition
The plain-English definition above is the one a CHRO can use in a board update. The operational definition adds two distinguishing features. An AI agent is distinguished from a chatbot by autonomy: an agent decides its own next step, a chatbot responds to the user's prompt. An agent is distinguished from a language model by tool use: an agent reads from and writes to systems outside the model's context, a model alone produces text.
The technical definition narrows further. Anthropic, in Building effective agents (Schluntz, December 2024), draws the line between a workflow and an agent at decision authority: a workflow follows a predefined path; an agent decides the path at runtime. The distinction is consequential because it determines who is responsible when the system goes wrong. Workflows fail predictably. Agents fail in ways their designers did not anticipate.
Workflows fail predictably. Agents fail in ways their designers did not anticipate.
The classical AI literature has its own definition. Russell and Norvig, in Artificial Intelligence: A Modern Approach (4th ed., 2021), define an agent as "anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators." That definition is older than the LLM era, broad enough to include a thermostat, and remains the citation of record for academic work. The modern LLM-based agent is one specific implementation of that lineage. We walk the bridge from one to the other on types of AI agents.
What an AI agent is not
The category is crowded. Six product categories use overlapping vocabulary to describe meaningfully different systems. A short tour of the differences is more useful than a longer essay on what an agent is.
- Not a chatbot →A chatbot generates a reply. An agent decomposes a goal and pursues it across multiple steps.
- Not just an LLM →An LLM is the brain. An agent is the body, brain, hands, memory, and goal.
- Not an AI assistant →An assistant works with the user, in a productivity surface. An agent works for the user, in the background.
- Not a copilot →A copilot suggests the next line of code or the next paragraph. An agent runs the whole task.
- Not RPA →RPA follows a recorded script. An agent decides what to do based on the situation.
- Not workflow automation →Workflow automation runs deterministic steps. An agent runs non-deterministic decisions.
Types of AI agents
Two taxonomies coexist. The classical Russell-and-Norvig typology covers the lineage. The modern LLM-era typology covers what vendors actually ship in 2026. The full mapping lives on types of AI agents.
Stimulus-response only. No internal state. The classical reflex agent and the simplest 2024-era one-shot LLM caller.
Reason about the world before acting. Goal-based and utility-based agents in classical terms; planner-executor in modern terms.
Function calling, MCP, retrieval. The architectural substance of a modern agent. Read more on tool use.
Multiple specialised agents collaborating, usually under an orchestrator. Useful when work is genuinely parallel.
Examples by business function
Concrete agent use cases organised by where they live in an organisation. The pattern is constant across every function: read data from a system, decide based on a rubric or model judgement, write back or notify a human.
See all examples →How AI agents work
An AI agent receives input, decides what to do, performs an action, reads the result of that action, and repeats. The decision step is where the underlying language model lives. The action step is where tool use, function calling, and the Model Context Protocol enter. The observation step is where reflection and error recovery happen. Memory threads through all four steps.
Some agents use the loop only once: the model is called, a single tool fires, the result is returned. Most production agents iterate three to fifteen times before reaching a terminal state. A small subset run indefinitely, supervised by a human or a parent agent.
The full architectural breakdown, including the planner, executor, tool router, memory module, and reflection layer, is on how AI agents work.
How to evaluate
Evaluating an agent is harder than evaluating a model. Agents are stateful, non-deterministic, and use tools that change between runs. Reliability matters more than peak capability: an agent that succeeds 95 percent of the time is more useful in production than one that succeeds 99 percent of the time on a curated benchmark and fails silently in the real world.
The honest evaluation framework covers four dimensions: capability (does it succeed on representative tasks), reliability (does it succeed consistently), cost (per-completion economics including retries), and latency (end-to-end time). The full procurement-grade framework with checklist and failure-mode taxonomy is on how to evaluate an AI agent.
In this cluster
Three companion sites in the same operator-first reference family. Each takes a different angle on the agent question.
Will AI agents replace your job? A defensible methodology calculator that scores tasks against the OECD AI Occupational Risk Index.
Where do AI agents sit in the org structure? A structural view of agent reporting lines and the functions that get redrawn.
How do AI agents fit into a workflow? A process view of swim-lane redesigns when an agent owns a step or assists a human.
Methodology, corrections
This site is built to be cited. Where a claim is editorial synthesis rather than a direct citation, the synthesis is noted. The full source list, what is intentionally excluded (vendor pricing, ranked recommendations, predictions), and the revision history are on methodology.
Corrections, criticisms, and citations to better sources are welcome via Digital Signet.