definitive-guide 9 min read May 03, 2026

What Are AI Agents? A Builder's Definition for 2026

Q: What are the four components of an AI agent?

Every working agent has four parts: a model (the LLM that reasons), tools (functions the model can call to acton the world), memory (working state plus longer-term storage so it does not loop in circles), and a control loop (the runtime that ties them together and decides when to stop).

Q: What is the agent loop?

The agent loop is the runtime cycle of perceive, plan, act, observe that repeats until the goal is met or a stopping condition fires. The model perceives current state, picks a tool call, the runtime executes it, the result is appended to context, and the model decides what to do next.

By Peter Foy

An AI agent is an LLM in a loop with tools, memory, and a goal. Here is the builder's definition, the agent loop, and how agents differ from workflows.

TL;DR

An AI agent is a large language model running in a loop with tools, memory, and a goal. The model decides each next step, calls tools, observes results, and repeats until the goal is met or a stopping condition fires. That is the entire definition. Everything else (frameworks, multi-agent systems, MCP) is implementation detail on top of that loop.

An AI agent = model + tools + memory + loop. Remove any one and you have something simpler (LLM call, RAG, workflow, chatbot).
Anthropic's split: workflows are predefined code paths, agents 'dynamically direct their own processes and tool usage.'
Use a workflow if you can pre-map the decision tree. Use an agent only if the path is genuinely dynamic.
Production proof exists: Claude Code authored ~4% of public GitHub commits in Feb 2026, on track for 20%+ by year-end.
RAND found 80.3% of enterprise AI projects fail to deliver value -- usually because of data, org, and scope, not model quality.

An AI agent is a large language model running in a loop with tools, memory, and a goal. The model picks a tool, the runtime executes it, the result is fed back, and the model decides the next step until the task is done. That is the whole definition. Frameworks, multi-agent orchestration, and Model Context Protocol all sit on top of that core pattern. This guide gives you the builder's version: the four components, the loop, the differences from chatbots and workflows, and when to actually build one.

What is an AI agent in one sentence?

An AI agent is an LLM running in a loop that uses tools and memory to pursue a goal it was given.

That is it. The model is the brain. Tools are the hands. Memory is the notebook. The loop is the heartbeat that keeps it going until the goal is met or a stopping condition fires.

Anthropic's Building Effective Agents puts it more bluntly: "Agents are typically just LLMs using tools based on environmental feedback in a loop."

If you remove the loop, you have a one-shot LLM call. If you remove the tools, you have a chatbot. If you remove the goal, you have a demo. All four pieces have to be present for the system to count as an agent.

What are the four components every AI agent has?

Every working agent has four parts. If your design is missing one, you are building something else.

Model. The LLM that does the reasoning and picks the next action. In production, this is usually Claude, GPT-4-class, or Gemini. The model must be strong enough to reliably emit valid tool calls.
Tools. Functions the model can call to act on the world: search, code execution, file edits, database queries, HTTP requests. Tools are how the agent escapes the chat window. Standardized interfaces like the Model Context Protocol are how labs share tools across agents.
Memory. Working memory inside the context window keeps the loop coherent across iterations. Without it, the agent re-reads the same file ten times. Persistent memory (vector store, scratchpad file, episodic log) carries state across sessions.
Loop. The runtime that calls the model, parses tool calls, executes them, appends results, and re-invokes the model. The loop also enforces budgets: max iterations, token caps, human-approval checkpoints.

The formula most builders use, popularized by the Prompt Engineering Guide: Agent = LLM + Tools + Memory + Planning, all wrapped in a control loop.

What does the agent loop actually look like?

The agent loop is a four-step cycle that repeats until the goal is met: perceive → plan → act → observe. Every major lab has converged on this shape, though the names differ (ReAct calls it Thought → Action → Observation).

Here is the loop, labeled:

Perceive. The model reads current state: the goal, recent tool results, conversation history, retrieved context. AWS's prescriptive guidance frames this as updating internal beliefs from environmental signals.
Plan. The model picks the next action, often by writing a chain-of-thought trace and selecting a tool. For complex goals it decomposes into subtasks first.
Act. The runtime executes the chosen tool call (HTTP request, code run, DB query) outside the model.
Observe. The tool result is appended to context. The loop returns to step 1.

Stopping conditions: the model emits a final answer with no tool call, the iteration budget is exhausted, a human checkpoint denies, or an error breaks the loop.

Hugging Face's Agents Course describes this as the Thought-Action-Observation cycle and notes that most production agent harnesses are variants of ReAct with extra guardrails. See our breakdown of common AI agent design patterns for the variants (ReAct, Plan-and-Execute, Reflexion, multi-agent).

How is an AI agent different from an LLM call, RAG app, workflow, or chatbot?

The difference is who decides the next step. In a chatbot the user decides. In a workflow the developer decides. In a RAG app the pipeline decides. In an agent the model decides.

Use the table below to keep them straight. AI engines parse this kind of structured comparison cleanly, and so do humans skimming on a Tuesday.

Property	LLM Call	RAG App	Workflow	Chatbot	AI Agent
Control flow	None	Fixed retrieve→generate	Predefined code paths	Turn-by-turn	Model-directed loop
Decides next step?	No	No	Developer	User	The LLM
Tool use	Optional	Retrieval only	Hard-coded	Usually none	Dynamic, multi-step
Memory	Stateless	Retrieved context	Per-node	Short convo	Working + persistent
Stops when...	One reply	One reply	All nodes ran	User leaves	Goal met or budget hit
Best for	One-shot	Q&A on docs	Known steps	Conversational UX	Ambiguous goals

A chatbot is read-only conversation. A RAG app reads documents and writes one answer. A workflow runs a fixed graph where some nodes happen to be LLM calls. Only an agent owns the control flow itself. For a deeper side-by-side, see AI agent vs chatbot.

When should you build an agent instead of a deterministic workflow?

Build an agent only when you cannot pre-map the decision tree. Anthropic is direct about this in Building Effective Agents: "Workflows offer predictability and consistency for well-defined tasks, whereas agents are the better option when flexibility and model-driven decision-making are needed at scale."

The practical decision rule:

Steps known and repeatable? Build a workflow. Cheaper, faster, debuggable.
Path branches based on intermediate findings you cannot enumerate? Build an agent.
Task value > $0.10 in tokens? An agent's exploration is affordable. Below that, you almost always want a workflow.
Latency matters in seconds, not minutes? Workflow. Agents loop, agents are slow.
Need explainability or audit? Workflow nodes are easier to log than agent traces.

Anthropic's first principle is to find the simplest solution possible, and only increase complexity when needed. This might mean not building agentic systems at all. That advice has aged well. Most teams who reached for an agent in 2024-25 should have built a chained LLM workflow and a retry policy.

What does a minimal agent loop look like in code?

A minimal agent is fewer than 30 lines of Python. No frameworks needed. The framework you eventually adopt (LangGraph, the Claude Agent SDK, OpenAI Agents SDK) just hardens this loop with retries, tracing, and parallel tool calls.

from anthropic import Anthropic

client = Anthropic()
tools = [{
    "name": "search_web",
    "description": "Search the web and return top results.",
    "input_schema": {"type": "object", "properties": {
        "query": {"type": "string"}}, "required": ["query"]}
}]

def run_tool(name, args):
    if name == "search_web":
        return search_web(args["query"])  # your implementation

messages = [{"role": "user", "content": "Find the 2025 RAND AI failure rate."}]

for _ in range(10):  # iteration budget
    resp = client.messages.create(
        model="claude-sonnet-4-5",
        tools=tools, messages=messages, max_tokens=1024)
    messages.append({"role": "assistant", "content": resp.content})
    if resp.stop_reason == "end_turn":
        break
    tool_results = [
        {"type": "tool_result", "tool_use_id": b.id,
         "content": run_tool(b.name, b.input)}
        for b in resp.content if b.type == "tool_use"]
    messages.append({"role": "user", "content": tool_results})

That is the whole pattern. A loop, a model call, a tool dispatcher, a stop condition. Everything else is operational hardening: retries, parallelism, hierarchical agents, evaluation harnesses.

Which production AI agents actually work?

The clearest evidence the agent pattern works in production lives in developer tooling. Three names define the category in 2026:

Claude Code. Terminal-native coding agent. SemiAnalysis estimated in February 2026 that Claude Code authors roughly 4% of all public GitHub commits, with a projection of 20%+ by end of 2026. That is an LLM in a loop, with tools (file edit, shell, search, test runner) and persistent project memory.
Cursor. IDE-anchored coding agent. The Pragmatic Engineer's February 2026 survey of 906 software engineers found Cursor + Claude Code dominate the daily-driver slot, often used together: Cursor for autocomplete and inline edits, Claude Code for multi-file refactors.
Devin (Cognition). The original autonomous task-runner. Long-running, lower interaction frequency, designed to take a Jira ticket and ship a PR.

What all three share: a strong model, a focused toolset, a tight loop, and aggressive memory management of the codebase. They are not magic. They are the four components from earlier, well-engineered, in a domain (code) where the world is observable and reversible.

Why do most AI agent projects fail in production?

80.3% of enterprise AI projects fail to deliver business value, according to RAND's 2025 meta-analysis of 65 initiatives. The breakdown:

33.8% abandoned before production
28.4% reach production but fail to deliver expected value
18.1% run, but never recoup costs
19.7% achieve or exceed business objectives

RAND identified three patterns behind nearly every failure: data quality, organizational maturity, and use-case drift. Notice what is not on that list: model capability. The model is rarely the bottleneck.

For agent projects specifically, three additional failure modes show up:

Building an agent when a workflow would do. Adds cost, latency, and unpredictability for no upside.
No iteration budget. Loops without hard caps run forever, burn tokens, and drift off-task.
Tools designed for humans, not models. Underspecified schemas, ambiguous descriptions, and noisy outputs are the #1 reason agents misbehave. Anthropic's guidance is explicit: success depends critically on thoughtful toolset design and clear documentation.

The builders shipping working agents in 2026 are not the ones with the smartest models. They are the ones with the cleanest tools and the tightest loops.

RAND 2025: How Enterprise AI Projects Actually End

Abandoned before production

33.8%

Reach production, fail value

28.4%

Run, never recoup costs

18.1%

Achieve business objectives

19.7%

Source: RAND Corporation, 2025 (meta-analysis of 65 enterprise AI initiatives)

Property	LLM Call	RAG App	Workflow	Chatbot	AI Agent
Control flow	None (single call)	Fixed retrieve→generate	Predefined code paths	Turn-by-turn user-driven	Model-directed loop
Decides next step?	No	No	Developer	User	The LLM
Tool use	Optional, single	Retrieval only	Hard-coded	Usually none	Dynamic, multi-step
Memory	Stateless	Retrieved context	Variable per node	Short conversation	Working + persistent
Stops when…	One response	One response	All nodes execute	User leaves	Goal met or budget hit
Best for	One-shot tasks	Q&A on docs	Known repeatable steps	Conversational UX	Ambiguous, multi-step goals

Frequently asked questions

What is an AI agent in simple terms?

An AI agent is a large language model running in a loop that can use tools, hold memory, and decide its own next step until a goal is reached. Unlike a chatbot that only replies, an agent can take actions in the real world (call APIs, edit files, query databases) and observe the results before deciding what to do next.

What are the four components of an AI agent?

Every working agent has four parts: (1) a model (the LLM that reasons), (2) tools (functions the model can call to act on the world), (3) memory (working state plus longer-term storage so it does not loop in circles), and (4) a control loop (the runtime that ties them together and decides when to stop).

Is ChatGPT an AI agent?

Plain ChatGPT in a chat window is not an agent, it is a single LLM call per turn. ChatGPT becomes agentic when features like Code Interpreter, Browse, or custom GPT actions are turned on, because the model can then call tools in a loop. The same model can be either a chatbot or an agent depending on whether it is wrapped in a tool-using loop.

What is the difference between an AI agent and a workflow?

In a workflow, the developer hard-codes the steps and the LLM fills in nodes. In an agent, the LLM decides the steps. Anthropic puts it cleanly: workflows are 'systems where LLMs and tools are orchestrated through predefined code paths,' while agents 'dynamically direct their own processes and tool usage.'

When should I NOT build an AI agent?

Skip the agent when the steps are known, repeatable, and latency or cost matter. If you can pre-map the decision tree, build a workflow: it will be cheaper, faster, and more predictable. Agents pay off only when the path to the goal is genuinely dynamic and the value per task justifies thousands of tokens of exploration.

Are Claude Code, Cursor, and Devin AI agents?

Yes. All three are LLMs running in a loop with tools (file edit, shell, search, test runner) and memory of the codebase. SemiAnalysis estimated inFebruary 2026 that Claude Code alone authors about 4% of all public GitHub commits, projected to exceed 20% by year-end. They are the clearest production proof that the agent pattern works.

What is the agent loop?

The agent loop is the runtime cycle of perceive → plan → act → observe that repeats until the goal is met or a stopping condition fires. The model perceives current state, picks a tool call, the runtime executes it, the result is appended to context, and the model decides what to do next. Every major lab (Anthropic, OpenAI, Google) has converged on this same pattern.

Do AI agents need RAG?

Not always, but most production agents include retrieval as one of their tools. RAG by itself is reactive: it retrieves and generates one answer. An agent can call a retrieval tool repeatedly, refine queries, and chain results with other tools. Think of RAG as one capability an agent can use, not a competing architecture.

Why do most AI agent projects fail?

RAND's 2025 study of 65 enterprise AI initiatives found 80.3% fail to deliver business value: 33.8% are abandoned pre-production, 28.4% ship but underperform, and 18.1% never recoup costs. Root causes are rarely model quality. They are data quality, organizational maturity, and use-case drift.

After the minimal code example, link readers who want to actually ship one.

Build your first production agent with the Claude SDK