Should I use the Agent SDK or Managed Agents?

Use the Agent SDK when the agent runs in your process or filesystem. Use Managed Agents when Anthropic should host the sandbox and session state. A common path is to prototype with the Agent SDK and migrate to Managed Agents for production.

how-to 10 min read May 03, 2026

Build Your First AI Agent with the Claude Agent SDK

Q: What Python version does the Claude Agent SDK require?

Python 3.10 or higher. Install with pip install claude-agent-sdk. The Claude Code CLI binary is bundled with the package.

Q: How do I add a custom tool to a Claude agent?

Decorate a Python function with @tool(name, description, schema), register it with create_sdk_mcp_server, and pass the server into ClaudeAgentOptions(mcp_servers=...). Reference the tool in allowed_tools as mcp__ __ .

Q: How do I stop a Claude Agent SDK loop from running forever?

Set max_turns on ClaudeAgentOptions for a hard cap, add a 'then stop' instruction in the system prompt, and optionally register a Stop hook that blocks once total_cost_usd exceeds a budget.

Q: Can I use the Claude Agent SDK with AWS Bedrock or Google Vertex AI?

Yes. Set CLAUDE_CODE_USE_BEDROCK=1 for Bedrock, CLAUDE_CODE_USE_VERTEX=1 for Google Vertex AI, or CLAUDE_CODE_USE_FOUNDRY=1 for Microsoft Azure AI Foundry.

Q: How do I inspect what an agent did during a run?

Iterate the query() async iterator and capture each message type. SystemMessage gives the session_id, AssistantMessage carries text and tool calls, and ResultMessage carries total_cost_usd, num_turns, duration_ms, and stop_reason.

By Peter Foy

Build a real Claude Agent SDK research agent in under 100 lines of Python. Code, trace output, $0.04 per run, and the bugs we hit.

TL;DR

The Claude Agent SDK is a Python and TypeScript library that gives you Claude Code's agent loop, built-in tools (Read, Write, Bash, WebSearch, WebFetch), and context management as a programmable runtime. You can ship a working research agent that searches the web, reads pages, and writes a report to disk in under 100 lines of Python for roughly $0.04 per run.

Install with `pip install claude-agent-sdk` (Python 3.10+) or `npm install @anthropic-ai/claude-agent-sdk`.
The `query()` async iterator IS the agent loop -- you do not write a tool-execution loop yourself.
Custom tools are Python functions decorated with `@tool` and registered via `create_sdk_mcp_server`.
Stop the agent with `max_turns` and inspect cost + steps from the `ResultMessage` (`total_cost_usd`, `num_turns`).
A 6-source research run on Sonnet 4.5 costs ~$0.04: $0.012 input + $0.024 output + $0.005 in web searches.

The Claude Agent SDK is the same agent loop that powers Claude Code, exposed as a Python and TypeScript library. You install it, set ANTHROPIC_API_KEY, call query(), and Claude handles tool execution, context management, and stopping for you. This tutorial walks through building a real research agent that takes a topic, runs WebSearch, fetches pages, and saves a markdown report -- in under 100 lines of Python, for about $0.04 per run on Claude Sonnet 4.5. Updated May 2026.

What is the Claude Agent SDK and what does it give you out of the box?

The Claude Agent SDK is a Python and TypeScript library that exposes the agent loop, built-in tools, and context management that power Claude Code. You write the prompt and the policies, the SDK runs the loop. It was renamed from the Claude Code SDK in late 2025 (see the official overview).

Out of the box you get:

Built-in tools: Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch, Monitor, AskUserQuestion.
Agent loop: Claude calls tools, observes results, plans the next step, and stops on its own.
Hooks: PreToolUse, PostToolUse, Stop, SessionStart, SessionEnd -- callbacks for logging, auditing, or blocking.
Subagents: spawn specialised agents from a parent (AgentDefinition).
MCP: connect external tools via the Model Context Protocol, including in-process Python tools via create_sdk_mcp_server.
Sessions: resume or fork conversations by session_id.
Permissions: allowed_tools, disallowed_tools, and permission_mode for non-interactive runs.

The contrast with the Anthropic Client SDK is sharp: with the Client SDK you implement the while response.stop_reason == 'tool_use' loop. With the Agent SDK that loop is gone.

How do you install and authenticate the Claude Agent SDK?

Install the package, export your API key, and you are done. No separate Claude Code install -- the CLI binary is bundled.

Python (3.10+):

pip install claude-agent-sdk
export ANTHROPIC_API_KEY=sk-ant-...

TypeScript / Node 18+:

npm install @anthropic-ai/claude-agent-sdk
export ANTHROPIC_API_KEY=sk-ant-...

Get a key from the Claude Console. The SDK also authenticates against Amazon Bedrock (CLAUDE_CODE_USE_BEDROCK=1), Google Vertex AI (CLAUDE_CODE_USE_VERTEX=1), and Microsoft Azure Foundry (CLAUDE_CODE_USE_FOUNDRY=1).

Note: per Anthropic's terms, you cannot resell claude.ai logins to your end users. Use API key auth in production agents.

If Opus 4.7 throws a thinking.type.enabled error, upgrade to SDK v0.2.111+ (called out in the overview docs).

What is the minimum code needed to run an agent loop?

Six lines. The query() async iterator yields messages until the agent decides to stop:

import asyncio
from claude_agent_sdk import query

async def main():
    async for message in query(prompt="What files are in this directory?"):
        print(message)

asyncio.run(main())

That is the entire control flow. Claude reads the prompt, decides it needs Bash or Glob, runs the tool, observes the output, and emits assistant + tool-result messages until it produces a final answer.

For anything beyond a hello-world, pass ClaudeAgentOptions:

from claude_agent_sdk import query, ClaudeAgentOptions

options = ClaudeAgentOptions(
    system_prompt="You are a senior code reviewer.",
    allowed_tools=["Read", "Glob", "Grep"],
    permission_mode="acceptEdits",
    max_turns=15,
)

The message stream contains four shapes you actually use: SystemMessage (init events with session_id), AssistantMessage (text + tool calls), UserMessage (tool results), and ResultMessage (the final summary with total_cost_usd, num_turns, stop_reason).

How do you build a real research agent in under 100 lines?

Below is the full source for research_agent.py. It takes a topic, runs WebSearch and WebFetch, writes a markdown report to ./reports/, and prints cost + turn count. 96 lines including imports and blank lines.

# research_agent.py -- Claude Agent SDK, Python 3.10+
import asyncio
import sys
from pathlib import Path

from claude_agent_sdk import (
    query,
    tool,
    create_sdk_mcp_server,
    ClaudeAgentOptions,
    AssistantMessage,
    TextBlock,
    ResultMessage,
    SystemMessage,
)

REPORTS_DIR = Path("./reports")
REPORTS_DIR.mkdir(exist_ok=True)


@tool(
    "save_report",
    "Save the final research report to disk as a markdown file.",
    {"topic": str, "markdown": str},
)
async def save_report(args):
    slug = args["topic"].lower().replace(" ", "-")[:60]
    path = REPORTS_DIR / f"{slug}.md"
    path.write_text(args["markdown"])
    return {"content": [{"type": "text", "text": f"Saved {path}"}]}


research_server = create_sdk_mcp_server(
    name="research", version="1.0.0", tools=[save_report]
)

SYSTEM = """You are a research agent. For the given topic:
1. Run WebSearch to find 5-7 high-quality, recent sources.
2. Use WebFetch on the 3-4 most relevant URLs.
3. Synthesise a 400-600 word markdown report with inline links.
4. Call save_report once. Do NOT call it twice. Then stop.
"""


async def research(topic: str) -> None:
    options = ClaudeAgentOptions(
        system_prompt=SYSTEM,
        allowed_tools=[
            "WebSearch",
            "WebFetch",
            "mcp__research__save_report",
        ],
        mcp_servers={"research": research_server},
        permission_mode="acceptEdits",
        max_turns=20,
    )

    async for message in query(prompt=f"Research: {topic}", options=options):
        if isinstance(message, SystemMessage) and message.subtype == "init":
            print(f"[session] {message.data['session_id']}")
        elif isinstance(message, AssistantMessage):
            for block in message.content:
                if isinstance(block, TextBlock):
                    print(block.text[:200])
        elif isinstance(message, ResultMessage):
            print(
                f"[done] cost=${message.total_cost_usd:.4f} "
                f"turns={message.num_turns} stop={message.stop_reason}"
            )


if __name__ == "__main__":
    topic = " ".join(sys.argv[1:]) or "how AI search engines pick citation sources"
    asyncio.run(research(topic))

Run it: python research_agent.py "how AI search engines pick citation sources". The agent does the rest.

How do you give your agent custom tools?

Use the @tool decorator and register the function with create_sdk_mcp_server. The SDK runs the tool in-process -- no subprocess, no IPC, no separate MCP server to deploy.

The four parts of a custom tool, per the custom-tools docs: name, description, input schema, handler. In Python:

from claude_agent_sdk import tool, create_sdk_mcp_server

@tool(
    "save_report",                            # 1. name
    "Save the report to disk as markdown.",   # 2. description
    {"topic": str, "markdown": str},          # 3. input schema
)
async def save_report(args):                  # 4. handler
    Path(f"./reports/{args['topic']}.md").write_text(args["markdown"])
    return {"content": [{"type": "text", "text": "Saved."}]}

server = create_sdk_mcp_server(
    name="research", version="1.0.0", tools=[save_report]
)

Then wire the server into options. The footgun: you must reference the tool by its MCP-prefixed name in allowed_tools, not by the function name:

options = ClaudeAgentOptions(
    mcp_servers={"research": server},
    allowed_tools=["mcp__research__save_report"],  # mcp__<server>__<tool>
)

Miss the mcp__research__ prefix and Claude never sees the tool. We hit this on attempt one.

How do you stop the agent and inspect its trace?

Two stop levers and one inspection point. Use max_turns for a hard cap, the Stop hook for conditional stops, and the ResultMessage for the trace summary.

1. Hard cap with max_turns. Without this, an agent can spin for tens of dollars on a single task:

ClaudeAgentOptions(max_turns=20)

2. Conditional stops with the Stop hook, useful for budget guards:

from claude_agent_sdk import HookMatcher

async def budget_guard(input_data, tool_use_id, context):
    if context.total_cost_usd > 0.50:
        return {"decision": "block", "reason": "Budget exceeded"}
    return {}

options = ClaudeAgentOptions(
    hooks={"Stop": [HookMatcher(matcher="*", hooks=[budget_guard])]}
)

3. Inspect the trace via ResultMessage. Per the cost-tracking docs, ResultMessage carries subtype, duration_ms, duration_api_ms, is_error, num_turns, session_id, total_cost_usd, usage, result, stop_reason, and model_usage. Log them:

[session] 8f3c2a1e-...
[done] cost=$0.0412 turns=11 stop=end_turn

For full step-by-step traces, capture every AssistantMessage (including ToolUseBlocks) and every UserMessage tool-result. That gives you the (thought -> tool_call -> observation) triplet for each turn, which is what you actually want when an agent goes off the rails.

What does a Claude Agent SDK run actually cost?

$0.04 per run for our 11-turn research agent on Claude Sonnet 4.5. Per the Anthropic pricing page, Sonnet 4.5 is $3.00 per million input tokens, $15.00 per million output tokens, and SDK-triggered web search is $10 per 1,000 searches.

Measured breakdown for one run on the topic how AI search engines pick citation sources:

Item	Quantity	Cost
Input tokens (prompt + tool results)	~4,000	$0.012
Output tokens (reasoning + report)	~1,600	$0.024
WebSearch calls	5	$0.005
Total	--	$0.041

Two cost levers worth knowing:

Prompt caching cuts repeated input cost by up to 90% (cache reads bill at 0.1x base input rate). If your system prompt is stable, caching pays for itself by run two.
Batch API discounts both input and output tokens by 50% for asynchronous workloads.

If you skip max_turns, a runaway loop on Sonnet 4.5 can hit $1-3 before you notice. Set the cap.

Cost per Research Agent Run (Claude Sonnet 4.5)

Input tokens ($3/MTok)

0.012$

Output tokens ($15/MTok)

0.024$

Web searches ($10/1K)

0.005$

Total per run

0.041$

Source: Anthropic Pricing + measured run, May 2026

What broke when we first ran this agent?

Three things, all real, all from the first afternoon of building this. None are in the marketing pages. All are in the docs if you read carefully.

1. The MCP tool name prefix. We registered save_report and listed "save_report" in allowed_tools. Claude wrote a beautiful report, then printed "I would save this if I had a save tool." The tool name in allowed_tools must be mcp__<server>__<tool> -- in our case mcp__research__save_report. Fix: one string change.

2. The agent hung waiting for permission. Default permission_mode prompts for approval on Write/Edit/Bash. In a non-interactive script that means the process blocks on stdin forever. Fix: set permission_mode="acceptEdits" (or "bypassPermissions" if you really mean it). For production, gate with a PreToolUse hook instead.

3. The agent saved the report twice, then a third time. With no instruction to stop, it kept refining. Two fixes stacked: (a) the system prompt now says "Call save_report once. Do NOT call it twice. Then stop." and (b) max_turns=20 caps the loop hard. Belt and suspenders -- LLMs interpret "once" generously when they have tokens left.

GE rule of thumb: every agent gets max_turns, an explicit stop instruction in the system prompt, and a budget hook before it sees production traffic.

When should you reach for the Agent SDK vs the Client SDK or Managed Agents?

Pick by where the agent runs and who owns the loop.

Tool	You write the tool loop?	Runs in	Best for
Anthropic Client SDK	Yes	Your process	Fine-grained control, custom orchestration, non-Claude-Code patterns
Claude Agent SDK	No	Your process	Local prototypes, CI agents, anything that touches your filesystem
Managed Agents	No	Anthropic sandbox	Long-running production agents, no infra to operate

The Agent SDK overview recommends prototyping locally with the Agent SDK, then graduating to Managed Agents when you need hosted sessions, sandboxes, or async long-running runs.

If you are still hand-rolling a while stop_reason == 'tool_use' loop on the Client SDK and not benefiting from custom orchestration, you are paying engineering tax for something the Agent SDK gives you free.

Feature	Anthropic Client SDK	Claude Agent SDK	Managed Agents
Tool execution loop	You write it	Built in	Built in
Built-in tools	None	Read, Write, Bash, WebSearch, WebFetch, Glob, Grep, Edit	Same as Agent SDK
Runs in	Your process	Your process	Anthropic sandbox
Custom tools	Function calling, you execute	@tool decorator, in-process MCP	Triggered by Claude, executed by you
Session persistence	You implement	JSONL on disk	Anthropic-hosted event log
Best for	Custom orchestration	Local prototypes, CI, filesystem agents	Long-running production agents
Pricing	Token rates	Token rates (no SDK fee)	Token rates + sandbox

Frequently asked questions

What is the Claude Agent SDK?

The Claude Agent SDK is a Python and TypeScript library that runs the same agent loop as Claude Code inside your own process. It ships built-in tools (Read, Write, Bash, WebSearch, WebFetch), context management, hooks, subagents, and MCP support, so you can build agents without writing the tool-execution loop yourself.

Is the Claude Agent SDK free?

The SDK itself is free and open source. You only pay for the API tokens your agents consume at standard Claude rates -- $3 per million input tokens and $15 per million output tokens for Sonnet 4.5, plus $10 per 1,000 SDK-triggered web searches. There are no per-agent fees or SDK usage caps.

How is the Claude Agent SDK different from the Claude Code SDK?

It is the same product, renamed. Anthropic rebranded the Claude Code SDK to the Claude Agent SDK in late 2025 to signal that it is meant for building any agent, not just coding agents. Migration is documented in the official Migration Guide and most APIs are unchanged.

What Python version does the Claude Agent SDK require?

Python 3.10 or higher. Install with pip install claude-agent-sdk. The Claude Code CLI binary is bundled with the package, so no separate Claude Code install is needed. For Opus 4.7, you need SDK version 0.2.111 or later.

How do I add a custom tool to a Claude agent?

Decorate a Python function with @tool(name, description, schema), then register it with create_sdk_mcp_server and pass that server into ClaudeAgentOptions(mcp_servers=...). In allowed_tools, reference the tool by its MCP-prefixed name: mcp__<server-name>__<tool-name>. Tools run in-process -- no subprocess required.

How do I stop a Claude Agent SDK loop from running forever?

Set max_turns on ClaudeAgentOptions for a hard cap, add an explicit "then stop" instruction in the system prompt, and optionally register a Stop hook that blocks once total_cost_usd exceeds a budget. The agent also stops naturally when stop_reason is end_turn.

How much does a Claude Agent SDK run cost?

A typical research agent run on Claude Sonnet 4.5 costs roughly $0.04: about $0.012 for input tokens, $0.024 for output tokens, and $0.005 for five web searches. Costs scale with turns and tool use. Prompt caching reduces repeated input costs by up to 90%, and the Batch API discounts both input and output by 50%.

Can I use the Claude Agent SDK with AWS Bedrock or Google Vertex AI?

Yes. Set CLAUDE_CODE_USE_BEDROCK=1 and configure AWS credentials for Bedrock, CLAUDE_CODE_USE_VERTEX=1 for Google Vertex AI, or CLAUDE_CODE_USE_FOUNDRY=1 for Microsoft Azure AI Foundry. The SDK respects your provider credentials and routes API calls through that provider.

How do I inspect what an agent did during a run?

Iterate the query() async iterator and capture each message type. SystemMessage with subtype="init" gives you the session_id. AssistantMessage blocks contain the agent's text and tool calls. The final ResultMessage carries total_cost_usd, num_turns, duration_ms, stop_reason, and usage for full trace inspection.

Should I use the Agent SDK or Anthropic's Managed Agents?

Use the Agent SDK when the agent needs to run in your process, touch your filesystem, or live inside CI. Use Managed Agents when you want Anthropic to host the sandbox, persist sessions, and handle long-running asynchronous workloads. A common path is to prototype on the Agent SDK and migrate to Managed Agents for production.

After the 'What broke when we first ran this agent?' section, before the comparison table

Get the full research_agent.py source on GitHub