The Claude Agent SDK is Anthropic's official library for building agents on the same engine that powers Claude Code, available in TypeScript and Python. The SDK itself is free; you pay only for the Claude API tokens your agent consumes. It runs against the Anthropic API, Amazon Bedrock, Google Vertex AI, or Microsoft Foundry. This FAQ answers 18 of the most common questions engineers ask in the Anthropic Discord, GitHub Issues, and Stack Overflow, with citations to the official docs and 80 to 120 word answers per question.

Is the Claude Agent SDK free?

Yes. The Claude Agent SDK is free and open source. You pay only for the Claude API tokens your agent consumes. There are no SDK license fees, no per-agent charges, and no SDK-side usage caps.

Token pricing in 2026, per Anthropic's pricing page, is:

  • Haiku 4.5: $1 / $5 per million input/output tokens
  • Sonnet 4.6: $3 / $15
  • Opus 4.6: $5 / $25

A typical agent run that reads 10 to 20 files and produces a report costs $0.05 to $0.50. Anthropic's Batch API processes async requests within 24 hours at a flat 50% discount on every token, which is a useful cost lever for high-volume offline agents.

What's the difference between Claude Code and the Claude Agent SDK?

Claude Code is the finished CLI/IDE product. The Claude Agent SDK is the underlying engine, exposed as a library. Everything Claude Code does, the SDK can do, because Claude Code is built on top of the SDK.

Think of it like buying a car versus buying the engine and chassis:

Claude Code Claude Agent SDK
Interface Interactive CLI, IDE plugin, desktop app query() async generator
Slash commands Yes (/init, /compact) No
Use case Human-in-the-loop coding Programmatic agents in your app
Setup Zero You assemble it

Per the Agent SDK overview, the decision reduces to a single question: is a human driving the agent, or is your application?

What languages does the Claude Agent SDK support?

Officially TypeScript and Python only. Anthropic ships @anthropic-ai/claude-agent-sdk on npm and claude-agent-sdk on PyPI. There is an open GitHub feature request for an official Go SDK, plus community wrappers for Go and Elixir.

From any other language, you can shell out to the Claude CLI directly. Both official SDKs are thin wrappers around a subprocess that exchanges JSON-lines messages over stdio, documented in the hosting guide. If you spawn claude --output-format stream-json from Rust, Java, or Ruby, you get the same agent loop, hooks, and tool stream that the official SDKs use.

How do I install the Claude Agent SDK?

One package per language. The Claude Code CLI is bundled automatically, so no separate install is required.

Python (3.10+):

pip install claude-agent-sdk
export ANTHROPIC_API_KEY=sk-ant-...

TypeScript / Node 18+:

npm install @anthropic-ai/claude-agent-sdk
export ANTHROPIC_API_KEY=sk-ant-...

A minimal Python agent looks like this, per the Python reference:

from claude_agent_sdk import query

async for message in query(prompt="Summarize README.md"):
    print(message)

If you migrated from the older Claude Code SDK, swap @anthropic-ai/claude-code for @anthropic-ai/claude-agent-sdk -- the API is otherwise compatible.

Can the Claude Agent SDK run on AWS Bedrock or Vertex AI?

Yes. The SDK supports AWS Bedrock, Google Vertex AI, and Microsoft Foundry as model providers. Set one environment variable and configure the cloud credentials, and the same agent code runs against the new provider.

# AWS Bedrock
export CLAUDE_CODE_USE_BEDROCK=1
export AWS_REGION=us-west-2

# Google Vertex AI
export CLAUDE_CODE_USE_VERTEX=1
export CLOUD_ML_REGION=us-east5

# Microsoft Foundry
export CLAUDE_CODE_USE_FOUNDRY=1

Per the Bedrock integration docs, traffic stays inside your AWS security boundary with zero operator access. AWS also publishes a reference architecture for hosting the Agent SDK on Bedrock AgentCore for enterprise deployments.

Can I run the Claude Agent SDK in AWS Lambda?

Yes, but with caveats. The SDK spawns the Claude CLI as a subprocess and maintains state on disk, which clashes with Lambda's read-only filesystem and 15-minute execution cap.

Your options, per the hosting guide:

  1. Lambda with /tmp workspace -- works for short, single-shot agents. Bundle the CLI in a Lambda layer and point cwd at /tmp.
  2. Bedrock AgentCore Runtime -- managed serverless host built for the Agent SDK, no CLI bundling required.
  3. ECS Fargate or Cloud Run -- the recommended path for long-running agents, since you keep a persistent filesystem and can run beyond 15 minutes.

For agents that exceed the Lambda timeout, AgentCore or Fargate are safer choices.

Does the Claude Agent SDK support streaming?

Yes. Streaming is the default. query() returns an async iterator of typed messages, so tokens, tool calls, and tool results stream back as the agent runs.

Under the hood, the SDK and CLI communicate over a JSON-lines stdio stream. Each line is a complete JSON object: regular messages (assistant text, tool input, tool output, cost) and control messages (permission requests, hook callbacks). The TypeScript reference documents the full message taxonomy.

for await (const msg of query({ prompt: "..." })) {
  if (msg.type === "assistant") {
    process.stdout.write(msg.message.content[0].text);
  }
}

Use this stream to forward partial output to your UI in real time.

How do hooks work in the Claude Agent SDK?

Hooks are callbacks that fire on agent lifecycle events, letting you intercept and modify behavior without forking the SDK. Per the hooks docs, the events are PreToolUse, PostToolUse, SubagentStart, SubagentStop, Notification, and Stop.

A PreToolUse hook receives tool_name and tool_input and returns a decision:

async def block_rm_rf(input, tool_use_id, context):
    if input["tool_name"] == "Bash" and "rm -rf" in input["tool_input"]["command"]:
        return {"hookSpecificOutput": {
            "hookEventName": "PreToolUse",
            "permissionDecision": "deny",
            "permissionDecisionReason": "rm -rf is blocked"
        }}
    return {}

PostToolUse hooks can append additionalContext or replace updatedToolOutput before Claude sees it. Use hooks for redaction, audit logging, and policy enforcement.

What models can I use with the Claude Agent SDK?

Any Claude model exposed by your provider. In 2026, the models reference lists three production tiers:

  • Claude Haiku 4.5 -- fastest and cheapest, ideal for routing and high-volume tools
  • Claude Sonnet 4.6 -- the default agent model, balances cost and reasoning
  • Claude Opus 4.6 -- highest reasoning, used for complex planning and tool-heavy workflows

Select a model per query() call:

async for msg in query(prompt="...", options={"model": "claude-sonnet-4-6"}):
    ...

On Bedrock and Vertex, use the provider-prefixed model IDs (for example, anthropic.claude-sonnet-4-6-v1:0). Subagents can use a different model than the parent, which lets you route cheap delegations to Haiku while keeping Sonnet for orchestration.

What are the rate limits for the Claude Agent SDK?

Rate limits are inherited from your underlying API account. The SDK adds no extra limits, but the Messages API meters every call. Per the rate limits docs, the metric trio is:

  • RPM -- requests per minute
  • ITPM -- input tokens per minute
  • OTPM -- output tokens per minute

Limits use a token-bucket algorithm that refills continuously. Exceeding any one returns a 429 with a retry-after header. Tier 1 starts around 50 RPM and 50K ITPM; Tier 4 reaches 4,000 RPM. For higher limits, request Priority Tier on the Limits page. On Bedrock and Vertex, AWS or Google quotas apply instead of Anthropic's.

How do MCP servers work with the Claude Agent SDK?

MCP (Model Context Protocol) servers extend your agent with external tools and data sources. The SDK supports three transports per the MCP docs: stdio (local subprocess), HTTP/SSE (remote), and in-process (TypeScript or Python function inside your app).

import { query, createSdkMcpServer, tool } from "@anthropic-ai/claude-agent-sdk";

const server = createSdkMcpServer({
  name: "crm",
  tools: [tool("get_contact", "Look up a contact", schema, async (args) => { ... })]
});

await query({ prompt: "...", options: { mcpServers: { crm: server } } });

Tools become available as mcp__crm__get_contact. List that fully-qualified name in allowedTools to skip the permission prompt.

How do I add custom tools to the Claude Agent SDK?

Use the in-process MCP server. Per the custom tools docs, this is the simplest path because tools run inside your application, not as a separate subprocess.

from claude_agent_sdk import tool, create_sdk_mcp_server, query

@tool("get_weather", "Get current weather", {"city": str})
async def get_weather(args):
    return {"content": [{"type": "text", "text": f"Sunny in {args['city']}"}]}

server = create_sdk_mcp_server(name="weather", tools=[get_weather])

async for msg in query(
    prompt="What's the weather in Tokyo?",
    options={"mcp_servers": {"weather": server}, "allowed_tools": ["mcp__weather__get_weather"]}
):
    print(msg)

The tool runs in your process, so it can hit databases, internal APIs, or domain logic without an extra hop.

How do subagents work in the Claude Agent SDK?

Subagents are specialist agents with isolated context windows that the parent can delegate to. Per the subagents docs, each subagent has its own system prompt, tool list, and conversation. Only the final message returns to the parent, which keeps the parent's context clean.

Define subagents programmatically or in .claude/agents/*.md:

await query({
  prompt: "Review this PR",
  options: {
    agents: {
      "security-scanner": { description: "...", prompt: "...", tools: ["Read", "Grep"], model: "haiku" }
    }
  }
});

Multiple subagents can run concurrently. A code review that runs style-checker, security-scanner, and test-coverage in parallel turns minutes into seconds.

How do I configure permissions in the Claude Agent SDK?

Permissions decide which tools and Bash commands run without asking the user. The permissions docs define six modes: default, acceptEdits, plan, bypassPermissions, delegate, and dontAsk.

await query({
  prompt: "...",
  options: {
    permissionMode: "default",
    allowedTools: ["Read", "Grep", "Bash(npm test)"],
    disallowedTools: ["Bash(rm *)", "Bash(sudo *)"],
    canUseTool: async (name, input) => ({ behavior: "allow", updatedInput: input })
  }
});

Glob patterns let you allow Bash(npm *) while blocking Bash(sudo *). For unattended production runs, use bypassPermissions plus a tight allowedTools list, or supply a canUseTool callback that enforces your own policy.

Can I resume a Claude Agent SDK session?

Yes. Sessions are persisted to disk and can be resumed by ID. Per the sessions docs, the SDK stores history under ~/.claude/projects/<encoded-cwd>/<session-id>.jsonl, where <encoded-cwd> is the working directory with non-alphanumeric characters replaced by -.

async for msg in query(prompt="continue", options={"resume": session_id}):
    print(msg)

The most common gotcha: if resume returns a fresh session, you launched the SDK from a different cwd than the original. Pin cwd explicitly in your options when resuming.

continue (resume the latest session) and resume (specific session ID) both append to existing history rather than starting over.

How do I customize the system prompt?

Three options, per the modifying system prompts docs:

  1. Append to the Claude Code preset -- inherit Claude Code's full agent behavior and add your own instructions on top.
  2. Fully custom prompt -- replace it entirely (you lose the Claude Code agent loop instructions).
  3. Output styles -- file-based persistent style configs in .claude/output-styles/.
await query({
  prompt: "...",
  options: {
    systemPrompt: { type: "preset", preset: "claude_code", append: "You are a SQL specialist. Always EXPLAIN before running." }
  }
});

For production agents that don't need filesystem or shell capabilities, a fully custom prompt is leaner. For coding agents, always start from the claude_code preset -- it includes the agent loop, planning behavior, and tool-use guidance.

Does the Claude Agent SDK support OpenTelemetry?

Yes. The SDK exports OpenTelemetry traces, metrics, and events to any OTLP backend. Per the observability docs, telemetry is off by default; turn it on with environment variables:

export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp.your-collector.com
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf

The SDK propagates TRACEPARENT and TRACESTATE into the CLI subprocess, so the agent's claude_code.interaction span becomes a child of any active span in your app. Honeycomb, Datadog, Grafana, Langfuse, and SigNoz all accept the OTLP signals natively. Tool calls, token usage, latency, and cost are captured automatically.

How do I sandbox file system access in the Claude Agent SDK?

The SDK ships with OS-level sandboxing for the Bash tool. Per the secure deployment docs, it uses bubblewrap on Linux and sandbox-exec on macOS to enforce filesystem and network isolation.

Defaults:

  • Write access: current working directory and subdirectories only
  • Read access: entire machine, except explicitly denied paths
  • Network: routed through a built-in proxy or removed via network namespaces

For production, run the SDK inside an ephemeral container (Docker, Firecracker, or Bedrock AgentCore Runtime) so that even a sandbox escape lands in a disposable environment. Combine container isolation with a tight allowedTools list, and you have defense in depth.