general tool-design-for-agents

tool-design-for-agents

This skill should be used when the user asks to "design tools for an AI agent", "build agent tools", "create function calling tools", "design tool schemas for LLMs", "build tools for Claude", "design agent capabilities", "write tool definitions", "create agent tool interfaces", "design function schemas for agents", or any variation of designing and building tools that AI agents call in B2B SaaS GTM workflows.
Download .md

Tool Design for Agents

A tool is a function an AI agent can call to interact with the outside world. Search a CRM, enrich a contact, send an email, query a database. The agent reads the tool's name, description, and parameters, then decides when to call it and what arguments to pass. Good tool design means the agent calls the right tool with the right arguments on the first try. Bad tool design means the agent calls the wrong tool, passes garbage arguments, or ignores the tool entirely.

The principle: design tools from the agent's perspective. The agent sees a name, a one-sentence description, and a parameter list. If those three things aren't crystal clear, the agent will guess. Agents that guess produce inconsistent results.

Tool Anatomy

The three elements an agent sees

Element What the agent uses it for Design goal
Name Deciding whether this tool might be relevant Verb-noun. Unambiguous. Instantly clear what it does
Description Deciding whether to call this tool vs another One sentence. States what the tool does AND what it returns
Parameters Filling in the arguments Clear types, clear descriptions, minimal required fields

Name rules

  • Verb-noun format. search_contacts, enrich_company, create_deal, send_email. The verb is the action. The noun is the object
  • No ambiguous verbs. handle_contact means nothing. process_data means nothing. Use specific verbs: search, create, update, delete, enrich, validate, send, get, list
  • No overlapping names. If the agent sees search_contacts, find_contacts, and lookup_contacts, it can't distinguish them. Pick one name per action. Delete the synonyms
  • Snake_case. search_contacts not searchContacts or SearchContacts. Consistency across all tools

Good names vs bad names:

Good Bad Why bad is bad
search_hubspot_contacts contact_tool No verb. Could be read, write, delete, anything
enrich_company company_enrichment Noun phrase, not action. Agent may not recognize it as callable
create_hubspot_deal deal_handler "Handler" is vague. Create, update, delete?
validate_email_address email_check "Check" could mean validate, look up, or send a test
get_linkedin_profile linkedin No verb, no noun. Complete mystery

Description rules

  • One sentence. State what the tool does and what it returns. "Searches HubSpot contacts by company name or job title. Returns up to 10 matching contact records with name, title, email, and company."
  • Include the return value. The agent needs to know what it gets back to plan its next step. "Creates a deal in HubSpot" is incomplete. "Creates a deal in HubSpot. Returns the deal ID and creation timestamp" tells the agent it can use the deal ID downstream
  • State limitations. "Returns up to 10 results" or "Only works for US companies" or "Requires a valid email domain." Constraints prevent the agent from expecting behavior the tool doesn't support
  • No marketing language. "Powerful contact enrichment engine that leverages AI to provide deep insights." The agent doesn't care. "Enriches a contact record with company data, title, and LinkedIn URL from Apollo." That's useful

Parameter rules

Rule Why
Required parameters are truly required If the tool works without a parameter, make it optional. Agents hallucinate values to satisfy unnecessary required fields
Each parameter has a description "query: string" tells the agent nothing. "query: string. The company name or domain to search for" tells the agent exactly what to pass
Use specific types string is loose. string, enum: ["positive", "negative", "neutral"] is tight. The tighter the type, the fewer bad calls
Default values for optional parameters limit: integer, default 10. The agent doesn't need to specify common defaults
Maximum 5-7 parameters More than 7 parameters and the agent struggles. If a tool needs 12 parameters, it's doing too many things. Split it

Tool Categories for GTM

Standard GTM tool set

Category Tools What they do
CRM read search_contacts, get_contact, get_company, get_deal, list_activities Read data from CRM without modification
CRM write create_contact, update_contact, create_deal, log_activity Modify CRM data. Always requires human approval gate
Enrichment enrich_company, enrich_contact, find_email, verify_email Pull data from enrichment providers
Research search_web, get_linkedin_profile, get_company_news, get_job_postings Gather external data for research
Email draft_email, send_email, schedule_email, get_email_status Email composition and sending
Internal format_output, count_words, validate_rules, log_result Agent-internal helpers

Tool category rules

  • CRM write tools always gate on human approval. The agent proposes a CRM update. A human approves it. The tool executes it. Never auto-execute CRM writes. One bad batch update cascades through workflows, scoring, and reporting
  • Enrichment tools return structured data. enrich_company returns { name, domain, employee_count, industry, funding_stage, funding_amount }. Not a paragraph of text. Structured data is easier for the agent to use correctly
  • Research tools set limits. search_web returns top 5 results with title, URL, and snippet. Not the full page content of 50 results. The agent's context window is finite
  • Internal tools don't call external services. count_words and format_output are pure functions. No API calls, no side effects. These run instantly and never fail

Designing Tool Responses

Response structure

Every tool response should follow this pattern:

{
  "success": true,
  "data": { ... },
  "metadata": {
    "source": "hubspot",
    "timestamp": "2025-01-15T10:30:00Z",
    "result_count": 3,
    "truncated": false
  }
}

For errors:

{
  "success": false,
  "error": {
    "code": "NOT_FOUND",
    "message": "No contact found with email domain 'example.com'",
    "suggestion": "Try searching by company name instead"
  }
}

Response rules

  • Always return structured data. JSON with named fields. Never raw text, HTML, or unprocessed API responses. The agent needs to extract specific values from the response. Named fields make this deterministic
  • Include a success/error flag. The agent needs to know whether the tool call worked before planning its next step. A raw response that might be data or might be an error message forces the agent to guess
  • Limit response size. If a tool can return 500 contacts, default to 10 and let the agent request more. Large responses burn context window and degrade agent performance
  • Include metadata. Source, timestamp, result count, whether results were truncated. The agent can use this to decide whether to make another call
  • Error suggestions help the agent recover. "No results found" is useless. "No results found. Try searching by company name instead of domain" gives the agent a recovery path

Tool Composition

How agents chain tool calls

Agent task: "Research Acme Corp and find the VP of Sales"

Step 1: search_hubspot_contacts(company_name="Acme Corp")
  → Returns 8 contacts

Step 2: Agent examines results. No VP of Sales found.

Step 3: enrich_company(domain="acme.com")
  → Returns company data including employee count

Step 4: get_linkedin_profile(company="Acme Corp", title="VP Sales")
  → Returns LinkedIn profile with name, title, current role

Step 5: create_contact(name="Jane Smith", title="VP Sales",
  company="Acme Corp", source="linkedin")
  → Returns new contact ID

Composition rules

  • Each tool does one thing. search_and_enrich_contact is two tools jammed together. What if the search succeeds but enrichment fails? Split them. The agent chains them when it needs both
  • Tool outputs are tool inputs. The contact ID from create_contact feeds into log_activity(contact_id=...). Design return values to be usable as inputs to other tools
  • No side effects in read tools. search_contacts should never create a log entry, trigger a webhook, or update a record. Read tools read. Write tools write. Mixing side effects makes behavior unpredictable
  • Idempotent where possible. Calling update_contact(id, title="VP Sales") twice should produce the same result. Idempotent tools are safe to retry on failure

Tool Access Control

Which agents can call which tools

Agent type Read tools Write tools Enrichment tools Send tools
Research agent All CRM read None All enrichment None
Scoring agent Contact + company read None ICP fit tools None
Email writer agent Contact read (for context) None None Draft only (no send)
Orchestrator agent All read CRM write (with approval) All Send (with approval)

Access rules

  • Principle of least privilege. Give each agent only the tools it needs. A research agent has no business sending emails. An email writer has no business updating CRM records
  • Write tools require approval gates. Any tool that modifies external state (CRM, email, database) must have a human approval step or an explicit automation rule
  • Separate draft from send. The email agent calls draft_email. A separate approval step calls send_email. The agent never sends directly
  • Log every tool call. Agent ID, tool name, parameters, response, timestamp. This is your audit trail when something goes wrong

Testing Tools

What to test

Test type What it validates How
Schema compliance Does the tool accept valid parameters and reject invalid ones? Pass valid and invalid parameter combinations
Response format Does the tool return the documented response structure? Check success responses and error responses
Agent usability Does the agent call the tool correctly based on name + description alone? Give the agent a task that requires the tool. Does it pick the right tool and pass correct args?
Error handling Does the tool return useful errors? Does the agent recover? Force errors (invalid IDs, network failures). Check agent behavior
Edge cases Does the tool handle empty results, special characters, rate limits? Pass empty queries, unicode, rapid sequential calls

Testing rules

  • Test with the agent, not just in isolation. A tool that works perfectly when called directly but confuses the agent is a bad tool. The ultimate test is: does the agent use it correctly?
  • Test tool selection. Give the agent 5 tools and a task. Does it pick the right one? If it picks the wrong tool, the name or description needs improvement
  • Test parameter filling. Does the agent pass the right arguments? If it passes company_name where domain was expected, the parameter descriptions are unclear
  • Test error recovery. Tool returns an error. Does the agent try again? Try a different approach? Give up gracefully? Or hallucinate a result?

Pre-Build Checklist

Before building a tool for an agent:

  • [ ] Tool has a verb-noun name (e.g., search_contacts, not contact_tool)
  • [ ] Description is one sentence stating what it does AND what it returns
  • [ ] Parameters have clear descriptions and types
  • [ ] Required parameters are truly required (tool fails without them)
  • [ ] Optional parameters have defaults
  • [ ] No more than 7 parameters
  • [ ] Response is structured JSON with success/error flag
  • [ ] Error responses include actionable suggestions
  • [ ] Response size is bounded (pagination or limits)
  • [ ] Read tools have no side effects
  • [ ] Write tools have approval gates
  • [ ] Tool tested with the actual agent (not just in isolation)
  • [ ] Tool name doesn't overlap with other tool names

Anti-Pattern Check

  • Tool named handle_data or process_input. The agent has no idea what this tool does. Use specific verb-noun names: search_contacts, enrich_company, validate_email. Every tool name should be unambiguous
  • All parameters marked as required. The agent hallucmates a "company_size" value because the tool requires it but the agent doesn't have it. Only require parameters the tool truly can't function without
  • Tool returns raw API response. The agent gets a 200-line JSON blob from HubSpot's API. 190 lines are irrelevant. Process the response in the tool. Return only the fields the agent needs
  • No error handling. Tool throws an exception. Agent receives a stack trace. It tries to extract useful information from the error message and fails. Return structured error objects with codes and suggestions
  • One tool does five things. manage_contacts(action="search|create|update|delete"). The agent struggles to use multi-action tools correctly. One tool, one action. Split it
  • Tool sends emails without approval. The email agent calls send_email directly. No human review. One hallucinated claim reaches the prospect. Separate draft_email from send_email. Require approval between them
  • 20 tools on one agent. The agent has too many options. It calls the wrong tool 30% of the time. Keep it to 5-8 tools per agent. If more are needed, split into multiple agents with focused tool sets
  • No tool call logging. Something went wrong but you can't tell which tool call caused it. Log every call with agent ID, tool name, parameters, response, and timestamp
Want agents that use skill files like this?
We customize skill files for your brand voice and methodology, then run content agents against them.
Book a call