Home/ Skills/ prompt-design-for-agents

general prompt-design-for-agents

prompt-design-for-agents

This skill should be used when the user asks to "write a system prompt", "design a prompt for an agent", "improve an agent prompt", "write instructions for an AI agent", "create a system prompt for a GTM agent", "fix my agent's prompt", "tune a prompt", "write a prompt for a research agent", "write a prompt for an email writer agent", or any variation of designing, writing, or improving system prompts for AI agents performing B2B SaaS go-to-market tasks.

Download .md

Prompt Design for Agents

A system prompt is the operating manual for an agent. It determines output quality more than model selection, tool design, or orchestration architecture. A well-prompted Sonnet outperforms a poorly-prompted Opus. Invest more time here than anywhere else in agent design.

The principle: write the prompt as if you're onboarding a smart new hire who has zero context on your company, your process, or your quality bar. Be explicit about what good looks like. Show, don't just tell.

Prompt Architecture

Every agent system prompt follows the same 7-section structure. Order matters. The model pays more attention to content earlier in the prompt.

1. Identity        — Who the agent is and what it does (2-3 sentences)
2. Input spec      — What the agent receives and in what format
3. Output spec     — Exact format, schema, required fields
4. Process         — Step-by-step instructions for how to get from input to output
5. Rules           — Hard constraints the agent must never violate
6. Examples        — 2-3 input/output pairs showing ideal behavior
7. Edge cases      — What to do when data is missing, ambiguous, or unexpected

Why this order works

Identity first because it frames everything that follows. The model interprets all subsequent instructions through the lens of "who am I"
Input/output specs before process because the model needs to know what it's working with and what it's producing before it reads how
Rules after process because rules are constraints on the process. They make more sense after the model understands what it's doing
Examples near the end because they serve as calibration. The model has absorbed the instructions and now sees what "good" looks like concretely
Edge cases last because they're exceptions to the normal flow. The model should understand the normal flow first

Section 1: Identity

Two to three sentences. Who the agent is, what it does, and for whom.

Template:

You are a [role] that [primary action] for [audience].
Your output is used by [downstream consumer] to [downstream purpose].
[One sentence on quality bar or operating philosophy.]

Good examples:

You are a B2B account research agent that produces structured account briefs
from company names. Your output is used by SDRs and ABM marketers to craft
personalized outbound campaigns. Accuracy matters more than completeness. Never
guess. If data is not found, say so.

You are a cold email writer that generates 3-email outbound sequences for B2B
SaaS prospects. Your output is reviewed by a human before sending. Write like
a peer, not a vendor. Every email must earn the next.

Bad examples:

You are a helpful AI assistant.

(Too generic. The model has no frame for what "helpful" means in this context.)

You are an advanced AI-powered go-to-market intelligence platform that leverages
cutting-edge natural language processing to synthesize multi-source data streams
into actionable strategic insights for revenue-generating teams.

(Marketing copy, not an operating manual. The model will mirror this style in its output.)

Identity rules

Name the specific job, not a general capability. "Account research agent" not "helpful assistant"
Name the downstream consumer. The agent writes differently when it knows the output goes to an SDR vs a VP
State the quality philosophy in one sentence. "Accuracy over completeness" or "Conciseness over comprehensiveness" sets the tone for all decisions the model makes

Section 2: Input Spec

Define exactly what the agent receives. Include the format, required fields, and optional fields.

Template:

## Input

You receive the following:
- **company_name** (required): The target company name
- **domain** (optional): The company's website domain
- **icp_criteria** (optional): ICP fit criteria to evaluate against
- **context** (optional): Additional context from the requesting user

Input format: JSON object or plain text, depending on source.

Input spec rules:

Label every field as required or optional. The agent needs to know what it can always rely on vs what might be missing
Specify the format. "JSON object with these keys" or "plain text, one company name per line." Ambiguous input specs cause parsing failures
Include an example input. Even one example disambiguates more than a paragraph of description
Note what the input does NOT include. "You do not receive the prospect's email address. Do not attempt to guess or construct email addresses" prevents hallucination on fields the agent doesn't have

Section 3: Output Spec

The most important section. Ambiguous output specs are the #1 cause of inconsistent agent behavior.

Template:

## Output

Return a JSON object with the following structure:

{
  "company_name": "string",
  "snapshot": {
    "founded": "year or 'Unknown'",
    "hq": "city, state/country",
    "employee_count": "number or range",
    "funding": "most recent round, amount, date",
    "industry": "string"
  },
  "signals": [
    {
      "signal": "description",
      "type": "funding | hiring | product | leadership | tech_stack",
      "date": "YYYY-MM-DD or approximate",
      "source": "where found",
      "strength": "tier_1 | tier_2 | tier_3"
    }
  ],
  "problem_hypothesis": "One paragraph connecting signals to a specific problem",
  "confidence": "high | medium | low",
  "missing_fields": ["list of fields that could not be populated"]
}

Output spec rules:

Use a concrete schema, not prose descriptions. Show the exact JSON structure or markdown template. "Return a summary of the company" produces wildly inconsistent outputs. A schema produces consistent ones
Specify what to do for missing data. Every field should have a fallback value: "Unknown", null, "Not found (checked [sources])". This prevents hallucination
Include missing_fields or confidence in the schema. The agent should communicate what it doesn't know, not fill gaps with guesses
Specify length constraints per field. "problem_hypothesis: 2-4 sentences, under 100 words" prevents both terse and bloated outputs
Show one complete example output. The model calibrates its output format to the example more reliably than to the schema description alone

Section 4: Process

Step-by-step instructions for transforming input into output. Think of this as the agent's standard operating procedure.

Template:

## Process

Follow these steps in order:

1. **Search for company information.** Use the web_search tool with the query
   "[company_name] funding crunchbase". Extract founding year, HQ, employee
   count, and most recent funding round.

2. **Identify recent signals.** Search for "[company_name] news" and
   "[company_name] hiring". Look for events in the last 90 days: funding
   announcements, leadership changes, product launches, job postings for
   roles relevant to [product category].

3. **Assess tech stack.** Search for "[company_name] jobs" and look for
   tools mentioned in job requirements. Cross-reference with BuiltWith
   if the domain is provided.

4. **Formulate problem hypothesis.** Connect the strongest signal to a
   specific problem the company likely faces. Ground the hypothesis in
   evidence from steps 1-3. Do not speculate beyond what the data supports.

5. **Compile output.** Assemble all findings into the output schema.
   Populate missing_fields with any fields that could not be found.
   Set confidence based on data coverage.

Process rules:

Number every step. The model follows numbered sequences more reliably than prose paragraphs
Name the specific tool to use in each step. "Use the web_search tool with query X" is better than "search for information about the company"
Include the search queries. Specifying exact queries ("[ company_name] funding crunchbase") produces more consistent results than "search for funding data"
Tell the agent what to extract, not just where to look. "Extract founding year, HQ, employee count, and most recent funding round" is actionable. "Look at the company profile" is not
Keep it to 5-8 steps. Fewer than 5 usually means steps are too vague. More than 8 usually means the agent's scope is too broad. Split into multiple agents

Section 5: Rules

Hard constraints. Non-negotiable. The agent must follow these regardless of input, context, or how "natural" a violation might feel.

Template:

## Rules

Follow these rules without exception:

### Accuracy rules
- Never fabricate information. If data is not found, report it as missing.
  Do not infer, guess, or synthesize from insufficient evidence.
- Never present estimates as facts. Label every estimate: "ARR: ~$15M
  (estimated from headcount, low confidence)"
- Cite sources for every claim. "Funding: $45M Series B (Crunchbase, Oct 2025)"

### Format rules
- Every email must be under 80 words (Email 1), 90 words (Email 2),
  or 30 words (Email 3)
- Subject lines: ≤ 5 words, lowercase, no emoji
- No em-dashes (—) in any output. Use periods or restructure

### Content rules
- Never use these phrases: "leveraging", "in today's fast-paced world",
  "best-in-class", "holistic", "synergies", "unlock", "streamline"
- Never start an email with "I". Start with the signal or the prospect
- Never use "demo" — use "teardown", "walkthrough", or "quick look"

### Behavioral rules
- If a required tool returns an error, note the error and continue
  with available data. Do not retry more than once
- If confidence is "low" on any critical field, flag it in the output.
  Do not bury low-confidence data in otherwise confident-looking output

Rules formatting principles:

Group rules by category. Accuracy rules, format rules, content rules, behavioral rules. Grouping makes them scannable and reduces missed rules
Use "never" and "always" for absolute constraints. "Never fabricate" is clearer than "try to avoid fabricating"
Pair each rule with the specific behavior. "No em-dashes" is a rule. "No em-dashes (—) in any output. Use periods or restructure" is a rule the model can follow
Keep rules to 10-15 max. Beyond that, the model starts dropping rules. If you have 25 rules, some of them are redundant or should be in a reference file
Put the most important rules first. The model is more likely to follow rules that appear earlier in the list

Section 6: Examples

Two to three input/output pairs showing exactly what ideal behavior looks like. Examples are the most powerful calibration tool available.

Example design rules:

Show complete input and complete output. Partial examples create ambiguity
Choose examples that demonstrate different scenarios. One straightforward case, one case with missing data, one edge case
Annotate what makes the example good. After the output, add a brief note: "Note: this output correctly handles the missing funding data by reporting 'Not found' instead of guessing"
Use realistic data. Fake data that's obviously fake ("Acme Corp, founded 2020, 50 employees") trains the model differently than realistic data. Use anonymized real examples when possible
Match the output exactly to the output spec. If the spec says JSON, the example should be JSON. If the spec says markdown with headers, the example should be markdown with headers

Example count:

2 examples minimum. One example shows format. Two examples show range
3 examples ideal for complex agents. Straightforward case, partial data case, edge case
More than 4 is diminishing returns and consumes context. If you need 5+ examples, the output spec is probably underspecified

Section 7: Edge Cases

Explicit instructions for scenarios that fall outside the normal process. The model handles edge cases well when told what to do. It handles them poorly when left to improvise.

Common edge cases for GTM agents:

Edge case	What to do
Company not found (no search results)	Return output with all fields set to "Not found." Set confidence to "low." Do not construct a profile from partial data
Company is pre-revenue / stealth	Note "Pre-revenue / stealth mode" in snapshot. Signals and tech stack will be sparse. Set confidence accordingly
Multiple companies with the same name	Use the domain to disambiguate. If no domain provided, note the ambiguity and pick the most likely match based on ICP criteria. Flag in output
Company was recently acquired	Note the acquisition. Research the parent company if the original company no longer operates independently
Contact has left the company	Note "No longer at [company] as of [date]." Do not include in the committee map
Signal is ambiguous (could be positive or negative)	Report the signal with the ambiguity noted. Do not force-classify as positive or negative. Let the human reviewer interpret
Input is not a company name	Return an error: "Input does not appear to be a company name. Received: [input]." Do not attempt to process

Edge case rules:

List 5-8 edge cases. Cover the scenarios that would cause the agent to produce bad output if not explicitly handled
For each edge case, give a specific instruction. "Handle gracefully" is not an instruction. "Return output with all fields set to 'Not found'" is
Include the "I don't know" case. Every agent should have explicit permission and instructions to say "I couldn't find this" rather than fabricating

Prompt Anti-Patterns

1. The essay prompt

A 3,000-word prose prompt with no structure, no headers, no numbered steps. The model loses track of instructions buried in paragraphs.

Fix: Use the 7-section architecture. Headers, numbered lists, tables. Structure makes prompts scannable for the model just like it does for humans.

2. The vague output spec

"Return a helpful summary of the company." What format? How long? What fields? What's "helpful"?

Fix: Exact schema with field types, length constraints, and a complete example output.

3. Rules buried in process steps

"In step 3, make sure you don't use em-dashes and also keep it under 80 words and don't mention the competitor by name." Rules mixed into process instructions get missed.

Fix: Dedicated Rules section. All constraints in one scannable place.

4. No examples

Instructions without examples leave the model to interpret quality on its own. Its interpretation rarely matches yours.

Fix: Two to three examples minimum. Complete input-output pairs with annotations.

5. Contradictory instructions

"Be concise" in the identity section and "provide comprehensive detail" in the output spec. The model tries to satisfy both and fails at both.

Fix: Read the prompt end-to-end and check for contradictions. When two instructions conflict, delete one.

6. Persona bloat

"You are a world-class expert in B2B SaaS go-to-market strategy with deep expertise in..." This doesn't improve output. It wastes tokens and primes the model for verbose, self-important responses.

Fix: Two-sentence identity. Role + purpose + quality bar. No superlatives.

7. Over-constraining with soft rules

"Try to keep the output concise." "Consider mentioning the competitor if relevant." "You might want to include a proof point." Soft rules are effectively suggestions. The model follows them inconsistently.

Fix: Make every rule binary. Either it's a hard constraint ("under 80 words") or remove it. Soft rules create inconsistent output.

Prompt Iteration Process

Diagnose before changing

When agent output is bad, diagnose which section failed before editing.

Symptom	Likely cause	Section to fix
Output format is wrong	Output spec is ambiguous	Section 3: Output spec
Agent skips steps	Process is unclear or too long	Section 4: Process
Agent violates a rule	Rule is buried or contradicted	Section 5: Rules
Output tone is off	Identity or examples set wrong tone	Section 1: Identity, Section 6: Examples
Agent hallucinates data	No explicit "don't guess" rule + no edge case handling	Section 5: Rules, Section 7: Edge cases
Output is inconsistent across runs	No examples or examples are too similar	Section 6: Examples

Change one thing at a time

Make one edit per iteration. If output is too long and factually inaccurate, fix length first, re-test, then fix accuracy
Track every prompt version. Save each version with a timestamp and note what changed and why
Re-run the same test inputs after each change. Compare outputs side-by-side to measure improvement
After 3 iterations on the same section without improvement, the problem may be in a different section. Re-diagnose

Prompt review checklist

Before deploying any agent prompt:

[ ] Identity is 2-3 sentences. Names the role, action, audience, and quality bar
[ ] Input spec lists every field with required/optional labels and format
[ ] Output spec includes exact schema with field types and length constraints
[ ] Output spec includes instructions for missing data (no guessing)
[ ] Process has 5-8 numbered steps with specific tools and queries named
[ ] Rules are grouped by category, use "never"/"always", and total ≤ 15
[ ] 2-3 complete input/output examples are included
[ ] Edge cases cover missing data, ambiguous input, and the "I don't know" case
[ ] No contradictions between sections
[ ] No soft rules ("try to", "consider", "you might want to")
[ ] No persona bloat or marketing language in the identity
[ ] Total prompt length is under 2,500 words (move reference material to separate files)

Anti-Pattern Check

Prompt is over 3,000 words with no reference files. Move detailed reference material (banned phrase lists, scoring rubrics, example databases) into separate files the agent loads as needed. The system prompt should be the operating manual, not the encyclopedia
No examples in the prompt. Examples are the strongest calibration tool. Two examples outperform 500 words of instructions. Always include them
Output spec says "return a summary." Summaries are subjective and inconsistent. Define the exact schema, fields, and constraints
Rules say "try to avoid." Make it binary. "Never" or remove the rule. Soft constraints produce inconsistent output
Process section has 12 steps. The agent's scope is too broad. Split into 2-3 agents with 4-6 steps each
Changed 5 things in the prompt at once. Now output is different but you don't know which change helped or hurt. One edit per iteration
Same prompt used across models. Different models respond differently to the same prompt. When switching from Sonnet to Opus or Haiku, re-test and adjust. Especially the rules section, which smaller models follow less reliably

Want agents that use skill files like this?

We customize skill files for your brand voice and methodology, then run content agents against them.

Book a call

# Prompt Design for Agents

## Prompt Architecture

Every agent system prompt follows the same 7-section structure. Order matters. The model pays more attention to content earlier in the prompt.

```
1. Identity        — Who the agent is and what it does (2-3 sentences)
2. Input spec      — What the agent receives and in what format
3. Output spec     — Exact format, schema, required fields
4. Process         — Step-by-step instructions for how to get from input to output
5. Rules           — Hard constraints the agent must never violate
6. Examples        — 2-3 input/output pairs showing ideal behavior
7. Edge cases      — What to do when data is missing, ambiguous, or unexpected
```

### Why this order works

- Identity first because it frames everything that follows. The model interprets all subsequent instructions through the lens of "who am I"
- Input/output specs before process because the model needs to know what it's working with and what it's producing before it reads how
- Rules after process because rules are constraints on the process. They make more sense after the model understands what it's doing
- Examples near the end because they serve as calibration. The model has absorbed the instructions and now sees what "good" looks like concretely
- Edge cases last because they're exceptions to the normal flow. The model should understand the normal flow first

---

## Section 1: Identity

Two to three sentences. Who the agent is, what it does, and for whom.

**Template:**
```
You are a [role] that [primary action] for [audience].
Your output is used by [downstream consumer] to [downstream purpose].
[One sentence on quality bar or operating philosophy.]
```

**Good examples:**

```
You are a B2B account research agent that produces structured account briefs
from company names. Your output is used by SDRs and ABM marketers to craft
personalized outbound campaigns. Accuracy matters more than completeness. Never
guess. If data is not found, say so.
```

```
You are a cold email writer that generates 3-email outbound sequences for B2B
SaaS prospects. Your output is reviewed by a human before sending. Write like
a peer, not a vendor. Every email must earn the next.
```

**Bad examples:**

```
You are a helpful AI assistant.
```
(Too generic. The model has no frame for what "helpful" means in this context.)

```
You are an advanced AI-powered go-to-market intelligence platform that leverages
cutting-edge natural language processing to synthesize multi-source data streams
into actionable strategic insights for revenue-generating teams.
```
(Marketing copy, not an operating manual. The model will mirror this style in its output.)

### Identity rules

- Name the specific job, not a general capability. "Account research agent" not "helpful assistant"
- Name the downstream consumer. The agent writes differently when it knows the output goes to an SDR vs a VP
- State the quality philosophy in one sentence. "Accuracy over completeness" or "Conciseness over comprehensiveness" sets the tone for all decisions the model makes

---

## Section 2: Input Spec

Define exactly what the agent receives. Include the format, required fields, and optional fields.

**Template:**
```
## Input

You receive the following:
- **company_name** (required): The target company name
- **domain** (optional): The company's website domain
- **icp_criteria** (optional): ICP fit criteria to evaluate against
- **context** (optional): Additional context from the requesting user

Input format: JSON object or plain text, depending on source.
```

**Input spec rules:**

- Label every field as required or optional. The agent needs to know what it can always rely on vs what might be missing
- Specify the format. "JSON object with these keys" or "plain text, one company name per line." Ambiguous input specs cause parsing failures
- Include an example input. Even one example disambiguates more than a paragraph of description
- Note what the input does NOT include. "You do not receive the prospect's email address. Do not attempt to guess or construct email addresses" prevents hallucination on fields the agent doesn't have

---

## Section 3: Output Spec

The most important section. Ambiguous output specs are the #1 cause of inconsistent agent behavior.

**Template:**
```
## Output

Return a JSON object with the following structure:

{
  "company_name": "string",
  "snapshot": {
    "founded": "year or 'Unknown'",
    "hq": "city, state/country",
    "employee_count": "number or range",
    "funding": "most recent round, amount, date",
    "industry": "string"
  },
  "signals": [
    {
      "signal": "description",
      "type": "funding | hiring | product | leadership | tech_stack",
      "date": "YYYY-MM-DD or approximate",
      "source": "where found",
      "strength": "tier_1 | tier_2 | tier_3"
    }
  ],
  "problem_hypothesis": "One paragraph connecting signals to a specific problem",
  "confidence": "high | medium | low",
  "missing_fields": ["list of fields that could not be populated"]
}
```

**Output spec rules:**

- Use a concrete schema, not prose descriptions. Show the exact JSON structure or markdown template. "Return a summary of the company" produces wildly inconsistent outputs. A schema produces consistent ones
- Specify what to do for missing data. Every field should have a fallback value: `"Unknown"`, `null`, `"Not found (checked [sources])"`. This prevents hallucination
- Include `missing_fields` or `confidence` in the schema. The agent should communicate what it doesn't know, not fill gaps with guesses
- Specify length constraints per field. "problem_hypothesis: 2-4 sentences, under 100 words" prevents both terse and bloated outputs
- Show one complete example output. The model calibrates its output format to the example more reliably than to the schema description alone

---

## Section 4: Process

Step-by-step instructions for transforming input into output. Think of this as the agent's standard operating procedure.

**Template:**
```
## Process

Follow these steps in order:

1. **Search for company information.** Use the web_search tool with the query
   "[company_name] funding crunchbase". Extract founding year, HQ, employee
   count, and most recent funding round.

2. **Identify recent signals.** Search for "[company_name] news" and
   "[company_name] hiring". Look for events in the last 90 days: funding
   announcements, leadership changes, product launches, job postings for
   roles relevant to [product category].

3. **Assess tech stack.** Search for "[company_name] jobs" and look for
   tools mentioned in job requirements. Cross-reference with BuiltWith
   if the domain is provided.

4. **Formulate problem hypothesis.** Connect the strongest signal to a
   specific problem the company likely faces. Ground the hypothesis in
   evidence from steps 1-3. Do not speculate beyond what the data supports.

5. **Compile output.** Assemble all findings into the output schema.
   Populate missing_fields with any fields that could not be found.
   Set confidence based on data coverage.
```

**Process rules:**

- Number every step. The model follows numbered sequences more reliably than prose paragraphs
- Name the specific tool to use in each step. "Use the web_search tool with query X" is better than "search for information about the company"
- Include the search queries. Specifying exact queries ("[ company_name] funding crunchbase") produces more consistent results than "search for funding data"
- Tell the agent what to extract, not just where to look. "Extract founding year, HQ, employee count, and most recent funding round" is actionable. "Look at the company profile" is not
- Keep it to 5-8 steps. Fewer than 5 usually means steps are too vague. More than 8 usually means the agent's scope is too broad. Split into multiple agents

---

## Section 5: Rules

Hard constraints. Non-negotiable. The agent must follow these regardless of input, context, or how "natural" a violation might feel.

**Template:**
```
## Rules

Follow these rules without exception:

### Accuracy rules
- Never fabricate information. If data is not found, report it as missing.
  Do not infer, guess, or synthesize from insufficient evidence.
- Never present estimates as facts. Label every estimate: "ARR: ~$15M
  (estimated from headcount, low confidence)"
- Cite sources for every claim. "Funding: $45M Series B (Crunchbase, Oct 2025)"

### Format rules
- Every email must be under 80 words (Email 1), 90 words (Email 2),
  or 30 words (Email 3)
- Subject lines: ≤ 5 words, lowercase, no emoji
- No em-dashes (—) in any output. Use periods or restructure

### Content rules
- Never use these phrases: "leveraging", "in today's fast-paced world",
  "best-in-class", "holistic", "synergies", "unlock", "streamline"
- Never start an email with "I". Start with the signal or the prospect
- Never use "demo" — use "teardown", "walkthrough", or "quick look"

### Behavioral rules
- If a required tool returns an error, note the error and continue
  with available data. Do not retry more than once
- If confidence is "low" on any critical field, flag it in the output.
  Do not bury low-confidence data in otherwise confident-looking output
```

**Rules formatting principles:**

- Group rules by category. Accuracy rules, format rules, content rules, behavioral rules. Grouping makes them scannable and reduces missed rules
- Use "never" and "always" for absolute constraints. "Never fabricate" is clearer than "try to avoid fabricating"
- Pair each rule with the specific behavior. "No em-dashes" is a rule. "No em-dashes (—) in any output. Use periods or restructure" is a rule the model can follow
- Keep rules to 10-15 max. Beyond that, the model starts dropping rules. If you have 25 rules, some of them are redundant or should be in a reference file
- Put the most important rules first. The model is more likely to follow rules that appear earlier in the list

---

## Section 6: Examples

Two to three input/output pairs showing exactly what ideal behavior looks like. Examples are the most powerful calibration tool available.

**Example design rules:**

- Show complete input and complete output. Partial examples create ambiguity
- Choose examples that demonstrate different scenarios. One straightforward case, one case with missing data, one edge case
- Annotate what makes the example good. After the output, add a brief note: "Note: this output correctly handles the missing funding data by reporting 'Not found' instead of guessing"
- Use realistic data. Fake data that's obviously fake ("Acme Corp, founded 2020, 50 employees") trains the model differently than realistic data. Use anonymized real examples when possible
- Match the output exactly to the output spec. If the spec says JSON, the example should be JSON. If the spec says markdown with headers, the example should be markdown with headers

**Example count:**
- 2 examples minimum. One example shows format. Two examples show range
- 3 examples ideal for complex agents. Straightforward case, partial data case, edge case
- More than 4 is diminishing returns and consumes context. If you need 5+ examples, the output spec is probably underspecified

---

## Section 7: Edge Cases

Explicit instructions for scenarios that fall outside the normal process. The model handles edge cases well when told what to do. It handles them poorly when left to improvise.

**Common edge cases for GTM agents:**

| Edge case | What to do |
|-----------|-----------|
| Company not found (no search results) | Return output with all fields set to "Not found." Set confidence to "low." Do not construct a profile from partial data |
| Company is pre-revenue / stealth | Note "Pre-revenue / stealth mode" in snapshot. Signals and tech stack will be sparse. Set confidence accordingly |
| Multiple companies with the same name | Use the domain to disambiguate. If no domain provided, note the ambiguity and pick the most likely match based on ICP criteria. Flag in output |
| Company was recently acquired | Note the acquisition. Research the parent company if the original company no longer operates independently |
| Contact has left the company | Note "No longer at [company] as of [date]." Do not include in the committee map |
| Signal is ambiguous (could be positive or negative) | Report the signal with the ambiguity noted. Do not force-classify as positive or negative. Let the human reviewer interpret |
| Input is not a company name | Return an error: "Input does not appear to be a company name. Received: [input]." Do not attempt to process |

**Edge case rules:**
- List 5-8 edge cases. Cover the scenarios that would cause the agent to produce bad output if not explicitly handled
- For each edge case, give a specific instruction. "Handle gracefully" is not an instruction. "Return output with all fields set to 'Not found'" is
- Include the "I don't know" case. Every agent should have explicit permission and instructions to say "I couldn't find this" rather than fabricating

---

## Prompt Anti-Patterns

### 1. The essay prompt

A 3,000-word prose prompt with no structure, no headers, no numbered steps. The model loses track of instructions buried in paragraphs.

**Fix:** Use the 7-section architecture. Headers, numbered lists, tables. Structure makes prompts scannable for the model just like it does for humans.

### 2. The vague output spec

"Return a helpful summary of the company." What format? How long? What fields? What's "helpful"?

**Fix:** Exact schema with field types, length constraints, and a complete example output.

### 3. Rules buried in process steps

"In step 3, make sure you don't use em-dashes and also keep it under 80 words and don't mention the competitor by name." Rules mixed into process instructions get missed.

**Fix:** Dedicated Rules section. All constraints in one scannable place.

### 4. No examples

Instructions without examples leave the model to interpret quality on its own. Its interpretation rarely matches yours.

**Fix:** Two to three examples minimum. Complete input-output pairs with annotations.

### 5. Contradictory instructions

"Be concise" in the identity section and "provide comprehensive detail" in the output spec. The model tries to satisfy both and fails at both.

**Fix:** Read the prompt end-to-end and check for contradictions. When two instructions conflict, delete one.

### 6. Persona bloat

"You are a world-class expert in B2B SaaS go-to-market strategy with deep expertise in..." This doesn't improve output. It wastes tokens and primes the model for verbose, self-important responses.

**Fix:** Two-sentence identity. Role + purpose + quality bar. No superlatives.

### 7. Over-constraining with soft rules

**Fix:** Make every rule binary. Either it's a hard constraint ("under 80 words") or remove it. Soft rules create inconsistent output.

---

## Prompt Iteration Process

### Diagnose before changing

When agent output is bad, diagnose which section failed before editing.

| Symptom | Likely cause | Section to fix |
|---------|-------------|---------------|
| Output format is wrong | Output spec is ambiguous | Section 3: Output spec |
| Agent skips steps | Process is unclear or too long | Section 4: Process |
| Agent violates a rule | Rule is buried or contradicted | Section 5: Rules |
| Output tone is off | Identity or examples set wrong tone | Section 1: Identity, Section 6: Examples |
| Agent hallucinates data | No explicit "don't guess" rule + no edge case handling | Section 5: Rules, Section 7: Edge cases |
| Output is inconsistent across runs | No examples or examples are too similar | Section 6: Examples |

### Change one thing at a time

- Make one edit per iteration. If output is too long and factually inaccurate, fix length first, re-test, then fix accuracy
- Track every prompt version. Save each version with a timestamp and note what changed and why
- Re-run the same test inputs after each change. Compare outputs side-by-side to measure improvement
- After 3 iterations on the same section without improvement, the problem may be in a different section. Re-diagnose

### Prompt review checklist

Before deploying any agent prompt:

- [ ] Identity is 2-3 sentences. Names the role, action, audience, and quality bar
- [ ] Input spec lists every field with required/optional labels and format
- [ ] Output spec includes exact schema with field types and length constraints
- [ ] Output spec includes instructions for missing data (no guessing)
- [ ] Process has 5-8 numbered steps with specific tools and queries named
- [ ] Rules are grouped by category, use "never"/"always", and total ≤ 15
- [ ] 2-3 complete input/output examples are included
- [ ] Edge cases cover missing data, ambiguous input, and the "I don't know" case
- [ ] No contradictions between sections
- [ ] No soft rules ("try to", "consider", "you might want to")
- [ ] No persona bloat or marketing language in the identity
- [ ] Total prompt length is under 2,500 words (move reference material to separate files)

---

## Anti-Pattern Check

- Prompt is over 3,000 words with no reference files. Move detailed reference material (banned phrase lists, scoring rubrics, example databases) into separate files the agent loads as needed. The system prompt should be the operating manual, not the encyclopedia
- No examples in the prompt. Examples are the strongest calibration tool. Two examples outperform 500 words of instructions. Always include them
- Output spec says "return a summary." Summaries are subjective and inconsistent. Define the exact schema, fields, and constraints
- Rules say "try to avoid." Make it binary. "Never" or remove the rule. Soft constraints produce inconsistent output
- Process section has 12 steps. The agent's scope is too broad. Split into 2-3 agents with 4-6 steps each
- Changed 5 things in the prompt at once. Now output is different but you don't know which change helped or hurt. One edit per iteration
- Same prompt used across models. Different models respond differently to the same prompt. When switching from Sonnet to Opus or Haiku, re-test and adjust. Especially the rules section, which smaller models follow less reliably