Home/ Skills/ icp-modeling-with-data

general icp-modeling-with-data

icp-modeling-with-data

This skill should be used when the user asks to "model ICP from data", "use data to define ICP", "analyze closed-won deals for ICP", "build an ICP from CRM data", "use win/loss data for ICP", "score ICP fit from data", "build a predictive ICP model", "analyze customer data for ICP patterns", "quantify ICP from closed-won", or any variation of using closed-won, churn, and CRM data to quantitatively model the ideal customer profile for B2B SaaS.

Download .md

ICP Modeling with Data

ICP modeling uses closed-won deal data, churn data, and customer health data to quantitatively identify which company attributes predict success. Instead of guessing "we think mid-market SaaS is our ICP," you analyze 50-100 closed-won deals and find that companies with 80-300 employees, Series A-B funding, and a sales-led motion close 3x faster and churn 60% less. The data tells you the ICP. Your job is to listen.

The principle: the ICP model is only as good as the data behind it. Minimum 50 closed-won deals for a directional model. Minimum 100 for a statistically meaningful one. If you have fewer than 50 wins, use the qualitative icp-definition-framework skill instead.

The Data You Need

Required datasets

Dataset	Source	Minimum size	What it tells you
Closed-won deals	CRM (Opportunities, Closed Won)	50+ deals	Which companies actually buy
Closed-lost deals	CRM (Opportunities, Closed Lost)	50+ deals	Which companies evaluate but don't buy. The contrast with won reveals fit
Churned customers	CS platform or CRM	20+ churned accounts	Which companies leave. Anti-ICP signal
Active healthy customers	CS platform or CRM	30+ accounts	Which companies stay and grow. Strongest ICP signal
Enrichment data	Apollo, Clearbit, or CRM enrichment fields	All accounts enriched	Firmographic and technographic attributes for analysis

Data fields to collect per deal/account

Field	Source	Type	Why needed
Company name	CRM	Text	Identification
Domain	CRM or enrichment	Text	Dedup and enrichment key
Employee count (at time of deal)	Enrichment	Number	Size segmentation
Industry	Enrichment	Category	Vertical analysis
Funding stage	Crunchbase or enrichment	Category	Stage analysis
ARR / revenue range	Enrichment or estimate	Number	Revenue segmentation
Geography (HQ)	Enrichment	Category	Regional analysis
GTM motion	Manual or inferred from team composition	Category	Motion matching
Deal ACV	CRM	Number	Value segmentation
Sales cycle (days)	CRM (created date to closed date)	Number	Velocity analysis
Deal source	CRM	Category	Channel analysis
Champion title	CRM (primary contact)	Text	Persona analysis
Tech stack	Job postings or enrichment	List	Stack-based segmentation
Outcome	CRM	Won / Lost / Churned / Active	The dependent variable
Churn date (if applicable)	CS platform	Date	Retention analysis
Expansion revenue (if applicable)	Billing	Number	Growth analysis
NPS or health score (if available)	CS platform	Number	Satisfaction correlation

The Analysis Process

Step 1: Build the analysis dataset

Merge all data into one spreadsheet or database with one row per deal/account.

Columns:
company | domain | employees | industry | funding | geography |
gtm_motion | acv | sales_cycle_days | source | champion_title |
tech_stack | outcome (won/lost/churned/active) | churn_date |
expansion_revenue | health_score

Dataset rules:

Every row must have the outcome field populated (won, lost, churned, active). Without the outcome, the row is unusable
Enrich missing fields before analysis. A dataset with 40% blank industry fields produces weak industry insights. Enrich from Apollo or Clearbit before analyzing
Separate "active" customers by tenure. A customer active for 6 months is different from one active for 3 years. Add a "months_active" column

Step 2: Segment and compare

For each attribute, compare win rates, churn rates, and value metrics across segments.

Analysis template (repeat for each attribute):

Employee count	Won deals	Lost deals	Win rate	Avg ACV	Avg cycle (days)	Churn rate	Expansion rate
1-20	5	15	25%	$8K	45	35%	5%
21-50	8	12	40%	$18K	38	20%	12%
51-200	18	10	64%	$42K	28	8%	25%
201-500	12	8	60%	$65K	35	10%	22%
501-1000	5	12	29%	$85K	72	5%	30%
1000+	2	18	10%	$120K	120	3%	35%

What this table reveals: 51-200 employees is the sweet spot. Highest win rate (64%), fastest cycle (28 days), low churn (8%), strong expansion (25%). This is the ICP size band.

Attributes to segment by

Attribute	Segments to compare	What to look for
Employee count	5-6 size bands	Which band has the highest win rate + lowest churn?
Industry	Top 5-8 industries in the dataset	Which verticals win and retain at the highest rates?
Funding stage	Seed, A, B, C, D+, Public, Bootstrapped	Which stages buy fastest and stay longest?
Geography	US regions, international	Where do you win most and support best?
GTM motion	Sales-led, PLG, Hybrid	Which motion is the best fit for your product?
ACV band	$0-10K, $10-30K, $30-100K, $100K+	Which deal size has the best economics (win rate x retention x expansion)?
Champion title	VP, Director, Manager, IC	Which persona drives the most successful deals?
Source	Inbound, Outbound, Referral, Partner	Which channel produces the best customers (not just the most)?
Tech stack	Companies using [CRM], [sequencing tool], etc.	Does stack predict fit? Do Salesforce shops buy and retain better than HubSpot shops?

Step 3: Score and rank

Assign a fit score based on how many ICP attributes a deal matches.

Scoring example:

Attribute	ICP value (from Step 2)	Score if match
Employee count: 51-200	+20
Industry: B2B SaaS	+15
Funding: Series A-B	+15
Geography: US	+10
GTM motion: Sales-led	+10
Champion: VP or Director	+10
Uses Salesforce or HubSpot	+10
Total possible	90

Then score every deal in the dataset and compare outcomes:

ICP score band	Deals	Win rate	Avg ACV	Avg cycle	Churn rate
70-90 (strong fit)	25	72%	$55K	25 days	5%
50-69 (moderate fit)	40	48%	$35K	42 days	15%
30-49 (weak fit)	30	22%	$20K	65 days	30%
0-29 (no fit)	15	8%	$12K	90 days	45%

What this proves: Strong-fit deals (70-90) win at 9x the rate of no-fit deals, close 3.6x faster, have 4.6x higher ACV, and churn at 1/9 the rate. The ICP model works.

Step 4: Validate with holdout data

Split your data: use 70% to build the model, 30% to validate.

Validation process:

Build the scoring model from 70% of the data (the training set)
Score the remaining 30% (the holdout set) using the model
Compare: do high-fit deals in the holdout set actually win more, close faster, and churn less?
If yes: the model is validated. Deploy it
If no: the model is overfit to the training data. Simplify (fewer attributes, fewer segments)

Validation rules:

If holdout win rate for strong-fit deals is within 10% of training win rate, the model is robust
If holdout results are dramatically different (win rates don't correlate with fit score), the model is overfit. Remove the weakest attributes and re-test
Minimum 15 deals in the holdout set for each fit tier. Below that, the sample is too small to validate

Advanced Modeling

Weighted attributes

Not all ICP attributes are equally predictive. Weight them by their correlation with the outcome.

How to weight:

For each attribute, calculate the win rate difference between the best and worst segments.

Attribute	Best segment win rate	Worst segment win rate	Difference	Weight (normalized)
Employee count	64% (51-200)	10% (1000+)	54 pp	30%
Industry	68% (B2B SaaS)	15% (Healthcare)	53 pp	29%
Funding stage	62% (Series A-B)	20% (Bootstrapped)	42 pp	23%
Geography	55% (US)	30% (APAC)	25 pp	14%
Tech stack	50% (Salesforce)	42% (No CRM)	8 pp	4%

Weighting rules:

Attributes with > 40 pp win-rate difference between best and worst segments are strong predictors. Weight them highest
Attributes with < 10 pp difference are weak predictors. They don't discriminate between fit and non-fit. Consider removing them from the model
Normalize weights to sum to 100. This makes the final score interpretable (0-100 scale)

Negative indicators (anti-ICP)

Some attributes are strong negative predictors. Include them as score penalties.

Attribute	Anti-ICP value	Penalty
Industry: government, education	These verticals have < 10% win rate and 60% churn	-30
Employee count: < 10	Too small. Can't afford. Churn at 50%	-20
No CRM in place	Can't integrate. Can't track ROI	-15
Bootstrapped with < $1M ARR	No budget for tools. Extremely price-sensitive	-15
Competitor's employee	Not a real prospect	-100 (disqualify)

Anti-ICP rules:

A strong negative indicator should be able to disqualify a deal regardless of positive scores. A competitor employee with a 90 fit score is still disqualified
Anti-ICP attributes should be based on loss AND churn data. An attribute that predicts losses AND churn is a double negative. Weight it heavily

Multi-outcome modeling

Instead of just modeling win/loss, model multiple outcomes.

Outcome to model	Dataset	What it reveals
Win vs loss	Closed-won vs closed-lost	Which companies buy
Retained vs churned	Active 12+ months vs churned	Which companies stay
Expanded vs flat	Customers with expansion vs no expansion	Which companies grow
Fast close vs slow close	Deals < 30 days vs > 90 days	Which companies buy quickly

Multi-outcome rules:

The strongest ICP attributes are those that predict positive outcomes across ALL models. "51-200 employees" predicting both higher win rate AND lower churn AND higher expansion is a triple signal. That attribute belongs in the ICP with maximum weight
An attribute that predicts wins but also predicts churn is a trap. "Healthcare companies buy often but churn at 40%" means healthcare is not ICP despite a decent win rate. Look at the full customer lifecycle, not just the sale

Implementing the ICP Score in CRM

CRM fields

Field	Type	Object	How it's set
`icp_score`	Number (0-100)	Account/Company	Automated from enrichment data + scoring model
`icp_tier`	Picklist (Tier 1, 2, 3, Not ICP)	Account/Company	Derived from icp_score: 70+ = Tier 1, 50-69 = Tier 2, 30-49 = Tier 3, < 30 = Not ICP
`icp_score_details`	Long text	Account/Company	Breakdown: "Size: +20, Industry: +15, Stage: +15, Geo: +10 = 60"
`icp_last_scored`	Date	Account/Company	Timestamp of last score calculation

Automation

Trigger: New account created OR enrichment data updated
  ↓
Action: Calculate ICP score from enrichment fields
  Employee count → size score
  Industry → industry score
  Funding stage → stage score
  Geography → geo score
  Tech stack → stack score
  Anti-ICP checks → penalties
  ↓
Set icp_score = sum of all dimension scores
Set icp_tier based on score thresholds
Set icp_last_scored = today
  ↓
If icp_tier = "Tier 1" or "Tier 2": flag for prioritization
If icp_tier = "Not ICP": flag for review or disqualification

Implementation rules

Score automatically on account creation. The ICP score should populate within seconds of an account entering the CRM, not after a human reviews it
Re-score when enrichment data changes. If a company raises a new round (funding stage changes) or grows (employee count changes), the ICP score should update
Make the score visible to reps. The ICP tier should appear on the account record, in list views, and in the lead routing logic. A score buried in a custom field nobody sees is useless
Don't hide the scoring logic. The icp_score_details field shows how the score was calculated. When a rep asks "why is this Tier 2 and not Tier 1?" they can see the breakdown

Model Maintenance

Quarterly review process

Step	What to do	Why
1. Pull new win/loss data from the last quarter	The model was built on historical data. New data validates or invalidates it	Fresh data catches model drift
2. Re-run the segmentation analysis	Do the same attributes still predict wins?	Market changes. Product evolves. ICP may shift
3. Check churn by ICP tier	Are Tier 1 accounts still retaining best?	If Tier 1 churn is rising, the model is no longer identifying sticky customers
4. Check pipeline by ICP tier	Is > 70% of pipeline from Tier 1-2?	If the team is generating mostly non-ICP pipeline, targeting is misaligned
5. Adjust weights or add/remove attributes	If an attribute stopped being predictive, remove it. If a new attribute emerged, add it	Model accuracy degrades over time without updates
6. Re-validate on holdout data	Test the updated model on the last quarter's data	Confirms the model still works

Maintenance rules

Quarterly is the minimum review cadence. Every quarter, re-run the analysis with fresh data. Annual reviews are too infrequent for a fast-moving SaaS company
Track model accuracy over time. "What % of Tier 1 deals closed this quarter?" and "What % of churned customers were Tier 1?" If accuracy is declining, the model needs recalibration
The model should get better over time, not stay static. Each quarter adds more data points. More data = better segmentation = more accurate predictions. The Q4 model should outperform the Q1 model

Tools for ICP Modeling

Approach	Tool	Best for	Complexity
Spreadsheet analysis	Google Sheets, Excel	Teams with 50-200 deals. Manual segmentation and scoring	Low
BI tool analysis	Looker, Metabase, Mode	Teams with 200+ deals. Visual analysis. Shareable dashboards	Medium
CRM-native reporting	HubSpot reports, Salesforce reports	Basic segmentation within CRM. No export needed	Low-medium
Statistical modeling (regression)	Python (pandas, scikit-learn), R	Teams with 500+ deals. Predictive modeling. Feature importance	High
Predictive scoring vendors	MadKudu, Clearbit Reveal, 6sense	Automated ICP scoring with ML. Minimal manual analysis	Low (to implement), $$ (cost)

Tool selection rules

Start with a spreadsheet. Export CRM data. Do the segmentation manually. You'll learn more from hands-on analysis than from any tool's automated output
Graduate to BI tools at 200+ deals. Spreadsheets get unwieldy above 200 rows. BI tools make segmentation visual and shareable
Statistical modeling at 500+ deals. Below 500 deals, regression models overfit. Above 500, logistic regression can identify non-obvious attribute interactions that manual analysis misses
Predictive scoring vendors at $10M+ ARR. Below $10M, the data volume doesn't justify the tool cost. Above $10M, automated ICP scoring saves RevOps 5-10 hours per week

Measurement

Metric	Definition	Target	Frequency
Model accuracy: win rate by tier	Win rate for Tier 1 vs Tier 2 vs non-ICP	Tier 1 win rate > 2x non-ICP win rate	Quarterly
Model accuracy: churn by tier	Churn rate for Tier 1 vs non-ICP	Tier 1 churn < 50% of non-ICP churn	Quarterly
Pipeline concentration	% of pipeline from Tier 1-2 accounts	> 70%	Monthly
ICP coverage	% of accounts in CRM with an ICP score	> 90%	Monthly
Scoring freshness	% of scores updated in last 90 days	> 80%	Monthly
Tier 1 expansion rate	Expansion revenue from Tier 1 accounts	> average for all accounts	Quarterly

Anti-Pattern Check

Building the model from 15 deals. The sample size is too small. Patterns from 15 deals are likely noise, not signal. Wait until you have 50+ wins and 50+ losses before modeling
Using only win data (no losses). The model identifies who buys. Without loss data, it can't identify who doesn't buy. The contrast between wins and losses reveals the discriminating attributes. Include both
Ignoring churn data. A company type that buys easily but churns at 40% is not ICP. It's a trap. Include churn data in the model. The best ICP attributes predict wins AND retention
Over-weighting one attribute. "100% of our wins are in the US" when 95% of prospects are also in the US. Geography isn't a discriminating factor in this case. Compare the win rate for the attribute vs the base rate
Building the model once and never updating. The model from 6 months ago was built on different data, a different product, and a different market. Refresh quarterly with new win/loss/churn data
Using the model to exclude leads instead of prioritize. The ICP model should determine routing priority and sales effort allocation. It should not be a hard gate that prevents non-ICP leads from ever being contacted. Tier 3 accounts can still be worked at lower priority
No anti-ICP in the model. The model only has positive scores. A government agency with 3 positive attributes scores 45/90 and enters the pipeline. Include negative indicators that disqualify regardless of positive fit
Scoring without enrichment. The model requires employee count, industry, and funding stage. If 40% of accounts are missing these fields, 40% of scores are wrong. Enrich before scoring

Want agents that use skill files like this?

We customize skill files for your brand voice and methodology, then run content agents against them.

Book a call

# ICP Modeling with Data

## The Data You Need

### Required datasets

| Dataset | Source | Minimum size | What it tells you |
|---------|--------|-------------|-------------------|
| Closed-won deals | CRM (Opportunities, Closed Won) | 50+ deals | Which companies actually buy |
| Closed-lost deals | CRM (Opportunities, Closed Lost) | 50+ deals | Which companies evaluate but don't buy. The contrast with won reveals fit |
| Churned customers | CS platform or CRM | 20+ churned accounts | Which companies leave. Anti-ICP signal |
| Active healthy customers | CS platform or CRM | 30+ accounts | Which companies stay and grow. Strongest ICP signal |
| Enrichment data | Apollo, Clearbit, or CRM enrichment fields | All accounts enriched | Firmographic and technographic attributes for analysis |

### Data fields to collect per deal/account

| Field | Source | Type | Why needed |
|-------|--------|------|-----------|
| Company name | CRM | Text | Identification |
| Domain | CRM or enrichment | Text | Dedup and enrichment key |
| Employee count (at time of deal) | Enrichment | Number | Size segmentation |
| Industry | Enrichment | Category | Vertical analysis |
| Funding stage | Crunchbase or enrichment | Category | Stage analysis |
| ARR / revenue range | Enrichment or estimate | Number | Revenue segmentation |
| Geography (HQ) | Enrichment | Category | Regional analysis |
| GTM motion | Manual or inferred from team composition | Category | Motion matching |
| Deal ACV | CRM | Number | Value segmentation |
| Sales cycle (days) | CRM (created date to closed date) | Number | Velocity analysis |
| Deal source | CRM | Category | Channel analysis |
| Champion title | CRM (primary contact) | Text | Persona analysis |
| Tech stack | Job postings or enrichment | List | Stack-based segmentation |
| Outcome | CRM | Won / Lost / Churned / Active | The dependent variable |
| Churn date (if applicable) | CS platform | Date | Retention analysis |
| Expansion revenue (if applicable) | Billing | Number | Growth analysis |
| NPS or health score (if available) | CS platform | Number | Satisfaction correlation |

---

## The Analysis Process

### Step 1: Build the analysis dataset

Merge all data into one spreadsheet or database with one row per deal/account.

**Dataset rules:**
- Every row must have the outcome field populated (won, lost, churned, active). Without the outcome, the row is unusable
- Enrich missing fields before analysis. A dataset with 40% blank industry fields produces weak industry insights. Enrich from Apollo or Clearbit before analyzing
- Separate "active" customers by tenure. A customer active for 6 months is different from one active for 3 years. Add a "months_active" column

### Step 2: Segment and compare

For each attribute, compare win rates, churn rates, and value metrics across segments.

**Analysis template (repeat for each attribute):**

| Employee count | Won deals | Lost deals | Win rate | Avg ACV | Avg cycle (days) | Churn rate | Expansion rate |
|---------------|-----------|-----------|---------|---------|-----------------|-----------|---------------|
| 1-20 | 5 | 15 | 25% | $8K | 45 | 35% | 5% |
| 21-50 | 8 | 12 | 40% | $18K | 38 | 20% | 12% |
| 51-200 | 18 | 10 | 64% | $42K | 28 | 8% | 25% |
| 201-500 | 12 | 8 | 60% | $65K | 35 | 10% | 22% |
| 501-1000 | 5 | 12 | 29% | $85K | 72 | 5% | 30% |
| 1000+ | 2 | 18 | 10% | $120K | 120 | 3% | 35% |

**What this table reveals:** 51-200 employees is the sweet spot. Highest win rate (64%), fastest cycle (28 days), low churn (8%), strong expansion (25%). This is the ICP size band.

### Attributes to segment by

| Attribute | Segments to compare | What to look for |
|-----------|-------------------|-----------------|
| Employee count | 5-6 size bands | Which band has the highest win rate + lowest churn? |
| Industry | Top 5-8 industries in the dataset | Which verticals win and retain at the highest rates? |
| Funding stage | Seed, A, B, C, D+, Public, Bootstrapped | Which stages buy fastest and stay longest? |
| Geography | US regions, international | Where do you win most and support best? |
| GTM motion | Sales-led, PLG, Hybrid | Which motion is the best fit for your product? |
| ACV band | $0-10K, $10-30K, $30-100K, $100K+ | Which deal size has the best economics (win rate x retention x expansion)? |
| Champion title | VP, Director, Manager, IC | Which persona drives the most successful deals? |
| Source | Inbound, Outbound, Referral, Partner | Which channel produces the best customers (not just the most)? |
| Tech stack | Companies using [CRM], [sequencing tool], etc. | Does stack predict fit? Do Salesforce shops buy and retain better than HubSpot shops? |

### Step 3: Score and rank

Assign a fit score based on how many ICP attributes a deal matches.

**Scoring example:**

| Attribute | ICP value (from Step 2) | Score if match |
|-----------|------------------------|---------------|
| Employee count: 51-200 | +20 | |
| Industry: B2B SaaS | +15 | |
| Funding: Series A-B | +15 | |
| Geography: US | +10 | |
| GTM motion: Sales-led | +10 | |
| Champion: VP or Director | +10 | |
| Uses Salesforce or HubSpot | +10 | |
| **Total possible** | **90** | |

Then score every deal in the dataset and compare outcomes:

| ICP score band | Deals | Win rate | Avg ACV | Avg cycle | Churn rate |
|---------------|-------|---------|---------|-----------|-----------|
| 70-90 (strong fit) | 25 | 72% | $55K | 25 days | 5% |
| 50-69 (moderate fit) | 40 | 48% | $35K | 42 days | 15% |
| 30-49 (weak fit) | 30 | 22% | $20K | 65 days | 30% |
| 0-29 (no fit) | 15 | 8% | $12K | 90 days | 45% |

**What this proves:** Strong-fit deals (70-90) win at 9x the rate of no-fit deals, close 3.6x faster, have 4.6x higher ACV, and churn at 1/9 the rate. The ICP model works.

### Step 4: Validate with holdout data

Split your data: use 70% to build the model, 30% to validate.

**Validation process:**
1. Build the scoring model from 70% of the data (the training set)
2. Score the remaining 30% (the holdout set) using the model
3. Compare: do high-fit deals in the holdout set actually win more, close faster, and churn less?
4. If yes: the model is validated. Deploy it
5. If no: the model is overfit to the training data. Simplify (fewer attributes, fewer segments)

**Validation rules:**
- If holdout win rate for strong-fit deals is within 10% of training win rate, the model is robust
- If holdout results are dramatically different (win rates don't correlate with fit score), the model is overfit. Remove the weakest attributes and re-test
- Minimum 15 deals in the holdout set for each fit tier. Below that, the sample is too small to validate

---

## Advanced Modeling

### Weighted attributes

Not all ICP attributes are equally predictive. Weight them by their correlation with the outcome.

**How to weight:**

For each attribute, calculate the win rate difference between the best and worst segments.

| Attribute | Best segment win rate | Worst segment win rate | Difference | Weight (normalized) |
|-----------|---------------------|----------------------|-----------|-------------------|
| Employee count | 64% (51-200) | 10% (1000+) | 54 pp | 30% |
| Industry | 68% (B2B SaaS) | 15% (Healthcare) | 53 pp | 29% |
| Funding stage | 62% (Series A-B) | 20% (Bootstrapped) | 42 pp | 23% |
| Geography | 55% (US) | 30% (APAC) | 25 pp | 14% |
| Tech stack | 50% (Salesforce) | 42% (No CRM) | 8 pp | 4% |

**Weighting rules:**
- Attributes with > 40 pp win-rate difference between best and worst segments are strong predictors. Weight them highest
- Attributes with < 10 pp difference are weak predictors. They don't discriminate between fit and non-fit. Consider removing them from the model
- Normalize weights to sum to 100. This makes the final score interpretable (0-100 scale)

### Negative indicators (anti-ICP)

Some attributes are strong negative predictors. Include them as score penalties.

| Attribute | Anti-ICP value | Penalty |
|-----------|---------------|---------|
| Industry: government, education | These verticals have < 10% win rate and 60% churn | -30 |
| Employee count: < 10 | Too small. Can't afford. Churn at 50% | -20 |
| No CRM in place | Can't integrate. Can't track ROI | -15 |
| Bootstrapped with < $1M ARR | No budget for tools. Extremely price-sensitive | -15 |
| Competitor's employee | Not a real prospect | -100 (disqualify) |

**Anti-ICP rules:**
- A strong negative indicator should be able to disqualify a deal regardless of positive scores. A competitor employee with a 90 fit score is still disqualified
- Anti-ICP attributes should be based on loss AND churn data. An attribute that predicts losses AND churn is a double negative. Weight it heavily

### Multi-outcome modeling

Instead of just modeling win/loss, model multiple outcomes.

| Outcome to model | Dataset | What it reveals |
|-----------------|---------|----------------|
| Win vs loss | Closed-won vs closed-lost | Which companies buy |
| Retained vs churned | Active 12+ months vs churned | Which companies stay |
| Expanded vs flat | Customers with expansion vs no expansion | Which companies grow |
| Fast close vs slow close | Deals < 30 days vs > 90 days | Which companies buy quickly |

**Multi-outcome rules:**
- The strongest ICP attributes are those that predict positive outcomes across ALL models. "51-200 employees" predicting both higher win rate AND lower churn AND higher expansion is a triple signal. That attribute belongs in the ICP with maximum weight
- An attribute that predicts wins but also predicts churn is a trap. "Healthcare companies buy often but churn at 40%" means healthcare is not ICP despite a decent win rate. Look at the full customer lifecycle, not just the sale

---

## Implementing the ICP Score in CRM

### CRM fields

| Field | Type | Object | How it's set |
|-------|------|--------|-------------|
| `icp_score` | Number (0-100) | Account/Company | Automated from enrichment data + scoring model |
| `icp_tier` | Picklist (Tier 1, 2, 3, Not ICP) | Account/Company | Derived from icp_score: 70+ = Tier 1, 50-69 = Tier 2, 30-49 = Tier 3, < 30 = Not ICP |
| `icp_score_details` | Long text | Account/Company | Breakdown: "Size: +20, Industry: +15, Stage: +15, Geo: +10 = 60" |
| `icp_last_scored` | Date | Account/Company | Timestamp of last score calculation |

### Automation

```
Trigger: New account created OR enrichment data updated
  ↓
Action: Calculate ICP score from enrichment fields
  Employee count → size score
  Industry → industry score
  Funding stage → stage score
  Geography → geo score
  Tech stack → stack score
  Anti-ICP checks → penalties
  ↓
Set icp_score = sum of all dimension scores
Set icp_tier based on score thresholds
Set icp_last_scored = today
  ↓
If icp_tier = "Tier 1" or "Tier 2": flag for prioritization
If icp_tier = "Not ICP": flag for review or disqualification
```

### Implementation rules

- **Score automatically on account creation.** The ICP score should populate within seconds of an account entering the CRM, not after a human reviews it
- **Re-score when enrichment data changes.** If a company raises a new round (funding stage changes) or grows (employee count changes), the ICP score should update
- **Make the score visible to reps.** The ICP tier should appear on the account record, in list views, and in the lead routing logic. A score buried in a custom field nobody sees is useless
- **Don't hide the scoring logic.** The `icp_score_details` field shows how the score was calculated. When a rep asks "why is this Tier 2 and not Tier 1?" they can see the breakdown

---

## Model Maintenance

### Quarterly review process

| Step | What to do | Why |
|------|-----------|-----|
| 1. Pull new win/loss data from the last quarter | The model was built on historical data. New data validates or invalidates it | Fresh data catches model drift |
| 2. Re-run the segmentation analysis | Do the same attributes still predict wins? | Market changes. Product evolves. ICP may shift |
| 3. Check churn by ICP tier | Are Tier 1 accounts still retaining best? | If Tier 1 churn is rising, the model is no longer identifying sticky customers |
| 4. Check pipeline by ICP tier | Is > 70% of pipeline from Tier 1-2? | If the team is generating mostly non-ICP pipeline, targeting is misaligned |
| 5. Adjust weights or add/remove attributes | If an attribute stopped being predictive, remove it. If a new attribute emerged, add it | Model accuracy degrades over time without updates |
| 6. Re-validate on holdout data | Test the updated model on the last quarter's data | Confirms the model still works |

### Maintenance rules

- **Quarterly is the minimum review cadence.** Every quarter, re-run the analysis with fresh data. Annual reviews are too infrequent for a fast-moving SaaS company
- **Track model accuracy over time.** "What % of Tier 1 deals closed this quarter?" and "What % of churned customers were Tier 1?" If accuracy is declining, the model needs recalibration
- **The model should get better over time, not stay static.** Each quarter adds more data points. More data = better segmentation = more accurate predictions. The Q4 model should outperform the Q1 model

---

## Tools for ICP Modeling

| Approach | Tool | Best for | Complexity |
|----------|------|----------|-----------|
| Spreadsheet analysis | Google Sheets, Excel | Teams with 50-200 deals. Manual segmentation and scoring | Low |
| BI tool analysis | Looker, Metabase, Mode | Teams with 200+ deals. Visual analysis. Shareable dashboards | Medium |
| CRM-native reporting | HubSpot reports, Salesforce reports | Basic segmentation within CRM. No export needed | Low-medium |
| Statistical modeling (regression) | Python (pandas, scikit-learn), R | Teams with 500+ deals. Predictive modeling. Feature importance | High |
| Predictive scoring vendors | MadKudu, Clearbit Reveal, 6sense | Automated ICP scoring with ML. Minimal manual analysis | Low (to implement), $$ (cost) |

### Tool selection rules

- **Start with a spreadsheet.** Export CRM data. Do the segmentation manually. You'll learn more from hands-on analysis than from any tool's automated output
- **Graduate to BI tools at 200+ deals.** Spreadsheets get unwieldy above 200 rows. BI tools make segmentation visual and shareable
- **Statistical modeling at 500+ deals.** Below 500 deals, regression models overfit. Above 500, logistic regression can identify non-obvious attribute interactions that manual analysis misses
- **Predictive scoring vendors at $10M+ ARR.** Below $10M, the data volume doesn't justify the tool cost. Above $10M, automated ICP scoring saves RevOps 5-10 hours per week

---

## Measurement

| Metric | Definition | Target | Frequency |
|--------|-----------|--------|-----------|
| Model accuracy: win rate by tier | Win rate for Tier 1 vs Tier 2 vs non-ICP | Tier 1 win rate > 2x non-ICP win rate | Quarterly |
| Model accuracy: churn by tier | Churn rate for Tier 1 vs non-ICP | Tier 1 churn < 50% of non-ICP churn | Quarterly |
| Pipeline concentration | % of pipeline from Tier 1-2 accounts | > 70% | Monthly |
| ICP coverage | % of accounts in CRM with an ICP score | > 90% | Monthly |
| Scoring freshness | % of scores updated in last 90 days | > 80% | Monthly |
| Tier 1 expansion rate | Expansion revenue from Tier 1 accounts | > average for all accounts | Quarterly |

---

## Anti-Pattern Check

- Building the model from 15 deals. The sample size is too small. Patterns from 15 deals are likely noise, not signal. Wait until you have 50+ wins and 50+ losses before modeling
- Using only win data (no losses). The model identifies who buys. Without loss data, it can't identify who doesn't buy. The contrast between wins and losses reveals the discriminating attributes. Include both
- Ignoring churn data. A company type that buys easily but churns at 40% is not ICP. It's a trap. Include churn data in the model. The best ICP attributes predict wins AND retention
- Over-weighting one attribute. "100% of our wins are in the US" when 95% of prospects are also in the US. Geography isn't a discriminating factor in this case. Compare the win rate for the attribute vs the base rate
- Building the model once and never updating. The model from 6 months ago was built on different data, a different product, and a different market. Refresh quarterly with new win/loss/churn data
- Using the model to exclude leads instead of prioritize. The ICP model should determine routing priority and sales effort allocation. It should not be a hard gate that prevents non-ICP leads from ever being contacted. Tier 3 accounts can still be worked at lower priority
- No anti-ICP in the model. The model only has positive scores. A government agency with 3 positive attributes scores 45/90 and enters the pipeline. Include negative indicators that disqualify regardless of positive fit
- Scoring without enrichment. The model requires employee count, industry, and funding stage. If 40% of accounts are missing these fields, 40% of scores are wrong. Enrich before scoring