general enrichment-stack-design

enrichment-stack-design

This skill should be used when the user asks to "design an enrichment stack", "build a data enrichment workflow", "choose enrichment vendors", "set up multi-provider enrichment", "design a data waterfall", "build an enrichment pipeline", "enrichment tool stack", "multi-vendor data enrichment", "stack enrichment providers", or any variation of designing a multi-vendor data enrichment stack for B2B SaaS.
Download .md

Enrichment Stack Design

An enrichment stack is the combination of data providers, tools, and workflows that fill in missing data on your CRM records. One provider gives you company size. Another gives you tech stack. A third gives you direct dial phone numbers. No single provider has everything. The stack fills the gaps.

The principle: design the stack around the fields you need, not the vendors available. Start with "what data do I need for scoring, routing, and personalization?" Then find the cheapest, most accurate providers for each field.

Designing the Stack

Step 1: Define required fields

Field Why you need it Used for
Company size (employees) ICP qualification Lead scoring, routing
Industry ICP qualification Segmentation, personalization
Funding stage/amount Signal detection, ICP fit Outbound triggers, scoring
Job title (normalized) Persona matching Routing, personalization
Work email (verified) Outreach Cold email, sequences
Direct dial phone Cold calling Phone outreach
Tech stack Competitive intelligence Personalization, qualification
Company revenue ICP qualification Scoring, segmentation
LinkedIn URL Research, personalization Signal mining, outbound
Location (HQ) Territory assignment Routing, territory-based outreach

Step 2: Map providers to fields

Field Provider 1 (primary) Provider 2 (fallback) Coverage
Company size Clearbit Apollo 90-95%
Industry Clearbit Apollo 90-95%
Funding Clearbit Crunchbase API 80-90%
Job title Apollo LinkedIn (manual) 85-92%
Work email Apollo Hunter.io 80-95% (with waterfall)
Direct dial Lusha Cognism 60-75%
Tech stack Clearbit BuiltWith 75-85%
Revenue Clearbit Apollo 65-80%
LinkedIn URL Apollo Google search 90%+
Location Clearbit Apollo 95%+

Waterfall Architecture

How the waterfall works

For each contact/company record:
  1. Check CRM: is the field already populated?
     → Yes: skip (don't overwrite good data)
     → No: proceed to enrichment
  
  2. Provider 1 (primary): query the API
     → Found: populate the field, log source
     → Not found: proceed to Provider 2
  
  3. Provider 2 (fallback): query the API
     → Found: populate the field, log source
     → Not found: mark field as "not available"
  
  4. Log results:
     - Which provider filled which field
     - Timestamp of enrichment
     - Confidence level (if provider supplies it)

Waterfall rules

  • Don't overwrite existing data with enrichment. If a human entered the company size, don't replace it with enrichment data. Enrichment fills gaps, not overwrites. Only overwrite if the existing data is explicitly flagged as outdated
  • Log the source of every enriched field. "Company size: 150, source: Clearbit, enriched: 2025-03-15." This enables data quality auditing and provider evaluation
  • Primary provider handles 70-80% of records. The fallback provider handles the remaining 20-30%. If the primary handles less than 60%, it's not a good primary for your ICP. Evaluate alternatives
  • Set a cost ceiling per record. If enrichment costs $0.50 per record across all providers, your budget for 10,000 records is $5,000. Plan and cap accordingly

Stack Configurations by Budget

Budget: $200-500/month (startup)

Stack:
  - Apollo (primary: emails, titles, company data)
  - Hunter.io (fallback: email finding)
  - NeverBounce (verification)
  
Coverage: 75-85%
Cost per enriched record: $0.05-0.15
Best for: Early-stage teams, outbound-first motion

Budget: $500-2,000/month (growth)

Stack:
  - Clearbit (primary: company enrichment, reveal)
  - Apollo (secondary: contact finding, emails)
  - Hunter.io (email fallback)
  - NeverBounce (verification)
  
Coverage: 85-92%
Cost per enriched record: $0.10-0.30
Best for: Teams with inbound + outbound, need real-time enrichment

Budget: $2,000-5,000/month (scale)

Stack:
  - Clearbit (company enrichment, reveal, real-time)
  - Apollo (contact database, prospecting)
  - Lusha or Cognism (direct dials, EMEA data)
  - 6sense or Bombora (intent data)
  - NeverBounce (verification)
  
Coverage: 90-95%
Cost per enriched record: $0.20-0.50
Best for: Mature teams with multiple channels and ABM motion

Integration Points

When to trigger enrichment

Trigger Action Tool
Form submission Enrich contact in real-time (< 5 seconds) Clearbit or Apollo webhook
CRM record creation Enrich on record create (contact or company) CRM integration or workflow
List import Bulk enrich after import API batch call or CSV upload
Scheduled re-enrichment Re-enrich records older than 90 days Scheduled workflow or cron job
Manual request Sales rep clicks "enrich" on a record CRM button or integration

Integration rules

  • Real-time enrichment on form submit. When a lead fills out a demo form, enrich within 5 seconds. This enables instant scoring, routing, and response. Clearbit is built for this. Apollo can do it via webhook
  • Batch enrichment on import. Don't import 2,000 records and then manually enrich them one by one. Use the API to batch-enrich after import
  • Re-enrich every 90 days. Job titles change. Companies grow. Funding rounds happen. Data older than 90 days degrades. Schedule automatic re-enrichment
  • Don't enrich records you'll never use. If a contact is disqualified (wrong ICP), don't spend enrichment credits on them. Enrich after initial qualification

Measuring Stack Performance

Metric Definition Target Frequency
Fill rate per field % of records with each field populated > 85% for critical fields Monthly
Provider match rate % of records each provider can enrich Primary > 70%, fallback > 50% Monthly
Cost per enriched record Total enrichment spend / records enriched $0.05-0.50 depending on stack Monthly
Data accuracy Spot-check enriched fields against LinkedIn/public sources > 90% Quarterly
Time to enrich Seconds from trigger to field population < 5 seconds (real-time), < 1 hour (batch) Monthly
Re-enrichment coverage % of records refreshed within 90 days > 80% Quarterly
Enrichment ROI Pipeline influenced by enriched data / enrichment cost > 10x Quarterly

Pre-Design Checklist

  • [ ] Required fields listed with use case for each (scoring, routing, personalization)
  • [ ] Primary and fallback providers selected per field
  • [ ] Waterfall logic designed (provider priority, don't overwrite rules)
  • [ ] Budget calculated per month and per record
  • [ ] Integration triggers defined (form submit, record creation, import, scheduled)
  • [ ] Source logging configured (which provider filled which field)
  • [ ] Re-enrichment schedule set (every 90 days)
  • [ ] Email verification tool included in the stack
  • [ ] Accuracy tested on 100 records from each provider
  • [ ] Cost ceiling per record defined and enforced

Anti-Pattern Check

  • One provider for everything. Apollo for emails, company data, phone numbers, and tech stack. Coverage is 65-75% when a multi-provider approach would get 85-95%. Add a second provider for the gap
  • Enriching before qualifying. You enrich every form fill, including spam bots, students, and competitors. Enrichment costs $0.10-0.50 per record. Qualify first (ICP fit check), then enrich qualified leads only
  • No re-enrichment schedule. Data from 12 months ago is 25-35% degraded. Job titles changed. Companies grew. People moved. Re-enrich every 90 days. Old data is wrong data
  • Overwriting human-entered data with enrichment. Sales rep manually updated a contact's title after a call. Enrichment workflow runs overnight and overwrites it with old data. Never overwrite manually entered fields
  • No source tracking. 50,000 records are enriched but nobody knows which provider supplied each field. When accuracy issues arise, you can't identify the weak provider. Log the source of every enriched field
  • Paying for enrichment you don't use. Clearbit enriches 100+ company attributes. You use 5 of them. You're paying for 95 unused fields. Audit which fields you actually use in scoring, routing, and personalization. Cut what you don't need
  • No verification on enriched emails. The enrichment provider "found" the email. But did they verify it? Provider-found emails still bounce at 5-15%. Always verify through a dedicated email verification tool before sending
Want agents that use skill files like this?
We customize skill files for your brand voice and methodology, then run content agents against them.
Book a call