how-to 11 min read May 04, 2026

How to Pick a North Star Metric for Your SaaS (Without Picking the Wrong One)

Q: What are the 5 criteria for a north star metric?

A north star metric must be leading (predicts revenue 60-90 days out), measurable (one query,one number), actionable (your team can move its inputs), value-aligned (rises only when customers get value), and a single number (not a ratio). Score each candidate 1-5 on every criterion.

Q: Should MRR be a north star metric?

Almost never. MRR is a lagging indicator. Sean Ellis recommends against using revenue as the NSM because the metric should capture units of value delivered, not the financial result of that value.

Q: How is a north star metric different from a KPI?

A KPI measures performance of a function. A north star metric measures the value the product delivers to customers and predicts the whole company's growth. You have many KPIs and one NSM.

Q: How do you pick a north star metric for a marketplace?

Pick a metric that captures value for both sides in one number. Airbnb uses nights booked -- it rises only when a guest gets a stay AND a host gets paid.

By Peter Foy

A 5-criterion test for picking your SaaS north star metric, with two worked examples (PLG + sales-led) and the failure mode that breaks org alignment.

TL;DR

Pick your north star metric by scoring 3 candidates against 5 criteria: leading (predicts revenue), measurable (one query, one number), actionable (your team can move its inputs), value-aligned (rises only when customers win), and single-number (no ratios). The winner is the candidate that scores highest across all five and survives a stress test against your sales comp plan and product roadmap.

Run the 5-criterion test: leading, measurable, actionable, value-aligned, single-number.
Score 3 candidates 1-5 on each criterion. The total tells you which to pick.
PLG winner is usually a depth-of-engagement metric, not raw MAU or MRR.
Sales-led winner is usually a customer-success outcome, not seats sold.
An NSM that breaks sales comp or roadmap alignment is the wrong NSM, even if it scores 25 of 25.

Pick your north star metric by scoring 3 candidates against 5 criteria: leading (predicts revenue 60-90 days out), measurable (one SQL query, one number), actionable (your team can move its inputs), value-aligned (rises only when customers get value), and single-number (not a ratio). Score each candidate 1-5 on each criterion. The highest total wins, but only if it survives a stress test against your sales comp plan and product roadmap. This guide walks through the test with two parallel worked examples and shows how to instrument the winner.

What is a north star metric, and why does picking the wrong one cost you a year?

A north star metric (NSM) is the single number that captures the core value your product delivers and predicts long-term revenue. Sean Ellis, who coined the term, defines it as "the single metric that best captures the core value that your product delivers to customers."

Picking the wrong one is expensive because the NSM dictates roadmap, hiring, and comp. A team that picks MAU as its NSM will build features that drive logins. A team that picks weekly active queries will build features that drive depth of use. Different metric, different product, different company 12 months later.

The failure pattern is well-documented. Reforge's Brian Balfour warns that "blindly buying into the concept of the one metric that matters is a fatal oversimplification." And John Cutler, author of Amplitude's North Star Playbook, notes: "If you can move your North Star directly, it's probably not a good North Star."

The job is not to pick a metric that sounds inspiring on a slide. The job is to pick a metric whose movement reliably predicts whether customers are getting value and whether your business will grow. That's what the 5-criterion test is for.

What are the 5 criteria for picking a north star metric?

A good NSM passes five tests. Score each candidate 1-5 on each. Anything below 4 on any criterion is a red flag.

Leading. The metric moves before revenue moves. If it tracks lagging revenue, it cannot guide decisions; you'll only see the result after the quarter is lost. Target: predicts revenue 60-90 days out.
Measurable. One SQL query, one number, one dashboard. If it requires three teams to reconcile definitions, it's not measurable, it's a debate.
Actionable. Your productteam can move its inputs. Cutler's framework explicitly says you should not be able to move the NSM directly -- but you must be able to move the breadth, depth, frequency, or efficiency drivers underneath it.
Value-aligned. It only goes up when customers get more value. A metric that rises while NPS falls fails this test. This is what blocks vanity metrics: total signups can grow while engaged users shrink.
Single-number, non-ratio. Sean Ellis is explicit: "It should not be a ratio." Ratios let teams optimize the denominator (kick out low-engagement users) instead of the numerator (deliver more value).

A candidate that scores 23-25 of 25 is a real NSM. A candidate that scores 18-22 is workable but you should keep looking. Below 18, kill it.

How do you score candidate metrics against the 5 criteria? (Worked example 1: PLG analytics tool)

Imagine a self-serve product analytics tool, similar to early Mixpanel or PostHog. The product is free up to 1M events/month, paid above that. The team brainstorms three candidate NSMs:

Candidate A: Monthly Active Users (MAU)
Candidate B: Weekly Active Querying Users -- accounts where >=3 unique users ran >=5 queries in the last 7 days
Candidate C: MRR

Score each against the 5 criteria:

Criterion	A: MAU	B: Weekly Active Querying Users	C: MRR
Leading (predicts revenue 60-90d)	2	5	1
Measurable (one query, one number)	5	4	5
Actionable (team can move inputs)	3	5	2
Value-aligned (rises only with value)	2	5	3
Single-number, non-ratio	5	5	5
Total	17	24	16

Winner: Weekly Active Querying Users (24/25).

Why MAU loses: a logged-in user who never queries is not getting value. The metric inflates without product-market fit deepening. Why MRR loses: it's a lagging indicator of decisions made 60-90 days ago, and the product team cannot move it directly without sales involvement. Why the winner wins: querying is the product's "aha moment", it correlates tightly with conversion to paid, and the team can move its inputs (onboarding query templates, integration breadth, alert features).

PLG Analytics Tool: NSM Candidate Scores (out of 25)

MAU

Weekly Active Querying Users

MRR

Source: Worked example, 5-criterion test

How does the test work for a sales-led product? (Worked example 2: HR platform)

Now apply the same test to a sales-led HR platform, similar to Lattice or 15Five. Average contract is $40K ARR, 12-month deals, sales-assisted onboarding. Three candidate NSMs:

Candidate A: Seats Sold
Candidate B: Performance Reviews Completed per Active Manager per Quarter
Candidate C: Logo Retention %

Score them:

Criterion	A: Seats Sold	B: Reviews Completed / Manager / Quarter	C: Logo Retention %
Leading (predicts revenue 60-90d)	2	5	3
Measurable (one query, one number)	5	4	4
Actionable (team can move inputs)	3	5	3
Value-aligned (rises only with value)	2	5	4
Single-number, non-ratio	5	5	1
Total	17	24	15

Winner: Reviews Completed per Active Manager per Quarter (24/25).

Logo Retention loses on the ratio rule -- it's a percentage, so the team can optimize by churning small accounts faster (shrinking the denominator). Seats Sold loses because it's a leading indicator of contracts, not value: enterprises buy 500 seats and use 80, then churn at renewal. The winning metric -- a count of completed reviews per active manager -- only rises when the product is actually used for its core job. It maps directly to renewal probability, which Reforge's research shows is the strongest predictor of net revenue retention in B2B SaaS.

Sales-led HR Platform: NSM Candidate Scores (out of 25)

Seats Sold

Reviews Completed / Manager / Quarter

Logo Retention %

Source: Worked example, 5-criterion test

Should a startup pick MRR or an engagement metric as its north star?

Pick the engagement metric, almost always. MRR fails the leading-indicator test (it lags real customer behavior by 30-90 days) and the actionable test (most product changes can't move it directly).

The exception is a pure transactional SaaS where MRR moves the same week the product is used -- think a usage-billed API where every successful call generates revenue. In that narrow case, MRR and engagement are nearly the same metric.

For everyone else, Sean Ellis is explicit: "the north star metric should capture units of value being delivered to users -- not revenue." Revenue is the result of value delivered. Tracking the result instead of the cause means you only see problems after they hit the bank account.

The practical rule: if you're pre-Series B, pick a depth-of-engagement metric. After Series B, when you have enough data to map engagement-to-revenue conversion rates with confidence, you can layer revenue targets on top of the engagement NSM. But the NSM itself stays anchored to customer value.

How do you instrument your north star metric in a warehouse and product analytics tool?

Once you've picked the winner, instrument it in three places so every team sees the same number.

Step 1: Define the metric in your warehouse (single source of truth). Write a dbt model that produces one row per (account, week) with the NSM value. This is the canonical definition. Example for Worked Example 1:

-- models/marts/north_star_weekly_active_querying.sql
select
  account_id,
  date_trunc('week', event_ts) as week,
  count(distinct user_id) as querying_users,
  count(*) as total_queries
from events
where event_name = 'query_run'
group by 1, 2
having count(distinct user_id) >= 3 and count(*) >= 5

Step 2: Pipe events into a product analytics tool. Send the same events to Mixpanel, Amplitude, or PostHog so PMs can slice the NSM by feature, cohort, or segment without writing SQL. The Mixpanel x Census integration keeps both layers in sync.

Step 3: Build the input tree. John Cutler's framework puts 3-5 inputs underneath the NSM, usually mapped to breadth, depth, frequency, efficiency. For Weekly Active Querying Users, inputs are: new querying accounts/week (breadth), queries per active user (depth), days/week with at least one query (frequency), and time-to-first-query for new signups (efficiency). Teams own inputs; leadership owns the NSM.

How do you know if your north star metric is wrong?

Your NSM is wrong if any of these four signals show up. Even if it scored 25/25 on the criteria, an NSM that breaks the org is broken.

It rises while NPS, retention, or expansion fall. This is the Jay Stansell pattern: "the North Star was shining while the product was quietly dying underneath it." Total signups grew, engaged users shrank, the company shipped fast and lost slowly.
It conflicts with sales comp. This is the failure case the brief warned about. A PLG company picks Weekly Active Querying Users as NSM. The metric scores 24/25. But sales comp is 100% on logo ACV, so AEs chase enterprise logos that buy 200 seats and query 4 times a quarter. Product optimizes for query velocity. Sales optimizes for big checks. Roadmap and quotas diverge. Within 6 months, exec meetings turn into hostage negotiations. Fix: align comp to the NSM, or accept that you have a leading and a lagging metric and weight them.
You can move it directly. Cutler's red flag. If a campaign or pricing change moves the NSM by 20% in a week, you picked an output, not a true north star. True NSMs move via inputs, not levers.
No team owns the inputs. If product, marketing, and CS each blame the others when the NSM dips, the input tree is incomplete. Map every input to one team.

Can you change your north star metric?

Yes, and you probably will. Most companies change NSMs every 18-36 months as the business model matures.

Amplitude itself changed its NSM as it grew, and the company that wrote the playbook explicitly recommends revisiting the metric annually. Stage transitions force the change: a pre-PMF startup tracks engagement depth, a Series B company adds activation breadth, a public company tracks expansion.

The rule for changing: do it deliberately, communicate it widely, and change the inputs and dashboards in the same week. Half-migrations, where some teams use the new NSM and others use the old one, cause more damage than the wrong metric. Pick a date, ship the new dbt model, retire the old dashboard, and re-run the 5-criterion test.

What NOT to do: change the NSM every quarter to chase whatever is up-and-to-the-right that week. That's not a north star, that's a weather vane.

Criterion	What it tests	Pass threshold	Common failure
Leading	Does it predict revenue 60-90 days out?	Score 4-5	Tracking MRR (lagging) instead of usage (leading)
Measurable	One SQL query, one number?	Score 4-5	Definition requires three teams to reconcile
Actionable	Can your team move its inputs?	Score 4-5	Picking a metric only Sales or Finance can move
Value-aligned	Does it only rise when customers get value?	Score 4-5	MAU rising while engaged users shrink
Single-number	Is it a count, not a ratio?	Score 5	Logo retention % -- optimize by churning small accounts

Frequently asked questions

What are the 5 criteria for a north star metric?

A north star metric must be leading (predicts revenue 60-90 days out), measurable (one query, one number), actionable (your team can move its inputs), value-aligned (rises only when customers get value), and a single number (not a ratio). Score each candidate 1-5 on every criterion. The highest total wins.

Should MRR be a north star metric?

Almost never. MRR is a lagging indicator -- it reflects decisions made 60-90 days ago. Sean Ellis explicitly recommends against using revenue as the NSM because the metric should capture units of value delivered, not the financial result of that value. Use a depth-of-engagement metric and track MRR as a downstream target.

How is a north star metric different from a KPI?

A KPI measures performance of a function (sales has KPIs, marketing has KPIs). A north star metric measures the value the product delivers to customers and predicts the whole company's growth. You have many KPIs and one NSM. KPIs ladder up to NSM inputs.

Can a north star metric be a ratio like activation rate?

No. Ratios let teams optimize the denominator instead of the numerator -- for example, churning low-engagement users to make activation rate look better. Sean Ellis is explicit on this. Use a count (e.g., activated users this week) and track activation rate as a quality input, not the NSM itself.

How often should you change your north star metric?

Every 18-36 months on average, usually triggered by a stage transition (pre-PMF to growth, growth to scale). Amplitude itself changed its own NSM as the company matured. Don't change it quarterly -- that's not a north star, that's a weather vane. When you do change it, ship the new dbt model and dashboards in the same week.

What is an example of a bad north star metric?

Total registered users. It rises every time someone signs up, even if they never return, never get value, and never pay. The classic failure pattern: registered users grew 4x while weekly engaged users shrank 30%, and the company didn't see the problem until renewal cohorts collapsed.

How do you pick a north star metric for a marketplace?

Pick a metric that captures value for both sides of the marketplace in one number. Airbnb uses nights booked -- it rises only when a guest gets a stay AND a host gets paid. Avoid metrics that only capture one side (GMV captures supply liquidity but ignores guest satisfaction).

Who owns the north star metric in a SaaS company?

The CEO or Chief Product Officer owns the NSM at the org level. Individual teams own the inputs that drive it -- product owns activation depth, marketing owns acquisition breadth, customer success owns retention frequency. John Cutler's North Star Playbook is explicit: no single team should be able to move the NSM directly.

What if my north star metric breaks alignment with sales comp?

That's the most common failure mode. Either change the NSM, change sales comp, or accept you have two metrics (one leading, one lagging) and weight them in planning. The worst option is leaving the conflict in place -- product and sales will optimize against each other for as long as the gap exists.

Place after the second worked example -- readers will want to score their own candidates immediately

Get the 5-criterion scoring template