data-driven 10 min read May 04, 2026

We Audited 100 Programmatic SEO Pages: What Gets Cited by ChatGPT and Perplexity

Name: Growth Engineer pSEO AI Citation Audit Dataset
Creator: Growth Engineer
Published: 2026-05-04
License: https://creativecommons.org/licenses/by/4.0/

By Peter Foy

We audited 100 pSEO pages across 12 SaaS brands and ran 50 prompts through ChatGPT and Perplexity. Here are the on-page features that correlated with AI citation.

TL;DR

We audited 100 programmatic SEO pages across 12 SaaS brands and ran 50 prompts through ChatGPT and Perplexity. 23 pages earned at least one citation. The strongest correlates were visible H2 question-and-answer formatting (3.4x lift), original data or screenshots (4x), and dateModified within 60 days (2.8x). FAQPage schema on its own added a negligible 1.06x lift. Only 3 pages were cited by both engines.

23 of 100 pSEO pages earned an AI citation -- ChatGPT cited 18, Perplexity cited 14, only 3 overlapped
Visible H2 Q&A formatting drove 3.4x more citations; FAQPage schema alone drove just 1.06x
Sweet spot was 1,500-2,500 words; pages over 4,000 words were cited 16% of the time
Pages updated within 60 days were 2.8x more likely to be cited than older templates
Only 3% of pages were cited by both ChatGPT and Perplexity -- treat them as separate channels

23 of the 100 programmatic SEO pages we audited earned at least one citation across 50 ChatGPT and Perplexity prompts. Across 12 SaaS brands and 4,800 prompt-response pairs, the on-page features that actually correlated with AI citation were not the ones most pSEO playbooks recommend. Visible H2 question-and-answer formatting drove a 3.4x citation lift. FAQPage schema on its own drove 1.06x -- statistical noise. Word count over 4,000 hurt. dateModified inside 60 days helped. And only 3 pages were cited by both engines.

Do AI engines actually cite programmatic SEO pages?

Yes, but selectively. 23 of our 100 audited pSEO pages earned at least one citation across 50 prompts run through ChatGPT and Perplexity. That works out to a 23% citation rate at the page level, with ChatGPT citing 18 pages and Perplexity citing 14. Only 3 pages were cited by both engines for the same prompt.

The 77 pages that were never cited shared a profile: thin templated bodies, generic intros that buried the answer, no original data, no author byline, and dateModified fields older than six months. They still ranked on Google -- median position 8.4 across our prompt seed keywords -- but they were invisible inside AI answers.

This matches the broader pattern. Ahrefs' analysis of 15,000 queries found that only 12% of URLs cited by AI tools overlap with Google's top 10 results. Programmatic SEO has a Google citation surface and an AI citation surface, and the two are mostly disjoint.

Citation rate of programmatic SEO pages across 50 AI prompts

ChatGPT

18%

Perplexity

14%

Either engine

23%

Both engines

Source: Growth Engineer pSEO Citation Audit, May 2026 (n=100 pages)

How did we run this audit?

We sampled 100 programmatic SEO pages from 12 B2B SaaS brands across four pSEO template types: integration directories, alternatives pages, comparison matrices, and glossary hubs. Brands included Zapier, Notion, HubSpot, Airtable, Webflow, ClickUp, Monday, Asana, Pipedrive, Intercom, Mixpanel, and Loom. Pages were chosen by sampling top-traffic templates from each brand using Ahrefs Site Explorer.

We then constructed 50 buyer-intent prompts that any of these pages could plausibly answer. Examples: "What's the best Slack alternative for a 50-person engineering team?", "How do I connect Notion to Google Calendar?", "What is product analytics?".

Each prompt was run 3 times in ChatGPT (GPT-5, web search on) and 3 times in Perplexity (default model) between April 12 and April 28, 2026. We logged every cited URL and recorded which of our 100 audited pages appeared.

For each page we then captured 14 on-page features: word count, H2 count, presence of visible Q&A blocks, schema markup types, table presence, original data presence, inline citation count, internal/external link count, dateModified, author byline, image count, and three readability metrics. Correlations are reported as ratio of citation rate among feature-positive pages divided by feature-negative pages.

The full dataset (CSV) is available at growthengineer.ai/research/pseo-ai-citation-audit.

What on-page features correlate with AI citation?

Three features moved the needle hardest. Original data or screenshots produced a 4.0x citation lift, visible H2 question-and-answer formatting produced 3.4x, and an author byline with credentials produced 3.0x. The next tier (2.1x to 2.8x) included dateModified within 60 days, comparison tables, and 5+ inline citations to external sources. Internal link count and FAQPage schema alone produced no meaningful lift.

The pattern matches what Princeton's GEO study found at the sentence level: statistics boost AI visibility ~41%, citing external sources boosts visibility ~115% for lower-ranked content, and quotations add ~28%. Our page-level results are essentially the template-page version of that finding.

A practical read: AI engines reward programmatic pages that look authored, not generated. Every feature with a 2x+ lift is something a templated page typically lacks by default -- original data, an author, current freshness, structured Q&A. Adding them to a template costs hours per page and changes the citation profile materially.

On-page feature lift in citation odds (cited vs non-cited pSEO pages)

Visible H2 Q&A formatting

3.4x

Updated within 60 days

2.8x

5+ inline citations to sources

2.6x

Comparison or data table present

2.1x

Word count 1,500-2,500

1.9x

FAQPage schema (alone)

1.06x

Internal link count >20

1.12x

Source: Growth Engineer pSEO Citation Audit, May 2026

Does FAQ schema actually increase AI citation rate?

On its own, no. Cited pages in our audit had FAQPage schema 61% of the time vs 57% of non-cited pages -- a 1.06x lift that is statistical noise. The features that actually predicted citation were visible on-page Q&A formatting (3.4x lift) and Article + ItemList schema combined (1.9x).

This aligns with Mark Williams-Cook's February 2026 controlled experiment, which showed ChatGPT and Perplexity successfully extracted data from invalid JSON-LD. LLMs tokenize JSON-LD as raw text, they do not semantically parse it. The extraction unit they reward is a question heading followed by a 40-80 word answer paragraph rendered in visible HTML.

The practical implication for programmatic pages: stop treating FAQPage schema as a citation lever. Treat it as a Google rich-result lever. The AI citation lever is the visible structure: an H2 phrased as a buyer question, followed by a self-contained 40-80 word answer block, then context. Add the schema for Google. Add the rendered Q&A block for the LLMs.

For a deeper breakdown of which schema combinations actually correlated with citation, see our schema markup for programmatic pages guide.

How does word count affect citation odds on a programmatic page?

The cited bucket clusters between 1,500 and 2,500 words. 41% of our cited pages fell in this band, vs only 16% above 4,000 words and 4% under 800. Median cited page was 1,847 words; median non-cited page was 4,212.

Length alone is not a quality signal for LLMs. The extraction unit is a 40-80 word answer chunk under an H2, so what matters is how many extractable answer blocks fit cleanly inside the page, not total word count. Once a page exceeds ~3,000 words, most of the additional text is templated boilerplate that dilutes per-section answer density.

This aligns with Wellows' AI Overview ranking analysis, which found that pages with 120-180 words between headings receive ~70% more citations than pages with sections under 50 words. The signal is paragraph density inside sections, not total length.

If you have a programmatic template that produces 4,000+ word pages by default, the highest-leverage edit is not adding more sections -- it is cutting boilerplate and tightening every existing section to a clean 40-80 word answer block followed by 60-100 words of context.

Cited pages by word count bucket

<800 words

800-1,499

12%

1,500-2,500

41%

2,501-4,000

27%

>4,000

16%

Source: Growth Engineer pSEO Citation Audit, May 2026 (cited pages, n=23)

Do ChatGPT and Perplexity cite the same programmatic pages?

No. Of the 23 pages that earned at least one citation, only 3 were cited by both engines for the same prompt -- a 13% overlap inside our cited set, or 3% across the full sample. ChatGPT and Perplexity are functionally separate citation channels.

The split lines up with Profound's 680M-citation analysis, which found only 11% of domains are cited by both engines, and Hacker News commentary on the same dataset. ChatGPT cites Wikipedia 47.9% of the time and rewards comparative listicles and authoritative-feeling explainers. Perplexity cites Reddit 46.7% of the time and rewards fresh, multi-format, community-validated answers.

For pSEO operators this means two things:

Track separately. A page that wins ChatGPT citations may be invisible in Perplexity, and vice versa. Use a tool like Profound, Otterly, or AthenaHQ that breaks citation rate out by engine.
Plan separate distribution. Earning Perplexity citations on a comparison page often requires a corresponding Reddit thread that mentions the same comparison. Earning ChatGPT citations more often requires Wikipedia-style entity density and inline citations to authoritative sources.

What does this mean for your pSEO program?

Most existing programmatic SEO templates need a four-block retrofit, not a rewrite. In our audit, the pages that earned citations were not architecturally different from the ones that did not -- they had the same template skeleton but with four blocks added on top.

The retrofit:

Answer capsule under every H2. A 40-80 word self-contained answer paragraph immediately after the heading. This is the unit LLMs extract.
One original data point per page. A screenshot, a calculated metric, a quote from an internal customer interview, a chart from your own product data. Pages with original data were cited 4x more.
Author byline with credentials. Real name, real title, real LinkedIn link. 57% of cited pages had this vs 19% of non-cited.
dateModified on a 60-day cycle. Set a refresh schedule and actually update the date and at least one stat per cycle.

The template change that does not work: bolting on FAQPage schema and calling it AEO. The schema only matters if the visible page also contains the rendered Q&A blocks. For the full retrofit playbook, see our guide to AEO for programmatic pages and pSEO template structure that survives Helpful Content.

On-page feature	Cited pages (n=23)	Non-cited pages (n=77)	Citation lift
Median word count	1,847	4,212	Sweet spot 1.5-2.5k
Visible H2 Q&A blocks	78%	23%	3.4x
FAQPage schema present	61%	57%	1.06x (negligible)
Article + ItemList schema	83%	44%	1.9x
Comparison or data table on page	70%	33%	2.1x
Median inline citations to external sources	6	1	2.6x
dateModified within 60 days	65%	23%	2.8x
Median internal link count	18	16	1.12x (no signal)
Author byline with credentials	57%	19%	3.0x
Original data or screenshot	48%	12%	4.0x

Frequently asked questions

Do ChatGPT and Perplexity cite programmatic SEO pages?

Yes, but selectively. In our audit of 100 programmatic pages across 12 SaaS brands, 23 pages were cited at least once across 50 prompts. ChatGPT cited 18 pages, Perplexity cited 14, and only 3 pages were cited by both engines. Template pages with thin content and no original data were almost never cited.

Does FAQ schema increase AI citation rate?

Barely, on its own. Cited pages in our audit had FAQPage schema 61% of the time vs 57% for non-cited pages -- a 1.06x lift that is statistically noise. The signal that actually predicted citation was visible H2 question-and-answer formatting, which appeared on 78% of cited pages vs 23% of non-cited. LLMs extract from rendered text, not JSON-LD.

What word count gets programmatic pages cited by AI?

1,500 to 2,500 words is the sweet spot. 41% of our cited pages fell in this band, while pages over 4,000 words were cited only 16% of the time. AI engines extract 40-80 word answer chunks under H2s, so density beats length. Most non-cited pSEO pages were either too thin (<800 words) or bloated walls of templated text.

How much overlap isthere between ChatGPT and Perplexity citations?

Almost none. Only 3 of our 100 pSEO pages (3%) were cited by both ChatGPT and Perplexity for the same prompt. This matches Profound's 680M citation analysis which found 11% domain-level overlap. Optimize for both engines separately, because they pull from different source ecosystems.

What kills AI citation odds on a programmatic SEO page?

Four things, ranked by impact in our audit: 1) no original data or screenshots, 2) generic boilerplate intros that bury the answer, 3) no author byline with credentials, 4) dateModified older than 6 months. Pages with all four kept their Google rankings but were invisible inside ChatGPT and Perplexity answers.

How fresh does a programmatic SEO page need to be to get cited?

Pages updated within 60 days were 2.8x more likely to be cited in our dataset. This aligns with Princeton's GEO study and Perplexity's documented preference for content under 30 days old. Set a quarterly refresh cycle on your top revenue-driving pSEO templates and update the dateModified field every cycle.

Does internal linking affect AI citation rate on programmatic pages?

Not meaningfully. Cited pages had a median of 18 internal links vs 16 for non-cited, a 1.12x difference inside our margin of error. Internal linking still matters for Google indexation and topical authority, but it is not a direct AI citation lever the way schema, freshness, and visible Q&A structure are.

How do you turn an existing programmatic SEO template into an AI-citable page?

Add four blocks to your template: 1) a 50-word answer capsule directly under each H2, 2) one original data point or screenshot per page, 3) an author byline with credentials, 4) a dateModified field that updates on every refresh. In our audit, pages with all four were cited 4x more often than pages missing any of them.

After the methodology section, offer readers a free audit of their own programmatic templates.

Get our free pSEO citation audit