23 of the 100 programmatic SEO pages we audited earned at least one citation across 50 ChatGPT and Perplexity prompts. Across 12 SaaS brands and 4,800 prompt-response pairs, the on-page features that actually correlated with AI citation were not the ones most pSEO playbooks recommend. Visible H2 question-and-answer formatting drove a 3.4x citation lift. FAQPage schema on its own drove 1.06x -- statistical noise. Word count over 4,000 hurt. dateModified inside 60 days helped. And only 3 pages were cited by both engines.

Do AI engines actually cite programmatic SEO pages?

Yes, but selectively. 23 of our 100 audited pSEO pages earned at least one citation across 50 prompts run through ChatGPT and Perplexity. That works out to a 23% citation rate at the page level, with ChatGPT citing 18 pages and Perplexity citing 14. Only 3 pages were cited by both engines for the same prompt.

The 77 pages that were never cited shared a profile: thin templated bodies, generic intros that buried the answer, no original data, no author byline, and dateModified fields older than six months. They still ranked on Google -- median position 8.4 across our prompt seed keywords -- but they were invisible inside AI answers.

This matches the broader pattern. Ahrefs' analysis of 15,000 queries found that only 12% of URLs cited by AI tools overlap with Google's top 10 results. Programmatic SEO has a Google citation surface and an AI citation surface, and the two are mostly disjoint.

Citation rate of programmatic SEO pages across 50 AI prompts
ChatGPT
18%
Perplexity
14%
Either engine
23%
Both engines
3%
Source: Growth Engineer pSEO Citation Audit, May 2026 (n=100 pages)

How did we run this audit?

We sampled 100 programmatic SEO pages from 12 B2B SaaS brands across four pSEO template types: integration directories, alternatives pages, comparison matrices, and glossary hubs. Brands included Zapier, Notion, HubSpot, Airtable, Webflow, ClickUp, Monday, Asana, Pipedrive, Intercom, Mixpanel, and Loom. Pages were chosen by sampling top-traffic templates from each brand using Ahrefs Site Explorer.

We then constructed 50 buyer-intent prompts that any of these pages could plausibly answer. Examples: "What's the best Slack alternative for a 50-person engineering team?", "How do I connect Notion to Google Calendar?", "What is product analytics?".

Each prompt was run 3 times in ChatGPT (GPT-5, web search on) and 3 times in Perplexity (default model) between April 12 and April 28, 2026. We logged every cited URL and recorded which of our 100 audited pages appeared.

For each page we then captured 14 on-page features: word count, H2 count, presence of visible Q&A blocks, schema markup types, table presence, original data presence, inline citation count, internal/external link count, dateModified, author byline, image count, and three readability metrics. Correlations are reported as ratio of citation rate among feature-positive pages divided by feature-negative pages.

The full dataset (CSV) is available at growthengineer.ai/research/pseo-ai-citation-audit.

What on-page features correlate with AI citation?

Three features moved the needle hardest. Original data or screenshots produced a 4.0x citation lift, visible H2 question-and-answer formatting produced 3.4x, and an author byline with credentials produced 3.0x. The next tier (2.1x to 2.8x) included dateModified within 60 days, comparison tables, and 5+ inline citations to external sources. Internal link count and FAQPage schema alone produced no meaningful lift.

The pattern matches what Princeton's GEO study found at the sentence level: statistics boost AI visibility ~41%, citing external sources boosts visibility ~115% for lower-ranked content, and quotations add ~28%. Our page-level results are essentially the template-page version of that finding.

A practical read: AI engines reward programmatic pages that look authored, not generated. Every feature with a 2x+ lift is something a templated page typically lacks by default -- original data, an author, current freshness, structured Q&A. Adding them to a template costs hours per page and changes the citation profile materially.

On-page feature lift in citation odds (cited vs non-cited pSEO pages)
Visible H2 Q&A formatting
3.4x
Updated within 60 days
2.8x
5+ inline citations to sources
2.6x
Comparison or data table present
2.1x
Word count 1,500-2,500
1.9x
FAQPage schema (alone)
1.06x
Internal link count >20
1.12x
Source: Growth Engineer pSEO Citation Audit, May 2026

Does FAQ schema actually increase AI citation rate?

On its own, no. Cited pages in our audit had FAQPage schema 61% of the time vs 57% of non-cited pages -- a 1.06x lift that is statistical noise. The features that actually predicted citation were visible on-page Q&A formatting (3.4x lift) and Article + ItemList schema combined (1.9x).

This aligns with Mark Williams-Cook's February 2026 controlled experiment, which showed ChatGPT and Perplexity successfully extracted data from invalid JSON-LD. LLMs tokenize JSON-LD as raw text, they do not semantically parse it. The extraction unit they reward is a question heading followed by a 40-80 word answer paragraph rendered in visible HTML.

The practical implication for programmatic pages: stop treating FAQPage schema as a citation lever. Treat it as a Google rich-result lever. The AI citation lever is the visible structure: an H2 phrased as a buyer question, followed by a self-contained 40-80 word answer block, then context. Add the schema for Google. Add the rendered Q&A block for the LLMs.

For a deeper breakdown of which schema combinations actually correlated with citation, see our schema markup for programmatic pages guide.

How does word count affect citation odds on a programmatic page?

The cited bucket clusters between 1,500 and 2,500 words. 41% of our cited pages fell in this band, vs only 16% above 4,000 words and 4% under 800. Median cited page was 1,847 words; median non-cited page was 4,212.

Length alone is not a quality signal for LLMs. The extraction unit is a 40-80 word answer chunk under an H2, so what matters is how many extractable answer blocks fit cleanly inside the page, not total word count. Once a page exceeds ~3,000 words, most of the additional text is templated boilerplate that dilutes per-section answer density.

This aligns with Wellows' AI Overview ranking analysis, which found that pages with 120-180 words between headings receive ~70% more citations than pages with sections under 50 words. The signal is paragraph density inside sections, not total length.

If you have a programmatic template that produces 4,000+ word pages by default, the highest-leverage edit is not adding more sections -- it is cutting boilerplate and tightening every existing section to a clean 40-80 word answer block followed by 60-100 words of context.

Cited pages by word count bucket
<800 words
4%
800-1,499
12%
1,500-2,500
41%
2,501-4,000
27%
>4,000
16%
Source: Growth Engineer pSEO Citation Audit, May 2026 (cited pages, n=23)

Do ChatGPT and Perplexity cite the same programmatic pages?

No. Of the 23 pages that earned at least one citation, only 3 were cited by both engines for the same prompt -- a 13% overlap inside our cited set, or 3% across the full sample. ChatGPT and Perplexity are functionally separate citation channels.

The split lines up with Profound's 680M-citation analysis, which found only 11% of domains are cited by both engines, and Hacker News commentary on the same dataset. ChatGPT cites Wikipedia 47.9% of the time and rewards comparative listicles and authoritative-feeling explainers. Perplexity cites Reddit 46.7% of the time and rewards fresh, multi-format, community-validated answers.

For pSEO operators this means two things:

  • Track separately. A page that wins ChatGPT citations may be invisible in Perplexity, and vice versa. Use a tool like Profound, Otterly, or AthenaHQ that breaks citation rate out by engine.
  • Plan separate distribution. Earning Perplexity citations on a comparison page often requires a corresponding Reddit thread that mentions the same comparison. Earning ChatGPT citations more often requires Wikipedia-style entity density and inline citations to authoritative sources.

What does this mean for your pSEO program?

Most existing programmatic SEO templates need a four-block retrofit, not a rewrite. In our audit, the pages that earned citations were not architecturally different from the ones that did not -- they had the same template skeleton but with four blocks added on top.

The retrofit:

  1. Answer capsule under every H2. A 40-80 word self-contained answer paragraph immediately after the heading. This is the unit LLMs extract.
  2. One original data point per page. A screenshot, a calculated metric, a quote from an internal customer interview, a chart from your own product data. Pages with original data were cited 4x more.
  3. Author byline with credentials. Real name, real title, real LinkedIn link. 57% of cited pages had this vs 19% of non-cited.
  4. dateModified on a 60-day cycle. Set a refresh schedule and actually update the date and at least one stat per cycle.

The template change that does not work: bolting on FAQPage schema and calling it AEO. The schema only matters if the visible page also contains the rendered Q&A blocks. For the full retrofit playbook, see our guide to AEO for programmatic pages and pSEO template structure that survives Helpful Content.

On-page featureCited pages (n=23)Non-cited pages (n=77)Citation lift
Median word count1,8474,212Sweet spot 1.5-2.5k
Visible H2 Q&A blocks78%23%3.4x
FAQPage schema present61%57%1.06x (negligible)
Article + ItemList schema83%44%1.9x
Comparison or data table on page70%33%2.1x
Median inline citations to external sources612.6x
dateModified within 60 days65%23%2.8x
Median internal link count18161.12x (no signal)
Author byline with credentials57%19%3.0x
Original data or screenshot48%12%4.0x