An AEO audit is a pass/fail review of whether AI search engines can crawl your B2B site, extract clean answers from your pages, parse your schema, verify your authority, and find your brand mentioned off-site. This checklist gives you 30 specific checks across five categories, each with a 60-second test. According to Forrester's 2026 Buyer Insights, 94% of B2B buyers now use GenAI for self-guided research, so any check you fail is leaking pipeline to a competitor that passes it.
How do I audit my B2B site for AI search visibility?
Run this 30-point checklist top to bottom. Each item is binary: pass or fail, with a method to verify in 60 seconds or less.
The audit is organized into five categories that mirror how a citation actually happens:
- Crawlability -- can AI bots reach your pages at all?
- Extractability -- is your content shaped so a model can lift a clean answer?
- Schema -- are you sending machine-readable signals?
- Authority -- do AI engines trust the source?
- Distribution -- are you mentioned in the places AI engines weight heavily?
Tools you'll need: a browser, Google's Rich Results Test, one logged-out ChatGPT or Perplexity tab, and 90 minutes. No engineering help required for ~70% of the checks. Score each check 1 (pass) or 0 (fail), tally by category, and you have a heat map of where to spend the next sprint.
What does the 30-point AEO audit checklist cover?
The 30 checks map to the five citation factors that AI engines weight most heavily in 2026. Distribution by category:
| Category | # of checks | What it tests |
|---|---|---|
| Crawlability | 6 | Robots.txt, sitemaps, JS rendering, llms.txt, status codes, WAF rules |
| Extractability | 6 | TL;DR, question H2s, section length, tables, citations, answer-first structure |
| Schema | 6 | Article, FAQPage, Organization, HowTo, Product, validation |
| Authority | 6 | Wikidata, named authors, primary citations, brand consistency, co-mentions, press |
| Distribution | 6 | Reddit, LinkedIn, YouTube, directories, citation tracking, refresh cycle |
A score of 25+/30 means your site is AEO-ready. 18-24 means you have meaningful gaps. Below 18, your brand is effectively invisible to ChatGPT, Perplexity, and Google AI Overviews.
Crawlability: can AI crawlers actually read your site?
Crawlability is the first failure point. If GPTBot or PerplexityBot can't fetch your page, nothing else matters. Six checks:
1. AI crawlers allowed in robots.txt
- Pass:
GPTBot,OAI-SearchBot,ChatGPT-User,ClaudeBot,PerplexityBot, andGoogle-Extendedare not disallowed. - 60-second check: visit
yourdomain.com/robots.txtand search for each user-agent. NoDisallow: /lines under those agents.
2. XML sitemap is live and submitted
- Pass: sitemap returns 200, lists priority URLs, and is referenced in robots.txt.
- 60-second check: visit
yourdomain.com/sitemap.xml. Submit in Google Search Console.
3. Critical content renders without JavaScript
- Pass: main copy is in initial HTML, not loaded via client-side JS.
- 60-second check: right-click the page, View Source, search for a sentence from your H1. If it's there, you pass. AI training crawlers historically skip JS-rendered content.
4. Cloudflare / WAF rules aren't blocking AI bots
- Pass: AI bot user-agents are not on a 403 / challenge list.
- 60-second check: in Cloudflare > Security > Bots, confirm AI Crawl Control is set to allow (not block) the bots you care about.
5. llms.txt file exists at root
- Pass:
yourdomain.com/llms.txtreturns 200 with a markdown index of priority pages. - 60-second check: curl the URL or visit it in-browser. Adoption is still emerging, but Anthropic, Stripe, and Cloudflare have shipped llms.txt files, and developer-tool LLMs already read them.
6. No 404 or redirect loops on priority URLs
- Pass: every URL in your sitemap returns 200, not 301-301-200 or 404.
- 60-second check: run Screaming Frog free version on your top 100 URLs. AI training data freezes; broken canonicals stay broken in the model.
Extractability: is your content shaped for AI extraction?
AI engines lift answers, not articles. If your page buries the answer or rambles, you don't get cited even if the bot reaches you. Six checks:
7. Direct answer in the first 50-100 words
- Pass: the H1's question is answered in the opening paragraph, no preamble.
- 60-second check: read the first paragraph. Does it state the answer? 90% of top-cited sources do.
8. TL;DR or summary box at the top
- Pass: a visible summary block with 3-5 bullets above the fold.
- 60-second check: page loads -- can you see a summary without scrolling? If not, fail.
9. Question-shaped H2s
- Pass: every H2 reads as a question a human would type into ChatGPT ("What is X?", "How does Y work?").
- 60-second check: outline the page. Count question-shaped H2s vs cute headings. Aim for 80%+ questions.
10. Sections are 120-180 words between headings
- Pass: most sections fall in the 120-180 word band.
- 60-second check: per SE Ranking's 2025 study of 129,000 domains, pages with 120-180 word sections receive 70% more ChatGPT citations. Scan for walls of text or 30-word stubs.
11. Tables for multi-attribute comparisons
- Pass: any "X vs Y" or feature comparison uses an HTML table.
- 60-second check: search the page for
<table>or markdown pipes. AI engines parse tables cleanly; prose comparisons get skipped.
12. Statistics are sourced inline with hyperlinks
- Pass: every number names the source, year, and links to the original.
- 60-second check: Cmd-F for "studies show" or "research shows". Each instance is a fail. Per the Princeton GEO study, adding cited statistics boosts AI visibility ~40%.
Schema: are AI engines getting structured signals?
Schema markup is how a page tells an AI engine what it is. Done well, it lifts citation rates substantially. Done badly, it can hurt. Six checks:
13. Article schema with author, datePublished, dateModified
- Pass: every blog post has Article JSON-LD with all three properties.
- 60-second check: paste the URL into Google's Rich Results Test. Confirm Article appears with all dates.
14. FAQPage schema on FAQ blocks
- Pass: any FAQ section is wrapped in FAQPage JSON-LD.
- 60-second check: Rich Results Test. FAQ-schema pages are 60% more likely to feature in Google AI Overviews.
15. Organization schema sitewide
- Pass: homepage and footer carry Organization schema with
sameAslinks to LinkedIn, Crunchbase, G2, Wikidata. - 60-second check: View Source on homepage, search for
"@type":"Organization".
16. HowTo schema on process / step-by-step content
- Pass: any "how to do X" article uses HowTo with named steps.
- 60-second check: Rich Results Test on a how-to URL.
17. Product schema on product / pricing pages
- Pass: product pages carry Product schema with offers, reviews, and aggregateRating.
- 60-second check: Rich Results Test on
/productor/pricing.
18. Schema validates with no errors
- Pass: zero errors, zero warnings on priority templates.
- 60-second check: Rich Results Test. Per a Growth Marshal study (n=730 citations), attribute-rich schema earns a 61.7% citation rate, but generic minimally-populated schema underperforms no schema at all (41.6% vs 59.8%). Quality matters more than presence.
Authority: do AI engines trust your brand?
AI engines weight source authority heavily. ChatGPT pulls 47.9% of top citations from Wikipedia. If your brand has no entity footprint, you're not in the model. Six checks:
19. Wikipedia or Wikidata entry exists
- Pass: a Wikidata Q-number exists for your company, with
instance of,industry,founded,headquarters, andofficial websitepopulated. - 60-second check: search Wikidata for your brand. No entry = highest-leverage fix on this list.
20. Named author with bio and credentials
- Pass: every article has an author byline linking to a bio page with credentials, LinkedIn, and Person schema.
- 60-second check: click an author name. Do you land on a real bio?
21. Primary sources cited inline with hyperlinks
- Pass: every claim links to a study, doc, or report -- not another marketing blog.
- 60-second check: open three articles, count outbound citations. Aim for 3-5 per piece. Per Princeton, adding authoritative external citations lifts visibility up to 115% for lower-ranked content.
22. Brand consistency across G2, Crunchbase, LinkedIn, Apollo
- Pass: company name, tagline, category, and founding year match across all major B2B databases.
- 60-second check: spot-check three platforms. Inconsistencies confuse entity resolution and tank citation rates.
23. 5+ third-party co-mentions per priority page
- Pass: each pillar page has 5 or more third-party pages mentioning your brand alongside the topic.
- 60-second check: Google
"yourbrand" "keyword"minus your own domain. Count results.
24. Recent third-party press or industry mention
- Pass: at least one mention from a credible publication or analyst report in the last 90 days.
- 60-second check: news search your brand name. Per Perplexity's freshness bias, recency is weighted heavily.
Distribution: are you visible where AI engines actually look?
AI engines pull from off-site sources at high rates. 5W's 2026 AI Platform Citation Source Index found Reddit accounts for roughly 40% of citations across LLMs and ~24% of Perplexity citations alone. Owned content is half the game. Six checks:
25. Active presence in 3+ relevant subreddits
- Pass: employees post substantive answers (not link drops) in subreddits where buyers ask category questions.
- 60-second check: search Reddit for
site:reddit.com "yourbrand". Count organic mentions.
26. LinkedIn distribution for every new article
- Pass: every article gets a thoughtful LinkedIn post within 24 hours of publish.
- 60-second check: check your last 10 articles vs your last 10 LinkedIn posts.
27. YouTube or podcast presence
- Pass: at least one YouTube channel or podcast appearance per quarter where your brand is named in transcripts.
- 60-second check: per GEORaiser, YouTube overtook Reddit as the #1 AI citation source for some categories in 2026. Search YouTube for your brand.
28. Listed in 5+ industry directories (G2, Capterra, TrustRadius, Gartner Peer Insights, Product Hunt)
- Pass: claimed and populated profiles with reviews on each.
- 60-second check: search each directory.
29. AI citation tracking installed
- Pass: a tool like Profound, Otterly, or Peec.ai is monitoring your brand citation rate weekly.
- 60-second check: if no one on the team can show this week's citation report, fail.
30. 13-week content refresh cycle in place
- Pass: priority pages have a
dateModifiednewer than 90 days. - 60-second check: sort your blog by last-modified date. Anything older than 13 weeks needs review.
Which AEO checks have the highest impact-to-effort ratio?
Five checks deliver disproportionate citation lift relative to the work involved. Hit these first:
- Allow AI crawlers in robots.txt (Check #1) -- 5 minutes of work, unlocks everything downstream. The most common silent killer.
- TL;DR + answer-first intro (Checks #7-8) -- 30 minutes per priority page; 90% of top-cited sources answer in the first 100 words.
- FAQPage schema on every FAQ block (Check #14) -- one template change, 60% more likely to feature in AI Overviews per Wellows.
- Wikidata entry (Check #19) -- 90 minutes once; permanent entity grounding.
- Reddit and YouTube distribution (Checks #25, #27) -- ongoing, but Reddit alone drives ~24% of Perplexity citations per 5W's 2026 Index.
A team with no engineering help can land all five in under a week. That alone moves most B2B sites from invisible to citable in ChatGPT and Perplexity within 3-5 publishing cycles.
How long does a full AEO audit take?
A solo marketer can complete the 30-point audit in 60 to 90 minutes if the site is under 500 pages. Breakdown:
- Crawlability (6 checks): 15 min -- robots.txt, sitemap, source view, WAF, llms.txt, status codes
- Extractability (6 checks): 20 min -- spot-check 5 priority URLs against checks 7-12
- Schema (6 checks): 15 min -- run priority templates through Rich Results Test
- Authority (6 checks): 20 min -- Wikidata, author bios, brand consistency spot checks
- Distribution (6 checks): 15 min -- Reddit, LinkedIn, YouTube, directory inventory
Fixing what the audit surfaces is the longer job. Per Fountain City's B2B GEO guide, full implementation typically runs 2-3 months: technical fundamentals in 1-2 weeks, content restructure in 2-4 weeks, and authority/entity work as a rolling effort. AI engines pull new content into citation pools within 3-5 business days, so quick wins compound fast.
What's the first thing to check if my brand isn't being cited by ChatGPT?
Check robots.txt first. It's the #1 silent killer of AI search visibility for B2B sites.
A surprising number of B2B marketing teams inherited a Disallow: / rule for GPTBot from a 2023 IP-protection panic, never reverted it, and now wonder why ChatGPT can't see them. Visit yourdomain.com/robots.txt right now. If you see User-agent: GPTBot followed by Disallow: /, that's your answer.
Second check: Wikidata. ChatGPT relies heavily on Wikipedia for entity grounding. No Wikidata entry, no clean entity association, lower citation probability across every prompt.
Third check: are your pages JS-rendered? If your homepage renders client-side via React/Vue with no SSR, AI training crawlers may have indexed an empty shell. View Source and search for a sentence from your H1 -- if it's not in the raw HTML, you have a rendering problem to fix before any other AEO work matters.
Which AEO checks can a marketer do without engineering help?
21 of the 30 checks are doable by a marketer with no developer support. The 9 checks that typically need engineering are flagged below:
| Check | Marketer-only? |
|---|---|
| #1 robots.txt config | Engineering |
| #2 Sitemap | Engineering |
| #3 SSR rendering | Engineering |
| #4 WAF rules | Engineering |
| #5 llms.txt file | Engineering |
| #6 Status codes | Engineering (audit yes, fix no) |
| #7-12 Content/extractability | Marketer |
| #13-18 Schema | Engineering for install, Marketer for audit |
| #19-24 Authority | Marketer |
| #25-30 Distribution | Marketer |
A marketer can run the full audit solo, then ship a prioritized ticket to engineering for the technical fixes. Most B2B teams find that 60-70% of their citation lift comes from the marketer-doable checks: TL;DRs, question H2s, FAQ schema (via plugin), Wikidata, and Reddit distribution.
| # | Category | Check | 60-Second Method | Pass Criteria |
|---|---|---|---|---|
| 1 | Crawlability | AI crawlers allowed in robots.txt | Visit /robots.txt, search for GPTBot, ClaudeBot, PerplexityBot | No Disallow: / under any AI user-agent |
| 2 | Crawlability | XML sitemap is live and submitted | Visit /sitemap.xml, check Google Search Console | Returns 200, referenced in robots.txt |
| 3 | Crawlability | Content renders without JavaScript | View Source, search for H1 sentence | Found in raw HTML |
| 4 | Crawlability | WAF doesn't block AI bots | Cloudflare > Security > Bots | AI bots set to allow |
| 5 | Crawlability | llms.txt exists at root | Visit /llms.txt | Returns 200 with markdown index |
| 6 | Crawlability | No broken redirects on priority URLs | Run Screaming Frog on top 100 URLs | All return 200, no chains |
| 7 | Extractability | Direct answer in first 50-100 words | Read opening paragraph | Answer stated, no preamble |
| 8 | Extractability | TL;DR / summary block above the fold | Load page, look without scrolling | Visible summary block |
| 9 | Extractability | Question-shaped H2s | Outline the page | 80%+ H2s are questions |
| 10 | Extractability | Sections of 120-180 words | Spot-check section lengths | Most sections in 120-180 band |
| 11 | Extractability | Tables for multi-attribute comparisons | Search for <table> in source | Comparisons use tables |
| 12 | Extractability | Statistics sourced inline with links | Cmd-F for 'studies show' | Zero matches; every stat linked |
| 13 | Schema | Article schema with dates + author | Rich Results Test | Article appears with author, datePublished, dateModified |
| 14 | Schema | FAQPage schema on FAQ blocks | Rich Results Test on FAQ page | FAQPage detected |
| 15 | Schema | Organization schema sitewide | View Source on homepage | @type:Organization with sameAs |
| 16 | Schema | HowTo schema on step-by-step content | Rich Results Test on how-to URL | HowTo detected |
| 17 | Schema | Product schema on product pages | Rich Results Test on /product | Product with offers + ratings |
| 18 | Schema | Schema validates with zero errors | Rich Results Test on priority templates | No errors, no warnings |
| 19 | Authority | Wikidata entry exists | Search wikidata.org for brand | Q-number with populated properties |
| 20 | Authority | Named author with bio + credentials | Click author byline | Real bio page with credentials |
| 21 | Authority | Primary sources cited with hyperlinks | Count outbound citations on3 articles | 3-5 primary citations per piece |
| 22 | Authority | Brand consistency across databases | Spot-check G2, Crunchbase, LinkedIn | Name, category, year match |
| 23 | Authority | 5+ third-party co-mentions per pillar | Google '"brand" "keyword"' minus own domain | 5+ results |
| 24 | Authority | Recent press / analyst mention (90 days) | News search brand name | 1+ credible mention in 90 days |
| 25 | Distribution | Active in 3+ relevant subreddits | Search reddit.com for brand | Organic, substantive mentions |
| 26 | Distribution | LinkedIn distribution per article | Compare last 10 posts to last 10 articles | 1:1 ratio within 24 hours |
| 27 | Distribution | YouTube or podcast presence quarterly | Search YouTube for brand | 1+ named mention per quarter |
| 28 | Distribution | 5+ industry directory listings | Check G2, Capterra, TrustRadius, Gartner Peer Insights, Product Hunt | Claimed and populated on each |
| 29 | Distribution | AI citation tracking installed | Ask team for this week's citation report | Active monitoring tool in use |
| 30 | Distribution | 13-week content refresh cycle | Sort blog by dateModified | Priority pages updated within 90 days |