Skip to content
Playbook · Practice

Full GEO Audit

Quick facts

Difficulty
Advanced
Time
1–2 days for a first full audit, ~half-day for a re-audit
Prerequisites
GEO Metrics, Generative Engine Optimization
What this is
A periodic physical for an entire surface: walks the GEO dependency stack in order, surfaces findings with severity, ships a report
Structure
A 6-layer dependency ladder, audited bottom-up: access → render → structure → content → authority → outcome
Scoring
Per-layer severity rubric is load-bearing; any 0–100 composite ships with its method or not at all
Effort
~1–2 days for a first full audit, ~half-day for a re-audit

1. What a full GEO audit covers

A full GEO audit is a periodic physical for an entire surface — it walks every layer that decides whether AI engines can fetch, parse, ground, and cite your pages. Each layer is its own discipline with its own playbook: citability, crawler-access audit, schema implementation, llms.txt deployment, writing for citation, citation tracking. The audit walks them in dependency order, tags each finding with a severity, and ships a report.

The 6 layers, bottom up: access → render → structure → content → off-site authority → outcome. Order matters — a finding higher up the stack is uninterpretable if the layer beneath it failed (§3).

It sits between two adjacent workstreams:

  • the heartbeatAI Citation Tracking produces a recurring snapshot of what engines actually do; the audit consumes that snapshot and diagnoses why;
  • the destinationGEO Maturity Model turns the diagnosis into a path forward.

One honest note on framing. “GEO audit” is a generic industry term; the specific 6-layer dependency ladder below is GEO Wiki’s organizing device, not an established standard. Use it because the ordering is load-bearing (§3), not because anyone ratified it.

2. Before you audit — scope, inputs, trigger

Four decisions fix what every later finding means. Get one wrong and the report is uninterpretable — the same decide-before-you-measure discipline the tracking playbook opens with.

DecisionOptionsRule of thumb
ScopeWhole domain / a locale / a subfolder / key templatesAudit one coherent surface; a mixed-scope finding is not actionable
Engine setYour audience’s actual engines, declared — not “AI”The engine set is a reported variable; name it in the report header
Competitor setNone / a named set as an optional overlayCompetitive findings are relative; never mix them with absolute ones. Definition discipline → GEO Metrics
BaselineFirst audit / delta against a prior audit or tracking logWithout a baseline you have a snapshot, not a trend — say which you have

When to run. A quarterly cadence, plus trigger events: a site migration, a redesign or SSR/render change, a robots.txt edit, a major content launch, or a known engine model/retrieval update. The cadence catches drift; the triggers catch step-changes.

Inputs on hand before you start: crawl access to production, the live robots.txt, the XML sitemap, a render-check tool (fetch-and-render, not just view-source), and the latest citation-tracking log if one exists. No tracking log is not a blocker — see §4.6.

3. The audit ladder — why order is load-bearing

The core mental model: GEO readiness is a dependency stack, not a checklist. A finding at an upper layer is uninterpretable if the layer beneath it failed.

  • Perfect Schema is worthless if AI crawlers are blocked at the door.
  • Great content is invisible if it does not survive fetch-time render.

So you audit bottom-up — start at the hardest gate — and you report and prioritize by severity, not by layer (§5–§6). The ladder is the page’s spine:

#LayerQuestion it answersGate behaviorGoverning playbook
1Access & crawlabilityCan AI fetch you at all?Fail → stop. Everything above is mootAI Crawler Access Audit
2Render & deliveryDoes the content survive the fetch?Fail → upper layers measure an empty shellSSR for AI Crawlers · Sitemap & IndexNow
3Structure & machine-readabilityCan a machine parse what it fetched?Weak → strong tax on extraction (not a hard block)Schema Implementation · llms.txt Deployment
4Content & trustIs it citable once read?Weak → read but not citedCitability · Writing for AI Citation
5Off-site authorityIs the entity corroborated elsewhere?Thin → under-cited despite clean pagesBrand Mention Tracking
6Outcome reconciliationWhat do engines actually do?Does not gate — it closes the loopAI Citation Tracking

Each rung’s findings stay chunk-friendly and one-screen so the final report is itself extractable.

4. Running the ladder

Audit the layers in order, 1 → 6. Each layer below uses the same four-row table: the question it answers, the gate signal that decides pass / block-upward, the 2–3 highest-signal checks, and a link to the playbook with the full how-to.

4.1 Layer 1 — Access & crawlability

QuestionCan the major AI user-agents fetch your pages at all?
GateBlocked or hard rate-limited → stop. Layers 2–6 are moot. This is the hardest gate.
Top checks(1) robots.txt rules for AI user-agents, including Google-Extended; (2) server/CDN/WAF user-agent blocking or bot-challenge interstitials; (3) soft-blocks — a 200 that serves a challenge page to non-browser clients
For the how →AI Crawler Access Audit; user-agent reference → AI Crawlers

Authoritative UA list and robots.txt token behavior: Google’s common-crawlers doc. This layer is binary in spirit — you are either reachable or you are a Layer-1 blocker finding.

4.2 Layer 2 — Render & delivery

QuestionDoes the primary content survive a fetch-time render?
GatePrimary content needs client-side JS the crawler does not execute → it sees an empty shell.
Top checks(1) SSR/SSG vs CSR for primary content (test fetched HTML, not the painted DOM); (2) sitemap coverage and lastmod freshness; (3) change-signaling via IndexNow/ping
For the how →SSR for AI Crawlers · Sitemap & IndexNow

IndexNow protocol reference: indexnow.org/documentation. A page that 200s but renders blank to a non-JS fetch is a Layer-2 finding, not a Layer-4 content problem — classify it here or you will misdiagnose it upward.

4.3 Layer 3 — Structure & machine-readability

QuestionCan a machine reliably parse and attribute what it fetched?
GateAbsent/invalid structured data → degraded entity resolution and answer extraction (a strong tax, not a hard block).
Top checks(1) Schema.org coverage and validity on key templates; (2) llms.txt presence and accuracy; (3) semantic heading structure and chunk boundaries
For the how →Schema Implementation · llms.txt Deployment; concept → Schema.org for AI

Validate against Google’s structured-data intro with the Rich Results Test and the schema.org Schema Markup Validator; llms.txt against the proposed spec. Note Google states AI features need no special schema — so weight structured data for entity clarity, not as an AI-eligibility lever (AI features and your site).

4.4 Layer 4 — Content & trust

QuestionOnce read, is the content citable and trustworthy?
GateNot chunk-extractable or thin on trust signals → read but not cited.
Top checks(1) extractable, self-contained claims and chunking (citability); (2) E-E-A-T signals; (3) freshness/decay on time-sensitive pages
For the how →Citability · Writing for AI Citation; concepts → E-E-A-T · Content Freshness

This is the layer with the most published causal evidence: content-level rewrites (add citations, statistics, quotations) lifted answer visibility by up to ~40% in Aggarwal et al. 2024. Treat that as direction, not a guaranteed number — §8.

4.5 Layer 5 — Off-site authority

QuestionIs the entity corroborated outside your own domain?
GateThin off-site presence → under-cited even with perfect on-page work.
Top checks(1) brand-mention volume and sentiment in the kinds of sources engines cite; (2) knowledge-graph / entity presence and disambiguation
For the how →Brand Mention Tracking; concepts → Brand Mentions · Knowledge Graph Presence · Entity Recognition

This is the layer practitioners most often skip and most often need: on-page perfection has a ceiling set by how well the entity is recognized and corroborated elsewhere.

4.6 Layer 6 — Outcome reconciliation

QuestionDo engines actually cite you — and does that match Layers 1–5?
GateDoes not gate upward. It closes the loop: “should be cited” (1–5) vs “is cited” (tracking).
Top checks(1) latest citation-tracking snapshot per declared engine; (2) gaps where on-page is clean but citation is absent → an off-site or engine-behavior signal
For the how →AI Citation Tracking — consume the latest snapshot

The latest tracking snapshot is the input to Layer 6; running the tracking itself is the AI Citation Tracking playbook. If no tracking log exists, Layer 6 yields exactly one finding: “stand up tracking first.” Per-engine behavior differs — reconcile against Perplexity AI, ChatGPT Search, and Google AI Overviews; the last has no per-citation API, so its outcome layer is necessarily coarser.

5. Scoring — a defensible severity model

Every finding gets a severity, and severity is anchored to the layer it sits on. A Layer-1 access failure outranks any amount of Layer-4 polish, because the polish cannot be observed until access is fixed.

SeverityMeaningTypical layer anchor
BlockerSuppresses citation outright; upper layers unmeasurableLayer 1, Layer 2
MajorMaterially depresses extraction/citation; measurable nowLayer 3, Layer 4
MinorReal but bounded; optimization, not a gateLayer 4 polish, Layer 5 long-tail

A composite 0–100 readiness score is optional and explicitly demoted. It is directional, not absolute — it ships only with its weighting method attached, the same provenance discipline GEO Metrics enforces for every reported number. A bare score with no method is a rumour, not a result.

6. Prioritize — from finding list to ranked action plan

Severity is not priority. A Major finding that is cheap and certain to fix outranks a Blocker that needs a six-month migration. Re-rank the severity-tagged findings by impact × confidence × ease (ICE) so the report ends with a sequenced action list, not a pile.

FindingLayerSeverityImpactConfidenceEaseICERank
Google-Extended disallowed in robots.txt1Blocker9997291
Key templates ship no Schema3Major7863362
Pricing page is CSR-only2Blocker8731683
Thin author/E-E-A-T signals4Major6551504

The ranked plan is the input to a roadmap, not the roadmap itself: where you are is this audit; the path forward is the GEO Maturity Model.

7. The audit report deliverable

What ships, every time:

  • Header / provenance — audit date, scope, declared engine set, and the tracking-log version consumed (§4.6). A report without provenance is not comparable to the next one.
  • Ladder results, bottom-up — Layer 1 → 6, each pass/fail with evidence.
  • Severity-ranked findings, then the ICE-prioritized action plan.
  • Baseline delta — if a prior audit exists, what moved and whether it moved because you changed something or the engine did.

Re-audit discipline. A re-audit trusts unchanged lower layers and re-runs what the trigger touched (§2). State explicitly what you re-ran versus carried forward — an unstated carry-forward is how a stale “pass” survives three audits.

8. Validity threats & pitfalls

Do not ship a report without clearing every item. Each line is the checklist; the linked source has the full treatment.

  • Top-down auditing — checking content first and missing a Layer-1/2 block beneath it; the most common and most expensive error.
  • Bare composite score — a 0–100 with no weighting method (GEO Metrics).
  • Stale tracking log treated as current — Layer 6 reconciled against last quarter’s reality (AI Citation Tracking).
  • Mixing competitor-relative and absolute findings — they are different constructs; never sum them (GEO Metrics).
  • One-locale audit generalized to all — Chinese and English answer surfaces are not interchangeable.
  • “Fixed it” with no re-audit delta — a change is not a result until a re-audit shows it.
  • Over-claiming visibility → revenue — the audit proves a visibility gap closed, not that revenue moved; the business bridge is a separate model (GEO ROI Models).
  • Over-extrapolating a single-actor lift — a gain measured alone may not survive once competitors optimize against the same engine; see the caveat in Aggarwal et al. 2024 §6.

9. Further reading

References

Academic:

  1. Aggarwal, P. et al. (2024). GEO: Generative Engine Optimization. KDD ‘24. arXiv:2311.09735 · ACM DL
  2. Liu, N., Zhang, T., Liang, P. (2023). Evaluating Verifiability in Generative Search Engines. Findings of EMNLP ‘23. arXiv:2304.09848

Platform & standards documentation (verified 2026-05):

Frequently asked questions

Isn't this just an SEO audit with AI keywords swapped in?
No, and the difference is structural, not cosmetic. A traditional SEO audit asks 'will this page rank in a list of links?'. A GEO audit asks 'will this page be pulled into a synthesized answer, and cited when it is?' — a different object with different failure modes. Ranked-link assumptions (position, CTR curves) do not transfer. See Generative Engine Optimization for why the metric changed, not just the tactics.
Why audit bottom-up instead of starting with the content, which is what clients care about?
Because GEO readiness is a dependency stack, not a checklist. A brilliant Layer-4 content finding is uninterpretable if Layer 1 blocks the AI crawler — you would be grading a page nothing can fetch. Audit access first, content later; report by severity, not by the order you happened to look.
Do I need citation tracking running before I can do the audit?
Layer 6 consumes the latest tracking snapshot — running the tracking itself is the AI Citation Tracking playbook. If no tracking exists, that is not a blocker; it is the audit's first finding: 'stand up tracking'. An audit with no outcome layer still diagnoses Layers 1–5; it just cannot yet reconcile 'should be cited' against 'is cited'.
Can I just report the composite 0–100 GEO score a tool gave me?
Only with its method attached. A bare score is a rumour — two tools compute it differently and neither is wrong, just incomparable. The load-bearing output of this audit is the per-layer severity rubric; the composite is an optional, directional summary. For definitions and provenance discipline, see GEO Metrics.
How often should a full audit run?
Quarterly as a cadence, plus trigger events: site migration, an SSR/render change, a robots.txt edit, a major content launch, or a known engine model/retrieval update. A re-audit trusts unchanged lower layers and re-runs what the trigger touched. 'We fixed it' without a re-audit delta is a claim, not a result.

Related playbooks & wiki

Sources

Primary

  1. GEO: Generative Engine Optimization (Aggarwal et al., KDD 2024) · arXiv / KDD '24 · 2024-08-25
  2. GEO: Generative Engine Optimization (KDD '24 Proceedings) · ACM SIGKDD · 2024-08-25
  3. Overview of Google crawlers and fetchers (user agents) · Google Search Central
  4. List of Google's common crawlers (Googlebot, Google-Extended) · Google Search Central · 2026-04-23
  5. Google Search Central — AI features and your site · Google · 2025-12-10
  6. Introduction to structured data markup in Google Search · Google Search Central · 2025-12-10
  7. Rich Results Test · Google
  8. Schema Markup Validator · Schema.org
  9. IndexNow — Protocol Documentation · IndexNow.org
  10. The /llms.txt file — proposed standard · Answer.AI (Jeremy Howard) · 2024-09-03
  11. Perplexity API — Chat Completions Reference · Perplexity
  12. OpenAI — Web Search tool (Responses API) · OpenAI

Secondary

  1. Evaluating Verifiability in Generative Search Engines (Liu et al. 2023) · arXiv / EMNLP '23 Findings
Last updated: 2026-05-19 Authors: Ray Yang Topic: Practice