GEO Metrics
Quick facts
- Number of core metrics
- 10 (GEO Wiki synthesis — not an industry standard)
- Academic reference
- Aggarwal et al. 2024 — defines 3 metrics, ≠ these 10
- Most transparent vendor (formulas)
- Otterly.ai
- Most contested metric
- Average Position (3 competing definitions)
- Vendors covered in this entry
- Profound · Otterly · Ahrefs · BrightEdge · Similarweb
1. Why GEO needs a new set of metrics
Traditional SEO KPIs — CTR, average SERP position, impressions — are built on one assumption: users see a ranked list of links and decide whether to click. Generative answers break that assumption.
The new setting has three defining features:
- Multi-source aggregation: a single answer can cite 3–10 sources, and “what rank am I” loses meaning
- Zero-click: users get the answer without visiting your page, so traditional CTR is unmeasurable
- Mention vs. citation decoupling: a brand can appear in an answer (mention) without any link (citation)
The job of measurement shifts — no longer “did the user reach my page” but “is the AI using my content to compose its answer”. See GEO concept entry and Zero-click Search for the broader context.
What you measure determines where you invest. The metric system is the foundation of any GEO strategy — see GEO ROI Models.
2. The ten metrics at a glance
The table below summarises the 10 core GEO KPIs — 30 seconds to get the full landscape.
| # | Metric | Measures | Unit | SEO equivalent | Where vendors use it |
|---|---|---|---|---|---|
| 3.1 | Visibility Score | Headline appearance rate | % or 0–100 index | ≈ search-visibility index | Profound, Otterly, Semrush (headline KPI) |
| 3.2 | Citation Rate | Single-topic citation visibility | % | ≈ CTR | Otterly, Ahrefs |
| 3.3 | Citation Share | Competitive visibility | % | ≈ Backlink share | Ahrefs, Profound |
| 3.4 | Share of Voice | Combined presence | % | ≈ PR SOV | Otterly, Ahrefs, BrightEdge |
| 3.5 | Average Position | Authority signal | 1.0–N.0 | ≈ SERP position (weak) | Otterly |
| 3.6 | Mention Frequency | Trend tracking | count/month | ≈ Brand search volume | Most vendors |
| 3.7 | Answer Inclusion Rate | Query-level coverage | % | ≈ Keyword coverage | Otterly (Brand Coverage) |
| 3.8 | First-Cite Rate | Authority signal | % | ≈ #1 SERP rate | Some platform APIs |
| 3.9 | Brand Sentiment | Perception quality | −100…+100 | ≈ PR sentiment | Otterly, Profound |
| 3.10 | Source Diversity Score | Platform-risk hedging | count / % | (no equivalent) | Mostly bespoke |
Important note on naming and origin: there is no single authoritative “standard” set of GEO KPIs. This list is GEO Wiki’s synthesis of the vocabulary in active commercial use (Profound, Otterly, Ahrefs, BrightEdge, Similarweb). The academic literature (Aggarwal et al. 2024) defines only three metrics under different names (Word Count, Position-Adjusted Word Count, Subjective Impression) and is not the origin of this 10-metric taxonomy. Other published lists (e.g. Search Engine Land’s “8 GEO metrics”) pick a different set again. This entry prefers commercial names so it lines up with the tools you actually buy.
Visibility Score vs. Citation Rate vs. Answer Inclusion Rate — the three most-confused metrics. Visibility Score (§3.1) is the marketed headline number: “did the brand appear at all” (mention or citation), often a composite. Citation Rate (§3.2) counts only answers with an explicit citation of your domain. Answer Inclusion Rate (§3.7) is the un-marketed, query-level binary version of Visibility. Many vendors’ “Visibility Score” is literally Answer Inclusion Rate repackaged ± position weighting — always check the formula.
3. The ten core metrics in detail
3.1 Visibility Score
Definition: the headline, marketed metric for “is my brand present in AI answers at all”. Most commonly: the fraction of tracked prompts/answers where the brand appears as a mention or a citation. Some vendors report it as a composite index (appearance + position, sometimes + sentiment) rather than a raw percentage.
Formula (the common percentage form):
Visibility Score = prompts_where_brand_appears / total_tracked_prompts × 100%
Composite form (Otterly’s public “Brand Visibility Index”, KPI page):
Brand Visibility Index = 10 + ((5 − avgPosition) / 4) × 90
Unit: % (raw form) or 0–100 index (composite form)
Used for: the single number reported to leadership. Answers “across everything we track, how present is the brand in AI answers”.
Vendor variations:
- Profound (Answer Engine Insights) leads with “Visibility score and share of voice metrics” — Visibility Score is its headline KPI; the formula is not public
- Otterly publishes a composite “Brand Visibility Index” built from Brand Coverage + Avg. Position (formula above)
- Semrush / Quattr / others ship a proprietary “AI Visibility Score” that normalises several factors; there is no universal formula (industry overview)
SEO equivalent: ≈ the single “search visibility” / visibility-index headline number that SEO suites (Sistrix, Semrush) report.
Key distinction: Visibility Score is mention-inclusive (appearance at all); Citation Rate (§3.2) requires an explicit citation. They are not interchangeable. A “Visibility Score” that is actually an appearance rate is the same construct as Answer Inclusion Rate (§3.7) under a marketing name. Always read the vendor’s formula before comparing.
Pitfalls:
- No standardized formula — raw % vs. composite index vs. position-weighted versions are not comparable
- Composite indices (Otterly’s) collapse their inputs into one number — decompose before acting on it
3.2 Citation Rate
Definition: the fraction of AI answers about a topic that cite your domain (explicit source attribution).
Formula:
Citation Rate = cited_answers / total_answers_about_topic × 100%
Unit: % (typically 0%–30%; over 30% usually means the topic is too narrow)
Used for: single-brand single-topic citation visibility baseline. Answers “what fraction of AI answers about X cite me”.
Vendor variations:
- Otterly measures this as “Domain Coverage”:
prompts that cite my domain / all prompts in selected time window(KPI definitions page). Note Otterly’s separately-named “Domain Citation” is not this rate — it is an absolute count (see §3.6) - Ahrefs Brand Radar defines “Citations” verbatim as “the AI results that cite the entity at least once as a source” (see Brand Radar help) — this matches the Citation Rate definition above
- Profound’s headline “Visibility Score” is not this metric — it is mention-inclusive (see §3.1)
SEO equivalent: ≈ CTR, but the denominator is “answers about a topic” rather than “search impressions” — the analogy is rough.
Pitfalls:
- The denominator (topic / query set) selection is highly sensitive — the same brand can show 5–10× different Citation Rates on different query sets
- Sample source and time window must be public, otherwise the number is not reproducible
For the operational playbook see AI Citation Tracking; for the citation/mention distinction see Citation vs Mention.
3.3 Citation Share
Definition: your share of all citations vs. a fixed competitor set.
Formula:
Citation Share = your_citations / total_citations_in_competitor_set × 100%
Unit: %
Used for: competitive visibility. Answers “within this niche, what share of AI citations am I capturing”.
Vendor variations:
- Commercial naming is inconsistent — Ahrefs’ “AI Share of Voice” overlaps conceptually but uses a different weighting (see §3.4)
- Profound bundles this into its “Share of Voice” without separate Citation Share
SEO equivalent: ≈ backlink share (your share of backlinks within a keyword set) — both are relative-share measures.
Key distinction from Citation Rate: CR’s denominator is “all answers” (absolute visibility); CS’s denominator is “all citations” (relative competitive position). Citation Rate going up does not imply Citation Share going up — the entire category may be expanding.
3.4 Share of Voice (SOV)
Definition: your share of brand appearances across the AI answers in a topic / competitor set, counting both citations and unlinked mentions.
Formula (Otterly’s public version, KPI definitions page):
SOV = number of my brand mentions / total number of all brand mentions × 100%
Unit: %
Used for: combined brand presence + mindshare measurement. Answers “when AI is talking about this category, what fraction of those conversations include me”.
Vendor variations (the most important section in this entry):
| Vendor | Name | Weighting | Public formula? |
|---|---|---|---|
| Otterly | Share of Voice | Raw mention count | ✓ Full formula |
| Ahrefs Brand Radar | AI Share of Voice | Weighted by Google search volume (impressions) | ✓ See Brand Radar methodology |
| Profound | Share of Voice | Not disclosed | ✗ Marketing copy only |
| BrightEdge | Share of Voice | Not disclosed (extends their SEO SOV patent to AI) | ✗ See SOV in 2026 |
| Similarweb | Brand Mention Share | Sample size and formula not public | ✗ See GenAI Intelligence |
Ahrefs’ impression weighting is a meaningful methodological choice: their SOV reflects potential exposure, not raw mention count. The same mention is weighted higher in a high-search-volume topic. This is a more accurate proxy for the commercial value of AI visibility.
SEO equivalent: ≈ PR-industry Share of Voice, narrowed from all media to AI answers.
Key distinction from Citation Share: SOV counts mentions (including unlinked); CS counts only citations (with explicit attribution). See Citation vs Mention and Brand Mentions.
3.5 Average Position ⚠️ The most contested metric
This is the single most ambiguous KPI in the GEO toolkit. Cross-vendor numbers cannot be compared without further qualification. This section is the highest-value section in this entry.
Three competing definitions in active use:
| Tag | Definition | Data source | Where you’ll see it |
|---|---|---|---|
| A. Citation Order | The ranked position in the cited-sources list (e.g. Perplexity’s [1][2][3]) | Platform API’s citations array order | Citation-transparent engines (Perplexity, Metaso) |
| B. Mention Order | The order in which the brand appears in the answer text | NLP parsing of answer text | Tools that work from response-text scraping (e.g. Otterly) |
| C. List Position | Rank within a list-style response (“top 5…”) | Text parsing + list detection | ChatGPT list-style answers |
GEO Wiki recommendation: default to A. Citation Order because (1) it has the most stable semantics (matches platform-returned structure); (2) it is most cross-platform comparable (most major engines expose citation order); (3) it is the basis for First-Cite Rate (§3.8).
But: every use must explicitly state which definition (A/B/C). Otherwise cross-tool numbers are not comparable.
Formula (under recommended definition A):
Average Position = mean(citation_rank) for answers where the brand
appears as a cited source
Unit: 1.0–N.0 (lower is better; N is typically 8–10)
Used for: authority comparison (semantic distance between rank #1 and rank #3 is meaningful).
Perplexity API ground truth: per Perplexity’s Chat Completions API, the citations field returns an array of URLs, and inline [1][2] annotations correspond to that array’s order. Whether that order is “ranked by relevance” is not explicitly stated in public docs — verify before using as a quality signal.
Vendor evidence:
- Otterly’s “Avg. Brand Position” formula:
sum of positions of the brand mentions across prompts / number of prompts where the brand appeared. Their docs do not state explicitly whether “position” means A, B, or C — but given their data source (response-text sampling), it is closer to B - Profound / Ahrefs / BrightEdge do not publish a precise definition of Average Position
SEO equivalent: ≈ Average SERP position, but the GEO version is a much weaker signal — AI answers are not paginated rankings.
Pitfall: a “position 1” measured under definition A and a “position 1” measured under definition B refer to entirely different things; mixing them produces nonsense comparisons.
For the term-level definition (cross-link), see GEO Glossary.
3.6 Mention Frequency
Definition: total count of brand appearances in the AI answer sample over a time window (absolute, not normalised).
Formula:
Mention Frequency = count(mentions) / time_window
Unit: mentions per month (typical)
Used for: early-stage existence checks, plus long-term trend monitoring.
Vendor variations: nearly every GEO tool exposes this metric (under various names); the variance is in sampling — which queries are sampled, how often, which engines.
SEO equivalent: ≈ brand search volume (but the source differs — search box vs. AI answer).
Pitfalls:
- Absolute values are not comparable across brands — large brands accumulate mentions naturally
- Easily contaminated by query-set bias — 100 carefully chosen queries don’t represent a domain
Academic note: Aggarwal’s “Word Count” metric is the closest academic analogue but measures normalised word count of cited sentences rather than raw mentions.
3.7 Answer Inclusion Rate
Definition: the fraction of queries in a target query set where the AI answer mentions or cites your brand at least once.
Formula:
AIR = queries_where_brand_appears / total_queries × 100%
Unit: %
Used for: query-level coverage — a finer-grained view than SOV. Answers “across the N queries I care about, how many produce an AI answer that includes my brand”.
Vendor variations:
- Otterly calls this “Brand Coverage”:
prompts that mention my brand / all prompts in selected time window(KPI page) - Most other vendors don’t expose this separately — it is bundled into SOV
SEO equivalent: ≈ keyword coverage rate (number of keywords ranking in top N over total tracked keywords).
Key distinction from SOV: AIR is a binary “does it appear” measure; SOV is a “what’s the share when it appears” measure.
3.8 First-Cite Rate
Definition: among answers that cite you, the fraction in which you are the first-cited source.
Formula:
First-Cite Rate = first_cited_answers / cited_answers × 100%
Unit: %
Used for: authority signal — being first-cited implies the AI prefers your source.
Vendor variations:
- This metric requires the platform to expose citation order — Perplexity API exposes it directly (see §3.5); ChatGPT and Claude only expose order in some response modes
- Few commercial vendors expose this as a standalone KPI; it’s usually folded into Average Position (definition A)
SEO equivalent: ≈ #1 SERP rank rate, but in the AI-answer context the signal is weaker — the gap between rank 1 and rank 2 is smaller than in traditional search.
Pitfall: when the underlying citation count is small, this is noisy — “20% First-Cite Rate” jumping to 40% week-over-week may be statistically meaningless if you were only cited 5 times.
3.9 Brand Sentiment
Definition: the net emotional tone of how AI engines describe your brand when they mention it — typically the balance of positive vs. negative references.
Formula (Otterly’s public version, KPI definitions page):
Brand Sentiment = (positive_mentions − negative_mentions) / total_mentions × 100
Unit: −100 … +100 (or a label: positive / neutral / negative)
Used for: perception quality, not just presence. A brand can have high Visibility but negative Sentiment (“X is overpriced”) — Sentiment catches what volume metrics miss.
Vendor variations:
- Otterly exposes “Brand Sentiment” with the formula above
- Profound (Answer Engine Insights) ships “Sentiment & Keyword Insights” (how AI describes the brand); the formula is not public
- Sentiment is model-generated, so it varies by engine and by how the prompt is framed — declare both
SEO equivalent: ≈ PR / social-listening brand sentiment, narrowed from all media to AI answers.
Pitfalls:
- Highly sensitive to prompt framing — a neutral query and a “problems with X” query produce opposite sentiment on the same brand
- Small mention counts make sentiment swings statistically meaningless (same caveat as First-Cite Rate)
3.10 Source Diversity Score
Definition: how many distinct AI engines have cited your content, vs. the total number of engines you’re tracking.
Formula:
Source Diversity Score = distinct_engines_citing_you / total_engines_tested
Unit: % or absolute count (depending on tracking scope)
Used for: single-platform risk hedging — avoid being “only popular on Perplexity”.
Vendor variations:
- The set of engines tested differs by vendor — Profound covers 9+ engines (ChatGPT, Perplexity, Claude, Copilot, Google AIO, Gemini, Grok, Amazon Rufus, Meta AI, DeepSeek); Ahrefs and Otterly cover 6; Similarweb covers 6
- The engine set itself is a variable — must be explicitly declared
SEO equivalent: no direct analogue. Under SEO, Google was dominant and “engine diversity” was not a meaningful concept; under GEO the multi-platform reality makes this metric essential.
For Chinese vs. English engine coverage differences, see Multilingual GEO; for the engines themselves see Generative Engine.
4. Vendor definition matrix
The table below summarises how 5 major Western vendors handle the 10 metrics. Note: only Otterly and Ahrefs publish complete formulas; others provide marketing-grade descriptions only.
| Metric | Profound | Otterly | Ahrefs Brand Radar | BrightEdge | Similarweb |
|---|---|---|---|---|---|
| Visibility Score | ”Visibility Score” (headline, no formula) | “Brand Visibility Index” + composite formula | — (uses AI SOV) | — | — |
| Citation Rate | — (folded into Visibility) | “Domain Coverage” + full formula | ”Citations” + formula | — | — |
| Citation Share | bundled in SOV | bundled in SOV | bundled in AI SOV | — | — |
| Share of Voice | ”Share of Voice” (no formula) | “Share of Voice” + full formula | ”AI Share of Voice” (impression-weighted) | “Share of Voice” (no formula) | “Brand Mention Share” (no formula) |
| Average Position | — | “Avg. Brand Position” + formula (definition B) | — | — | — |
| Mention Frequency | ”Citations” (absolute count) | “Brand Mentions” / “Domain Citation” + formula | ”Mentions” + formula | — | “Mention Share” |
| Answer Inclusion Rate | — | “Brand Coverage” + full formula | — | — | — |
| First-Cite Rate | — | implicit in Avg Position | — | — | — |
| Brand Sentiment | ”Sentiment & Keyword Insights” (no formula) | “Brand Sentiment” + full formula | — | — | — |
| Source Diversity | 9+ engines covered (implicit) | 6 engines covered (implicit) | 6 engines covered (implicit) | Google AIO–centric | 6 engines covered (implicit) |
| Public sample size | ”1.5B prompts” (marketing) | not disclosed | ”320M+ prompts/month” | full Google AIO parsing | based on traffic panel |
Key observations:
- Formula transparency tier 1: Otterly (7+ metrics with full formulas, incl. Brand Visibility Index & Brand Sentiment), Ahrefs (4 metrics with formulas + methodology blog)
- Formula transparency tier 2: Profound, BrightEdge, Similarweb — marketing descriptions only, no public formulas
- Methodologically distinct: Similarweb is the only vendor measuring actual referral traffic from AI bots (panel-based), while others sample AI responses; Ahrefs is the only vendor with impression weighting; BrightEdge primarily covers Google AI Overviews
Specific tool selection is left to a future tools collection (planned for Stage-3 expansion); this entry stays vendor-neutral.
5. Mapping to traditional SEO metrics
The table below maps the 10 GEO KPIs to their nearest SEO equivalents — useful for practitioners with an SEO background.
| SEO metric | GEO equivalent | Key difference |
|---|---|---|
| Search-visibility index (Sistrix/Semrush) | Visibility Score | GEO version is appearance-in-answer, often a vendor composite |
| Click-through Rate (CTR) | Citation Rate | Different denominator — SEO’s is impression, GEO’s is answer count |
| SERP Average Position | Average Position | GEO has 3 competing definitions; weaker authority signal |
| Impressions | Answer Inclusion Rate | GEO has no strict notion of impression |
| Share of Voice (PR sense) | Share of Voice | GEO version is narrower and quantifiable (limited to AI answers) |
| Backlinks count | Citation Share | Citation ≠ link, but the role (authority signal) is similar |
| Domain Authority | Source Diversity Score | Authority inferred from breadth of citing engines |
| Keyword coverage | Answer Inclusion Rate | GEO is query-level coverage (query → answer), finer-grained |
| Brand search volume | Mention Frequency | Different source (search box vs. AI answer) |
| Brand sentiment (PR/social listening) | Brand Sentiment | GEO version is model-generated and prompt-sensitive |
See SEO vs GEO for the full comparison framework.
6. Choosing metrics by GEO maturity stage
Different GEO maturity stages call for different metric sets. The mapping below corresponds to the 5-level GEO Maturity Model.
| Stage | Recommended metrics | Why |
|---|---|---|
| L1 Starting (no baseline) | Visibility Score + Mention Frequency + Source Diversity Score | Prove “AI knows you exist” — qualitative first, no competitor set needed |
| L2 Lifting off (baseline established) | + Citation Rate + Average Position (def. A) | Now you have a baseline — measure improvement over time |
| L3 Accelerating (vs. competitors) | + Citation Share + Share of Voice | Introduce competitor set — absolute numbers no longer enough |
| L4 Optimising (fine-tuning) | + First-Cite Rate + Answer Inclusion Rate + Brand Sentiment | Pursue rank-1 citations, full coverage, and positive framing |
| L5 Leading (industry benchmark) | All 10 + custom composite metrics | Mature stage — you’re the one defining standards |
For the operational steps see the GEO Audit playbook section on metric snapshots.
7. Common pitfalls and confusions
Seven errors to confirm against before publishing a report or interpreting vendor data:
- “Citation rate” is overloaded in casual usage — sometimes refers to Citation Rate (absolute), sometimes to Citation Share (relative). Always include the English term explicitly
- Average Position is incomparable across tools without definition tag (A/B/C) — this is the highest-frequency error
- Sample bias — query set choice changes results 5–10×; sampling methodology must be public
- Time-window bias — AI answers refresh fast; 30-day windows differ enormously from 7-day windows; always declare the window
- Multilingual slicing — Chinese vs. English query results cannot be summed; AI engines have radically different source pools per language
- SOV / CS confusion — including or excluding mentions makes them two different metrics; results can differ 2–3×;
Otterly's SOV≠Ahrefs's AI SOV≠BrightEdge's SOV - “Visibility Score” is not standardized — it is the highest-level number and the easiest to misread. Some vendors mean a raw appearance rate (= Answer Inclusion Rate), others a composite of appearance + position (Otterly’s Brand Visibility Index). It is not Citation Rate. Always verify whether a “Visibility” number counts mentions or only citations before comparing tools
8. Further reading
- Concept companions: Citation vs Mention, Brand Mentions, GEO Glossary
- Operational companions: AI Citation Tracking (how to collect these metrics), GEO Audit (end-to-end audit flow), GEO Maturity Model (metric evolution by stage)
- Business framing: GEO ROI Models (how metrics map to business outcomes)
- Academic reference point (defines 3 metrics, not this taxonomy): Aggarwal et al. 2024 — GEO: Generative Engine Optimization (KDD ‘24 paper summary + key data)
References
Academic:
- Aggarwal, P. et al. (2024). GEO: Generative Engine Optimization. KDD ‘24. arXiv:2311.09735 · ACM DL
Vendor documentation (as of 2026-05):
- Otterly.ai — Brand Report KPI Definitions
- Ahrefs — Brand Radar Help · Brand Radar Methodology
- Profound — Answer Engine Insights
- BrightEdge — Share of Voice in 2026
- Similarweb — GenAI Intelligence
API ground truth:
- Perplexity — Chat Completions API Reference
Industry overview:
- TigerTracks — The Definition of AI Visibility Score (2026)
Frequently asked questions
Is 'Visibility Score' the same as Citation Rate?
What does 'average position' actually mean in GEO?
Does a brand mention count as a GEO citation?
Why aren't Share of Voice numbers comparable across vendors?
Which metric should I track first if I have no baseline?
Is Citation Rate or Citation Share more important?
See also
Sources
Primary
- GEO: Generative Engine Optimization (Aggarwal et al., KDD 2024) · arXiv / KDD '24 · 2024-08-25
- GEO: Generative Engine Optimization (KDD '24 Proceedings) · ACM SIGKDD · 2024-08-25
- Otterly.ai — Brand Report KPI Definitions · Otterly.ai
- Ahrefs Brand Radar — What It Is & How to Use It · Ahrefs
- Ahrefs Brand Radar Methodology · Ahrefs
- Profound — Answer Engine Insights · Profound
- BrightEdge — What Share of Voice Really Means for Search in 2026 · BrightEdge
- Similarweb — GenAI Intelligence (AI Chatbot Traffic) · Similarweb
- Perplexity API — Chat Completions Reference · Perplexity