Schema.org for AI
Quick facts
- What it is
- The AI-relevant subset of Schema.org — the types and properties that change how an engine resolves your entity and parses your page. Not the spec; the subset that touches AI
- Is it a ranking or citation signal?
- No. It gates eligibility for features and aids entity resolution and parsing. Google states markup enables a feature, it does not rank you or guarantee the feature
- Where it acts
- The pre-retrieval parse + entity layer — not the grounding/selection gate that Citability and E-E-A-T govern. Markup makes an entity resolvable, not a passage liftable
- Strongest evidence
- Index-integrated AI (Google AI Overviews, Bing Copilot) uses it via the search index. Live-fetch chatbots (ChatGPT, Perplexity) read JSON-LD as plain page text, not parsed structured data (searchVIU, 2025)
- Highest-leverage primitive
- sameAs on Organization / Person — the entity-resolution join key into the knowledge graph. The one property worth getting right first
1. What “Schema.org for AI” is
This entry is not a mirror of the Schema.org documentation. It is the AI-citation-relevant subset: the handful of types and properties that change how an AI engine resolves your entity and parses your page — and nothing more.
Definition (GEO Wiki working definition): Schema.org for AI is the subset of structured-data vocabulary whose presence changes how an AI engine disambiguates the entity behind a page and parses the page reliably — distinct from whether any passage on it is liftable.
2. Markup ≠ citation — why schema is infrastructure, not a signal
The load-bearing honesty, and this entry’s counterpart to E-E-A-T §1’s “not a score”: structured data is not a ranking factor and not a citation lever. Google states it plainly — “using structured data enables a feature to be present, it does not guarantee that it will be present”, and a structured-data manual action “doesn’t affect how the page ranks” (see General Structured Data Guidelines). Google’s 2025 AI-search guidance repeats it: markup “makes pages eligible for certain search features and rich results”, not ranking (Succeeding in AI search).
What markup actually buys is exactly three things, all upstream of selection:
| What schema buys | What it does not buy |
|---|---|
| Reliable, unambiguous parse of the page’s facts | A ranking or citation boost |
| Entity disambiguation (who/what you are) via the knowledge graph | A passage becoming liftable |
| Eligibility for structured/rich surfaces (where they still exist) | A guarantee the surface appears |
Where it acts is the whole point. Schema operates at the pre-retrieval parse and entity layer — never at the grounding/selection gate that Citability and E-E-A-T govern in Answer Loop §3:
page ──► [ PARSE + ENTITY LAYER ] ◄── schema acts here
│ facts parsed cleanly
│ entity resolved (sameAs → KG)
▼
retrieval ──► candidate passages
▼
[ GROUNDING / SELECTION GATE ] ◄── schema does NOT act here
citability (shape) · E-E-A-T (trust) Citability & E-E-A-T own this
▼
grounded answer ──► (maybe) citation
The orthogonality line, stated as the reciprocal of Citability §2’s: marking up an FAQ does not make its answers citable. Passage shape is citability’s, decided in the visible content. Markup only declares structure a parser could already extract.
3. The AI-relevant type subset — the load-bearing table
The canonical table the rest of the site quotes — the E-E-A-T §4-of-this-entry. Each type is read for what it asserts to an engine and which proxy it feeds, not for spec completeness.
| Type | What it asserts to an AI | Proxy it feeds | Failure shape |
|---|---|---|---|
Organization | This site/brand is this entity | Entity recognition · KG presence | No sameAs; entity stays ambiguous, never resolved |
Person | This author/expert is this identity | Entity + the trust proxies E-E-A-T names | Anonymous byline; no resolvable identity |
Article / NewsArticle | This page is an article, by X, dated Y | Type + authorship + freshness | Untyped page; author/date not machine-stated |
WebSite | Site-level identity, search action | Site entity binding | Page-only signals, no site entity |
BreadcrumbList | Where this sits in the site graph | Site architecture / context | Orphan page, no structural context |
FAQPage | These Q&As exist on the page | Answer-shape declaration (see §2 + §6) | Treated as liftable — it is not; that is citability’s |
HowTo | These ordered steps exist | Answer-shape declaration | Same — and its Google rich result was removed (§6) |
The single highest-leverage rows are Organization and Person, because they carry the property that actually feeds the layer AI consumes — covered next. FAQPage/HowTo are deliberately last: they describe shape a parser already sees and carry the §6 caveat.
4. The AI-relevant property subset — sameAs is the workhorse
Companion table, same reading. Properties, not types, are where the entity leverage concentrates.
| Property | What it asserts | Proxy it feeds | Failure shape |
|---|---|---|---|
sameAs | ”This entity is the one at these URLs” (Wikipedia, Wikidata, official, socials) | Entity recognition · KG presence | Entity never joined to the graph; stays ambiguous |
mainEntity | The primary thing this page is about | Topic/entity binding | Page about everything, resolved as nothing |
about / mentions | Entities this content concerns/cites | Topical + entity graph | No machine-stated topic anchors |
author | The Person/Organization behind it | Authorship → trust proxies | Unattributed; trust proxy missing |
knowsAbout / hasOccupation | An author’s domain and role | Expertise corroboration | Asserted expertise with nothing to resolve |
speakable | Sections fit for text-to-speech | A beta, US/EN/news-only feature | Over-relied on; not a general surface (Google, beta) |
sameAs is the entity-resolution join key — the single property worth getting right before any other. It is the explicit edge from your markup to the knowledge graph the model already trusts. A minimal, illustrative block:
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Example Co",
"url": "https://example.com",
"sameAs": [
"https://en.wikipedia.org/wiki/Example_Co",
"https://www.wikidata.org/wiki/Q000000",
"https://www.linkedin.com/company/example-co"
]
}
This is one illustrative block, not a template. Syntax — JSON-LD vs Microdata vs RDFa, where it goes, escaping — is JSON-LD’s. Full per-type templates and validation are the Schema Implementation playbook’s. Which markup feeds the entity layer and why is below; the resolution mechanism is treated in Entity Recognition and Knowledge Graph Presence.
5. How AI engines actually consume schema — the honest mechanism, per surface
The honesty section, mirror of E-E-A-T §5. The core: engines do not “read your schema and rank you.” How — and whether — markup is consumed at answer time splits hard by surface, and the evidence now exists to say so rather than hedge.
| Surface | How schema is consumed | Evidence strength |
|---|---|---|
| Google AI Overviews / AI Mode | Via Google’s existing index and structured-data systems — AI search “is still search”; same eligibility rules, no AI-specific markup required | Strongest — Google’s own docs (AI features) |
| Bing Copilot | Via the Bing index — Microsoft has confirmed structured data is used | Strong — vendor-confirmed |
| ChatGPT / Perplexity (live fetch) | The page is fetched and rendered to text; JSON-LD is read as plain text, not parsed as a graph | Strong (negative) — controlled test |
| Claude / Gemini (direct fetch) | Same: no evidence of dedicated JSON-LD parsing at answer time | Consistent with the above |
The negative result is well-supported, not speculative. A controlled December 2025 test placed a price only inside JSON-LD across five systems; none of the live-fetch chatbots extracted it (searchVIU). An independent observation found ChatGPT and Perplexity will surface values even from invalid, fabricated schema — they are reading the markup as text on the page, not as a parsed structure (Search Engine Roundtable, observation).
The seam, restated: the entity benefit still reaches these models — but through the model prior and the knowledge graph, not by parsing your JSON-LD during the fetch. Why each markup is an entity proxy is this entry’s; how the identity resolves across platforms is Entity Recognition’s and Knowledge Graph Presence’s.
6. What the evidence says — and what it does not
The bounded-reading section, same honesty discipline as E-E-A-T §6.
| What holds | The bounded reading |
|---|---|
| Index-integrated AI (Google, Bing) uses structured data | Through the index, as eligibility — Google states it is not a ranking boost |
| Valid markup that matches content reduces extraction ambiguity | It clarifies what is already there; it cannot manufacture trust or liftability |
| Schema coverage does not correlate with AI citation rates | A Dec-2024 study found no correlation; treat schema as hygiene, not a lever (Search Engine Land) |
| Rich-result surfaces can be revoked unilaterally | Google restricted FAQ rich results to gov/health and removed HowTo entirely in 2023 (Google) |
The FAQ/HowTo deprecation is the cleanest cautionary datum: a surface that schema “earned” was withdrawn by the vendor in one announcement. Markup is not a durable benefit you own.
One boundary on the GEO literature, stated explicitly: Aggarwal et al. measured content substance and structure rewrites — cite sources, add statistics, quotations — and did not test schema markup as a variable (KDD ‘24, arXiv:2311.09735; paper summary). The headline GEO numbers therefore do not transfer to “add schema.” Borrowing them here would be the exact over-claim §7 warns against.
The position, the reciprocal of E-E-A-T §6’s “earned, not annotated”: schema is declared, not rewarded. It lets engines trust what is already on the page; it cannot create what is not.
7. Anti-patterns — schema spam and why it backfires
Mirror of E-E-A-T §7. Each pattern looks like the signal it imitates and fails on a trust or anti-abuse filter.
| Anti-pattern | Why it looks like it works | Why it actually fails |
|---|---|---|
| Markup not matching visible content | Looks like rich structure | Google manual action strips eligibility; text-reading AI sees the contradiction directly |
FAQPage stuffing for SERP real estate | Looks like answer coverage | Rich result restricted to gov/health since 2023; no payoff, accuracy risk |
Fabricated Organization / Person | Looks like a resolved entity | Fails sameAs / KG corroboration — the same failure as fake authorship in E-E-A-T §7 |
| Over-marking every element | Looks thorough | Noise, validation errors, mismatch risk; no upside |
| JSON-LD contradicting on-page text | Looks complete | Live-fetch AI reads both as text and trusts neither |
The load-bearing line: invalid or content-mismatched schema is worse than none. It trips AI anti-abuse the way fabricated authority trips trust filters — the over-claim pattern that AI Content Detection covers. Google’s standing position is that there are no special markup tricks; markup must mirror content that is already visible.
8. Schema across SEO and GEO — invariant baseline vs what changes
Mirror of E-E-A-T §8; this restates SEO vs GEO’s shared-baseline contract rather than re-deriving it.
Invariant: valid markup that matches content is a shared SEO+GEO baseline — on the “never drop” list. It costs little, it cannot be the differentiator, and removing it degrades both blue links and machine parseability.
What changes is the consumer: from a rich-result renderer to an entity/parse layer feeding the model’s prior.
| Surface | Schema delta |
|---|---|
| Google AI Overviews | The native home — index-based; schema reused from Google’s existing systems, weighed as eligibility not rank |
| Live-fetch chatbots | Markup read as page text; the value is indirect, via entity presence in the prior/KG — not the JSON-LD on the page |
Two routed lines, not expanded: the trust-readability of non-text assets — ImageObject/VideoObject provenance — is Multimodal Signals’; and the format choice underneath all of this is JSON-LD’s.
9. Why this matters for GEO + how to act
Schema is infrastructure that feeds the entity layer — not a lever on the grounding choke point Answer Loop §3 calls highest-leverage. Get it correct and out of the way; spend the real effort on citability and trust. This entry is the concept; the doing is the playbook.
| Your intent | First stop |
|---|---|
| Implement or fix markup correctly | Schema Implementation |
| Decide format / syntax | JSON-LD |
| Understand why markup feeds entity resolution | Entity Recognition · Knowledge Graph Presence |
| Audit schema as part of the whole site | Full GEO Audit |
| Make a passage actually liftable | Citability |
| See where this sits in the loop | Answer Loop |
| The method that ties it together | Generative Engine Optimization |
For the term itself and its neighbors, see the GEO glossary.
References
Official (Google):
- Google Search Central — General Structured Data Guidelines · Introduction to structured data markup
- Google Search Central — Changes to HowTo and FAQ rich results (2023-08-08)
- Google Search Central — AI features and your website · Top ways to ensure your content performs well in Google’s AI experiences (2025-05-21)
- Google Search Central — Speakable structured data (beta)
Vocabulary:
- Schema.org — Organization, Person, sameAs, FAQPage, HowTo, Article, speakable
Independent / industry:
- searchVIU — Schema Markup and AI in 2025: What ChatGPT, Claude, Perplexity & Gemini Really See (2025-12-02)
- Search Engine Land — How schema markup fits into AI search — without the hype (2026-03-25)
- Search Engine Roundtable — ChatGPT & Perplexity Treat Structured Data As Text On A Page [observation] (2026-02-03)
Academic (boundary reference — schema not a tested variable):
- Aggarwal, P., Murahari, V., Rajpurohit, T., Kalyan, A., Narasimhan, K. & Deshpande, A. (2024). GEO: Generative Engine Optimization. KDD ‘24. arXiv:2311.09735 · paper summary
Frequently asked questions
Does Schema.org markup get my content cited by AI?
Do ChatGPT and Perplexity read my JSON-LD?
Which schema types matter most for AI?
Is FAQPage or HowTo schema still worth adding?
Can schema markup hurt me?
See also
Sources
Primary
- General Structured Data Guidelines · Google Search Central · 2026-01-06
- Introduction to structured data markup in Google Search · Google Search Central · 2025-12-10
- Changes to HowTo and FAQ rich results · Google Search Central · 2023-08-08
- AI features and your website · Google Search Central · 2025-12-10
- Top ways to ensure your content performs well in Google's AI experiences on Search · Google Search Central · 2025-05-21
- Speakable structured data (beta) · Google Search Central · 2025-12-10
- Schema.org vocabulary (Organization, Person, sameAs, FAQPage, HowTo, Article, speakable) · Schema.org
Secondary
- Schema Markup and AI in 2025: What ChatGPT, Claude, Perplexity & Gemini Really See · searchVIU
- How schema markup fits into AI search — without the hype · Search Engine Land
- GEO: Generative Engine Optimization (Aggarwal et al., KDD '24) · arXiv