Skip to content
Concept · Infrastructure

Schema.org for AI

Quick facts

What it is
The AI-relevant subset of Schema.org — the types and properties that change how an engine resolves your entity and parses your page. Not the spec; the subset that touches AI
Is it a ranking or citation signal?
No. It gates eligibility for features and aids entity resolution and parsing. Google states markup enables a feature, it does not rank you or guarantee the feature
Where it acts
The pre-retrieval parse + entity layer — not the grounding/selection gate that Citability and E-E-A-T govern. Markup makes an entity resolvable, not a passage liftable
Strongest evidence
Index-integrated AI (Google AI Overviews, Bing Copilot) uses it via the search index. Live-fetch chatbots (ChatGPT, Perplexity) read JSON-LD as plain page text, not parsed structured data (searchVIU, 2025)
Highest-leverage primitive
sameAs on Organization / Person — the entity-resolution join key into the knowledge graph. The one property worth getting right first

1. What “Schema.org for AI” is

This entry is not a mirror of the Schema.org documentation. It is the AI-citation-relevant subset: the handful of types and properties that change how an AI engine resolves your entity and parses your page — and nothing more.

Definition (GEO Wiki working definition): Schema.org for AI is the subset of structured-data vocabulary whose presence changes how an AI engine disambiguates the entity behind a page and parses the page reliably — distinct from whether any passage on it is liftable.

2. Markup ≠ citation — why schema is infrastructure, not a signal

The load-bearing honesty, and this entry’s counterpart to E-E-A-T §1’s “not a score”: structured data is not a ranking factor and not a citation lever. Google states it plainly — “using structured data enables a feature to be present, it does not guarantee that it will be present”, and a structured-data manual action “doesn’t affect how the page ranks” (see General Structured Data Guidelines). Google’s 2025 AI-search guidance repeats it: markup “makes pages eligible for certain search features and rich results”, not ranking (Succeeding in AI search).

What markup actually buys is exactly three things, all upstream of selection:

What schema buysWhat it does not buy
Reliable, unambiguous parse of the page’s factsA ranking or citation boost
Entity disambiguation (who/what you are) via the knowledge graphA passage becoming liftable
Eligibility for structured/rich surfaces (where they still exist)A guarantee the surface appears

Where it acts is the whole point. Schema operates at the pre-retrieval parse and entity layer — never at the grounding/selection gate that Citability and E-E-A-T govern in Answer Loop §3:

  page ──► [ PARSE + ENTITY LAYER ]      ◄── schema acts here
              │  facts parsed cleanly
              │  entity resolved (sameAs → KG)

  retrieval ──► candidate passages

  [ GROUNDING / SELECTION GATE ]         ◄── schema does NOT act here
   citability (shape) · E-E-A-T (trust)      Citability & E-E-A-T own this

  grounded answer ──► (maybe) citation

The orthogonality line, stated as the reciprocal of Citability §2’s: marking up an FAQ does not make its answers citable. Passage shape is citability’s, decided in the visible content. Markup only declares structure a parser could already extract.

3. The AI-relevant type subset — the load-bearing table

The canonical table the rest of the site quotes — the E-E-A-T §4-of-this-entry. Each type is read for what it asserts to an engine and which proxy it feeds, not for spec completeness.

TypeWhat it asserts to an AIProxy it feedsFailure shape
OrganizationThis site/brand is this entityEntity recognition · KG presenceNo sameAs; entity stays ambiguous, never resolved
PersonThis author/expert is this identityEntity + the trust proxies E-E-A-T namesAnonymous byline; no resolvable identity
Article / NewsArticleThis page is an article, by X, dated YType + authorship + freshnessUntyped page; author/date not machine-stated
WebSiteSite-level identity, search actionSite entity bindingPage-only signals, no site entity
BreadcrumbListWhere this sits in the site graphSite architecture / contextOrphan page, no structural context
FAQPageThese Q&As exist on the pageAnswer-shape declaration (see §2 + §6)Treated as liftable — it is not; that is citability’s
HowToThese ordered steps existAnswer-shape declarationSame — and its Google rich result was removed (§6)

The single highest-leverage rows are Organization and Person, because they carry the property that actually feeds the layer AI consumes — covered next. FAQPage/HowTo are deliberately last: they describe shape a parser already sees and carry the §6 caveat.

4. The AI-relevant property subset — sameAs is the workhorse

Companion table, same reading. Properties, not types, are where the entity leverage concentrates.

PropertyWhat it assertsProxy it feedsFailure shape
sameAs”This entity is the one at these URLs” (Wikipedia, Wikidata, official, socials)Entity recognition · KG presenceEntity never joined to the graph; stays ambiguous
mainEntityThe primary thing this page is aboutTopic/entity bindingPage about everything, resolved as nothing
about / mentionsEntities this content concerns/citesTopical + entity graphNo machine-stated topic anchors
authorThe Person/Organization behind itAuthorship → trust proxiesUnattributed; trust proxy missing
knowsAbout / hasOccupationAn author’s domain and roleExpertise corroborationAsserted expertise with nothing to resolve
speakableSections fit for text-to-speechA beta, US/EN/news-only featureOver-relied on; not a general surface (Google, beta)

sameAs is the entity-resolution join key — the single property worth getting right before any other. It is the explicit edge from your markup to the knowledge graph the model already trusts. A minimal, illustrative block:

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Example Co",
  "url": "https://example.com",
  "sameAs": [
    "https://en.wikipedia.org/wiki/Example_Co",
    "https://www.wikidata.org/wiki/Q000000",
    "https://www.linkedin.com/company/example-co"
  ]
}

This is one illustrative block, not a template. Syntax — JSON-LD vs Microdata vs RDFa, where it goes, escaping — is JSON-LD’s. Full per-type templates and validation are the Schema Implementation playbook’s. Which markup feeds the entity layer and why is below; the resolution mechanism is treated in Entity Recognition and Knowledge Graph Presence.

5. How AI engines actually consume schema — the honest mechanism, per surface

The honesty section, mirror of E-E-A-T §5. The core: engines do not “read your schema and rank you.” How — and whether — markup is consumed at answer time splits hard by surface, and the evidence now exists to say so rather than hedge.

SurfaceHow schema is consumedEvidence strength
Google AI Overviews / AI ModeVia Google’s existing index and structured-data systems — AI search “is still search”; same eligibility rules, no AI-specific markup requiredStrongest — Google’s own docs (AI features)
Bing CopilotVia the Bing index — Microsoft has confirmed structured data is usedStrong — vendor-confirmed
ChatGPT / Perplexity (live fetch)The page is fetched and rendered to text; JSON-LD is read as plain text, not parsed as a graphStrong (negative) — controlled test
Claude / Gemini (direct fetch)Same: no evidence of dedicated JSON-LD parsing at answer timeConsistent with the above

The negative result is well-supported, not speculative. A controlled December 2025 test placed a price only inside JSON-LD across five systems; none of the live-fetch chatbots extracted it (searchVIU). An independent observation found ChatGPT and Perplexity will surface values even from invalid, fabricated schema — they are reading the markup as text on the page, not as a parsed structure (Search Engine Roundtable, observation).

The seam, restated: the entity benefit still reaches these models — but through the model prior and the knowledge graph, not by parsing your JSON-LD during the fetch. Why each markup is an entity proxy is this entry’s; how the identity resolves across platforms is Entity Recognition’s and Knowledge Graph Presence’s.

6. What the evidence says — and what it does not

The bounded-reading section, same honesty discipline as E-E-A-T §6.

What holdsThe bounded reading
Index-integrated AI (Google, Bing) uses structured dataThrough the index, as eligibility — Google states it is not a ranking boost
Valid markup that matches content reduces extraction ambiguityIt clarifies what is already there; it cannot manufacture trust or liftability
Schema coverage does not correlate with AI citation ratesA Dec-2024 study found no correlation; treat schema as hygiene, not a lever (Search Engine Land)
Rich-result surfaces can be revoked unilaterallyGoogle restricted FAQ rich results to gov/health and removed HowTo entirely in 2023 (Google)

The FAQ/HowTo deprecation is the cleanest cautionary datum: a surface that schema “earned” was withdrawn by the vendor in one announcement. Markup is not a durable benefit you own.

One boundary on the GEO literature, stated explicitly: Aggarwal et al. measured content substance and structure rewrites — cite sources, add statistics, quotations — and did not test schema markup as a variable (KDD ‘24, arXiv:2311.09735; paper summary). The headline GEO numbers therefore do not transfer to “add schema.” Borrowing them here would be the exact over-claim §7 warns against.

The position, the reciprocal of E-E-A-T §6’s “earned, not annotated”: schema is declared, not rewarded. It lets engines trust what is already on the page; it cannot create what is not.

7. Anti-patterns — schema spam and why it backfires

Mirror of E-E-A-T §7. Each pattern looks like the signal it imitates and fails on a trust or anti-abuse filter.

Anti-patternWhy it looks like it worksWhy it actually fails
Markup not matching visible contentLooks like rich structureGoogle manual action strips eligibility; text-reading AI sees the contradiction directly
FAQPage stuffing for SERP real estateLooks like answer coverageRich result restricted to gov/health since 2023; no payoff, accuracy risk
Fabricated Organization / PersonLooks like a resolved entityFails sameAs / KG corroboration — the same failure as fake authorship in E-E-A-T §7
Over-marking every elementLooks thoroughNoise, validation errors, mismatch risk; no upside
JSON-LD contradicting on-page textLooks completeLive-fetch AI reads both as text and trusts neither

The load-bearing line: invalid or content-mismatched schema is worse than none. It trips AI anti-abuse the way fabricated authority trips trust filters — the over-claim pattern that AI Content Detection covers. Google’s standing position is that there are no special markup tricks; markup must mirror content that is already visible.

8. Schema across SEO and GEO — invariant baseline vs what changes

Mirror of E-E-A-T §8; this restates SEO vs GEO’s shared-baseline contract rather than re-deriving it.

Invariant: valid markup that matches content is a shared SEO+GEO baseline — on the “never drop” list. It costs little, it cannot be the differentiator, and removing it degrades both blue links and machine parseability.

What changes is the consumer: from a rich-result renderer to an entity/parse layer feeding the model’s prior.

SurfaceSchema delta
Google AI OverviewsThe native home — index-based; schema reused from Google’s existing systems, weighed as eligibility not rank
Live-fetch chatbotsMarkup read as page text; the value is indirect, via entity presence in the prior/KG — not the JSON-LD on the page

Two routed lines, not expanded: the trust-readability of non-text assets — ImageObject/VideoObject provenance — is Multimodal Signals’; and the format choice underneath all of this is JSON-LD’s.

9. Why this matters for GEO + how to act

Schema is infrastructure that feeds the entity layer — not a lever on the grounding choke point Answer Loop §3 calls highest-leverage. Get it correct and out of the way; spend the real effort on citability and trust. This entry is the concept; the doing is the playbook.

Your intentFirst stop
Implement or fix markup correctlySchema Implementation
Decide format / syntaxJSON-LD
Understand why markup feeds entity resolutionEntity Recognition · Knowledge Graph Presence
Audit schema as part of the whole siteFull GEO Audit
Make a passage actually liftableCitability
See where this sits in the loopAnswer Loop
The method that ties it togetherGenerative Engine Optimization

For the term itself and its neighbors, see the GEO glossary.

References

Official (Google):

Vocabulary:

  • Schema.org — Organization, Person, sameAs, FAQPage, HowTo, Article, speakable

Independent / industry:

Academic (boundary reference — schema not a tested variable):

  • Aggarwal, P., Murahari, V., Rajpurohit, T., Kalyan, A., Narasimhan, K. & Deshpande, A. (2024). GEO: Generative Engine Optimization. KDD ‘24. arXiv:2311.09735 · paper summary

Frequently asked questions

Does Schema.org markup get my content cited by AI?
No — not directly. Markup is not a ranking or citation signal. It does two things: it makes a page reliably parseable, and it disambiguates the entity behind it (who/what you are) so the model can resolve you against its knowledge graph. Whether a passage is then lifted into an answer is decided by citability (its structure) and E-E-A-T (its source trust) at the grounding gate — schema does not act there. The honest model: markup makes an entity resolvable, not a passage liftable.
Do ChatGPT and Perplexity read my JSON-LD?
Not as structured data, at answer time. A controlled 2025 test (searchVIU) placed a price only inside JSON-LD and queried five systems; none of the live-fetch chatbots extracted it. Independent observation found ChatGPT and Perplexity will even surface values from invalid, made-up schema — meaning they read the markup as plain text on the page, not as a parsed graph. The entity benefit still reaches them, but through the model's prior and the knowledge graph, not by parsing the JSON-LD on your page during the fetch.
Which schema types matter most for AI?
Organization and Person — because they carry sameAs, the join key that resolves your entity into the knowledge graph, which is the part AI actually consumes. Article gives the page a clean type and authorship. FAQPage and HowTo declare answer shape a parser can already see, but they do not make those answers citable and their Google rich results were curtailed in 2023. Prioritise the entity primitives over the answer-shape ones.
Is FAQPage or HowTo schema still worth adding?
For AI, only marginally, and not for the rich result. Google restricted FAQ rich results to authoritative government and health sites and removed HowTo rich results entirely in 2023 — so the SERP payoff is largely gone. The markup still validly describes structure, but it does not make the underlying answers liftable; that is citability's job, done in the visible content. Add it if it is cheap and accurate; do not expect it to move AI citation on its own.
Can schema markup hurt me?
Yes. Markup that does not match the visible page is the main failure mode: Google issues structured-data manual actions that strip rich-result eligibility, and AI systems that read markup as page text will see the contradiction directly. Fabricated Organization or Person markup fails sameAs and knowledge-graph corroboration the same way fake authorship fails E-E-A-T. Invalid or content-mismatched schema is worse than none.

See also

Sources

Primary

  1. General Structured Data Guidelines · Google Search Central · 2026-01-06
  2. Introduction to structured data markup in Google Search · Google Search Central · 2025-12-10
  3. Changes to HowTo and FAQ rich results · Google Search Central · 2023-08-08
  4. AI features and your website · Google Search Central · 2025-12-10
  5. Top ways to ensure your content performs well in Google's AI experiences on Search · Google Search Central · 2025-05-21
  6. Speakable structured data (beta) · Google Search Central · 2025-12-10
  7. Schema.org vocabulary (Organization, Person, sameAs, FAQPage, HowTo, Article, speakable) · Schema.org

Secondary

  1. Schema Markup and AI in 2025: What ChatGPT, Claude, Perplexity & Gemini Really See · searchVIU
  2. How schema markup fits into AI search — without the hype · Search Engine Land
  3. GEO: Generative Engine Optimization (Aggarwal et al., KDD '24) · arXiv

Tertiary[observation]

  1. ChatGPT & Perplexity Treat Structured Data As Text On A Page
Last updated: 2026-05-18 Authors: Ray Yang Topic: Infrastructure