Skip to content
Standard · Infrastructure

Sitemap & IndexNow

Quick facts

What they are
Two URL-submission protocols — sitemap.xml (2005, pull-mode, every major search engine) and IndexNow (2021, push-mode, the Microsoft + Yandex ecosystem)
The load-bearing line
Both transit AI search visibility only via the host search index — Google AI Overviews inherits sitemap behavior from Google; Bing Copilot inherits both from Bing; standalone answer engines consume neither directly
IndexNow participants (2026-05)
Microsoft Bing · Yandex · Naver · Seznam.cz · Yep. Google has not adopted since its 2021 testing announcement; no AI vendor's first-party crawler participates
The costliest mistake
Pushing IndexNow expecting ChatGPT / Perplexity / Claude to refresh — wrong protocol scope. IndexNow → Bing → Copilot is its only AI-side path; standalone engines crawl on their own schedules
Where it sits
Upstream of AI Crawlers in the answer loop — discovery, not citation. Necessary, not sufficient: being submitted does not mean being liftable

1. What sitemap.xml and IndexNow are

Sitemap.xml and IndexNow are two URL-submission protocols that solve adjacent but distinct problems. Sitemap.xml is the older one — published in 2005 as an open standard (sitemaps.org) — and works in pull mode: a publisher hosts a list of URLs at a known path, and search engines fetch the file on their own crawl schedule. Every major search crawler honors it, including Googlebot, Bingbot, and YandexBot.

IndexNow is the newer one — launched in 2021 by Microsoft and Yandex (indexnow.org) — and works in push mode: a publisher notifies a participating endpoint the moment a URL changes, and the receiving engine refreshes its index in minutes. The participant list as of 2026-05 is Microsoft Bing, Yandex, Naver, Seznam.cz, and Yep (indexnow.org). Google has not adopted IndexNow since its 2021 testing announcement, and no AI vendor’s first-party crawler participates.

For GEO purposes, the load-bearing fact is downstream of either protocol’s mechanics: their AI-search effect transits only through the host search index. Google AI Overviews inherits Google’s sitemap behavior because AIO grounds in Google’s classic web index; Bing Copilot inherits Bing’s sitemap and IndexNow behavior for the same reason. Standalone answer engines — ChatGPT Search, Perplexity, Claude — run their own retrieval crawlers on their own crawl schedules, and their published crawler documentation does not mention sitemap.xml or IndexNow consumption.

In the answer-loop sequence, this layer sits upstream of AI Crawlers: discovery (you exist on someone’s URL list) comes before crawl (a bot fetches you), which comes before retrievability (the bot’s index includes you), which comes before citability (you are liftable when read). Submission is necessary, not sufficient — clearing it makes you a candidate, not a citation.

2. The mechanics — what each protocol actually does

The two protocols differ on five operational dimensions, summarized below.

Aspectsitemap.xmlIndexNow
Standardsitemaps.org (2005, open protocol)indexnow.org (2021, Microsoft + Yandex)
ModePull — engine fetches on its own crawl schedulePush — publisher notifies on URL change
Consumed byGooglebot, Bingbot, YandexBot, and effectively every major search crawlerMicrosoft Bing, Yandex, Naver, Seznam.cz, Yep — not Google, not any AI vendor’s first-party crawler
LatencyHours to days, depending on the engine’s crawl scheduleMinutes, push-on-change
Declared atSitemap: directive in robots.txt + optional Search Console / Webmaster Tools submissionOne HTTP request per URL change to any participating endpoint, with the URL plus a key
Scale limits50,000 URLs and 50 MB (uncompressed) per file; sitemap indexes for sites above that (sitemaps.org)Up to 10,000 URLs per batched POST; arbitrary key tokens hosted at site root

The publisher-side IndexNow surface is intentionally minimal — one HTTP request per URL change, one key file at the site root, no API tokens, no per-engine authentication. A complete minimal push looks like:

# IndexNow — the minimal push, the principle not a full client

1. Host <your-key>.txt at your site root, body containing <your-key>
   e.g. https://example.com/abc123.txt  →  body: abc123

2. On URL change, GET (single-URL form):
   https://api.indexnow.org/IndexNow?url=https://example.com/page&key=abc123

   Or POST (batch form, up to 10k URLs):
   POST https://<participating-engine>/indexnow
   Content-Type: application/json
   { "host": "example.com",
     "key": "abc123",
     "urlList": ["https://example.com/page1", "https://example.com/page2"] }

3. The receiving engine shares the submission with the other IndexNow
   participants (Bing → Yandex → Naver → Seznam → Yep) automatically.
   Submit to one, reach all five.

The full JSON shape and per-engine endpoint list are in the official spec; CDNs including Cloudflare publish IndexNow at the edge with a single toggle (Cloudflare, 2021), so many sites already emit IndexNow signals without site-side code.

A note on Google and IndexNow: Google announced in November 2021 that it would test the protocol for sustainability gains (Search Engine Land, 2021-11-09). The test did not move to adoption; as of 2026 independent coverage continues to treat Google’s absence as the status quo (ppc.land, 2024-12-30). The operational implication: a single IndexNow push reaches Bing, Yandex, Naver, Seznam, and Yep — and through them, on the AI side, only Bing Copilot.

3. Per-engine effect on AI visibility — the canonical table

The single most-quoted asset on this page. Sitemap.xml and IndexNow’s effect on an AI surface is determined by whether that surface reuses a host search index, and the engines split cleanly into three groups.

AI surfacesitemap.xmlIndexNowWhy
Google AI OverviewsUsed (via Google index)Not used — Google has not adopted IndexNowAIO grounds in Google’s classic web index; sitemap submission via Search Console accelerates Google discovery, which feeds AIO’s candidate pool
Bing CopilotUsed (via Bing index)Used (via Bing index)IndexNow is Microsoft’s protocol; Copilot inherits Bing’s index, so a push refreshes Bing in minutes and Copilot’s grounding pool with it
ChatGPT Search · Perplexity · ClaudeIndirect — each runs its own retrieval crawler on its own crawl schedule; the Sitemap: directive in robots.txt may aid discovery but is not requiredNot used — none of the three documents IndexNow or any submission protocol in its crawler docsEach engine maintains an independent retrieval index; no public URL-submission channel exists for any of them

The load-bearing read of this table: submission protocols affect an AI surface only where that surface inherits a search ranking that uses them. Google AI Overviews inherits, Bing Copilot inherits twice (sitemap + IndexNow), and standalone answer engines inherit nothing — they crawl on their own. “Push IndexNow to appear in ChatGPT” is therefore a category error (§6).

4. “Submit to the AI index” — what does and does not exist

The most common practitioner question on this layer is: how do I submit my URL to ChatGPT, Perplexity, or Claude? As of 2026-05, the answer is: directly, you can’t. No first-party URL-submission channel exists for any major standalone answer engine.

Submission channels that do exist — each reaches an AI surface transitively, via a host search index:

  • Google Search Console — sitemap submission and URL Inspection feed Google’s classic web index, which AI Overviews reuses (Build and submit a sitemap)
  • Bing Webmaster Tools — sitemap submission and URL Submit feed Bing’s index, which Bing Copilot reuses
  • IndexNow — push notification refreshes Bing’s index (and Yandex’s, Naver’s, Seznam’s, Yep’s) in minutes; reaches Bing Copilot transitively at the lowest available latency (How to add IndexNow)

Submission channels that do not exist:

  • A first-party OpenAI / Anthropic / Perplexity URL-submission API — their crawler docs cover robots.txt, IP-range allowlisting, and access policy but say nothing about how to submit URLs (OpenAI; Anthropic; Perplexity)
  • IndexNow consumption by Google — frequently asked, but Google has not adopted the protocol
  • Any single channel that reaches all AI surfaces at once

The honest framing: there is no “submit to AI” today — only “submit to a host index” and wait for the AI surface to inherit. When a client asks where to push their page so it shows up in ChatGPT or Perplexity, the answer is not a submission protocol. It is the work that makes the page liftable once a retrieval crawler reaches it (citability) plus making sure that crawler can reach the page at all (AI Crawlers).

5. Sitemap.xml ≠ llms.txt ≠ robots.txt — three files, three jobs

Three root-level files appear in any complete crawler-and-discovery setup, and they are routinely confused. Each one does exactly one job and refuses to do the others.

FileWhat it doesWhat it does not do
robots.txtAccess — may a bot fetch this path at allDoes not list URLs, signal freshness, or claim completeness
sitemap.xmlDiscovery + completeness — here is everything I want indexedDoes not curate, does not grant access, does not act as a quality signal
llms.txtCuration + clean rendering — read these pages first, in clean markdownDoes not grant or deny access, does not claim completeness, does not signal indexing intent

The load-bearing line: sitemap.xml is not a curation file, not an access rule, not a “best of” list. It is a completeness manifest. Trying to slim sitemap.xml down to “only the pages I want AI engines to see” misreads the protocol’s job — curation belongs to llms.txt; sitemap.xml must reflect everything you want indexed, or the engine treats the omission as a coverage gap. The same direction inverted gives the llms.txt §4 three-file table — read together they are the canonical sitemap-vs-llmstxt disambiguation, mirrored from each protocol’s own side.

6. Anti-patterns — when submission backfires or wastes effort

Each anti-pattern below sounds right and fails because it confuses a protocol’s scope, a participating-engine list, or a file’s job.

Anti-patternWhy it sounds rightWhy it actually fails
”Push IndexNow → appear in ChatGPT, Perplexity, or Claude”IndexNow is an open standard, and some AI vendors say they “respect web standards”The IndexNow participant list (indexnow.org) is Bing, Yandex, Naver, Seznam, and Yep — no AI vendor’s crawler is on it. IndexNow → Bing → Copilot is the only path that reaches any AI surface
”Curate sitemap.xml to send AI engines only the best pages”Curation feels GEO-aligned — quality over quantityMisreads sitemap.xml’s job. The protocol is completeness — every URL you want indexed. An engine treats a curated subset as a coverage signal, not a quality one. Curation is llms.txt’s job
”Bloat sitemap.xml with noindex or canonical-to-elsewhere URLs""More URLs = wider discovery”Engines treat inconsistency between sitemap and on-page signals as quality-signal noise. Google’s sitemap doc names the URL classes that belong, and the ones that don’t (Build and submit a sitemap)
“Try IndexNow against Google anyway — it can’t hurt""Push is always better than pull; worst case, nothing happens”Google has not adopted IndexNow since its 2021 testing announcement (Search Engine Land, 2021; status unchanged through 2024 — ppc.land). The submission goes nowhere, and the monitoring noise is real. For Google, use Search Console

The line that closes the section: submission is a hygiene baseline, not a GEO lever. It earns you candidacy at the host index, which an AI surface may or may not reuse. The signals that decide whether you are actually picked once you are in the candidate pool live elsewhere — citability (chunk shape, quotable claims), E-E-A-T, entity recognition. Over-investing in the submission layer (curating sitemaps, pushing IndexNow at non-participating engines) does not move citation; under-investing forfeits candidacy. Get it correct and move on.

7. Why this matters for GEO + how to act

Your intentFirst stop
Get my site discovered by Google → AIO candidate eligibilityGoogle Search Console sitemap submission (Build and submit a sitemap) + Sitemap: directive in robots.txt
Get my site discovered by Bing → Copilot candidate eligibilityBing Webmaster Tools sitemap + IndexNow integration (How to add IndexNow)
Push fresh URLs to Bing / Copilot in minutesIndexNow per indexnow.org/documentation — see §2 for the minimal call
Get cited by ChatGPT, Perplexity, or ClaudeNo submission channel exists today. Invest in citability and confirm the retrieval crawler can reach you — see AI Crawlers
Audit index coverage and crawler access as part of a GEO sweepGEO Audit — sitemap presence + crawler access are two checkpoints
Make sure a bot can parse the page once it gets thereSSR for AI Crawlers — a separate problem with a separate fix
Govern crawler access at the protocol levelAI Crawlers · robots.txt
Distinguish sitemap.xml from llms.txt and robots.txt§5 above; the mirror table from llms.txt §4

The closing read, stated plainly: sitemap.xml is a hygiene baseline GEO never drops; IndexNow is a low-cost investment worth making for the share of your AI exposure that flows through Bing Copilot. Neither is a GEO load-bearing lever — those still live in citability, E-E-A-T, and entity recognition. Get the indexing layer correct, then spend marginal effort where citation is actually decided.

References

Primary — protocol specifications:

Primary — engine documentation:

Primary — historical anchor:

Secondary — independent coverage and infrastructure:

Frequently asked questions

Does submitting a sitemap or pushing IndexNow get my page cited by AI search engines?
Only via the host search index, and only on engines that reuse one. A sitemap helps Google index you, which feeds Google AI Overviews' candidate pool. A sitemap or IndexNow push helps Bing index you, which feeds Bing Copilot. ChatGPT Search, Perplexity, and Claude each run their own retrieval crawlers on their own schedules — their public crawler documentation does not mention sitemap.xml or IndexNow consumption ([OpenAI](https://platform.openai.com/docs/bots); [Anthropic](https://support.claude.com/en/articles/8896518-does-anthropic-crawl-data-from-the-web-and-how-can-site-owners-block-the-crawler); [Perplexity](https://docs.perplexity.ai/guides/bots)). Submission is necessary to be a candidate, not sufficient to be cited.
Does Google accept IndexNow pushes?
No, as of 2026-05. Google announced in November 2021 that it would test the protocol ([Search Engine Land](https://searchengineland.com/google-is-testing-the-indexnow-protocol-for-sustainability-375932)) but has not adopted it; independent coverage continues to treat the absence as the status quo ([ppc.land, 2024-12-30](https://ppc.land/googles-absence-from-indexnow-raises-questions-about-web-indexing-standards/)). For Google, the canonical submission channels remain the `Sitemap:` directive in robots.txt, Search Console sitemap submission, and URL Inspection — the limited Indexing API is restricted to job postings and livestream content.
Do ChatGPT Search, Perplexity, or Claude read sitemap.xml or IndexNow?
Their crawler docs do not document consumption of either. OpenAI's, Anthropic's, and Perplexity's published crawler pages discuss robots.txt, IP-range allowlisting, and access policy but say nothing about URL submission. The practical implication: their retrieval crawlers find your pages on their own crawl schedules, which means making sure the bot can reach the page ([AI Crawlers](/ai-crawlers)) and that the page is liftable once read ([Citability](/citability)) are the load-bearing investments — not submission protocols.
Should I curate sitemap.xml to send AI engines only my best pages?
No — that misreads the protocol. Sitemap.xml is a completeness signal: every URL you want indexed, in one canonical list. An engine treats a curated subset as a coverage gap, not a quality signal, and Google's own sitemap doc spells out which URLs belong ([Build and submit a sitemap](https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap)). Curation is [llms.txt](/llms-txt)'s job; access is [robots.txt](/robots-txt)'s job; sitemap.xml is here-is-everything-I-want-indexed.
What's the right way to submit my URL to AI engines today?
There is no first-party 'submit to AI' channel for any major standalone answer engine. The practical chain is: submit to the host search index, and let the AI surface inherit. Google Search Console (for AIO), Bing Webmaster Tools or IndexNow (for Copilot) are the only working channels. For ChatGPT, Perplexity, and Claude the work is upstream: keep your robots policy correct, return a clean SSR response when the retrieval crawler arrives ([SSR for AI Crawlers](/ssr-for-ai-crawlers)), and write the page so it is liftable when read ([Citability](/citability)).

See also

Sources

Primary

  1. Sitemaps XML format — Protocol · sitemaps.org
  2. IndexNow — Documentation · indexnow.org
  3. IndexNow — homepage (participating engines) · indexnow.org
  4. How to add IndexNow to your website (Bing Webmaster Tools) · Microsoft Bing
  5. Build and submit a sitemap · Google Search Central · 2025-12-10
  6. Bing Webmaster Guidelines · Microsoft Bing
  7. Overview of OpenAI Crawlers (GPTBot / OAI-SearchBot / ChatGPT-User) · OpenAI
  8. Does Anthropic crawl data from the web, and how can site owners block the crawler? · Anthropic · 2026-04-07
  9. Perplexity Crawlers · Perplexity AI
  10. Google is testing the IndexNow protocol for sustainability · Search Engine Land · 2021-11-09

Secondary

  1. Google's absence from IndexNow raises questions about web indexing standards · ppc.land
  2. Cloudflare now supports IndexNow · Cloudflare
  3. IndexNow — new initiative by Microsoft and Yandex to push content to search engines · Search Engine Land
Last updated: 2026-05-23 Authors: Ray Yang Topic: Infrastructure