Skip to content

← Learn

Technical GEO

Engineer — crawlers, llms.txt, schema, rendering, infrastructure.

For
Engineers who own site infrastructure (devops, platform, SEO eng).
You should already know
GEO 101 Step 1 (definition).
After this path
You can audit and harden a site's AI-crawler surface end to end.
  1. The AI crawler landscape

    Before tuning anything, know who's actually hitting your origin. AI crawlers split into three categories with very different access consequences — training, retrieval, and user-triggered fetches — and the access decision is per-category, not per-bot.

  2. Per-bot deep dive: GPTBot, ClaudeBot, PerplexityBot

    The three major AI crawlers worth profiling individually. Each one has its own User-Agent pattern, crawl frequency, and opt-out signal — pick the one your traffic logs show first.

  3. robots.txt and access control

    The first lever you actually pull. Write the rule wrong and you're either invisible to AI or being scraped without anything in return — and the protocol's allow/disallow precedence has corners that bite at scale.

  4. llms.txt

    An emerging publishing convention for LLM consumption — cheaper than a sitemap, more semantic than robots.txt. A forward-compatible bet on adoption, not a confirmed citation channel.

  5. Schema.org and JSON-LD

    Schema.org isn't a ranking or citation signal — it's the infrastructure that makes you parseable and your entities resolvable. JSON-LD is the recommended serialization; the other two formats are legacy.

  6. Rendering: SSR vs CSR for AI crawlers

    If your content needs JavaScript to render, the AI crawlers that don't execute JS will never see it. Core Web Vitals is a related-but-different concern: a direct lever for AI Overviews, mostly noise for ChatGPT and Perplexity.

  7. Sitemap & IndexNow

    Sitemap.xml and IndexNow don't reach AI engines directly — they only travel through host search indexes: AIO via Google, Copilot via Bing. ChatGPT, Perplexity, and Claude don't consume either file.

  8. Hands-on: a crawler-access audit

    Bring the previous seven steps together with a real audit on your own domain. The 6-layer dependency ladder is what turns scattered findings into a sprint-ready ranked action plan.