Technical GEO

Engineer — crawlers, llms.txt, schema, rendering, infrastructure.

For: Engineers who own site infrastructure (devops, platform, SEO eng).
You should already know: GEO 101 Step 1 (definition).
After this path: You can audit and harden a site's AI-crawler surface end to end.

The AI crawler landscape
Before tuning anything, know who's actually hitting your origin. AI crawlers split into three categories with very different access consequences — training, retrieval, and user-triggered fetches — and the access decision is per-category, not per-bot.
Per-bot deep dive: GPTBot, ClaudeBot, PerplexityBot
The three major AI crawlers worth profiling individually. Each one has its own User-Agent pattern, crawl frequency, and opt-out signal — pick the one your traffic logs show first.
robots.txt and access control
The first lever you actually pull. Write the rule wrong and you're either invisible to AI or being scraped without anything in return — and the protocol's allow/disallow precedence has corners that bite at scale.
llms.txt
An emerging publishing convention for LLM consumption — cheaper than a sitemap, more semantic than robots.txt. A forward-compatible bet on adoption, not a confirmed citation channel.
Schema.org and JSON-LD
Schema.org isn't a ranking or citation signal — it's the infrastructure that makes you parseable and your entities resolvable. JSON-LD is the recommended serialization; the other two formats are legacy.
Rendering: SSR vs CSR for AI crawlers
If your content needs JavaScript to render, the AI crawlers that don't execute JS will never see it. Core Web Vitals is a related-but-different concern: a direct lever for AI Overviews, mostly noise for ChatGPT and Perplexity.
Sitemap & IndexNow
Sitemap.xml and IndexNow don't reach AI engines directly — they only travel through host search indexes: AIO via Google, Copilot via Bing. ChatGPT, Perplexity, and Claude don't consume either file.
Hands-on: a crawler-access audit
Bring the previous seven steps together with a real audit on your own domain. The 6-layer dependency ladder is what turns scattered findings into a sprint-ready ranked action plan.