Technical GEO
Engineer — crawlers, llms.txt, schema, rendering, infrastructure.
- For
- Engineers who own site infrastructure (devops, platform, SEO eng).
- You should already know
- GEO 101 Step 1 (definition).
- After this path
- You can audit and harden a site's AI-crawler surface end to end.
- The AI crawler landscape
Before tuning anything, know who's actually hitting your origin. AI crawlers split into three categories with very different access consequences — training, retrieval, and user-triggered fetches — and the access decision is per-category, not per-bot.
- Per-bot deep dive: GPTBot, ClaudeBot, PerplexityBot
The three major AI crawlers worth profiling individually. Each one has its own User-Agent pattern, crawl frequency, and opt-out signal — pick the one your traffic logs show first.
- robots.txt and access control
The first lever you actually pull. Write the rule wrong and you're either invisible to AI or being scraped without anything in return — and the protocol's allow/disallow precedence has corners that bite at scale.
- llms.txt
An emerging publishing convention for LLM consumption — cheaper than a sitemap, more semantic than robots.txt. A forward-compatible bet on adoption, not a confirmed citation channel.
- Schema.org and JSON-LD
Schema.org isn't a ranking or citation signal — it's the infrastructure that makes you parseable and your entities resolvable. JSON-LD is the recommended serialization; the other two formats are legacy.
- Rendering: SSR vs CSR for AI crawlers
If your content needs JavaScript to render, the AI crawlers that don't execute JS will never see it. Core Web Vitals is a related-but-different concern: a direct lever for AI Overviews, mostly noise for ChatGPT and Perplexity.
- Sitemap & IndexNow
Sitemap.xml and IndexNow don't reach AI engines directly — they only travel through host search indexes: AIO via Google, Copilot via Bing. ChatGPT, Perplexity, and Claude don't consume either file.
- Hands-on: a crawler-access audit
Bring the previous seven steps together with a real audit on your own domain. The 6-layer dependency ladder is what turns scattered findings into a sprint-ready ranked action plan.