Writing for AI Citation
Quick facts
- Difficulty
- Intermediate
- Time
- ~2–6 hours per page, faster on re-writes
- Prerequisites
- Citability, E-E-A-T
- What this is
- Per-signal rewrite recipes — what to type, what to delete, what to leave alone — to make each passage liftable into an AI answer
- Recipe spine
- Seven recipes mapped 1:1 to the seven citability signals; same five-element template per recipe (produces / mechanism / steps / before-after / pitfall)
- Output
- A draft (or rewrite) that passes the chunk-extraction test on every sampled paragraph, plus a reusable MDX section template and pre-publish checklist
- Effort
- ~2–6 hours per page on a fresh draft; faster on rewrites, where the audit playbook has already pre-localized the findings
- Necessary, not sufficient
- Structure makes a passage liftable; trust decides whether the source is used at all. Pair every recipe with the E-E-A-T axis
1. What this playbook is
This playbook turns each of the seven citability signals into a writing recipe — what to type, what to delete, what to leave alone — so a draft can survive the chunk-extraction test the Citability audit runs on it. It is the rewrite the audit points to: where the audit returns a ⚠️ or ❌ on Signal N, the matching recipe in §4.N here is the move at the keyboard. Definitions of every signal live in Citability §4; this entry never re-derives them.
Microsoft puts the structural case plainly in February 2026: “Clear headings, tables, and FAQ sections help surface key information and make content easier for AI systems to reference accurately” (Bing AI Performance in Webmaster Tools). The empirical anchor sits in Aggarwal et al. 2024: content-substance rewrites — cite sources, add statistics, add quotations — measurably raised AI-answer visibility, while keyword stuffing did not. Take the direction as load-bearing; treat the magnitude as a bounded upper estimate, since competitive pressure shrinks single-actor lifts (C-SEO Bench).
The recipes assume you have already cleared two upstream gates. Be retrievable at all (AI Crawlers; see Generative Engine Optimization for the full loop), and trustworthy enough to be selected (E-E-A-T). Structure without substance is detectable and penalized (AI Content Detection); a wall of citations on a thin domain still loses on trust. The seven recipes win on the third gate — grounding — and only there.
2. Before you write — four pre-draft decisions
Four decisions fix what every recipe in §4 means for the current page. Make them before the first sentence; the same decide-before-you-measure discipline that opens the Citability audit and AI Citation Tracking playbooks.
| Decision | Options | Rule of thumb |
|---|---|---|
| Surface | One page / one template / a content cluster / a locale | Write one coherent surface; mixed-surface scope produces text that no §4 recipe applies cleanly to |
| Query intent | Definitional / how-to / comparison / list / reference | Form follows intent; a how-to query demands a step list, a comparison demands a self-labeling table, a definitional query demands an inverted-pyramid lede |
| Draft mode | Fresh draft / rewrite of stale page / response to a citability-audit finding | Each mode has a different starting kernel; §5’s MDX skeleton assumes fresh-draft mode, but every recipe stands alone in audit-response mode |
| Audience surface | Practitioner-only / decision-maker / mixed | Affects paragraph density and the §4.7 quotable claim’s register — practitioner-only pages can carry tighter, jargon-dense claims |
Where to start. With an audit finding in hand, jump straight to the §4.N recipe matching the failed signal; walk §6’s pre-publish checklist on the rewrite and ship. On a fresh draft, walk §3 → §4 → §5 → §6 in order. On a rewrite triggered by stale content — a fact has drifted under you, an engine changed its surface, a competitor’s page now ranks above yours on the target query — treat the trigger as the surface-decision input and walk the fresh-draft path on the section that broke. The freshness cadence model is Content Freshness’s.
3. The seven-signal rewrite spine
One row per recipe — each is an H3 in §4. Failure shapes come from the Citability audit playbook’s per-signal micro-tables; definitions come from Citability §4.
| # | Recipe | What it produces | Failure shape it fixes | Definition |
|---|---|---|---|---|
| 1 | Self-contained chunks | Paragraphs that lift alone, subject and pronouns resolved within the paragraph | Pronoun chains, “as above” refs, dangling pointers to a prior diagram | Citability §4.1 |
| 2 | Inverted-pyramid sections | Claim in the first sentence of every H2/H3, justification after | Two or three paragraphs of preamble before the claim | Citability §4.2 |
| 3 | Question-shaped headings | H2/H3 matched to real user queries on Q&A-form pages | Topic headings no real user types; or invented FAQ no one asks | Citability §4.3 |
| 4 | Step lists | Numbered, imperative procedures, one action per step, lift-alone | Steps buried in prose (“first you should consider…”) | Citability §4.4 |
| 5 | Self-labeling tables | Captioned tables whose rows read alone, column headers full noun phrases | Tables whose rows mean nothing without the surrounding paragraph | Citability §4.5 |
| 6 | Heading discipline | One H1, clean H2 → H3 nesting, no skipped levels | Decorative headings, skipped levels (H2 → H4), duplicated H1s | Citability §4.6 |
| 7 | Quotable claims | One crisp standalone claim per H2 that survives extraction | Hedged, multi-clause filler sentences nothing can lift | Citability §4.7 |
Signals 1, 2, and 7 are the highest-leverage in practice; they govern whether any atom of the page is liftable, so every page wants them. Signals 3 and 4 are conditional on the page form — do not invent Q&A or steps where neither fits. Signal 5 scales with how tabular the content is. Signal 6 is cheap to audit and cheap to fix.
The recipes are surface-invariant — they win everywhere. What varies is which signal each engine weights highest, and that is §8’s surface-delta table. The Microsoft framing that anchors the whole spine: “The unit of value shifts from documents to groundable information — discrete, supportable facts with clear provenance” (Bing — Evolving role of the index, May 2026). The unit being written for is the passage, not the page.
4. The seven rewrite recipes
Each H3 below uses the same five-element template — what it produces / mechanism / recipe (numbered) / before-after pair / pitfall — so recipes stay comparable across signals.
4.1 Self-contained chunks
What it produces. Paragraphs that lift cleanly out of context — subject re-stated, pronouns resolved, no dangling pointers to neighbors.
Mechanism. The atomic unit of grounding is one liftable passage; engines retrieve and select at the chunk level, not the page level. A paragraph that needs the paragraph above it to make sense fails Citability §4.1 the moment it is lifted alone.
Recipe.
- Open every paragraph with its own subject — re-state the noun the section is about, do not chain a pronoun back to the previous paragraph.
- Resolve every pronoun within the same paragraph; if it refers to something three sentences back, swap it for the noun phrase.
- Replace “as above”, “as we saw”, “earlier” with the re-stated content itself, not a pointer to it.
- For cross-section references, add a parenthetical re-statement: “the precedence rule (longer path wins) applies”, not “the precedence rule from §2 applies”.
Before. “As above, it applies; but as we noted in §2, the precedence rules can override.”
After. “The robots.txt Disallow directive applies to every URL the user-agent block names. Where two rules conflict, the longer-path rule wins — see §2 for the full precedence walk-through.”
Pitfall. Over-chunking — splitting every paragraph into one-sentence atoms in pursuit of Signal 1. Google’s May-2026 stance is explicit: “There’s no requirement to break your content into tiny pieces for AI to better understand it” (AI Optimization Guide). Signal 1 measures coherent self-containment, not fragmentation; one-sentence paragraphs fail Signal 1 just as hard as multi-paragraph pronoun chains do. The §7 don’ts table catches the operational form of this anti-pattern, and AI Content Detection covers why over-chunked text reads as low-effort at scale.
4.2 Inverted-pyramid sections
What it produces. Every H2 and H3 opens with the section’s claim; setup, justification, and qualifications come after.
Mechanism. Live-fetch engines — notably ChatGPT search — reward a direct answer near the top of the fetched page or section. Selection at the grounding step favors passages that answer immediately, because retrieval can only see the first window of the section before the model commits to a quote.
Recipe.
- Write the answer as the first sentence of every H2 and H3.
- Push setup and justification after the claim. If you cannot write the claim without first writing two paragraphs of build-up, the claim is not yet sharp enough.
- If your section opens with “Let’s consider…”, “There are several factors…”, or “First, it is important to understand…” — that is the rewrite trigger.
- The first sentence should be quotable on its own; §4.7 will reuse it as the section’s liftable claim.
Before. “There are several factors to consider when thinking about robots.txt. Many practitioners debate whether path-length or specificity should decide precedence. Eventually, after weighing both, the longer path wins.”
After. “In robots.txt, the longer-path rule wins when two Disallow directives conflict. The rest of this section walks the precedence model and the two edge cases where it does not.”
Pitfall. Writing the section twice — a “TL;DR sentence” that just restates the H2 title without making a claim, followed by the same content the section would have had anyway. The opening sentence must advance the page, not announce it. The page-level TL;DR sits in frontmatter and is rendered automatically by the layout; do not duplicate it as a body blockquote.
4.3 Question-shaped headings
What it produces. H2 and H3 phrased as questions on pages where the page form is genuinely Q&A — FAQ, troubleshooting, “when should I”, “why does X” — and matched to real queries.
Mechanism. Question-shaped headings line up with the sub-queries that query fan-out produces (see Answer Loop §3.1). Google AI Overviews and Gemini’s web-grounded mode especially reward heading-as-query matching, because the heading is one of the strongest retrieval signals on an index-based engine.
Recipe.
- Phrase H3s as questions only where the page form is genuinely Q&A. A definitional reference page does not become a Q&A page by re-titling each section; it just becomes harder to scan.
- Match real queries — pull from Search Console, support tickets, sales-call transcripts, or the engine’s “people also ask” panel. Invented questions are the §7 anti-pattern.
- Keep the answer’s first sentence (Signal 2) self-sufficient even if the heading is stripped — assume the engine quotes the answer without the heading.
- One question per heading; do not chain “X and Y — which?” into a single H3 unless the answer genuinely covers both with a single claim.
Before. ”### Considerations regarding crawler precedence rules”
After. ”### When two robots.txt rules conflict — which wins?”
Pitfall. FAQ-stuffing — twelve invented question headings at the bottom of every page in pursuit of Signal 3. Recognized as boilerplate, down-weighted as low-effort content, and the manufactured questions trip AI Content Detection at scale. The right count of Q-shape headings on a page is the number of real user queries the page answers — sometimes that is five, sometimes zero.
4.4 Step lists
What it produces. Numbered, imperative procedures — one action per step, lift-alone, no surrounding prose required to interpret a step.
Mechanism. An ordered procedure lifts as one clean unit; engines preferentially quote numbered imperative lists when the user query is procedural (“how do I…”, “what are the steps to…”). Prose with embedded steps loses to lists with explicit steps on every engine.
Recipe.
- For any procedure, use a numbered list — never prose-with-embedded-steps.
- One action per step, imperative mood — “Run X”, not “You should run X” and not “X needs to be run”.
- Each step lifts alone — no “this depends on the previous step” without re-stating the prior state inline.
- If the procedure has branches, split into sub-procedures. Do not nest conditionals inside a single list; “if X then go to step 5” is the failure shape Signal 4 catches.
Before. “First you should consider whether the file already exists, and then it may be worth fetching the current version, after which you would edit the relevant directive, and finally re-deploy and verify.”
After.
1. Fetch the current robots.txt: curl https://example.com/robots.txt
2. Locate the User-agent: * block.
3. Add the line Disallow: /private/ on its own line within that block.
4. Re-deploy the file at the site root.
5. Re-fetch the URL and confirm the new directive is present.
Pitfall. Numbering things that are not sequential — “1. Important context 2. Another consideration 3. A tip” — to inherit Signal 4’s lift-as-a-unit reward. The number implies execution order; using it for visual emphasis breaks the signal. If the items are not sequential, use a bulleted list or write prose.
4.5 Self-labeling tables
What it produces. Captioned tables whose rows are independently liftable — column headers are full noun phrases, cells are complete clauses, the caption above states the comparison axis.
Mechanism. Each table row is independently quotable when columns are self-labeling, because the engine can lift one row without the surrounding paragraph. Microsoft is explicit: “Clear headings, tables, and FAQ sections help surface key information and make content easier for AI systems to reference accurately” (Bing AI Performance).
Recipe.
- Every table gets a caption sentence above it stating the comparison axis (“When two robots.txt Disallow rules conflict, the longer path wins.”) — not a label, a full sentence.
- Column headers are full noun phrases (“Winning rule”, not “Wins?”), so each cell can be parsed without referring back to the column header from across the row.
- Each cell reads as a complete clause, not a one-word answer. “Yes” alone is not liftable; “Longer path wins per RFC 9309 §2.2.2” is.
- Lift-test one row: paste a single row into a fresh engine session and ask “what is this saying?”. If the engine cannot answer cleanly, the row depends on its neighbors — re-write.
Before.
| Rule | Wins? | Notes |
|---|---|---|
| Long | Yes | See above |
After.
When two robots.txt Disallow rules conflict, the longer matching path wins.
| Conflict scenario | Winning rule | Why |
|---|---|---|
/private/ vs /private/docs/ for the same user-agent | /private/docs/ | Longest matching path wins per RFC 9309 §2.2.2 |
| Two rules of equal path length, different order | First-declared | Tie-break by declaration order within the user-agent block |
Pitfall. Tables-as-decoration — using a two-column table where a sentence would do. Signal 5 measures liftable structured comparison, not visual layout. A two-row, two-column table that says “Option A: yes / Option B: no” is one English sentence; making it a table erodes Signal 1 without adding to Signal 5.
4.6 Heading discipline
What it produces. Clean H2 → H3 nesting, no skipped levels, no decorative headings, exactly one H1 per page (the frontmatter title).
Mechanism. Clean nesting makes passages addressable; engines retrieve and quote by section, and the heading is the strongest implicit caption a passage gets. Skipped levels and decorative headings break the retrievable section boundary that Signal 6 holds open.
Recipe.
- One H1 per page (the frontmatter
titlefield — the layout renders it as the H1). Never repeat it in body. - H2 → H3 only — no skipped levels (no H2 → H4 just to inherit smaller font size).
- Every heading names a real unit. If you cannot sentence-summarize what is under a heading, the heading is decorative and should be deleted or merged.
- Heading text is the section’s contract; the first sentence under it must deliver on that contract (Signal 2 + Signal 7 — usually the same sentence).
Before. H2 → H4 → H3 (skipped); decorative H3s like ”### Diving in!”, ”### Let’s explore”, ”### A note”.
After. Clean H2 → H3 → H3 nesting throughout; topic-titled headings (”### When two rules conflict”, ”### What the precedence model assumes”).
Pitfall. H3s used purely for visual rhythm — one-sentence sections under an H3 to “break up the page”. If the content under an H3 is one sentence, merge it back into the parent section. A heading with one sentence under it is decorative by construction.
4.7 Quotable claims
What it produces. One crisp standalone claim per H2 that survives extraction with attribution intact — usually the same sentence the §4.2 inverted-pyramid recipe puts at the top.
Mechanism. The smallest unit of citability is a single claim that survives lifting. Models preferentially quote crisp, standalone claims; the engine’s selection step penalizes hedged multi-clause sentences nothing can extract from.
Recipe.
- Write one liftable sentence per H2 — verb-first or subject-first, never bury the claim under a subordinate clause.
- Cut hedges. “It could perhaps be argued that, in some cases, X may not always Y” loses Signal 7 every time; “X does Y” wins it.
- Keep attribution inline — cite the source in the sentence itself (“per RFC 9309 §2.2.2”, “per Aggarwal et al.”), not in a footnote the engine will lose during retrieval.
- The claim should make sense even when the H2 heading is stripped; engines sometimes quote the sentence without its heading.
Before. “It could perhaps be argued that, in some cases, retrieval may not always lead to use.”
After. “Retrieval makes you a candidate; grounding decides if you are used.”
Pitfall. Manufactured statistics — “47% of marketers report…” with no source — to mimic a quotable claim. Unsourced numbers fail trust filtering on E-E-A-T and trip AI Content Detection; a quotable sentence with a fabricated number is worse than a hedged sentence, because the page now ships a citation-shaped trap. Every statistic must carry its source URL in the same sentence — that is what makes the claim quotable and trustworthy.
5. A reusable section template
One annotated MDX skeleton — the canonical “good page section” the rest of the playbook leans on. Drop it into a new section, fill in the slots, and the section carries Signals 1, 2, 5, 6, and 7 simultaneously.
## H2 — topic-titled heading (Signal 6 + sets up Signal 3 if Q-shaped)
One-sentence quotable claim that resolves the H2's question or asserts the
section's load-bearing fact. (Signals 2 + 7 — usually the same sentence.)
Two or three sentences of substantive context that justify the claim,
each opening with its own subject and resolving its own pronouns within
the paragraph. (Signal 1.)
Caption sentence above the table — what is being compared, in one full
sentence (not a label).
| Column A (full noun phrase) | Column B (full noun phrase) | Column C (full noun phrase) |
|---|---|---|
| Self-contained cell clause | Self-contained cell clause | Self-contained cell clause |
| Self-contained cell clause | Self-contained cell clause | Self-contained cell clause |
(Signal 5.)
One closing sentence that *advances* — does not summarize — the section,
pointing to the next H2 or to an inline cross-reference where the reader
needs more depth. (Signal 1 + reader hand-off.)
Three rules to enforce when filling the template. The first sentence must be quotable in isolation — read it back to yourself with the heading stripped, and check that it still makes a claim. Every paragraph passes the lift-test — paste alone into ChatGPT search or Perplexity and ask “what is this passage saying?”. The table caption is a full sentence, not a label — “Comparison of robots.txt precedence rules” fails the caption rule; “When two robots.txt rules conflict, the longer matching path wins” passes it.
What the template is not. It is not a rigid form to copy mechanically; sections without comparisons should not invent a table just to fill the §5 slot. The signals are conditional on the page’s actual form — that is the §7 fake-fix warning, restated constructively. The CJK paragraph-length tolerance and section-opening conventions differ from English; the zh sibling of this playbook will localize the recipes idiomatically (see Multilingual GEO for the language axis).
6. The pre-publish checklist
Run this on the draft before pushing — it is the constructive complement to the Citability audit’s chunk-extraction test, run during writing rather than after publishing. Grouped by signal so a failure routes directly back to the matching §4 recipe.
SIGNAL 1 — Self-contained chunks
[ ] Each paragraph opens with its own subject (no pronoun without a same-paragraph antecedent).
[ ] Zero "as above / as we saw / see §X" without a re-stated noun phrase.
[ ] Three random paragraphs lifted alone pass the chunk-extraction test.
SIGNAL 2 — Inverted-pyramid sections
[ ] Every H2's first sentence states the section's claim.
[ ] Zero sections opening with "Let's consider..." / "There are several..."
SIGNAL 3 — Question-shaped headings (only where page form is Q&A)
[ ] Each question heading matches a real user query (Search Console / tickets / sales calls).
[ ] Zero invented "frequently asked" questions just to inflate Signal 3.
SIGNAL 4 — Step lists (only where the page contains a procedure)
[ ] Every procedure is a numbered list, imperative mood, one action per step.
[ ] Zero numbered lists for non-sequential content.
SIGNAL 5 — Self-labeling tables
[ ] Each table has a full-sentence caption above it.
[ ] Each column header is a full noun phrase; each cell is a complete clause.
SIGNAL 6 — Heading discipline
[ ] Clean H2 → H3 nesting; no skipped levels; no decorative headings.
[ ] One H1 per page (frontmatter title); no H1 in body.
SIGNAL 7 — Quotable claims
[ ] One crisp, hedge-free, attribution-intact claim per H2.
[ ] Every statistic has its source URL in the same sentence as the number.
TRUST (E-E-A-T overlap — not citability, but defeats a citable page)
[ ] Author identity and credentials visible on the page.
[ ] No manufactured statistics; every number is sourced or dropped.
A clean checklist pass is necessary, not sufficient. The E-E-A-T axis remains the trust half of grounding, and structure without substance is detectable (AI Content Detection). Walk the checklist top-down; the order is also the priority order — Signal 1 failures cost more than Signal 5 failures on most pages, and any TRUST-block failure overrides every signal above it.
7. The don’ts — patterns to avoid
The constructive complement to Citability §6 and Citability audit §7 — but framed at the writer’s keyboard, not at the auditor’s clipboard.
| Don’t | Looks like it fixes | Why it actually fails | Right move |
|---|---|---|---|
| Over-chunk every paragraph to one sentence | Signal 1 (self-contained) | Fragments lose meaning; nothing is a coherent liftable answer; Google says explicitly there is no requirement to do this | Aim for one coherent self-contained idea per paragraph — three to five sentences typical |
| Invent FAQ entries no real user asks | Signal 3 (Q&A) | Recognized as boilerplate, down-weighted as low-effort content, trips AI Content Detection at scale | Only Q-shape headings that match a real query from Search Console, tickets, or “people also ask” |
| Manufacture statistics (“47% of marketers…”) to look citable | Signal 7 (quotable claim) | Unsourced numbers fail trust filtering, and a citation-shaped trap is worse than a hedged sentence | Use a real source with the URL in the same sentence — or drop the claim |
| Copy boilerplate prose across sibling pages | Signal 1 at scale | Near-duplicate detection — siblings sharing prose lose ranking on both | Re-write per page; siblings share structure (the §5 template), never prose |
| Ship the LLM’s first draft verbatim | ”Optimized” wording | Detectable as low-effort mass content; substance is missing even when shape looks right | Use the LLM to bloat; trim by hand. Human-kernel → LLM-expansion → human-trim, never LLM-first |
| Add a “perfect” TL;DR that just restates the H2 title | Signal 2 (direct-answer block) | Does not advance the page; the engine sees redundancy and skips the section | The TL;DR is the claim, not the topic — see §4.2 before/after |
The AI-assisted drafting pattern, stated plainly. The mechanism Aggarwal et al. benchmarked is substance, not template-conformity; templates with empty content trip the very filter they tried to please. The safe direction is human-written kernel → LLM expansion → human trim, never LLM draft → light edit → ship. The LLM-first failure mode is also covered by AI Content Detection, and the bounded-magnitude reading of Aggarwal sits at the paper entry — take the direction, not the headline number.
Google’s May-2026 closing position: “There are no additional requirements to appear in AI Overviews or AI Mode, nor other special optimizations necessary” (AI Features and Your Website). The recipes in §4 are not special tricks — they are the structural form of clearly-written, well-sourced content, which is what every primary source on the topic ends up recommending.
8. How the recipes vary by surface
The signals are surface-invariant — they win everywhere — but each engine weights one or two highest. The writer’s question becomes: if I have to over-invest in one signal for my target engine, which one?
| Surface | Signal to over-invest in | Why | Recipe to prioritize |
|---|---|---|---|
| Perplexity | 1, 5, 7 | Citation-dense by design; rewards tight, liftable chunks and crisp quotable claims the most | §4.1, §4.5, §4.7 |
| ChatGPT search | 2 | Live fetch from the URL; rewards a direct-answer block near the top of the section | §4.2 |
| Google AI Overviews | 3, 6 | Index-based; rewards heading discipline and Q-shaped sub-queries that match fan-out | §4.3, §4.6 |
Language is a separate axis. Chunk and answer-block citability shifts in Chinese versus English — paragraph-length tolerance, punctuation rhythm, and section-opening conventions all differ. The zh sibling of this playbook localizes the recipes idiomatically; for the operational delta and the citability shifts that matter on Chinese-language engines, see Multilingual GEO.
Under competition, single-actor lifts shrink. The recipes win directionally on every surface, but the headline magnitudes from single-actor benchmarks are not promises — once competitors apply the same recipes, the average lift compresses (C-SEO Bench, NeurIPS ‘25 D&B). The right way to read a recipe is “will this make my page more liftable than the version without it”, not “will this produce a +40% lift in absolute terms”.
9. After you publish — close the loop
Three operational moves to run after the rewrite ships. None of them is a full H3; each one is a paragraph of instruction.
Re-run the chunk-extraction test on the live page. Pick three paragraphs at random — the TL;DR, one H2 first paragraph, one table row — lift each one verbatim, paste alone into a fresh ChatGPT search or Perplexity session, and ask “what is this passage saying?”. Every paragraph the engine hedges on, asks for context for, or completes incorrectly is a citability finding; route to the Citability audit playbook for the full diagnostic and back here for the signal-specific rewrite.
Instrument the page in your tracking set. Add the page’s target queries to the frozen prompt set defined in AI Citation Tracking §3, then wait two cadences before reading a trend — single-week deltas are noise. Watch Citation Rate and Average Position (defined in GEO Metrics), not vanity views. The first cadence after a rewrite often shows a transient dip while engines re-crawl and re-embed; the trend is what counts.
Distinguish citation from mention in the report. The page may be quoted with a link (citation) or paraphrased without one (mention); both matter, but they fix different breakages — see Citation vs Mention. A page that is mentioned but not cited usually means trust signals are passing while attribution is breaking, which is an E-E-A-T or attribution issue, not a citability one. The verifiability ceiling sits at “51.5% of generated sentences are fully supported by citations” (Liu et al., EMNLP ‘23) — that gap is the attribution problem this playbook does not solve.
When to re-write. The drift triggers are the same ones that opened §2 — a fact has drifted under you, an engine changed its surface, a competitor’s page now ranks above yours on the target query. The cadence model sits in Content Freshness; the operational loop is the §2 → §4 → §6 walk repeated on the section that broke.
10. Further reading
- Concept layer — Citability (the seven signals defined), E-E-A-T (the trust half — necessary if structure is to convert), Answer Loop (the four-step loop these recipes win step 3 of)
- Adjacent playbooks — Citability audit (the diagnostic this playbook remedies), Full GEO Audit (the Layer-4 hand-off), AI Citation Tracking (the measurement loop after publish)
- Per-engine surfaces — Perplexity, ChatGPT search
- Academic anchor — Aggarwal et al. 2024 — GEO: Generative Engine Optimization (the substance-over-tricks mechanism); bounded reading via C-SEO Bench; verifiability ceiling via Liu et al. 2023
- Trust / detection axis — AI Content Detection (why over-optimization fails detection), Multilingual GEO (language as a separate axis)
References
Academic:
- Aggarwal, P., Murahari, V., Rajpurohit, T., Kalyan, A., Narasimhan, K. & Deshpande, A. (2024). GEO: Generative Engine Optimization. KDD ‘24. arXiv:2311.09735 · ACM DL · in-project paper entry
- Puerto, H., Gubri, M., Green, C., Oh, S. J. & Yun, S. (2025). C-SEO Bench: Does Conversational SEO Work? NeurIPS ‘25 Datasets & Benchmarks. arXiv:2506.11097
- Liu, N. F., Zhang, T. & Liang, P. (2023). Evaluating Verifiability in Generative Search Engines. Findings of EMNLP 2023. arXiv:2304.09848
Official platform documentation (verified 2026-05):
- Google Search Central — Google’s Guide to Optimizing for Generative AI Features on Google Search · A new resource for optimizing for generative AI in Google Search · AI Features and Your Website · Top ways to ensure your content performs well in Google’s AI experiences on Search
- Microsoft Bing — Evolving role of the index: From ranking pages to supporting answers · Introducing AI Performance in Bing Webmaster Tools (Public Preview)
- OpenAI — ChatGPT search Help Center
- Perplexity — What is an answer engine, and how does Perplexity work as one?
Frequently asked questions
Where does this playbook sit relative to the Citability concept entry and the Citability audit?
Do I need to apply all seven recipes to every page?
Can I just feed my draft to an LLM and ask it to optimize for citability?
How long should each paragraph be?
What metric should I watch after publishing a rewrite?
Related playbooks & wiki
Sources
Primary
- GEO: Generative Engine Optimization (Aggarwal et al., KDD 2024) · arXiv / KDD '24 · 2024-08-25
- GEO: Generative Engine Optimization (KDD '24 Proceedings) · ACM SIGKDD · 2024-08-25
- Google's Guide to Optimizing for Generative AI Features on Google Search · Google Search Central · 2026-05-15
- A new resource for optimizing for generative AI in Google Search · Google Search Central · 2026-05-15
- AI Features and Your Website · Google Search Central · 2025-12-10
- Top ways to ensure your content performs well in Google's AI experiences on Search · Google Search Central · 2025-05-01
- Introducing AI Performance in Bing Webmaster Tools (Public Preview) · Microsoft Bing · 2026-02-10
- Evolving role of the index: From ranking pages to supporting answers · Microsoft Bing · 2026-05-06
- ChatGPT search — OpenAI Help Center · OpenAI
- What is an answer engine, and how does Perplexity work as one? · Perplexity AI
Secondary
- C-SEO Bench: Does Conversational SEO Work? (Puerto et al., NeurIPS '25 D&B) · arXiv / NeurIPS '25 D&B
- Evaluating Verifiability in Generative Search Engines (Liu et al., EMNLP '23 Findings) · arXiv / EMNLP '23 Findings