#geo #generative-engine-optimization #ai-search #compliance #ai-seo #governance

Why China Already Treats GEO as a Compliance Problem

Chinese GEO practitioners just published a 4,000-line internal manual treating Generative Engine Optimization as a compliance problem. Western SEO teams are 18 months behind.

Hui · Co-founder & CEO at AuxoraMay 10, 20269 min read7 views

Most "AI SEO" advice you read this year is about how to get cited more often: structured data tricks, FAQ stuffing, RAG-friendly markup, schema clones of competitor pages, batch-generated city pages.

In China, the conversation has already moved past that. While Western tools race to dominate AI Overviews and ChatGPT citations, Chinese practitioners just published a 4,000-line internal manual treating Generative Engine Optimization as a compliance problem first, a growth problem second. The document, authored by Yao Jingang (writer of AI Marketing: From SEO to GEO) and circulated as "the GEO Red Book", maps 55 specific behaviors that will get a brand fined, sued, or quietly de-ranked across Doubao, Kimi, Qwen, Yuanbao, and DeepSeek.

Western SEO teams are about 18 months behind on this conversation. The framework deserves attention, because the same regulatory pressure is already building under Google's spam policies, the EU AI Act, the FTC's AI guidance, and a dozen platform-level changes that nobody is reading carefully.

This is what the manual gets right, what it changes about how an AI SEO tool should operate, and the honest checklist your vendor should be able to answer.

The framing shift: GEO is a knowledge-asset problem, not a ranking problem

The Red Book opens with a line worth quoting carefully: GEO is the practice of helping real, valuable information be understood and fairly cited by generative engines. Not the practice of "winning" AI search.

That distinction matters because it implies a completely different operating model. If your job is to win citations, your toolkit converges on volume, schema manipulation, and prompt-injectable content. If your job is to make true things legible to machines, your toolkit converges on fact registries, citation provenance, and content governance.

The Red Book then organizes risks into three layers, each with its own audit lens:

Layer	What it asks	Where it shows up
Values & ethics	Is what we're trying to do legitimate?	Competitor disparagement, fake reviews, paid placements without disclosure
Information quality	Is what we're publishing true and traceable?	Hallucinated content, scraped citations, scaled programmatic pages, expired-domain hijacking
AI system safety	Are we exploiting model or retrieval weaknesses?	Prompt injection, RAG poisoning, vector manipulation, agent tool abuse

Most Western "AI SEO" guides cover layer two halfway, ignore layer one, and have nothing useful to say about layer three. Most Chinese GEO vendors are now expected to cover all three.

The red-line formula

The Red Book proposes a simple judgment formula for any GEO action under consideration:

Red-line risk = illegitimate intent × factual distortion × technical manipulation × scope of impact × difficulty of reversal

Three things make this useful in practice:

It is multiplicative. A high score on any single dimension drags the whole action toward the red zone, which kills the "but it's only a small lie" argument.
It explicitly weights reversibility. If misinformation enters someone else's RAG knowledge base, gets indexed by three AI engines, and propagates across mirror sites, the cleanup cost is closer to permanent than to a takedown notice.
It is intent-aware. The same action (publishing a comparison page) can score 2 or 5 depending on whether you supply verifiable evidence or fabricate it.

Western SEO frameworks usually stop at "is this against Google's spam policy?" That misses the more important question: if this gets quoted by an AI engine to a real buyer, will the resulting decision be one we are comfortable defending?

Nine risk categories worth memorizing

The 55 specific red lines collapse into nine categories. The original Chinese list maps cleanly onto Western policy and law (Google spam policy, FTC endorsement guides, EU AI Act, OWASP LLM Top 10). Treat this as an audit grid for your own content operation:

Competitive ethics: anonymous attacks on competitors, fake reviews seeded across forums, biased "neutral" comparisons
Trust fabrication: paid placements with no disclosure, fake awards, fabricated client logos
Content quality: AI-hallucinated facts published as authoritative, batch low-value content, plagiarism with synonym swaps
Search spam: keyword stuffing, hidden text, doorway pages, link farms, parasite SEO on rented sub-domains
User safety: search poisoning, hidden redirects, malicious downloads
LLM instruction risks: direct and indirect prompt injection, jailbreaks, multi-turn context hijacking
RAG and retrieval risks: knowledge-base poisoning, vector manipulation, fake retrieval entries
Agent and tool risks: tool-call manipulation, over-privileged agents, unsafe rendering of model output
Data and supply chain risks: memory poisoning, training-data poisoning, model backdoors, sensitive-info leakage

If your AI SEO vendor cannot describe how their product avoids each of these categories, you have a procurement problem, not a tooling preference.

What "white-hat GEO" actually requires

The Red Book proposes seven principles that are useful to copy verbatim into your own content policy. They are not aspirational. They are the floor.

Truth first: every factual claim has verifiable evidence
User benefit: optimization improves answer accuracy, not just brand visibility
Source traceability: original source, publication date, update date, reviewer credentials
Machine readability: structured data must match what is visible on the page
Disclosure: paid relationships, affiliate links, sponsored comparisons are surfaced clearly
Respect for competition: no claims about competitors that cannot be verified with public evidence
Security by default: pages are treated as potential model input, with prompt-injection defenses and output sanitization

Principle four is the one most current AI SEO tools quietly violate. Schema that asserts a five-star rating, a FAQ that the page does not actually answer, a price that does not match the visible content. All of this is technically machine-readable. None of it is true.

Why this matters for Western SEO teams in 2026

Three forces are converging:

Google's spam policy enforcement is shifting from manual to automated. Scaled-content abuse, parasite SEO, and expired-domain hijacking are being penalized at site level, not page level. The Red Book's category four is already operational as a Google policy.
The EU AI Act and FTC AI guidance are creating disclosure obligations. Synthetic content labeling, disclosure of automated decisioning, and algorithmic transparency requirements are no longer purely regional issues. If you sell into the EU, your AI-assisted content workflow now has a paper trail it did not have last year.
AI search engines themselves are getting smarter about source quality. Perplexity, ChatGPT search, and Google AI Overviews are all moving toward source-level trust scoring. A site that gets its citations from D-tier sources (anonymous blogs, undisclosed affiliate roundups, expired-domain reanimations) is starting to look different from a site cited by A-tier sources, even when the page-level signals are similar.

The window for treating GEO as a pure visibility game is narrowing. Companies that get caught on the wrong side of any of the nine categories above are not going to get a polite warning. They are going to get a sudden ranking drop, a regulatory notice, or both.

Ten honest questions for your AI SEO vendor

Borrowed and adapted from the Red Book's vendor self-audit checklist. If your tool's answer to any of these is yes, treat it as a hazard signal:

Do you promise guaranteed AI search rankings or citation slots?
Do you produce thousands of programmatic pages without per-page proof of real service or expertise?
Do you build inbound links via private blog networks, expired domains, or rented sub-paths on high-authority sites?
Do you collect customer data for "training" or "personalization" without a documented data-processing agreement?
Do you produce comparison pages or rankings that include negative claims about competitors with no public evidence?
Do you generate fake reviews, testimonials, awards, or media mentions?
Do you embed instructions to AI crawlers in alt text, hidden HTML comments, or schema fields?
Do you use bot-driven queries plus screenshots to manufacture proof of "AI visibility"?
Are you unable to produce a full audit trail of fact-checking, editorial review, and post-publication monitoring?
Have you had to take down published content because of platform or legal complaint, and if so, can you describe the process you used?

A vendor that answers all ten cleanly is rare. The point is to make the conversation explicit. Most procurement teams do not currently ask any of these questions of GEO vendors.

What we're doing differently

This is the part where most posts pivot to a product pitch. We're going to skip most of that, because the Red Book's framing is more useful than any feature list we could promote.

What we will say: at Auxora we treat the seven white-hat principles as product constraints, not marketing copy. Our GTM reports flag each finding against the nine risk categories, our content workflow surfaces source provenance for every recommendation, and we refuse certain categories of work outright (anonymous competitor attacks, schema that contradicts visible content, programmatic city pages without verified local presence). If you want to see how that shows up in a live audit, the methodology page walks through it. If you want to compare against the most popular alternatives, the vs-jasper comparison is a good place to start.

If you only take one thing from the Red Book: treat your GEO program like a knowledge-asset engineering project, not a ranking-trick project. Build a fact registry that other teams (legal, sales, customer success, security) can rely on. Then optimize the surface representation of those facts for AI engines. Skip the surface tricks.

Reading list

If you want the source material the Red Book draws from, this is the short version of the bibliography:

GEO: Generative Engine Optimization (Aggarwal et al., KDD 2024) — the original academic paper on visibility mechanics in generative engines.
Not what you have signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection (arXiv:2302.12173) — the foundational indirect prompt injection paper.
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation (USENIX Security 2025) — what RAG poisoning actually looks like in practice.
Google Search Central: Spam policies for Google Web Search — the operational version of category four.
OWASP Top 10 for LLM Applications 2025 — Western framework that overlaps significantly with the Red Book's third layer.
MITRE ATLAS: Adversarial Threat Landscape for AI Systems — the threat-model lens for category six through nine.

The full Red Book itself is in Chinese and circulates inside Yao Jingang's wiki. If you read Chinese, the framework is worth the afternoon. If you don't, this post is the closest thing to a faithful summary I can offer without reproducing copyrighted material.

The shift it represents (from "rank in AI" to "be cited accurately by AI") is the part that matters. The categories and the formula are tools to get there.