Reverse-Engineering Google AI Mode: What Discovery Engine Reveals About Google's AI Search Architecture

Let me be clear upfront: I’m not claiming Discovery Engine is exposing the full architecture of Google AI Mode & AIO. They’re different products serving different purposes.

But here’s what I am saying: When it comes to AI-powered search with a proper algorithm and tech stack, Google is clearly ahead. And they’re not hiding their technology. They’re selling it.

Google’s Discovery Engine is a product with a proper UI & API endpoints.

Click to see the Discovery Engine documentation here.

Click to see the Discovery Engine’s full aspects here. [HexDocs link is here, I know it sounds familiar :) ]

Before starting, let me show you a sneak peek at what’s inside.

    .... "description": "Optional. A unique identifier for tracking visitors. For example, this could be implemented with an HTTP cookie, which should be able to uniquely identify a visitor on a single device. This unique identifier should not change if the visitor logs in or out of the website. This field should NOT have a fixed value such as `unknown_visitor`. This should be the same identifier as UserEvent.user_pseudo_id and CompleteQueryRequest.user_pseudo_id The field must be a UTF-8 encoded string with a length limit of 128 characters. Otherwise, an `INVALID_ARGUMENT` error is returned.",
      "type": "string"
    },
    "contentSearchSpec": {
      "description": "A specification for configuring the behavior of content search.",
      "$ref": "GoogleCloudDiscoveryengineV1SearchRequestContentSearchSpec"
    },
    "rankingExpression": {
      "description": "Optional. The ranking expression controls the customized ranking on retrieval documents. This overrides ServingConfig.ranking_expression. The syntax and supported features depend on the `ranking_expression_backend` value. If `ranking_expression_backend` is not provided, it defaults to `RANK_BY_EMBEDDING`. If ranking_expression_backend is not provided or set to `RANK_BY_EMBEDDING`, it should be a single function or multiple functions that are joined by \"+\". * ranking_expression = function, { \" + \", function }; Supported functions: * double * relevance_score * double * dotProduct(embedding_field_path) Function variables: * `relevance_score`: pre-defined keywords, used for measure relevance between query and document. * `embedding_field_path`: the document embedding field used with query embedding vector. * `dotProduct`: embedding function between `embedding_field_path` and query embedding vector. Example ranking expression: If document has an embedding field doc_embedding, the ranking expression could be `0.5 * relevance_score + 0.3 * dotProduct(doc_embedding)`. If ranking_expression_backend is set to `RANK_BY_FORMULA`, the following expression types (and combinations of those chained using + or * operators) are supported: * `double` * `signal` * `log(signal)` * `exp(signal)` * `rr(signal, double \u003e 0)` -- reciprocal rank transformation with second argument being a denominator constant. * `is_nan(signal)` -- returns 0 if signal is NaN, 1 otherwise. * `fill_nan(signal1, signal2 | double)` -- if signal1 is NaN, returns signal2 | double, else returns signal1. Here are a few examples of ranking formulas that use the supported ranking expression types: - `0.2 * semantic_similarity_score + 0.8 * log(keyword_similarity_score)` -- mostly rank by the logarithm of `keyword_similarity_score` with slight `semantic_smilarity_score` adjustment. - `0.2 * exp(fill_nan(semantic_similarity_score, 0)) + 0.3 * is_nan(keyword_similarity_score)` -- rank by the exponent of `semantic_similarity_score` filling the value with 0 if it's NaN, also add constant 0.3 adjustment to the final score if `semantic_similarity_score` is NaN. - `0.2 * rr(semantic_similarity_score, 16) + 0.8 * rr(keyword_similarity_score, 16)` -- mostly rank by the reciprocal rank of `keyword_similarity_score` with slight adjustment of reciprocal rank of `semantic_smilarity_score`. The following signals are supported: * `semantic_similarity_score`: semantic similarity adjustment that is calculated using the embeddings generated by a proprietary Google model. This score determines how semantically similar a search query is to a document. * `keyword_similarity_score`: keyword match adjustment uses the Best Match 25 (BM25) ranking function. This score is calculated using a probabilistic model to estimate the probability that a document is relevant to a given query. * `relevance_score`: semantic relevance adjustment that uses a proprietary Google model to determine the meaning and intent behind a user's query in context with the content in the documents. * `pctr_rank`: predicted conversion rate adjustment as a rank use predicted Click-through rate (pCTR) to gauge the relevance and attractiveness of a search result from a user's perspective. A higher pCTR suggests that the result is more likely to satisfy the user's query and intent, making it a valuable signal for ranking. * `freshness_rank`: freshness adjustment as a rank * `document_age`: The time in hours elapsed since the document was last updated, a floating-point number (e.g., 0.25 means 15 minutes). * `topicality_rank`: topicality adjustment as a rank. Uses proprietary Google model to determine the keyword-based overlap between the query and the document. * `base_rank`: the default rank of the result",
      "type": "string"
    },
    "rankingExpressionBackend": {
      "description": "Optional. The backend to use for the ranking expression evaluation.",
      "type": "string",
      "enumDescriptions": [
        "Default option for unspecified/unknown values.",
        "Deprecated: Use `RANK_BY_EMBEDDING` instead. Ranking by custom embedding model, the default way to evaluate the ranking expression. Legacy enum option, `RANK_BY_EMBEDDING` should be used instead.",
        "Deprecated: Use `RANK_BY_FORMULA` instead. Ranking by custom formula. Legacy enum option, `RANK_BY_FORMULA` should be used instead.",
        "Ranking by custom embedding model, the default way to evaluate the ranking expression.",
        "Ranking by custom formula."
      ],
      "enumDeprecated": [
        false,
        true,
        true,
        false,
        false
      ],
      "enum": [
        "RANKING_EXPRESSION_BACKEND_UNSPECIFIED",
        "BYOE",
        "CLEARBOX",
        "RANK_BY_EMBEDDING",
        "RANK_BY_FORMULA"
      ]
    },
    "safeSearch": {
      "description": "Whether to turn on safe search. This is only supported for website search.",
      "type": "boolean"
    },
    "userLabels": {
      "description": "The user labels applied to a resource must meet the following requirements: * Each resource can have multiple labels, up to a maximum of 64. * Each label must be a key-value pair. * Keys have a minimum length of 1 character and a maximum length of 63 characters and cannot be empty. Values can be empty and have a maximum length of 63 characters. * Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed. * The key portion of a label must be unique. However, you can use the same key with multiple resources. * Keys must start with a lowercase letter or international character. See [Google Cloud Document](https://cloud.google.com/resource-manager/docs/creating-managing-labels#requirements) for more details.",
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    },
    "naturalLanguageQueryUnderstandingSpec": {
      "description": "Optional. Config for natural language query understanding capabilities, such as extracting structured field filters from the query. Refer to [this documentation](https://cloud.google.com/generative-ai-app-builder/docs/natural-language-queries) for more information. If `naturalLanguageQueryUnderstandingSpec` is not specified, no additional natural language query understanding will be done.",
      "$ref": "GoogleCloudDiscoveryengineV1SearchRequestNaturalLanguageQueryUnderstandingSpec"
    },
    "searchAsYouTypeSpec": {
      "description": "Search as you type configuration. Only supported for the IndustryVertical.MEDIA vertical.",
      "$ref": "GoogleCloudDiscoveryengineV1SearchRequestSearchAsYouTypeSpec"
    },
    "displaySpec": {
      "description": "Optional. Config for display feature, like match highlighting on search results.",
      "$ref": "GoogleCloudDiscoveryengineV1SearchRequestDisplaySpec"
    },
    "crowdingSpecs": {
      "description": "Optional. Crowding specifications for improving result diversity. If multiple CrowdingSpecs are specified, crowding will be evaluated on each unique combination of the `field` values, and max_count will be the maximum value of `max_count` across all CrowdingSpecs. For example, if the first CrowdingSpec has `field` = \"color\" and `max_count` = 3, and the second CrowdingSpec has `field` = \"size\" and `max_count` = 2, then after 3 documents that share the same color AND size have been returned, subsequent ones should be removed or demoted.",
      "type": "array",
      "items": {
        "$ref": "GoogleCloudDiscoveryengineV1SearchRequestCrowdingSpec"
      }
    },
    "session": {
      "description": "The session resource name. Optional. Session allows users to do multi-turn /search API calls or coordination between /search API calls and /answer API calls. Example #1 (multi-turn /search API calls): Call /search API with the session ID generated in the first call. Here, the previous search query gets considered in query standing. I.e., if the first query is \"How did Alphabet do in 2022?\" and the current query is \"How about 2023?\", the current query will be interpreted as \"How did Alphabet do in 2023?\". Example #2 (coordination between /search API calls and /answer API calls): Call /answer API with the session ID generated in the first call. Here, the answer generation happens in the context of the search results from the first search call. Multi-turn Search feature is currently at private GA stage. Please use v1alpha or v1beta version instead before we launch this feature to public GA. Or ask for allowlisting through Google Support team.",

Google Sells Its Search Technology

Google Cloud Discovery Engine (also called Vertex AI Search) is Google’s enterprise search product. It’s designed for companies that need AI-powered search for their internal data, documentation, product catalogs, or customer-facing applications.

This isn’t some obscure API buried in documentation. It’s available right now in Google Cloud Console. The first setup is time-consuming. You need to configure data stores, processing pipelines, and ranking controls but it’s all there, exposed in the UI.

And Google keeps updating it. New models. More signals. Better configuration options. They’re actively developing this product because enterprises pay for it.

Why This Matters for AI Search Optimization

Here’s my main point: We’re always looking for hard technical data about how AI search works.

We reverse-engineer ChatGPT’s citation system through browser DevTools. I analyze Perplexity’s ranking factors through systematic testing. We build hypotheses from fragments of information.

But with Google? The technical data lives in the Discovery Engine. The signal names. The ranking components. The chunking parameters. The model options. It’s documented, configurable, and visible.

I’m not saying AI Mode uses identical code. But Google built the Discovery Engine with Google’s engineering team, similar infrastructure, and a similar philosophy that powers their consumer products. When Discovery Engine exposes seven distinct ranking signals with names like “Gecko” and “Jetstream,” that tells us something about how Google thinks about AI search ranking.

We can learn from it.

Important Note: The Cloud Console mentions Gecko, but there is also an embedding model named Gecko and it’s deprecated now for API users.I will continue mentioning Gecko in this post. Since the Discovery Engine names it “Gecko score.”

I’ve been using the Discovery Engine tool for about a year. It keeps getting updated, and even though I use it only for my own website or my clients’ websites, I can say it is constantly evolving and improving. Many of the features I discuss in this content were added recently. I also had the opportunity to present a project I conducted through Discovery Engine, along with the insights I uncovered on the main stage at BrightonSEO in April 2025. In the conferences I’ll attend in 2026, I plan to dive into much more technical topics. This is only the visible side of the iceberg.

Time flies :)

The Four-Stage Pipeline: Prepare → Retrieve → Signal → Serve

After deep-diving into Discovery Engine’s configuration panels, I’ve mapped the four-stage pipeline Google uses, including the seven specific ranking signals exposed in the Signal Viewer.

These are actual configuration options, signal names, and architectural decisions that Google shows to enterprise customers.

Stage 1: Prepare, Query Understanding and Normalization

What happens here: The system normalizes the input query, picks up context, and routes it to the appropriate fulfillment pipeline.

Synonyms Configuration

Discovery Engine allows adding synonym control lists with time ranges. This is significant. It means Google can apply different synonym mappings for different time periods. Think elections, it’s adjustable.

Why does this matter? Query understanding isn’t static. The meaning of terms evolves, and Google can dynamically adjust how queries are expanded based on temporal context.

Autocomplete Configuration

The autocomplete settings reveal Google’s approach to query prediction:

Setting	Default/Options	Implication
Maximum suggestions	Configurable	Controls candidate query diversity
Minimum trigger length	Character-based	Determines when prediction kicks in
Matching order	”Suggestion start with the term” OR
”Suggestion can start from anywhere in the term”	Prefix matching vs. substring matching
Query suggestions model	Multiple sources (see below)	Different data sources for suggestions
Deny list	Configurable	Explicit blocking of unwanted suggestions

Query Suggestions Model Options:

Automatic: System decides the best source
Document: Suggestions derived from indexed document content
**Search History:**Suggestions based on past searches
**User Events:**Suggestions driven by user interaction data
Completable Fields: Suggestions from schema-defined completable attributes

The “matching order” setting is particularly interesting. Google offers two options:

Prefix matching: Suggestions must start with the typed term
Substring matching: Suggestions can match anywhere in the term

This toggle suggests Google’s query understanding can operate in both modes. For AI Mode, substring matching would allow more flexible query interpretation, connecting partial inputs to broader concept matches.

Schema Configuration: How Google Indexes Structured Data

This section will look familiar if you followed Mark Williams-Cook’s coverage of the Google ranking leak. Discovery Engine exposes a schema(boolean, array) & prediction configuration panel with field-level controls that mirror what we saw in those leaked documents.

In the Data Store → Website → Schema section, each field can be configured with these attributes:

Attribute	Function
Field Name	The identifier for this data field (e.g., `Product Name`, `Customer ID`, `Author`)
Type	Data type: text, numbers, dates, or boolean (yes/no)
Array	Whether the field holds multiple values (e.g., `["Waterproof", "Durable", "Lightweight"]`) or a single value
Searchable	Improves recall for this field in search queries. Only text fields can be searchable.
Indexable	Enables filtering, ordering, and faceting by this field. Object fields cannot be indexable.
Retrievable	Returns this field in search results

Why this matters:

Google explicitly separates Searchable, Indexable, and Retrievable as distinct properties. This means:

A field can be searchable but not retrievable: it influences ranking but isn’t shown to users
A field can be indexable but not searchable: it can be filtered/sorted but doesn’t contribute to text matching
A field can be retrievable but not indexable: it’s returned in results but can’t be used for filtering

This three-way distinction reveals how Google thinks about structured data processing. Your schema markup isn’t just “indexed or not”, different properties of your structured data serve different functions in the search pipeline.

Optimization Implications

This stage happens before your content is even considered. The query gets transformed, expanded, and contextualized through time-aware synonym mappings.

Actionable insight: Your content needs to cover the full semantic field around your target topics across time. Historical terminology and current terminology both matter; Google’s time-ranged synonyms mean they’re actively managing semantic drift.

Stage 2: Retrieve, Document Selection and Processing

What happens here: The system finds the most relevant documents from the data store based on the processed query and applies document-level processing.

Data Store Configuration

Discovery Engine accepts multiple data sources: Search Console verified websites, downloaded HTMLs, document datasets, and sitemaps.

Document Processing: The Chunking Pipeline

This is where it gets technical. Discovery Engine’s processing configuration gives some hints on how Google chunks documents for retrieval:

Parser Options:

Default document parser
Layout parser with table annotation and image annotation
Gemini enhancement (Preview) Using LLM to enhance document understanding (And just wow, for layout parsing, can AI help with it???)

Chunking Configuration:

Advanced chunking can be enabled
Chunk size limit: 500 tokens (maximum, not fixed)
Option to include ancestor headings in chunks

The 500-token chunk size limit tells us Google Discovery Engine caps retrieval segments at roughly 375 words maximum. Chunks can be smaller, but won’t exceed this limit. This suggests content should be structured so that key information is self-contained within ~500 token sections.

The “include ancestor headings in chunks” option is significant when enabled; heading hierarchy travels with each chunk. If a chunk comes from content under H1 > H2 > H3, those headings are preserved as context. This means your heading structure isn’t just for readers; it can provide topical context to each retrieved segment. (WOW!)

The Gemini enhancement option is a preview feature that uses LLM processing during indexing. This suggests Google is already using AI to improve document understanding at the retrieval stage, not just the serving stage.

Optimization Implications

Actionable insights:

Structure content in sections under ~500 tokens. Each major point should be self-contained within roughly 500 tokens (about 375 words) since that’s the maximum retrieval unit size.
Use clear heading hierarchies. When ancestor headings are included, chunks carry their H1/H2/H3 context. A well-structured heading hierarchy means your chunks are topically labeled.
Optimize tables and images. Table and image annotation means structured data in tables and descriptive image content are being parsed and indexed.
Submit comprehensive sitemaps. Google accepts sitemap data for the data store, confirming that sitemap signals influence retrieval.

Stage 3: Signal, The Seven Ranking Signals Revealed

What happens here: Retrieved documents are processed through multiple signal layers to produce a final ranked list. This is where the magic happens and Discovery Engine’s Signal Viewer exposes the exact signals.

The Seven Ranking Signals

Discovery Engine’s Signal Viewer shows a table with these columns for every search result:

Signal	Description	What It Measures
Base Ranking	Initial relevance score from the core ranking algorithm	Pre-adjustment relevance
Embedding Adjustment	Semantic similarity between query and document embeddings	Gecko score: Google’s embedding model
Semantic Relevance	Cross-attention model score	Jetstream: handles context and negation better than embeddings
Keyword Matching	Frequency and relevance of query keywords	BM25 or a similar algorithm
Predicted Conversion	Likelihood of user engagement	Three-tier system: Popularity → PCTR → Personalized PCTR
Freshness	Recency score	Time-sensitive query adjustment
Boost/Bury	Manual business rule adjustment	Explicit promotion/demotion

This is the actual ranking stack. Let me break down what each signal means for optimization:

Signal 1: Base Ranking

The initial relevance score before any adjustments. This is likely Google’s traditional ranking algorithm output, the foundation that everything else modifies.

Signal 2: Embedding Adjustment (Gecko Score)

Google’s Gecko embedding model measures semantic similarity between the query and document embeddings. This is the vector similarity score.

Optimization implication: Pure keyword matching isn’t enough. Your content needs to be semantically aligned with query intent at the embedding level.

Signal 3: Semantic Relevance (Jetstream)

This is huge. Google uses a cross-attention model called Jetstream that “better understands context and negation compared to embeddings.”

Good news: Jetstream is open-source. https://github.com/AI-Hypercomputer/JetStream

Interesting news: It’s designed for TPUs.

Cross-attention models process query and document together, allowing for nuanced understanding that pure embedding similarity misses. The explicit mention of negation handling suggests Jetstream can understand “not X” and “without Y” patterns that embeddings struggle with. (MUVERA???)

Optimization implication: Write content that explicitly addresses what something is and what it isn’t. Negation-aware ranking means clarifying distinctions matters.

Signal 4: Keyword Matching (BM25)

Classic keyword frequency and relevance scoring, likely BM25 or a variant. This confirms that traditional keyword optimization isn’t dead; it’s one of seven signals in the stack.

Optimization implication: Keywords still matter. They’re not the only signal, but they’re explicitly in the ranking stack.

Signal 5: Predicted Conversion (PCTR/PCVR)

It’s very interesting because Google uses the word “prediction” with a scoring in a public service.

This is the engagement signal. PCTR (Predicted Click-Through Rate) and PCVR (Predicted Conversion Rate) are calculated from historical user interaction data.

Discovery Engine’s Data Quality section reveals this isn’t a single signal It’s a three-tier system with quality thresholds:

TIER 1: Popularity Derived from user interactions across documents. The more users interact with a document, the stronger the boost. Key details:

Aggregated across all datastores with events in the Google Cloud project
If one datastore has sufficient events, the signal is enabled project-wide
BUT: Documents in datastores without events don’t get boosted, even if the project-level threshold is met

TIER 2: PCTR Model (Predicted Click-Through Rate) Predicts the probability of a user clicking on a document given the context. Google explicitly states this is “an important factor considered in ranking.”

App-specific (tied to a particular search application)
Only counts events from datastores linked to the current app
Has minimum and optimal thresholds for data quality

TIER 3: Personalized PCTR Model Incorporates user-specific signals: user metadata, individual search history, and behavioral patterns.

Only activates after 100,000+ queries have been served
User-specific, not just document-specific
Represents the highest tier of engagement signal sophistication

Each tier shows Optimal, Suboptimal, and Blocking status, meaning there are hard thresholds where the signal either works well, works partially, or is disabled entirely due to insufficient data.

Optimization implication: Engagement isn’t just one signal. It’s a layered system. Basic popularity comes first, then predicted CTR modeling, then personalized predictions. Sites with more user interaction data unlock higher tiers of ranking signals.

Signal 6: Freshness

Recency scoring is adjusted based on query type. Discovery Engine notes this is “especially important for time-sensitive queries.”

Optimization implication: Freshness is query-dependent. For time-sensitive topics, recent content gets boosted. For evergreen topics, freshness may matter less.

Signal 7: Boost/Bury

Manual adjustment from -1 to +1 based on business rules. The configuration interface allows:

Selecting a data store
Applying filters using AND/OR/NOT operators with grouping via parentheses
Exact phrase matching with quotes
Slider adjustment from Bury (-1) to Boost (+1)

Optimization implication: Google can (and does) apply manual ranking adjustments based on categorical rules. This is likely how E-E-A-T categories and source authority are implemented, as boost/bury rules on source attributes.

How These Signals Combine

The Signal Viewer shows all seven signals as separate columns, with Position as the final output. This means Google is using **multi-signal fusion,**likely a learned combination of all seven signals to produce the final ranking.

This is consistent with what I’ve found in other AI search systems. ChatGPT uses Reciprocal Rank Fusion (RRF). Perplexity uses 59+ factors in an XGBoost reranker. Google’s approach appears to be a seven-signal fusion model.

Stage 4: Serve, Answer Generation and Content Controls

What happens here: The system generates and formats the final answer, applying LLM synthesis and safety filtering.

Search Type Configuration

Discovery Engine offers three search types that map conceptually to Google’s consumer products:

Discovery Engine Setting	Likely Google Consumer Equivalent
Search - list with results	Traditional Google Search
Search with an answer	AI Overviews
Search with follow-ups	AI Mode

The naming and functionality alignment is too close to be coincidental. “Search with follow-ups” describes exactly what AI Mode does. A conversational search experience with multi-turn context.

LLM Configuration

The serving layer allows model selection and customization:

Model Selection: Gemini 2.5 Flash is the default, but any model can be selected. This means Google can swap underlying models without changing the pipeline.

Answer Customization:

Custom instructions for tone, style, and length
Language auto-detection or manual selection
Related questions toggle
Image source selection for answer images

The “custom instructions” field is significant. It suggests that AI Mode’s answer style is prompt-engineered, not hardcoded. Google can adjust how answers are generated through instruction tuning.

Safety and Quality Gates

Several filters act as final gates before serving:

Filter	Function
Ignore no answer summary	Skip “no summary available” messages
Ignore Adversarial Query	Block LLM responses on detected adversarial queries
Ignore low relevant content	Prevent LLM answers when content relevance is too low

The “low relevant content” filter is important. Even if content makes it through the Signal stage, it can be blocked at Serve if the LLM determines it’s not relevant enough to ground an answer.

Optimization Implications

Actionable insights:

Optimize for grounded synthesis. The LLM is constrained to ground answers in retrieved content. Your content needs clear, extractable statements.
Avoid adversarial patterns. Content that triggers adversarial detection gets filtered. Write straightforwardly without manipulation patterns.
Maintain high relevance density. Low relevance content gets filtered. Every section should contribute to query relevance.
Structure for multi-turn. AI Mode supports follow-ups. Content that answers related questions in the same topic cluster may have advantages.

The Complete Picture: What Discovery Engine Tells Us About AI Search

Here’s the flow based on Discovery Engine’s architecture and likely a close approximation of how Google approaches AI search more broadly:

Stage 1 - Prepare: User query → Time-aware synonym expansion → Autocomplete prediction model → Transformed query with context

Stage 2 - Retrieve: Transformed query → Data store lookup → Chunked retrieval (up to 500 tokens per chunk, optionally with ancestor headings) → Layout-parsed tables/images → Gemini-enhanced document understanding → Candidate set

Stage 3 - Signal: Candidate set → Seven-signal scoring:

Base Ranking (core algorithm)
Gecko embedding similarity
Jetstream cross-attention relevance
BM25 keyword matching
Engagement signals (Popularity → PCTR → Personalized PCTR tiers)
Freshness scoring
Boost/Bury business rules → Multi-signal fusion → Final ranked list

Stage 4 - Serve: Top N results → Gemini 2.5 Flash (or selected model) → Custom instruction application → Adversarial/low-relevance filtering → Grounded answer generation → Related questions → Rendered AI Mode response

What This Means for AI Search Optimization

Discovery Engine is Google’s enterprise search product built on the same infrastructure and engineering philosophy as their consumer products. The configuration options reveal how Google thinks about AI-powered ranking.

Is this exactly how AI Mode works? I can’t say for certain. But when the same company exposes seven named ranking signals, specific chunk sizes, and three search types that align with their consumer product tiers, that’s technical intelligence worth studying.

Key Takeaways

1. Seven explicit signals, not a black box

The Signal Viewer exposes seven distinct ranking signals. This isn’t a single neural network making opaque decisions. It’s a multi-signal fusion system with interpretable components.

2. Gecko + Jetstream = Semantic layer

Google uses both embedding similarity (Gecko) and cross-attention (Jetstream) for semantic understanding. Jetstream’s negation handling suggests nuanced query understanding beyond simple similarity.

3. PCTR is a three-tier system, not a single signal

Engagement signals operate in tiers: Popularity → PCTR → Personalized PCTR. Each tier has quality thresholds and unlocks with more user interaction data. Personalized ranking only activates after 100,000+ queries. (And of course, this is about the Discovery Engine)

4. 500-token chunk limit with optional heading context

The retrieval unit has a ~500 token maximum, with an option to include ancestor headings. Structure your content accordingly.

5. Three product tiers from one pipeline

Traditional search, AI Overviews, and AI Mode are all served from the same pipeline with different configurations. The architecture is unified.

6. Adversarial and relevance gates

Final safety filters can block content even after it ranks well. Write naturally and maintain high relevance throughout.

Next Steps

Discovery Engine won’t give you the exact weights Google uses in AI Mode. But it gives you something almost as valuable: a window into how Google engineers think about AI search.

The seven signals are named. The chunk size limit is defined. The pipeline stages are visible. That’s more technical intelligence than we get from any other source.

The question is whether you’re paying attention.

Have questions about AI Mode optimization? Follow for more reverse-engineering insights from Google’s own products and documentation. Connect with me on X and LinkedIn.

Google Sells Its Search Technology

Why This Matters for AI Search Optimization

The Four-Stage Pipeline: Prepare → Retrieve → Signal → Serve

Stage 1: Prepare, Query Understanding and Normalization

Synonyms Configuration

Autocomplete Configuration

Schema Configuration: How Google Indexes Structured Data

Optimization Implications

Stage 2: Retrieve, Document Selection and Processing

Data Store Configuration

Document Processing: The Chunking Pipeline

Optimization Implications

Stage 3: Signal, The Seven Ranking Signals Revealed

The Seven Ranking Signals

Signal 1: Base Ranking

Signal 2: Embedding Adjustment (Gecko Score)

Signal 3: Semantic Relevance (Jetstream)

Signal 4: Keyword Matching (BM25)

Signal 5: Predicted Conversion (PCTR/PCVR)

Signal 6: Freshness

Signal 7: Boost/Bury

How These Signals Combine

Stage 4: Serve, Answer Generation and Content Controls

Search Type Configuration

LLM Configuration

Safety and Quality Gates

Optimization Implications

The Complete Picture: What Discovery Engine Tells Us About AI Search

What This Means for AI Search Optimization

Key Takeaways

Next Steps

~$ shortcuts