Google Discover Glossary

Pipeline 17terms

Discover pipelines

Discover is not a single algorithm but 20+ specialised pipelines, each with its own behaviour, reach and audience.

Pipeline 31% FR · 9.9% reach

Content

Discover's main pipeline by volume — the default entry highway.

If Discover is a giant newspaper with many distribution kiosks, Content is the main one. About one in three articles in France enters Discover through this pipeline. It's where most editorial articles land first before potentially being amplified by other pipelines.

Example

A general-news article on Le Figaro about a political update enters Content first. If readers click and read, it may then surface in Mustntmiss or Moonstone too.

Pipeline 19.3% reach FR

Moonstone

Top pipeline in France — an engagement-broadcast machine. Ouest-France dominates it.

Moonstone is the pipeline with the largest reach in France. Once an article is picked here, it's broadcast to a huge audience. The catch: it strongly favours sites with a long history of high engagement, which is why a single regional player (Ouest-France) captures most of its volume.

Example

An Ouest-France weekend article about a local festival can pull 200k+ visitors via Moonstone alone, while the same article on a younger site might never even enter the pipeline.

Pipeline 19.7% reach · 3.7 d lifespan

Shoppinginspiration

Product/shopping pipeline. Median lifespan 3.7 days — 8× longer than Content.

A pipeline focused on product reviews, deals, gift guides, and shopping inspiration. Articles here have unusually long shelf life — 3-4 days versus a few hours for breaking news — because shopping intent is sticky. Great pipeline for affiliate sites and e-commerce magazines.

Example

A 'Best mid-range smartphones for under €500' guide can keep generating Discover traffic for 4-5 days straight via Shoppinginspiration.

Pipeline 24% reach FR · 99.8% exclusive

Feedads

Ads-driven pipeline with 99.8% URL exclusivity. Doesn't share traffic with the rest.

Feedads is the ads/sponsored content pipeline. It's almost completely isolated from the editorial pipelines: 99.8% of the URLs that ride Feedads never appear elsewhere in Discover. Editorial publishers rarely interact with it directly.

Pipeline ~14% volume FR

Aura

Diversifying pipeline that rewards editorial depth.

Aura's job is to inject diversity into the feed — bringing in long-form, well-researched articles that complement the breaking-news firehose. It rewards depth, narrative, and research over breaking-news velocity.

Example

A 3000-word investigation into a healthcare scandal can be picked up by Aura even days after publication.

Pipeline 11.2% reach FR · ×2 boost

Mustntmiss

Applies a ×2 priority boost to articles flagged as essential.

Mustntmiss is Discover's 'don't miss this' pipeline — Google flags certain articles as essential reading on a topic and doubles their priority. Once an article enters Mustntmiss, its visibility roughly doubles compared to a standard Content placement.

Example

A unique angle on a major news event (e.g. a confirmed insider account during a crisis) often triggers Mustntmiss, multiplying its reach.

Pipeline Median age 2.2 h · 46% exclusive

Newsstoriesheadlines

Ultra-fresh breaking-news pipeline. 46% of URLs are exclusive.

The breaking-news pipeline. The median article on this pipeline is 2.2 hours old — by Discover standards, that's basically real-time. Almost half of its content never appears anywhere else, making it a pure breaking-news channel.

Example

A live update on an unfolding sports result or a sudden political resignation hits Newsstoriesheadlines within minutes — and disappears within 4-6 hours.

Pipeline ×33 in 3 months FR

Creatorcontent

Social-feed pipeline in massive growth. Powered 75% by x.com in France.

Creatorcontent surfaces content tied to social activity — primarily X/Twitter posts in France (75% of its sources). It exploded ×33 in 3 months, reflecting Google's bet on creator-driven content. A site with no active X presence renounces an increasingly large share of Discover traffic.

Example

A journalist who posts the article URL on X with a strong hook can trigger Creatorcontent inclusion within an hour, often before the article even ranks in Search.

Pipeline Median delay 1.5 d

Astria

Local-authority and lifestyle pipeline with an unusual publication delay.

Astria has an unusual property: it doesn't surface articles immediately — the median delay between publication and surfacing is 1.5 days. This makes it a 'second-wind' pipeline. Articles that didn't catch fire on day 1 can find a second life via Astria on day 2-3.

Pipeline

Geotargetingstories

Local pipeline filtered by user geolocation.

Surfaces stories that are geographically relevant to where the user is right now. A traffic accident in Toulouse only surfaces to users in or near Toulouse via this pipeline.

Pipeline 1.8% reach · 67% exclusive

Webkicklocalstories

Regional press pipeline. 67% URL exclusivity.

The dedicated regional-press pipeline. Two-thirds of its content is unique to it. National media practically never appear here — it's the playground of regional newspapers (Ouest-France, La Voix du Nord, Sud Ouest, etc.).

Pipeline

Garamondrelatedarticlegrouping

Builds related-article groups (powers the Google Showcase surface).

Behind the 'Top stories' carousels and Showcase widgets, this pipeline groups together multiple articles that cover the same event. Being included in such a group massively amplifies traffic.

Pipeline

Relatedcontentruby

Click-triggered pipeline — suggests other articles from the same site or topic.

When a user has already clicked one of your articles, Relatedcontentruby kicks in to suggest more of yours on similar topics. It's how a single hit can multiply your visits — a strong reason to publish coherent topical clusters.

Example

A reader who clicked your '7 Discover hacks' article gets shown your '5 myths' article right after, via Relatedcontentruby.

Pipeline ×7 in 3 months

Paginationpanoptic

Triggered by deep scroll — surfaces more cards once a user is hooked.

The deeper a user scrolls in their Discover feed, the more this pipeline kicks in to fill the slate. Its volume has grown ×7 in 3 months, signaling that Google is increasingly relying on engaged users to consume more content per session.

Pipeline 13% reach EN · ~0 FR

Neoncluster

Video-dominant pipeline. Significant in English, near-absent in French.

The pipeline dedicated to video content (mostly YouTube broadcasts). Massive in the English market (13% reach) but barely exists in France (~36 hits in 3 months = noise level). For French publishers, video is not a priority Discover lever.

Pipeline

Deeptrends / Deeptrendsfable

Sequential trend detectors — Fable detects the spike, Deeptrends persists it.

A two-stage detection system. Deeptrendsfable spots that something is starting to trend (sudden spike in queries, mentions, social shares). Deeptrends then takes over to keep amplifying it as long as the trend persists.

Pipeline

Beacon Push System

Proactive server-push for live sports scores and finance recaps.

A specialised architecture where Google's server actively pushes notifications without waiting for the user to open the Discover feed. Limited to live-sports scores and finance market recaps for now.

Pipeline step 8terms

The 8-step processing pipeline

Every URL Google considers for Discover passes through these 8 stages, in order. Order matters: a domain blacklisted at step 4 never even reaches ranking at step 6.

Pipeline step Step 1/8

Step 1 — Content Ingestion

Googlebot crawls the page, parses HTML, extracts entities and assigns MIDs.

The very first step. Google's crawler fetches the URL, reads the HTML and extracts every entity it recognises (people, places, brands, events). Each entity gets matched to a Knowledge Graph identifier (MID). Sites in Google News are crawled within minutes; the rest can wait 1-24 hours.

Example

An article mentioning 'Apple' and 'Tim Cook' triggers entity extraction → MID for Apple Inc. + MID for Tim Cook. These tags travel with the URL through the rest of the pipeline.

Pipeline step Step 2/8

Step 2 — Structured Data Parsing

The Google App parser reads metadata. JSON-LD wins over Open Graph and Twitter Card.

The Google App SDK contains a class explicitly named SchemaOrg{parsedMetatags, jsonLdScripts}. The decompilation revealed: when JSON-LD is present, it OVERRIDES whatever Open Graph or Twitter Card claim. Most SEO guides got this wrong for years — they advised optimising OG, but Discover prioritises JSON-LD.

Example

If your og:title says 'Cheap iPhones' but your JSON-LD headline says 'Why iPhones cost more in 2026', Discover shows the JSON-LD title.

Pipeline step Step 3/8 · 13 cluster types

Step 3 — Classification

Article assigned to one or more of 13 cluster types.

Google App's code confirms 13 cluster types. Your article isn't tagged 'tech news' — it's tagged with a combination of internal cluster IDs (moonstone-cluster, neoncluster, content-cluster, etc.). An article can ride multiple clusters at once. This is how the same article can be amplified by multiple pipelines.

Pipeline step Step 4/8 · brutal

Step 4 — Filtering

Two filter levels: collection (domain) and entity (URL). Blacklists kick in here.

The brutal step. Two filters run: 'collection' filters at the domain level (entire site blacklisted? out), 'entity' filters at the URL level (this specific article violates a policy? out). This step runs BEFORE ranking, so a blacklisted site never even gets evaluated for relevance — it's already excluded.

Example

A site flagged for spam policies sees ALL its articles blocked at step 4. Doesn't matter if a specific article is genuinely useful — collection filter wins.

Pipeline step Step 5/8 · 7 sub-types

Step 5 — Interest Matching

NAIADES system layered on top of NavBoost. 7 personalisation sub-types.

The personalisation layer. NAIADES looks at: who is this user (their embeddings — Nephesh, Picasso, VanGogh), what topics they've engaged with (MID-based), what they've searched recently (Query-based), what's their current session about. Then matches articles to users with high alignment.

Pipeline step Step 6/8 · opaque

Step 6 — Ranking

Server-side ranking — opaque, but heavily influenced by steps 1-5.

The actual scoring. Happens entirely on Google's servers and remains the most opaque step. What we know: it's strongly informed by behavioural signals from NavBoost (clicks, dwell time, pogo-sticking) plus quality signals from steps 1-2 (structured data quality, entity coherence) and personalisation match from step 5.

Pipeline step Step 7/8 · gRPC stream

Step 7 — Feed Assembly

Live delivery via gRPC streaming — the feed is a live stream, not a static snapshot.

Critical revelation from the SDK decompilation: when you scroll Discover, you're not browsing a pre-computed list of cards. The feed is a gRPC stream, continuously assembled as you scroll. Cards can be inserted, demoted or removed in real time based on what you're doing.

Pipeline step Step 8/8 · real-time

Step 8 — Feedback Loop

Tombstoning of dismissed content + continuous NavBoost feeding.

The closing loop. Every action you take (click, swipe-away, 'don't show me this again', share, long dwell) feeds back into the system. Dismissed content gets permanently 'tombstoned' for that user. NavBoost incorporates the new behavioural data and adjusts site-level scores. The loop runs in real time — within seconds.

Algorithm 8terms

Algorithms & internal systems

The ranking and personalisation machinery underneath every Discover card.

Algorithm Source: Chrome telemetry

NavBoost

The principal Discover ranking algorithm — drives ranking from real-time user behaviour.

If you remember only one technical name from this glossary, it should be this one. NavBoost watches every click, dwell time, scroll depth, pogo-stick and share — primarily through Chrome — and continuously rescores sites and articles. On Discover (unlike classic SEO), NavBoost effects are instant: a bad article gets cut within hours.

Example

You publish a clickbait headline. CTR is great (10%), but pogo-stick rate is 60% and dwell time is 8 seconds. Within 2 hours, NavBoost has flagged the article as low-satisfaction and cut its distribution.

Algorithm 7 sub-types

NAIADES

Discover's personalisation system. Seven sub-types layered on top of NavBoost.

NAIADES is the layer that decides 'should THIS article be shown to THIS specific user'. NavBoost says how 'good' an article is in general; NAIADES says whether it's a match for you specifically. It uses 7 different sub-types — MID-based (entities you care about), Query-based (recent searches), WPAS (publisher status), RECALL_BOOST, AIM Thread, and others.

Algorithm

Knowledge Graph (KG)

Google's structured database of entities (people, places, brands, events).

Think of the Knowledge Graph as Google's giant index card system: every notable person, place, brand, event has its own card with a unique ID (MID). Discover uses this index to understand what your article is REALLY about — not just keywords. The more your article mentions properly-recognised entities, the better Discover can match it to the right audience.

Example

Two articles with similar text but one mentions 'Apple Inc.' (recognised entity → MID /m/0k8z) and the other writes 'la pomme américaine' — only the first benefits from Apple's Knowledge Graph signal.

Algorithm

MID (Machine ID)

Unique identifier for an entity in the Knowledge Graph. Format: /m/xxxxx or /g/xxxxx.

Every entity in the Knowledge Graph has a MID — its 'social security number' inside Google's systems. Apple Inc. = /m/0k8z. Tim Cook = /m/0gxhdx. When your article correctly resolves to entities with stable MIDs, Discover indexes those MIDs alongside your URL and routes traffic accordingly.

Example

Article tagged with /m/0gxhdx (Tim Cook) → matched to users who recently engaged with Tim Cook content.

Algorithm

Entity linking

The process of associating a textual mention with a Knowledge Graph entity.

When your article says 'Apple', Google has to decide: which Apple? The fruit, the company, the record label, a specific person? Entity linking is the algorithm that resolves this ambiguity using context, then attaches the right MID. Inconsistent naming hurts entity linking.

Example

Article writes 'Apple', 'apple', 'APPLE Inc.', 'la firme à la pomme' — entity linker may not consolidate them all to /m/0k8z, weakening the signal.

Algorithm

SAFT (Structured Annotation Framework)

Google's industrial-grade entity-and-relation extraction system.

SAFT is the Google internal system that runs entity extraction at scale across the whole web. It handles coreference resolution ('he' → 'Tim Cook'), relation extraction ('founded' between 'Steve Jobs' and 'Apple'), and feeds all of this into the Knowledge Graph.

Algorithm

Twiddler (re-ranker)

Late-stage re-ranking layer that nudges results just before delivery.

A 'twiddler' is a small re-ranking module that takes the already-scored result list and applies last-minute nudges (boost certain freshness, demote duplicates, enforce diversity quotas). Discover uses several twiddlers chained together. Their effect is small per twiddler but cumulative.

Algorithm

Goldmine (title rewriter)

Internal Google system that rewrites titles before display when judged misaligned.

If Google judges that your headline doesn't match what the article actually delivers, Goldmine may rewrite it on-the-fly for display in search/Discover cards — using your H1, og:title, JSON-LD headline, or even synthesised text. A divergence between H1 and Title increases the chance Goldmine rewrites you (often badly).

Example

You wrote 'Why this Lidl product will change your life' but the article is mostly a price comparison. Goldmine may surface the more boring (but more accurate) JSON-LD headline instead.

Embedding 6terms

User embeddings (3 layers)

Mathematical vector representations of users, operating at three different timescales. Together they drive every personalisation decision in Discover.

Embedding Multi-product · slow

Nephesh — foundational layer

Foundational user embedding traversing every Google product (Search, Discover, YouTube, Maps, Gmail).

Nephesh is the deepest, slowest-moving 'who you are' representation. It's built over months from your behaviour across every Google product and rarely shifts dramatically. When you log into a brand new device, Nephesh already 'knows' you — that's why your Discover feed is personalised on day one.

Example

Even after wiping your phone, your Discover feed comes back recommending tech reviews because Nephesh has 'tech enthusiast' baked in.

Embedding Batch · STAT + LTAT

Picasso — long-duration layer

Batch-computed user embedding capturing persistent preferences (STAT + LTAT).

Picasso runs in batch jobs (every few hours / days) and synthesises two complementary signals: STAT (Short-Term Affinity Track, last few weeks) and LTAT (Long-Term Affinity Track, last few months). It's faster-moving than Nephesh but still much slower than your current session.

Embedding On-device · seconds

VanGogh — instant layer

On-device user embedding that reacts to the very current session.

VanGogh runs ON YOUR PHONE in real time. As you open, scroll, click, swipe in the current session, VanGogh updates within seconds. This is what makes Discover feel responsive: dismiss two clickbait articles → the next slate already shifts away from clickbait.

Example

You spend 3 minutes on a long-form football tactics article. Your next Discover refresh contains noticeably more long-form sports analysis.

Embedding

Site2vec

Vector representation of a whole site in Google's embedding space.

Just as Nephesh embeds a user, Site2vec embeds a site. It's the mathematical 'fingerprint' of what your site is about as a whole. Discover compares Site2vec(your site) against user embeddings to decide site-level affinity, and against other sites' Site2vec to spot competitors and overlaps.

Embedding Higher = better

SiteFocusScore

Internal metric measuring a site's thematic coherence. High = good.

Computed from Site2vec, this score answers: 'how tightly clustered is this site's content around a coherent theme?' A site with a clear focus (e.g. cycling magazine) scores high. A generalist news site scores lower. High SiteFocusScore = Discover trusts the site as an authority on its niche.

Example

A 200-article site exclusively about coffee scores ~0.92. Same site adding 50 random political articles drops to ~0.74.

Embedding Lower = better

SiteRadius

Internal metric measuring a site's content dispersion in vector space. Low = good.

The flip side of SiteFocusScore. SiteRadius measures HOW FAR your articles drift from your site's centroid. A small radius means tight thematic discipline; a large radius means scattered content. Discover penalises high-radius sites because they look opportunistic.

NAIADES 5terms

NAIADES sub-types (the 7)

Specialised personalisation signals NAIADES layers on top of NavBoost. Each sub-type captures a different facet of why a user should see this article.

NAIADES Sub-type 793

MID-based (sub-type 793)

Personalisation by Knowledge-Graph entities the user has engaged with.

If you've recently read articles tagged with /m/0k8z (Apple Inc.), MID-based scoring lifts other articles tagged with the same MID for you. This is why your Discover feed gets 'sticky' on entities you care about — once Apple is in your active MID set, Apple-related cards keep surfacing.

NAIADES Sub-type 792

Query-based (sub-type 792)

Personalisation based on the user's recent search queries.

Your Discover feed reacts to what you've Googled in Search. Searched 'best espresso machines under 500€' yesterday? Espresso content surfaces in Discover today. The window is short (last few days) but the effect is strong — 'follow-up surfacing' is one of Discover's most reliable mechanics.

NAIADES Publisher Center

WPAS (Web Publisher Articles Signal)

NAIADES sub-type tied to Google Publisher Center registration.

If your site is registered (and approved) in Google Publisher Center, WPAS gives every one of your articles a small but persistent boost in distribution. Free signup, ~10 minutes setup, materially raises your distribution ceiling. Why most non-news publishers ignore it is genuinely puzzling.

NAIADES

RECALL_BOOST

NAIADES sub-type that increases retrieval priority from the candidate pool.

At the moment Discover is assembling your slate, it pulls from a giant pool of candidate articles. RECALL_BOOST increases the chance that a specific article gets pulled into the candidate set in the first place. Earlier and broader exposure than what NavBoost alone would warrant.

NAIADES

AIM Thread

Cross-session topical thread the user is currently following.

AIM stands for Affinity-Interest Modelling. An AIM Thread is a topical 'thread' you've been pulling on — say, you've been following the World Cup across multiple sessions. NAIADES persists this thread across sessions and surfaces follow-up content even days later, before the topic naturally surfaces from your other signals.

Signal 9terms

Behavioural signals

What NavBoost actually measures, and how it punishes or rewards.

Signal Target 5-10% FR

CTR (Click-Through Rate)

Clicks ÷ impressions. Discover's primary positive feedback signal.

The single most important early signal. Discover shows your card a small audience first; if CTR is high (~7-12% in France), the article is amplified. Below ~3%, distribution gets cut quickly. The headline-image combination drives 80% of CTR — get those right or nothing else matters.

Example

A 6.5% CTR on first 5000 impressions usually triggers Wave 2 amplification. A 1.8% CTR usually kills the article within 2 hours.

Signal Target 60-180 s

Dwell time

Time elapsed between click and return to feed. Long dwell = title kept its promise.

If a user stays on your article for 90+ seconds, NavBoost reads that as 'the title delivered'. If they bounce in 8 seconds, it reads as 'oversold/clickbait'. Dwell time is harder to fake than CTR — even bot traffic struggles to mimic genuine reading patterns.

Signal Severe penalty

Pogo-sticking

Quick return to the feed after clicking — one of the most negative NavBoost signals.

The user clicks your card, sees the page, immediately goes back to Discover. That's a pogo-stick. It's the worst kind of click — Google now KNOWS the title oversold and the user is annoyed. Even a single pogo-stick is bad; a high pogo-stick rate kills distribution within minutes.

Example

Headline 'You won't believe what this Lidl product does'. Article is a 6-paragraph price comparison. Pogo-stick rate hits 65%. Distribution cut within 90 minutes.

Signal

Long click vs short click

Internal Google distinction. A long click ends the user's task; a short click sends them back.

Google internally categorises clicks as 'long' (the user got their answer) or 'short' (the user came back to look for something better). Discover overlays this with dwell time and pogo-stick to score satisfaction. Sites with high long-click ratios are systematically promoted.

Signal

Bounce-back

The full event Discover instruments: tap card → arrive on article → press back → return to feed.

Discover instruments this event sequence very tightly. The latency between 'arrive on article' and 'press back' is one of the cleanest quality proxies available — much harder to fake than CTR or pageviews.

Signal

Scroll depth

How far down the article a user scrolls. Tracked by Chrome.

If users consistently bail at the 30% mark, that's a sign the lede is wasting time before getting to the value. If they scroll to 90%+, you've nailed the structure. Long-form articles with low scroll depth are a Discover red flag.

Signal Per-user · permanent

Tombstoning

Permanent marking of content dismissed by a user. Never re-surfaces for them.

When a user clicks 'Don't show me this again' or 'Not interested in this site', Discover writes a permanent marker against that combination. The content (or the entire site) will never appear in that user's feed again. Per-user, but irreversible at the user level.

Signal

Rug pull counter

Counter of articles pushed to the feed and then retroactively withdrawn.

If you publish an article, get distribution, then update or remove the article in a way that breaks the original promise (e.g., behind a paywall after the fact, redirect to ad-heavy content), Discover counts that as a 'rug pull'. High counts correlate with editorial unreliability and distribution gets clipped.

Signal 3 stages · ~5% reach Wave 3

Wave 1 / 2 / 3

Stages of progressive Discover exposure: restricted test → broader test → mass amplification.

Wave 1 (~1000-5000 impressions): the test phase, narrow audience, used to estimate CTR and dwell. Wave 2 (~50k-500k): broader test if signals are good. Wave 3 (1M+): mass amplification — only ~5% of articles ever reach this. This staged exposure is why hours 0-2 after publication are critical.

Example

An article hits 9% CTR + 130s dwell on Wave 1 → Wave 2 triggered → another 7% CTR → Wave 3. Total: 1.2M visits over 36 hours.

Metric 6terms

Performance metrics

The technical numbers Google measures on every visit. Below target = friction in distribution.

Metric LCP + CLS + INP

Core Web Vitals

Google's official UX performance metric set: LCP, CLS, INP.

The three vital signs Google checks on every page visit. Failing any of them is a hard ceiling on Discover distribution. The standard is 'pass-rate ≥ 75% of visits' — meaning at least 3 out of 4 of your visitors must hit the targets.

Metric Target < 2.5 s

LCP (Largest Contentful Paint)

Time to display the largest visible element. The user's perceived loading time.

LCP measures when the main image or the largest text block becomes visible. Below 2.5s = user perceives the page as 'fast'. Between 2.5-4s = mediocre. Above 4s = bad — users start bouncing. The hero image is usually the LCP element on a Discover article.

Metric Target < 0.1

CLS (Cumulative Layout Shift)

Visual stability metric. How much the layout jumps around during load.

Sum of all unexpected layout shifts. Aim for < 0.1. Common offenders: ads loading after the main content (everything jumps down), images without explicit width/height, fonts swapping. Mobile is especially punishing — a single ad insertion can blow your CLS.

Metric Target < 200 ms

INP (Interaction to Next Paint)

Reactivity after a click — how quickly the page responds to user input.

Replaced FID in 2024. Measures every interaction (taps, key presses) across the page lifetime, then takes the slowest one. Below 200ms = responsive. Above 500ms = janky, JavaScript is blocking the main thread. Heavy ad scripts are the #1 offender.

Metric Target < 500 ms

TTFB (Time to First Byte)

Server response speed. Caps the achievable LCP.

The time between the user's request and the very first byte of HTML arriving. Below 500ms = good. Above 800ms = your server is dragging the whole experience down. Cheap shared hosting often has TTFB > 1s — that alone makes Discover-grade performance impossible.

Metric Target > 85

PageSpeed mobile score

The aggregate Lighthouse score on a mobile profile. Aim > 85.

A composite score combining LCP, CLS, INP, TBT, etc. Discover is mobile-first, so the desktop score doesn't matter — only the mobile (slow 4G, mid-range device) score does. Below 70 = your distribution is materially capped, even if individual CWV pass.

Quality 7terms

Content quality & policy

How Google evaluates whether content deserves to be in Discover at all.

Quality

E-E-A-T

Experience, Expertise, Authoritativeness, Trustworthiness — the four-letter quality framework.

Google's quality raters (humans) evaluate content along these four axes. Experience: did the author actually live this? Expertise: are they a recognised expert? Authoritativeness: does the site have a track record on the topic? Trustworthiness: is it accurate and transparent? Strong E-E-A-T = sustained Discover distribution. Weak E-E-A-T = capped at Wave 1.

Example

A health article cited and signed by an MD with verified credentials beats an anonymous health article on identical content — every time.

Quality

YMYL — Your Money or Your Life

Topics that materially affect a reader's wellbeing: health, finance, legal, safety, parenting.

If your content can affect someone's health, money, safety or major life decisions, Google applies a much higher quality bar. YMYL articles need verified authors (Schema Person + bio + credentials), reputable sources, and clear disclaimers. Lazy YMYL content gets crushed by both quality raters and HCS.

Quality Site-level

Helpful Content System (HCS)

Google's automated demotion of content judged unhelpful or written for SEO rather than humans.

HCS rolled into the core algorithm in 2024. It evaluates whether content brings real, original value or just regurgitates what's everywhere else. SITE-LEVEL signal: a few thin pages can drag down your whole site's distribution, even on your good articles. Recovery requires removing or rewriting the thin content, not just the new pages.

Quality Manual + algo

Site Reputation Abuse policy

Targets third-party content (often affiliate / coupon) hosted on a high-authority domain.

Active since May 2024, aggressively enforced. The classic case: a top news site rents out a /coupons subdirectory to an affiliate company. Google now penalises this — the parasitic content gets blocked AND the host site gets a reputation hit.

Quality

Cloaking

Showing different content to Googlebot than to human visitors. Hard violation.

The classic 'show clean text to Google, redirect humans to spam' trick. Modern cloaking is more subtle — JavaScript that swaps content based on user-agent fingerprinting. Discover detects it via Chrome telemetry (the rendered DOM in real users' Chrome doesn't match what Googlebot saw). Triggers manual action and full Discover suspension.

Quality

Quality raters

Humans hired by Google who score sites against the Search Quality Guidelines.

A few thousand contractors worldwide. They don't directly affect rankings — their evaluations train Google's ML models. The rater handbook (170+ pages, public) is the single best source on what Google considers 'quality'. Reading it is unfair edge.

Quality 7-21 d duration

Halo effect

Site-level reputation boost that lifts every article on the same domain.

When one of your articles becomes a runaway hit on Discover, your entire site benefits temporarily — other articles see higher CTR, faster pickup, broader distribution. The halo lasts 7-21 days. This is why publishers chase 'one breakout per week' more than 'consistent baseline'.

Schema 9terms

Schema & structured data

How Google parses what an article actually is. JSON-LD wins, hands down.

Schema Top priority

JSON-LD

Schema.org structured-data format. The TOP priority source for the Google App parser.

A small JSON block in your <head> that tells Google exactly what your page is about, in machine-readable form. The Google App SDK was decompiled and the parser explicitly prioritises JSON-LD over Open Graph and Twitter Card. If your JSON-LD says one thing and your og:title says another, Discover trusts the JSON-LD.

Schema

Schema.org

Standardised structured-data vocabulary. Provides NewsArticle, Person, Organization types.

A shared vocabulary maintained by Google, Microsoft, Yahoo and Yandex. Defines hundreds of types (Article, Person, Recipe, Event, Product, etc.) and properties (name, author, datePublished, etc.). When you use it correctly, all major search engines understand your content the same way.

Schema

NewsArticle

Schema.org type recommended for every Discover article — strictly preferred over Article.

A specialised sub-type of Article for time-sensitive news content. NewsArticle accepts dateline, printSection and other journalism-specific properties. The Google App parser handles NewsArticle differently from generic Article — preferring it for distribution. Misusing NewsArticle for evergreen guides can trigger demotions, though.

Example

An article on the latest iPhone launch → NewsArticle. A timeless guide on 'How to choose a smartphone' → Article.

Schema

Schema Person (author)

The author block. Critical for E-E-A-T and YMYL.

Inside your NewsArticle JSON-LD, the author should be a full Person object: name, url (to author page), sameAs (Wikipedia, X, LinkedIn), jobTitle, knowsAbout. Missing or shallow author data is the #1 reason YMYL articles fail to gain traction in Discover.

Schema

Schema Organization (publisher)

The publisher block. Identifies your site as an organisation in the Knowledge Graph.

Each NewsArticle's JSON-LD should reference your site as a publisher Organization: name, url, logo (with explicit width/height), sameAs (social profiles), foundingDate. This is what binds your articles to a recognised entity in the Knowledge Graph and unlocks WPAS-style boosts.

Schema

BreadcrumbList

Schema declaration of the page's hierarchical position.

Tells Google: this article lives at /tech/iphone/iphone-16-review. Powers the breadcrumb that sometimes replaces the URL line in cards. Helps Google understand your site's category structure — feeding into SiteFocusScore and SiteRadius.

Schema

ImageObject

Schema sub-entity declaring an image with explicit width, height, url.

When you reference an image inside Article markup, wrap it as ImageObject with explicit dimensions. This lets Google pick the right hero crop for the Discover card and accelerates LCP measurement. Naked image URLs with no metadata leave Google guessing.

Schema

Wikidata linking

Linking your site/authors to Wikidata items via Schema sameAs property.

Wikidata is the structured-data backbone Google uses to disambiguate entities. Adding sameAs links from your Schema Person/Organization to Wikidata Q-IDs concretely helps Google resolve your entities and assign you a stable MID. Free, public-domain, takes 15 minutes per entity.

Schema Required tag

max-image-preview:large

Robots-meta directive granting Google permission to display large image previews.

Without this directive, your hero image cannot be used as a Discover card thumbnail at full size — Discover falls back to a tiny 100x100 crop or no image at all. The single most-forgotten tag on Discover-aspiring sites. Add it in your <meta name="robots"> alongside index, follow.

Image 5terms

Hero image & visuals

92% of Discover cards display a hero image. The image often outweighs the headline for CTR.

Image ≥ 1200×800 px

Hero image

The article's lead visual. The asset Google selects (in 92% of cases) for the Discover card thumbnail.

The single highest-leverage visual on the page. Discover's 1200×800 minimum is non-negotiable — below that, the image gets pillar-boxed and CTR drops by 30-50%. The hero must be original, sharp, and emotionally engaging. Stock photos underperform original photography by 2-3× on CTR.

Image

Aspect ratio (16:9 / 4:3)

Discover crops cards to either 16:9 or 4:3.

The two ratios Discover uses. 16:9 for most cards, 4:3 for compact slots. If your hero image isn't already in one of these ratios, Google will crop it — often badly (cropping out the subject). Author your hero in 16:9 (1344×756 or larger) and you control what's visible.

Image

Image originality

Reverse-image search penalises stock and over-used visuals.

Google can spot a stock photo or a 100×-reposted image instantly via reverse-image hash. Original visuals (your own photography, custom illustrations, photographed product shots) outperform stock by 2-3× on CTR — often by an order of magnitude on competitive topics.

Image

AI image disclosure (C2PA / IPTC)

Metadata flagging an image as AI-generated. Discover doesn't penalise AI imagery per se but expects disclosure.

The C2PA standard adds a cryptographic 'origin' tag to images, indicating whether they came from a camera or an AI generator. IPTC fields can also carry this. Discover doesn't ban AI images, but undisclosed AI imagery presented as photographic evidence on YMYL topics gets flagged and demoted.

Image

Image alt text

The alt='' attribute. Used as tie-breaker when Discover picks the card thumbnail.

When a page has multiple candidate images, alt text helps Discover pick the most relevant one. Keep it descriptive (what's in the image) and topic-relevant (uses your target keyword naturally). Stuffing keywords backfires.

Metadata 6terms

Title & metadata

The text Discover surfaces on the card. Title-image combo drives 80% of CTR.

Metadata

H1

The single first-level heading on the page.

There must be exactly one H1 per article, and it should match the title users see in the Discover card. Mismatch between H1 and Title is one of the most common Goldmine triggers — Google rewrites your headline (badly) when it can't trust it.

Metadata Sweet spot 50-65 chars

Title tag

The <title> in <head>. Often shown verbatim in Discover.

The single highest-leverage piece of copy on the entire page. Discover often shows it verbatim, sometimes truncated to 70 characters. Sweet spot: 50-65 characters. The first 50 chars must contain the curiosity hook because that's what survives truncation on small-screen mobile.

Metadata

Meta description

The 150-160 character summary in <head>.

Discover only renders the meta description on a fraction of cards (mostly larger formats). It's not the primary lever, but it remains a tie-breaker for click-through prediction in some pipelines. Don't ignore it; don't obsess over it either.

Metadata

Open Graph

The og:* meta family. Lower priority than JSON-LD, but still consulted.

Originally built by Facebook for social sharing. Discover's parser uses og:* as a SECONDARY source, only when JSON-LD is missing or incomplete. og:image is consulted for the card image when no Schema ImageObject is present. Keep og:* aligned with JSON-LD to avoid Goldmine rewriting.

Metadata

Twitter Card

The twitter:* meta family. Lowest priority for Discover.

Used by X/Twitter (and now creatorcontent pipeline indirectly). For Discover ranking purposes, it's the LAST fallback after JSON-LD and Open Graph. Still worth setting correctly because creatorcontent specifically reads it when articles are shared on X.

Metadata

Canonical URL

The single authoritative URL for a piece of content.

Declared via <link rel='canonical'>. When the same content lives at multiple URLs (mobile/desktop, syndication, paginated), canonical tells Discover which one is THE one. Inconsistent or self-canonicalising-then-not is a known way to fragment your own NavBoost score.

Indexing 8terms

Discoverability & indexing

Getting Googlebot to your articles fast — before the Discover window closes.

Indexing Free · 10 min setup

Google Publisher Center

Google's dashboard at publishercenter.google.com. Unlocks WPAS + faster crawl.

Free signup, ~10 minutes setup. Approved Publisher Center sites get: (1) the WPAS personalisation boost, (2) sub-minute crawl latency vs 1-24h for unregistered sites, (3) eligibility for Google News inclusion. The single highest ROI action for any Discover-aspiring site.

Indexing

Indexing API

Google's push-indexing endpoint. Officially restricted to job postings + live streams.

A REST API where you POST a URL and Google crawls it within minutes. Officially limited to JobPosting and BroadcastEvent schema types — but news publishers commonly use it as an unofficial fast-lane for breaking content. Use sparingly: abuse triggers throttling.

Indexing

News sitemap (news.xml)

Dedicated XML sitemap with <news:publication> blocks for sub-48h articles.

A specialised sitemap that signals 'these URLs are time-sensitive news' to Googlebot. URLs in news.xml get crawled within minutes (vs hours for regular sitemap.xml). Maximum 1000 URLs, only articles less than 48h old. Pings Search Console on every update.

Indexing

IndexNow

Open protocol for Bing/Yandex/Cloudflare to push URL updates. Google does NOT support it.

Confusingly often presented as a 'faster indexing' silver bullet — but Google has explicitly stated it ignores IndexNow signals. Useful for Bing visibility, neutral for Discover. Many CDNs (Cloudflare) enable it by default; that's fine, just don't expect Discover effects.

Indexing

Robots meta directives

The combo Discover needs: index, follow, max-image-preview:large, max-snippet:-1, max-video-preview:-1. Anything stricter (e.g. noindex, max-snippet:50) silently caps your distribution. Single most common cause of 'why doesn't my site appear in Discover at all?'.

Indexing

Crawl budget

The number of pages Googlebot crawls on your site per unit of time.

Limited resource. If your site has 100k thin pages, Googlebot wastes its budget there instead of crawling your fresh articles. Trim the deadwood (noindex tag pages, paginated archives, etc.) to focus the budget where it matters.

Indexing

Google Search Console

Free Google dashboard. The only authoritative source for Discover impressions / clicks / CTR.

Free, takes 5 minutes to verify. The Discover-specific tab only appears once your site has accumulated > 100 Discover impressions in 16 months — its mere existence is a signal that you're eligible. Tracks Discover separately from Search.

Indexing

Discover performance report

The Discover-only view in Search Console. Appears at >100 lifetime Discover impressions.

Shows total Discover impressions, clicks, CTR, and per-article performance over the last 16 months. Doesn't expose pipelines or NAIADES sub-types — for that you need third-party tools like 1492.vision or DiscoReady.

Penalty 3terms

Penalties & demotions

What gets you de-prioritised or banned — and how to detect it.

Penalty

Manual action

Human-reviewed penalty visible in Search Console.

A human at Google has reviewed your site and applied a penalty for a clear policy violation (spam, hidden text, hacked content, site reputation abuse). Listed in Search Console > Manual actions. Requires a reconsideration request to lift, often after concrete remediation.

Penalty

Algorithmic suppression

Silent demotion not reported in Search Console.

The system itself decided to demote you, no human involvement, no notification. Detectable only by sudden, sustained drops in Discover impressions while Search remains stable. Recovery means fixing the underlying signal — most often Helpful Content System triggers.

Penalty

Core / Spam update

Periodic algorithm refreshes. Discover impact often more violent than Search.

Google ships 3-6 named updates per year. Core updates re-tune ranking; Spam updates target policy violators. Discover sites can lose 70%+ of distribution in a single update — much more volatile than Search. Recovery rarely happens before the next update.

Amplification 6terms

Amplification & timing

How to maximise the chance Wave 1 triggers Wave 2 and Wave 3.

Amplification Hour 1 critical

Direct-clicks hack (100-300 clicks)

Pre-Discover wave 2 trigger: 100-300 organic-looking direct clicks within the first hour.

Controversial but documented hack: in the first hour after publication, drive 100-300 'real-looking' direct clicks (push notifications, social shares, internal site CTAs). NavBoost reads the elevated engagement on hour 1 and triggers Wave 2 amplification. Risk: Google's anti-fraud detects mechanical patterns. The cleaner the source (real users), the safer.

Amplification

Push notification sequence

Coordinated push delivery to your subscriber base, timed to amplify Wave 1 signals.

Build a push subscriber list (OneSignal, Outpush). Within 30 minutes of publishing, push the article to your warm audience. Their clicks count as 'direct' in NavBoost — same effect as the direct-clicks hack but fully legitimate. Sites with 100k+ push subs can self-trigger Wave 2 reliably.

Amplification

Cold-start signal

The first impression an article makes on Discover. Hour 0-2 is critical.

Discover's algorithm makes a 'gut decision' in hour 0-2 based on Wave 1 signals. If CTR + dwell + scroll-depth all look great, Wave 2 fires within 90 minutes. If they're mediocre, the article gets quietly capped. Hour 2 onwards, the trajectory is mostly locked in.

Amplification

Publication timing

The hour-of-day you publish materially affects Wave 1 outcomes.

Discover's user activity peaks differ by topic. Lifestyle / horoscope: 5h-7h French time. Tech / business: 8h-10h. Entertainment: 19h-22h. Publishing into the right window can change Wave 1 by 5-10× and trigger Wave 2 that wouldn't have fired otherwise. Underrated lever.

Example

A horoscope article published at 4h AM French time often hits 100k visits. The same article at 14h might cap at 8k.

Amplification

Content refresh

Updating an existing article to trigger a new Discover cycle.

Material content updates (new sections, refreshed numbers, updated dates) on a previously-distributed article can trigger a new Discover cycle, with the benefit of the article's existing NavBoost history. Cosmetic-only updates (changing dateModified without real content change) are detected and demoted.

Amplification

Ryan Hoods effect

Author-page reach multiplier — coherent author identity boosts every article they sign.

Named after a notable case study. When an author has a strong, well-built profile (Schema Person + bio + sameAs to X/Wikipedia + active publication history), every new article they sign benefits from the author's accumulated authority. Often a 1.5-3× lift on Wave 1 CTR.

Ecosystem 8terms

Profile pages & FR ecosystem

The under-exploited entity layer and the players powering the French Discover market.

Ecosystem Free Profiler tool

profile.google.com/cp/…

URL format of Google's public Web Profile pages — the entity layer for sites and creators.

Every site that's been recognised by Google as a publishing entity has a Google Web Profile at profile.google.com/cp/[id]. The presence of this URL is the cleanest proof that your domain is in Google's entity index. The DiscoReady Profiler tool surfaces it for any site in one second.

Ecosystem

DiscoReady

All-in-one platform to master Google Discover. Free Profiler tool included.

The platform you're reading this glossary on. Free tools (Profiler, Title Lab, Image Validator, Schema Auditor, 1-min Audit) plus the Premium Guide that this glossary is built from.

Ecosystem 75% of creatorcontent FR

X (x.com)

Powers Discover's creatorcontent pipeline — 75% of FR sources.

The dominant source feeding the creatorcontent pipeline in France. A site without an active X presence (a verified handle posting article URLs with hooks, accumulating engagement) renounces a growing share of Discover traffic.

Ecosystem

BFM TV

Volume-first editorial model. Present across multiple pipelines without dominance.

Reference case study: high publication frequency, broad topic coverage, never a clear winner on any single pipeline. Volume strategy works but caps your per-article ceiling — opposite of Ouest-France's pipeline-domination model.

Ecosystem

Ouest-France

Multi-pipeline editorial model. Dominant on 8-12 pipelines simultaneously.

The reference success case in French Discover. Builds tight thematic verticals (regional news, sports, lifestyle), each strong enough to dominate its respective pipeline. Captures the largest share of Moonstone, Webkicklocalstories and Mustntmiss in France.

Ecosystem

HBAgency

French ad network specialised in publishing. RPM 2-3× AdSense.

Once your Discover traffic is meaningful (~100k visits/month), graduating from AdSense to a specialised network like HBAgency typically yields 2-3× the per-thousand-impressions revenue. The single biggest monetisation lever for FR Discover sites.

Ecosystem

Outpush

French push-notification platform — popular for Discover-style amplification.

A French SaaS for browser/web push notifications. Frequently used to drive the 100-300 first-hour clicks that trigger Wave 2 amplification. Cleaner than mechanical click-injection because the clicks come from real opted-in subscribers.

Ecosystem

1492.vision

Discover-specific analytics platform. Maps which pipelines your articles ride.

Third-party analytics tool built specifically for Discover. Reveals which of the 20+ pipelines each of your articles entered, what their pipeline-level performance looks like, and how that compares to competitors. Around €200-500/month — rentable above 100k Discover visits/month.

Google Discover, decoded

Browse by topic

Discover pipelines

The 8-step processing pipeline

Algorithms & internal systems

User embeddings (3 layers)

NAIADES sub-types (the 7)

Behavioural signals

Performance metrics

Content quality & policy

Schema & structured data

Hero image & visuals

Title & metadata

Discoverability & indexing

Penalties & demotions

Amplification & timing

Profile pages & FR ecosystem

Content

Moonstone

Shoppinginspiration

Feedads

Aura

Mustntmiss

Newsstoriesheadlines

Creatorcontent

Astria

Geotargetingstories

Webkicklocalstories

Garamondrelatedarticlegrouping

Relatedcontentruby

Paginationpanoptic

Neoncluster

Deeptrends / Deeptrendsfable

Beacon Push System

Step 1 — Content Ingestion

Step 2 — Structured Data Parsing

Step 3 — Classification

Step 4 — Filtering

Step 5 — Interest Matching

Step 6 — Ranking

Step 7 — Feed Assembly

Step 8 — Feedback Loop

NavBoost

NAIADES

Knowledge Graph (KG)

MID (Machine ID)

Entity linking

SAFT (Structured Annotation Framework)

Twiddler (re-ranker)

Goldmine (title rewriter)

Nephesh — foundational layer

Picasso — long-duration layer

VanGogh — instant layer

Site2vec

SiteFocusScore

SiteRadius

MID-based (sub-type 793)

Query-based (sub-type 792)

WPAS (Web Publisher Articles Signal)

RECALL_BOOST

AIM Thread

CTR (Click-Through Rate)

Dwell time

Pogo-sticking

Long click vs short click

Bounce-back

Scroll depth

Tombstoning

Rug pull counter

Wave 1 / 2 / 3

Core Web Vitals

LCP (Largest Contentful Paint)

CLS (Cumulative Layout Shift)

INP (Interaction to Next Paint)

TTFB (Time to First Byte)

PageSpeed mobile score

E-E-A-T

YMYL — Your Money or Your Life

Helpful Content System (HCS)

Site Reputation Abuse policy