• GEO

What We Learned from 10,000 Generative Answers in 2025

  • Felix Rose-Collins
  • 5 min read

Intro

In 2025, generative search finally crossed a threshold. It stopped being an experiment and became the primary way hundreds of millions of people interact with information.

To understand how this shift changes discovery, we conducted one of the largest independent GEO research efforts to date:

10,000 generative answers analyzed across 7 major engines across 4 months across 5 query categories across 100+ brands.

This article summarizes the most important insights — what generative engines do, how they choose sources, what patterns emerged, which brands win or lose, and what this means for the future of optimization.

This is the definitive “state of generative answers” report for 2025.

Part 1: The Project Overview — What We Tested

Across 10,000 generative answers, we tracked:

  • inclusion frequency

  • citation patterns

  • reasoning behavior

  • hallucination types

  • fact drift over time

  • generative bias

  • multi-modal influence

  • answer structures

  • entity classification

  • category-level dominance

Queries came from 5 groups:

1. Informational

Definitions, how-tos, explanations, facts.

2. Transactional

Comparisons, product choices, service providers.

3. Brand-Level

“What is X?”, “Who owns X?”, “X vs Y.”

4. Multi-Modal

Images, screenshots, charts, videos.

5. Agentic

Multi-step workflows, research instructions, tool-use queries.

Engines included:

  • Google SGE

  • Bing Copilot

  • ChatGPT Search

  • Perplexity

  • Claude Search

  • Brave Summaries

  • You.com

This dataset is the clearest snapshot yet of how AI answers are being constructed in the wild.

Part 2: The 10 Most Important Findings (Summary)

Here are the top takeaways before we dive deep:

1. Generative answers are written using very few sources — typically 3–10.

2. Entity clarity was the strongest predictor of inclusion.

3. Original data was cited far more often than any other content.

4. Outdated pages were excluded almost universally.

5. Canonical definitions shaped how brands were described.

6. Multi-modal assets influenced which brands were selected.

7. Hallucinations decreased, but misclassification increased.

8. Cross-web consistency strongly influenced trust scoring.

9. Agents modified answers based on multi-step reasoning.

10. SERP-first SEO factors barely predicted generative visibility.

Let’s break down the details.

Part 3: Finding #1 — Models Use Far Fewer Sources Than Expected

Despite retrieving dozens or hundreds of pages:

Generative answers are typically built from 3–10 selected sources.

This is consistent across:

  • short answers

  • long explanations

  • comparisons

  • multi-step reasoning

  • agentic workflows

If you aren’t one of the 3–10 sources that survive filtering, you are invisible.

Meet Ranktracker

The All-in-One Platform for Effective SEO

Behind every successful business is a strong SEO campaign. But with countless optimization tools and techniques out there to choose from, it can be hard to know where to start. Well, fear no more, cause I've got just the thing to help. Presenting the Ranktracker all-in-one platform for effective SEO

We have finally opened registration to Ranktracker absolutely free!

Create a free account

Or Sign in using your credentials

This is the biggest shift from the SERP era:

Visibility ≠ ranking. Visibility = inclusion.

Part 4: Finding #2 — Entity Clarity Was the Strongest Predictor of Visibility

The brands with the best visibility across engines shared one universal trait:

AI could answer “What is this?” with perfect confidence.

We observed three levels of entity clarity:

Level 1 — Crystal clear Consistent, unambiguous, canonical. These brands dominated generative visibility.

Level 2 — Partially clear Some inconsistencies. These brands appeared occasionally.

Level 3 — Ambiguous Conflicting descriptions. These brands were almost completely excluded.

Entity clarity beats:

  • backlinks

  • domain rating

  • content length

  • keyword density

  • domain age

It is the #1 GEO factor across our entire dataset.

Part 5: Finding #3 — Original Data Outperformed All Other Content Types

Generative engines overwhelmingly favored:

  • proprietary studies

  • statistics

  • benchmarks

  • whitepapers

  • research reports

  • survey findings

Any content that existed nowhere else.

Meet Ranktracker

The All-in-One Platform for Effective SEO

Behind every successful business is a strong SEO campaign. But with countless optimization tools and techniques out there to choose from, it can be hard to know where to start. Well, fear no more, cause I've got just the thing to help. Presenting the Ranktracker all-in-one platform for effective SEO

We have finally opened registration to Ranktracker absolutely free!

Create a free account

Or Sign in using your credentials

Brands with original data had:

  • 3–4× higher inclusion rates

  • 5× more stable citations

  • near-zero hallucination risk

The engines want first-source evidence, not rewritten SEO content.

Part 6: Finding #4 — Recency Was More Important Than Authority

This was surprising even to us:

Engines consistently downranked outdated pages even if they came from high-authority domains.

Recency mattered enormously.

A page updated in the last 90 days outperformed:

  • higher DR competitors

  • longer content

  • more linked pages

  • older evergreen guides

Models interpret recency = credibility.

Part 7: Finding #5 — Canonical Definitions Shape How AI Describes You

We observed a direct relationship between:

  • the format of a brand’s canonical page

  • the wording used in generative summaries

Simple, structured definitions reliably showed up in answers verbatim.

This means:

You can shape how the generative web describes you —

by shaping your canonical definitions.

This is the new “snippet optimization.”

Part 8: Finding #6 — Multi-Modal Assets Played an Unexpected Role

Generative engines increasingly used:

  • screenshots

  • UI examples

  • product images

  • diagrams

  • videos

as supporting evidence.

Brands with:

  • consistent design

  • well-lit images

  • annotated visuals

  • video demos

appeared more often and were described more accurately.

Visual clarity = generative clarity.

Part 9: Finding #7 — Hallucinations Are Down, But Misclassification Is Up

Hallucinations dropped significantly across engines.

But a new problem emerged:

Misclassification — AI placing brands in the wrong category.

Examples:

  • calling a SaaS platform a “tool” instead of a “suite”

  • misidentifying product tiers

  • mixing up competitors

  • merging two brands’ features

  • confusing the parent company with the product

These errors almost always traced back to:

  • weak canonical data

  • inconsistent product naming

  • outdated support pages

Brands that updated definitions monthly had significantly lower misclassification rates.

Part 10: Finding #8 — Cross-Web Consistency Weighted Heavily in Selection

Engines checked:

  • LinkedIn

  • Wikipedia

  • Wikidata

  • Crunchbase

  • G2

  • GitHub

  • social profiles

  • schema

  • third-party reviews

against each other.

If facts matched → trust increased. If facts conflicted → exclusion happened.

Cross-web consistency was a top-5 ranking factor.

Part 11: Finding #9 — Agentic Reasoning Boosted Some Brands and Hurt Others

Agentic queries are multi-step instructions:

“Research X, compare providers, summarize options, and recommend the best one.”

We observed:

Brands with strong structured comparisons were chosen more often.

Engines wanted:

  • pros & cons

  • transparent pricing

  • clear positioning

  • use-case lists

  • feature breakdowns

Brands that hid weaknesses or obscured features lost inclusion.

Part 12: Finding #10 — SEO Strength Did Not Predict Generative Visibility

This is the clearest finding of all:

High-ranking SEO brands often performed poorly in generative answers.

Why?

Because generative visibility depends on:

  • clarity

  • consistency

  • authority

  • recency

  • originality

  • trustworthiness

  • structured data

—not on keyword rankings.

We saw brands with:

  • DR 20 outperform DR 80

  • 100-page sites outperform 10,000-page sites

  • focused domains outperform broad ones

Generative engines reward coherence, not volume.

Part 13: Secondary Findings Worth Noting

Beyond the top 10 insights, we found several additional patterns:

1. Engines penalize ambiguous product ecosystems

If you have too many overlapping products, clarity collapses.

2. Long paragraphs performed poorly

Structured content was consistently preferred.

3. Models reward “definition-first” content

Start with the answer → then expand.

4. Models dislike outdated screenshots

Old UI confused multi-modal recognition.

5. Engines prefer distinct brands over brand families

Parent/child relationships often got blurred or merged.

6. Engines heavily downranked affiliate sites

Lack of originality = exclusion.

7. Domain authority only mattered for trust, not inclusion

It was one signal, not the determining one.

Part 14: Industry-Level Insights From 10,000 Answers

Strongest generative visibility

  • SaaS

  • finance

  • health information

  • cybersecurity

  • analytics

  • developer tools

These industries had clear definitions and structured documentation.

Weakest

  • hospitality

  • travel

  • home services

  • creative agencies

  • local service providers

These industries suffered from vagueness and inconsistent naming.

Part 15: What Brands Can Do With These Insights (Action-Oriented Summary)

1. Strengthen your canonical definitions

This shapes how AI describes you.

2. Publish original research

This multiplies generative visibility.

3. Maintain strict cross-web consistency

This boosts trust and inclusion.

4. Update core pages monthly

Recency is not optional.

5. Create comparison-friendly content

Agents love structured breakdowns.

6. Maintain multi-modal alignment

Your images, screenshots, and UI matter now.

7. Eliminate contradictions

AI punishes ambiguity more than search engines do.

8. Prioritize entity clarity above all

This is the foundation of GEO.

Conclusion: Generative Answers Reveal a New Information Economy

The data across 10,000 generative answers confirms one thing:

Meet Ranktracker

The All-in-One Platform for Effective SEO

Behind every successful business is a strong SEO campaign. But with countless optimization tools and techniques out there to choose from, it can be hard to know where to start. Well, fear no more, cause I've got just the thing to help. Presenting the Ranktracker all-in-one platform for effective SEO

We have finally opened registration to Ranktracker absolutely free!

Create a free account

Or Sign in using your credentials

We are entering an answer economy — not a link economy.

Visibility no longer depends on:

  • rankings

  • backlinks

  • keyword volume

  • SERP surfaces

It depends on:

  • clarity

  • facts

  • structure

  • recency

  • originality

  • entity coherence

  • multi-modal understanding

  • consistent cross-web identities

Generative engines don’t reward the biggest sites. They reward the clearest, most trustworthy, and most structured.

What we learned from 10,000 generative answers in 2025 is simple:

If you want visibility in the age of AI, you must optimize for how AI thinks —not how humans used to click.

Felix Rose-Collins

Felix Rose-Collins

Ranktracker's CEO/CMO & Co-founder

Felix Rose-Collins is the Co-founder and CEO/CMO of Ranktracker. With over 15 years of SEO experience, he has single-handedly scaled the Ranktracker site to over 500,000 monthly visits, with 390,000 of these stemming from organic searches each month.

Start using Ranktracker… For free!

Find out what’s holding your website back from ranking.

Create a free account

Or Sign in using your credentials

Different views of Ranktracker app