pith. sign in

arxiv: 2606.20065 · v1 · pith:CN6XGJXKnew · submitted 2026-06-18 · 💻 cs.IR · cs.CL· cs.CY

Generative Engine Optimization at Scale: Measuring Brand Visibility Across AI Search Engines

Pith reviewed 2026-06-26 15:43 UTC · model grok-4.3

classification 💻 cs.IR cs.CLcs.CY
keywords generative engine optimizationAI search visibilitybrand visibilitythree-tier laddercitation sourceslisticle contentsentiment instabilityAI answer engines
0
0 comments X

The pith

AI search engines display brands according to a three-tier visibility ladder based on stature.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper measures brand visibility across AI search engines using over 100,000 prompt responses from more than 100 brands. It finds that appearance rates form clear tiers: 73 percent for global household names, 44 percent for mid-market brands, and 11 percent for niche ones. Corporate websites dominate citations at 78 percent, with best-of listicles as the top content format. Sentiment around brands changes much more readily than whether they are mentioned at all. This baseline matters because it shows how visibility in AI answers varies strongly by brand maturity and provides protocols for testing improvements.

Core claim

First visibility runs form a clear three-tier brand-stature ladder where global household names appear in 73% of relevant AI answers, established mid-market brands in 44%, and niche brands in 11%. When citing sources, 78% are corporate websites, YouTube leads non-corporate sources, and ranked best-of listicles account for 21% of citations. Sentiment framing flips 6.7 times more often than mentions themselves.

What carries the argument

The three-tier brand-stature ladder measured from first visibility runs on 100K+ prompt responses.

If this is right

  • AI brand visibility differs by platform and brand maturity.
  • The highest-leverage content format is the ranked best-of listicle.
  • Sentiment is an unstable signal compared to mere mention.
  • Seven v1.1 protocols can test whether specific changes improve AI visibility.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Marketers for smaller brands could prioritize getting featured in listicles to boost visibility.
  • Different AI engines may require tailored strategies due to platform differences.
  • Tracking visibility separately from sentiment could give a more stable view of presence.

Load-bearing premise

The prompts used in the analysis are representative of typical user queries to AI search engines and the brands tracked are a fair sample across tiers without selection bias.

What would settle it

Repeating the analysis with a fresh set of prompts or a broader, independently selected group of brands would produce different tier percentages or citation patterns.

read the original abstract

People increasingly get answers straight from AI search engines like ChatGPT, Claude, Perplexity, and Gemini rather than scrolling search results. Brands that once focused on search engine optimization (SEO) must now optimize for how these engines represent, cite, and recommend them -- a shift variously called Generative Engine Optimization (GEO), Answer Engine Optimization (AEO), and AI Search Visibility. We treat AEO and AI Visibility as part of GEO, and study how to measure brand visibility across AI engines: what they value when they cite a brand, which sources they rely on, and what content large language models surface. The hard case is everyone outside the already-authoritative top brands -- SMEs, D2C brands, creators, and early-stage startups. We analyze 100K+ prompt responses across 100+ brands tracked on Ranqo between March and May 2026. First visibility runs form a clear three-tier brand-stature ladder: global household names (e.g., Stripe, Nike) appear in 73% of relevant AI answers on their first run; established mid-market and regional brands (e.g., Olipop, Klaviyo) in 44%; niche and small brands in just 11% -- about 30 percentage points per step. When engines cite sources, about 78% go to corporate websites; among non-corporate sources YouTube leads, ahead of Reddit, editorial media, and Wikipedia. The highest-leverage page is the ranked "best-of" listicle, the most-cited content format at about 21% of all citations. Sentiment is the unstable signal: whether a brand is framed positively or negatively flips about 6.7 times more often than whether it is mentioned at all. These findings provide a first large-scale baseline for measuring GEO: AI brand visibility can be measured, differs by platform, and varies strongly by brand maturity. We close by proposing seven v1.1 protocols to test whether specific recommendations can causally improve AI visibility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper presents an observational analysis of brand visibility across AI search engines (ChatGPT, Claude, Perplexity, Gemini) based on 100K+ prompt responses from 100+ brands tracked on the Ranqo platform between March and May 2026. It reports a three-tier visibility ladder on first runs (global household names at 73%, mid-market/regional at 44%, niche/small at 11%), with 78% of citations to corporate websites, listicles as the top-cited format (21%), and sentiment as an unstable signal (flipping 6.7 times more often than mention). The work positions these as a baseline for Generative Engine Optimization (GEO) and proposes seven v1.1 protocols for causal testing.

Significance. If the sampling and tiering are representative, the study supplies a valuable first large-scale empirical baseline for measuring AI brand visibility, quantifying stature-based gaps and highlighting citation patterns that could guide both research and SME strategies. The scale (100K+ responses) and forward-looking protocols add utility beyond pure description.

major comments (2)
  1. [Abstract / Methods] Abstract and (presumed) Methods: The reported 73/44/11 visibility ladder is computed from brands 'tracked on Ranqo' with no disclosed recruitment process, independent tier-assignment criteria, or verification that tier labels are exogenous to visibility outcomes. This selection mechanism is load-bearing for the central claim of a stature-driven gap.
  2. [Abstract / Methods] Abstract and (presumed) Methods: The 100K+ prompts lack any description of sampling frame, stratification, or validation against real user query distributions; if prompts are disproportionately brand-specific or visibility-seeking, the tier differences are confounded by query construction rather than engine behavior.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and for identifying key areas where methodological transparency can be strengthened. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract / Methods] Abstract and (presumed) Methods: The reported 73/44/11 visibility ladder is computed from brands 'tracked on Ranqo' with no disclosed recruitment process, independent tier-assignment criteria, or verification that tier labels are exogenous to visibility outcomes. This selection mechanism is load-bearing for the central claim of a stature-driven gap.

    Authors: We agree that the manuscript should have provided explicit details on these points. The tier labels were assigned using observable, pre-existing brand characteristics (global recognition, market presence, and revenue scale) drawn from public sources and intended to be independent of the AI visibility measurements. However, the current text does not document the exact assignment rules or recruitment process for the Ranqo-tracked brands. In revision we will add a Methods subsection that (a) states the tier criteria with examples of the public metrics used, (b) describes the platform recruitment process to the extent it is known, and (c) discusses the assumption of exogeneity together with any limitations. We view this as a necessary clarification rather than a change to the underlying data. revision: yes

  2. Referee: [Abstract / Methods] Abstract and (presumed) Methods: The 100K+ prompts lack any description of sampling frame, stratification, or validation against real user query distributions; if prompts are disproportionately brand-specific or visibility-seeking, the tier differences are confounded by query construction rather than engine behavior.

    Authors: This concern is valid. The manuscript does not currently describe the prompt-generation process, sampling frame, or any validation against external query distributions. The prompts were constructed to be brand-relevant and representative of typical user questions, but without documented stratification or external benchmarking, confounding from query design cannot be ruled out. In the revision we will expand the Methods section to detail how prompts were generated, any steps taken to diversify them, and a limitations paragraph addressing potential selection effects. We will also note that the observed tier gaps are conditional on the prompt set used. revision: yes

Circularity Check

0 steps flagged

No circularity: purely observational measurement of visibility counts

full rationale

The paper performs direct empirical counting of brand mentions across AI responses to 100K+ prompts. No equations, fitted parameters, predictions, or derivations are present that could reduce to self-defined quantities or self-citations. The reported 73/44/11 tier ladder is computed from observed frequencies on the tracked brands; tier labels and visibility rates are independent of any internal model or ansatz. Selection-bias concerns (Ranqo sample) affect external validity but do not create circularity in the reported measurements themselves.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the representativeness of the sampled prompts and brands as a domain assumption. No free parameters or invented entities.

axioms (1)
  • domain assumption The selected prompts and brands represent real-world AI search behavior
    The analysis relies on this to generalize the three-tier ladder.

pith-pipeline@v0.9.1-grok · 5904 in / 1310 out tokens · 31707 ms · 2026-06-26T15:43:04.124758+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. How Large Language Models Source Brand Reputation Across Languages and Markets

    cs.IR 2026-06 unverdicted novelty 5.0

    LLMs cite third-party domains for 85.7% of brand attributions, with Wikipedia dominant in most languages, a long-tailed domain distribution, and market-specific shifts such as YouTube and HR sites in Poland.

Reference graph

Works this paper leans on

15 extracted references · 1 linked inside Pith · cited by 1 Pith paper

  1. [1]

    Aggarwal, V

    P. Aggarwal, V. Murahari, T. Rajpurohit, A. Kalyan, K. Narasimhan, and A. Deshpande. GEO: Generative Engine Optimization. KDD 2024. 2311.09735

  2. [2]

    Puerto, M

    H. Puerto, M. Gubri, T. Green, S. J. Oh, and S. Yun. C-SEO Bench: Does Conversational SEO Work? NeurIPS Datasets and Benchmarks 2025. 2506.11097

  3. [3]

    Algaba, V

    A. Algaba, V. Holst, F. Tori, M. Mobini, B. Verbeken, S. Wenmackers, and V. Ginis. How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices? 2504.02767 , April 2025

  4. [4]

    Kirsten, J

    E. Kirsten, J. Grosse Perdekamp, M. Upadhyay, K. P. Gummadi, and M. B. Zafar. Characterizing Web Search in the Age of Generative AI. 2510.11560 , October 2025

  5. [5]

    K.-C. Yang. News Source Citing Patterns in AI Search Systems. 2507.05301 , July 2025

  6. [6]

    GEO vs AEO vs SEO: Three Measurement Views of the Same Work

    Ranqo. GEO vs AEO vs SEO: Three Measurement Views of the Same Work. April 2026. https://ranqo.ai/blog/geo-vs-aeo-vs-seo

  7. [7]

    What AI Platforms Really Recommend When You Ask About CRM Software

    Ranqo. What AI Platforms Really Recommend When You Ask About CRM Software. February 2026. https://ranqo.ai/blog/ai-platforms-crm-recommendations-study

  8. [8]

    What is Generative Engine Optimization (GEO)? The Complete 2026 Guide

    Ranqo. What is Generative Engine Optimization (GEO)? The Complete 2026 Guide. April 2026. https://ranqo.ai/blog/what-is-generative-engine-optimization-geo-guide

  9. [9]

    The 5 Factors That Determine Whether AI Cites Your Brand

    Ranqo. The 5 Factors That Determine Whether AI Cites Your Brand. April 2026. https://ranqo.ai/blog/5-factors-ai-cites-your-brand

  10. [10]

    How to Get Cited by Perplexity: The Citation-Engine Playbook

    Ranqo. How to Get Cited by Perplexity: The Citation-Engine Playbook. April 2026. https://ranqo.ai/blog/how-to-get-cited-by-perplexity

  11. [11]

    AI Visibility for SaaS: The Complete B2B Playbook

    Ranqo. AI Visibility for SaaS: The Complete B2B Playbook. April 2026. https://ranqo.ai/blog/ai-visibility-for-saas-b2b-playbook

  12. [12]

    AI Visibility for E-commerce & DTC Brands: It's Research-and-Handoff, Not Search

    Ranqo. AI Visibility for E-commerce & DTC Brands: It's Research-and-Handoff, Not Search. April 2026. https://ranqo.ai/blog/ai-visibility-for-ecommerce-dtc

  13. [13]

    The E-E-A-T Playbook for AI Citations: Visible Authority Beats Markup Theatre

    Ranqo. The E-E-A-T Playbook for AI Citations: Visible Authority Beats Markup Theatre. May 2026. https://ranqo.ai/blog/eeat-playbook-ai-citations

  14. [14]

    Schema Markup for AI Citations: A Complete Guide

    Ranqo. Schema Markup for AI Citations: A Complete Guide. April 2026. https://ranqo.ai/blog/schema-markup-for-ai-citations

  15. [15]

    How to Measure AI Share of Voice: The Three Decisions That Change the Number

    Ranqo. How to Measure AI Share of Voice: The Three Decisions That Change the Number. June 2026. https://ranqo.ai/blog/how-to-measure-ai-share-of-voice