Recognition: no theorem link
Measuring Google AI Overviews: Activation, Source Quality, Claim Fidelity, and Publisher Impact
Pith reviewed 2026-05-15 02:09 UTC · model grok-4.3
The pith
Google AI Overviews activate on 13.7% of searches, with 11% of their claims unsupported by cited pages and most of those pages carrying ads that suppress publisher revenue.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Issuing 55,393 trending queries across 19 categories shows AIO activation at 13.7% overall and 64.7% for question-form queries, with lower rates on politically sensitive topics. AIO-cited domains are more credible than co-displayed results but nearly 30% do not appear in those results. Of 98,020 decomposed atomic claims, 11.0% are unsupported by the cited pages, with omission as the main failure mode, and fidelity is independent of source quality. Well over half of AIO-cited pages carry display advertising, so publishers lose revenue when AIOs suppress clicks while Google's sponsored ads remain.
What carries the argument
Large-scale longitudinal query measurement combined with atomic claim decomposition to quantify activation rates, source credibility differences, unsupported claims, and advertising presence on cited pages.
Load-bearing premise
The 55,393 trending queries represent typical user searches and that breaking responses into atomic claims can be done reliably without systematic bias in measuring unsupported content.
What would settle it
Re-running the same queries with a different query sample or having human raters verify the support status of a random subset of the 98,020 claims would show whether the reported activation and unsupported rates hold.
Figures
read the original abstract
Google AI Overviews (AIOs) are arguably the most widely encountered deployment of generative AI, reaching over 2 billion users who may not realize the answers they see are AI-generated. Where search engines have traditionally surfaced ranked sources and left users to evaluate them, AIOs synthesize and deliver a single answer - giving Google unprecedented editorial control over what users read and know. We present a large-scale longitudinal measurement study, issuing 55,393 trending queries across 19 topical categories over a 40-day window (March 13 - April 21, 2026). We report four main findings. First, overall AIO activation is 13.7%, rising to 64.7% for question-form queries, while politically sensitive topics see markedly lower rates. Second, AIO-cited domains are more credible than co-displayed first-page results, yet nearly 30% do not appear in those results at all, indicating a source selection mechanism distinct from Google's ranking algorithm. Third, decomposing responses into 98,020 atomic claims, 11.0% are unsupported by the cited pages - with omission the dominant failure mode - and source quality and claim fidelity are largely independent. Fourth, well over half of AIO-cited pages carry display advertising, meaning publishers lose revenue when AIOs suppress the click-through, even as Google's own sponsored ads continue to appear on the same page. Together, these findings document a rapid transformation of the online information ecosystem whose consequences for epistemic security remain poorly understood.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports results from a 40-day longitudinal measurement study issuing 55,393 trending queries across 19 topical categories to Google. It finds AIO activation at 13.7% overall (64.7% for question-form queries, lower for politically sensitive topics), AIO-cited domains more credible than co-displayed first-page results yet ~30% absent from those results, 11.0% of 98,020 atomic claims unsupported by cited pages (omission dominant) with source quality and fidelity largely independent, and >50% of AIO-cited pages carrying display advertising.
Significance. If the measurements are reliable, the study supplies large-scale empirical evidence on activation rates, source selection distinct from ranking, claim fidelity, and publisher revenue displacement in generative search summaries. These observations bear directly on epistemic security, information ecosystem shifts, and advertising economics, providing a useful baseline for future work.
major comments (1)
- [Abstract / claim-fidelity section] Abstract (third finding) and associated methods: the decomposition of responses into 98,020 atomic claims and the 11.0% unsupported rate are load-bearing for the fidelity and independence claims, yet the manuscript provides no description of atomic-claim extraction criteria, support-judgment rules, inter-rater reliability, or human-validation subsample. Without these details the reported percentage cannot be verified and may embed systematic bias.
minor comments (1)
- [Abstract] The date window March 13–April 21, 2026 appears to lie in the future relative to the arXiv posting; confirm the correct interval or clarify the study timeline.
Simulated Author's Rebuttal
We thank the referee for their careful review and constructive feedback on our manuscript. We address the single major comment below and will revise the manuscript to incorporate the requested methodological details.
read point-by-point responses
-
Referee: [Abstract / claim-fidelity section] Abstract (third finding) and associated methods: the decomposition of responses into 98,020 atomic claims and the 11.0% unsupported rate are load-bearing for the fidelity and independence claims, yet the manuscript provides no description of atomic-claim extraction criteria, support-judgment rules, inter-rater reliability, or human-validation subsample. Without these details the reported percentage cannot be verified and may embed systematic bias.
Authors: We agree that the current version of the manuscript does not include a sufficiently detailed description of the atomic-claim extraction process, the rules used to judge support or omission, inter-rater reliability statistics, or the human-validation subsample. This information is necessary for reproducibility and to allow readers to evaluate potential bias. In the revised manuscript we will add a dedicated subsection to the Methods section that specifies: (1) the annotation guidelines and decision rules for decomposing AIO responses into atomic claims, (2) the criteria for classifying a claim as supported (direct quotation, close paraphrase, or logical entailment from the cited page), (3) inter-rater agreement results (including Cohen’s kappa) obtained from a pilot study on a random subsample of claims, and (4) the size, selection procedure, and validation protocol for the human-reviewed subsample. These additions will be placed immediately before the presentation of the 11.0 % unsupported rate so that the fidelity and independence findings can be properly assessed. No changes to the reported percentages themselves are required. revision: yes
Circularity Check
No circularity: purely observational measurement study
full rationale
The paper conducts a longitudinal measurement by issuing 55,393 queries, recording AIO activation rates, identifying cited domains, decomposing responses into 98,020 atomic claims, and counting unsupported claims plus advertising presence. No equations, fitted parameters, self-referential derivations, or load-bearing self-citations appear in the reported findings. All percentages (13.7% activation, 11.0% unsupported, etc.) are direct empirical tallies from the collected data, not reduced to prior quantities by construction. The study is self-contained against external benchmarks and contains no derivation chain that collapses to its inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Trending queries obtained from Google Trends are representative of typical user search behavior across the 19 categories
- domain assumption Atomic claims can be reliably extracted from AIO text and checked against cited pages without systematic omission bias
Reference graph
Works this paper leans on
-
[1]
AdExchanger. 2026. The AI Search Reckoning Is Dismantling Open Web Traffic. https://www.adexchanger.com/publishers/the-ai- search-reckoning-is-dismantling-open-web-traffic-and-publishers- may-never-recover/
work page 2026
-
[2]
Saharsh Agarwal and Ananya Sen. 2026. Google AI Overviews and Publisher Traffic: Evidence from a Field Experiment. https://doi.org/ 10.2139/ssrn.6513059 SSRN Working Paper No. 6513059
-
[3]
Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, and Ameet Deshpande. 2024. GEO: Gen- erative Engine Optimization. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(Barcelona, Spain) (KDD ’24). Association for Computing Machinery, New York, NY, USA, 5–16. https://doi.org/10.1145...
- [4]
-
[5]
Benjamin Andow, Samin Yaseer Mahmud, Justin Whitaker, William Enck, Bradley Reaves, Kapil Singh, and Serge Egelman. 2020. Actions Speak Louder than Words: Entity-Sensitive Privacy Policy and Data Flow Analysis with PoliCheck. In29th USENIX Security Symposium (USENIX Security 20). USENIX Association, 985–1002. https://www. usenix.org/conference/usenixsecur...
work page 2020
- [6]
-
[7]
Yuri Baburov. 2025. readability-lxml: Fast HTML to Text Parser (Ar- ticle Readability Tool). https://github.com/buriy/python-readability. Accessed: 2026-04-25
work page 2025
-
[8]
Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, ...
work page 2020
-
[9]
Competition and Markets Authority. 2026. CMA Proposes Pack- age of Measures to Improve Google Search Services in the UK. https://www.gov.uk/government/news/cma-proposes-package- of-measures-to-improve-google-search-services-in-uk
work page 2026
-
[10]
Hao Cui, Rahmadi Trimananda, Athina Markopoulou, and Scott Jordan
-
[11]
In32nd USENIX Security Symposium (USENIX Security 23)
PoliGraph: Automated Privacy Policy Analysis using Knowledge Graphs. In32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 1037–1054. https://www.usenix. org/conference/usenixsecurity23/presentation/cui
-
[12]
Digital Content Next. 2025. Facts: Google’s Push to AI Hurts Publisher Traffic. https://digitalcontentnext.org/blog/2025/08/14/facts-googles- push-to-ai-hurts-publisher-traffic/
work page 2025
-
[13]
EasyList Authors. [n. d.]. EasyList. https://easylist.to/. Accessed: 2026-03-24
work page 2026
-
[14]
Robert Epstein and Ronald E Robertson. 2015. The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections.Proceedings of the National Academy of Sciences112, 33 (2015), E4512–E4521
work page 2015
-
[15]
European Commission. 2025. Commission Opens Investigation into Possible Anticompetitive Conduct by Google in the Use of Online Con- tent for AI Purposes. https://ec.europa.eu/commission/presscorner/ detail/da/ip_25_2964
work page 2025
-
[16]
Tianyu Gao, Adam Fisch, and Danqi Chen. 2021. Making Pre-trained Language Models Better Few-shot Learners. InProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 3816–3830
work page 2021
-
[17]
Chang Ge, Justine Zhang, Haofei Xu, Yanna Krupnikov, Jenna Bednar, and Sabina Tomkins. 2025. What does the public want their local government to hear? A data-driven case study of public comments across the state of Michigan.Journal of Quantitative Description: Digital Media5 (2025)
work page 2025
-
[18]
2025.AI Overviews and AI Mode in Search
Google. 2025.AI Overviews and AI Mode in Search. Technical Report. Google. https://search.google/pdf/google-about-AI-overviews-AI- Mode.pdf
work page 2025
-
[19]
Google. 2025. Puppeteer: Headless Chrome Node.js API. https://github. com/puppeteer/puppeteer. Accessed: 2026-04-25
work page 2025
-
[20]
Google. 2026. Puppeteer API: page.goto waitUntil Options. https: //pptr.dev/api/puppeteer.puppeteerlifecycleevent. Accessed: 2026-04- 25
work page 2026
-
[21]
Polisis: Automated Analysis and Presentation of Privacy Policies Using Deep Learning
Hamza Harkous, Kassem Fawaz, Rémi Lebret, Florian Schaub, Kang G. Shin, and Karl Aberer. 2018. Polisis: Automated Analysis and Presenta- tion of Privacy Policies Using Deep Learning. arXiv:1802.02561 [cs.CL]
work page internal anchor Pith review Pith/arXiv arXiv 2018
- [22]
-
[23]
Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, and Ting Liu. 2025. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions.ACM Trans. Inf. Syst.43, 2, Article 42 (Jan. 2025), 55 pages. https://doi.org/ 10.1145/3703155
-
[24]
Umar Iqbal, Peter Snyder, Shitong Zhu, Benjamin Livshits, Zhiyun Qian, and Zubair Shafiq. 2020. AdGraph: A Graph-Based Approach to Ad and Tracker Blocking. InProceedings of the 2020 IEEE Symposium on Security and Privacy (S&P). IEEE, 763–776. https://doi.org/10.1109/ SP40000.2020.00005
-
[25]
Klaudia Jaźwińska and Aisvarya Chandrasekar. 2024. How ChatGPT Search (Mis)represents Publisher Content. https://www.cjr.org/tow_ center/how-chatgpt-misrepresents-publisher-content.php 14 Measuring Google AI Overviews: Activation, Source Quality, Claim Fidelity, and Publisher Impact
work page 2024
-
[26]
Why Language Models Hallucinate
Adam Tauman Kalai, Ofir Nachum, Santosh S. Vempala, and Edwin Zhang. 2025. Why Language Models Hallucinate. arXiv:2509.04664 [cs.CL] https://arxiv.org/abs/2509.04664
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[27]
Mehrzad Khosravi and Hema Yoganarasimhan. 2026. Impact of AI Search Summaries on Website Traffic: Evidence from Google AI Overviews and Wikipedia. arXiv:2602.18455 [cs.CY] https: //arxiv.org/abs/2602.18455
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[28]
Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph Gonzalez, Hao Zhang, and Ion Stoica
-
[29]
InProceedings of the 29th symposium on operating systems principles
Efficient memory management for large language model serving with pagedattention. InProceedings of the 29th symposium on operating systems principles. 611–626
-
[30]
Hause Lin, Jana Lasser, Stephan Lewandowsky, Rocky Cole, Andrew Gully, David G Rand, and Gordon Pennycook. 2023. High level of correspondence across different news domain quality rating sets.PNAS nexus2, 9 (2023), pgad286
work page 2023
-
[31]
Nelson Liu, Tianyi Zhang, and Percy Liang. 2023. Evaluating Verifiabil- ity in Generative Search Engines. InFindings of the Association for Com- putational Linguistics: EMNLP 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singa- pore, 7001–7025. https://doi.org/10.18653/v1/2023.findings-emnlp.467
-
[32]
Varun Magesh, Faiz Surani, Matthew Dahl, Mirac Suzgun, Christo- pher D Manning, and Daniel E Ho. 2025. Hallucination-free? Assessing the reliability of leading AI legal research tools.Journal of empirical legal studies22, 2 (2025), 216–242
work page 2025
-
[33]
Dasha Metropolitansky and Jonathan Larson. 2025. Towards effec- tive extraction and evaluation of factual claims. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 6996–7045
work page 2025
-
[34]
Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Koh, Mohit Iyyer, Luke Zettlemoyer, and Hannaneh Hajishirzi
-
[35]
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Lin- guistics, Singapore, 12076–12100. https://doi.org/10.18653/v1/2023. emnlp-main.741
-
[36]
NPR. 2025. Online News Publishers Face ‘Extinction-Level Event’ from Google’s AI-Powered Search. https://www.npr.org/2025/07/31/nx-s1- 5484118/google-ai-overview-online-publishers
work page 2025
-
[37]
Department of Justice. 2025. Department of Justice Pre- vails in Landmark Antitrust Case Against Google. https: //www.justice.gov/opa/pr/department-justice-prevails-landmark- antitrust-case-against-google
work page 2025
-
[38]
Pew Research Center. 2025. Google Users Are Less Likely to Click Links When AI Summaries Appear. https://www.pewresearch.org/ short-reads/2025/07/22/google-users-are-less-likely-to-click-on- links-when-an-ai-summary-appears-in-the-results/
work page 2025
-
[39]
Sundar Pichai. 2025. Q2 earnings call: CEO’s remarks. https://blog.google/company-news/inside-google/message- ceo/alphabet-earnings-q2-2025/
work page 2025
-
[40]
Leonard Richardson. 2025. Beautiful Soup. https://www.crummy.com/ software/BeautifulSoup/. Accessed: 2026-04-25
work page 2025
-
[41]
Robertson, Shan Jiang, Kenneth Joseph, Lisa Friedland, David Lazer, and Christo Wilson
Ronald E. Robertson, Shan Jiang, Kenneth Joseph, Lisa Friedland, David Lazer, and Christo Wilson. 2018. Auditing Partisan Audience Bias within Google Search.Proceedings of the ACM on Human-Computer Interaction2, CSCW, Article 148 (Nov. 2018), 22 pages. https://doi. org/10.1145/3274417
-
[42]
The Poppler Developers. 2026. Poppler: A PDF Rendering Library. https://poppler.freedesktop.org/. Accessed: 2026-04-25
work page 2026
-
[43]
Yongqi Tong, Dawei Li, Sizhe Wang, Yujia Wang, Fei Teng, and Jingbo Shang. 2024. Can LLMs Learn from Previous Mistakes? Investigating LLMs’ Errors to Boost for Reasoning. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for C...
work page 2024
-
[44]
https://doi.org/10.18653/v1/2024.acl-long.169
-
[45]
2022.{OVRseen }: Auditing network traffic and privacy policies in oculus {VR}
Rahmadi Trimananda, Hieu Le, Hao Cui, Janice Tran Ho, Anastasia Shuba, and Athina Markopoulou. 2022.{OVRseen }: Auditing network traffic and privacy policies in oculus {VR}. In31st USENIX security symposium (USENIX security 22). 3789–3806
work page 2022
- [46]
-
[47]
Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online.Science359, 6380 (2018), 1146–1151
work page 2018
-
[48]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompt- ing elicits reasoning in large language models.Advances in neural information processing systems35 (2022), 24824–24837
work page 2022
-
[49]
Kevin Wu, Eric Wu, Kevin Wei, Angela Zhang, Allison Casasola, Teresa Nguyen, Sith Riantawan, Patricia Shi, Daniel Ho, and James Zou
-
[50]
https://doi.org/10.1038/s41467-025-58551-6
An automated framework for assessing how well LLMs cite relevant medical references.Nature Communications16, 1 (Apr 2025). https://doi.org/10.1038/s41467-025-58551-6
-
[51]
Yuhao Wu, Evin Jaff, Ke Yang, Ning Zhang, and Umar Iqbal. 2025. An In-Depth Investigation of Data Collection in LLM App Ecosystems. In Proceedings of the 2025 ACM Internet Measurement Conference(USA) (IMC ’25). Association for Computing Machinery, New York, NY, USA, 150–170. https://doi.org/10.1145/3730567.3732912
-
[52]
xAI. 2025. Grok 4.1 Fast and Agent Tools API. https://x.ai/news/grok- 4-1-fast. Accessed: 2026-04-25
work page 2025
-
[53]
Yiwei Xu, Saloni Dash, Sungha Kang, Wang Liao, and Emma S. Spiro
-
[54]
arXiv:2511.22809 [cs.HC] https://arxiv.org/abs/2511.22809
AI summaries in online search influence users’ attitudes. arXiv:2511.22809 [cs.HC] https://arxiv.org/abs/2511.22809
-
[55]
Yumo Xu, Peng Qi, Jifan Chen, Kunlun Liu, Rujun Han, Lan Liu, Bo- nan Min, Vittorio Castelli, Arshit Gupta, and Zhiguo Wang. 2025. CiteEval: Principle-Driven Citation Evaluation for Source Attribu- tion. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Wanxiang Che, Joyce Nabende, Ekaterina...
-
[56]
Christina Yeung, Umar Iqbal, Yekaterina Tsipenyuk O’Neil, Tadayoshi Kohno, and Franziska Roesner. 2023. Online Advertising in Ukraine and Russia During the 2022 Russian Invasion. InProceedings of the ACM Web Conference 2023 (WWW). ACM, Austin, TX, USA. https: //doi.org/10.1145/3543507.3583484
-
[57]
Eric Zeng, Tadayoshi Kohno, and Franziska Roesner. 2020. Bad news: Clickbait and deceptive ads on news and misinformation websites. InWorkshop on technology and consumer protection. IEEE Computer Society, 1–11. A ETHICS Our study involves automated web interactions that may have unintended side effects. We outline these considerations and the steps we too...
work page 2020
-
[58]
Verifiable : it can in principle be checked true or false against evidence
-
[59]
Specific : it states a concrete fact , event , attribute , relationship , quantity , date , ranking , or action
-
[60]
Decontex tualized : it is fully understandable on its own , and its meaning in isolation matches its meaning in the AI Overview
-
[61]
Entailed : if the AI Overview is true , the claim must also be true . Rules :
-
[62]
Extract only claims that are explicitly supported by the AI Overview text . Do not use outside knowledge
-
[63]
If the text is vague , keep the claim equally vague or omit it
Do not invent or normalize missing details . If the text is vague , keep the claim equally vague or omit it
-
[64]
If a statement is generic , normative , speculative , promotional , advisory , subjective , or otherwise not specifically verifiable , do not extract it
-
[65]
If a sentence contains both generic language and one buried specific fact , extract only the specific fact
-
[66]
If the text says that a person , organization , government body , report , court , source , or expert said , reported , announced , recommended , warned , found , highlighted , or did something , preserve that attribution when it is part of the meaning
-
[67]
Resolve references when the text clearly supports it : - replace pronouns or shorthand with the fully specified referent when recoverable from nearby context ; - expand partial names only when the full name is present in the AI Overview ; - otherwise leave them unresolved only if the claim is still understandable and faithful
-
[68]
If a statement has multiple plausible interpretations and the AI Overview does not clearly resolve the ambiguity , do not extract a claim from that ambiguous part
-
[69]
Split multi - fact sentences into the simplest discrete factual claims that remain natural and useful for fact - checking
-
[70]
Do not extract duplicate claims or near - duplicates
-
[71]
Do not include citations , source names , bullet labels , headings , or formatting artifacts unless they are themselves part of a factual claim . What to omit : - opinions , praise , hype , or value judgments - advice , instructions , recommendations to the reader - vague trend language without a checkable proposition - rhetorical summaries - section head...
work page 2026
-
[72]
You MUST output exactly one entry per clailaims above
-
[73]
Use an empty list [] for OMITTED
m a tc h e d_ r e fe r e nc e s should list ALL reference IDs ( R1 , R2 , etc .) that are relevant to the claim . Use an empty list [] for OMITTED
-
[74]
evidence should quote or closely paraphrase the specific text from references . Use " No relevant content found " for OMITTED
-
[75]
confidence reflects how clearly the references support your judgment (1.0 = unambiguous match / contradiction , 0.5 = borderline )
-
[76]
CRITICAL : When a claim contains numbers , dates , or attributes paired with specific entities , verify that each value is assigned to the CORRECT entity . If the claimand value B to entity Y , but the refere nceassi gns value A to entity Y and value B to entity X , that is INCORRECT -- the values are swapped . Do not label a claim CLEAR just because the ...
-
[77]
Only answer with the specified JSON array , no other text . 18
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.