pith. machine review for the scientific record. sign in

arxiv: 2605.14021 · v1 · submitted 2026-05-13 · 💻 cs.CY · cs.AI

Recognition: no theorem link

Measuring Google AI Overviews: Activation, Source Quality, Claim Fidelity, and Publisher Impact

Authors on Pith no claims yet

Pith reviewed 2026-05-15 02:09 UTC · model grok-4.3

classification 💻 cs.CY cs.AI
keywords Google AI Overviewssearch measurementgenerative AIclaim fidelitysource qualitypublisher revenuequery activationinformation ecosystem
0
0 comments X

The pith

Google AI Overviews activate on 13.7% of searches, with 11% of their claims unsupported by cited pages and most of those pages carrying ads that suppress publisher revenue.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper measures Google AI Overviews across 55,393 trending queries over 40 days to track how often they appear and how well they perform. Activation reaches 13.7% overall but jumps to 64.7% for question queries while staying low on political topics. Cited sources prove more credible than standard results yet nearly 30% are absent from those results, showing a separate selection process. Breaking answers into 98,020 atomic claims reveals that 11% lack support from the cited pages, mostly through omissions, and source quality does not predict fidelity. Over half the cited pages include display advertising, so publishers lose clicks and revenue when the AI summary replaces the original link.

Core claim

Issuing 55,393 trending queries across 19 categories shows AIO activation at 13.7% overall and 64.7% for question-form queries, with lower rates on politically sensitive topics. AIO-cited domains are more credible than co-displayed results but nearly 30% do not appear in those results. Of 98,020 decomposed atomic claims, 11.0% are unsupported by the cited pages, with omission as the main failure mode, and fidelity is independent of source quality. Well over half of AIO-cited pages carry display advertising, so publishers lose revenue when AIOs suppress clicks while Google's sponsored ads remain.

What carries the argument

Large-scale longitudinal query measurement combined with atomic claim decomposition to quantify activation rates, source credibility differences, unsupported claims, and advertising presence on cited pages.

Load-bearing premise

The 55,393 trending queries represent typical user searches and that breaking responses into atomic claims can be done reliably without systematic bias in measuring unsupported content.

What would settle it

Re-running the same queries with a different query sample or having human raters verify the support status of a random subset of the 98,020 claims would show whether the reported activation and unsupported rates hold.

Figures

Figures reproduced from arXiv: 2605.14021 by Haofei Xu, Jacob M. Montgomery, Umar Iqbal.

Figure 1
Figure 1. Figure 1: Our approach to characterizing the Google AIO ecosystem: (1) We begin by extracting top search queries [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Daily AIO activation rate (red line, 7-day mov [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of reference counts across 7,583 [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Median reference count by topical category. [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Verification label distribution by topic cate [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
read the original abstract

Google AI Overviews (AIOs) are arguably the most widely encountered deployment of generative AI, reaching over 2 billion users who may not realize the answers they see are AI-generated. Where search engines have traditionally surfaced ranked sources and left users to evaluate them, AIOs synthesize and deliver a single answer - giving Google unprecedented editorial control over what users read and know. We present a large-scale longitudinal measurement study, issuing 55,393 trending queries across 19 topical categories over a 40-day window (March 13 - April 21, 2026). We report four main findings. First, overall AIO activation is 13.7%, rising to 64.7% for question-form queries, while politically sensitive topics see markedly lower rates. Second, AIO-cited domains are more credible than co-displayed first-page results, yet nearly 30% do not appear in those results at all, indicating a source selection mechanism distinct from Google's ranking algorithm. Third, decomposing responses into 98,020 atomic claims, 11.0% are unsupported by the cited pages - with omission the dominant failure mode - and source quality and claim fidelity are largely independent. Fourth, well over half of AIO-cited pages carry display advertising, meaning publishers lose revenue when AIOs suppress the click-through, even as Google's own sponsored ads continue to appear on the same page. Together, these findings document a rapid transformation of the online information ecosystem whose consequences for epistemic security remain poorly understood.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript reports results from a 40-day longitudinal measurement study issuing 55,393 trending queries across 19 topical categories to Google. It finds AIO activation at 13.7% overall (64.7% for question-form queries, lower for politically sensitive topics), AIO-cited domains more credible than co-displayed first-page results yet ~30% absent from those results, 11.0% of 98,020 atomic claims unsupported by cited pages (omission dominant) with source quality and fidelity largely independent, and >50% of AIO-cited pages carrying display advertising.

Significance. If the measurements are reliable, the study supplies large-scale empirical evidence on activation rates, source selection distinct from ranking, claim fidelity, and publisher revenue displacement in generative search summaries. These observations bear directly on epistemic security, information ecosystem shifts, and advertising economics, providing a useful baseline for future work.

major comments (1)
  1. [Abstract / claim-fidelity section] Abstract (third finding) and associated methods: the decomposition of responses into 98,020 atomic claims and the 11.0% unsupported rate are load-bearing for the fidelity and independence claims, yet the manuscript provides no description of atomic-claim extraction criteria, support-judgment rules, inter-rater reliability, or human-validation subsample. Without these details the reported percentage cannot be verified and may embed systematic bias.
minor comments (1)
  1. [Abstract] The date window March 13–April 21, 2026 appears to lie in the future relative to the arXiv posting; confirm the correct interval or clarify the study timeline.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful review and constructive feedback on our manuscript. We address the single major comment below and will revise the manuscript to incorporate the requested methodological details.

read point-by-point responses
  1. Referee: [Abstract / claim-fidelity section] Abstract (third finding) and associated methods: the decomposition of responses into 98,020 atomic claims and the 11.0% unsupported rate are load-bearing for the fidelity and independence claims, yet the manuscript provides no description of atomic-claim extraction criteria, support-judgment rules, inter-rater reliability, or human-validation subsample. Without these details the reported percentage cannot be verified and may embed systematic bias.

    Authors: We agree that the current version of the manuscript does not include a sufficiently detailed description of the atomic-claim extraction process, the rules used to judge support or omission, inter-rater reliability statistics, or the human-validation subsample. This information is necessary for reproducibility and to allow readers to evaluate potential bias. In the revised manuscript we will add a dedicated subsection to the Methods section that specifies: (1) the annotation guidelines and decision rules for decomposing AIO responses into atomic claims, (2) the criteria for classifying a claim as supported (direct quotation, close paraphrase, or logical entailment from the cited page), (3) inter-rater agreement results (including Cohen’s kappa) obtained from a pilot study on a random subsample of claims, and (4) the size, selection procedure, and validation protocol for the human-reviewed subsample. These additions will be placed immediately before the presentation of the 11.0 % unsupported rate so that the fidelity and independence findings can be properly assessed. No changes to the reported percentages themselves are required. revision: yes

Circularity Check

0 steps flagged

No circularity: purely observational measurement study

full rationale

The paper conducts a longitudinal measurement by issuing 55,393 queries, recording AIO activation rates, identifying cited domains, decomposing responses into 98,020 atomic claims, and counting unsupported claims plus advertising presence. No equations, fitted parameters, self-referential derivations, or load-bearing self-citations appear in the reported findings. All percentages (13.7% activation, 11.0% unsupported, etc.) are direct empirical tallies from the collected data, not reduced to prior quantities by construction. The study is self-contained against external benchmarks and contains no derivation chain that collapses to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the assumption that trending queries capture representative search behavior and that manual or automated claim decomposition is unbiased; no free parameters or invented entities are introduced.

axioms (2)
  • domain assumption Trending queries obtained from Google Trends are representative of typical user search behavior across the 19 categories
    Used to select the 55,393 queries issued in the study
  • domain assumption Atomic claims can be reliably extracted from AIO text and checked against cited pages without systematic omission bias
    Required for the 11.0% unsupported claim statistic

pith-pipeline@v0.9.0 · 5580 in / 1381 out tokens · 38775 ms · 2026-05-15T02:09:49.978034+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

77 extracted references · 77 canonical work pages · 3 internal anchors

  1. [1]

    AdExchanger. 2026. The AI Search Reckoning Is Dismantling Open Web Traffic. https://www.adexchanger.com/publishers/the-ai- search-reckoning-is-dismantling-open-web-traffic-and-publishers- may-never-recover/

  2. [2]

    Saharsh Agarwal and Ananya Sen. 2026. Google AI Overviews and Publisher Traffic: Evidence from a Field Experiment. https://doi.org/ 10.2139/ssrn.6513059 SSRN Working Paper No. 6513059

  3. [3]

    Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, and Ameet Deshpande. 2024. GEO: Gen- erative Engine Optimization. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(Barcelona, Spain) (KDD ’24). Association for Computing Machinery, New York, NY, USA, 5–16. https://doi.org/10.1145...

  4. [4]

    Alphabet Inc. 2026. Form 10-K: Annual Report for the Fiscal Year Ended December 31, 2025. U.S. Securities and Exchange Commis- sion Filing. https://www.sec.gov/Archives/edgar/data/0001652044/ 000165204426000018/goog-20251231.htm Accessed: 2026-05-06

  5. [5]

    Benjamin Andow, Samin Yaseer Mahmud, Justin Whitaker, William Enck, Bradley Reaves, Kapil Singh, and Serge Egelman. 2020. Actions Speak Louder than Words: Entity-Sensitive Privacy Policy and Data Flow Analysis with PoliCheck. In29th USENIX Security Symposium (USENIX Security 20). USENIX Association, 985–1002. https://www. usenix.org/conference/usenixsecur...

  6. [6]

    Sinan Aral, Haiwen Li, and Rui Zuo. 2026. The Rise of AI Search: Implications for Information Markets and Human Judgement at Scale. arXiv preprint arXiv:2602.13415(2026)

  7. [7]

    Yuri Baburov. 2025. readability-lxml: Fast HTML to Text Parser (Ar- ticle Readability Tool). https://github.com/buriy/python-readability. Accessed: 2026-04-25

  8. [8]

    Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, ...

  9. [9]

    Competition and Markets Authority. 2026. CMA Proposes Pack- age of Measures to Improve Google Search Services in the UK. https://www.gov.uk/government/news/cma-proposes-package- of-measures-to-improve-google-search-services-in-uk

  10. [10]

    Hao Cui, Rahmadi Trimananda, Athina Markopoulou, and Scott Jordan

  11. [11]

    In32nd USENIX Security Symposium (USENIX Security 23)

    PoliGraph: Automated Privacy Policy Analysis using Knowledge Graphs. In32nd USENIX Security Symposium (USENIX Security 23). USENIX Association, Anaheim, CA, 1037–1054. https://www.usenix. org/conference/usenixsecurity23/presentation/cui

  12. [12]

    Digital Content Next. 2025. Facts: Google’s Push to AI Hurts Publisher Traffic. https://digitalcontentnext.org/blog/2025/08/14/facts-googles- push-to-ai-hurts-publisher-traffic/

  13. [13]

    EasyList Authors. [n. d.]. EasyList. https://easylist.to/. Accessed: 2026-03-24

  14. [14]

    Robert Epstein and Ronald E Robertson. 2015. The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections.Proceedings of the National Academy of Sciences112, 33 (2015), E4512–E4521

  15. [15]

    European Commission. 2025. Commission Opens Investigation into Possible Anticompetitive Conduct by Google in the Use of Online Con- tent for AI Purposes. https://ec.europa.eu/commission/presscorner/ detail/da/ip_25_2964

  16. [16]

    Tianyu Gao, Adam Fisch, and Danqi Chen. 2021. Making Pre-trained Language Models Better Few-shot Learners. InProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 3816–3830

  17. [17]

    Chang Ge, Justine Zhang, Haofei Xu, Yanna Krupnikov, Jenna Bednar, and Sabina Tomkins. 2025. What does the public want their local government to hear? A data-driven case study of public comments across the state of Michigan.Journal of Quantitative Description: Digital Media5 (2025)

  18. [18]

    2025.AI Overviews and AI Mode in Search

    Google. 2025.AI Overviews and AI Mode in Search. Technical Report. Google. https://search.google/pdf/google-about-AI-overviews-AI- Mode.pdf

  19. [19]

    Google. 2025. Puppeteer: Headless Chrome Node.js API. https://github. com/puppeteer/puppeteer. Accessed: 2026-04-25

  20. [20]

    Google. 2026. Puppeteer API: page.goto waitUntil Options. https: //pptr.dev/api/puppeteer.puppeteerlifecycleevent. Accessed: 2026-04- 25

  21. [21]

    Polisis: Automated Analysis and Presentation of Privacy Policies Using Deep Learning

    Hamza Harkous, Kassem Fawaz, Rémi Lebret, Florian Schaub, Kang G. Shin, and Karl Aberer. 2018. Polisis: Automated Analysis and Presenta- tion of Privacy Policies Using Deep Learning. arXiv:1802.02561 [cs.CL]

  22. [22]

    Desheng Hu, Joachim Baumann, Aleksandra Urman, Elsa Lichtenegger, Robin Forsberg, Aniko Hannak, and Christo Wilson. 2025. Auditing Google’s AI Overviews and Featured Snippets: A Case Study on Baby Care and Pregnancy.arXiv preprint arXiv:2511.12920(2025)

  23. [23]

    Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, and Ting Liu. 2025. A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions.ACM Trans. Inf. Syst.43, 2, Article 42 (Jan. 2025), 55 pages. https://doi.org/ 10.1145/3703155

  24. [24]

    Umar Iqbal, Peter Snyder, Shitong Zhu, Benjamin Livshits, Zhiyun Qian, and Zubair Shafiq. 2020. AdGraph: A Graph-Based Approach to Ad and Tracker Blocking. InProceedings of the 2020 IEEE Symposium on Security and Privacy (S&P). IEEE, 763–776. https://doi.org/10.1109/ SP40000.2020.00005

  25. [25]

    Klaudia Jaźwińska and Aisvarya Chandrasekar. 2024. How ChatGPT Search (Mis)represents Publisher Content. https://www.cjr.org/tow_ center/how-chatgpt-misrepresents-publisher-content.php 14 Measuring Google AI Overviews: Activation, Source Quality, Claim Fidelity, and Publisher Impact

  26. [26]

    Why Language Models Hallucinate

    Adam Tauman Kalai, Ofir Nachum, Santosh S. Vempala, and Edwin Zhang. 2025. Why Language Models Hallucinate. arXiv:2509.04664 [cs.CL] https://arxiv.org/abs/2509.04664

  27. [27]

    Mehrzad Khosravi and Hema Yoganarasimhan. 2026. Impact of AI Search Summaries on Website Traffic: Evidence from Google AI Overviews and Wikipedia. arXiv:2602.18455 [cs.CY] https: //arxiv.org/abs/2602.18455

  28. [28]

    Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph Gonzalez, Hao Zhang, and Ion Stoica

  29. [29]

    InProceedings of the 29th symposium on operating systems principles

    Efficient memory management for large language model serving with pagedattention. InProceedings of the 29th symposium on operating systems principles. 611–626

  30. [30]

    Hause Lin, Jana Lasser, Stephan Lewandowsky, Rocky Cole, Andrew Gully, David G Rand, and Gordon Pennycook. 2023. High level of correspondence across different news domain quality rating sets.PNAS nexus2, 9 (2023), pgad286

  31. [31]

    Nelson Liu, Tianyi Zhang, and Percy Liang. 2023. Evaluating Verifiabil- ity in Generative Search Engines. InFindings of the Association for Com- putational Linguistics: EMNLP 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singa- pore, 7001–7025. https://doi.org/10.18653/v1/2023.findings-emnlp.467

  32. [32]

    Varun Magesh, Faiz Surani, Matthew Dahl, Mirac Suzgun, Christo- pher D Manning, and Daniel E Ho. 2025. Hallucination-free? Assessing the reliability of leading AI legal research tools.Journal of empirical legal studies22, 2 (2025), 216–242

  33. [33]

    Dasha Metropolitansky and Jonathan Larson. 2025. Towards effec- tive extraction and evaluation of factual claims. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 6996–7045

  34. [34]

    Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Koh, Mohit Iyyer, Luke Zettlemoyer, and Hannaneh Hajishirzi

  35. [35]

    Harnessing the power of large language models for empathetic response generation: Empirical investigations and improvements

    FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Lin- guistics, Singapore, 12076–12100. https://doi.org/10.18653/v1/2023. emnlp-main.741

  36. [36]

    NPR. 2025. Online News Publishers Face ‘Extinction-Level Event’ from Google’s AI-Powered Search. https://www.npr.org/2025/07/31/nx-s1- 5484118/google-ai-overview-online-publishers

  37. [37]

    Department of Justice. 2025. Department of Justice Pre- vails in Landmark Antitrust Case Against Google. https: //www.justice.gov/opa/pr/department-justice-prevails-landmark- antitrust-case-against-google

  38. [38]

    Pew Research Center. 2025. Google Users Are Less Likely to Click Links When AI Summaries Appear. https://www.pewresearch.org/ short-reads/2025/07/22/google-users-are-less-likely-to-click-on- links-when-an-ai-summary-appears-in-the-results/

  39. [39]

    Sundar Pichai. 2025. Q2 earnings call: CEO’s remarks. https://blog.google/company-news/inside-google/message- ceo/alphabet-earnings-q2-2025/

  40. [40]

    Leonard Richardson. 2025. Beautiful Soup. https://www.crummy.com/ software/BeautifulSoup/. Accessed: 2026-04-25

  41. [41]

    Robertson, Shan Jiang, Kenneth Joseph, Lisa Friedland, David Lazer, and Christo Wilson

    Ronald E. Robertson, Shan Jiang, Kenneth Joseph, Lisa Friedland, David Lazer, and Christo Wilson. 2018. Auditing Partisan Audience Bias within Google Search.Proceedings of the ACM on Human-Computer Interaction2, CSCW, Article 148 (Nov. 2018), 22 pages. https://doi. org/10.1145/3274417

  42. [42]

    The Poppler Developers. 2026. Poppler: A PDF Rendering Library. https://poppler.freedesktop.org/. Accessed: 2026-04-25

  43. [43]

    Yongqi Tong, Dawei Li, Sizhe Wang, Yujia Wang, Fei Teng, and Jingbo Shang. 2024. Can LLMs Learn from Previous Mistakes? Investigating LLMs’ Errors to Boost for Reasoning. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for C...

  44. [44]

    https://doi.org/10.18653/v1/2024.acl-long.169

  45. [45]

    2022.{OVRseen }: Auditing network traffic and privacy policies in oculus {VR}

    Rahmadi Trimananda, Hieu Le, Hao Cui, Janice Tran Ho, Anastasia Shuba, and Athina Markopoulou. 2022.{OVRseen }: Auditing network traffic and privacy policies in oculus {VR}. In31st USENIX security symposium (USENIX security 22). 3789–3806

  46. [46]

    Pranav Narayanan Venkit, Philippe Laban, Yilun Zhou, Yixin Mao, and Chien-Sheng Wu. 2024. Search Engines in an AI Era: The False Promise of Factual and Verifiable Source-Cited Responses. arXiv:2410.22349 [cs.IR] https://arxiv.org/abs/2410.22349

  47. [47]

    Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online.Science359, 6380 (2018), 1146–1151

  48. [48]

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompt- ing elicits reasoning in large language models.Advances in neural information processing systems35 (2022), 24824–24837

  49. [49]

    Kevin Wu, Eric Wu, Kevin Wei, Angela Zhang, Allison Casasola, Teresa Nguyen, Sith Riantawan, Patricia Shi, Daniel Ho, and James Zou

  50. [50]

    https://doi.org/10.1038/s41467-025-58551-6

    An automated framework for assessing how well LLMs cite relevant medical references.Nature Communications16, 1 (Apr 2025). https://doi.org/10.1038/s41467-025-58551-6

  51. [51]

    Yuhao Wu, Evin Jaff, Ke Yang, Ning Zhang, and Umar Iqbal. 2025. An In-Depth Investigation of Data Collection in LLM App Ecosystems. In Proceedings of the 2025 ACM Internet Measurement Conference(USA) (IMC ’25). Association for Computing Machinery, New York, NY, USA, 150–170. https://doi.org/10.1145/3730567.3732912

  52. [52]

    xAI. 2025. Grok 4.1 Fast and Agent Tools API. https://x.ai/news/grok- 4-1-fast. Accessed: 2026-04-25

  53. [53]

    Yiwei Xu, Saloni Dash, Sungha Kang, Wang Liao, and Emma S. Spiro

  54. [54]

    arXiv:2511.22809 [cs.HC] https://arxiv.org/abs/2511.22809

    AI summaries in online search influence users’ attitudes. arXiv:2511.22809 [cs.HC] https://arxiv.org/abs/2511.22809

  55. [55]

    Yumo Xu, Peng Qi, Jifan Chen, Kunlun Liu, Rujun Han, Lan Liu, Bo- nan Min, Vittorio Castelli, Arshit Gupta, and Zhiguo Wang. 2025. CiteEval: Principle-Driven Citation Evaluation for Source Attribu- tion. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Wanxiang Che, Joyce Nabende, Ekaterina...

  56. [56]

    Christina Yeung, Umar Iqbal, Yekaterina Tsipenyuk O’Neil, Tadayoshi Kohno, and Franziska Roesner. 2023. Online Advertising in Ukraine and Russia During the 2022 Russian Invasion. InProceedings of the ACM Web Conference 2023 (WWW). ACM, Austin, TX, USA. https: //doi.org/10.1145/3543507.3583484

  57. [57]

    Eric Zeng, Tadayoshi Kohno, and Franziska Roesner. 2020. Bad news: Clickbait and deceptive ads on news and misinformation websites. InWorkshop on technology and consumer protection. IEEE Computer Society, 1–11. A ETHICS Our study involves automated web interactions that may have unintended side effects. We outline these considerations and the steps we too...

  58. [58]

    Verifiable : it can in principle be checked true or false against evidence

  59. [59]

    Specific : it states a concrete fact , event , attribute , relationship , quantity , date , ranking , or action

  60. [60]

    Decontex tualized : it is fully understandable on its own , and its meaning in isolation matches its meaning in the AI Overview

  61. [61]

    Entailed : if the AI Overview is true , the claim must also be true . Rules :

  62. [62]

    Do not use outside knowledge

    Extract only claims that are explicitly supported by the AI Overview text . Do not use outside knowledge

  63. [63]

    If the text is vague , keep the claim equally vague or omit it

    Do not invent or normalize missing details . If the text is vague , keep the claim equally vague or omit it

  64. [64]

    If a statement is generic , normative , speculative , promotional , advisory , subjective , or otherwise not specifically verifiable , do not extract it

  65. [65]

    If a sentence contains both generic language and one buried specific fact , extract only the specific fact

  66. [66]

    If the text says that a person , organization , government body , report , court , source , or expert said , reported , announced , recommended , warned , found , highlighted , or did something , preserve that attribution when it is part of the meaning

  67. [67]

    Resolve references when the text clearly supports it : - replace pronouns or shorthand with the fully specified referent when recoverable from nearby context ; - expand partial names only when the full name is present in the AI Overview ; - otherwise leave them unresolved only if the claim is still understandable and faithful

  68. [68]

    If a statement has multiple plausible interpretations and the AI Overview does not clearly resolve the ambiguity , do not extract a claim from that ambiguous part

  69. [69]

    Split multi - fact sentences into the simplest discrete factual claims that remain natural and useful for fact - checking

  70. [70]

    Do not extract duplicate claims or near - duplicates

  71. [71]

    Key takeaways

    Do not include citations , source names , bullet labels , headings , or formatting artifacts unless they are themselves part of a factual claim . What to omit : - opinions , praise , hype , or value judgments - advice , instructions , recommendations to the reader - vague trend language without a checkable proposition - rhetorical summaries - section head...

  72. [72]

    You MUST output exactly one entry per clailaims above

  73. [73]

    Use an empty list [] for OMITTED

    m a tc h e d_ r e fe r e nc e s should list ALL reference IDs ( R1 , R2 , etc .) that are relevant to the claim . Use an empty list [] for OMITTED

  74. [74]

    No relevant content found

    evidence should quote or closely paraphrase the specific text from references . Use " No relevant content found " for OMITTED

  75. [75]

    confidence reflects how clearly the references support your judgment (1.0 = unambiguous match / contradiction , 0.5 = borderline )

  76. [76]

    If the claimand value B to entity Y , but the refere nceassi gns value A to entity Y and value B to entity X , that is INCORRECT -- the values are swapped

    CRITICAL : When a claim contains numbers , dates , or attributes paired with specific entities , verify that each value is assigned to the CORRECT entity . If the claimand value B to entity Y , but the refere nceassi gns value A to entity Y and value B to entity X , that is INCORRECT -- the values are swapped . Do not label a claim CLEAR just because the ...

  77. [77]

    Only answer with the specified JSON array , no other text . 18