pith. sign in

arxiv: 2510.18774 · v4 · submitted 2025-10-21 · 💻 cs.CL

AI use in American newspapers is widespread, uneven, and rarely disclosed

Pith reviewed 2026-05-18 04:52 UTC · model grok-4.3

classification 💻 cs.CL
keywords AI detectionjournalismnewspaperscontent generationdisclosureopinion piecesmedia transparencylocal news
0
0 comments X

The pith

Roughly 9% of newly published American newspaper articles are partially or fully AI-generated.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper audits 186,000 articles from 1,500 U.S. newspapers published in summer 2025 using an AI text detector. It reports that AI generation appears in about 9% of articles overall, showing up more often at smaller local outlets, in weather and technology topics, and inside certain ownership groups. Opinion pieces at three major national papers contain AI content at 6.4 times the rate of news articles from the same outlets. A separate check of 100 AI-flagged pieces found disclosures in only five cases.

Core claim

Using the Pangram AI detector on a dataset of 186K articles published by 1.5K American newspapers in summer 2025, we find that approximately 9% of newly-published articles are either partially or fully AI-generated. This use is uneven, appearing more in smaller outlets, weather and technology topics, and certain ownership groups. Opinion pieces from the Washington Post, New York Times, and Wall Street Journal are 6.4 times more likely to contain AI-generated content than news articles, and a manual audit reveals disclosures in only five of 100 AI-flagged articles.

What carries the argument

Large-scale application of the Pangram AI text detector across 186K newspaper articles, followed by breakdown by outlet size, topic, ownership, and a manual audit of disclosures in flagged items.

If this is right

  • AI generation concentrates more in smaller local newspapers than in large national ones.
  • Opinion sections at major papers contain AI content at far higher rates than straight news reporting.
  • Disclosures of AI assistance appear in only a small minority of cases where it is used.
  • Current editorial standards leave most AI-assisted articles without any reader notice.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the detector holds up, local newsrooms may be integrating generative tools faster than larger organizations.
  • AI use in opinion writing by public figures could shift how readers assess the origin of commentary.
  • Widespread undetected AI content might gradually change audience expectations for what counts as original reporting.

Load-bearing premise

The Pangram detector accurately identifies AI-generated text in newspaper articles rather than flagging normal journalistic phrasing, topic-specific language, or outlet style as AI output.

What would settle it

Independent human experts manually labeling a random sample of articles flagged as AI-generated by Pangram, plus a matched set of unflagged articles from the same newspapers, to measure the detector's actual accuracy rate on this material.

Figures

Figures reproduced from arXiv: 2510.18774 by Bradley Emi, Destiny Akinode, Jenna Russell, Katherine Thai, Marzena Karpinska, Max Spero, Mohit Iyyer.

Figure 1
Figure 1. Figure 1: Map of the United States showing the AI use (%), number of articles, and number of unique newspapers [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: AI use by topic in recent_news. Weather, science/technology, and health exhibit higher AI use than topics like conflict, crime/justice, and religion. Cohere) 20 even as they churn out LLM-generated articles. AI use varies by topic across ownership groups [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Heatmap of AI use by publication owner and article topic in [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: AI use in recent_news is more frequent in languages other than English. The most prominent such languages include Spanish, Portuguese, Vietnamese, French, and Polish. Many AI-generated articles contain authen￾tic quotes. To examine whether MIXED and AI-GENERATED articles include fabricated infor￾mation, we analyze the authorship of quotations in these articles. Specifically, we extract all quotes in the da… view at source ↗
Figure 6
Figure 6. Figure 6: Monthly proportion of opinions articles de￾tected as either AI-GENERATED or MIXED from September 2022 to September 2025. AI use in opin￾ion pieces at reputable publications has dramatically increased over time. cern whether AI contributed merely to editing or to greater content generation, restricting readers’ abil￾ity to judge the appropriateness of AI in specific articles. Guest contributors are much mor… view at source ↗
Figure 5
Figure 5. Figure 5: An example of an article found in print which [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Changes in AI use in opinions articles year over year (2025 only includes January 1 - September 15). AI likelihood increased across all topics, with especially large gains in opinion articles about Science & Technol￾ogy, consistent with topic trends in recent_news. and peace (12×), human interest (10x), and politics and government (9×). These patterns indicate that AI use in opinion writing extends well be… view at source ↗
Figure 9
Figure 9. Figure 9: Prompt for classifying topic of articles [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Distribution of circulations of articles in the [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of the lengths of articles in [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Screenshot of Pangram’s API, . "Arkansas Department of Agriculture to host NASDA Annual Meeting" by Stuttgart Daily Leader [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Screenshot of Pangram’s API, , . "SEC’s at-large bid preference lacks support, Big Ten commissioner says" by The Associated Press 20 [PITH_FULL_IMAGE:figures/full_fig_p020_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Distribution of AI Use predictions in recent_news [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗
Figure 16
Figure 16. Figure 16: Distribution of AI Categories in recent_news articles between papers with circulations below 100k (left) and above 100k (right). Newspapers most and least likely to be using AI The 50 newspapers most likely to be using AI based off the percentage of AI use detected in their articles in recent_news are listed in [PITH_FULL_IMAGE:figures/full_fig_p021_16.png] view at source ↗
Figure 15
Figure 15. Figure 15: Distribution of AI Likelihoods per each AI [PITH_FULL_IMAGE:figures/full_fig_p021_15.png] view at source ↗
Figure 17
Figure 17. Figure 17: AI use in articles written in languages other than English. The most common language are Spanish, [PITH_FULL_IMAGE:figures/full_fig_p023_17.png] view at source ↗
Figure 21
Figure 21. Figure 21: Distribution of AI use predictions in ai_reporters by year. C.3 ai_reporters In section §5, we talk about the aggregated trends for the 10 reporters identified as using AI in our data set, whose data we analyzed longitudinally. Here we present more details about the reporters, label distribution, and patterns observed in individ￾ual reporters. Reporter Profile [PITH_FULL_IMAGE:figures/full_fig_p024_21.png] view at source ↗
Figure 19
Figure 19. Figure 19: Distribution of AI Use predictions in opinions [PITH_FULL_IMAGE:figures/full_fig_p024_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Distribution of AI Likelihoods per each AI [PITH_FULL_IMAGE:figures/full_fig_p024_20.png] view at source ↗
Figure 22
Figure 22. Figure 22: Distribution of AI Use predictions in ai_reporters. we could not retrieve all possible articles for ev￾ery reporter. We did, however, ensure we gathered most of their recent articles and as many as possible from before November 2022 (the ChatGPT release date) [PITH_FULL_IMAGE:figures/full_fig_p025_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Distribution of AI Likelihoods per each AI [PITH_FULL_IMAGE:figures/full_fig_p025_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: AI content patterns in 2025 by reporters (part 1). See Table [PITH_FULL_IMAGE:figures/full_fig_p026_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: AI content patterns in 2025 by reporters (part 2). See Table [PITH_FULL_IMAGE:figures/full_fig_p027_25.png] view at source ↗
read the original abstract

AI is rapidly transforming journalism, but the extent of its use in published newspaper articles remains unclear. We address this gap by auditing a large-scale dataset of 186K articles from online editions of 1.5K American newspapers published in the summer of 2025. Using Pangram, a state-of-the-art AI detector, we discover that approximately 9% of newly-published articles are either partially or fully AI-generated. This AI use is unevenly distributed, appearing more frequently in smaller, local outlets, in specific topics such as weather and technology, and within certain ownership groups. We also analyze 45K opinion pieces from Washington Post, New York Times, and Wall Street Journal, finding that they are 6.4 times more likely to contain AI-generated content than news articles from the same publications, with many AI-flagged op-eds authored by prominent public figures. Despite this prevalence, we find that AI use is rarely disclosed: a manual audit of 100 AI-flagged articles found only five disclosures of AI use. Overall, our audit highlights the immediate need for greater transparency and updated editorial standards regarding the use of AI in journalism to maintain public trust.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript audits AI use in journalism via the Pangram detector applied to 186K articles from 1.5K U.S. newspapers (summer 2025), reporting that ~9% are partially or fully AI-generated. Rates are higher in smaller/local outlets, weather/technology topics, and certain ownership groups; opinion pieces from WaPo/NYT/WSJ are 6.4x more likely to be flagged than news articles from the same outlets; and a manual audit of 100 flagged articles finds only five disclosures.

Significance. If the detector classifications are reliable, the work supplies large-scale empirical data on AI adoption patterns in American newspapers and documents low disclosure rates, which bears directly on editorial standards and public trust in journalism. The scale of the corpus and the explicit focus on uneven distribution plus disclosure provide concrete, falsifiable observations that can ground further research in computational social science.

major comments (2)
  1. [Methods] Methods section: the headline 9% prevalence (and all downstream claims about uneven distribution by outlet size, topic, and ownership) rests on Pangram outputs without any reported calibration, false-positive baseline, or human-labeled validation set on newspaper prose. No pre-2023 human-written control corpus or error-rate measurement on this domain is described, leaving open the possibility that formulaic journalistic style (short declaratives, topic-specific phrasing) produces systematic false positives.
  2. [Results] Results (disclosure audit): the claim that AI use is 'rarely disclosed' is based on a post-selection manual review of only 100 AI-flagged articles. This sample is small, lacks reported sampling details, inter-annotator agreement, or confidence intervals, and therefore provides only weak support for the low-disclosure conclusion.
minor comments (2)
  1. [Abstract] Abstract and §3: specify the exact date range for 'summer of 2025' and the precise threshold or probability cutoff used to label an article as 'partially or fully AI-generated' from Pangram's output.
  2. [Results] Figure captions and §4: clarify whether the 6.4x multiplier for opinion pieces is a raw ratio or adjusted for article length or other covariates.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which have prompted us to clarify and strengthen key aspects of the manuscript. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Methods] Methods section: the headline 9% prevalence (and all downstream claims about uneven distribution by outlet size, topic, and ownership) rests on Pangram outputs without any reported calibration, false-positive baseline, or human-labeled validation set on newspaper prose. No pre-2023 human-written control corpus or error-rate measurement on this domain is described, leaving open the possibility that formulaic journalistic style (short declaratives, topic-specific phrasing) produces systematic false positives.

    Authors: We appreciate the referee's point on domain-specific validation. Pangram was chosen based on its strong reported performance in recent benchmarks, but we acknowledge that no explicit calibration or false-positive analysis on newspaper prose was included. In the revised manuscript, we have added a dedicated limitations subsection in Methods that discusses the risk of false positives from formulaic journalistic writing and reports results from a small post-hoc human validation on a sample of pre-2023 articles. While a comprehensive pre-2023 control corpus was outside the scope of this large-scale audit, the relative differences across outlets, topics, and ownership groups remain informative even under moderate error rates. revision: partial

  2. Referee: [Results] Results (disclosure audit): the claim that AI use is 'rarely disclosed' is based on a post-selection manual review of only 100 AI-flagged articles. This sample is small, lacks reported sampling details, inter-annotator agreement, or confidence intervals, and therefore provides only weak support for the low-disclosure conclusion.

    Authors: We agree that the disclosure audit rests on a modest sample and that additional methodological details would improve transparency. The revised manuscript now specifies that the 100 articles were randomly sampled from the AI-flagged set, describes the annotation protocol in more detail, and reports a binomial confidence interval around the observed 5% disclosure rate. We have also added an explicit statement that the review was performed by a single annotator and note this as a limitation. While these changes do not increase the sample size, they provide a more precise and cautious presentation of the finding. revision: yes

Circularity Check

0 steps flagged

No circularity: direct empirical counts on external corpus

full rationale

This is a pure measurement study that applies an external detector (Pangram) to a fixed corpus of 186K articles and reports observed proportions and correlations. No equations, fitted parameters, or derived predictions are defined in terms of the target statistics; the 9% figure and uneven distributions are direct outputs of the detector applied to the input data. No self-citation chain, uniqueness theorem, or ansatz is invoked to justify the central claim. The derivation is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The measurement depends on the unvalidated domain performance of one commercial detector and the assumption that the collected 186K articles represent typical summer 2025 output.

axioms (1)
  • domain assumption Pangram detector output corresponds to actual AI generation in newspaper articles
    Used as state-of-the-art without reported validation on this genre or outlet types.

pith-pipeline@v0.9.0 · 5755 in / 1243 out tokens · 52702 ms · 2026-05-18T04:52:05.396674+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Evaluating Commercial AI Chatbots as News Intermediaries

    cs.CL 2026-05 conditional novelty 7.0

    Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors an...

  2. The Impact of AI-Generated Text on the Internet

    cs.CY 2026-04 unverdicted novelty 7.0

    By mid-2025 roughly 35% of new websites are AI-generated or AI-assisted, correlating with lower semantic diversity and higher positive sentiment but showing no significant drop in factual accuracy or stylistic diversity.

  3. Frankentext: Stitching random text fragments into long-form narratives

    cs.CL 2025-05 conditional novelty 7.0

    Frankentexts force LLMs to compose coherent long-form stories by copying 90% of tokens verbatim from random human snippets, yielding better quality and originality than vanilla generation while evading detectors.

  4. Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources

    cs.IR 2026-05 unverdicted novelty 6.0

    Audit of ChatGPT, Copilot, Gemini and Perplexity finds ~16% of cited sources are AI-generated across 712 queries on politics, health and environment.

  5. Generative artificial intelligence reduces social welfare through model collapse

    physics.soc-ph 2026-04 unverdicted novelty 6.0

    A game-theoretic model shows that individually rational adoption of generative AI causes model collapse that reduces collective social welfare for important tasks, with habit formation creating spillovers from low-sta...

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · cited by 5 Pith papers · 1 internal anchor

  1. [1]

    InProceedings of the First Workshop on Advancing Natural Language Processing for Wikipedia, pages 67–79, Miami, Florida, USA

    The rise of AI-generated content in Wikipedia. InProceedings of the First Workshop on Advancing Natural Language Processing for Wikipedia, pages 67–79, Miami, Florida, USA. Association for Com- putational Linguistics. Peter Brown and Klaudia Ja´ zwi´nska. 2025. Journalism zero: How platforms and publishers are navigating ai. Report, Tow Center for Digital...

  2. [2]

    Language Models are Few-Shot Learners

    Language Models are Few-Shot Learners. arXiv preprint. ArXiv:2005.14165 [cs]. Roberto Cavazos and Greg Sterling. 2024. The high cost of review fraud: An economic analysis of con- sumer harm. https://askfortransparency.com/ research/high-cost-of-review-fraud/ . Ac- cessed: 2025-10-17. Elizabeth Clark, Tal August, Sofia Serrano, Nikita Haduong, Suchin Gurur...

  3. [3]

    All that‘s ‘human’ is not gold: Evaluating human evaluation of generated text. InProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 7282–7296, Online. Association for Computational Linguistics. Jim Conaghan. 2017. ...

  4. [4]

    Liam Dugan, Andrew Zhu, Firoj Alam, Preslav Nakov, Marianna Apidianaki, and Chris Callison-Burch

    The long-lasting effects of newspaper op-eds on public opinion.Quarterly Journal of Political Science, 13(1):59–87. Liam Dugan, Andrew Zhu, Firoj Alam, Preslav Nakov, Marianna Apidianaki, and Chris Callison-Burch

  5. [5]

    arXiv preprint arXiv:2402.14873 , year=

    GenAI content detection task 3: Cross-domain machine generated text detection challenge. InPro- ceedings of the 1stWorkshop on GenAI Content De- tection (GenAIDetect), pages 377–388, Abu Dhabi, UAE. International Conference on Computational Linguistics. Bradley Emi. 2025. All about false positives in ai detectors. https://www.pangram.com/blog/ all-about-f...

  6. [6]

    arXiv preprint arXiv:2504.09865 , year=

    Bias and fairness in large language models: A survey.Computational Linguistics, 50(3):1097– 1179. Isabel O. Gallegos, Chen Shani, Weiyan Shi, Federico Bianchi, Izzy Gainsburg, Dan Jurafsky, and Robb Willer. 2025. Labeling messages as ai-generated does not reduce their persuasive effects.arXiv preprint arXiv:2504.09865. 14 Tarun Gupta and Danish Pruthi. 20...

  7. [7]

    Published online 2025-03-24; Taylor & Francis

    Ai-generated news content: The impact of AI writer identity and perceived AI human-likeness.In- ternational Journal of Human-Computer Interaction, pages 1–13. Published online 2025-03-24; Taylor & Francis. Weixin Liang, Zachary Izzo, Yaohui Zhang, Haley Lepp, Hancheng Cao, Xuandong Zhao, Lingjiao Chen, Hao- tian Ye, Sheng Liu, Zhi Huang, Daniel A. McFarla...

  8. [8]

    The widespread adoption of large language model-assisted writing across society.Preprint, arXiv:2502.09747. Weixin Liang, Yaohui Zhang, Zhengxuan Wu, Haley Lepp, Wenlong Ji, Xuandong Zhao, Hancheng Cao, Sheng Liu, Siyu He, Zhi Huang, Diyi Yang, Christo- pher Potts, Christopher D Manning, and James Y . Zou. 2024b. Mapping the increasing use of LLMs in scie...

  9. [9]

    InProceedings of the 63rd Annual Meeting of the Association for Computational Lin- guistics (Volume 1: Long Papers), pages 5342–5373, Vienna, Austria

    People who frequently use ChatGPT for writ- ing tasks are accurate and robust detectors of AI- generated text. InProceedings of the 63rd Annual Meeting of the Association for Computational Lin- guistics (Volume 1: Long Papers), pages 5342–5373, Vienna, Austria. Association for Computational Lin- guistics. Fabio Salvi, Manoel Horta Ribeiro, Riccardo Gallot...

  10. [10]

    arXiv preprint arXiv:2505.09662 , year=

    Large language models are more persua- sive than incentivized human persuaders.Preprint, arXiv:2505.09662. Minkyu Shin, Jin Kim, and Jiwoong Shin. 2025. The adoption and efficacy of large language models: Ev- idence from consumer complaints in the financial industry.Preprint, arXiv:2311.16466. Felix M. Simon. 2024. Artificial intelligence in the news: How...

  11. [11]

    Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y ., Farhadi, A., Roesner, F., and Choi, Y

    Taxonomy of risks posed by language models. InProceedings of the 2022 ACM Conference on Fair- ness, Accountability, and Transparency, FAccT ’22, page 214–229, New York, NY , USA. Association for Computing Machinery. Eileen Yam and Brian Kennedy. 2025. From political speeches to songs, how would americans react if they 16 found out AI was involved? Pew Res...

  12. [12]

    Carefully read the article content and identify the main focus

  13. [13]

    Choose the most appropriate primary topic from the taxonomy above

  14. [14]

    Base your decision on the primary subject matter and content focus

  15. [15]

    Consider the description of each category when making your choice

  16. [16]

    If the article truly doesn't fit well into any of the listed categories, choose "Other"

  17. [17]

    Be specific and accurate - don't force a category if it's not a good fit

  18. [18]

    Arkansas Department of Agriculture to host NASDA Annual Meeting

    Respond ONLY with the classification in this exact format: <classification> Primary Topic: [exact primary topic name] </classification> Figure 9: Prompt for classifying topic of articles Topic Definition (IPTC summary) arts, culture, entertainment and media All forms of arts, entertainment, cultural heritage and media conflict, war and peace Acts of socia...