AI use in American newspapers is widespread, uneven, and rarely disclosed

· 2025 · cs.CL · arXiv 2510.18774

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open full Pith review browse 6 citing papers arXiv PDF

abstract

AI is rapidly transforming journalism, but the extent of its use in published newspaper articles remains unclear. We address this gap by auditing a large-scale dataset of 186K articles from online editions of 1.5K American newspapers published in the summer of 2025. Using Pangram, a state-of-the-art AI detector, we discover that approximately 9% of newly-published articles are either partially or fully AI-generated. This AI use is unevenly distributed, appearing more frequently in smaller, local outlets, in specific topics such as weather and technology, and within certain ownership groups. We also analyze 45K opinion pieces from Washington Post, New York Times, and Wall Street Journal, finding that they are 6.4 times more likely to contain AI-generated content than news articles from the same publications, with many AI-flagged op-eds authored by prominent public figures. Despite this prevalence, we find that AI use is rarely disclosed: a manual audit of 100 AI-flagged articles found only five disclosures of AI use. Overall, our audit highlights the immediate need for greater transparency and updated editorial standards regarding the use of AI in journalism to maintain public trust.

representative citing papers

Evaluating Commercial AI Chatbots as News Intermediaries

cs.CL · 2026-05-21 · conditional · novelty 7.0

Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.

The Impact of AI-Generated Text on the Internet

cs.CY · 2026-04-14 · unverdicted · novelty 7.0

By mid-2025 roughly 35% of new websites are AI-generated or AI-assisted, correlating with lower semantic diversity and higher positive sentiment but showing no significant drop in factual accuracy or stylistic diversity.

Frankentext: Stitching random text fragments into long-form narratives

cs.CL · 2025-05-23 · conditional · novelty 7.0

Frankentexts force LLMs to compose coherent long-form stories by copying 90% of tokens verbatim from random human snippets, yielding better quality and originality than vanilla generation while evading detectors.

Hitting a Moving Target: Test-Time Adaptation for AI Text Detection under Continual Distribution Shift

cs.CL · 2026-06-23 · unverdicted · novelty 6.0

Test-time adaptation with semi-supervised learning leverages inference-time homogeneity to maintain AI text detection performance under adversarial humanization, new LLMs, and temporal drift.

Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources

cs.IR · 2026-05-22 · unverdicted · novelty 6.0

Audit of ChatGPT, Copilot, Gemini and Perplexity finds ~16% of cited sources are AI-generated across 712 queries on politics, health and environment.

Generative artificial intelligence reduces social welfare through model collapse

physics.soc-ph · 2026-04-23 · unverdicted · novelty 6.0

A game-theoretic model shows that individually rational adoption of generative AI causes model collapse that reduces collective social welfare for important tasks, with habit formation creating spillovers from low-stakes to high-value domains.

citing papers explorer

Showing 6 of 6 citing papers.

Evaluating Commercial AI Chatbots as News Intermediaries cs.CL · 2026-05-21 · conditional · none · ref 48 · internal anchor
Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.
The Impact of AI-Generated Text on the Internet cs.CY · 2026-04-14 · unverdicted · none · ref 26 · internal anchor
By mid-2025 roughly 35% of new websites are AI-generated or AI-assisted, correlating with lower semantic diversity and higher positive sentiment but showing no significant drop in factual accuracy or stylistic diversity.
Frankentext: Stitching random text fragments into long-form narratives cs.CL · 2025-05-23 · conditional · none · ref 5 · internal anchor
Frankentexts force LLMs to compose coherent long-form stories by copying 90% of tokens verbatim from random human snippets, yielding better quality and originality than vanilla generation while evading detectors.
Hitting a Moving Target: Test-Time Adaptation for AI Text Detection under Continual Distribution Shift cs.CL · 2026-06-23 · unverdicted · none · ref 78 · internal anchor
Test-time adaptation with semi-supervised learning leverages inference-time homogeneity to maintain AI text detection performance under adversarial humanization, new LLMs, and temporal drift.
Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources cs.IR · 2026-05-22 · unverdicted · none · ref 40 · internal anchor
Audit of ChatGPT, Copilot, Gemini and Perplexity finds ~16% of cited sources are AI-generated across 712 queries on politics, health and environment.
Generative artificial intelligence reduces social welfare through model collapse physics.soc-ph · 2026-04-23 · unverdicted · none · ref 17 · internal anchor
A game-theoretic model shows that individually rational adoption of generative AI causes model collapse that reduces collective social welfare for important tasks, with habit formation creating spillovers from low-stakes to high-value domains.

AI use in American newspapers is widespread, uneven, and rarely disclosed

fields

years

verdicts

representative citing papers

citing papers explorer