Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.
AI use in American newspapers is widespread, uneven, and rarely disclosed
6 Pith papers cite this work. Polarity classification is still indexing.
abstract
AI is rapidly transforming journalism, but the extent of its use in published newspaper articles remains unclear. We address this gap by auditing a large-scale dataset of 186K articles from online editions of 1.5K American newspapers published in the summer of 2025. Using Pangram, a state-of-the-art AI detector, we discover that approximately 9% of newly-published articles are either partially or fully AI-generated. This AI use is unevenly distributed, appearing more frequently in smaller, local outlets, in specific topics such as weather and technology, and within certain ownership groups. We also analyze 45K opinion pieces from Washington Post, New York Times, and Wall Street Journal, finding that they are 6.4 times more likely to contain AI-generated content than news articles from the same publications, with many AI-flagged op-eds authored by prominent public figures. Despite this prevalence, we find that AI use is rarely disclosed: a manual audit of 100 AI-flagged articles found only five disclosures of AI use. Overall, our audit highlights the immediate need for greater transparency and updated editorial standards regarding the use of AI in journalism to maintain public trust.
representative citing papers
By mid-2025 roughly 35% of new websites are AI-generated or AI-assisted, correlating with lower semantic diversity and higher positive sentiment but showing no significant drop in factual accuracy or stylistic diversity.
Frankentexts force LLMs to compose coherent long-form stories by copying 90% of tokens verbatim from random human snippets, yielding better quality and originality than vanilla generation while evading detectors.
Test-time adaptation with semi-supervised learning leverages inference-time homogeneity to maintain AI text detection performance under adversarial humanization, new LLMs, and temporal drift.
Audit of ChatGPT, Copilot, Gemini and Perplexity finds ~16% of cited sources are AI-generated across 712 queries on politics, health and environment.
A game-theoretic model shows that individually rational adoption of generative AI causes model collapse that reduces collective social welfare for important tasks, with habit formation creating spillovers from low-stakes to high-value domains.
citing papers explorer
-
Evaluating Commercial AI Chatbots as News Intermediaries
Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.
-
The Impact of AI-Generated Text on the Internet
By mid-2025 roughly 35% of new websites are AI-generated or AI-assisted, correlating with lower semantic diversity and higher positive sentiment but showing no significant drop in factual accuracy or stylistic diversity.
-
Frankentext: Stitching random text fragments into long-form narratives
Frankentexts force LLMs to compose coherent long-form stories by copying 90% of tokens verbatim from random human snippets, yielding better quality and originality than vanilla generation while evading detectors.
-
Hitting a Moving Target: Test-Time Adaptation for AI Text Detection under Continual Distribution Shift
Test-time adaptation with semi-supervised learning leverages inference-time homogeneity to maintain AI text detection performance under adversarial humanization, new LLMs, and temporal drift.
-
Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources
Audit of ChatGPT, Copilot, Gemini and Perplexity finds ~16% of cited sources are AI-generated across 712 queries on politics, health and environment.
-
Generative artificial intelligence reduces social welfare through model collapse
A game-theoretic model shows that individually rational adoption of generative AI causes model collapse that reduces collective social welfare for important tasks, with habit formation creating spillovers from low-stakes to high-value domains.