Humans cannot reliably distinguish LLM-generated news from human-written news across multiple models, with domain expertise providing only modest help and fatigue reducing accuracy over time.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Gini index of BERTopic topic distributions on COVID-19 Reddit data shows significant global correlation with fake news fraction but ambiguous results at community level.
A survey reviewing benchmark data contamination in LLMs, its impact on evaluation, and alternative assessment approaches.
citing papers explorer
-
Can Humans Tell? A Dual-Axis Study of Human Perception of LLM-Generated News
Humans cannot reliably distinguish LLM-generated news from human-written news across multiple models, with domain expertise providing only modest help and fatigue reducing accuracy over time.
-
Quantifying correlations between information overload and fake news during COVID-19 pandemic: a Reddit study with BERT model approach
Gini index of BERTopic topic distributions on COVID-19 Reddit data shows significant global correlation with fake news fraction but ambiguous results at community level.
-
Benchmark Data Contamination of Large Language Models: A Survey
A survey reviewing benchmark data contamination in LLMs, its impact on evaluation, and alternative assessment approaches.