Humans cannot reliably distinguish LLM-generated news from human-written news across multiple models, with domain expertise providing only modest help and fatigue reducing accuracy over time.
Combating Misinformation in the Age of LLMs: Opportunities and Challenges, November 2023
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 4representative citing papers
ReFACT benchmark reveals LLMs show a persistent salient distractor failure mode where 61% of incorrect error span predictions are semantically unrelated to actual errors, persisting across model sizes, and comparative judgment yields lower F1 than independent detection.
LLM-based persuasion systems frequently match or exceed human effectiveness across domains, with key influences from interaction style, model scale, prompt design, and personalization, while posing risks to information integrity, fairness, privacy, and autonomy.
TrustLLM defines eight trustworthiness principles, creates a six-dimension benchmark, and evaluates 16 LLMs showing proprietary models generally lead but some open-source ones are close while over-calibration can hurt utility.
The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.
The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.
citing papers explorer
-
Can Humans Tell? A Dual-Axis Study of Human Perception of LLM-Generated News
Humans cannot reliably distinguish LLM-generated news from human-written news across multiple models, with domain expertise providing only modest help and fatigue reducing accuracy over time.
-
ReFACT: A Benchmark for Scientific Confabulation Detection with Positional Error Annotations
ReFACT benchmark reveals LLMs show a persistent salient distractor failure mode where 61% of incorrect error span predictions are semantically unrelated to actual errors, persisting across model sizes, and comparative judgment yields lower F1 than independent detection.
-
Persuasion with Large Language Models: A Survey of Empirical Evidence, Study Methodologies, and Ethical Implications
LLM-based persuasion systems frequently match or exceed human effectiveness across domains, with key influences from interaction style, model scale, prompt design, and personalization, while posing risks to information integrity, fairness, privacy, and autonomy.
-
TrustLLM: Trustworthiness in Large Language Models
TrustLLM defines eight trustworthiness principles, creates a six-dimension benchmark, and evaluates 16 LLMs showing proprietary models generally lead but some open-source ones are close while over-calibration can hurt utility.
-
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
The paper surveys hallucination in LLMs with an innovative taxonomy, factors, detection methods, benchmarks, mitigation strategies, and open research directions.
-
AI Safety Landscape for Large Language Models: Taxonomy, State-of-the-art, and Future Directions
The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.