RAID: A shared benchmark for robust evaluation of machine-generated text detectors

Liam Dugan, Alyssa Hwang, Filip Trhlík, Andrew Zhu, Josh Magnus Ludan, Hainiu Xu, Daphne Ippolito, Chris Callison-Burch · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy

cs.CL · 2026-03-24 · conditional · novelty 6.0

AI-generated text detectors achieve high benchmark accuracy by exploiting unstable dataset-specific linguistic features, as evidenced by cross-domain degradation and differing SHAP explanations across corpora.

citing papers explorer

Showing 1 of 1 citing paper.

Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy cs.CL · 2026-03-24 · conditional · none · ref 16
AI-generated text detectors achieve high benchmark accuracy by exploiting unstable dataset-specific linguistic features, as evidenced by cross-domain degradation and differing SHAP explanations across corpora.

RAID: A shared benchmark for robust evaluation of machine-generated text detectors

fields

years

verdicts

representative citing papers

citing papers explorer