pith. machine review for the scientific record. sign in

arxiv: 2605.06903 · v1 · submitted 2026-05-07 · 💻 cs.CL · cs.AI

Recognition: no theorem link

MELD: Multi-Task Equilibrated Learning Detector for AI-Generated Text

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:15 UTC · model grok-4.3

classification 💻 cs.CL cs.AI
keywords AI-generated text detectionmulti-task learningrobustness to attacksEMA distillationhard-negative rankinglow false-positive ratesgeneralization to new modelsLLM detectors
0
0 comments X

The pith

MELD adds auxiliary supervision on generator families, attack types and domains to create a robust AI-generated text detector that performs well at low false-positive rates on unseen models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that single-task binary detectors for AI-generated text lose the ability to distinguish important structure once accuracy on clean data saturates, leading to poor robustness against attacks and new generators. MELD addresses this by training a shared encoder with additional heads that classify the generator family, the attack type used, and the source domain, while automatically balancing the losses and adding distillation from a clean teacher to an attacked student plus a ranking loss on hard negatives. These changes allow the model to generalize without any further fine-tuning. On a new evaluation set using recent models from four major providers, it reaches 99.9 percent true positive rate at one percent false positive rate, outperforming many baselines that drop sharply, and it leads open-source detectors on the RAID leaderboard while competing with commercial systems under attack.

Core claim

MELD attaches generator-family, attack-type, and source-domain heads to a shared encoder, and balances the four losses with learned homoscedastic uncertainty weights. To improve robustness, an EMA teacher predicts on clean inputs while an attack-augmented student is distilled toward the teacher. MELD further uses a hard-negative pairwise ranking loss to enlarge the score margin between AI-generated texts and the most confusable human texts. At inference, all auxiliary heads are discarded.

What carries the argument

The multi-task setup with auxiliary heads for generator family, attack type and domain, balanced by uncertainty weights, plus EMA distillation and hard-negative ranking loss.

Load-bearing premise

The auxiliary supervision on generator family, attack type, and domain, along with the EMA distillation and hard-negative ranking, produces a representation that generalizes to unseen generators and attacks instead of overfitting to the specific training distributions.

What would settle it

A test on text produced by an entirely new generator family combined with a novel attack method not present in the training data, checking if the high true-positive rate at low false-positive rate is preserved.

Figures

Figures reproduced from arXiv: 2605.06903 by Cheng Wan, Chenjun Li, Johannes C. Paetzold.

Figure 1
Figure 1. Figure 1: Overview of MELD. A shared encoder (Student) is trained with a main classification head and three auxiliary heads for generator family, attack type, and source domain. During training, clean inputs are passed through an EMA teacher, while the student is trained on clean or attack-augmented inputs. The objective combines (i) uncertainty-weighted multi-task classification, (ii) main-head teacher–student dist… view at source ↗
Figure 2
Figure 2. Figure 2: Backbone geometry. UMAP of ∼112,000 embeddings per panel from the evaluated detectors. A robust detector should separate human and AI text, preserve generator-level structure, and keep attacked variants near their clean sources. MELD best matches this geometry, with the highest generator separability, the lowest attack displacement, and visibly less human/AI overlap than the baselines. the loss pulls the s… view at source ↗
Figure 3
Figure 3. Figure 3: Distance-space geometry. Per-detector within-source vs. between-source cosine-distance distribu￾tions. A better representation keeps same-source variants close while separating different sources, leading to less overlap between the two distributions. MELD shows the clearest separation and reaches Cohen’s d ′ = 3.28, ∼7× the strongest baseline (ModernBERT-Detect, d ′ = 0.47). None Paraphrase Synonym Misspel… view at source ↗
Figure 4
Figure 4. Figure 4: Per-attack robustness on RAID. TPR@5%FPR on the official RAID test set, aggregated over domain, generator, decoding, and repetition. We compare open-source detectors with public papers or models. Bold marks the best cell per attack. “–” denotes an attack not scored by that submission. analysis: the model learns to keep attacks close to the underlying clean source rather than overfitting to a narrow attack … view at source ↗
Figure 5
Figure 5. Figure 5: Compact MELD training step. Auxiliary heads, the EMA teacher, and ranking supervision are used only during training; inference uses only the main AI/Human head. B Kendall log-variance trajectories [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: reports the learned log-variance st for each task. We initialize all tasks with the same weight, and optimize st jointly with the encoder. Lower st corresponds to a larger multiplier e −st in the uncertainty-weighted loss. In our runs, the auxiliary heads move to lower st later in training and therefore receive larger relative multipliers than the main head. This suggests that, within the joint objective, … view at source ↗
read the original abstract

Large language models are now embedded in everyday writing workflows, making reliable AI-generated text detection important for academic integrity, content moderation, and provenance tracking. In practice, however, a detector must do more than achieve high aggregate AUROC on clean, in-distribution human and AI text: it should remain robust to attacks and adversarial rewrites, transfer to unseen generators and domains, and operate at low false-positive rates (FPR). Most existing detectors optimize a single AI/Human objective, giving the representation little incentive to learn generator, attack, or domain structure once the binary task saturates. We introduce MELD (Multi-Task Equilibrated Learning Detector), a deployable detector for AI-generated text that enriches binary detection with auxiliary supervision. MELD attaches generator-family, attack-type, and source-domain heads to a shared encoder, and balances the four losses with learned homoscedastic uncertainty weights. To improve robustness, an EMA teacher predicts on clean inputs while an attack-augmented student is distilled toward the teacher. MELD further uses a hard-negative pairwise ranking loss to enlarge the score margin between AI-generated texts and the most confusable human texts. At inference, all auxiliary heads are discarded, giving MELD the same interface and cost as a standard detector. On the public RAID leaderboard, MELD is the strongest open-source detector and is competitive with leading commercial models, especially under attack and at low FPR. Across standard held-out benchmarks, MELD matches or outperforms supervised baselines. We further introduce MELD-eval, a held-out evaluation pool built from recent chat models released by four major LLM providers. Without additional finetuning, MELD achieves 99.9% TPR at 1% FPR on MELD-eval, while many baselines degrade sharply.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces MELD, a deployable AI-generated text detector that augments binary classification with auxiliary multi-task heads for generator family, attack type, and source domain. These are balanced via learned homoscedastic uncertainty weights; an EMA teacher-student distillation is applied between clean and attack-augmented inputs, and a hard-negative pairwise ranking loss enlarges margins between AI and confusable human texts. Auxiliary heads are discarded at inference. The central claims are that MELD is the strongest open-source detector on the public RAID leaderboard (especially under attack and low FPR), matches or exceeds supervised baselines on standard held-out sets, and achieves 99.9% TPR at 1% FPR on a new held-out MELD-eval pool from four recent LLM providers without further finetuning.

Significance. If the reported generalization holds and is attributable to the multi-task equilibration rather than training-data coverage alone, the work would be a meaningful advance for practical detectors that must handle unseen generators, attacks, and domains while remaining lightweight at inference. The release of MELD-eval as a challenging, recent-LLM held-out set is a concrete community contribution.

major comments (2)
  1. [§4 (Experiments)] §4 (Experiments) and §4.3 (MELD-eval results): the headline 99.9% TPR@1%FPR and RAID leaderboard ranking are presented without ablation studies that isolate the auxiliary heads, EMA distillation, or hard-negative ranking loss. Removing these components (or replacing them with a plain binary baseline trained on the same data) is required to establish that the claimed robustness to unseen generators/attacks is produced by the proposed mechanisms rather than data distribution; this is load-bearing for the central claim.
  2. [§3.1 (Multi-task architecture)] §3.1 (Multi-task architecture): the four losses are balanced by learned homoscedastic uncertainty weights, yet no analysis is provided of the learned weights, their sensitivity to initialization, or whether they actually equilibrate the tasks versus simply defaulting to the binary loss. This directly affects the interpretation of the multi-task benefit.
minor comments (2)
  1. [Table 1 (RAID leaderboard)] Table 1 (RAID leaderboard): the exact training details and hyper-parameters for the re-implemented baselines are not stated, making it impossible to verify that the reported gaps are not due to unequal optimization.
  2. [§5 (MELD-eval construction)] §5 (MELD-eval construction): the prompt templates, temperature settings, and exact model versions used to generate the held-out pool should be listed explicitly for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. The two major comments identify important gaps in the experimental validation of our central claims. We agree that both points require additional material and will revise the manuscript accordingly to strengthen the attribution of performance gains to the proposed mechanisms.

read point-by-point responses
  1. Referee: [§4 (Experiments)] §4 (Experiments) and §4.3 (MELD-eval results): the headline 99.9% TPR@1%FPR and RAID leaderboard ranking are presented without ablation studies that isolate the auxiliary heads, EMA distillation, or hard-negative ranking loss. Removing these components (or replacing them with a plain binary baseline trained on the same data) is required to establish that the claimed robustness to unseen generators/attacks is produced by the proposed mechanisms rather than data distribution; this is load-bearing for the central claim.

    Authors: We agree that the current version does not contain the requested ablations. While MELD is compared against external baselines and the RAID leaderboard, we did not include controlled ablations that replace the full model with a plain binary classifier trained on identical data, nor incremental removals of the auxiliary heads, EMA distillation, and pairwise ranking loss. We will add these experiments (both on RAID and on MELD-eval) in the revised manuscript, reporting TPR@1%FPR, AUROC, and attack robustness for each variant. This will allow readers to isolate the contribution of each component versus data coverage alone. revision: yes

  2. Referee: [§3.1 (Multi-task architecture)] §3.1 (Multi-task architecture): the four losses are balanced by learned homoscedastic uncertainty weights, yet no analysis is provided of the learned weights, their sensitivity to initialization, or whether they actually equilibrate the tasks versus simply defaulting to the binary loss. This directly affects the interpretation of the multi-task benefit.

    Authors: We acknowledge the absence of this analysis. The manuscript describes the homoscedastic uncertainty weighting but does not report the converged weight values, their stability across random seeds, or comparisons against fixed-weight multi-task training. In the revision we will add (i) a table or plot of the learned uncertainty parameters at convergence, (ii) sensitivity results for different initializations, and (iii) a direct comparison to a fixed-weight multi-task baseline to demonstrate that the learned weights actively equilibrate the tasks rather than collapsing to the binary loss. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results on held-out data with no reductive derivations

full rationale

The paper introduces MELD as an empirical multi-task detector using auxiliary heads for generator family/attack/domain, uncertainty-weighted loss balancing, EMA teacher-student distillation, and hard-negative ranking, then discards auxiliaries at inference. All performance claims (99.9% TPR at 1% FPR on MELD-eval, RAID leaderboard ranking) are experimental measurements on explicitly held-out benchmarks and a new evaluation pool constructed from recent LLMs. No equations, first-principles derivations, or predictions appear that reduce claimed generalization to quantities defined by fitted parameters or self-citations inside the paper. The method is self-contained against external benchmarks; auxiliary supervision is presented as a training technique whose benefit is measured, not presupposed by definition.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard supervised deep-learning assumptions plus the unproven premise that auxiliary multi-task signals improve robustness without introducing new failure modes. No new physical entities or mathematical axioms are introduced.

free parameters (1)
  • learned homoscedastic uncertainty weights
    Four loss weights are learned during training; they are not fixed a priori and directly affect the final encoder.
axioms (2)
  • domain assumption Multi-task supervision on generator family, attack type, and domain enriches the shared representation for the binary detection task
    Invoked in the description of attaching auxiliary heads and balancing losses.
  • domain assumption EMA teacher on clean inputs plus attack-augmented student distillation improves robustness to adversarial rewrites
    Stated as the mechanism for robustness.

pith-pipeline@v0.9.0 · 5634 in / 1484 out tokens · 26759 ms · 2026-05-11T01:15:46.599354+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 2 internal anchors

  1. [1]

    Gptzero: Robust detection of llm-generated texts,

    George Alexandru Adam, Alexander Cui, Edwin Thomas, Emily Napier, Nazar Shmatko, Jacob Schnell, Jacob Junqi Tian, Alekhya Dronavalli, Edward Tian, and Dongwon Lee. GPTZero: Robust detection of LLM-generated texts.arXiv preprint arXiv:2602.13042, 2026

  2. [2]

    Qwen 3.6 Plus, 2026

    Alibaba Cloud. Qwen 3.6 Plus, 2026. API model snapshotqwen3.6-plus-04-02

  3. [3]

    Claude Haiku 4.5, 2025

    Anthropic. Claude Haiku 4.5, 2025. URL https://www.anthropic.com/news/ claude-haiku-4-5. API model snapshotclaude-haiku-4.5-20251001

  4. [4]

    Fast-DetectGPT: Efficient zero-shot detection of machine-generated text via conditional probability curvature

    Guangsheng Bao, Yanbin Zhao, Zhiyang Teng, Linyi Yang, and Yue Zhang. Fast-DetectGPT: Efficient zero-shot detection of machine-generated text via conditional probability curvature. In The Twelfth International Conference on Learning Representations, 2024

  5. [5]

    Diversity boosts AI-generated text detection.arXiv preprint arXiv:2509.18880, 2025

    Advik Raj Basani and Pin-Yu Chen. Diversity boosts AI-generated text detection.arXiv preprint arXiv:2509.18880, 2025

  6. [6]

    Learning to rank using gradient descent

    Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. Learning to rank using gradient descent. InProceedings of the 22nd International Conference on Machine Learning, pages 89–96, 2005

  7. [7]

    RepreGuard: Detecting LLM-generated text by revealing hidden representation patterns.Transactions of the Association for Computational Linguistics, 13:1812–1831, 2025

    Xin Chen, Junchao Wu, Shu Yang, Runzhe Zhan, Zeyu Wu, Ziyang Luo, Di Wang, Min Yang, Lidia S Chao, and Derek F Wong. RepreGuard: Detecting LLM-generated text by revealing hidden representation patterns.Transactions of the Association for Computational Linguistics, 13:1812–1831, 2025

  8. [8]

    Machine-generated text detection prevents language model collapse

    George Drayson, Emine Yilmaz, and Vasileios Lampos. Machine-generated text detection prevents language model collapse. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 29645–29661, 2025

  9. [9]

    RAID: A shared benchmark for robust evaluation of machine-generated text detectors

    Liam Dugan, Alyssa Hwang, Filip Trhlík, Andrew Zhu, Josh Magnus Ludan, Hainiu Xu, Daphne Ippolito, and Chris Callison-Burch. RAID: A shared benchmark for robust evaluation of machine-generated text detectors. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12463–12492, 2024

  10. [10]

    GLTR: Statistical detection and visualization of generated text

    Sebastian Gehrmann, Hendrik Strobelt, and Alexander M Rush. GLTR: Statistical detection and visualization of generated text. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 111–116, 2019

  11. [11]

    Gemini 3 Flash, 2025

    Google DeepMind. Gemini 3 Flash, 2025. API model snapshot gemini-3-flash-preview-20251217

  12. [12]

    How close is ChatGPT to human experts? Comparison corpus, evaluation, and detection

    Biyang Guo, Xin Zhang, Ziyuan Wang, Minqi Jiang, Jinran Nie, Yuxuan Ding, Jianwei Yue, and Yupeng Wu. How close is ChatGPT to human experts? comparison corpus, evaluation, and detection.arXiv preprint arXiv:2301.07597, 2023

  13. [13]

    DeTeCtive: detecting AI-generated text via multi-level contrastive learning

    Xun Guo, Shan Zhang, Yongxin He, Ting Zhang, Wanquan Feng, Haibin Huang, and Chongyang Ma. DeTeCtive: detecting AI-generated text via multi-level contrastive learning. InProceedings of the 38th International Conference on Neural Information Processing Systems, pages 88320– 88347, 2024

  14. [14]

    Spotting LLMs with binoculars: zero-shot detection of machine-generated text

    Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Spotting LLMs with binoculars: zero-shot detection of machine-generated text. InProceedings of the 41st International Conference on Machine Learning, pages 17519–17537, 2024

  15. [15]

    In de- fense of the triplet loss for person re-identification.arXiv preprint arXiv:1703.07737, 2017

    Alexander Hermans, Lucas Beyer, and Bastian Leibe. In defense of the triplet loss for person re-identification.arXiv preprint arXiv:1703.07737, 2017

  16. [16]

    Distilling the Knowledge in a Neural Network

    Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015. 10

  17. [17]

    RADAR: robust AI-text detection via adversarial learning

    Xiaomeng Hu, Pin-Yu Chen, and Tsung-Yi Ho. RADAR: robust AI-text detection via adversarial learning. InProceedings of the 37th International Conference on Neural Information Processing Systems, pages 15077–15095, 2023

  18. [18]

    Averaging Weights Leads to Wider Optima and Better Generalization

    Pavel Izmailov, Dmitrii Podoprikhin, Timur Garipov, Dmitry Vetrov, and Andrew Gordon Wilson. Averaging weights leads to wider optima and better generalization.arXiv preprint arXiv:1803.05407, 2018

  19. [19]

    Multi-task learning using uncertainty to weigh losses for scene geometry and semantics

    Alex Kendall, Yarin Gal, and Roberto Cipolla. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7482–7491, 2018

  20. [20]

    Paraphras- ing evades detectors of AI-generated text, but retrieval is an effective defense.Advances in neural information processing systems, 36:27469–27500, 2023

    Kalpesh Krishna, Yixiao Song, Marzena Karpinska, John Wieting, and Mohit Iyyer. Paraphras- ing evades detectors of AI-generated text, but retrieval is an effective defense.Advances in neural information processing systems, 36:27469–27500, 2023

  21. [21]

    MAGE: Machine-generated text detection in the wild

    Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Zhilin Wang, Longyue Wang, Linyi Yang, Shuming Shi, and Yue Zhang. MAGE: Machine-generated text detection in the wild. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 36–53, 2024

  22. [22]

    GPT detectors are biased against non-native english writers.Patterns, 4(7), 2023

    Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, and James Zou. GPT detectors are biased against non-native english writers.Patterns, 4(7), 2023

  23. [23]

    Uncertainty regularized multi-task learning

    Kourosh Meshgi, Maryam Sadat Mirzaei, and Satoshi Sekine. Uncertainty regularized multi-task learning. InProceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, pages 78–88, 2022

  24. [24]

    DetectGPT: zero-shot machine-generated text detection using probability curvature

    Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D Manning, and Chelsea Finn. DetectGPT: zero-shot machine-generated text detection using probability curvature. InPro- ceedings of the 40th International Conference on Machine Learning, pages 24950–24962, 2023

  25. [25]

    GPT-5.4 Mini, 2026

    OpenAI. GPT-5.4 Mini, 2026. API model snapshotgpt-5.4-mini-20260317

  26. [26]

    The FineWeb datasets: Decanting the web for the finest text data at scale.Advances in Neural Information Processing Systems, 37:30811–30849, 2024

    Guilherme Penedo, Hynek Kydlíˇcek, Anton Lozhkov, Margaret Mitchell, Colin Raffel, Leandro V on Werra, Thomas Wolf, et al. The FineWeb datasets: Decanting the web for the finest text data at scale.Advances in Neural Information Processing Systems, 37:30811–30849, 2024

  27. [27]

    Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection

    Xiao Pu, Zepeng Cheng, Lin Yuan, Yu Wu, and Xiuli Bi. Breaking the generator barrier: Disen- tangled representation for generalizable AI-text detection.arXiv preprint arXiv:2604.13692, 2026

  28. [28]

    business

    QuillBot, a Learneo, Inc. business. QuillBot ai content detector, 2025. Commercial product; performance reported on the public RAID leaderboard athttps://raid-bench.xyz

  29. [29]

    Facenet: A unified embedding for face recognition and clustering

    Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 815–823, 2015

  30. [30]

    Emma Strubell, Ananya Ganesh, and Andrew McCallum

    Irene Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-V oss, Jeff Wu, Alec Radford, Gretchen Krueger, Jong Wook Kim, Sarah Kreps, et al. Release strategies and the social impacts of language models.arXiv preprint arXiv:1908.09203, 2019

  31. [31]

    Grammarly AI writing detector, 2026

    Superhuman Platform Inc. Grammarly AI writing detector, 2026. Commercial product; performance reported on the public RAID leaderboard athttps://raid-bench.xyz

  32. [32]

    FAID: Fine-grained AI-generated text detection using multi-task auxiliary and multi-level contrastive learning

    Minh Ngoc Ta, Dong Cao Van, Duc-Anh Hoang, Minh Le-Anh, Truong Nguyen, My Anh Tran Nguyen, Yuxia Wang, Preslav Nakov, and Dinh Viet Sang. FAID: Fine-grained AI-generated text detection using multi-task auxiliary and multi-level contrastive learning. InProceedings of the 19th Conference of the European Chapter of the Association for Computational Linguisti...

  33. [33]

    Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results

    Antti Tarvainen and Harri Valpola. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. InProceedings of the 31st International Conference on Neural Information Processing Systems, pages 1195–1204, 2017

  34. [34]

    Modeling the attack: detecting AI-generated text by quantifying adversarial perturbations

    LDMS Sai Teja, Annepaka Yadagiri, Sangam Sai Anish, Siva Gopala Krishna Nuthakki, and Partha Pakray. Modeling the attack: detecting AI-generated text by quantifying adversarial perturbations. In2026 20th International Conference on Ubiquitous Information Management and Communication (IMCOM), pages 1–8. IEEE, 2026

  35. [35]

    TURINGBENCH: A benchmark environment for turing test in the age of neural text generation

    Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, and Dongwon Lee. TURINGBENCH: A benchmark environment for turing test in the age of neural text generation. InFindings of the Association for Computational Linguistics: EMNLP 2021, pages 2001–2016, 2021

  36. [36]

    Ghostbuster: Detecting text ghost- written by large language models

    Vivek Verma, Eve Fleisig, Nicholas Tomlin, and Dan Klein. Ghostbuster: Detecting text ghost- written by large language models. InProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1702–1717, 2024

  37. [37]

    M4GT- Bench: Evaluation benchmark for black-box machine-generated text detection

    Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, et al. M4GT- Bench: Evaluation benchmark for black-box machine-generated text detection. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Paper...

  38. [38]

    Smarter, better, faster, longer: A modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference

    Benjamin Warner, Antoine Chaffin, Benjamin Clavié, Orion Weller, Oskar Hallström, Said Taghadouini, Alexis Gallagher, Raja Biswas, Faisal Ladhak, Tom Aarsen, et al. Smarter, better, faster, longer: A modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference. InProceedings of the 63rd Annual Meeting of the Associati...

  39. [39]

    Seq vs seq: An open suite of paired encoders and decoders.The Fourteenth International Conference on Learning Representations, 2026

    Orion Weller, Kathryn Ricci, Marc Marone, Antoine Chaffin, Dawn Lawrie, and Benjamin Van Durme. Seq vs seq: An open suite of paired encoders and decoders.The Fourteenth International Conference on Learning Representations, 2026

  40. [40]

    Advancing machine-generated text detection from an easy to hard supervision perspective

    Chenwang Wu, Yiu-ming Cheung, Bo Han, and Defu Lian. Advancing machine-generated text detection from an easy to hard supervision perspective. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  41. [41]

    DetectRL: Benchmarking LLM-generated text detection in real-world scenarios.Advances in Neural Information Processing Systems, 37:100369–100401, 2024

    Junchao Wu, Runzhe Zhan, Derek F Wong, Shu Yang, Xinyi Yang, Yulin Yuan, and Lidia S Chao. DetectRL: Benchmarking LLM-generated text detection in real-world scenarios.Advances in Neural Information Processing Systems, 37:100369–100401, 2024

  42. [42]

    LLMDet: A third party large language models generated text detection tool

    Kangxi Wu, Liang Pang, Huawei Shen, Xueqi Cheng, and Tat-Seng Chua. LLMDet: A third party large language models generated text detection tool. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 2113–2133, 2023

  43. [43]

    Human texts are outliers: Detecting LLM-generated texts via out-of-distribution detection

    Cong Zeng, Shengkun Tang, Yuanzhou Chen, Zhiqiang Shen, Wenchao Yu, Xujiang Zhao, Haifeng Chen, Wei Cheng, et al. Human texts are outliers: Detecting LLM-generated texts via out-of-distribution detection. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  44. [44]

    I ’d imagine it has something to do with availability

    Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, and Yuntian Deng. WildChat: 1m ChatGPT interaction logs in the wild. InThe Twelfth International Conference on Learning Representations, 2024. 12 A MELD training pseudocode Box 1: MELD training step Input:batchB={x, y, g, a, d}with masksµ t; studentS θ; EMA teacherT ¯θ Hyperparams:p, α, τ te...