A Distribution-Free Framework for Rewrite-Based Human-text Detection via Knockoff Filtering

Yi Liu

arxiv: 2606.00402 · v1 · pith:D2JI5JRUnew · submitted 2026-05-29 · 📊 stat.ME · cs.AI· stat.AP

A Distribution-Free Framework for Rewrite-Based Human-text Detection via Knockoff Filtering

Yi Liu This is my paper

Pith reviewed 2026-06-28 20:59 UTC · model grok-4.3

classification 📊 stat.ME cs.AIstat.AP

keywords LLM text detectionknockoff filteringfalse discovery ratedistribution-free methodsrewrite detectorsmultiple hypothesis testingAI-generated content

0 comments

The pith

Rewrite-based LLM text detectors inherit finite-sample FDR guarantees via a simple calibration that treats rewrites as knockoffs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that rewrite-based detectors for AI-generated text already produce the paired samples needed for knockoff filtering, so they can be reframed as a multiple-testing problem. This separation lets the detector's scoring rule stay unchanged while a calibration step enforces control over the false discovery rate in finite samples. Because the method requires no distributional assumptions or retraining, it applies directly to existing detectors. A reader would care if the claim holds because it turns heuristic detectors into ones with explicit error-rate guarantees across domains and models.

Core claim

Rewrite-based detection implicitly constructs knockoff samples, enabling LLM-generated text detection to be formulated as a multiple hypothesis testing problem with knockoff structure. This perspective separates the design of detection statistics from the control of false discoveries, allowing existing rewrite detectors to inherit finite-sample false discovery rate (FDR) guarantees through a simple calibration procedure.

What carries the argument

Knockoff filtering on rewrite pairs, which supplies the exchangeability structure required to calibrate any rewrite detector for FDR control.

If this is right

Any existing rewrite detector acquires finite-sample FDR control without retraining or new assumptions.
The same calibration works across three detection models, nineteen domains, and four LLMs while preserving detection power.
Design of the detection score can proceed independently of the error-rate guarantee.
The resulting procedure remains distribution-free.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same knockoff view might let practitioners combine several rewrite detectors under a single joint FDR guarantee.
The calibration step could be packaged as a lightweight post-processor usable on any black-box rewrite detector.
Similar paired-sample constructions in other detection tasks could receive the same distribution-free FDR treatment.

Load-bearing premise

Rewrite-based detection implicitly produces knockoff samples that meet the conditions needed for valid multiple-testing calibration.

What would settle it

A controlled experiment with labeled human and LLM texts in which the calibrated procedure's realized false discovery proportion exceeds the nominal target level.

Figures

Figures reproduced from arXiv: 2606.00402 by Yi Liu.

**Figure 1.** Figure 1: Distribution of knockoff statistics si for human (alternative, blue) and AI-generated (null, red) texts in representative high-symmetry cases, one per rewriting method: Religious/GPT-3.5T (Likelihood, frac+ = 0.510, KS-p = 0.988), PersonalCommunication/GPT-4o (IMBD, frac+ = 0.510, KS-p = 0.713), and Sports/GPT-3.5T (L2D, frac+ = 0.510, KS-p = 0.068). The dashed vertical line marks si = 0. The near-symmetri… view at source ↗

**Figure 3.** Figure 3: Per-domain FDR and Power across all methods and models at each target level [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

read the original abstract

We propose a distribution-free statistical framework that converts arbitrary rewrite-based detectors into detectors with finite-sample FDR guarantees without retraining. Our key observation is that rewrite-based detection implicitly constructs knockoff samples, enabling LLM-generated text detection to be formulated as a multiple hypothesis testing problem with knockoff structure. This perspective separates the design of detection statistics from the control of false discoveries, allowing existing rewrite detectors to inherit finite-sample false discovery rate (FDR) guarantees through a simple calibration procedure. We demonstrate reliable FDR control with meaningful detection power across three detection models, 19 domains, and four LLMs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper wraps rewrite-based detectors in knockoff filtering to claim finite-sample FDR control, but the exchangeability step that would make the guarantee exact is not derived.

read the letter

The central move is to notice that a rewrite step already produces something like a knockoff copy, so you can run the usual knockoff filter on top of any existing rewrite detector and get FDR control without retraining. That separation between the detection statistic and the error-rate procedure is the clean part, and the abstract says it works across three models, 19 domains, and four LLMs.

The experiments are presented as showing reliable control plus decent power, which is the right kind of check to run. If the calibration really is distribution-free and only adds a simple post-processing step, that would be useful for anyone who already has a rewrite detector and wants a formal bound on false positives.

The weak point is the exchangeability claim. Finite-sample knockoff FDR control needs the original and the knockoff to be pairwise exchangeable under the null. The abstract treats the rewrite as automatically supplying that property, but gives no derivation showing the joint distribution is symmetric or that the conditional independence holds without extra assumptions on the rewrite distribution. The stress-test concern lands here: if exchangeability is only approximate, the finite-sample guarantee does not follow. Without seeing the actual proof or the precise conditions on the rewrite operator, it is hard to know how much the claim rests on.

This is for readers working on statistical guarantees for text detection rather than for people who just want a better practical detector. The idea is worth a serious referee because the framing is new and the empirical checks are in the right direction, even if the key symmetry needs more work.

Referee Report

2 major / 1 minor

Summary. The paper proposes a distribution-free framework that converts arbitrary rewrite-based detectors into ones with finite-sample FDR guarantees by observing that rewrites implicitly construct knockoff samples, allowing LLM-generated text detection to be cast as a multiple hypothesis testing problem; existing detectors then inherit FDR control via a simple calibration step without retraining. Empirical results are reported across three detection models, 19 domains, and four LLMs.

Significance. If the exchangeability claim holds, the separation of statistic design from FDR control would be a meaningful contribution, permitting reuse of existing rewrite detectors with rigorous finite-sample guarantees. The broad empirical scope (multiple models, domains, LLMs) is a strength that would support practical utility if the theoretical foundation is secured.

major comments (2)

[Abstract] Abstract: the central claim that 'rewrite-based detection implicitly constructs knockoff samples' enabling finite-sample FDR control is asserted without a derivation showing that the rewrite operator satisfies the pairwise exchangeability (or the required conditional independence) under the human-text null; if this symmetry fails to hold exactly, the calibration procedure inherits no exact finite-sample guarantee.
[Abstract (and any theoretical section deriving the knockoff property)] The distribution-free assertion rests on the rewrite procedure preserving the joint symmetry needed for knockoff FDR control; without explicit conditions on the rewrite distribution or a proof that the detector statistic satisfies the knockoff filter requirements, the finite-sample guarantee is not established and may reduce to an approximate procedure.

minor comments (1)

[Abstract] Abstract: quantitative statements such as 'reliable FDR control with meaningful detection power' should reference specific tables or figures reporting achieved FDR levels and power across the 19 domains and four LLMs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and detailed report. The major comments correctly identify that the abstract and theoretical claims regarding the knockoff property require a more explicit derivation of exchangeability. We address each point below and will revise the manuscript to include the requested formal arguments.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'rewrite-based detection implicitly constructs knockoff samples' enabling finite-sample FDR control is asserted without a derivation showing that the rewrite operator satisfies the pairwise exchangeability (or the required conditional independence) under the human-text null; if this symmetry fails to hold exactly, the calibration procedure inherits no exact finite-sample guarantee.

Authors: We agree that the abstract states the key observation without a self-contained derivation. Section 3 of the manuscript sketches why rewrites under the human-text null produce the required exchangeability for knockoff filtering, but the argument is not fully formalized. In revision we will add an explicit proof of pairwise exchangeability (including the precise conditional independence statement) together with the minimal assumptions on the rewrite distribution that make the finite-sample FDR control exact rather than approximate. revision: yes
Referee: [Abstract (and any theoretical section deriving the knockoff property)] The distribution-free assertion rests on the rewrite procedure preserving the joint symmetry needed for knockoff FDR control; without explicit conditions on the rewrite distribution or a proof that the detector statistic satisfies the knockoff filter requirements, the finite-sample guarantee is not established and may reduce to an approximate procedure.

Authors: The referee is correct that the current text does not supply a complete set of conditions on the rewrite distribution nor a line-by-line verification that the detector statistic meets the knockoff filter requirements. We will insert a dedicated subsection that states the necessary conditions on the rewrite operator, proves the joint symmetry, and verifies that the resulting test statistic satisfies the conditions for exact finite-sample FDR control via the knockoff filter. This change will make the guarantee rigorous rather than implicit. revision: yes

Circularity Check

0 steps flagged

No circularity; applies standard knockoff filter via modeling observation

full rationale

The paper's derivation rests on the modeling claim that rewrite-based detectors implicitly produce knockoff samples satisfying the conditions for the existing knockoff filter, then applies the standard calibration procedure for FDR control. No equations or steps in the provided abstract reduce a prediction or result to a fitted parameter, self-citation, or redefinition by construction. The framework separates detector design from FDR control using external knockoff theory, with no load-bearing self-citations or ansatzes imported from prior author work. This is a normal non-circular application of an established method to a new domain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review limits visibility into parameters and assumptions; the central premise is the knockoff construction from rewrite detection.

axioms (1)

domain assumption Rewrite-based detection implicitly constructs knockoff samples
Stated as the key observation that enables the multiple-testing formulation.

pith-pipeline@v0.9.1-grok · 5616 in / 1034 out tokens · 32649 ms · 2026-06-28T20:59:41.321989+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 6 canonical work pages

[1]

The Fourteenth International Conference on Learning Representations , abbr=

Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text , author=. The Fourteenth International Conference on Learning Representations , abbr=
[2]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

Candès, Emmanuel and Fan, Yingying and Janson, Lucas and Lv, Jinchi , title =. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2018 , month =. doi:10.1111/rssb.12265 , url =

work page doi:10.1111/rssb.12265 2018
[3]

The Annals of Statistics , volume=

Controlling the false discovery rate via knockoffs , author=. The Annals of Statistics , volume=. 2015 , publisher=

2015
[4]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Imitate before detect: Aligning machine stylistic preference for machine-revised text detection , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[5]

Beat LLM s at Their Own Game: Zero-Shot LLM -Generated Text Detection via Querying C hat GPT

Zhu, Biru and Yuan, Lifan and Cui, Ganqu and Chen, Yangyi and Fu, Chong and He, Bingxiang and Deng, Yangdong and Liu, Zhiyuan and Sun, Maosong and Gu, Ming. Beat LLM s at Their Own Game: Zero-Shot LLM -Generated Text Detection via Querying C hat GPT. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. doi:10.18653...

work page doi:10.18653/v1/2023.emnlp-main.463 2023
[6]

DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection , url =

Yu, Xiao and Qi, Yuang and Chen, Kejiang and Chen, Guoqiang and Yang, Xi and Zhu, Pengyuan and Shang, Xiuwei and Zhang, Weiming and Yu, Nenghai , booktitle =. DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection , url =. doi:10.52202/079017-0516 , editor =

work page doi:10.52202/079017-0516
[7]

The Twelfth International Conference on Learning Representations , year=

Raidar: geneRative AI Detection viA Rewriting , author=. The Twelfth International Conference on Learning Representations , year=
[8]

The Thirty-Ninth Annual Conference on Neural Information Processing Systems , year=

AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees , author=. The Thirty-Ninth Annual Conference on Neural Information Processing Systems , year=
[9]

2024 , eprint=

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text , author=. 2024 , eprint=

2024
[10]

The Twelfth International Conference on Learning Representations , year=

Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature , author=. The Twelfth International Conference on Learning Representations , year=
[11]

GLTR : Statistical Detection and Visualization of Generated Text

Gehrmann, Sebastian and Strobelt, Hendrik and Rush, Alexander. GLTR : Statistical Detection and Visualization of Generated Text. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2019. doi:10.18653/v1/P19-3019

work page doi:10.18653/v1/p19-3019 2019
[12]

Intrinsic Dimension Estimation for Robust Detection of

Eduard Tulchinskii and Kristian Kuznetsov and Kushnareva Laida and Daniil Cherniavskii and Sergey Nikolenko and Evgeny Burnaev and Serguei Barannikov and Irina Piontkovskaya , booktitle=. Intrinsic Dimension Estimation for Robust Detection of. 2023 , url=

2023
[13]

D etect LLM : Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text

Su, Jinyan and Zhuo, Terry and Wang, Di and Nakov, Preslav. D etect LLM : Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.827

work page doi:10.18653/v1/2023.findings-emnlp.827 2023
[14]

Xiaomeng Hu and Pin. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 , year =

2023
[15]

Release Strategies and the Social Impacts of Language Models , author=
[16]

Learning to Rewrite: Generalized LLM -Generated Text Detection

Hao, Wei and Li, Ran and Zhao, Weiliang and Yang, Junfeng and Mao, Chengzhi. Learning to Rewrite: Generalized LLM -Generated Text Detection. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.322

work page doi:10.18653/v1/2025.acl-long.322 2025
[17]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

Ghostbuster: Detecting text ghostwritten by large language models , author=. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

2024
[18]

2024 , howpublished =

2024
[19]

and Ng, Andrew and Potts, Christopher

Socher, Richard and Perelygin, Alex and Wu, Jean and Chuang, Jason and Manning, Christopher D. and Ng, Andrew and Potts, Christopher. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013

2013
[20]

arXiv preprint arXiv:2301.04253 , year=

Towards answering climate questionnaires from unstructured climate reports , author=. arXiv preprint arXiv:2301.04253 , year=

arXiv
[21]

arXiv preprint arXiv:2104.08663 , year=

Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models , author=. arXiv preprint arXiv:2104.08663 , year=

Pith/arXiv arXiv
[22]

Proceedings of the 22nd international conference on World Wide Web , pages=

From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews , author=. Proceedings of the 22nd international conference on World Wide Web , pages=
[23]

Proceedings of the eighteenth international conference on artificial intelligence and law , pages=

When does pretraining help? assessing self-supervised learning for law and the casehold dataset of 53,000+ legal holdings , author=. Proceedings of the eighteenth international conference on artificial intelligence and law , pages=
[24]

Pubmedqa: A dataset for biomedical research question answering , author=. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) , pages=

2019
[25]

Topic-aware convolutional neural networks for extreme summarization

Don’t give me the details, just the summary , author=. Topic-aware convolutional neural networks for extreme summarization. ArXiv abs/1808.08745 , year=

Pith/arXiv arXiv
[26]

, author=

Effects of age and gender on blogging. , author=. AAAI spring symposium: Computational approaches to analyzing weblogs , volume=. 2006 , organization=

2006
[27]

, author=

Distinguishing Fact from Fiction: A Benchmark Dataset for Identifying Machine-Generated Scientific Papers in the LLM Era. , author=. Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023) , pages=

2023

[1] [1]

The Fourteenth International Conference on Learning Representations , abbr=

Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text , author=. The Fourteenth International Conference on Learning Representations , abbr=

[2] [2]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

Candès, Emmanuel and Fan, Yingying and Janson, Lucas and Lv, Jinchi , title =. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2018 , month =. doi:10.1111/rssb.12265 , url =

work page doi:10.1111/rssb.12265 2018

[3] [3]

The Annals of Statistics , volume=

Controlling the false discovery rate via knockoffs , author=. The Annals of Statistics , volume=. 2015 , publisher=

2015

[4] [4]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Imitate before detect: Aligning machine stylistic preference for machine-revised text detection , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

[5] [5]

Beat LLM s at Their Own Game: Zero-Shot LLM -Generated Text Detection via Querying C hat GPT

Zhu, Biru and Yuan, Lifan and Cui, Ganqu and Chen, Yangyi and Fu, Chong and He, Bingxiang and Deng, Yangdong and Liu, Zhiyuan and Sun, Maosong and Gu, Ming. Beat LLM s at Their Own Game: Zero-Shot LLM -Generated Text Detection via Querying C hat GPT. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. doi:10.18653...

work page doi:10.18653/v1/2023.emnlp-main.463 2023

[6] [6]

DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection , url =

Yu, Xiao and Qi, Yuang and Chen, Kejiang and Chen, Guoqiang and Yang, Xi and Zhu, Pengyuan and Shang, Xiuwei and Zhang, Weiming and Yu, Nenghai , booktitle =. DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection , url =. doi:10.52202/079017-0516 , editor =

work page doi:10.52202/079017-0516

[7] [7]

The Twelfth International Conference on Learning Representations , year=

Raidar: geneRative AI Detection viA Rewriting , author=. The Twelfth International Conference on Learning Representations , year=

[8] [8]

The Thirty-Ninth Annual Conference on Neural Information Processing Systems , year=

AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees , author=. The Thirty-Ninth Annual Conference on Neural Information Processing Systems , year=

[9] [9]

2024 , eprint=

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text , author=. 2024 , eprint=

2024

[10] [10]

The Twelfth International Conference on Learning Representations , year=

Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature , author=. The Twelfth International Conference on Learning Representations , year=

[11] [11]

GLTR : Statistical Detection and Visualization of Generated Text

Gehrmann, Sebastian and Strobelt, Hendrik and Rush, Alexander. GLTR : Statistical Detection and Visualization of Generated Text. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2019. doi:10.18653/v1/P19-3019

work page doi:10.18653/v1/p19-3019 2019

[12] [12]

Intrinsic Dimension Estimation for Robust Detection of

Eduard Tulchinskii and Kristian Kuznetsov and Kushnareva Laida and Daniil Cherniavskii and Sergey Nikolenko and Evgeny Burnaev and Serguei Barannikov and Irina Piontkovskaya , booktitle=. Intrinsic Dimension Estimation for Robust Detection of. 2023 , url=

2023

[13] [13]

D etect LLM : Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text

Su, Jinyan and Zhuo, Terry and Wang, Di and Nakov, Preslav. D etect LLM : Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.827

work page doi:10.18653/v1/2023.findings-emnlp.827 2023

[14] [14]

Xiaomeng Hu and Pin. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 , year =

2023

[15] [15]

Release Strategies and the Social Impacts of Language Models , author=

[16] [16]

Learning to Rewrite: Generalized LLM -Generated Text Detection

Hao, Wei and Li, Ran and Zhao, Weiliang and Yang, Junfeng and Mao, Chengzhi. Learning to Rewrite: Generalized LLM -Generated Text Detection. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.322

work page doi:10.18653/v1/2025.acl-long.322 2025

[17] [17]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

Ghostbuster: Detecting text ghostwritten by large language models , author=. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

2024

[18] [18]

2024 , howpublished =

2024

[19] [19]

and Ng, Andrew and Potts, Christopher

Socher, Richard and Perelygin, Alex and Wu, Jean and Chuang, Jason and Manning, Christopher D. and Ng, Andrew and Potts, Christopher. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013

2013

[20] [20]

arXiv preprint arXiv:2301.04253 , year=

Towards answering climate questionnaires from unstructured climate reports , author=. arXiv preprint arXiv:2301.04253 , year=

arXiv

[21] [21]

arXiv preprint arXiv:2104.08663 , year=

Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models , author=. arXiv preprint arXiv:2104.08663 , year=

Pith/arXiv arXiv

[22] [22]

Proceedings of the 22nd international conference on World Wide Web , pages=

From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews , author=. Proceedings of the 22nd international conference on World Wide Web , pages=

[23] [23]

Proceedings of the eighteenth international conference on artificial intelligence and law , pages=

When does pretraining help? assessing self-supervised learning for law and the casehold dataset of 53,000+ legal holdings , author=. Proceedings of the eighteenth international conference on artificial intelligence and law , pages=

[24] [24]

Pubmedqa: A dataset for biomedical research question answering , author=. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) , pages=

2019

[25] [25]

Topic-aware convolutional neural networks for extreme summarization

Don’t give me the details, just the summary , author=. Topic-aware convolutional neural networks for extreme summarization. ArXiv abs/1808.08745 , year=

Pith/arXiv arXiv

[26] [26]

, author=

Effects of age and gender on blogging. , author=. AAAI spring symposium: Computational approaches to analyzing weblogs , volume=. 2006 , organization=

2006

[27] [27]

, author=

Distinguishing Fact from Fiction: A Benchmark Dataset for Identifying Machine-Generated Scientific Papers in the LLM Era. , author=. Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023) , pages=

2023