How Human-Like Are Large Language Models? A Register-Aware Linguistic Evaluation Framework

(2) Department of Digital Humanities; 3; 3); (3) University of Birmingham United Kingdom; 4); (4) Chair of AI-supported Therapy Decisions LMU M\"unchen Munich Germany; 5; (5) Munich Center for Machine Learning (MCML) Munich Germany; 6); (6) Institute of AI for Health Helmholtz Zentrum M\"unchen Neuherberg Germany)

arxiv: 2605.23651 · v1 · pith:444QW6XYnew · submitted 2026-05-22 · 💻 cs.CL

How Human-Like Are Large Language Models? A Register-Aware Linguistic Evaluation Framework

Bj\"orn Nieth (1 , 4) , Marianna Gracheva (2) , Michaela Mahlberg (2 , 3) , Bjoern Eskofier (1 , 3 , 5

show 8 more authors

6) Emmanuelle Salin (1) ((1) Department Artificial Intelligence in Biomedical Engineering (AIBE) FAU Erlangen-N\"urnberg Germany (2) Department of Digital Humanities Social Studies (DHSS) FAU Erlangen-N\"urnberg Germany (3) University of Birmingham United Kingdom (4) Chair of AI-supported Therapy Decisions LMU M\"unchen Munich Germany (5) Munich Center for Machine Learning (MCML) Munich Germany (6) Institute of AI for Health Helmholtz Zentrum M\"unchen Neuherberg Germany)

This is my paper

Pith reviewed 2026-05-25 04:18 UTC · model grok-4.3

classification 💻 cs.CL

keywords large language modelslinguistic evaluationregister variationBiber featuresmaximum mean discrepancyhuman-likenesscorpus linguisticstext generation

0 comments

The pith

Large language models always deviate from human linguistic patterns, but the closest model depends on the register rather than size.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a context-aware framework that measures human-likeness of LLM texts by comparing their distributions of linguistic features against human reference corpora for specific registers. It applies a two-sample test using 67 lexico-grammatical features to seven instruction-tuned models across five English datasets. All LLMs show measurable differences from the human baseline in every register examined. Rankings of which model comes closest shift depending on the register and are not explained by differences in model size. This matters because texts can be factually accurate yet still feel unnatural if they violate the expected frequencies and patterns for a given communicative context.

Core claim

LLMs deviate from the human baseline in every tested setup when their texts are compared on lexico-grammatical feature distributions. The model that produces the distribution closest to human writing changes with the register, and this ordering is not dictated by model size.

What carries the argument

A two-sample Maximum Mean Discrepancy comparison between human and LLM corpora, performed separately for each register using the 67 Biber lexico-grammatical features.

If this is right

Evaluation of LLM output must be performed register by register rather than with a single aggregate score.
Larger models are not guaranteed to produce more human-like language distributions than smaller ones.
Different communicative contexts expose different strengths among current open-source models.
The framework supplies a quantitative basis for selecting models according to the intended register of use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Fine-tuning on register-specific human data may close the observed gaps more effectively than further scaling.
The same method could be applied to measure how well models handle register shifts within a single conversation.
Training data that under-represents certain registers likely contributes to the systematic deviations found here.

Load-bearing premise

The 67 Biber features together with the MMD statistic capture the aspects of language production that determine whether a text feels human-like in a given register.

What would settle it

An experiment that finds one model size ranking first across every register would show that closeness is dictated by size after all.

Figures

Figures reproduced from arXiv: 2605.23651 by (2) Department of Digital Humanities, 3, 3), (3) University of Birmingham United Kingdom, 4), (4) Chair of AI-supported Therapy Decisions LMU M\"unchen Munich Germany, 5, (5) Munich Center for Machine Learning (MCML) Munich Germany, 6), (6) Institute of AI for Health Helmholtz Zentrum M\"unchen Neuherberg Germany), Bjoern Eskofier (1, Bj\"orn Nieth (1, Emmanuelle Salin (1) ((1) Department Artificial Intelligence in Biomedical Engineering (AIBE) FAU Erlangen-N\"urnberg Germany, Marianna Gracheva (2), Michaela Mahlberg (2, Social Studies (DHSS) FAU Erlangen-N\"urnberg Germany.

**Figure 2.** Figure 2: MMD2 with a resampled confidence interval for different sample sizes on the XSum dataset. 5.2 Model vs human In [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: MMD2 for all datasets and models to the respective human corpus, where the points indicate the observed MMD2 and the whiskers show the 95% CI resampled on coupled samples from the human and model corpus. The orange line in each plot gives the respective Human-Human MMD2 for the respective datasets with the resampled CI. The models on the y-axis are sorted by their observed MMD2 distance. Because the distan… view at source ↗

**Figure 5.** Figure 5: MMD2 for the prompt stability experiments to the human reference sample of the BNC2014Spoken. Dots indicate the mean value over all prompts, while the band shows the minimum and maximum observed distance for the respective model under all prompt variations. except for Llama 8B and Gemma 12B on the WritingPrompts dataset, use past-tense less frequently. Other nouns occur more frequently in spoken convers… view at source ↗

**Figure 4.** Figure 4: Overview of the proposed evaluation frame [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 6.** Figure 6: Violinplot of Biber dimension 1 on BNC2014Spoken for human and models in the ZeroShot setting models of one register can be calculated. The results are shown in Appendix [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: MMD2 with bootstrapped confidence interval for different sample sizes on all datasets. For BNC2014Spoken error is increasing, since dataset has only 1200 samples, thus a sample size larger 600 will lead to a smaller and larger subset [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: Correlation heatmap between the MMD2 between human and AI for the BNC2014Spoken between different prompt variants in the Zero-Shot setting [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗

**Figure 9.** Figure 9: Human and model distributions for Biber dimensions in the Zero-Shot setting (BNC2014Spoken). [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗

**Figure 10.** Figure 10: Human and model distributions for Biber dimensions in the Zero-Shot setting (S2ORC_ACL). [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗

**Figure 11.** Figure 11: Human and model distributions for Biber dimensions in the Zero-Shot setting (wikiHow). [PITH_FULL_IMAGE:figures/full_fig_p023_11.png] view at source ↗

**Figure 12.** Figure 12: Human and model distributions for Biber dimensions in the Zero-Shot setting (WritingPrompts). [PITH_FULL_IMAGE:figures/full_fig_p024_12.png] view at source ↗

**Figure 13.** Figure 13: Human and model distributions for Biber dimensions in the Zero-Shot setting (XSum). [PITH_FULL_IMAGE:figures/full_fig_p025_13.png] view at source ↗

**Figure 14.** Figure 14: Mean of the normalized linguistic features without standardization to the full human dataset, with [PITH_FULL_IMAGE:figures/full_fig_p026_14.png] view at source ↗

**Figure 15.** Figure 15: Mean of the normalized linguistic features without standardization to the full human dataset, with the [PITH_FULL_IMAGE:figures/full_fig_p027_15.png] view at source ↗

**Figure 16.** Figure 16: Mean of the normalized linguistic features without standardization to the full human dataset, with the [PITH_FULL_IMAGE:figures/full_fig_p028_16.png] view at source ↗

**Figure 17.** Figure 17: Mean of the normalized linguistic features without standardization to the full human dataset, with the [PITH_FULL_IMAGE:figures/full_fig_p029_17.png] view at source ↗

**Figure 18.** Figure 18: Mean of the normalized linguistic features without standardization to the full human dataset, with the [PITH_FULL_IMAGE:figures/full_fig_p030_18.png] view at source ↗

**Figure 19.** Figure 19: Wasserstein distance for marginal feature distributions between model and human for BNC2014Spoken [PITH_FULL_IMAGE:figures/full_fig_p031_19.png] view at source ↗

**Figure 20.** Figure 20: Wasserstein distance for marginal feature distributions between model and human for S2ORC_ACL in [PITH_FULL_IMAGE:figures/full_fig_p032_20.png] view at source ↗

**Figure 21.** Figure 21: Wasserstein distance for marginal feature distributions between model and human for wikiHow in the [PITH_FULL_IMAGE:figures/full_fig_p033_21.png] view at source ↗

**Figure 22.** Figure 22: Wasserstein distance for marginal feature distributions between model and human for WritingPrompts in [PITH_FULL_IMAGE:figures/full_fig_p034_22.png] view at source ↗

**Figure 23.** Figure 23: Wasserstein distance for marginal feature distributions between model and human for XSum in the [PITH_FULL_IMAGE:figures/full_fig_p035_23.png] view at source ↗

**Figure 24.** Figure 24: Observed MMD distance between different models for BNC2014Spoken in the Zero-Shot setting. The [PITH_FULL_IMAGE:figures/full_fig_p036_24.png] view at source ↗

**Figure 25.** Figure 25: Observed MMD distance between different models for S2ORC_ACL in the Zero-Shot setting. The [PITH_FULL_IMAGE:figures/full_fig_p037_25.png] view at source ↗

**Figure 26.** Figure 26: Observed MMD distance between different models for wikiHow in the Zero-Shot setting. The MMD [PITH_FULL_IMAGE:figures/full_fig_p038_26.png] view at source ↗

**Figure 27.** Figure 27: Observed MMD distance between different models for WritingPrompts in the Zero-Shot setting. The [PITH_FULL_IMAGE:figures/full_fig_p039_27.png] view at source ↗

**Figure 28.** Figure 28: Observed MMD distance between different models for XSum in the Zero-Shot setting. The MMD [PITH_FULL_IMAGE:figures/full_fig_p040_28.png] view at source ↗

**Figure 29.** Figure 29: Sum of the variances of the 67 linguistic features after normalization on the corresponding full human [PITH_FULL_IMAGE:figures/full_fig_p041_29.png] view at source ↗

read the original abstract

While factual correctness and task-performance have been in focus of Large Language Model (LLM) research for a long time, the fundamental question of how human-like generated texts are on a linguistic level has been underexplored. From a corpus-linguistic perspective, language production is inherently context-dependent, with distinct communicative contexts giving rise to differences in frequencies and co-occurrence patterns of linguistic features. A text failing to adhere to these patterns can be content-wise correct, but still be unfavorable to human readers. In this work, we propose a context-aware evaluation framework in which human-likeness is assessed using a two-sample problem between the linguistic feature distribution of a human reference corpus for a given register and a corresponding LLM-generated corpus. We implement this framework using the Maximum Mean Discrepancy (MMD) and the 67 lexico-grammatical features introduced by Biber, which are commonly applied in corpus linguistics. In our experiments, we compare seven instruction-tuned, open-source models across five English-language datasets spanning distinct registers against a human baseline. While across all tested setups, LLMs deviate from the human baseline, which models are closest to human language depends on the register and is not dictated by model size.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies Biber features plus MMD to show LLMs deviate from human registers but closest model varies by register not size.

read the letter

The core result here is straightforward: across the tested registers, all seven LLMs differ from the human corpora on the 67 Biber features, yet which model sits closest shifts with the register and does not track model size. That register dependence is the main claim worth noting. The work is new in taking the established Biber feature set and two-sample MMD setup and turning it into a register-aware evaluation for generated text. It does this cleanly by pulling human reference corpora for five English registers and generating matching LLM output for direct comparison. The approach is transparent and builds directly on corpus-linguistic tools that have been used for decades, which gives it a solid empirical footing without introducing new parameters or circular definitions. The limitation is that the 67 features are counts of specific lexico-grammatical classes. They may not pick up discourse-level or pragmatic signals that matter for human judgments of naturalness in a given register. If the MMD ordering does not match what readers actually rate as human-like, the register-specific conclusions rest on a narrower base than the abstract suggests. The stress-test concern lands because nothing in the reported setup tests alignment with human naturalness ratings or an expanded feature set. This is useful for readers who already work with register variation or who need a practical way to compare models on linguistic distributions rather than task accuracy. It is worth sending to peer review because the method is reproducible and the question is well-posed, even if the feature set needs more validation against human perception.

Referee Report

1 major / 1 minor

Summary. The manuscript proposes a context-aware evaluation framework for assessing the human-likeness of LLM-generated texts using Maximum Mean Discrepancy (MMD) to compare distributions of 67 Biber lexico-grammatical features between human reference corpora and LLM outputs across five distinct English registers. Experiments with seven instruction-tuned open-source LLMs reveal that all models deviate from human baselines, but the model closest to the human distribution varies depending on the register and is not solely determined by model size.

Significance. If the framework's assumptions hold, this work offers a valuable corpus-linguistic approach to LLM evaluation that accounts for register-specific linguistic patterns, moving beyond task performance metrics. The reliance on established Biber features and MMD contributes to the method's transparency and potential for replication in the field.

major comments (1)

[Abstract] The central claim that 'which models are closest to human language depends on the register and is not dictated by model size' is load-bearing on the 67 Biber features plus two-sample MMD being a sufficient statistic for human-likeness (Abstract). The manuscript provides no evidence that these distances align with human judgments of naturalness or discourse-level properties in the tested registers, nor any ablation against expanded feature sets; if the ordering differs from such external validation, the register-dependence conclusion does not follow from the reported MMD values.

minor comments (1)

[Abstract] The abstract states the main finding but does not name the five registers or seven models; adding these would improve immediate readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments. We respond to the single major comment below.

read point-by-point responses

Referee: [Abstract] The central claim that 'which models are closest to human language depends on the register and is not dictated by model size' is load-bearing on the 67 Biber features plus two-sample MMD being a sufficient statistic for human-likeness (Abstract). The manuscript provides no evidence that these distances align with human judgments of naturalness or discourse-level properties in the tested registers, nor any ablation against expanded feature sets; if the ordering differs from such external validation, the register-dependence conclusion does not follow from the reported MMD values.

Authors: We acknowledge the referee's point that the manuscript does not provide direct evidence linking MMD distances on the Biber feature set to human judgments of naturalness. The 67 features are selected because they are a well-established, replicable set in corpus linguistics for modeling register variation (Biber 1988 and subsequent validation studies). MMD serves as a distribution-level comparator rather than a claim of sufficiency for all aspects of human-likeness. The reported finding is therefore scoped to relative distances within this operationalization: across the five registers, the model minimizing MMD changes and is not monotonically related to parameter count. We agree that external validation would strengthen interpretation. In revision we will (1) temper the abstract wording to emphasize that conclusions concern this specific feature set and metric, (2) add citations to existing literature on the predictive validity of Biber features for perceived register appropriateness, and (3) expand the limitations section to note the absence of human judgment correlation or feature-set ablations as directions for future work. No new experiments are added at this stage. revision: partial

Circularity Check

0 steps flagged

No circularity; direct empirical comparison to external human corpora

full rationale

The paper defines human-likeness via two-sample MMD distances on the fixed, externally established set of 67 Biber lexico-grammatical features between LLM-generated texts and independent human reference corpora for each register. No equations, fitted parameters, self-referential definitions, or load-bearing self-citations appear; the reported register-dependent ordering of models follows immediately from these distance computations without any reduction of outputs to inputs by construction. The approach is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields limited visibility into parameters or assumptions; the framework rests on the domain assumption that Biber features are sufficient proxies for register-specific human language production.

axioms (1)

domain assumption Biber's 67 lexico-grammatical features capture the relevant frequency and co-occurrence patterns that distinguish registers in human language production.
The entire evaluation framework is built on this standard corpus-linguistic premise as stated in the abstract.

pith-pipeline@v0.9.0 · 5900 in / 1182 out tokens · 18953 ms · 2026-05-25T04:18:15.964825+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

136 extracted references · 51 canonical work pages · 15 internal anchors

[1]

Precision-Recall Curves Using Information Divergence Frontiers , url =

Josip Djolonga and Mario Lucic and Marco Cuturi and Olivier Bachem and Olivier Bousquet and Sylvain Gelly , bibsource =. Precision-Recall Curves Using Information Divergence Frontiers , url =. The 23rd International Conference on Artificial Intelligence and Statistics,
[2]

Ghostbuster: Detecting Text Ghostwritten by Large Language Models , url =

Verma, Vivek and Fleisig, Eve and Tomlin, Nicholas and Klein, Dan , booktitle =. Ghostbuster: Detecting Text Ghostwritten by Large Language Models , url =
[3]

Manning and Chelsea Finn , bibsource =

Eric Mitchell and Yoonho Lee and Alexander Khazatsky and Christopher D. Manning and Chelsea Finn , bibsource =. DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature , url =. International Conference on Machine Learning,
[4]

Wu, Junchao and Yang, Shu and Zhan, Runzhe and Yuan, Yulin and Chao, Lidia Sam and Wong, Derek Fai , doi =. A. Computational Linguistics , language =
[5]

Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual , editor =

Krishna Pillutla and Swabha Swayamdipta and Rowan Zellers and John Thickstun and Sean Welleck and Yejin Choi and Za. Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual , editor =

2021
[6]

Zhao and Kelvin Guu and Adams Wei Yu and Brian Lester and Nan Du and Andrew M

Jason Wei and Maarten Bosma and Vincent Y. Zhao and Kelvin Guu and Adams Wei Yu and Brian Lester and Nan Du and Andrew M. Dai and Quoc V. Le , bibsource =. Finetuned Language Models are Zero-Shot Learners , url =. The Tenth International Conference on Learning Representations,
[7]

Linguistic

Li, Ziqi and Zhang, Qi , language =. Linguistic
[8]

Long Ouyang and Jeffrey Wu and Xu Jiang and Diogo Almeida and Carroll L. Wainwright and Pamela Mishkin and Chong Zhang and Sandhini Agarwal and Katarina Slama and Alex Ray and John Schulman and Jacob Hilton and Fraser Kelton and Luke Miller and Maddie Simens and Amanda Askell and Peter Welinder and Paul F. Christiano and Jan Leike and Ryan Lowe , bibsourc...

2022
[9]

Comparing

Zamaraeva, Olga and. Comparing. Proceedings of the 63rd
[10]

Bagdasarov, Sergei and Alves, Diego , booktitle =. Like a
[11]

Differentiating between human-written and

Georgiou, Georgios P , journal =. Differentiating between human-written and
[12]

Register

Myntti, Amanda and Henriksson, Erik and Laippala, Veronika and Pyysalo, Sampo , journal =. Register
[13]

Persona-

Truong, Kimberly Le and Fogliato, Riccardo and Heidari, Hoda and Wu, Zhiwei Steven , booktitle =. Persona-
[14]

Jin, Di and Pan, Eileen and Oufattole, Nassim and Weng, Wei-Hung and Fang, Hanyi and Szolovits, Peter , doi =. What. Applied Sciences , language =
[15]

Bowman , bibsource =

Alex Wang and Yada Pruksachatkun and Nikita Nangia and Amanpreet Singh and Julian Michael and Felix Hill and Omer Levy and Samuel R. Bowman , bibsource =. SuperGLUE:. Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada , editor =

2019
[16]

Es, Shahul and James, Jithin and Espinosa Anke, Luis and Schockaert, Steven , booktitle =
[17]

Measuring Massive Multitask Language Understanding , url =

Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt , bibsource =. Measuring Massive Multitask Language Understanding , url =. 9th International Conference on Learning Representations,
[19]

Proceedings of the

Yadagiri, Annepaka and. Proceedings of the
[20]

Veirano Pinto, Marcia , doi =. Elena. English Language and Linguistics , language =
[21]

Register as a predictor of linguistic variation , url =

Biber,, Douglas , doi =. Register as a predictor of linguistic variation , url =. Corpus Linguistics and Linguistic Theory , language =
[22]

Register, genre, and style , year =

Biber, Douglas and Conrad, Susan , doi =. Register, genre, and style , year =
[23]

Neurobiber:

Alkiek, Kenan and Wegmann, Anna and Zhu, Jian and Jurgens, David , language =. Neurobiber:
[24]

and Aroyehun, Segun , journal =

Zanotto, Sergio E. and Aroyehun, Segun , journal =. Human
[25]

Comparative linguistic analysis framework of human-written vs

Culda, Lia Cornelia and Nerişanu, Raluca Andreea and Cristescu, Marian Pompiliu and Mara, Dumitru Alexandru and Bâra, Adela and Oprea, Simona-Vasilica , doi =. Comparative linguistic analysis framework of human-written vs. machine-generated text , url =. Connection Science , language =
[26]

, journal =

Paech, Samuel J. , journal =
[27]

Wang, Zhengxiang and Tripto, Nafis Irtiza and Park, Solha and Li, Zhenzhen and Zhou, Jiawei , language =. Catch
[28]

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection , url =

Wang, Yuxia and Mansurov, Jonibek and Ivanov, Petar and Su, Jinyan and Shelmanov, Artem and Tsvigun, Akim and Whitehouse, Chenxi and Mohammed Afzal, Osama and Mahmoud, Tarek and Sasaki, Toru and Arnold, Thomas and Aji, Alham Fikri and Habash, Nizar and Gurevych, Iryna and Nakov, Preslav , booktitle =. M4: Multi-generator, Multi-domain, and Multi-lingual B...
[30]

Liu, Jae Q. J. and Hui, Kelvin T. K. and Al Zoubi, Fadi and Zhou, Zing Z. X. and Samartzis, Dino and Yu, Curtis C. H. and Chang, Jeremy R. and Wong, Arnold Y. L. , doi =. The great detectives: humans versus. International Journal for Educational Integrity , language =
[31]

Stylometry can reveal artificial intelligence authorship, but humans struggle:

Zaitsu, Wataru and Jin, Mingzhe and Ishihara, Shunichi and Tsuge, Satoru and Inaba, Mitsuyuki , doi =. Stylometry can reveal artificial intelligence authorship, but humans struggle:. PLOS One , language =
[32]

Stylometry

Przystalski, Karol and Argasiński, Jan and Grabska-Gradzińska, Iwona and Ochab, Jeremi , doi =. Stylometry
[33]

Nature , language =

Shumailov, Ilia and Shumaylov, Zakhar and Zhao, Yiren and Papernot, Nicolas and Anderson, Ross and Gal, Yarin , doi =. Nature , language =
[34]

Srikant , bibsource =

Shiyu Liang and Yixuan Li and R. Srikant , bibsource =. Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , url =. 6th International Conference on Learning Representations,
[35]

Roy and Zoubin Ghahramani , bibsource =

Gintare Karolina Dziugaite and Daniel M. Roy and Zoubin Ghahramani , bibsource =. Training generative neural networks via Maximum Mean Discrepancy optimization , url =. Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence,
[36]

Jordan , bibsource =

Mingsheng Long and Yue Cao and Jianmin Wang and Michael I. Jordan , bibsource =. Learning Transferable Features with Deep Adaptation Networks , url =. Proceedings of the 32nd International Conference on Machine Learning,
[37]

Zhu, Yongchun and Zhuang, Fuzhen and Wang, Jindong and Ke, Guolin and Chen, Jingwu and Bian, Jiang and Xiong, Hui and He, Qing , doi =. Deep. IEEE Transactions on Neural Networks and Learning Systems , keywords =
[38]

and Gil, María Victoria and Glaubitz, Christina and Greiner, Maximilian and Holick, Caroline T

Mirza, Adrian and Alampara, Nawaf and Kunchapu, Sreekanth and Ríos-García, Martiño and Emoekabu, Benedict and Krishnan, Aswanth and Gupta, Tanya and Schilling-Wilhelmi, Mara and Okereke, Macjonathan and Aneesh, Anagha and Asgari, Mehrdad and Eberhardt, Juliane and Elahi, Amir Mohammad and Elbeheiry, Hani M. and Gil, María Victoria and Glaubitz, Christina ...
[40]

Ho and Christopher R

Neel Guha and Julian Nyarko and Daniel E. Ho and Christopher R. LegalBench:. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 , editor =

2023
[41]

Bowman , bibsource =

Alex Wang and Amanpreet Singh and Julian Michael and Felix Hill and Omer Levy and Samuel R. Bowman , bibsource =. 7th International Conference on Learning Representations,
[42]

Variation across speech and writing , year =

Biber, Douglas , publisher =. Variation across speech and writing , year =
[43]

Yang, An and Li, Anfeng and Yang, Baosong and Zhang, Beichen and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Gao, Chang and Huang, Chengen and Lv, Chenxu and Zheng, Chujie and Liu, Dayiheng and Zhou, Fan and Huang, Fei and Hu, Feng and Ge, Hao and Wei, Haoran and Lin, Huan and Tang, Jialong and Yang, Jian and Tu, Jianhong and Zhang, Jianwei and Yang, Jia...
[44]

Grattafiori, Aaron and Dubey, Abhimanyu and Jauhri, Abhinav and Pandey, Abhinav and Kadian, Abhishek and Al-Dahle, Ahmad and Letman, Aiesha and Mathur, Akhil and Schelten, Alan and Vaughan, Alex and Yang, Amy and Fan, Angela and Goyal, Anirudh and Hartshorn, Anthony and Yang, Aobo and Mitra, Archi and Sravankumar, Archie and Korenev, Artem and Hinsvark, A...
[45]

and Carey, C

Team, Gemma and Kamath, Aishwarya and Ferret, Johan and Pathak, Shreya and Vieillard, Nino and Merhej, Ramona and Perrin, Sarah and Matejovicova, Tatiana and Ramé, Alexandre and Rivière, Morgane and Rouillard, Louis and Mesnard, Thomas and Cideron, Geoffrey and Grill, Jean-bastien and Ramos, Sabela and Yvinec, Edouard and Casbon, Michelle and Pot, Etienne...
[46]

Apertus:

Apertus, Project and Hernández-Cano, Alejandro and Hägele, Alexander and Huang, Allen Hao and Romanou, Angelika and Solergibert, Antoni-Joan and Pasztor, Barna and Messmer, Bettina and Garbaya, Dhia and Ďurech, Eduard Frank and Hakimi, Ido and Giraldo, Juan García and Ismayilzada, Mete and Foroutan, Negar and Moalla, Skander and Chen, Tiancheng and Sabolč...
[50]

Koupaee, Mahnaz and Wang, William Yang , journal =
[52]

and Bergen, Benjamin K

Chang, Tyler A. and Bergen, Benjamin K. , doi =. Language Model Behavior: A Comprehensive Survey , url =. Computational Linguistics , number =
[53]

and Macke, Jakob H

Bischoff, Sebastian and Darcher, Alana and Deistler, Michael and Gao, Richard and Gerken, Franziska and Gloeckler, Manuel and Haxel, Lisa and Kapoor, Jaivardhan and Lappalainen, Janne K. and Macke, Jakob H. and Moss, Guy and Pals, Matthijs and Pei, Felix and Rapp, Rachel and Sağtekin, A. Erdem and Schröder, Cornelius and Schulz, Auguste and Stefanidi, Zin...
[54]

Ramdas, Aaditya and Garcia, Nicolas and Cuturi, Marco , journal =. On
[55]

and Rasch, Malte J

Gretton, Arthur and Borgwardt, Karsten M. and Rasch, Malte J. and Schölkopf, Bernhard and Smola, Alexander , journal =. A kernel two-sample test , url =
[56]

Personalized Text Generation with Fine-Grained Linguistic Control , url =

Alhafni, Bashar and Kulkarni, Vivek and Kumar, Dhruv and Raheja, Vipul , booktitle =. Personalized Text Generation with Fine-Grained Linguistic Control , url =
[57]

Linguistic

Terčon, Luka and Dobrovoljc, Kaja , language =. Linguistic
[58]

Reinhart, Alex and Markey, Ben and Laudenbach, Michael and Pantusen, Kachatad and Yurko, Ronald and Weinberg, Gordon and Brown, David West , journal =. Do
[59]

Benchmark of stylistic variation in

Milička, Jiří and Marklová, Anna and Cvrček, Václav , journal =. Benchmark of stylistic variation in
[60]

Applied Corpus Linguistics , language =

Berber Sardinha, Tony , doi =. Applied Corpus Linguistics , language =
[61]

Milička, Jiří and Marklová, Anna and Cvrček, Václav , journal =
[62]

Precision-

Djolonga, Josip and Lucic, Mario and Cuturi, Marco and Bachem, Olivier and Bousquet, Olivier and Gelly, Sylvain , editor =. Precision-. Proceedings of the. 2020 , pages =

2020
[63]

Ghostbuster:

Verma, Vivek and Fleisig, Eve and Tomlin, Nicholas and Klein, Dan , year =. Ghostbuster:. Proceedings of the 2024. doi:10.18653/v1/2024.naacl-long.95 , abstract =

work page doi:10.18653/v1/2024.naacl-long.95 2024
[65]

Proceedings of the 35th

Pillutla, Krishna and Swayamdipta, Swabha and Zellers, Rowan and Thickstun, John and Welleck, Sean and Choi, Yejin and Harchaoui, Zaid , year =. Proceedings of the 35th
[66]

International

Wei, Jason and Bosma, Maarten and Zhao, Vincent Y and Guu, Kelvin and Yu, Adams Wei and Lester, Brian and Du, Nan and Dai, Andrew M and Le, Quoc V , year =. International
[67]

Linguistic

Li, Ziqi and Zhang, Qi , year =. Linguistic
[68]

Training language models to follow instructions with human feedback , abstract =

Ouyang, Long and Wu, Jeff and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll L and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and Schulman, John and Hilton, Jacob and Kelton, Fraser and Miller, Luke and Simens, Maddie and Askell, Amanda and Welinder, Peter and Christiano, Paul and Leike, Jan and Lowe, Ry...
[69]

Bagdasarov, Sergei and Alves, Diego , year =. Like a. Proceedings of the
[70]

Information , author =

Differentiating between human-written and. Information , author =
[71]

Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation

Myntti, Amanda and Henriksson, Erik and Laippala, Veronika and Pyysalo, Sampo , month = sep, year =. Register. doi:10.48550/arXiv.2504.01542 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2504.01542
[72]

Persona-

Truong, Kimberly Le and Fogliato, Riccardo and Heidari, Hoda and Wu, Zhiwei Steven , year =. Persona-. Proceedings of the 2025

2025
[74]

Es, Shahul and James, Jithin and Espinosa-Anke, Luis and Schockaert, Steven , year =. System
[75]

Measuring Massive Multitask Language Understanding

Hendrycks, Dan and Burns, Collin and Basart, Steven and Zou, Andy and Mazeika, Mantas and Song, Dawn and Steinhardt, Jacob , month = jan, year =. Measuring. doi:10.48550/arXiv.2009.03300 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2009.03300 2009
[80]

and Aroyehun, Segun , month = dec, year =

Zanotto, Sergio E. and Aroyehun, Segun , month = dec, year =. Human. doi:10.48550/arXiv.2412.03025 , abstract =

work page doi:10.48550/arxiv.2412.03025
[81]

machine-generated text , volume =

Comparative linguistic analysis framework of human-written vs. machine-generated text , volume =. Connection Science , author =. 2025 , pages =. doi:10.1080/09540091.2025.2507183 , abstract =

work page doi:10.1080/09540091.2025.2507183 2025
[82]

, month = jan, year =

Paech, Samuel J. , month = jan, year =. doi:10.48550/arXiv.2312.06281 , abstract =

work page doi:10.48550/arxiv.2312.06281
[83]

Distinguishing

Mosca, Edoardo and Abdalla, Mohamed Hesham Ibrahim and Basso, Paolo and Musumeci, Margherita and Groh, Georg , year =. Distinguishing. Proceedings of the 3rd. doi:10.18653/v1/2023.trustnlp-1.17 , abstract =

work page doi:10.18653/v1/2023.trustnlp-1.17 2023
[84]

International Journal for Educational Integrity , author =

The great detectives: humans versus. International Journal for Educational Integrity , author =. 2024 , pages =. doi:10.1007/s40979-024-00155-6 , abstract =

work page doi:10.1007/s40979-024-00155-6 2024
[87]

Anderson and Yarin Gal , title =

Nature , author =. 2024 , pages =. doi:10.1038/s41586-024-07566-y , abstract =

work page doi:10.1038/s41586-024-07566-y 2024
[88]

Liang, Shiyu and Li, Yixuan and Srikant, R , year =
[89]

Training generative neural networks via Maximum Mean Discrepancy optimization

Dziugaite, Gintare Karolina and Roy, Daniel M. and Ghahramani, Zoubin , month = may, year =. Training generative neural networks via. doi:10.48550/arXiv.1505.03906 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1505.03906
[90]

Learning Transferable Features with Deep Adaptation Networks

Long, Mingsheng and Cao, Yue and Wang, Jianmin and Jordan, Michael I. , month = may, year =. Learning. doi:10.48550/arXiv.1502.02791 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1502.02791
[93]

URL https:// doi.org/10.18653/v1/p19-1472

Zellers, Rowan and Holtzman, Ari and Bisk, Yonatan and Farhadi, Ali and Choi, Yejin , year =. Proceedings of the 57th. doi:10.18653/v1/P19-1472 , abstract =

work page doi:10.18653/v1/p19-1472
[94]

SSRN Electronic Journal , author =

Legalbench:. SSRN Electronic Journal , author =. doi:10.2139/ssrn.4583531 , abstract =

work page doi:10.2139/ssrn.4583531
[95]

Proceedings of the 2018

Wang, Alex and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel , year =. Proceedings of the 2018. doi:10.18653/v1/W18-5446 , language =

work page doi:10.18653/v1/w18-5446 2018
[96]

Variation across speech and writing , publisher =

Biber, Douglas , year =. Variation across speech and writing , publisher =
[97]

Yang, An and Li, Anfeng and Yang, Baosong and Zhang, Beichen and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Gao, Chang and Huang, Chengen and Lv, Chenxu and Zheng, Chujie and Liu, Dayiheng and Zhou, Fan and Huang, Fei and Hu, Feng and Ge, Hao and Wei, Haoran and Lin, Huan and Tang, Jialong and Yang, Jian and Tu, Jianhong and Zhang, Jianwei and Yang, Jia...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.09388

Showing first 80 references.

[1] [1]

Precision-Recall Curves Using Information Divergence Frontiers , url =

Josip Djolonga and Mario Lucic and Marco Cuturi and Olivier Bachem and Olivier Bousquet and Sylvain Gelly , bibsource =. Precision-Recall Curves Using Information Divergence Frontiers , url =. The 23rd International Conference on Artificial Intelligence and Statistics,

[2] [2]

Ghostbuster: Detecting Text Ghostwritten by Large Language Models , url =

Verma, Vivek and Fleisig, Eve and Tomlin, Nicholas and Klein, Dan , booktitle =. Ghostbuster: Detecting Text Ghostwritten by Large Language Models , url =

[3] [3]

Manning and Chelsea Finn , bibsource =

Eric Mitchell and Yoonho Lee and Alexander Khazatsky and Christopher D. Manning and Chelsea Finn , bibsource =. DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature , url =. International Conference on Machine Learning,

[4] [4]

Wu, Junchao and Yang, Shu and Zhan, Runzhe and Yuan, Yulin and Chao, Lidia Sam and Wong, Derek Fai , doi =. A. Computational Linguistics , language =

[5] [5]

Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual , editor =

Krishna Pillutla and Swabha Swayamdipta and Rowan Zellers and John Thickstun and Sean Welleck and Yejin Choi and Za. Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual , editor =

2021

[6] [6]

Zhao and Kelvin Guu and Adams Wei Yu and Brian Lester and Nan Du and Andrew M

Jason Wei and Maarten Bosma and Vincent Y. Zhao and Kelvin Guu and Adams Wei Yu and Brian Lester and Nan Du and Andrew M. Dai and Quoc V. Le , bibsource =. Finetuned Language Models are Zero-Shot Learners , url =. The Tenth International Conference on Learning Representations,

[7] [7]

Linguistic

Li, Ziqi and Zhang, Qi , language =. Linguistic

[8] [8]

Long Ouyang and Jeffrey Wu and Xu Jiang and Diogo Almeida and Carroll L. Wainwright and Pamela Mishkin and Chong Zhang and Sandhini Agarwal and Katarina Slama and Alex Ray and John Schulman and Jacob Hilton and Fraser Kelton and Luke Miller and Maddie Simens and Amanda Askell and Peter Welinder and Paul F. Christiano and Jan Leike and Ryan Lowe , bibsourc...

2022

[9] [9]

Comparing

Zamaraeva, Olga and. Comparing. Proceedings of the 63rd

[10] [10]

Bagdasarov, Sergei and Alves, Diego , booktitle =. Like a

[11] [11]

Differentiating between human-written and

Georgiou, Georgios P , journal =. Differentiating between human-written and

[12] [12]

Register

Myntti, Amanda and Henriksson, Erik and Laippala, Veronika and Pyysalo, Sampo , journal =. Register

[13] [13]

Persona-

Truong, Kimberly Le and Fogliato, Riccardo and Heidari, Hoda and Wu, Zhiwei Steven , booktitle =. Persona-

[14] [14]

Jin, Di and Pan, Eileen and Oufattole, Nassim and Weng, Wei-Hung and Fang, Hanyi and Szolovits, Peter , doi =. What. Applied Sciences , language =

[15] [15]

Bowman , bibsource =

Alex Wang and Yada Pruksachatkun and Nikita Nangia and Amanpreet Singh and Julian Michael and Felix Hill and Omer Levy and Samuel R. Bowman , bibsource =. SuperGLUE:. Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada , editor =

2019

[16] [16]

Es, Shahul and James, Jithin and Espinosa Anke, Luis and Schockaert, Steven , booktitle =

[17] [17]

Measuring Massive Multitask Language Understanding , url =

Dan Hendrycks and Collin Burns and Steven Basart and Andy Zou and Mantas Mazeika and Dawn Song and Jacob Steinhardt , bibsource =. Measuring Massive Multitask Language Understanding , url =. 9th International Conference on Learning Representations,

[18] [19]

Proceedings of the

Yadagiri, Annepaka and. Proceedings of the

[19] [20]

Veirano Pinto, Marcia , doi =. Elena. English Language and Linguistics , language =

[20] [21]

Register as a predictor of linguistic variation , url =

Biber,, Douglas , doi =. Register as a predictor of linguistic variation , url =. Corpus Linguistics and Linguistic Theory , language =

[21] [22]

Register, genre, and style , year =

Biber, Douglas and Conrad, Susan , doi =. Register, genre, and style , year =

[22] [23]

Neurobiber:

Alkiek, Kenan and Wegmann, Anna and Zhu, Jian and Jurgens, David , language =. Neurobiber:

[23] [24]

and Aroyehun, Segun , journal =

Zanotto, Sergio E. and Aroyehun, Segun , journal =. Human

[24] [25]

Comparative linguistic analysis framework of human-written vs

Culda, Lia Cornelia and Nerişanu, Raluca Andreea and Cristescu, Marian Pompiliu and Mara, Dumitru Alexandru and Bâra, Adela and Oprea, Simona-Vasilica , doi =. Comparative linguistic analysis framework of human-written vs. machine-generated text , url =. Connection Science , language =

[25] [26]

, journal =

Paech, Samuel J. , journal =

[26] [27]

Wang, Zhengxiang and Tripto, Nafis Irtiza and Park, Solha and Li, Zhenzhen and Zhou, Jiawei , language =. Catch

[27] [28]

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection , url =

Wang, Yuxia and Mansurov, Jonibek and Ivanov, Petar and Su, Jinyan and Shelmanov, Artem and Tsvigun, Akim and Whitehouse, Chenxi and Mohammed Afzal, Osama and Mahmoud, Tarek and Sasaki, Toru and Arnold, Thomas and Aji, Alham Fikri and Habash, Nizar and Gurevych, Iryna and Nakov, Preslav , booktitle =. M4: Multi-generator, Multi-domain, and Multi-lingual B...

[28] [30]

Liu, Jae Q. J. and Hui, Kelvin T. K. and Al Zoubi, Fadi and Zhou, Zing Z. X. and Samartzis, Dino and Yu, Curtis C. H. and Chang, Jeremy R. and Wong, Arnold Y. L. , doi =. The great detectives: humans versus. International Journal for Educational Integrity , language =

[29] [31]

Stylometry can reveal artificial intelligence authorship, but humans struggle:

Zaitsu, Wataru and Jin, Mingzhe and Ishihara, Shunichi and Tsuge, Satoru and Inaba, Mitsuyuki , doi =. Stylometry can reveal artificial intelligence authorship, but humans struggle:. PLOS One , language =

[30] [32]

Stylometry

Przystalski, Karol and Argasiński, Jan and Grabska-Gradzińska, Iwona and Ochab, Jeremi , doi =. Stylometry

[31] [33]

Nature , language =

Shumailov, Ilia and Shumaylov, Zakhar and Zhao, Yiren and Papernot, Nicolas and Anderson, Ross and Gal, Yarin , doi =. Nature , language =

[32] [34]

Srikant , bibsource =

Shiyu Liang and Yixuan Li and R. Srikant , bibsource =. Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , url =. 6th International Conference on Learning Representations,

[33] [35]

Roy and Zoubin Ghahramani , bibsource =

Gintare Karolina Dziugaite and Daniel M. Roy and Zoubin Ghahramani , bibsource =. Training generative neural networks via Maximum Mean Discrepancy optimization , url =. Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence,

[34] [36]

Jordan , bibsource =

Mingsheng Long and Yue Cao and Jianmin Wang and Michael I. Jordan , bibsource =. Learning Transferable Features with Deep Adaptation Networks , url =. Proceedings of the 32nd International Conference on Machine Learning,

[35] [37]

Zhu, Yongchun and Zhuang, Fuzhen and Wang, Jindong and Ke, Guolin and Chen, Jingwu and Bian, Jiang and Xiong, Hui and He, Qing , doi =. Deep. IEEE Transactions on Neural Networks and Learning Systems , keywords =

[36] [38]

and Gil, María Victoria and Glaubitz, Christina and Greiner, Maximilian and Holick, Caroline T

Mirza, Adrian and Alampara, Nawaf and Kunchapu, Sreekanth and Ríos-García, Martiño and Emoekabu, Benedict and Krishnan, Aswanth and Gupta, Tanya and Schilling-Wilhelmi, Mara and Okereke, Macjonathan and Aneesh, Anagha and Asgari, Mehrdad and Eberhardt, Juliane and Elahi, Amir Mohammad and Elbeheiry, Hani M. and Gil, María Victoria and Glaubitz, Christina ...

[37] [40]

Ho and Christopher R

Neel Guha and Julian Nyarko and Daniel E. Ho and Christopher R. LegalBench:. Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023 , editor =

2023

[38] [41]

Bowman , bibsource =

Alex Wang and Amanpreet Singh and Julian Michael and Felix Hill and Omer Levy and Samuel R. Bowman , bibsource =. 7th International Conference on Learning Representations,

[39] [42]

Variation across speech and writing , year =

Biber, Douglas , publisher =. Variation across speech and writing , year =

[40] [43]

Yang, An and Li, Anfeng and Yang, Baosong and Zhang, Beichen and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Gao, Chang and Huang, Chengen and Lv, Chenxu and Zheng, Chujie and Liu, Dayiheng and Zhou, Fan and Huang, Fei and Hu, Feng and Ge, Hao and Wei, Haoran and Lin, Huan and Tang, Jialong and Yang, Jian and Tu, Jianhong and Zhang, Jianwei and Yang, Jia...

[41] [44]

Grattafiori, Aaron and Dubey, Abhimanyu and Jauhri, Abhinav and Pandey, Abhinav and Kadian, Abhishek and Al-Dahle, Ahmad and Letman, Aiesha and Mathur, Akhil and Schelten, Alan and Vaughan, Alex and Yang, Amy and Fan, Angela and Goyal, Anirudh and Hartshorn, Anthony and Yang, Aobo and Mitra, Archi and Sravankumar, Archie and Korenev, Artem and Hinsvark, A...

[42] [45]

and Carey, C

Team, Gemma and Kamath, Aishwarya and Ferret, Johan and Pathak, Shreya and Vieillard, Nino and Merhej, Ramona and Perrin, Sarah and Matejovicova, Tatiana and Ramé, Alexandre and Rivière, Morgane and Rouillard, Louis and Mesnard, Thomas and Cideron, Geoffrey and Grill, Jean-bastien and Ramos, Sabela and Yvinec, Edouard and Casbon, Michelle and Pot, Etienne...

[43] [46]

Apertus:

Apertus, Project and Hernández-Cano, Alejandro and Hägele, Alexander and Huang, Allen Hao and Romanou, Angelika and Solergibert, Antoni-Joan and Pasztor, Barna and Messmer, Bettina and Garbaya, Dhia and Ďurech, Eduard Frank and Hakimi, Ido and Giraldo, Juan García and Ismayilzada, Mete and Foroutan, Negar and Moalla, Skander and Chen, Tiancheng and Sabolč...

[44] [50]

Koupaee, Mahnaz and Wang, William Yang , journal =

[45] [52]

and Bergen, Benjamin K

Chang, Tyler A. and Bergen, Benjamin K. , doi =. Language Model Behavior: A Comprehensive Survey , url =. Computational Linguistics , number =

[46] [53]

and Macke, Jakob H

Bischoff, Sebastian and Darcher, Alana and Deistler, Michael and Gao, Richard and Gerken, Franziska and Gloeckler, Manuel and Haxel, Lisa and Kapoor, Jaivardhan and Lappalainen, Janne K. and Macke, Jakob H. and Moss, Guy and Pals, Matthijs and Pei, Felix and Rapp, Rachel and Sağtekin, A. Erdem and Schröder, Cornelius and Schulz, Auguste and Stefanidi, Zin...

[47] [54]

Ramdas, Aaditya and Garcia, Nicolas and Cuturi, Marco , journal =. On

[48] [55]

and Rasch, Malte J

Gretton, Arthur and Borgwardt, Karsten M. and Rasch, Malte J. and Schölkopf, Bernhard and Smola, Alexander , journal =. A kernel two-sample test , url =

[49] [56]

Personalized Text Generation with Fine-Grained Linguistic Control , url =

Alhafni, Bashar and Kulkarni, Vivek and Kumar, Dhruv and Raheja, Vipul , booktitle =. Personalized Text Generation with Fine-Grained Linguistic Control , url =

[50] [57]

Linguistic

Terčon, Luka and Dobrovoljc, Kaja , language =. Linguistic

[51] [58]

Reinhart, Alex and Markey, Ben and Laudenbach, Michael and Pantusen, Kachatad and Yurko, Ronald and Weinberg, Gordon and Brown, David West , journal =. Do

[52] [59]

Benchmark of stylistic variation in

Milička, Jiří and Marklová, Anna and Cvrček, Václav , journal =. Benchmark of stylistic variation in

[53] [60]

Applied Corpus Linguistics , language =

Berber Sardinha, Tony , doi =. Applied Corpus Linguistics , language =

[54] [61]

Milička, Jiří and Marklová, Anna and Cvrček, Václav , journal =

[55] [62]

Precision-

Djolonga, Josip and Lucic, Mario and Cuturi, Marco and Bachem, Olivier and Bousquet, Olivier and Gelly, Sylvain , editor =. Precision-. Proceedings of the. 2020 , pages =

2020

[56] [63]

Ghostbuster:

Verma, Vivek and Fleisig, Eve and Tomlin, Nicholas and Klein, Dan , year =. Ghostbuster:. Proceedings of the 2024. doi:10.18653/v1/2024.naacl-long.95 , abstract =

work page doi:10.18653/v1/2024.naacl-long.95 2024

[57] [65]

Proceedings of the 35th

Pillutla, Krishna and Swayamdipta, Swabha and Zellers, Rowan and Thickstun, John and Welleck, Sean and Choi, Yejin and Harchaoui, Zaid , year =. Proceedings of the 35th

[58] [66]

International

Wei, Jason and Bosma, Maarten and Zhao, Vincent Y and Guu, Kelvin and Yu, Adams Wei and Lester, Brian and Du, Nan and Dai, Andrew M and Le, Quoc V , year =. International

[59] [67]

Linguistic

Li, Ziqi and Zhang, Qi , year =. Linguistic

[60] [68]

Training language models to follow instructions with human feedback , abstract =

Ouyang, Long and Wu, Jeff and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll L and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and Schulman, John and Hilton, Jacob and Kelton, Fraser and Miller, Luke and Simens, Maddie and Askell, Amanda and Welinder, Peter and Christiano, Paul and Leike, Jan and Lowe, Ry...

[61] [69]

Bagdasarov, Sergei and Alves, Diego , year =. Like a. Proceedings of the

[62] [70]

Information , author =

Differentiating between human-written and. Information , author =

[63] [71]

Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation

Myntti, Amanda and Henriksson, Erik and Laippala, Veronika and Pyysalo, Sampo , month = sep, year =. Register. doi:10.48550/arXiv.2504.01542 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2504.01542

[64] [72]

Persona-

Truong, Kimberly Le and Fogliato, Riccardo and Heidari, Hoda and Wu, Zhiwei Steven , year =. Persona-. Proceedings of the 2025

2025

[65] [74]

Es, Shahul and James, Jithin and Espinosa-Anke, Luis and Schockaert, Steven , year =. System

[66] [75]

Measuring Massive Multitask Language Understanding

Hendrycks, Dan and Burns, Collin and Basart, Steven and Zou, Andy and Mazeika, Mantas and Song, Dawn and Steinhardt, Jacob , month = jan, year =. Measuring. doi:10.48550/arXiv.2009.03300 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2009.03300 2009

[67] [80]

and Aroyehun, Segun , month = dec, year =

Zanotto, Sergio E. and Aroyehun, Segun , month = dec, year =. Human. doi:10.48550/arXiv.2412.03025 , abstract =

work page doi:10.48550/arxiv.2412.03025

[68] [81]

machine-generated text , volume =

Comparative linguistic analysis framework of human-written vs. machine-generated text , volume =. Connection Science , author =. 2025 , pages =. doi:10.1080/09540091.2025.2507183 , abstract =

work page doi:10.1080/09540091.2025.2507183 2025

[69] [82]

, month = jan, year =

Paech, Samuel J. , month = jan, year =. doi:10.48550/arXiv.2312.06281 , abstract =

work page doi:10.48550/arxiv.2312.06281

[70] [83]

Distinguishing

Mosca, Edoardo and Abdalla, Mohamed Hesham Ibrahim and Basso, Paolo and Musumeci, Margherita and Groh, Georg , year =. Distinguishing. Proceedings of the 3rd. doi:10.18653/v1/2023.trustnlp-1.17 , abstract =

work page doi:10.18653/v1/2023.trustnlp-1.17 2023

[71] [84]

International Journal for Educational Integrity , author =

The great detectives: humans versus. International Journal for Educational Integrity , author =. 2024 , pages =. doi:10.1007/s40979-024-00155-6 , abstract =

work page doi:10.1007/s40979-024-00155-6 2024

[72] [87]

Anderson and Yarin Gal , title =

Nature , author =. 2024 , pages =. doi:10.1038/s41586-024-07566-y , abstract =

work page doi:10.1038/s41586-024-07566-y 2024

[73] [88]

Liang, Shiyu and Li, Yixuan and Srikant, R , year =

[74] [89]

Training generative neural networks via Maximum Mean Discrepancy optimization

Dziugaite, Gintare Karolina and Roy, Daniel M. and Ghahramani, Zoubin , month = may, year =. Training generative neural networks via. doi:10.48550/arXiv.1505.03906 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1505.03906

[75] [90]

Learning Transferable Features with Deep Adaptation Networks

Long, Mingsheng and Cao, Yue and Wang, Jianmin and Jordan, Michael I. , month = may, year =. Learning. doi:10.48550/arXiv.1502.02791 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1502.02791

[76] [93]

URL https:// doi.org/10.18653/v1/p19-1472

Zellers, Rowan and Holtzman, Ari and Bisk, Yonatan and Farhadi, Ali and Choi, Yejin , year =. Proceedings of the 57th. doi:10.18653/v1/P19-1472 , abstract =

work page doi:10.18653/v1/p19-1472

[77] [94]

SSRN Electronic Journal , author =

Legalbench:. SSRN Electronic Journal , author =. doi:10.2139/ssrn.4583531 , abstract =

work page doi:10.2139/ssrn.4583531

[78] [95]

Proceedings of the 2018

Wang, Alex and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel , year =. Proceedings of the 2018. doi:10.18653/v1/W18-5446 , language =

work page doi:10.18653/v1/w18-5446 2018

[79] [96]

Variation across speech and writing , publisher =

Biber, Douglas , year =. Variation across speech and writing , publisher =

[80] [97]

Yang, An and Li, Anfeng and Yang, Baosong and Zhang, Beichen and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Gao, Chang and Huang, Chengen and Lv, Chenxu and Zheng, Chujie and Liu, Dayiheng and Zhou, Fan and Huang, Fei and Hu, Feng and Ge, Hao and Wei, Haoran and Lin, Huan and Tang, Jialong and Yang, Jian and Tu, Jianhong and Zhang, Jianwei and Yang, Jia...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.09388