Recognition: unknown
DSIPA: Detecting LLM-Generated Texts via Sentiment-Invariant Patterns Divergence Analysis
Pith reviewed 2026-05-07 13:07 UTC · model grok-4.3
The pith
DSIPA detects machine-generated text by measuring the stability of sentiment distributions when writing style is deliberately varied.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By quantifying sentiment distributional stability under controlled stylistic variation using two unsupervised metrics, DSIPA captures the greater emotional consistency typical of LLM outputs compared to the affective variation in human texts, enabling zero-shot detection without parameter access or training data.
What carries the argument
Sentiment distribution consistency and preservation metrics applied under controlled stylistic variation to reveal behavioral differences.
If this is right
- It achieves higher detection accuracy than prior methods across news, code, essays, papers, and comments.
- The approach generalizes well to different models and domains.
- It maintains performance even when text is adversarially modified or paraphrased.
- No labeled datasets or model internals are required for operation.
Where Pith is reading between the lines
- One could test whether increasing emotional variation in LLM outputs reduces detectability by this method.
- The framework might integrate with other detection signals for hybrid systems.
- It raises the question of whether similar stability patterns exist in other attributes like factual consistency.
- Applications could include automated verification in publishing or social media.
Load-bearing premise
LLMs exhibit more emotionally consistent outputs than human-written texts do.
What would settle it
Demonstrating no significant difference in sentiment stability between LLM-generated and human texts after applying the same stylistic variations would falsify the detection premise.
Figures
read the original abstract
The rapid advancement of large language models (LLMs) presents new security challenges, particularly in detecting machine-generated text used for misinformation, impersonation, and content forgery. Most existing detection approaches struggle with robustness against adversarial perturbation, paraphrasing attacks, and domain shifts, often requiring restrictive access to model parameters or large labeled datasets. To address this, we propose DSIPA, a novel training-free framework that detects LLM-generated content by quantifying sentiment distributional stability under controlled stylistic variation. It is based on the observation that LLMs typically exhibit more emotionally consistent outputs, while human-written texts display greater affective variation. Our framework operates in a zero-shot, black-box manner, leveraging two unsupervised metrics, sentiment distribution consistency and sentiment distribution preservation, to capture these intrinsic behavioral asymmetries without the need for parameter updates or probability access. Extensive experiments are conducted on state-of-the-art proprietary and open-source models, including GPT-5.2, Gemini-1.5-pro, Claude-3, and LLaMa-3.3. Evaluations on five domains, such as news articles, programming code, student essays, academic papers, and community comments, demonstrate that DSIPA improves F1 detection scores by up to 49.89% over baseline methods. The framework exhibits superior generalizability across domains and strong resilience to adversarial conditions, providing a robust and interpretable behavioral signal for secure content identification in the evolving LLM landscape.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes DSIPA, a training-free, zero-shot, black-box framework for detecting LLM-generated texts. It quantifies two unsupervised metrics—sentiment distribution consistency and sentiment distribution preservation—under controlled stylistic variation, based on the premise that LLMs produce more emotionally consistent outputs than human-written texts. Experiments on models including GPT-5.2, Gemini-1.5-pro, Claude-3, and LLaMA-3.3 across five domains (news, code, essays, papers, comments) claim up to 49.89% F1 improvement over baselines, plus superior generalizability and adversarial resilience.
Significance. If the core asymmetry and metrics prove robust, DSIPA would supply a parameter-free, interpretable behavioral signal that avoids model-parameter access and large labeled datasets, addressing key limitations of current detectors. The training-free design and claimed resilience to paraphrasing/domain shifts represent a genuine strength worth validating.
major comments (2)
- Abstract: the central performance claim (up to 49.89% F1 lift) is stated without naming the baselines, datasets, statistical tests, or error bars; this prevents any assessment of whether the reported gains are load-bearing or artifacts of the chosen sentiment analyzer and text-length confounds.
- Abstract / §2 (Observation): the foundational premise that 'LLMs typically exhibit more emotionally consistent outputs, while human-written texts display greater affective variation' is asserted directly but without supporting statistics (per-domain variance ratios, significance tests, or corpus-level comparisons); because the two metrics are constructed to exploit exactly this asymmetry, its empirical weakness would render the separation unreliable.
minor comments (1)
- The abstract mentions 'extensive experiments' on proprietary and open-source models but supplies no table or section reference for the exact domain splits, attack types, or metric definitions; adding these cross-references would improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major comment point by point below, indicating where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [—] Abstract: the central performance claim (up to 49.89% F1 lift) is stated without naming the baselines, datasets, statistical tests, or error bars; this prevents any assessment of whether the reported gains are load-bearing or artifacts of the chosen sentiment analyzer and text-length confounds.
Authors: We agree that the abstract would benefit from greater specificity. In the revised manuscript, we will update the abstract to name the baseline detectors (including DetectGPT, GPTZero, and the other methods compared in our experiments), specify the five evaluation domains and corresponding datasets, report that F1 gains are accompanied by standard deviations across repeated runs, and note that statistical significance was evaluated via paired t-tests. To address potential confounds, our experiments already match text-length distributions between human and LLM samples and validate results across multiple sentiment analyzers; we will make these design choices explicit in the abstract and §4. revision: yes
-
Referee: [—] Abstract / §2 (Observation): the foundational premise that 'LLMs typically exhibit more emotionally consistent outputs, while human-written texts display greater affective variation' is asserted directly but without supporting statistics (per-domain variance ratios, significance tests, or corpus-level comparisons); because the two metrics are constructed to exploit exactly this asymmetry, its empirical weakness would render the separation unreliable.
Authors: We acknowledge that the premise is stated in the abstract and at the start of §2 without immediate quantitative backing. Although the effectiveness of the derived metrics is shown through the full experimental results, we agree that explicit support for the underlying asymmetry would improve clarity and rigor. We will revise §2 to add per-domain sentiment variance ratios (LLM vs. human), results of statistical tests for variance differences, and corpus-level comparisons across the five domains. These additions will be placed immediately after the observation statement. revision: yes
Circularity Check
No significant circularity; framework applies external observation via unsupervised metrics.
full rationale
The paper presents DSIPA as a training-free, zero-shot detector that defines two unsupervised metrics (sentiment distribution consistency and preservation) to quantify stability under stylistic variation. This rests on the stated behavioral observation about LLM vs. human affective variation rather than any fitted parameter, self-referential definition, or self-citation chain. No equations appear in the abstract that reduce a claimed result to its inputs by construction, and the method does not rename known empirical patterns or smuggle ansatzes via prior work. The derivation chain is therefore self-contained as an application of the premise to new metrics, with no load-bearing step that collapses into tautology.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLMs typically exhibit more emotionally consistent outputs than human-written texts under stylistic variation.
Forward citations
Cited by 1 Pith paper
-
Lightweight Stylistic Consistency Profiling: Robust Detection of LLM-Generated Textual Content for Multimedia Moderation
LiSCP detects LLM-generated text via stylistic consistency profiling across paraphrased variants and reports up to 11.79% better cross-domain accuracy plus robustness to adversarial attacks.
Reference graph
Works this paper leans on
-
[1]
Pald: Detection of text partially written by large language models,
E. Lei, H. Hsu, and C.-F. Chen, “Pald: Detection of text partially written by large language models,” inThe Thirteenth International Conference on Learning Representations, 2025
2025
-
[2]
A Survey on Large Language Models for Code Generation
J. Jiang, F. Wang, J. Shen, S. Kim, and S. Kim, “A survey on large language models for code generation,”arXiv preprint arXiv:2406.00515, 2024
work page internal anchor Pith review arXiv 2024
-
[3]
Harnessing the power of llms in practice: A survey on chatgpt and beyond,
J. Yang, H. Jin, R. Tang, X. Han, Q. Feng, H. Jiang, S. Zhong, B. Yin, and X. Hu, “Harnessing the power of llms in practice: A survey on chatgpt and beyond,”ACM Transactions on Knowledge Discovery from Data, 2023
2023
-
[4]
arXiv preprint arXiv:2404.16038 (2024)
P. Zhou, L. Wang, Z. Liu, Y . Hao, P. Hui, S. Tarkoma, and J. Kan- gasharju, “A survey on generative ai and llm for video generation, understanding, and streaming,”arXiv preprint arXiv:2404.16038, 2024
-
[5]
S. Li, X. Lin, Y . Liu, X. Chen, and J. Li, “Trustworthy ai-generative con- tent for intelligent network service: Robustness, security, and fairness,” arXiv preprint arXiv:2405.05930, 2024
-
[6]
Z. Chu, S. Wang, J. Xie, T. Zhu, Y . Yan, J. Ye, A. Zhong, X. Hu, J. Liang, P. S. Yuet al., “Llm agents for education: Advances and applications,” arXiv preprint arXiv:2503.11733, vol. 2, 2025
-
[7]
Qos- aware multi-aigc service orchestration at edges: An attention-diffusion- aided drl method,
Y . Liu, S. Li, X. Lin, X. Chen, G. Li, Y . Liu, B. Liao, and J. Li, “Qos- aware multi-aigc service orchestration at edges: An attention-diffusion- aided drl method,”IEEE Transactions on Cognitive Communications and Networking, vol. 11, no. 2, pp. 1078–1090, 2025
2025
-
[8]
Large language models driven neural architecture search for universal and lightweight disease diagnosis on histopathology slide images,
X. Su, Q. Mao, Z. Wu, X. Lin, S. You, Y . Liao, and C. Xu, “Large language models driven neural architecture search for universal and lightweight disease diagnosis on histopathology slide images,”npj Digital Medicine, vol. 8, no. 1, p. 682, 2025
2025
-
[9]
Do language models plagiarize?
J. Lee, T. Le, J. Chen, and D. Lee, “Do language models plagiarize?” in Proceedings of the ACM Web Conference 2023, 2023, pp. 3637–3647
2023
-
[10]
Deepfake text detection: Limitations and opportunities,
J. Pu, Z. Sarwar, S. M. Abdullah, A. Rehman, Y . Kim, P. Bhattacharya, M. Javed, and B. Viswanath, “Deepfake text detection: Limitations and opportunities,” in2023 IEEE Symposium on Security and Privacy (SP). IEEE, 2023, pp. 1613–1630
2023
-
[11]
S. Li, X. Lin, J. Wu, Z. Liu, H. Li, T. Ju, X. Chen, and J. Li, “Honeytrap: Deceiving large language model attackers to honeypot traps with resilient multi-agent defense,”arXiv preprint arXiv:2601.04034, 2026
-
[12]
Detectrl: Benchmarking llm-generated text detection in real-world scenarios,
J. Wu, R. Zhan, D. Wong, S. Yang, X. Yang, Y . Yuan, and L. Chao, “Detectrl: Benchmarking llm-generated text detection in real-world scenarios,”Advances in Neural Information Processing Systems, vol. 37, pp. 100 369–100 401, 2024
2024
-
[13]
Human heuristics for ai- generated language are flawed,
M. Jakesch, J. T. Hancock, and M. Naaman, “Human heuristics for ai- generated language are flawed,”Proceedings of the National Academy of Sciences, vol. 120, no. 11, p. e2208839120, 2023
2023
-
[14]
Bust: Benchmark for the evaluation of detectors of llm-generated text,
J. Cornelius, O. Lithgow-Serrano, S. Mitrovi ´c, L. Dolamic, and F. Ri- naldi, “Bust: Benchmark for the evaluation of detectors of llm-generated text,” inProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024, pp. 8022–8050
2024
-
[15]
Online detection of LLM-generated texts via sequential hypothesis testing by betting,
C. Chen and J.-K. Wang, “Online detection of LLM-generated texts via sequential hypothesis testing by betting,” inForty-second International Conference on Machine Learning, 2025
2025
-
[16]
Detectgpt: Zero-shot machine-generated text detection using probability curvature,
E. Mitchell, Y . Lee, A. Khazatsky, C. D. Manning, and C. Finn, “Detectgpt: Zero-shot machine-generated text detection using probability curvature,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 24 950–24 962
2023
-
[17]
Radar: Robust ai-text detection via adversarial learning,
X. Hu, P.-Y . Chen, and T.-Y . Ho, “Radar: Robust ai-text detection via adversarial learning,”Advances in Neural Information Processing Systems, vol. 36, 2024
2024
-
[18]
Can ai-generated text be reliably detected?arXiv preprint arXiv:2303.11156, 2023
V . S. Sadasivan, A. Kumar, S. Balasubramanian, W. Wang, and S. Feizi, “Can ai-generated text be reliably detected?”arXiv preprint arXiv:2303.11156, 2023
-
[19]
Adversarial watermarking transformer: Towards tracing text provenance with data hiding,
S. Abdelnabi and M. Fritz, “Adversarial watermarking transformer: Towards tracing text provenance with data hiding,” in2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021, pp. 121–140
2021
-
[20]
Watermarking conditional text gen- eration for ai detection: Unveiling challenges and a semantic-aware watermark remedy,
Y . Fu, D. Xiong, and Y . Dong, “Watermarking conditional text gen- eration for ai detection: Unveiling challenges and a semantic-aware watermark remedy,” inProceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI 2024), 2024
2024
-
[21]
On the reliability of watermarks for large language models,
J. Kirchenbauer, J. Geiping, Y . Wen, M. Shu, K. Saifullah, K. Kong, K. Fernando, A. Saha, M. Goldblum, and T. Goldstein, “On the reliability of watermarks for large language models,” inInternational Conference on Learning Representations, 2024
2024
-
[22]
Emma Strubell, Ananya Ganesh, and Andrew McCallum
I. Solaiman, M. Brundage, J. Clark, A. Askell, A. Herbert-V oss, J. Wu, A. Radford, G. Krueger, J. W. Kim, S. Krepset al., “Release strategies and the social impacts of language models,”arXiv preprint arXiv:1908.09203, 2019
-
[23]
Few-shot detection of machine-generated text using style representa- tions,
R. R. Soto, K. Koch, A. Khan, B. Chen, M. Bishop, and N. Andrews, “Few-shot detection of machine-generated text using style representa- tions,” inInternational Conference on Learning Representations, 2024
2024
-
[24]
Styledecipher: Robust and explainable detection of llm-generated texts with stylistic analysis,
S. Li, A. Wulianghai, X. Lin, G. Li, X. Chen, J. Wu, and J. Li, “Styledecipher: Robust and explainable detection of llm-generated texts with stylistic analysis,”arXiv preprint arXiv:2510.12608, 2025
-
[25]
Fast-detectgpt: Efficient zero-shot detection of machine-generated text via conditional probability curvature,
G. Bao, Y . Zhao, Z. Teng, L. Yang, and Y . Zhang, “Fast-detectgpt: Efficient zero-shot detection of machine-generated text via conditional probability curvature,” inInternational Conference on Learning Repre- sentations, 2024
2024
-
[26]
Multiscale positive-unlabeled detection of AI-generated texts,
Y . Tian, H. Chen, X. Wang, Z. Bai, Q. ZHANG, R. Li, C. Xu, and Y . Wang, “Multiscale positive-unlabeled detection of AI-generated texts,” inInternational Conference on Learning Representations, 2024
2024
-
[27]
Para- phrasing evades detectors of ai-generated text, but retrieval is an effective SUBMITTED TO IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING 15 defense,
K. Krishna, Y . Song, M. Karpinska, J. Wieting, and M. Iyyer, “Para- phrasing evades detectors of ai-generated text, but retrieval is an effective SUBMITTED TO IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING 15 defense,”Advances in Neural Information Processing Systems, vol. 36, 2024
2024
-
[28]
Model-agnostic sentiment distribution stability analysis for robust llm- generated texts detection,
S. Li, X. Lin, G. Li, Z. Liu, A. Wulianghai, L. Ding, J. Wu, and J. Li, “Model-agnostic sentiment distribution stability analysis for robust llm- generated texts detection,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 42, 2026, pp. 35 608–35 616
2026
-
[29]
Dna-gpt: Divergent n-gram analysis for training-free detection of gpt-generated text,
X. Yang, W. Cheng, L. Petzold, W. Y . Wang, and H. Chen, “Dna-gpt: Divergent n-gram analysis for training-free detection of gpt-generated text,”arXiv preprint arXiv:2305.17359, 2023
-
[30]
Llms to the moon? reddit market sentiment analysis with large language models,
X. Deng, V . Bashlovkina, F. Han, S. Baumgartner, and M. Bendersky, “Llms to the moon? reddit market sentiment analysis with large language models,” inCompanion Proceedings of the ACM Web Conference 2023, 2023, pp. 1014–1019
2023
-
[31]
A comprehensive overview of large language models,
H. Naveed, A. U. Khan, S. Qiu, M. Saqib, S. Anwar, M. Usman, N. Akhtar, N. Barnes, and A. Mian, “A comprehensive overview of large language models,”arXiv preprint arXiv:2307.06435, 2023
-
[32]
When neutral sum- maries are not that neutral: Quantifying political neutrality in llm- generated news summaries (student abstract),
S. Vijay, A. Priyanshu, and A. R. KhudaBukhsh, “When neutral sum- maries are not that neutral: Quantifying political neutrality in llm- generated news summaries (student abstract),” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 28, 2025, pp. 29 514–29 516
2025
-
[33]
”do anything now
X. Shen, Z. Chen, M. Backes, Y . Shen, and Y . Zhang, “”do anything now”: Characterizing and evaluating in-the-wild jailbreak prompts on large language models,” 2024
2024
-
[34]
A survey on aspect-based sentiment analysis: Tasks, methods, and challenges,
W. Zhang, X. Li, Y . Deng, L. Bing, and W. Lam, “A survey on aspect-based sentiment analysis: Tasks, methods, and challenges,”IEEE Transactions on Knowledge and Data Engineering, 2022
2022
-
[35]
A survey of knowledge enhanced pre-trained language models,
L. Hu, Z. Liu, Z. Zhao, L. Hou, L. Nie, and J. Li, “A survey of knowledge enhanced pre-trained language models,”IEEE Transactions on Knowledge and Data Engineering, 2023
2023
-
[36]
Examining zero-shot vulnerability repair with large language models,
H. Pearce, B. Tan, B. Ahmad, R. Karri, and B. Dolan-Gavitt, “Examining zero-shot vulnerability repair with large language models,” in2023 IEEE Symposium on Security and Privacy (SP). IEEE, 2023, pp. 2339–2356
2023
-
[37]
Evaluating the moral beliefs encoded in llms,
N. Scherrer, C. Shi, A. Feder, and D. Blei, “Evaluating the moral beliefs encoded in llms,”Advances in Neural Information Processing Systems, vol. 36, 2024
2024
-
[38]
Sparks of Artificial General Intelligence: Early experiments with GPT-4
S. Bubeck, V . Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Ka- mar, P. Lee, Y . T. Lee, Y . Li, S. Lundberget al., “Sparks of artificial general intelligence: Early experiments with gpt-4,”arXiv preprint arXiv:2303.12712, 2023
work page internal anchor Pith review arXiv 2023
-
[39]
Gltr: Statistical detection and visualization of generated text,
S. Gehrmann, H. Strobelt, and A. M. Rush, “Gltr: Statistical detection and visualization of generated text,” inProceedings of the 57th An- nual Meeting of the Association for Computational Linguistics: System Demonstrations, 2019, pp. 111–116
2019
-
[40]
Reverse engineering configurations of neural text generation models,
Y . Tay, D. Bahri, C. Zheng, C. Brunk, D. Metzler, and A. Tomkins, “Reverse engineering configurations of neural text generation models,” arXiv preprint arXiv:2004.06201, 2020
-
[41]
Authorship attribution for neural text generation,
A. Uchendu, T. Le, K. Shu, and D. Lee, “Authorship attribution for neural text generation,” inProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 8384–8395
2020
-
[42]
A survey on detection of llms-generated content,
X. Yang, L. Pan, X. Zhao, H. Chen, L. Petzold, W. Y . Wang, and W. Cheng, “A survey on detection of llms-generated content,”arXiv preprint arXiv:2310.15654, 2023
-
[43]
J. Wu, S. Yang, R. Zhan, Y . Yuan, D. F. Wong, and L. S. Chao, “A survey on llm-gernerated text detection: Necessity, methods, and future directions,”arXiv preprint arXiv:2310.14724, 2023
-
[44]
Position: On the possibilities of ai-generated text detection,
S. Chakraborty, A. Bedi, S. Zhu, B. An, D. Manocha, and F. Huang, “Position: On the possibilities of ai-generated text detection,” inForty- first International Conference on Machine Learning, 2024
2024
-
[45]
A watermark for large language models,
J. Kirchenbauer, J. Geiping, Y . Wen, J. Katz, I. Miers, and T. Goldstein, “A watermark for large language models,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 17 061–17 084
2023
-
[46]
Spotting llms with binoc- ulars: Zero-shot detection of machine-generated text,
A. Hans, A. Schwarzschild, V . Cherepanova, H. Kazemi, A. Saha, M. Goldblum, J. Geiping, and T. Goldstein, “Spotting llms with binoc- ulars: Zero-shot detection of machine-generated text,” inInternational Conference on Machine Learning. PMLR, 2024, pp. 17 519–17 537
2024
-
[47]
Deep kernel relative test for machine-generated text detection,
Y . Song, Z. Yuan, S. Zhang, Z. Fang, J. Yu, and F. Liu, “Deep kernel relative test for machine-generated text detection,” inThe Thirteenth International Conference on Learning Representations, 2025
2025
-
[48]
Detecting machine-generated texts by multi-population aware optimization for maximum mean discrepancy,
S. Zhang, Y . Song, J. Yang, Y . Li, B. Han, and M. Tan, “Detecting machine-generated texts by multi-population aware optimization for maximum mean discrepancy,” inInternational Conference on Learning Representations, 2024
2024
-
[49]
In- trinsic dimension estimation for robust detection of ai-generated texts,
E. Tulchinskii, K. Kuznetsov, L. Kushnareva, D. Cherniavskii, S. Nikolenko, E. Burnaev, S. Barannikov, and I. Piontkovskaya, “In- trinsic dimension estimation for robust detection of ai-generated texts,” Advances in Neural Information Processing Systems, vol. 36, 2024
2024
-
[50]
Detecting generated text via rewriting,
C. Mao, C. V ondrick, H. Wang, and J. Yang, “Detecting generated text via rewriting,” inThe Twelfth International Conference on Learning Representations, 2024
2024
-
[51]
Mad-tsc: A multilingual aligned news dataset for target-dependent sentiment classification,
E. Dufraisse, A. Popescu, J. Tourille, A. Brun, and J. Deshayes, “Mad-tsc: A multilingual aligned news dataset for target-dependent sentiment classification,” in61st Annual Meeting of the Association for Computational Linguistics, 2023
2023
-
[52]
Sentiment analysis in the era of large language models: A reality check,
W. Zhang, Y . Deng, B. Liu, S. J. Pan, and L. Bing, “Sentiment analysis in the era of large language models: A reality check,” inNAACL-HLT (Findings), 2024
2024
-
[53]
Can chatgpt understand too? a comparative study on chatgpt and fine-tuned bert,
Q. Zhong, L. Ding, J. Liu, B. Du, and D. Tao, “Can chatgpt understand too? a comparative study on chatgpt and fine-tuned bert,”arXiv preprint arXiv:2302.10198, 2023
-
[54]
C. Li, J. Wang, Y . Zhang, K. Zhu, W. Hou, J. Lian, F. Luo, Q. Yang, and X. Xie, “Large language models understand and can be enhanced by emotional stimuli,”arXiv preprint arXiv:2307.11760, 2023
-
[55]
Ghostbuster: Detecting text ghostwritten by large language models,
V . Verma, E. Fleisig, N. Tomlin, and D. Klein, “Ghostbuster: Detecting text ghostwritten by large language models,” inProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024, pp. 1702–1717
2024
-
[56]
Y . Liu, Z. Zhong, Y . Liao, Z. Sun, J. Zheng, J. Wei, Q. Gong, F. Tong, Y . Chen, Y . Zhang, and X. He, “On the generalization and adaptation ability of machine-generated text detectors in academic writing,” in Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V .2, ser. KDD ’25. New York, NY , USA: Association for Comp...
-
[57]
Fairness-guided few-shot prompting for large language models,
H. Ma, C. Zhang, Y . Bian, L. Liu, Z. Zhang, P. Zhao, S. Zhang, H. Fu, Q. Hu, and B. Wu, “Fairness-guided few-shot prompting for large language models,”Advances in Neural Information Processing Systems, vol. 36, 2024
2024
-
[58]
Evaluating Large Language Models Trained on Code
M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. d. O. Pinto, J. Kaplan, H. Edwards, Y . Burda, N. Joseph, G. Brockmanet al., “Evaluating large language models trained on code,”arXiv preprint arXiv:2107.03374, 2021
work page internal anchor Pith review arXiv 2021
-
[59]
E. Tian. (2023) Gptzero: An ai text detector,. [Online]. Available: https://gptzero.me/
2023
-
[60]
Lost in back-translation: Emotion preservation in neural machine translation,
E. Troiano, R. Klinger, and S. Pad ´o, “Lost in back-translation: Emotion preservation in neural machine translation,” 2020
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.