GigaCheck: Detecting LLM-generated Content via Object-Centric Span Localization
Pith reviewed 2026-05-23 19:01 UTC · model grok-4.3
The pith
Treating generated text spans as objects lets a DETR-like model localize AI intervals by combining it with linguistic encoders.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By integrating a DETR-like vision model with linguistic encoders, GigaCheck achieves precise localization of AI-generated intervals through an object-centric treatment of text spans, transferring the robustness of visual object detection to the textual domain. The shared fine-tuned backbone delivers strong accuracy in both document classification and span localization, and the results indicate that DETR-style architectures generalize beyond pixel space to the regression of generated-text intervals.
What carries the argument
The object-centric span localization head that replaces bounding-box regression with token-span regression on embeddings produced by linguistic encoders.
If this is right
- Fine-tuned LLM representations support high-accuracy document-level authorship detection with limited data.
- The same representations enable accurate localization of generated intervals inside documents.
- Embeddings learned for one detection task transfer directly to the other task without retraining the backbone.
- DETR-style detection heads apply to non-image domains when the input is replaced by linguistic token sequences.
Where Pith is reading between the lines
- Localization output could feed directly into editing interfaces that rewrite only the flagged sections.
- The same object-centric head might be tested on generated code or dialogue turns to check whether the transfer holds for other sequence types.
- If the approach works, object-detection losses could replace standard boundary losses in tasks such as sentence or paragraph segmentation.
Load-bearing premise
Contiguous spans of generated text possess the same spatial and contextual coherence properties as visual objects, allowing an image-detection architecture to be repurposed for token-span regression without fundamental changes.
What would settle it
A localization benchmark in which standard sequence-labeling methods achieve higher span-level F1 than the DETR-adapted model would show that the object-centric transfer does not improve detection.
Figures
read the original abstract
With the increasing quality and spread of LLM assistants, the amount of generated content is growing rapidly. In many cases and tasks, such texts are already indistinguishable from those written by humans, and the quality of generation continues to increase. At the same time, detection methods are advancing more slowly than generation models, making it challenging to prevent misuse of generative AI technologies. We propose GigaCheck, a dual-strategy framework for AI-generated text detection. At the document level, we leverage the representation learning of fine-tuned LLMs to discern authorship with high data efficiency. At the span level, we introduce a novel structural adaptation that treats generated text segments as "objects." By integrating a DETR-like vision model with linguistic encoders, we achieve precise localization of AI intervals, effectively transferring the robustness of visual object detection to the textual domain. Experimental results across three classification and three localization benchmarks confirm the robustness of our approach. The shared fine-tuned backbone delivers strong accuracy in both scenarios, highlighting the generalization power of the learned embeddings. Moreover, we successfully demonstrate that visual detection architectures like DETR are not limited to pixel space, effectively generalizing to the localization of generated text spans. To ensure reproducibility and foster further research, we publicly release our source code.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes GigaCheck, a dual-strategy framework for LLM-generated text detection. Document-level detection uses fine-tuned LLMs for authorship classification with high data efficiency. Span-level detection treats generated text segments as 'objects' by integrating a DETR-like vision model with linguistic encoders for precise localization of AI intervals. It claims robust performance across three classification and three localization benchmarks, successful transfer of visual detection robustness to text, and releases code for reproducibility.
Significance. If the empirical results hold with proper controls, the work would demonstrate a viable transfer of set-prediction object detection to 1D token sequences, potentially improving span-level localization over standard sequence labeling. The shared backbone and public code release strengthen reproducibility and generalization claims.
major comments (2)
- [Model Architecture / Span Localization] The central claim (abstract and model description) that a DETR-style architecture can be directly repurposed for token-span regression 'without fundamental changes to the detection head or loss' rests on the untested axiom that generated text intervals possess the same spatial/contextual coherence properties as visual objects. Text is strictly sequential and 1D while DETR was designed for 2D spatial relations and bipartite matching over sets; the manuscript must provide ablations isolating the contribution of the vision transfer versus the fine-tuned LLM backbone, or explicit modifications to the detection head, to substantiate the transfer.
- [Experiments] Abstract and experimental claims assert 'strong accuracy' and 'robustness' across six benchmarks with successful vision-to-text generalization, yet no quantitative metrics, baseline comparisons, ablation tables, or error analysis are referenced in the provided summary. Without these, it is impossible to evaluate whether performance gains are load-bearing or attributable to post-hoc choices.
minor comments (2)
- [Abstract] The abstract states results on 'three classification and three localization benchmarks' but does not name them or report numbers; adding this would improve clarity.
- [Methods] Notation for 'text objects' and the precise mapping from token spans to DETR queries/outputs should be defined explicitly in the methods section to avoid ambiguity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the two major comments point by point below and indicate where revisions will be made.
read point-by-point responses
-
Referee: [Model Architecture / Span Localization] The central claim (abstract and model description) that a DETR-style architecture can be directly repurposed for token-span regression 'without fundamental changes to the detection head or loss' rests on the untested axiom that generated text intervals possess the same spatial/contextual coherence properties as visual objects. Text is strictly sequential and 1D while DETR was designed for 2D spatial relations and bipartite matching over sets; the manuscript must provide ablations isolating the contribution of the vision transfer versus the fine-tuned LLM backbone, or explicit modifications to the detection head, to substantiate the transfer.
Authors: The manuscript presents the DETR-like component as a direct transfer that works empirically, but we agree that the claim would be strengthened by explicit ablations and a clearer description of any 1D adaptations to the matching loss or head. We will add an ablation study (comparing the full model against an LLM-backbone-only sequence labeling baseline) and expand the model section to detail the precise modifications made to the detection head and loss for token sequences. revision: yes
-
Referee: [Experiments] Abstract and experimental claims assert 'strong accuracy' and 'robustness' across six benchmarks with successful vision-to-text generalization, yet no quantitative metrics, baseline comparisons, ablation tables, or error analysis are referenced in the provided summary. Without these, it is impossible to evaluate whether performance gains are load-bearing or attributable to post-hoc choices.
Authors: The full manuscript contains the requested quantitative results, baseline comparisons, and tables in Sections 4 and 5 (plus appendix), covering the six benchmarks. The abstract provides only a high-level summary, which is standard. We will add explicit cross-references from the abstract and introduction to the result tables and, if space permits, include a brief error analysis subsection. revision: partial
Circularity Check
No circularity: empirical model adaptation with external benchmarks
full rationale
The paper is an empirical engineering contribution that proposes a dual-strategy detection framework, fine-tunes LLM backbones, and adapts a DETR-style architecture for token-span regression on text. No equations, first-principles derivations, or predictions are presented that reduce claimed performance to quantities defined by the authors' own fitted parameters or self-citations. Results are validated on three classification and three localization benchmarks, with the central claim resting on experimental outcomes rather than any self-referential construction. The adaptation of visual detection methods to text is framed as a transfer learning hypothesis tested empirically, not derived by definition from prior author work.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Fine-tuned LLM representations capture authorship signals with high data efficiency
- ad hoc to paper Visual object-detection architectures can be transferred to token sequences without fundamental redesign of the detection head
invented entities (1)
-
text objects
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Wissam Antoun, Virginie Mouilleron, Beno ˆıt Sagot, and Djam´e Seddah. Towards a robust detection of language model generated text: Is chatgpt that easy to detect? arXiv preprint arXiv:2306.05871, 2023. 2
-
[2]
Seg- former: A topic segmentation model with controllable range of attention
Haitao Bai, Pinghui Wang, Ruofei Zhang, and Zhou Su. Seg- former: A topic segmentation model with controllable range of attention. In Proceedings of the AAAI Conference on Ar- tificial Intelligence, volume 37, pages 12545–12552, 2023. 10
work page 2023
-
[3]
Longformer: The Long-Document Transformer
Iz Beltagy, Matthew E Peters, and Arman Cohan. Long- former: The long-document transformer. arXiv preprint arXiv:2004.05150, 2020. 8 11
work page internal anchor Pith review Pith/arXiv arXiv 2004
-
[4]
Amrita Bhattacharjee and Huan Liu. Fighting fire with fire: can chatgpt detect ai-generated text? ACM SIGKDD Explo- rations Newsletter, 25(2):14–21, 2024. 2
work page 2024
-
[5]
Bloom: A 176b-parameter open-access multi- lingual language model, 2023
BigScience. Bloom: A 176b-parameter open-access multi- lingual language model, 2023. 6
work page 2023
-
[6]
GPT- NeoX-20B: An open-source autoregressive language model
Sidney Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Con- nor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, Usvsn Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, and Samuel Weinbach. GPT- NeoX-20B: An open-source autoregressive language model. In Proceedings of BigScience Episo...
work page 2022
-
[7]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Sub- biah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakan- tan, Pranav Shyam, Girish Sastry, Amanda Askell, Sand- hini Agarwal, Ariel Herbert-V oss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz...
work page 2020
-
[8]
End-to- end object detection with transformers
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to- end object detection with transformers. In European confer- ence on computer vision, pages 213–229. Springer, 2020. 2, 3
work page 2020
-
[9]
Souradip Chakraborty, Amrit Singh Bedi, Sicheng Zhu, Bang An, Dinesh Manocha, and Furong Huang. On the possibilities of ai-generated text detection. arXiv preprint arXiv:2304.04736, 2023. 2
-
[10]
Facebook ai’s wmt20 news translation task submission
Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, and Jiatao Gu. Facebook ai’s wmt20 news translation task submission. arXiv preprint arXiv:2011.08298, 2020. 5
-
[11]
Gpt-sentinel: Distinguishing human and chatgpt generated content
Yutian Chen, Hao Kang, Vivian Zhai, Liangze Li, Rita Singh, and Bhiksha Raj. Gpt-sentinel: Distinguishing human and chatgpt generated content. arXiv preprint arXiv:2305.07969, 2023. 8
-
[12]
Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V
Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Alex Castro-Ros, Marie Pel- lat, Kevin Robinson, Dasha Valter, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping...
work page 2022
-
[13]
Automatic detection of hybrid human-machine text boundaries
Joseph Cutler, Liam Dugan, Shreya Havaldar, and Adam Stein. Automatic detection of hybrid human-machine text boundaries. 2021. 9, 10
work page 2021
-
[14]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. 1, 7, 10
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[15]
Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, and Chris Callison-Burch. Real or fake text?: Inves- tigating human ability to detect boundaries between human- written and machine-generated text. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 12763–12771, 2023. 6
work page 2023
-
[16]
Tweepfake: About de- tecting deepfake tweets
Tiziano Fagni, Fabrizio Falchi, Margherita Gambini, Anto- nio Martella, and Maurizio Tesconi. Tweepfake: About de- tecting deepfake tweets. Plos one, 16(5):e0251415, 2021. 5
work page 2021
-
[17]
Feature-based detec- tion of automated language models: tackling gpt-2, gpt-3 and grover
Leon Fr ¨ohling and Arkaitz Zubiaga. Feature-based detec- tion of automated language models: tackling gpt-2, gpt-3 and grover. PeerJ Computer Science, 7:e443, 2021. 2
work page 2021
-
[18]
GLTR: Statistical Detection and Visualization of Generated Text
Sebastian Gehrmann, Hendrik Strobelt, and Alexander M Rush. Gltr: Statistical detection and visualization of gen- erated text. arXiv preprint arXiv:1906.04043 , 2019. 2, 7, 8
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[19]
Saliency-guided detr for mo- ment retrieval and highlight detection, 2024
Aleksandr Gordeev, Vladimir Dokholyan, Irina Tolstykh, and Maksim Kuprashevich. Saliency-guided detr for mo- ment retrieval and highlight detection, 2024. 3
work page 2024
-
[20]
Social engineer- ing with chatgpt
Dijana Vukovic Grbic and Igor Dujlovic. Social engineer- ing with chatgpt. In 2023 22nd International Symposium INFOTEH-JAHORINA (INFOTEH), pages 1–5. IEEE, 2023. 1
work page 2023
-
[21]
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023. 3
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[22]
How close is chatgpt to human experts? comparison corpus, evaluation, and detection
Biyang Guo, Xin Zhang, Ziyuan Wang, Minqi Jiang, Jinran Nie, Yuxuan Ding, Jianwei Yue, and Yupeng Wu. How close is chatgpt to human experts? comparison corpus, evaluation, and detection. arXiv preprint arXiv:2301.07597, 2023. 2
-
[23]
Authentigpt: Detecting machine- generated text via black-box language models denoising
Zhen Guo and Shangdi Yu. Authentigpt: Detecting machine- generated text via black-box language models denoising. arXiv preprint arXiv:2311.07700, 2023. 2
-
[24]
Hans W A Hanley and Zakir Durumeric. Machine-made me- dia: Monitoring the mobilization of machine-generated arti- cles on misinformation and mainstream news websites. In Proceedings of the International AAAI Conference on Web and Social Media, volume 18, pages 542–556, 2024. 1
work page 2024
-
[25]
Pengcheng He, Jianfeng Gao, and Weizhu Chen. Deber- tav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing. arXiv preprint arXiv:2111.09543, 2021. 10
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[26]
Relation detr: Exploring explicit position relation prior for object detection
Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen, and Xuguang Lan. Relation detr: Exploring explicit position relation prior for object detection. arXiv preprint arXiv:2407.11699, 2024. 3
-
[27]
LoRA: Low-Rank Adaptation of Large Language Models
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021. 3, 4 12
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[28]
Radar: Ro- bust ai-text detection via adversarial learning
Xiaomeng Hu, Pin-Yu Chen, and Tsung-Yi Ho. Radar: Ro- bust ai-text detection via adversarial learning. Advances in Neural Information Processing Systems , 36:15077–15095,
-
[29]
Monodtr: Monocular 3d object detection with depth-aware transformer
Kuan-Chih Huang, Tsung-Han Wu, Hung-Ting Su, and Win- ston H Hsu. Monodtr: Monocular 3d object detection with depth-aware transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 4012–4021, 2022. 3
work page 2022
-
[30]
Survey of hallucination in natural language generation
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38, 2023. 1
work page 2023
-
[31]
Albert Q Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lam- ple, Lucile Saulnier, et al. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023. 1, 4
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[32]
Bag of Tricks for Efficient Text Classification
Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759, 2016. 8
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[33]
Enkelejda Kasneci, Kathrin Seßler, Stefan K ¨uchemann, Maria Bannert, Daryna Dementieva, Frank Fischer, Urs Gasser, Georg Groh, Stephan G ¨unnemann, Eyke H¨ullermeier, et al. Chatgpt for good? on opportunities and challenges of large language models for education.Learning and individual differences, 103:102274, 2023. 1
work page 2023
-
[34]
Ryuto Koike, Masahiro Kaneko, and Naoaki Okazaki. Out- fox: Llm-generated essay detection through in-context learn- ing with adversarially generated examples. InProceedings of the AAAI Conference on Artificial Intelligence , volume 38, pages 21258–21266, 2024. 2
work page 2024
-
[35]
Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense
Kalpesh Krishna, Yixiao Song, Marzena Karpinska, John Wieting, and Mohit Iyyer. Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. Ad- vances in Neural Information Processing Systems, 36, 2024. 2
work page 2024
-
[36]
Language genera- tion models can cause harm: So what can we do about it? an actionable survey
Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Anto- nios Anastasopoulos, and Yulia Tsvetkov. Language genera- tion models can cause harm: So what can we do about it? an actionable survey. arXiv preprint arXiv:2210.07700, 2022. 1
-
[37]
Artificial text detection via examining the topology of attention maps
Laida Kushnareva, Daniil Cherniavskii, Vladislav Mikhailov, Ekaterina Artemova, Serguei Barannikov, Alexander Bernstein, Irina Piontkovskaya, Dmitri Pio- ntkovski, and Evgeny Burnaev. Artificial text detection via examining the topology of attention maps. arXiv preprint arXiv:2109.04825, 2021. 2
-
[38]
Ai-generated text boundary detection with roft
Laida Kushnareva, Tatiana Gaintseva, Dmitry Abulkhanov, Kristian Kuznetsov, German Magai, Eduard Tulchinskii, Serguei Barannikov, Sergey Nikolenko, and Irina Pio- ntkovskaya. Ai-generated text boundary detection with roft. In First Conference on Language Modeling. 2, 6, 7, 8, 9, 10, 16
-
[39]
Mina Lee, Percy Liang, and Qian Yang. Coauthor: Design- ing a human-ai collaborative writing dataset for exploring language model capabilities. In Proceedings of the 2022 CHI conference on human factors in computing systems, pages 1– 19, 2022. 6, 10
work page 2022
-
[40]
Detecting moments and highlights in videos via natural language queries
Jie Lei, Tamara L Berg, and Mohit Bansal. Detecting moments and highlights in videos via natural language queries. Advances in Neural Information Processing Sys- tems, 34:11846–11858, 2021. 3
work page 2021
-
[41]
Dn-detr: Accelerate detr training by intro- ducing query denoising
Feng Li, Hao Zhang, Shilong Liu, Jian Guo, Lionel M Ni, and Lei Zhang. Dn-detr: Accelerate detr training by intro- ducing query denoising. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 13619–13627, 2022. 3, 4
work page 2022
-
[42]
Textbooks Are All You Need II: phi-1.5 technical report
Yuanzhi Li, S ´ebastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar, and Yin Tat Lee. Textbooks are all you need ii: phi-1.5 technical report. arXiv preprint arXiv:2309.05463, 2023. 9, 10
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
Mage: Machine-generated text detection in the wild
Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Zhilin Wang, Longyue Wang, Linyi Yang, Shuming Shi, and Yue Zhang. Mage: Machine-generated text detection in the wild. In Pro- ceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages 36–53, 2024. 2, 5, 6
work page 2024
-
[44]
Focal Loss for Dense Object Detection
T Lin. Focal loss for dense object detection. arXiv preprint arXiv:1708.02002, 2017. 4
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[45]
Dab-detr: Dynamic anchor boxes are better queries for detr
Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi, Hang Su, Jun Zhu, and Lei Zhang. Dab-detr: Dynamic anchor boxes are better queries for detr. arXiv preprint arXiv:2201.12329, 2022. 3, 4
-
[46]
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019. 1, 2, 7, 9, 10
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[47]
Argugpt: evaluating, understanding and identifying argu- mentative essays generated by gpt models
Yikang Liu, Ziyin Zhang, Wanyang Zhang, Shisen Yue, Xi- aojing Zhao, Xinyuan Cheng, Yiwen Zhang, and Hai Hu. Argugpt: evaluating, understanding and identifying argu- mentative essays generated by gpt models. arXiv preprint arXiv:2304.07666, 2023. 2
-
[48]
Check me if you can: Detecting chatgpt-generated academic writing us- ing checkgpt
Zeyan Liu, Zijun Yao, Fengjun Li, and Bo Luo. Check me if you can: Detecting chatgpt-generated academic writing us- ing checkgpt. arXiv preprint arXiv:2306.05524, 2023. 2
-
[49]
Zeyan Liu, Zijun Yao, Fengjun Li, and Bo Luo. On the de- tectability of chatgpt content: benchmarking, methodology, and evaluation through the lens of academic writing. arXiv e-prints, pages arXiv–2306, 2023. 1, 2
work page 2023
-
[50]
Transformer over pre-trained transformer for neural text segmentation with enhanced topic coherence
Kelvin Lo, Yuan Jin, Weicong Tan, Ming Liu, Lan Du, and Wray Buntine. Transformer over pre-trained transformer for neural text segmentation with enhanced topic coherence. arXiv preprint arXiv:2110.07160, 2021. 10
-
[51]
Decoupled Weight Decay Regularization
I Loshchilov. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017. 5
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[52]
SGDR: Stochastic Gradient Descent with Warm Restarts
Ilya Loshchilov and Frank Hutter. Sgdr: Stochas- tic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016. 5
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[53]
The threat of offen- sive ai to organizations
Yisroel Mirsky, Ambra Demontis, Jaidip Kotak, Ram Shankar, Deng Gelei, Liu Yang, Xiangyu Zhang, Maura Pin- tor, Wenke Lee, Yuval Elovici, et al. The threat of offen- sive ai to organizations. Computers & Security, 124:103006,
-
[54]
Detectgpt: Zero-shot machine-generated text detection using probability curva- ture
Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christo- pher D Manning, and Chelsea Finn. Detectgpt: Zero-shot machine-generated text detection using probability curva- ture. In International Conference on Machine Learning , pages 24950–24962. PMLR, 2023. 2, 8, 9
work page 2023
-
[55]
WonJun Moon, Sangeek Hyun, SuBeen Lee, and Jae-Pil Heo. Correlation-guided query-dependency calibration in video representation learning for temporal grounding. arXiv preprint arXiv:2311.08835, 2023. 3
-
[56]
ChatGPT: A Large Language Model
OpenAI. ChatGPT: A Large Language Model. On- line; accessed February 13, 2024, 2023. Available at https://www.openai.com/. 2, 10
work page 2024
- [57]
-
[58]
Game of tones: faculty detection of gpt-4 generated content in university assessments
Mike Perkins, Jasper Roe, Darius Postma, James McGaugh- ran, and Don Hickerson. Game of tones: faculty detection of gpt-4 generated content in university assessments. arXiv preprint arXiv:2305.18081, 2023. 1
-
[59]
Language models are unsu- pervised multitask learners
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. Language models are unsu- pervised multitask learners. OpenAI blog, 1(8):9, 2019. 10
work page 2019
-
[60]
Learning to rewrite: Generalized llm- generated text detection
Llama Rewrite. Learning to rewrite: Generalized llm- generated text detection. 2
-
[61]
Generalized in- tersection over union: A metric and a loss for bounding box regression
Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, and Silvio Savarese. Generalized in- tersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 658–666,
-
[62]
Generating phishing attacks using chatgpt
Sayak Saha Roy, Krishna Vamsi Naragam, and Shirin Nilizadeh. Generating phishing attacks using chatgpt. arXiv preprint arXiv:2305.05133, 2023. 1
-
[63]
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
V Sanh. Distilbert, a distilled version of bert: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019. 7, 8, 10
work page internal anchor Pith review Pith/arXiv arXiv 1910
-
[64]
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal V . Nayak, Debajy- oti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica...
work page 2022
-
[65]
OpenReview.net, 2022. 6
work page 2022
-
[66]
Release Strategies and the Social Impacts of Language Models
Irene Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-V oss, Jeff Wu, Alec Radford, Gretchen Krueger, Jong Wook Kim, Sarah Kreps, et al. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203, 2019. 2, 8
work page internal anchor Pith review Pith/arXiv arXiv 1908
-
[67]
Ai bot chatgpt writes smart essays- should academics worry? Nature, 2022
Chris Stokel-Walker. Ai bot chatgpt writes smart essays- should academics worry? Nature, 2022. 1
work page 2022
-
[68]
Energy and policy considerations for deep learning in NLP
Emma Strubell, Ananya Ganesh, and Andrew McCallum. Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3645–3650, Florence, Italy, July 2019. Association for Computational Linguistics. 11
work page 2019
-
[69]
Detectllm: Leveraging log rank information for zero- shot detection of machine-generated text
Jinyan Su, Terry Yue Zhuo, Di Wang, and Preslav Nakov. Detectllm: Leveraging log rank information for zero- shot detection of machine-generated text. arXiv preprint arXiv:2306.05540, 2023. 2
-
[70]
Chatgpt is fun, but not an author, 2023
H Holden Thorp. Chatgpt is fun, but not an author, 2023. 1
work page 2023
-
[71]
Llama: Open and efficient foundation lan- guage models, 2023
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth´ee Lacroix, Baptiste Rozi`ere, Naman Goyal, Eric Hambro, Faisal Azhar, Aure- lien Rodriguez, Armand Joulin, Edouard Grave, and Guil- laume Lample. Llama: Open and efficient foundation lan- guage models, 2023. 3, 5, 6
work page 2023
-
[72]
Intrinsic di- mension estimation for robust detection of ai-generated texts
Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Sergey Nikolenko, Evgeny Burnaev, Serguei Barannikov, and Irina Piontkovskaya. Intrinsic di- mension estimation for robust detection of ai-generated texts. Advances in Neural Information Processing Systems , 36,
-
[73]
Toproberta: Topology-aware authorship attribution of deepfake texts
Adaku Uchendu, Thai Le, and Dongwon Lee. Toproberta: Topology-aware authorship attribution of deepfake texts. arXiv preprint arXiv:2309.12934, 2023. 2
-
[74]
Au- thorship attribution for neural text generation
Adaku Uchendu, Thai Le, Kai Shu, and Dongwon Lee. Au- thorship attribution for neural text generation. In Proceed- ings of the 2020 conference on empirical methods in natural language processing (EMNLP), pages 8384–8395, 2020. 2
work page 2020
-
[75]
Turingbench: A benchmark environment for tur- ing test in the age of neural text generation
Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, and Dong- won Lee. Turingbench: A benchmark environment for tur- ing test in the age of neural text generation. arXiv preprint arXiv:2109.13296, 2021. 2, 5
-
[76]
Christoforos Vasilatos, Manaar Alam, Talal Rahwan, Yasir Zaki, and Michail Maniatakos. Howkgpt: Investigating the detection of chatgpt-generated university student homework through context-aware perplexity analysis. arXiv preprint arXiv:2305.18226, 2023. 1
-
[77]
Ghostbuster: Detecting text ghostwritten by large language models
Vivek Verma, Eve Fleisig, Nicholas Tomlin, and Dan Klein. Ghostbuster: Detecting text ghostwritten by large language models. arXiv preprint arXiv:2305.15047, 2023. 5, 9
-
[78]
GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model
Ben Wang and Aran Komatsuzaki. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/ mesh-transformer-jax, 2021. 6
work page 2021
-
[79]
Seqxgpt: Sentence-level ai- generated text detection
Pengyu Wang, Linyang Li, Ke Ren, Botian Jiang, Dong Zhang, and Xipeng Qiu. Seqxgpt: Sentence-level ai- generated text detection. arXiv preprint arXiv:2310.08903,
-
[80]
Rongsheng Wang, Qi Li, and Sihong Xie. Detectgpt-sc: Im- proving detection of text generated by large language mod- els through self-consistency with masked predictions. arXiv preprint arXiv:2310.14479, 2023. 2
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.