Topic-Based Watermarks for Large Language Models

Alexander Nemecek; Erman Ayday; Yuzhou Jiang

arxiv: 2404.02138 · v6 · submitted 2024-04-02 · 💻 cs.CR · cs.CL· cs.LG

Topic-Based Watermarks for Large Language Models

Alexander Nemecek , Yuzhou Jiang , Erman Ayday This is my paper

Pith reviewed 2026-05-24 02:01 UTC · model grok-4.3

classification 💻 cs.CR cs.CLcs.LG

keywords watermarkinglarge language modelstopic-guided selectionAI-generated textparaphrasing robustnesstoken subsetsgreen-listinggeneration quality

0 comments

The pith

A topic-guided scheme partitions LLM vocabulary into topic-aligned token subsets to embed watermarks that resist paraphrasing while preserving generation quality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a watermarking approach for large language models that identifies the topic of an input prompt and then favors tokens from a matching subset during text generation. This creates a detectable signature by green-listing semantically related tokens without requiring changes to the core generation process or extra frameworks. The authors argue this balances three goals that often conflict in prior methods: attack resistance, output fluency, and low overhead. Tests across multiple models and benchmarks are said to show quality on par with leading systems alongside better survival under paraphrasing and lexical changes. If the approach holds, it would allow straightforward addition of watermarks to standard pipelines for detecting AI-generated content.

Core claim

By partitioning the vocabulary into topic-specific subsets and selecting the relevant subset from the prompt to bias token probabilities toward aligned items, the method embeds marks that improve robustness to paraphrasing and lexical perturbations while matching the text quality of industry systems and adding negligible overhead, all without external mechanisms beyond normal generation.

What carries the argument

Topic-guided selection of green-listed token subsets from a vocabulary partition, chosen according to the prompt's identified topic.

If this is right

Watermarking becomes possible on any standard LLM pipeline without specialized integrations or post-processing.
Detection remains effective after paraphrasing and word-level changes that defeat earlier watermark schemes.
Generation speed and quality stay comparable to unwatermarked baselines across common benchmarks.
The same method can be applied uniformly to outputs from different models for consistent tracing.
No additional runtime cost beyond normal sampling makes broad deployment feasible.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If topic detection is noisy on short or ambiguous prompts, the watermark strength would vary by input type in practice.
The approach could extend to dynamic multi-topic handling by blending subsets during long generations.
Combining this token bias with existing statistical detectors might raise the bar for evasion attempts.
Widespread use would create a de facto standard for marking AI text, aiding downstream verification tools.

Load-bearing premise

A relevant topic can be reliably identified from the input prompt to pick the correct token subset, and favoring those tokens keeps the output fluent and coherent without extra fixes.

What would settle it

A test set where automatic topic identification from prompts frequently selects mismatched subsets, resulting in either watermark detection failure or measurable drops in fluency under paraphrasing attacks, would disprove the central performance claims.

Figures

Figures reproduced from arXiv: 2404.02138 by Alexander Nemecek, Erman Ayday, Yuzhou Jiang.

**Figure 1.** Figure 1: Comparison of KGW and TBW vocabulary partitioning. KGW (top) randomly partitions vocabulary V into green/red lists using parameter γ. TBW (bottom) creates semantically meaningful partitions by assigning tokens to predefined topic lists. Prompts are mapped to corresponding topics, making that topic list the active green list. Both methods bias generation toward green lists using parameter δ to adjust logit … view at source ↗

**Figure 2.** Figure 2: Text perplexity comparison using baseline [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Detection scores under random (left) and tar [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Perplexity comparison for OPT-6.7B with TBW at higher watermark strength (δ = 3.0) and all other schemes at their standard settings. Compared to TBW’s δ = 2.0 results (see §5.2), the increased bias leads to moderate quality degradation while retaining the lowest perplexity among watermarking methods. Lower values indicate higher text quality. provides crucial insights into whether watermarking artifacts a… view at source ↗

**Figure 6.** Figure 6: Comparison of average generation time (sec [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗

**Figure 5.** Figure 5: Comparison of average generation time (sec [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗

**Figure 7.** Figure 7: ROC curves comparing watermark methods on OPT-6.7B and Gemma-7B against PEGASUS (top) and [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

**Figure 8.** Figure 8: Word clouds of the top-40 normalized token frequencies for OPT-6.7B (top) and GEMMA-7B (bottom). From left to right, each row shows outputs from: (i) non-watermarked generations, (ii) TBW with bias δ = 2.0, and (iii) TBW with bias δ = 3.0. Across both models and bias strengths, the distributions are dominated by common function words (e.g., “the,” “and,” “to”), with no systematic elevation of topic-specifi… view at source ↗

**Figure 9.** Figure 9: ROC curves for maximum z-score detection on OPT-6.7B and GEMMA-7B. Both models achieve AUC values of 0.996 and 1.000, indicating near-perfect separation between watermarked and non-watermarked content. 0.0 5.0 10.0 15.0 20.0 z-score 0.00 0.10 0.20 0.30 0.40 Density OPT-6.7B Watermarked Non-watermarked Threshold = 4.75 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 z-score 0.00 0.10 0.20 0.30 0.40 0.50 0.60 Densi… view at source ↗

**Figure 10.** Figure 10: z-score distributions for the maximum zscore detection method on OPT-6.7B and GEMMA7B. Both show clear separation between watermarked and non-watermarked text, with OPT-6.7B’s closer overlap explaining its higher false positive rate. both models exhibit clear bimodal separation between watermarked and non-watermarked text, the non-watermarked distribution for OPT-6.7B lies slightly closer to the detect… view at source ↗

**Figure 11.** Figure 11: Detection score (z-score) vs. bias strength δ across similarity thresholds. Higher δ yields stronger watermark signals, with detection saturating around δ = 5.0. 0.0 2.0 4.0 6.0 8.0 10.0 (Bias Strength) 0.0 0.2 0.4 0.6 0.8 1.0 Distinct-1 = 0.3 = 0.5 = 0.7 0.0 2.0 4.0 6.0 8.0 10.0 (Bias Strength) 0.0 0.2 0.4 0.6 0.8 1.0 Distinct-2 0.0 2.0 4.0 6.0 8.0 10.0 (Bias Strength) 0.0 0.2 0.4 0.6 0.8 1.0 Distinct-3 … view at source ↗

**Figure 12.** Figure 12: Lexical diversity (Distinct-N) vs. bias strength δ. Moderate δ values maintain diversity, while very high strengths lead to increased repetition. sits just before the diversity inflection point in [PITH_FULL_IMAGE:figures/full_fig_p025_12.png] view at source ↗

**Figure 13.** Figure 13: Distinct-N scores vs. detection scores across parameter combinations. The relationship demonstrates [PITH_FULL_IMAGE:figures/full_fig_p026_13.png] view at source ↗

**Figure 14.** Figure 14: Comprehensive heat map of Distinct-N metrics across bias strength [PITH_FULL_IMAGE:figures/full_fig_p026_14.png] view at source ↗

**Figure 15.** Figure 15: Detection strength vs. number of topics. Mean detector z-score (max-z) as we scale K ∈ {4, 8, 16, 32} on GEMMA-7B with δ=2.0, τ=0.5 over 100 prompts. Error bars denote ±s.d. Text quality vs. K. Using the same setup, we assess BERTScore F1 as a text-quality metric (Zhang et al., 2024) [PITH_FULL_IMAGE:figures/full_fig_p027_15.png] view at source ↗

**Figure 16.** Figure 16: Text quality vs. number of topics. BERTScore F1 remains flat as K increases, indicating no quality degradation. Error bars denote ±s.d. 0.35 0.40 0.45 0.50 0.55 0.60 BERTScore F1 0 2 4 6 8 10 12 14 z-score Number of Topics 4 topics 8 topics 16 topics 32 topics [PITH_FULL_IMAGE:figures/full_fig_p028_16.png] view at source ↗

**Figure 17.** Figure 17: Detection vs. quality trade-off. Per-sample [PITH_FULL_IMAGE:figures/full_fig_p028_17.png] view at source ↗

read the original abstract

The indistinguishability of large language model (LLM) output from human-authored content poses significant challenges, raising concerns about potential misuse of AI-generated text and its influence on future model training. Watermarking algorithms offer a viable solution by embedding detectable signatures into generated text. However, existing watermarking methods often involve trade-offs among attack robustness, generation quality, and additional overhead such as specialized frameworks or complex integrations. We propose a lightweight, topic-guided watermarking scheme for LLMs that partitions the vocabulary into topic-aligned token subsets. Given an input prompt, the scheme selects a relevant topic-specific token list, effectively "green-listing" semantically aligned tokens to embed robust marks while preserving fluency and coherence. Experimental results across multiple LLMs and state-of-the-art benchmarks demonstrate that our method achieves text quality comparable to industry-leading systems and simultaneously improves watermark robustness against paraphrasing and lexical perturbation attacks, with minimal performance overhead. Our approach avoids reliance on additional mechanisms beyond standard text generation pipelines, enabling straightforward adoption and suggesting a practical path toward globally consistent watermarking of AI-generated content.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The topic-guided partitioning for watermark green lists is a distinct mechanism, but the robustness claims rest on an untested assumption that topics can be reliably identified from prompts.

read the letter

The main takeaway is a new watermarking approach that partitions the vocabulary into topic-aligned subsets and selects one based on the input prompt to bias token generation. This differs from standard hash or fixed-list methods by tying the green list directly to prompt semantics. It does well by staying inside ordinary generation pipelines with no added frameworks and by claiming comparable output quality plus stronger resistance to paraphrasing and lexical changes at low overhead. The abstract positions this as practical for broad adoption in content authenticity work. The soft spot is the topic identification step itself. The scheme needs an accurate topic from the prompt to pick the right subset; if that step is unreliable on short, vague, or multi-topic inputs, the watermark signal drops or fluency suffers. The abstract describes the selection but gives no algorithm, accuracy numbers, or handling for uncertainty, so the reported robustness gains depend on an unexamined precondition. The experiments are asserted across models and benchmarks yet supply no metrics or attack details in the given text, leaving the performance claims hard to judge. This paper is for researchers working on deployable LLM watermarking and provenance tools. Readers focused on lightweight, integrable methods could extract useful ideas if the topic detection works as assumed. It deserves a serious referee to examine the full implementation, topic detection accuracy, and experimental numbers. I would send it to peer review.

Referee Report

2 major / 0 minor

Summary. The paper proposes a lightweight topic-guided watermarking scheme for LLMs. It partitions the vocabulary into topic-aligned token subsets; given an input prompt, it identifies a relevant topic to select and green-list semantically aligned tokens during generation. This is claimed to embed detectable marks while preserving fluency, achieving text quality comparable to industry systems, and improving robustness to paraphrasing and lexical perturbation attacks with minimal overhead, all without additional frameworks beyond standard generation pipelines.

Significance. If the central claims hold, the work would provide a practical, low-overhead alternative to existing watermarking methods that often require specialized integrations. The emphasis on topic alignment for robustness without sacrificing quality addresses a key tension in the field. The approach's avoidance of extra mechanisms is a concrete strength that could facilitate broader adoption if the topic-selection precondition is validated.

major comments (2)

[Abstract / method description] Abstract and method description: the central robustness claims against paraphrasing and lexical attacks rest on the precondition that a relevant topic can be reliably identified from any input prompt to select the correct token subset. No method, accuracy metrics, fallback procedure, or validation experiments for topic identification (especially on short, ambiguous, or multi-topic prompts) are supplied, leaving the reported improvements dependent on an unexamined assumption.
[Abstract] Abstract: the assertion of 'experimental results across multiple LLMs and state-of-the-art benchmarks' that demonstrate comparable quality and improved robustness supplies no quantitative metrics, attack details, baseline comparisons, or methodology, rendering the performance claims unverifiable from the manuscript text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive review. The comments identify important areas where the presentation of our topic-guided watermarking method can be strengthened, particularly around the topic identification precondition and the level of detail in the abstract. We address each major comment below and commit to revisions that will make the manuscript more complete and verifiable.

read point-by-point responses

Referee: [Abstract / method description] Abstract and method description: the central robustness claims against paraphrasing and lexical attacks rest on the precondition that a relevant topic can be reliably identified from any input prompt to select the correct token subset. No method, accuracy metrics, fallback procedure, or validation experiments for topic identification (especially on short, ambiguous, or multi-topic prompts) are supplied, leaving the reported improvements dependent on an unexamined assumption.

Authors: We agree that reliable topic identification from the input prompt is a central precondition for the claimed robustness gains, and that the manuscript does not provide sufficient detail on this component. The current description assumes topic selection occurs but does not specify the procedure (e.g., embedding-based matching or a lightweight classifier), report accuracy, or include fallback logic. We will add a dedicated subsection describing the topic identification method, its implementation, accuracy metrics on standard topic classification benchmarks, handling for short/ambiguous/multi-topic prompts (including a default general-topic fallback), and new validation experiments measuring end-to-end watermark performance under these conditions. These additions will directly address the unexamined assumption. revision: yes
Referee: [Abstract] Abstract: the assertion of 'experimental results across multiple LLMs and state-of-the-art benchmarks' that demonstrate comparable quality and improved robustness supplies no quantitative metrics, attack details, baseline comparisons, or methodology, rendering the performance claims unverifiable from the manuscript text.

Authors: The abstract is written as a high-level summary per standard conventions. The full manuscript contains the requested details in the Experiments and Evaluation sections: quantitative metrics (perplexity, detection rates), explicit attack descriptions (paraphrasing via specific models and lexical substitutions), baseline comparisons (to prior watermarking schemes), and methodology (models, benchmarks, attack parameters). However, we acknowledge that the abstract could better signpost these results. We will revise the abstract to include concise references to key quantitative outcomes and direct readers to the relevant sections, improving verifiability without exceeding length constraints. revision: partial

Circularity Check

0 steps flagged

No circularity: method is an independent algorithmic proposal

full rationale

The paper describes a new topic-guided watermarking scheme that partitions the vocabulary into topic-aligned subsets and selects green-list tokens from an input prompt. No equations, derivations, or self-citations are shown that reduce the claimed robustness or quality improvements to fitted parameters, self-definitions, or prior author results by construction. The approach is presented as a self-contained algorithmic contribution without load-bearing uniqueness theorems or ansatzes imported from the authors' own prior work. The topic-identification precondition is an assumption about the method's applicability rather than a circular step in any derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The method rests on the feasibility of topic-aligned vocabulary partitioning and prompt-driven subset selection; these are domain assumptions rather than new entities or fitted constants.

axioms (1)

domain assumption LLM generation can be biased toward topic-aligned token subsets without materially harming fluency or coherence
Core premise of the green-listing step described in the abstract

pith-pipeline@v0.9.0 · 5716 in / 1198 out tokens · 63339 ms · 2026-05-24T02:01:03.982268+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a lightweight, topic-guided watermarking scheme for LLMs that partitions the vocabulary into topic-aligned token subsets... select a relevant topic-specific token list, effectively 'green-listing' semantically aligned tokens
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

z = (g − γ·n) / sqrt(n·γ·(1−γ)) ... maximum z-score detection

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The End of Trust: How Agentic AI Breaks Security Assumptions
cs.CR 2026-05 unverdicted novelty 6.0

Agentic AI eliminates the fidelity-scale tradeoff in deception, enabling the Infinite Impostor attack that hijacks trusted relationships at mass scale and requiring a shift to suspect-by-default security based on eval...
Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking
cs.CY 2026-04 conditional novelty 6.0

AI content watermarking exhibits detection disparities across languages, cultures, and demographics due to content-dependent signal properties, with benchmarks failing to disaggregate performance and watermarking held...

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · cited by 2 Pith papers · 5 internal anchors

[1]

online" 'onlinestring :=

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...

work page
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page
[3]

Association for the Advancement of Artificial Intelligence (AAAI) . 2025. AAAI Launches AI-Powered Peer Review Assessment System . https://aaai.org/aaai-launches-ai-powered-peer-review-assessment-system/. Accessed: 2025-08-25

work page 2025
[4]

Jainit Sushil Bafna, Hardik Mittal, Suyash Sethia, Manish Shrivastava, and Radhika Mamidi. 2024. https://arxiv.org/abs/2407.02978 Mast kalandar at semeval-2024 task 8: On the trail of textual origins: Roberta-bilstm approach to detect ai-generated text . Preprint, arXiv:2407.02978

work page arXiv 2024
[5]

Isabel Cachola, Kyle Lo, Arman Cohan, and Daniel Weld. 2020. https://doi.org/10.18653/v1/2020.findings-emnlp.428 TLDR : Extreme summarization of scientific documents . In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4766--4777, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2020.findings-emnlp.428 2020
[6]

Canyu Chen and Kai Shu. 2024. Combating misinformation in the age of llms: Opportunities and challenges. AI Magazine, 45(3):354--368

work page 2024
[7]

Cleveland Clinic . 2025. Cleveland Clinic Announces Rollout of Ambience Healthcare's AI Platform . https://newsroom.clevelandclinic.org/2025/02/19/cleveland-clinic-announces-the-rollout-of-ambience-healthcares-ai-platform. Accessed: 2025-08-25

work page 2025
[8]

Sumanth Dathathri, Abigail See, Sumedh Ghaisas, Po-Sen Huang, Rob McAdam, Johannes Welbl, Vandana Bachani, Alex Kaskasoli, Robert Stanforth, Tatiana Matejovicova, and 1 others. 2024. Scalable watermarking for identifying large language model outputs. Nature, 634(8035):818--823

work page 2024
[9]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. https://arxiv.org/abs/1810.04805 Bert: Pre-training of deep bidirectional transformers for language understanding . Preprint, arXiv:1810.04805

work page internal anchor Pith review Pith/arXiv arXiv 2019
[10]

Christiane Fellbaum. 1998. WordNet: An electronic lexical database. MIT press

work page 1998
[11]

Xiaoyan Feng, He Zhang, Yanjun Zhang, Leo Yu Zhang, and Shirui Pan. 2025. https://openreview.net/forum?id=Zvyb3WAg03 Bimark: Unbiased multilayer watermarking for large language models . In Forty-second International Conference on Machine Learning

work page 2025
[12]

Google . 2024. Introducing gemini 2.0: Our new ai model for the agentic era. https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/. Accessed: 2025-07-29

work page 2024
[13]

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, and 1 others. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024
[14]

Maarten Grootendorst. 2020. Keybert: Minimal keyword extraction with bert. https://maartengr.github.io/KeyBERT/. Accessed: 2025‑07‑29

work page 2020
[15]

Abe Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, and Yulia Tsvetkov. 2024 a . https://doi.org/10.18653/v1/2024.naacl-long.226 S em S tamp: A semantic watermark with paraphrastic robustness for text generation . In Proceedings of the 2024 Conference of the North American Ch...

work page doi:10.18653/v1/2024.naacl-long.226 2024
[16]

Abe Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, and Tianxing He. 2024 b . https://doi.org/10.18653/v1/2024.findings-acl.98 k- S em S tamp: A clustering-based semantic watermark for detection of machine-generated text . In Findings of the Association for Computational Linguistics: ACL 2024, pages 1706--1715, Bangkok, Thailand. Association for Computat...

work page doi:10.18653/v1/2024.findings-acl.98 2024
[17]

Hugging Face . 2025. Hugging face - the ai community building the future. https://huggingface.co. Accessed: 2025-08-26

work page 2025
[18]

Mingjia Huo, Sai Ashish Somayajula, Youwei Liang, Ruisi Zhang, Farinaz Koushanfar, and Pengtao Xie. 2024. Token-specific watermarking with enhanced detectability and semantic coherence for large language models. In Proceedings of the 41st International Conference on Machine Learning, ICML'24. JMLR.org

work page 2024
[19]

Niful Islam, Debopom Sutradhar, Humaira Noor, Jarin Tasnim Raya, Monowara Tabassum Maisha, and Dewan Md Farid. 2023. https://arxiv.org/abs/2306.01761 Distinguishing human generated text from chatgpt generated text using machine learning . Preprint, arXiv:2306.01761

work page arXiv 2023
[20]

Nikola Jovanovi\' c , Robin Staab, and Martin Vechev. 2024. Watermark stealing in large language models. In Proceedings of the 41st International Conference on Machine Learning, ICML'24. JMLR.org

work page 2024
[21]

John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers, and Tom Goldstein. 2023. A watermark for large language models. In International Conference on Machine Learning, pages 17061--17084. PMLR

work page 2023
[22]

Kalpesh Krishna. 2023. ai-detection-paraphrases. https://github.com/martiansideofthemoon/ai-detection-paraphrasesv . Accessed: 2025-08-01

work page 2023
[23]

Kalpesh Krishna, Yixiao Song, Marzena Karpinska, John Wieting, and Mohit Iyyer. 2023. Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. In Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS '23, Red Hook, NY, USA. Curran Associates Inc

work page 2023
[24]

Rohith Kuditipudi, John Thickstun, Tatsunori Hashimoto, and Percy Liang. 2024. https://arxiv.org/abs/2307.15593 Robust distortion-free watermarks for language models . Preprint, arXiv:2307.15593

work page arXiv 2024
[25]

Jooyoung Lee, Thai Le, Jinghui Chen, and Dongwon Lee. 2023. Do language models plagiarize? In Proceedings of the ACM Web Conference 2023, pages 3637--3647

work page 2023
[26]

Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, Sangdoo Yun, Jamin Shin, and Gunhee Kim. 2024. https://doi.org/10.18653/v1/2024.acl-long.268 Who wrote this code? watermarking for code generation . In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4890--4911, Bangkok, Th...

work page doi:10.18653/v1/2024.acl-long.268 2024
[27]

David D Lewis, Yiming Yang, Tony G Rose, and Fan Li. 2004. Rcv1: A new benchmark collection for text categorization research. Journal of machine learning research, 5(Apr):361--397

work page 2004
[28]

Yu, and Lifang He

Qian Li, Hao Peng, Jianxin Li, Congying Xia, Renyu Yang, Lichao Sun, Philip S. Yu, and Lifang He. 2021. https://arxiv.org/abs/2008.00364 A survey on text classification: From shallow to deep learning . Preprint, arXiv:2008.00364

work page arXiv 2021
[29]

Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, and James Zou. 2023. Gpt detectors are biased against non-native english writers. Patterns, 4(7)

work page 2023
[30]

Aiwei Liu, Leyi Pan, Xuming Hu, Shiao Meng, and Lijie Wen. 2024. https://arxiv.org/abs/2310.06356 A semantic invariant robust watermark for large language models . Preprint, arXiv:2310.06356

work page arXiv 2024
[31]

Yepeng Liu and Yuheng Bu. 2024. Adaptive text watermark for large language models. In Proceedings of the 41st International Conference on Machine Learning, ICML'24. JMLR.org

work page 2024
[32]

Minjia Mao, Dongjun Wei, Zeyu Chen, Xiao Fang, and Michael Chau. 2025. https://arxiv.org/abs/2405.14604 Watermarking low-entropy generation for large language models: An unbiased and low-risk method . Preprint, arXiv:2405.14604

work page arXiv 2025
[33]

Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D Manning, and Chelsea Finn. 2023. Detectgpt: Zero-shot machine-generated text detection using probability curvature. In International conference on machine learning, pages 24950--24962. PMLR

work page 2023
[34]

Felix B Mueller, Rebekka G \"o rge, Anna K Bernzen, Janna C Pirk, and Maximilian Poretschkin. 2024. Llms and memorization: On quality and specificity of copyright compliance. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, volume 7, pages 984--996

work page 2024
[35]

Alexander Nemecek, Yuzhou Jiang, and Erman Ayday. 2025. The feasibility of topic-based watermarking on academic peer reviews. arXiv preprint arXiv:2505.21636

work page arXiv 2025
[36]

Newsweek . 2025. World's Best Hospitals 2025 - United States of America . https://rankings.newsweek.com/worlds-best-hospitals-2025/united-states-america. Accessed: 2025-08-25

work page 2025
[37]

Georg Niess and Roman Kern. 2025. https://aclanthology.org/2025.acl-long.145/ Ensemble watermarks for large language models . In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2903--2916, Vienna, Austria. Association for Computational Linguistics

work page 2025
[38]

OpenAI . 2022. Introducing chatgpt. https://openai.com/index/chatgpt/. Accessed: 2025-07-29

work page 2022
[39]

OpenAI . 2023. New ai classifier for indicating ai‑written text. https://openai.com/index/new-ai-classifier-for-indicating-ai-written-text. Accessed: 2025-07-29

work page 2023
[40]

Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Xuming Hu, Lijie Wen, Irwin King, and Philip S. Yu. 2024. https://doi.org/10.18653/v1/2024.emnlp-demo.7 M ark LLM : An open-source toolkit for LLM watermarking . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System...

work page doi:10.18653/v1/2024.emnlp-demo.7 2024
[41]

Wenjie Qu, Wengrui Zheng, Tianyang Tao, Dong Yin, Yanze Jiang, Zhihua Tian, Wei Zou, Jinyuan Jia, and Jiaheng Zhang. 2025. https://arxiv.org/abs/2401.16820 Provably robust multi-bit watermarking for ai-generated text . Preprint, arXiv:2401.16820

work page arXiv 2025
[42]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(1)

work page 2020
[43]

Nils Reimers and Iryna Gurevych. 2020. https://doi.org/10.18653/v1/2020.emnlp-main.365 Making monolingual sentence embeddings multilingual using knowledge distillation . In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4512--4525, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2020.emnlp-main.365 2020
[44]

Ryoma Sato, Yuki Takezawa, Han Bao, Kenta Niwa, and Makoto Yamada. 2023. https://arxiv.org/abs/2310.08920 Embarrassingly simple text watermarks . Preprint, arXiv:2310.08920

work page arXiv 2023
[45]

Scott Aaronson . 2023. Watermarking of large language models. https://simons.berkeley.edu/talks/scott-aaronson-ut-austin-openai-2023-08-17. Accessed: 2025‑07‑29

work page 2023
[46]

Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Nicolas Papernot, Ross Anderson, and Yarin Gal. 2024. Ai models collapse when trained on recursively generated data. Nature, 631(8022):755--759

work page 2024
[47]

SythID-Team . 2024. https://deepmind.google/discover/blog/watermarking-ai-generated-text-and-video-with-synthid/ Watermarking ai-generated text and video with synthid . Accessed: 2025-08-26

work page 2024
[48]

Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivi \`e re, Mihir Sanjay Kale, Juliette Love, and 1 others. 2024. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295

work page internal anchor Pith review Pith/arXiv arXiv 2024
[49]

Shangqing Tu, Yuliang Sun, Yushi Bai, Jifan Yu, Lei Hou, and Juanzi Li. 2024. https://doi.org/10.18653/v1/2024.acl-long.83 W ater B ench: Towards holistic evaluation of watermarks for large language models . In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1517--1542, Bangkok, Thaila...

work page doi:10.18653/v1/2024.acl-long.83 2024
[50]

Yihan Wu, Zhengmian Hu, Junfeng Guo, Hongyang Zhang, and Heng Huang. 2024. A resilient and accessible distribution-preserving watermark for large language models. In Proceedings of the 41st International Conference on Machine Learning, ICML'24. JMLR.org

work page 2024
[51]

Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter J. Liu. 2020 a . Pegasus: pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the 37th International Conference on Machine Learning, ICML'20. JMLR.org

work page 2020
[52]

Ruisi Zhang, Shehzeen Samarah Hussain, Paarth Neekhara, and Farinaz Koushanfar. 2024. Remark-llm: a robust and efficient watermarking framework for generative large language models. In Proceedings of the 33rd USENIX Conference on Security Symposium, SEC '24, USA. USENIX Association

work page 2024
[53]

Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer. 2022. https://arxiv.org/abs/2205.01068 Opt: Open pre-trained transformer language...

work page internal anchor Pith review Pith/arXiv arXiv 2022
[54]

BERTScore: Evaluating Text Generation with BERT

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020 b . https://arxiv.org/abs/1904.09675 Bertscore: Evaluating text generation with bert . Preprint, arXiv:1904.09675

work page internal anchor Pith review Pith/arXiv arXiv 2020
[55]

Xuandong Zhao, Prabhanjan Vijendra Ananth, Lei Li, and Yu-Xiang Wang. 2024. https://openreview.net/forum?id=SsmT8aO45L Provable robust watermarking for AI -generated text . In The Twelfth International Conference on Learning Representations

work page 2024

[1] [1]

online" 'onlinestring :=

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...

work page

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page

[3] [3]

Association for the Advancement of Artificial Intelligence (AAAI) . 2025. AAAI Launches AI-Powered Peer Review Assessment System . https://aaai.org/aaai-launches-ai-powered-peer-review-assessment-system/. Accessed: 2025-08-25

work page 2025

[4] [4]

Jainit Sushil Bafna, Hardik Mittal, Suyash Sethia, Manish Shrivastava, and Radhika Mamidi. 2024. https://arxiv.org/abs/2407.02978 Mast kalandar at semeval-2024 task 8: On the trail of textual origins: Roberta-bilstm approach to detect ai-generated text . Preprint, arXiv:2407.02978

work page arXiv 2024

[5] [5]

Isabel Cachola, Kyle Lo, Arman Cohan, and Daniel Weld. 2020. https://doi.org/10.18653/v1/2020.findings-emnlp.428 TLDR : Extreme summarization of scientific documents . In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4766--4777, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2020.findings-emnlp.428 2020

[6] [6]

Canyu Chen and Kai Shu. 2024. Combating misinformation in the age of llms: Opportunities and challenges. AI Magazine, 45(3):354--368

work page 2024

[7] [7]

Cleveland Clinic . 2025. Cleveland Clinic Announces Rollout of Ambience Healthcare's AI Platform . https://newsroom.clevelandclinic.org/2025/02/19/cleveland-clinic-announces-the-rollout-of-ambience-healthcares-ai-platform. Accessed: 2025-08-25

work page 2025

[8] [8]

Sumanth Dathathri, Abigail See, Sumedh Ghaisas, Po-Sen Huang, Rob McAdam, Johannes Welbl, Vandana Bachani, Alex Kaskasoli, Robert Stanforth, Tatiana Matejovicova, and 1 others. 2024. Scalable watermarking for identifying large language model outputs. Nature, 634(8035):818--823

work page 2024

[9] [9]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. https://arxiv.org/abs/1810.04805 Bert: Pre-training of deep bidirectional transformers for language understanding . Preprint, arXiv:1810.04805

work page internal anchor Pith review Pith/arXiv arXiv 2019

[10] [10]

Christiane Fellbaum. 1998. WordNet: An electronic lexical database. MIT press

work page 1998

[11] [11]

Xiaoyan Feng, He Zhang, Yanjun Zhang, Leo Yu Zhang, and Shirui Pan. 2025. https://openreview.net/forum?id=Zvyb3WAg03 Bimark: Unbiased multilayer watermarking for large language models . In Forty-second International Conference on Machine Learning

work page 2025

[12] [12]

Google . 2024. Introducing gemini 2.0: Our new ai model for the agentic era. https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/. Accessed: 2025-07-29

work page 2024

[13] [13]

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, and 1 others. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024

[14] [14]

Maarten Grootendorst. 2020. Keybert: Minimal keyword extraction with bert. https://maartengr.github.io/KeyBERT/. Accessed: 2025‑07‑29

work page 2020

[15] [15]

Abe Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, and Yulia Tsvetkov. 2024 a . https://doi.org/10.18653/v1/2024.naacl-long.226 S em S tamp: A semantic watermark with paraphrastic robustness for text generation . In Proceedings of the 2024 Conference of the North American Ch...

work page doi:10.18653/v1/2024.naacl-long.226 2024

[16] [16]

Abe Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, and Tianxing He. 2024 b . https://doi.org/10.18653/v1/2024.findings-acl.98 k- S em S tamp: A clustering-based semantic watermark for detection of machine-generated text . In Findings of the Association for Computational Linguistics: ACL 2024, pages 1706--1715, Bangkok, Thailand. Association for Computat...

work page doi:10.18653/v1/2024.findings-acl.98 2024

[17] [17]

Hugging Face . 2025. Hugging face - the ai community building the future. https://huggingface.co. Accessed: 2025-08-26

work page 2025

[18] [18]

Mingjia Huo, Sai Ashish Somayajula, Youwei Liang, Ruisi Zhang, Farinaz Koushanfar, and Pengtao Xie. 2024. Token-specific watermarking with enhanced detectability and semantic coherence for large language models. In Proceedings of the 41st International Conference on Machine Learning, ICML'24. JMLR.org

work page 2024

[19] [19]

Niful Islam, Debopom Sutradhar, Humaira Noor, Jarin Tasnim Raya, Monowara Tabassum Maisha, and Dewan Md Farid. 2023. https://arxiv.org/abs/2306.01761 Distinguishing human generated text from chatgpt generated text using machine learning . Preprint, arXiv:2306.01761

work page arXiv 2023

[20] [20]

Nikola Jovanovi\' c , Robin Staab, and Martin Vechev. 2024. Watermark stealing in large language models. In Proceedings of the 41st International Conference on Machine Learning, ICML'24. JMLR.org

work page 2024

[21] [21]

John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers, and Tom Goldstein. 2023. A watermark for large language models. In International Conference on Machine Learning, pages 17061--17084. PMLR

work page 2023

[22] [22]

Kalpesh Krishna. 2023. ai-detection-paraphrases. https://github.com/martiansideofthemoon/ai-detection-paraphrasesv . Accessed: 2025-08-01

work page 2023

[23] [23]

Kalpesh Krishna, Yixiao Song, Marzena Karpinska, John Wieting, and Mohit Iyyer. 2023. Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense. In Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS '23, Red Hook, NY, USA. Curran Associates Inc

work page 2023

[24] [24]

Rohith Kuditipudi, John Thickstun, Tatsunori Hashimoto, and Percy Liang. 2024. https://arxiv.org/abs/2307.15593 Robust distortion-free watermarks for language models . Preprint, arXiv:2307.15593

work page arXiv 2024

[25] [25]

Jooyoung Lee, Thai Le, Jinghui Chen, and Dongwon Lee. 2023. Do language models plagiarize? In Proceedings of the ACM Web Conference 2023, pages 3637--3647

work page 2023

[26] [26]

Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, Sangdoo Yun, Jamin Shin, and Gunhee Kim. 2024. https://doi.org/10.18653/v1/2024.acl-long.268 Who wrote this code? watermarking for code generation . In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4890--4911, Bangkok, Th...

work page doi:10.18653/v1/2024.acl-long.268 2024

[27] [27]

David D Lewis, Yiming Yang, Tony G Rose, and Fan Li. 2004. Rcv1: A new benchmark collection for text categorization research. Journal of machine learning research, 5(Apr):361--397

work page 2004

[28] [28]

Yu, and Lifang He

Qian Li, Hao Peng, Jianxin Li, Congying Xia, Renyu Yang, Lichao Sun, Philip S. Yu, and Lifang He. 2021. https://arxiv.org/abs/2008.00364 A survey on text classification: From shallow to deep learning . Preprint, arXiv:2008.00364

work page arXiv 2021

[29] [29]

Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, and James Zou. 2023. Gpt detectors are biased against non-native english writers. Patterns, 4(7)

work page 2023

[30] [30]

Aiwei Liu, Leyi Pan, Xuming Hu, Shiao Meng, and Lijie Wen. 2024. https://arxiv.org/abs/2310.06356 A semantic invariant robust watermark for large language models . Preprint, arXiv:2310.06356

work page arXiv 2024

[31] [31]

Yepeng Liu and Yuheng Bu. 2024. Adaptive text watermark for large language models. In Proceedings of the 41st International Conference on Machine Learning, ICML'24. JMLR.org

work page 2024

[32] [32]

Minjia Mao, Dongjun Wei, Zeyu Chen, Xiao Fang, and Michael Chau. 2025. https://arxiv.org/abs/2405.14604 Watermarking low-entropy generation for large language models: An unbiased and low-risk method . Preprint, arXiv:2405.14604

work page arXiv 2025

[33] [33]

Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D Manning, and Chelsea Finn. 2023. Detectgpt: Zero-shot machine-generated text detection using probability curvature. In International conference on machine learning, pages 24950--24962. PMLR

work page 2023

[34] [34]

Felix B Mueller, Rebekka G \"o rge, Anna K Bernzen, Janna C Pirk, and Maximilian Poretschkin. 2024. Llms and memorization: On quality and specificity of copyright compliance. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, volume 7, pages 984--996

work page 2024

[35] [35]

Alexander Nemecek, Yuzhou Jiang, and Erman Ayday. 2025. The feasibility of topic-based watermarking on academic peer reviews. arXiv preprint arXiv:2505.21636

work page arXiv 2025

[36] [36]

Newsweek . 2025. World's Best Hospitals 2025 - United States of America . https://rankings.newsweek.com/worlds-best-hospitals-2025/united-states-america. Accessed: 2025-08-25

work page 2025

[37] [37]

Georg Niess and Roman Kern. 2025. https://aclanthology.org/2025.acl-long.145/ Ensemble watermarks for large language models . In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2903--2916, Vienna, Austria. Association for Computational Linguistics

work page 2025

[38] [38]

OpenAI . 2022. Introducing chatgpt. https://openai.com/index/chatgpt/. Accessed: 2025-07-29

work page 2022

[39] [39]

OpenAI . 2023. New ai classifier for indicating ai‑written text. https://openai.com/index/new-ai-classifier-for-indicating-ai-written-text. Accessed: 2025-07-29

work page 2023

[40] [40]

Leyi Pan, Aiwei Liu, Zhiwei He, Zitian Gao, Xuandong Zhao, Yijian Lu, Binglin Zhou, Shuliang Liu, Xuming Hu, Lijie Wen, Irwin King, and Philip S. Yu. 2024. https://doi.org/10.18653/v1/2024.emnlp-demo.7 M ark LLM : An open-source toolkit for LLM watermarking . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System...

work page doi:10.18653/v1/2024.emnlp-demo.7 2024

[41] [41]

Wenjie Qu, Wengrui Zheng, Tianyang Tao, Dong Yin, Yanze Jiang, Zhihua Tian, Wei Zou, Jinyuan Jia, and Jiaheng Zhang. 2025. https://arxiv.org/abs/2401.16820 Provably robust multi-bit watermarking for ai-generated text . Preprint, arXiv:2401.16820

work page arXiv 2025

[42] [42]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(1)

work page 2020

[43] [43]

Nils Reimers and Iryna Gurevych. 2020. https://doi.org/10.18653/v1/2020.emnlp-main.365 Making monolingual sentence embeddings multilingual using knowledge distillation . In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4512--4525, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2020.emnlp-main.365 2020

[44] [44]

Ryoma Sato, Yuki Takezawa, Han Bao, Kenta Niwa, and Makoto Yamada. 2023. https://arxiv.org/abs/2310.08920 Embarrassingly simple text watermarks . Preprint, arXiv:2310.08920

work page arXiv 2023

[45] [45]

Scott Aaronson . 2023. Watermarking of large language models. https://simons.berkeley.edu/talks/scott-aaronson-ut-austin-openai-2023-08-17. Accessed: 2025‑07‑29

work page 2023

[46] [46]

Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Nicolas Papernot, Ross Anderson, and Yarin Gal. 2024. Ai models collapse when trained on recursively generated data. Nature, 631(8022):755--759

work page 2024

[47] [47]

SythID-Team . 2024. https://deepmind.google/discover/blog/watermarking-ai-generated-text-and-video-with-synthid/ Watermarking ai-generated text and video with synthid . Accessed: 2025-08-26

work page 2024

[48] [48]

Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivi \`e re, Mihir Sanjay Kale, Juliette Love, and 1 others. 2024. Gemma: Open models based on gemini research and technology. arXiv preprint arXiv:2403.08295

work page internal anchor Pith review Pith/arXiv arXiv 2024

[49] [49]

Shangqing Tu, Yuliang Sun, Yushi Bai, Jifan Yu, Lei Hou, and Juanzi Li. 2024. https://doi.org/10.18653/v1/2024.acl-long.83 W ater B ench: Towards holistic evaluation of watermarks for large language models . In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1517--1542, Bangkok, Thaila...

work page doi:10.18653/v1/2024.acl-long.83 2024

[50] [50]

Yihan Wu, Zhengmian Hu, Junfeng Guo, Hongyang Zhang, and Heng Huang. 2024. A resilient and accessible distribution-preserving watermark for large language models. In Proceedings of the 41st International Conference on Machine Learning, ICML'24. JMLR.org

work page 2024

[51] [51]

Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter J. Liu. 2020 a . Pegasus: pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the 37th International Conference on Machine Learning, ICML'20. JMLR.org

work page 2020

[52] [52]

Ruisi Zhang, Shehzeen Samarah Hussain, Paarth Neekhara, and Farinaz Koushanfar. 2024. Remark-llm: a robust and efficient watermarking framework for generative large language models. In Proceedings of the 33rd USENIX Conference on Security Symposium, SEC '24, USA. USENIX Association

work page 2024

[53] [53]

Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer. 2022. https://arxiv.org/abs/2205.01068 Opt: Open pre-trained transformer language...

work page internal anchor Pith review Pith/arXiv arXiv 2022

[54] [54]

BERTScore: Evaluating Text Generation with BERT

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020 b . https://arxiv.org/abs/1904.09675 Bertscore: Evaluating text generation with bert . Preprint, arXiv:1904.09675

work page internal anchor Pith review Pith/arXiv arXiv 2020

[55] [55]

Xuandong Zhao, Prabhanjan Vijendra Ananth, Lei Li, and Yu-Xiang Wang. 2024. https://openreview.net/forum?id=SsmT8aO45L Provable robust watermarking for AI -generated text . In The Twelfth International Conference on Learning Representations

work page 2024