Distortion-free Watermarks are not Truly Distortion-free under Watermark Key Collisions
read the original abstract
Language model (LM) watermarking techniques inject a statistical signal into LM-generated content by substituting the random sampling process with pseudo-random sampling, using watermark keys as the random seed. Among these statistical watermarking approaches, distortion-free watermarks are particularly crucial because they embed watermarks into LM-generated content without compromising generation quality. However, one notable limitation of pseudo-random sampling compared to true-random sampling is that, under the same watermark keys (i.e., key collision), the results of pseudo-random sampling exhibit correlations. This limitation could potentially undermine the distortion-free property. Our studies reveal that key collisions are inevitable due to the limited availability of watermark keys, and existing distortion-free watermarks exhibit a significant distribution bias toward the original LM distribution in the presence of key collisions. Moreover, achieving a perfect distortion-free watermark is impossible as no statistical signal can be embedded under key collisions. To reduce the distribution bias caused by key collisions, we introduce a new family of distortion-free watermarks--beta-watermark. Experimental results support that the beta-watermark can effectively reduce the distribution bias under key collisions.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Robust Text Watermarking for Large Language Models via Dual Semantic Embeddings
DEW creates a robust watermark for LLM text by applying vector-space operations to dual embeddings and hiding the signal via key-seeded random projections, showing improved detection after paraphrasing and translation.
-
Hidden in Plain Tokens: Simply Robust, Gradient-Free Watermark for Synthetic Audio
A training-free audio watermarking method that reduces vocabulary via community detection to boost detection robustness by orders of magnitude while resisting audio modifications.
-
Robust Text Watermarking for Large Language Models via Dual Semantic Embeddings
DEW is a semantic watermarking method for LLMs that derives a robust signal from dual embeddings via vector-space algebra and pseudo-random projections, remaining detectable after paraphrasing and translation.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.