RLSpoofer trains a 4B model on 100 watermarked paraphrase pairs to spoof PF watermarks at 62% success rate, far exceeding baselines trained on up to 10,000 samples.
Optimizing adaptive attacks against watermarks for language models
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CR 4verdicts
UNVERDICTED 4roles
baseline 1polarities
baseline 1representative citing papers
TimeMark is a trustworthy time watermarking framework that achieves exact generation-time recovery from AI-generated content with theoretically perfect accuracy by using time-dependent cryptographic keys, random non-stored bit sequences, and two-stage encoding with error-correcting codes.
Randomized per-query key selection with single-key detection acceptance bounds forgery success rate independently of collected samples while preserving model utility.
LLM watermarking adoption is limited by misaligned stakeholder incentives; incentive-aligned approaches such as in-context watermarking can enable practical use in targeted domains like education and peer review.
citing papers explorer
-
RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience
RLSpoofer trains a 4B model on 100 watermarked paraphrase pairs to spoof PF watermarks at 62% success rate, far exceeding baselines trained on up to 10,000 samples.
-
TimeMark: A Trustworthy Time Watermarking Framework for Exact Generation-Time Recovery from AIGC
TimeMark is a trustworthy time watermarking framework that achieves exact generation-time recovery from AI-generated content with theoretically perfect accuracy by using time-dependent cryptographic keys, random non-stored bit sequences, and two-stage encoding with error-correcting codes.
-
Mitigating Watermark Forgery in Generative Models via Randomized Key Selection
Randomized per-query key selection with single-key detection acceptance bounds forgery success rate independently of collected samples while preserving model utility.
-
Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption
LLM watermarking adoption is limited by misaligned stakeholder incentives; incentive-aligned approaches such as in-context watermarking can enable practical use in targeted domains like education and peer review.