Optimizing adaptive attacks against watermarks for language models

Diaa, A · 2024 · arXiv 2410.02440

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience

cs.CR · 2026-04-13 · unverdicted · novelty 7.0

RLSpoofer trains a 4B model on 100 watermarked paraphrase pairs to spoof PF watermarks at 62% success rate, far exceeding baselines trained on up to 10,000 samples.

TimeMark: A Trustworthy Time Watermarking Framework for Exact Generation-Time Recovery from AIGC

cs.CR · 2026-04-14 · unverdicted · novelty 6.0

TimeMark is a trustworthy time watermarking framework that achieves exact generation-time recovery from AI-generated content with theoretically perfect accuracy by using time-dependent cryptographic keys, random non-stored bit sequences, and two-stage encoding with error-correcting codes.

Mitigating Watermark Forgery in Generative Models via Randomized Key Selection

cs.CR · 2025-07-10 · unverdicted · novelty 5.0

Randomized per-query key selection with single-key detection acceptance bounds forgery success rate independently of collected samples while preserving model utility.

Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption

cs.CR · 2025-10-21 · unverdicted · novelty 4.0

LLM watermarking adoption is limited by misaligned stakeholder incentives; incentive-aligned approaches such as in-context watermarking can enable practical use in targeted domains like education and peer review.

citing papers explorer

Showing 4 of 4 citing papers.

RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience cs.CR · 2026-04-13 · unverdicted · none · ref 29
RLSpoofer trains a 4B model on 100 watermarked paraphrase pairs to spoof PF watermarks at 62% success rate, far exceeding baselines trained on up to 10,000 samples.
TimeMark: A Trustworthy Time Watermarking Framework for Exact Generation-Time Recovery from AIGC cs.CR · 2026-04-14 · unverdicted · none · ref 13
TimeMark is a trustworthy time watermarking framework that achieves exact generation-time recovery from AI-generated content with theoretically perfect accuracy by using time-dependent cryptographic keys, random non-stored bit sequences, and two-stage encoding with error-correcting codes.
Mitigating Watermark Forgery in Generative Models via Randomized Key Selection cs.CR · 2025-07-10 · unverdicted · none · ref 11
Randomized per-query key selection with single-key detection acceptance bounds forgery success rate independently of collected samples while preserving model utility.
Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption cs.CR · 2025-10-21 · unverdicted · none · ref 17
LLM watermarking adoption is limited by misaligned stakeholder incentives; incentive-aligned approaches such as in-context watermarking can enable practical use in targeted domains like education and peer review.

Optimizing adaptive attacks against watermarks for language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer