A watermark for large language models

John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers, Tom Goldstein · 2023

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

browse 8 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Every Bit, Everywhere, All at Once: A Binomial Multibit LLM Watermark

cs.CR · 2026-05-12 · unverdicted · novelty 7.0

A binomial multibit watermarking scheme encodes every payload bit at each LLM token with dynamic redirection, outperforming baselines in accuracy and robustness for large payloads.

RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience

cs.CR · 2026-04-13 · unverdicted · novelty 7.0

RLSpoofer trains a 4B model on 100 watermarked paraphrase pairs to spoof PF watermarks at 62% success rate, far exceeding baselines trained on up to 10,000 samples.

Towards Distillation-Resistant Large Language Models: An Information-Theoretic Perspective

cs.CL · 2026-02-03 · unverdicted · novelty 7.0

A learned transformation matrix minimizes CMI in teacher logits to degrade distillation performance while preserving task accuracy.

Robust Spectral Watermark for Synthetic Tabular Data

cs.CR · 2025-11-26 · unverdicted · novelty 7.0

TAB-DRW embeds detectable watermarks in the frequency domain of normalized synthetic tabular data via DFT and rank-based pseudorandom bits, achieving robustness to attacks while preserving fidelity and supporting mixed data types.

ArcMark: Distortion-Free Multi-Byte LLM Watermark via Optimal Transport

cs.LG · 2026-02-06 · unverdicted · novelty 6.0

ArcMark is a multi-byte LLM watermark that achieves distortion-free embedding of several bytes per few hundred tokens by treating generation as a channel coding problem and using optimal transport to match distributions.

GhostCite: A Large-Scale Analysis of Citation Validity in the Age of Large Language Models

cs.CR · 2026-02-06 · unverdicted · novelty 6.0

LLMs hallucinate citations at rates from 14.23% to 94.93%, with 1.07% of papers containing invalid citations and an 80.9% increase in 2025.

Fundamental Trade-Offs in Multi-Bit Watermarking of Stochastic Processes

cs.IT · 2026-05-09 · unverdicted · novelty 5.0

Derives matched converse and achievability bounds that characterize optimal trade-offs among false-alarm probability, detection error probability, distortion, and information rate for multi-bit watermarking of stationary ergodic stochastic processes.

Lightweight Stylistic Consistency Profiling: Robust Detection of LLM-Generated Textual Content for Multimedia Moderation

cs.CL · 2026-05-07 · unverdicted · novelty 4.0

LiSCP detects LLM-generated text via stylistic consistency profiling across paraphrased variants and reports up to 11.79% better cross-domain accuracy plus robustness to adversarial attacks.

citing papers explorer

Showing 8 of 8 citing papers.

Every Bit, Everywhere, All at Once: A Binomial Multibit LLM Watermark cs.CR · 2026-05-12 · unverdicted · none · ref 8
A binomial multibit watermarking scheme encodes every payload bit at each LLM token with dynamic redirection, outperforming baselines in accuracy and robustness for large payloads.
RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience cs.CR · 2026-04-13 · unverdicted · none · ref 7
RLSpoofer trains a 4B model on 100 watermarked paraphrase pairs to spoof PF watermarks at 62% success rate, far exceeding baselines trained on up to 10,000 samples.
Towards Distillation-Resistant Large Language Models: An Information-Theoretic Perspective cs.CL · 2026-02-03 · unverdicted · none · ref 7
A learned transformation matrix minimizes CMI in teacher logits to degrade distillation performance while preserving task accuracy.
Robust Spectral Watermark for Synthetic Tabular Data cs.CR · 2025-11-26 · unverdicted · none · ref 15
TAB-DRW embeds detectable watermarks in the frequency domain of normalized synthetic tabular data via DFT and rank-based pseudorandom bits, achieving robustness to attacks while preserving fidelity and supporting mixed data types.
ArcMark: Distortion-Free Multi-Byte LLM Watermark via Optimal Transport cs.LG · 2026-02-06 · unverdicted · none · ref 4
ArcMark is a multi-byte LLM watermark that achieves distortion-free embedding of several bytes per few hundred tokens by treating generation as a channel coding problem and using optimal transport to match distributions.
GhostCite: A Large-Scale Analysis of Citation Validity in the Age of Large Language Models cs.CR · 2026-02-06 · unverdicted · none · ref 25
LLMs hallucinate citations at rates from 14.23% to 94.93%, with 1.07% of papers containing invalid citations and an 80.9% increase in 2025.
Fundamental Trade-Offs in Multi-Bit Watermarking of Stochastic Processes cs.IT · 2026-05-09 · unverdicted · none · ref 7
Derives matched converse and achievability bounds that characterize optimal trade-offs among false-alarm probability, detection error probability, distortion, and information rate for multi-bit watermarking of stationary ergodic stochastic processes.
Lightweight Stylistic Consistency Profiling: Robust Detection of LLM-Generated Textual Content for Multimedia Moderation cs.CL · 2026-05-07 · unverdicted · none · ref 20
LiSCP detects LLM-generated text via stylistic consistency profiling across paraphrased variants and reports up to 11.79% better cross-domain accuracy plus robustness to adversarial attacks.

A watermark for large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer