pith. sign in

arxiv: 2602.07235 · v2 · pith:GKTJ2DTDnew · submitted 2026-02-06 · 💻 cs.LG · cs.AI· cs.IT· math.IT

ArcMark: Distortion-Free Multi-Byte LLM Watermark via Optimal Transport

Pith reviewed 2026-05-25 07:01 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.ITmath.IT
keywords LLM watermarkingdistortion-free watermarkmulti-bit watermarkoptimal transportchannel codinginformation capacity
0
0 comments X

The pith

ArcMark embeds multiple bytes into LLM text without changing next-token probabilities via optimal transport.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ArcMark to encode multi-byte messages such as user or model identifiers into generated text. It formulates the task as a channel coding problem whose capacity bound guides a construction that uses optimal transport to keep the average next-token distribution identical to the unwatermarked LLM. The resulting watermark recovers the embedded information more accurately than prior distortion-free methods, even after an adversary changes some tokens, while producing text with unchanged perplexity and downstream task performance.

Core claim

Formulating distortion-free multi-bit LLM watermarking as a channel coding problem yields an information-theoretic capacity; solving the associated optimal transport problem produces ArcMark, which embeds multiple bytes per few hundred tokens without perturbing the LLM next-token distribution.

What carries the argument

Optimal transport plan between message distribution and token sequences that exactly preserves the marginal next-token probabilities.

If this is right

  • Generated text can carry embedded identifiers for the submitting user, exact model version, or even the prompt itself.
  • Watermarked and unwatermarked outputs are statistically identical in perplexity and in performance on downstream tasks.
  • Message recovery remains reliable when an attacker alters a subset of the generated tokens.
  • Embedding rates are limited by the derived information-theoretic capacity for distortion-free channels.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could support tracing of individual generations back to specific sessions or prompts at scale.
  • Efficient approximations to the optimal transport step would be needed for deployment on very long outputs.
  • Empirical tests across model families would check whether the capacity expression holds beyond the evaluated settings.

Load-bearing premise

The watermarking task admits a transport plan whose marginals match the original LLM distribution while carrying the message.

What would settle it

An experiment that measures the empirical next-token distribution of watermarked outputs and detects a statistically significant shift from the original LLM, or that shows message recovery rates falling below the claimed capacity.

Figures

Figures reproduced from arXiv: 2602.07235 by Atefeh Gilani, Carol Xuan Long, Flavio P. Calmon, Lalitha Sankar, Oliver Kosut, Sajani Vithana.

Figure 1
Figure 1. Figure 1: Message accuracy on Llama3-8B for different watermark embedding lengths. The top row [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of ArcMark: The k-bit watermarking message (for example 001 in the figure) is mapped to a block of n tokens using a random linear code, defined by the generator matrix G, that generates a collection of codewords {Ct} n t=1. Each Ct is mapped to one of p-equally points on the circle. Additionally, a shared randomly generated point vt which takes r equally spaced values around the circle, is generat… view at source ↗
Figure 3
Figure 3. Figure 3: Message accuracy on Llama2-7B for different watermark embedding lengths. The top row [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Bit accuracy results: the left column corresponds to Llama2-7B and the right column to [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Perplexity of watermarked text generated using [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Bit accuracy (left) and message accuracy (right) on Llama3-8B for different numbers of [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
read the original abstract

Watermarking is an important tool for promoting the responsible use of large language models (LLMs). Existing watermarks insert a signal into generated tokens that either flags LLM-generated text (zero-bit watermarking) or encodes more complex messages (multi-bit watermarking). Though a number of recent approaches insert multiple bits into text without perturbing average next-token predictions, they largely extend design principles from the zero-bit setting, such as encoding a single bit per token. In contrast, a watermarker capable of embedding multiple bytes into the text would dramatically increase the potential applications, by embedding information such as the ID of the user who submitted the prompt, the precise model version that was used, or even the prompt itself. We address this problem by introducing ArcMark: a new watermark construction based on coding and information-theoretic principles that is capable of reliably embedding multiple bytes of information into just a few hundred tokens, without any distortion of the underlying LLM next-token distribution. We derive ArcMark by formulating the distortion-free watermarking problem as a channel coding problem, and deriving an information-theoretic channel capacity that establishes the fundamental limit of embedding information in LLM output in a distortion-free manner. This capacity formulation informs the design of ArcMark. In practice, ArcMark outperforms competing multi-bit distortion-free watermarks in terms of reconstruction accuracy, including in the face of attacks that alter a subset of the LLM text. ArcMark output is also shown to be indistinguishable from unwatermarked text in terms of perplexity, and in downstream task quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces ArcMark, a multi-byte LLM watermarking method based on optimal transport. It formulates distortion-free watermarking as a channel coding problem, derives the associated information-theoretic capacity, and uses this to design an encoder/decoder pair that embeds multiple bytes into a few hundred tokens while leaving the LLM next-token conditional distributions unchanged. The construction is claimed to outperform prior multi-bit distortion-free watermarks in reconstruction accuracy (including under partial text alterations), while producing outputs indistinguishable from unwatermarked text in perplexity and downstream task quality.

Significance. If the capacity derivation and OT construction are shown to achieve the zero-distortion property at the claimed finite lengths, the work would provide a principled, capacity-informed approach to high-rate multi-bit watermarking. This could enable practical applications such as embedding user IDs or model provenance directly in generated text without detectable statistical shifts.

major comments (2)
  1. [Capacity derivation and ArcMark construction sections] The central claim that the derived channel capacity directly informs an OT-based construction achieving the stated embedding rates without message-dependent bias in the output marginals is load-bearing. The abstract states that the capacity 'informs the design,' but the manuscript must explicitly verify that the practical OT plan and its inversion preserve the exact original LLM marginal at every position for finite blocks of a few hundred tokens; any gap between the idealized channel model and sequential LLM generation would undermine both the zero-distortion guarantee and the reported rates.
  2. [Experiments section] The experimental claims of outperforming baselines in reconstruction accuracy (including under attacks) and maintaining downstream task quality rest on the zero-distortion property holding in practice. The manuscript should include a direct empirical check (e.g., via KL divergence or next-token prediction statistics) that the watermarked and unwatermarked distributions are identical within sampling error; without this, the performance numbers cannot be attributed to the claimed distortion-free property.
minor comments (2)
  1. [Abstract] The abstract would benefit from naming the specific competing multi-bit distortion-free baselines and reporting at least one quantitative reconstruction-accuracy figure.
  2. [Preliminaries / Method sections] Notation for the OT plan, the channel model, and the capacity expression should be introduced with explicit definitions before being used in the construction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. We address each major comment below and outline the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Capacity derivation and ArcMark construction sections] The central claim that the derived channel capacity directly informs an OT-based construction achieving the stated embedding rates without message-dependent bias in the output marginals is load-bearing. The abstract states that the capacity 'informs the design,' but the manuscript must explicitly verify that the practical OT plan and its inversion preserve the exact original LLM marginal at every position for finite blocks of a few hundred tokens; any gap between the idealized channel model and sequential LLM generation would undermine both the zero-distortion guarantee and the reported rates.

    Authors: The ArcMark construction computes an optimal transport plan between the message-conditioned and original next-token distributions at each step, with the plan chosen to be independent of the message in its marginal effect; the decoder inverts this plan using the same shared randomness. By the definition of optimal transport, the pushforward measure exactly recovers the original LLM marginal for any finite block length when the plan is applied sequentially. We will add an explicit lemma and short proof in Section 3 (or a new appendix) demonstrating that the finite-block sequential generation preserves the marginal at every position under the channel-coding formulation, thereby closing any perceived gap between the idealized model and the implemented encoder. revision: yes

  2. Referee: [Experiments section] The experimental claims of outperforming baselines in reconstruction accuracy (including under attacks) and maintaining downstream task quality rest on the zero-distortion property holding in practice. The manuscript should include a direct empirical check (e.g., via KL divergence or next-token prediction statistics) that the watermarked and unwatermarked distributions are identical within sampling error; without this, the performance numbers cannot be attributed to the claimed distortion-free property.

    Authors: We agree that an explicit empirical verification would make the zero-distortion claim more robust. In the revised manuscript we will add, in the Experiments section, a direct comparison on a held-out prompt set: (i) average KL divergence between the watermarked and unwatermarked next-token conditional distributions, and (ii) per-token frequency statistics, both shown to be statistically indistinguishable within sampling error. These results will be reported alongside the existing perplexity and downstream-task metrics. revision: yes

Circularity Check

0 steps flagged

Derivation self-contained; capacity derived from first principles informs OT construction without reduction to inputs

full rationale

The abstract states that the distortion-free watermarking problem is formulated as a channel coding problem, from which an information-theoretic capacity is derived to establish fundamental limits and inform the ArcMark design. No equations, self-citations, fitted parameters, or ansatzes are quoted that reduce the claimed capacity or OT plan to the target rates by construction. The construction is presented as achieving the derived limits via optimal transport without message-dependent bias, and the paper reports empirical outperformance on reconstruction accuracy and perplexity. This is the most common honest finding when the central claim rests on an external information-theoretic formulation rather than a self-referential fit or renamed empirical pattern.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; the central claim rests on treating LLM token generation as a memoryless channel whose capacity can be achieved via optimal transport without explicit free parameters or invented entities listed.

axioms (1)
  • domain assumption LLM next-token distributions can be modeled as a channel for which a distortion-free capacity exists and can be achieved by optimal transport
    Invoked when the abstract states the problem is formulated as a channel coding problem whose capacity informs the design.

pith-pipeline@v0.9.0 · 5833 in / 1275 out tokens · 52743 ms · 2026-05-25T07:01:08.880029+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    We derive ArcMark by formulating the distortion-free watermarking problem as a channel coding problem, and deriving an information-theoretic channel capacity... ArcMark employs a random linear channel code... represents each message codeword symbol C, generated token X, and side information V as points on the unit circle, and (ii) solves an optimal transport problem that minimizes the arc length

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    Theorem 3.1. The watermarking capacity is given by R_cap = max I(W;X) s.t. Pr(X=x|Q=q)=q(x)

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Every Bit, Everywhere, All at Once: A Binomial Multibit LLM Watermark

    cs.CR 2026-05 unverdicted novelty 7.0

    A binomial multibit watermarking scheme encodes every payload bit at each LLM token with dynamic redirection, outperforming baselines in accuracy and robustness for large payloads.

  2. Covert Multi-bit LLM Watermarking: An Information Theory and Coding Approach

    cs.IT 2026-05 unverdicted novelty 6.0

    Characterizes the exact capacity of multi-bit covert LLM watermarking via Gelfand-Pinsker and channel synthesis, then gives a polar-code algorithm achieving 0.375 bits/token at under 10% BER with negligible perplexity impact.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · cited by 2 Pith papers

  1. [1]

    Artificial intelligence 2024 legislation

    National Conference of State Legislatures. Artificial intelligence 2024 legislation. https: //www.ncsl.org/technology-and-communication/artificial-intelligence-2024-legislation, 2024. Accessed: 2025-05-14

  2. [2]

    Adoption of watermarking for generative ai systems in practice and implications under the new eu ai act.arXiv preprint arXiv:2503.18156, 2025

    Bram Rijsbosch, Gijs van Dijck, and Konrad Kollnig. Adoption of watermarking for generative ai systems in practice and implications under the new eu ai act.arXiv preprint arXiv:2503.18156, 2025. 13

  3. [3]

    Reducing risks posed by synthetic content: An overview of technical approaches to digital content transparency

    Bilva Chandra, Jesse Dunietz, and Kathleen Roberts. Reducing risks posed by synthetic content: An overview of technical approaches to digital content transparency. Technical Report NIST.AI.100-4, National Institute of Standards and Technology, Gaithersburg, MD, 2024

  4. [4]

    A watermark for large language models

    John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers, and Tom Goldstein. A watermark for large language models. InInternational Conference on Machine Learning, pages 17061–17084. PMLR, 2023

  5. [5]

    Scalable watermarking for identifying large language model outputs.Nature, 634(8035):818–823, 2024

    Sumanth Dathathri, Abigail See, Sumedh Ghaisas, Po-Sen Huang, Rob McAdam, Johannes Welbl, Vandana Bachani, Alex Kaskasoli, Robert Stanforth, Tatiana Matejovicova, et al. Scalable watermarking for identifying large language model outputs.Nature, 634(8035):818–823, 2024

  6. [6]

    Heavywater and simplexwater: Distortion-free llm watermarks for low-entropy distributions

    Dor Tsur, Carol Xuan Long, Claudio Mayrink Verdun, Sajani Vithana, Hsiang Hsu, Chun-Fu Chen, Haim H Permuter, and Flavio Calmon. Heavywater and simplexwater: Distortion-free llm watermarks for low-entropy distributions. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  7. [7]

    Watermarking of large language models

    Scott Aaronson. Watermarking of large language models. https://simons.berkeley.edu/talks/ scott-aaronson-ut-austin-openai-2023-08-17, August 2023. Accessed: 2025-01-1-

  8. [8]

    Robust distortion- free watermarks for language models.arXiv preprint arXiv:2307.15593, 2023

    Rohith Kuditipudi, John Thickstun, Tatsunori Hashimoto, and Percy Liang. Robust distortion- free watermarks for language models.arXiv preprint arXiv:2307.15593, 2023

  9. [9]

    Bimark: Unbiased multilayer watermarking for large language models.arXiv preprint arXiv:2506.21602, 2025

    Xiaoyan Feng, He Zhang, Yanjun Zhang, Leo Yu Zhang, and Shirui Pan. Bimark: Unbiased multilayer watermarking for large language models.arXiv preprint arXiv:2506.21602, 2025

  10. [10]

    Advancing beyond identification: Multi-bit watermark for large language models

    KiYoon Yoo, Wonhyuk Ahn, and Nojun Kwak. Advancing beyond identification: Multi-bit watermark for large language models. In Kevin Duh, Helena Gomez, and Steven Bethard, editors,Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4031–4...

  11. [11]

    On the reliability of watermarks for large language models.arXiv preprint arXiv:2306.04634, 2023

    John Kirchenbauer, Jonas Geiping, Yuxin Wen, Manli Shu, Khalid Saifullah, Kezhi Kong, Kasun Fernando, Aniruddha Saha, Micah Goldblum, and Tom Goldstein. On the reliability of watermarks for large language models.arXiv preprint arXiv:2306.04634, 2023

  12. [12]

    PhD thesis, Massachusetts Institute of Technology, 2000

    Brian Chen.Design and analysis of digital watermarking, information embedding, and data hiding systems. PhD thesis, Massachusetts Institute of Technology, 2000

  13. [13]

    Information-theoretic analysis of information hiding

    Pierre Moulin and Joseph A O’Sullivan. Information-theoretic analysis of information hiding. IEEE Transactions on information theory, 49(3):563–593, 2003

  14. [14]

    Authentication with distortion criteria

    Emin Martinian, Gregory W Wornell, and Brian Chen. Authentication with distortion criteria. IEEE Transactions on Information Theory, 51(7):2523–2542, 2005

  15. [15]

    Coding for channels with random parameters.Probl

    Israel Gel’Fand and Mark Pinsker. Coding for channels with random parameters.Probl. Contr. Inform. Theory, 9(1):19–31, 1980

  16. [16]

    Text data-hiding for digital and printed documents: Theoretical and practical considerations

    Renato Vill´ an, Sviatoslav Voloshynovskiy, Oleksiy Koval, J Vila, Emre Topak, Fr´ ed´ eric Deguil- laume, Yuri Rytsar, and Thierry Pun. Text data-hiding for digital and printed documents: Theoretical and practical considerations. InSecurity, Steganography, and Watermarking of Multimedia Contents VIII, volume 6072, pages 406–416. SPIE, 2006. 14

  17. [17]

    An informationtheoretical approach to information embedding

    Frans MJ Willems. An informationtheoretical approach to information embedding. In2000 Sym- posium on Information Theory in the Benelux, SITB 2000, pages 255–260. Werkgemeenschap voor Informatie-en Communicatietheorie (WIC), 2000

  18. [18]

    Optimized couplings for watermarking large language models

    Carol Xuan Long, Dor Tsur, Claudio Mayrink Verdun, Hsiang Hsu, Haim Permuter, and Flavio P Calmon. Optimized couplings for watermarking large language models. In2025 IEEE International Symposium on Information Theory (ISIT), pages 1–6. IEEE, 2025

  19. [19]

    Universally optimal watermarking schemes for llms: from theory to practice.arXiv preprint arXiv:2410.02890, 2024

    Haiyun He, Yepeng Liu, Ziqiao Wang, Yongyi Mao, and Yuheng Bu. Universally optimal watermarking schemes for llms: from theory to practice.arXiv preprint arXiv:2410.02890, 2024

  20. [20]

    Provably robust multi-bit watermarking for ai-generated text

    Wenjie Qu, Wengrui Zheng, Tianyang Tao, Dong Yin, Yanze Jiang, Zhihua Tian, Wei Zou, Jinyuan Jia, and Jiaheng Zhang. Provably robust multi-bit watermarking for ai-generated text. InProceedings of the 34th USENIX Conference on Security Symposium, 2025

  21. [21]

    StealthInk: A multi-bit and stealthy watermark for large language models

    Ya Jiang, Chuxiong Wu, Massieh Kordi Boroujeny, Brian Mark, and Kai Zeng. StealthInk: A multi-bit and stealthy watermark for large language models. InProceedings of the 42nd International Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 27685–27709, 13–19 Jul 2025

  22. [22]

    Robust multi-bit text watermark with LLM-based paraphrasers

    Xiaojun Xu, Jinghan Jia, Yuanshun Yao, Yang Liu, and Hang Li. Robust multi-bit text watermark with LLM-based paraphrasers. InForty-second International Conference on Machine Learning, 2025

  23. [23]

    Advancing beyond identification: Multi- bit watermark for large language models

    KiYoon Yoo, Wonhyuk Ahn, and Nojun Kwak. Advancing beyond identification: Multi- bit watermark for large language models. InProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4031–4055, 2024

  24. [24]

    El Gamal and Y-H

    A. El Gamal and Y-H. Kim.Network information theory. Cambridge university press, 2011

  25. [25]

    Cambridge University Press, 2025

    Yury Polyanskiy and Yihong Wu.Information Theory: From Coding to Learning. Cambridge University Press, 2025. 15 A Proof of Theorem 3.1 In this proof we use the vector notationX n = (X1, X2, . . . , Xn). Achievability:We first prove that the capacity is at least equal to the quantity given in (2). Consider any discrete distribution PW and function x(w, q) ...