Majority Bit-Aware Watermarking For Large Language Models
Pith reviewed 2026-05-19 00:25 UTC · model grok-4.3
The pith
Majority bit-aware encoding embeds detectable messages in LLM text without forcing smaller green lists.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that majority bit-aware encoding relaxes the watermark signal strength from depending on green list size, so a strong detectable signal is preserved in generated texts even when using a large green list.
What carries the argument
Majority bit-aware encoding, a message encoding paradigm that determines the watermark preference from the majority bit rather than green-list size.
If this is right
- Large green lists become usable while still supporting accurate multi-bit message recovery.
- Generation quality improves because fewer tokens are excluded from the preferred set.
- MajorMark+ extends the same benefit to longer embedded messages without extra quality cost.
Where Pith is reading between the lines
- The same encoding idea could be tried on other generative models that sample from large vocabularies.
- It might reduce the usual quality penalty when watermarking is applied to open-ended or creative tasks.
- Detection could remain robust under moderate distribution shifts in sampling temperature or top-p.
Load-bearing premise
The majority-bit mapping produces a statistically detectable bias in token choices that stays reliable no matter how large the green list becomes.
What would settle it
Running the method with large green lists on real LLMs and finding detection accuracy falls to chance levels or text quality matches the old restricted-list baselines would disprove the claim.
read the original abstract
The growing deployment of Large Language Models (LLMs) has raised concerns about their misuse in generating harmful or deceptive content. To address this issue, watermarking methods have been proposed to embed identifiable multi-bit messages into generated text for misuse tracing. However, existing methods often suffer from a fundamental trade-off between text quality and decoding accuracy. In particular, they have to restrict the size of the preferred token set (i.e., green list) during encoding to maintain a detectable watermark signal for decoding, which inevitably degrades generation quality. To improve this trade-off, we propose a novel message encoding paradigm called \textit{majority bit-aware encoding}, which relaxes the watermark signal strength from the green list size. This strategy allows for a strong watermark signal to be preserved in generated texts even when using a large green list. We introduce two instantiations of this paradigm: MajorMark and MajorMark$^{+}$, where the latter is specifically optimized for long messages. Extensive experiments on state-of-the-art LLMs demonstrate that our methods achieve higher decoding accuracy and superior text quality compared to prior baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a novel majority bit-aware encoding paradigm for watermarking large language models to embed multi-bit messages. This approach relaxes the dependence of the watermark signal strength on the green list size, enabling the use of larger green lists to improve text quality while maintaining a strong detectable signal. The authors present two instantiations, MajorMark and MajorMark+, with the latter optimized for long messages, and claim through experiments on state-of-the-art LLMs that these methods achieve higher decoding accuracy and superior text quality compared to prior baselines.
Significance. If the empirical claims hold under realistic conditions, the work addresses a central limitation in existing LLM watermarking schemes by decoupling signal strength from green-list size. This could enable more practical multi-bit watermarking for misuse tracing without the usual quality degradation, representing a targeted improvement over prior trade-off resolutions.
major comments (2)
- Abstract: The claim of 'higher decoding accuracy and superior text quality' from 'extensive experiments' is asserted without any reported quantitative metrics, baselines, or controls, leaving the central empirical support for the majority-bit paradigm unassessable from the provided text.
- §3 (Encoding and Detection): The assumption that majority-bit mapping preserves independent statistical detectability independent of green/red partition ratio lacks supporting analysis of detection statistic variance or bias under Zipf-like next-token distributions; under skewed LLM sampling, high-probability tokens may dominate large green lists and render the majority vote either trivial or unsteerable without reintroducing quality penalties.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help clarify how to better present the contributions of the majority bit-aware encoding paradigm. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core claims.
read point-by-point responses
-
Referee: Abstract: The claim of 'higher decoding accuracy and superior text quality' from 'extensive experiments' is asserted without any reported quantitative metrics, baselines, or controls, leaving the central empirical support for the majority-bit paradigm unassessable from the provided text.
Authors: We agree that the abstract would benefit from explicit quantitative support for the empirical claims. In the revised manuscript we will update the abstract to include key metrics (e.g., decoding accuracy gains and quality improvements relative to the strongest baselines) drawn directly from the experimental results already reported in the paper body. revision: yes
-
Referee: §3 (Encoding and Detection): The assumption that majority-bit mapping preserves independent statistical detectability independent of green/red partition ratio lacks supporting analysis of detection statistic variance or bias under Zipf-like next-token distributions; under skewed LLM sampling, high-probability tokens may dominate large green lists and render the majority vote either trivial or unsteerable without reintroducing quality penalties.
Authors: This observation correctly identifies a gap in the current theoretical justification. While the majority-bit construction is intended to aggregate per-token signals so that detectability does not scale directly with green-list size, we did not supply a variance or bias analysis under realistic (Zipfian) token distributions. We will add a dedicated paragraph and supporting simulation in §3 that derives the expected behavior of the majority-vote statistic and shows that high-probability tokens do not render the signal trivial or force quality degradation. revision: yes
Circularity Check
No circularity: novel encoding paradigm presented without reduction to fitted inputs or self-citations
full rationale
The paper introduces majority bit-aware encoding as a new message encoding paradigm that decouples watermark signal strength from green list size. No equations, fitted parameters, or predictions are described that reduce by construction to prior inputs. The central claim rests on the proposed encoding change and is supported by experiments rather than self-referential derivations or load-bearing self-citations. This is a standard case of an independent methodological contribution with no detectable circular steps in the provided description.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM generation proceeds by sampling tokens from a probability distribution over a fixed vocabulary.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
majority bit-aware encoding ... guarantees that γ ≥ 0.5 ... Em[γ] = 0.5 + 1/√(2πb)
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanembed_injective unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
clustering-based decoding ... deterministic decoding ... shard-wise token occurrence count
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.