pith. sign in

arxiv: 2605.19402 · v1 · pith:Q4D6WD6Knew · submitted 2026-05-19 · 💻 cs.CR

High-Rate Public-Key Pseudorandom Codes for Edit Errors

Pith reviewed 2026-05-20 04:51 UTC · model grok-4.3

classification 💻 cs.CR
keywords pseudorandom codesedit errorsinsertion-deletionpublic-key cryptographywatermarkingerror-correcting codeshigh-rate constructions
0
0 comments X

The pith

Public-key pseudorandom codes for edit errors reach rates near 1 over large alphabets and near 1/2 over binary.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to build public-key pseudorandom codes that stay decodable after insertions and deletions while remaining indistinguishable from random strings. It does this by giving a reduction that converts codes robust to substitutions into codes robust to sublinear edit errors, then supplying explicit constructions that achieve high rates for different alphabet sizes. A reader would care because these codes support robust watermarking of generated content that survives typical edits without revealing the watermark. The constructions rely on the same assumptions used for simpler Hamming-error versions.

Core claim

We construct public-key PRCs with rate arbitrarily close to 1 over sufficiently large constant alphabets, and with rate arbitrarily close to 1/2 over the binary alphabet. Moreover, if we allow the alphabet size to be poly(lambda), then our public-key PRCs can attain the Singleton bound for insertion-deletion channels. We also give a reduction showing that binary zero-bit PRCs robust against substitution errors can be transformed into binary zero-bit PRCs robust against sublinear polynomial edit errors.

What carries the argument

A new reduction from zero-bit Hamming-robust PRCs to edit-robust PRCs, combined with algebraic constructions for high-rate pseudorandom codes over edit channels.

If this is right

  • This yields the first high-rate public-key binary PRCs for edit channels.
  • The constructions enable watermarking that tolerates edits while preserving computational indistinguishability from random.
  • When the alphabet size grows polynomially with the security parameter, the codes achieve the optimal rate given by the Singleton bound for insertion-deletion channels.
  • The results apply to any assumption that already gives zero-bit Hamming-robust PRCs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The reduction technique may extend to other error models such as deletions only or mixed substitution-edit channels.
  • These codes could support applications in data storage over channels prone to insertions and deletions, such as DNA-based storage.
  • Future work could test whether the same reduction works for linear edit error rates instead of only sublinear rates.

Load-bearing premise

The existence of zero-bit pseudorandom codes that are robust against a constant fraction of substitution errors under some cryptographic assumption.

What would settle it

An efficient algorithm that can distinguish the output strings of the constructed codes from uniformly random strings after a small number of insertions and deletions.

read the original abstract

Pseudorandom codes (PRCs), introduced by Christ and Gunn (CRYPTO '2024), are error-correcting codes whose codewords are computationally indistinguishable from uniformly random strings, while still being decodable by someone holding the key. They provide a natural primitive for robust and undetectable watermarking, particularly in applications to AI-generated content. Although recent works have obtained strong results for substitution errors, the edit-error setting remains much less understood, especially in the high-rate regime and over small alphabets. We study public-key pseudorandom codes against edit errors. First, we give a new reduction showing that binary zero-bit PRCs robust against a constant fraction of substitution errors can be transformed into binary zero-bit PRCs robust against edit errors. Consequently, under any assumption that yields zero-bit Hamming-robust PRCs, one also obtains zero-bit PRCs for edit channels, albeit only for the weaker class of sublinear polynomial edit channels, namely channels with edit error rate $1/n^{\gamma}$ for any constant $\gamma>0$. In the high-rate regime, we construct public-key PRCs with rate arbitrarily close to $1$ over sufficiently large constant alphabets, and with rate arbitrarily close to $1/2$ over the binary alphabet. Moreover, if we allow the alphabet size to be $\mathrm{poly}(\lambda)$, where $\lambda$ is the security parameter, then our public-key PRCs can attain the Singleton bound for insertion-deletion channels. Taken together, these results yield the first high-rate public-key binary PRC constructions for edit channels, under the same assumption that yields zero-bit Hamming-robust PRCs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a new reduction converting binary zero-bit PRCs that are robust to a constant fraction of substitution errors into zero-bit PRCs robust to sublinear polynomial edit errors (rate 1/n^γ for constant γ>0). It also gives explicit constructions of public-key PRCs achieving rates arbitrarily close to 1 over large constant alphabets, rates arbitrarily close to 1/2 over the binary alphabet, and the Singleton bound when the alphabet size is poly(λ), all for insertion-deletion channels and under the same cryptographic assumptions that yield Hamming-robust PRCs.

Significance. If the reduction and constructions are correct, the work provides the first high-rate public-key PRCs for edit errors in the binary setting. The reduction usefully connects the Hamming and edit settings, and the high-rate results (including approaching the Singleton bound) are technically notable for applications such as robust watermarking of AI-generated content.

major comments (2)
  1. [Abstract] Abstract: the high-rate binary construction is presented as yielding PRCs 'for edit channels' under the same assumption as Hamming-robust PRCs, yet the reduction only guarantees sublinear edit error rate 1/n^γ rather than constant-fraction robustness. If the binary high-rate result routes through this reduction, the achieved δ is sublinear; this distinction should be stated explicitly when the high-rate binary result is claimed, as it affects comparison to standard constant-fraction insertion-deletion models.
  2. [High-rate constructions section] The manuscript should clarify in the high-rate regime section whether the binary rate-1/2 construction achieves constant-fraction edit robustness via a direct argument or inherits only the sublinear guarantee from the reduction; this is load-bearing for the claim of 'first high-rate public-key binary PRC constructions for edit channels'.
minor comments (2)
  1. [Introduction] Add a short paragraph in the introduction comparing the sublinear edit rate obtained here to prior constant-fraction edit-error results for PRCs or related primitives.
  2. [High-rate constructions section] Ensure that all parameter settings (including the dependence on γ and the security parameter λ) are restated when the high-rate binary result is presented.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and valuable feedback on our manuscript. The comments highlight an important point regarding the precise robustness guarantees in our high-rate binary constructions, and we agree that explicit clarification will strengthen the presentation. We address each major comment below and will revise the manuscript to incorporate the suggested distinctions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the high-rate binary construction is presented as yielding PRCs 'for edit channels' under the same assumption as Hamming-robust PRCs, yet the reduction only guarantees sublinear edit error rate 1/n^γ rather than constant-fraction robustness. If the binary high-rate result routes through this reduction, the achieved δ is sublinear; this distinction should be stated explicitly when the high-rate binary result is claimed, as it affects comparison to standard constant-fraction insertion-deletion models.

    Authors: We agree that this distinction is important and should be stated explicitly. The high-rate binary construction (achieving rate approaching 1/2) is obtained by composing our new reduction with an underlying high-rate Hamming-robust PRC; consequently, it inherits only the sublinear polynomial edit-error guarantee (edit rate 1/n^γ for constant γ > 0) rather than constant-fraction robustness. By contrast, the large-alphabet constructions achieve constant-fraction edit robustness. We will revise the abstract to explicitly note that the binary high-rate result yields the first such public-key PRCs for sublinear edit channels under the stated assumptions, while preserving the comparison to Hamming-robust PRCs. revision: yes

  2. Referee: [High-rate constructions section] The manuscript should clarify in the high-rate regime section whether the binary rate-1/2 construction achieves constant-fraction edit robustness via a direct argument or inherits only the sublinear guarantee from the reduction; this is load-bearing for the claim of 'first high-rate public-key binary PRC constructions for edit channels'.

    Authors: We thank the referee for identifying this load-bearing clarification. The binary rate-1/2 construction does not achieve constant-fraction edit robustness via a direct argument; it inherits the sublinear polynomial guarantee (edit rate 1/n^γ) from the reduction. We will add an explicit statement in the high-rate regime section (and in the relevant theorem statement) to make this inheritance clear, thereby accurately qualifying the claim as the first high-rate public-key binary PRCs for sublinear edit channels. revision: yes

Circularity Check

0 steps flagged

No circularity; results built from external assumptions and a new reduction

full rationale

The paper's derivation proceeds by stating a new reduction that converts constant-fraction Hamming-robust zero-bit PRCs into edit-robust versions (limited to sublinear edit rates 1/n^γ) and then presents high-rate constructions under the same external cryptographic assumption introduced by Christ and Gunn. No equation or claim reduces a derived quantity to a fitted parameter or self-defined input by construction, nor does any load-bearing step rely on a self-citation chain or imported uniqueness theorem from the authors' prior work. The central claims remain independent of internal fitting or renaming and are conditioned on the stated external assumption.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Results rest on the existence of zero-bit Hamming-robust PRCs from prior work and standard cryptographic hardness assumptions; no new free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Existence of zero-bit PRCs robust against a constant fraction of substitution errors under some cryptographic assumption
    Invoked to obtain edit-error PRCs via the stated reduction for sublinear polynomial edit channels.

pith-pipeline@v0.9.0 · 5836 in / 1203 out tokens · 33913 ms · 2026-05-20T04:51:55.871392+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 2 internal anchors

  1. [1]

    IEEE transactions on information theory , volume =

    Asymptotically good codes correcting insertions, deletions, and transpositions , author =. IEEE transactions on information theory , volume =. 2002 , publisher =

  2. [2]

    and Leonard, D.A

    Sakata, S. and Leonard, D.A. and Jensen, H.E. and Hoholdt, T. , year = 1998, month = jul, journal =. Fast Erasure-and-Error Decoding of Algebraic Geometry Codes up to the. doi:10.1109/18.681332 , urldate =

  3. [3]

    Expected length of the longest common subsequence for large alphabets

    Expected Length of the Longest Common Subsequence for Large Alphabets , author =. doi:10.48550/arXiv.math/0308234 , urldate =. arXiv , keywords =:math/0308234 , publisher =

  4. [4]

    IEEE Transactions on Information Theory , volume =

    Linear-Time Encodable/Decodable Codes with near-Optimal Rate , author =. IEEE Transactions on Information Theory , volume =. doi:10.1109/TIT.2005.855587 , urldate =

  5. [5]

    Length of the longest common subsequence between overlapping words

    Length of the Longest Common Subsequence between Overlapping Words , author =. doi:10.48550/arXiv.1803.03238 , urldate =. arXiv , keywords =:1803.03238 , primaryclass =

  6. [6]

    Advances in Neural Information Processing Systems , volume =

    Edit distance robust watermarks via indexing pseudorandom codes , author =. Advances in Neural Information Processing Systems , volume =

  7. [7]

    arXiv preprint arXiv:2512.08918 , year =

    Improved Pseudorandom Codes from Permuted Puzzles , author =. arXiv preprint arXiv:2512.08918 , year =

  8. [8]

    Streaming algorithms for embedding and computing edit distance in the low distance regime , year =

    Chakraborty, Diptarka and Goldenberg, Elazar and Kouck\'. Streaming algorithms for embedding and computing edit distance in the low distance regime , year =. doi:10.1145/2897518.2897577 , booktitle =

  9. [9]

    Annual International Cryptology Conference , pages =

    Pseudorandom Error-Correcting Codes , author =. Annual International Cryptology Conference , pages =

  10. [10]

    arXiv preprint arXiv:2303.17370 , year =

    Linear insertion deletion codes in the high-noise and high-rate regimes , author =. arXiv preprint arXiv:2303.17370 , year =

  11. [11]

    Approximation, Randomization, and Combinatorial Optimization

    Ghentiyala, Surendra and Guruswami, Venkatesan , title =. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025) , pages =. 2025 , volume =. doi:10.4230/LIPIcs.APPROX/RANDOM.2025.54 , annote =

  12. [12]

    Synchronization

    Haeupler, Bernhard and Shahrasbi, Amirbehshad , year = 2021, month = oct, journal =. Synchronization. doi:10.1145/3468265 , urldate =

  13. [13]

    Shtetl-Optimized , month =

    Scott Aaronson , title =. Shtetl-Optimized , month =. 2022 , howpublished =

  14. [14]

    Undetectable Watermarks for Language Models , booktitle =

    Miranda Christ and Sam Gunn and Or Zamir , editor =. Undetectable Watermarks for Language Models , booktitle =. 2024 , url =

  15. [15]

    Ideal Pseudorandom Codes , booktitle =

    Omar Alrabiah and Prabhanjan Ananth and Miranda Christ and Yevgeniy Dodis and Sam Gunn , editor =. Ideal Pseudorandom Codes , booktitle =. 2025 , url =. doi:10.1145/3717823.3718309 , timestamp =

  16. [16]

    Theory of Cryptography Conference , pages =

    Black-box crypto is useless for pseudorandom codes , author =. Theory of Cryptography Conference , pages =. 2025 , organization =

  17. [17]

    Theory of Cryptography Conference , pages =

    Separating pseudorandom codes from local oracles , author =. Theory of Cryptography Conference , pages =. 2025 , organization =

  18. [18]

    Transactions on Machine Learning Research , issn =

    Robust Distortion-free Watermarks for Language Models , author =. Transactions on Machine Learning Research , issn =. 2024 , url =

  19. [19]

    Provable Robust Watermarking for

    Xuandong Zhao and Prabhanjan Vijendra Ananth and Lei Li and Yu-Xiang Wang , booktitle =. Provable Robust Watermarking for. 2024 , url =

  20. [20]

    International conference on machine learning , pages =

    A watermark for large language models , author =. International conference on machine learning , pages =. 2023 , organization =

  21. [21]

    The Thirteenth International Conference on Learning Representations , year =

    An Undetectable Watermark for Generative Image Models , author =. The Thirteenth International Conference on Learning Representations , year =

  22. [22]

    2025 IEEE Symposium on Security and Privacy (SP) , pages =

    Watermarking language models for many adaptive users , author =. 2025 IEEE Symposium on Security and Privacy (SP) , pages =. 2025 , organization =

  23. [23]

    The Twelfth International Conference on Learning Representations , year =

    On the Reliability of Watermarks for Large Language Models , author =. The Twelfth International Conference on Learning Representations , year =

  24. [24]

    SIAM Journal on Discrete Mathematics , volume =

    Cheng, Kuan and Guruswami, Venkatesan and Haeupler, Bernhard and Li, Xin , title =. SIAM Journal on Discrete Mathematics , volume =. 2023 , doi =

  25. [25]

    Improved Constructions of Linear Codes for Insertions and Deletions , year =

    Gross, Roee and Con, Roni and Yaakobi, Eitan , journal =. Improved Constructions of Linear Codes for Insertions and Deletions , year =

  26. [26]

    SoK: Watermarking for AI-Generated Content , year =

    Zhao, Xuandong and Gunn, Sam and Christ, Miranda and Fairoze, Jaiden and Fabrega, Andres and Carlini, Nicholas and Garg, Sanjam and Hong, Sanghyun and Nasr, Milad and Tramer, Florian and Jha, Somesh and Li, Lei and Wang, Yu-Xiang and Song, Dawn , booktitle =. SoK: Watermarking for AI-Generated Content , year =

  27. [27]

    2007 , issue_date =

    Ostrovsky, Rafail and Rabani, Yuval , title =. 2007 , issue_date =. doi:10.1145/1284320.1284322 , journal =

  28. [28]

    2012 , isbn =

    Jowhari, Hossein , title =. 2012 , isbn =. doi:10.1007/978-3-642-33090-2_56 , pages =

  29. [29]

    Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms , pages =

    Indyk, Piotr , title =. Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms , pages =. 2004 , isbn =

  30. [30]

    Bar-Yossef, Ziv and Jayram, T. S. and Krauthgamer, Robert and Kumar, Ravi , title =. 2004 , isbn =. doi:10.1109/FOCS.2004.14 , booktitle =

  31. [31]

    Communication complexity of document exchange , year =

    Cormode, Graham and Paterson, Mike and Sahinalp, S\". Communication complexity of document exchange , year =. Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms , pages =

  32. [32]

    arXiv preprint arXiv:2602.15323 , year =

    Unforgeable Watermarks for Language Models via Robust Signatures , author =. arXiv preprint arXiv:2602.15323 , year =

  33. [33]

    Advances in neural information processing systems , volume =

    Paraphrasing evades detectors of ai-generated text, but retrieval is an effective defense , author =. Advances in neural information processing systems , volume =

  34. [34]

    Can AI-Generated Text be Reliably Detected?

    Can AI-generated text be reliably detected? , author =. arXiv preprint arXiv:2303.11156 , year =

  35. [35]

    34th USENIX Security Symposium (USENIX Security 25) , pages =

    Provably robust multi-bit watermarking for \ AI-generated \ text , author =. 34th USENIX Security Symposium (USENIX Security 25) , pages =

  36. [36]

    ACM Transactions on Accessible Computing , volume =

    Measuring the accuracy of automatic speech recognition solutions , author =. ACM Transactions on Accessible Computing , volume =. 2024 , publisher =

  37. [37]

    arXiv preprint arXiv:2402.03082 , year =

    Visual text meets low-level vision: A comprehensive survey on visual text processing , author =. arXiv preprint arXiv:2402.03082 , year =

  38. [38]

    Parallel iterative edit models for local sequence transduction , author =. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) , pages =

  39. [39]

    IACR Communications in Cryptology , volume =

    Publicly-Detectable Watermarking for Language Models , author =. IACR Communications in Cryptology , volume =

  40. [40]

    2016 IEEE International Symposium on Information Theory (ISIT) , pages =

    Efficiently decodable insertion/deletion codes for high-noise and high-rate regimes , author =. 2016 IEEE International Symposium on Information Theory (ISIT) , pages =. 2016 , organization =

  41. [41]

    IEEE Transactions on Information Theory , volume =

    Deletion codes in the high-noise and high-rate regimes , author =. IEEE Transactions on Information Theory , volume =. 2017 , publisher =

  42. [42]

    IEEE Transactions on Information Theory , volume =

    An improved bound on the fraction of correctable deletions , author =. IEEE Transactions on Information Theory , volume =. 2016 , publisher =

  43. [43]

    2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages =

    Deterministic document exchange protocols, and almost optimal binary codes for edit errors , author =. 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages =. 2018 , organization =

  44. [44]

    2009 , publisher =

    Algebraic function fields and codes , author =. 2009 , publisher =

  45. [45]

    2025 IEEE Symposium on Security and Privacy (SP) , pages =

    Sok: Watermarking for ai-generated content , author =. 2025 IEEE Symposium on Security and Privacy (SP) , pages =. 2025 , organization =

  46. [46]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , pages =

    Robust detection of watermarks for large language models under human edits , author =. Journal of the Royal Statistical Society Series B: Statistical Methodology , pages =. 2025 , publisher =

  47. [47]

    arXiv preprint arXiv:2506.11444 , year =

    Gaussmarker: Robust dual-domain watermark for diffusion models , author =. arXiv preprint arXiv:2506.11444 , year =