pith. sign in

arxiv: 2605.27135 · v1 · pith:EMXY4A4Jnew · submitted 2026-05-26 · 💻 cs.CR · cs.CV

Do Modern Post-Hoc Watermarking Methods Beat Broken-Arrows?

Pith reviewed 2026-06-29 17:27 UTC · model grok-4.3

classification 💻 cs.CR cs.CV
keywords post-hoc watermarkingAI-generated imagesrobustnesssecurityfalse alarm rateneural networksclassic watermarkingBroken Arrows
0
0 comments X

The pith

Classic watermarking methods provide superior security compared to modern neural-network post-hoc schemes while maintaining robustness against image transformations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts a comparison between classic post-hoc watermarking techniques and modern ones that use neural networks for very low false-alarm rates. It evaluates both approaches on robustness to standard image changes and security against sophisticated attacks in realistic conditions. Experiments reveal that classic methods, such as Broken Arrows, achieve better security performance without losing robustness. This finding is relevant because reliable watermarking is needed to identify AI-generated images, and security against removal or forgery attacks is critical for practical applications. Readers interested in AI content authenticity would care about which techniques actually hold up under real threats.

Core claim

Through direct experimental comparison, classic watermarking outperforms modern techniques in terms of security while maintaining robustness in a realistic scenario.

What carries the argument

A fair comparison protocol applying classic augmentations and recent sophisticated attacks to measure robustness and security metrics across watermarking schemes.

If this is right

  • Security should be prioritized over extremely low false-alarm rates in watermarking design for AI images.
  • Classic methods remain competitive or superior for deployment where attack resistance matters.
  • Modern neural methods require further development to match or exceed classic security levels.
  • Evaluation of watermarking must include sophisticated attacks beyond basic transformations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the attack set is representative, resources might be better spent refining classic methods rather than developing new neural ones.
  • Future work could test these methods against additional real-world threats like model-specific attacks not covered here.
  • Combining elements from both classic and modern approaches might yield improved overall performance.

Load-bearing premise

The specific augmentations and sophisticated attacks selected in the experiments represent the main threats that matter for real-world watermarking deployment.

What would settle it

An experiment showing modern neural watermarking methods resisting a new set of attacks that successfully defeat the classic methods, or vice versa in a different attack suite.

Figures

Figures reproduced from arXiv: 2605.27135 by Enoal Gesny, Eva Giboulot.

Figure 1
Figure 1. Figure 1: Examples of attacks on Videoseal. 10 20 30 40 50 60 70 PSNR (dB) 0.0 0.2 0.4 0.6 0.8 1.0 ASR Blind VideoSeal TrustMark Broken Arrows 10 20 30 40 50 60 70 PSNR (dB) 0.0 0.2 0.4 0.6 0.8 1.0 ASR Oracle VideoSeal TrustMark Broken Arrows 10 20 30 40 50 60 70 PSNR (dB) 0.0 0.2 0.4 0.6 0.8 1.0 ASR Black-box VideoSeal TrustMark Broken Arrows 10 20 30 40 50 60 70 PSNR (dB) 0.0 0.2 0.4 0.6 0.8 1.0 ASR White-box Vide… view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of watermarking schemes across sce [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of the PSNR of the images attacked [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

With the rapid proliferation of generative models, such as diffusion models, digital watermarking has emerged as a crucial solution for identifying AI-generated images. Modern post-hoc watermarking schemes use neural networks to achieve an extremely low false-alarm rate while remaining robust to common image transformations. However, there is a lack of comparison between these modern methods and classic ones, particularly in real-world scenarios where robustness and security take precedence over achieving an extremely low false-alarm probability. In this paper, we propose a fair comparison of robustness and security between modern and classic post-hoc watermarking across various types of classic augmentations and recent sophisticated attacks. Our experiments show that, in a realistic scenario, classic watermarking outperforms modern techniques in terms of security while maintaining robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper compares modern neural-network-based post-hoc watermarking schemes against classic methods (e.g., Broken Arrows) for detecting AI-generated images. It asserts that, under a realistic threat model consisting of standard image augmentations and recent sophisticated attacks, classic watermarking achieves superior security while preserving comparable robustness.

Significance. If the empirical ranking is shown to be stable under a well-justified threat model, the result would temper enthusiasm for neural post-hoc detectors and indicate that simpler, non-learned schemes may remain preferable when security against removal or forgery is the primary requirement.

major comments (2)
  1. [§4–5] §4–5 (Experimental methodology and results): The manuscript states that it performs “a fair comparison … across various types of classic augmentations and recent sophisticated attacks” but supplies neither an explicit threat model nor an argument that the chosen attack set is representative of deployment-relevant adversaries. No sensitivity analysis or adaptive-attack results are reported to demonstrate that the security ranking is stable when the attacker is allowed to target the neural detector directly.
  2. [Table 2 / Figure 3] Table 2 / Figure 3 (security metrics): The abstract and results claim “outperforms … in terms of security” yet the provided text contains no numerical values for false-positive rates under attack, bit-error rates after removal attempts, or statistical significance tests. Without these quantities the central security claim cannot be verified.
minor comments (2)
  1. [Abstract] Abstract: The claim of experimental superiority is stated without any quantitative metrics, dataset sizes, or attack descriptions; adding one or two key numbers would make the abstract self-contained.
  2. [§2] Notation: The distinction between “robustness” (survival under benign transformations) and “security” (resistance to adversarial removal) is used throughout but never formally defined; a short paragraph in §2 would eliminate ambiguity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and outline the revisions we will make to strengthen the presentation of the threat model and security metrics.

read point-by-point responses
  1. Referee: [§4–5] §4–5 (Experimental methodology and results): The manuscript states that it performs “a fair comparison … across various types of classic augmentations and recent sophisticated attacks” but supplies neither an explicit threat model nor an argument that the chosen attack set is representative of deployment-relevant adversaries. No sensitivity analysis or adaptive-attack results are reported to demonstrate that the security ranking is stable when the attacker is allowed to target the neural detector directly.

    Authors: We agree that a dedicated threat-model subsection would improve clarity and verifiability. In the revised manuscript we will insert a new subsection at the beginning of §4 that (i) formally states the adversary’s goals, knowledge, and capabilities, (ii) justifies the selected augmentations and sophisticated attacks as representative of realistic deployment adversaries, and (iii) reports a sensitivity analysis over the principal attack parameters. We note that our existing attack suite already includes recent non-adaptive sophisticated methods; however, we will also add a short discussion of the computational cost and practical difficulty of fully adaptive attacks against each detector and, if space permits, include a limited set of adaptive-attack results. revision: partial

  2. Referee: [Table 2 / Figure 3] Table 2 / Figure 3 (security metrics): The abstract and results claim “outperforms … in terms of security” yet the provided text contains no numerical values for false-positive rates under attack, bit-error rates after removal attempts, or statistical significance tests. Without these quantities the central security claim cannot be verified.

    Authors: The quantitative results are presented in Table 2 and Figure 3. To make the security claims self-contained in the narrative, we will revise §§5–6 to explicitly quote the key numerical values (false-positive rates under each attack, bit-error rates after removal, and any other security metrics) and will add the results of statistical significance tests (e.g., paired t-tests or Wilcoxon tests with p-values) comparing the classic and neural methods. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical comparison with no derivations

full rationale

The paper is an experimental study that performs direct comparisons of robustness and security metrics between classic and modern watermarking methods under chosen augmentations and attacks. No equations, parameter fittings, uniqueness theorems, or ansatzes are presented; the central claim follows from tabulated experimental outcomes rather than any derivation chain. The choice of attack suite is an experimental design decision (subject to external validity critique) but does not constitute a self-definitional, fitted-input, or self-citation reduction. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical benchmarking paper; no mathematical derivations, fitted parameters, or postulated entities are described in the abstract.

pith-pipeline@v0.9.1-grok · 5651 in / 924 out tokens · 19481 ms · 2026-06-29T17:27:56.204432+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 13 canonical work pages · 5 internal anchors

  1. [1]

    2016.Watermarking Security

    Patrick Bas, Teddy Furon, François Cayre, Gwenaël Doërr, and Benjamin Mathon. 2016.Watermarking Security. Springer Singapore. 10 20 30 40 50 60 70 PSNR (dB) 0.0 0.2 0.4 0.6 0.8 1.0 ASR Videoseal JPEG WmForger CGBA DDNAttack Purification V AE WMInTheSand 10 20 30 40 50 60 70 PSNR (dB) 0.0 0.2 0.4 0.6 0.8 1.0 ASR TrustMark JPEG WmForger CGBA DDNAttack Purific...

  2. [2]

    Tu Bui, Shruti Agarwal, and John Collomosse. 2025. TrustMark: Robust Water- marking and Watermark Removal for Arbitrary Resolution Images. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 18629–18639

  3. [3]

    C2PA. 2024. C2PA: The Coalition for Content Provenance and Authenticity. https://c2pa.org

  4. [4]

    François Cayre, Caroline Fontaine, and Teddy Furon. 2005. Watermarking security part I: theory, Vol. 5681. SPIE, 746. https://inria.hal.science/inria-00083329

  5. [5]

    Vivien Chappelier, Mathieu Desoubeaux, and Jonathan Delhumeau. 2018. Pro- cede d’enregistrement d’un contenu multimedia, procede de detection d’une marque au sein d’un contenu multimedia, dispositifs et programme d’ordinateurs correspondants

  6. [6]

    China. 2023. Chinese AI Governance Rules. http://www.cac.gov.cn/2023-07/13/ c_1690898327029107.htm

  7. [7]

    Chun-Hsien Chou and Yun-Chin Li. 1995. A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile. 5, 6 (1995), 467–476. https://ieeexplore.ieee.org/document/475889

  8. [8]

    Riccardo Corvi, Davide Cozzolino, Giada Zingarini, Giovanni Poggi, Koki Nagano, and Luisa Verdoliva. 2023. On the detection of synthetic images generated by diffusion models. InICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5

  9. [9]

    M. Costa. 1983. Writing on dirty paper (Corresp.). 29, 3 (1983), 439–441. https: //ieeexplore.ieee.org/document/1056659

  10. [10]

    Ingemar J. Cox. 2008.Digital watermarking and steganography(2nd ed ed.). Morgan Kaufmann Publishers, Amsterdam Boston

  11. [11]

    Europe. 2023. European AI Act. https://artificialintelligenceact.eu/. IH&MMSec ’26, June 17–19, 2026, Firenze, Italy Gesny et al

  12. [12]

    Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, and Teddy Furon. 2023. The Stable Signature: Rooting Watermarks in Latent Diffusion Models.ICCV(2023)

  13. [13]

    Zeki Yalniz, and Alexandre Mourachko

    Pierre Fernandez, Hady Elsahar, I. Zeki Yalniz, and Alexandre Mourachko. 2024. Video Seal: Open and Efficient Video Watermarking

  14. [14]

    Pierre Fernandez, Tomáš Souček, Nikola Jovanović, Hady Elsahar, Sylvestre- Alvise Rebuffi, Valeriu Lacatusu, Tuan Tran, and Alexandre Mourachko. 2025. Geometric Image Synchronization with Deep Watermarking

  15. [15]

    Teddy Furon and Patrick Bas. 2012. A New Measure of Watermarking Security Applied on DC-DM QIM. InInformation Hiding(Berkeley, United States, 2012-05). TBA. https://hal.science/hal-00702689

  16. [16]

    Teddy Furon and Patrick Bas. 2008. Broken Arrows.EURASIP Journal on Infor- mation Security2008 (Oct. 2008), ID 597040. https://hal.science/hal-00335311

  17. [17]

    Enoal Gesny, Eva Giboulot, and Teddy Furon. 2024. When does gradient estima- tion improve black-box adversarial attacks?. In2024 IEEE International Workshop on Information Forensics and Security (WIFS)(2024-12). https://ieeexplore.ieee. org/document/10810691/ ISSN: 2157-4774

  18. [18]

    Enoal Gesny, Eva Giboulot, Teddy Furon, and Vivien Chappelier. 2026. Guidance Watermarking for Diffusion Models. InThe Fourteenth International Conference on Learning Representations. https://openreview.net/forum?id=5ifzhjMCKq

  19. [19]

    Explaining and Harnessing Adversarial Examples

    Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. arXiv:1412.6572 [stat] http://arxiv.org/abs/ 1412.6572

  20. [20]

    Sven Gowal, Rudy Bunel, Florian Stimberg, David Stutz, Guillermo Ortiz-Jimenez, Christina Kouridi, Mel Vecerik, Jamie Hayes, Sylvestre-Alvise Rebuffi, Paul Bernard, Chris Gamble, Miklós Z. Horváth, Fabian Kaczmarczyck, Alex Kaskasoli, Aleksandar Petrov, Ilia Shumailov, Meghana Thotakuri, Olivia Wiles, Jessica Yung, Zahra Ahmed, Victor Martin, Simon Rosen,...

  21. [21]

    Huiskes and Michael S

    Mark J. Huiskes and Michael S. Lew. 2008. The MIR flickr retrieval evaluation. In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval(Vancouver, British Columbia, Canada)(MIR ’08). Association for Com- puting Machinery, New York, NY, USA, 39–43. https://doi.org/10.1145/1460096. 1460104

  22. [22]

    Chloé Imadache, Eva Giboulot, and Teddy Furon. 2025. Evaluating the security of public surrogate watermark detectors. InICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2025-04). https: //ieeexplore.ieee.org/document/10889821/ ISSN: 2379-190X

  23. [23]

    T. Kalker. 2001. Considerations on watermarking security. In2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564)(2001-10). 201–206. https://ieeexplore.ieee.org/document/962734

  24. [24]

    Thibault Maho, Teddy Furon, and Erwan Le Merrer. 2021. SurFree: a fast surrogate- free black-box attack. In2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)(Nashville, TN, USA, 2021-06). IEEE, 10425–10434. https: //ieeexplore.ieee.org/document/9578850/

  25. [25]

    Neri Merhav and Erez Sabbag. 2006. Optimal Watermark Embedding and Detec- tion Strategies Under Limited Detection Resources. In2006 IEEE International Symposium on Information Theory(2006-07). 173–177. arXiv:0705.1919 [cs] http://arxiv.org/abs/0705.1919

  26. [26]

    Miller, I.J

    M.L. Miller, I.J. Cox, and J.A. Bloom. 2000. Informed embedding: exploiting image and detector information during watermark insertion. InProceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)(Vancouver, BC, Canada, 2000). IEEE. http://ieeexplore.ieee.org/document/899260/

  27. [27]

    Miller and Jeffrey A

    Matt L. Miller and Jeffrey A. Bloom. 2000. Computing the Probability of False Watermark Detection. InInformation Hiding(Berlin, Heidelberg, 2000), Andreas Pfitzmann (Ed.). Springer, 146–158

  28. [28]

    Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, and Anima Anandkumar. 2022. Diffusion Models for Adversarial Purification. InInternational Conference on Machine Learning (ICML)

  29. [29]

    Stéphane Pateux and Gaëtan Le Guelvouit. 2003. Practical watermarking scheme based on wide spread spectrum and game theory. 18, 4 (2003), 283–296. https: //www.sciencedirect.com/science/article/pii/S0923596502001455

  30. [30]

    Aleksandar Petrov, Pierre Fernandez, Tomas Soucek, and Hady Elsahar. 2026. We Can Hide More Bits: The Unused Watermarking Capacity in Theory and in Practice. https://openreview.net/forum?id=Ry8jLSYIUG

  31. [31]

    Md Farhamdur Reza, Ali Rahmati, Tianfu Wu, and Huaiyu Dai. 2023. CGBA: Curvature-aware Geometric Black-box Attack. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 124–133

  32. [32]

    Geoffrey B. Rhoads. 2010. Detecting embedded signals in media content using coincidence metrics

  33. [33]

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-Resolution Image Synthesis With Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10684–10695

  34. [34]

    Jérôme Rony, Luiz Gustavo Hafemann, Luiz Soares de Oliveira, Ismail Ben Ayed, and Eric Granger. 2019. Decoupling Direction and Norm for Efficient Gradient- Based L2 Adversarial Attacks and Defenses. 4317–4325. Algorithm 1Watermark In The Sand (WIS) Attack Require:Imagex, VAE(E,D), OracleO, Threshold𝛽 Ensure:Adversarial Imagex 𝐴 1:x 0 ←x 2:𝑖←0 // Phase 1: ...

  35. [35]

    Jiaming Song, Chenlin Meng, and Stefano Ermon. 2022. Denoising Diffusion Implicit Models. arXiv:2010.02502 [cs.LG] https://arxiv.org/abs/2010.02502

  36. [36]

    Tran, and Alexandre Mourachko

    Tomas Soucek, Sylvestre-Alvise Rebuffi, Pierre Fernandez, Nikola Jovanović, Hady Elsahar, Valeriu Lacatusu, Tuan A. Tran, and Alexandre Mourachko. 2025. Transferable Black-Box One-Shot Forging of Watermarks via Image Preference Models. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems. https://openreview.net/forum?id=yb5JOOmfxA

  37. [37]

    USA. 2023. Ensuring Safe, Secure, and Trustworthy AI. https: //www.whitehouse.gov/wp-content/uploads/2023/07/Ensuring-Safe-Secure- and-Trustworthy-AI.pdf

  38. [38]

    Enze Xie, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, Muyang Li, Ligeng Zhu, Yao Lu, and Song Han. 2024. Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer. arXiv:2410.10629 [cs.CV] https://arxiv.org/abs/2410.10629

  39. [39]

    Zijin Yang, Kai Zeng, Kejiang Chen, Han Fang, Weiming Zhang, and Nenghai Yu

  40. [40]

    Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models.arXiv preprint arXiv:2404.04956(2024)

  41. [41]

    Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, and Boaz Barak

    Hanlin Zhang, Benjamin L. Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, and Boaz Barak. 2024. Watermarks in the sand: impossibility of strong watermarking for language models. InProceedings of the 41st International Con- ference on Machine Learning(Vienna, Austria)(ICML’24). JMLR.org, Article 2429

  42. [42]

    Efros, Eli Shechtman, and Oliver Wang

    Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang

  43. [43]

    The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

    The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. arXiv:1801.03924 [cs] http://arxiv.org/abs/1801.03924

  44. [44]

    Jiren Zhu, Russell Kaplan, Justin Johnson, and Li Fei-Fei. 2018. HiDDeN: Hiding Data With Deep Networks. InComputer Vision – ECCV 2018, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Vol. 11219. Springer International Publishing, Cham, 682–697. https://link.springer.com/10.1007/978- 3-030-01267-0_40 Series Title: Lecture N...