pith. machine review for the scientific record. sign in

arxiv: 2605.09646 · v1 · submitted 2026-05-10 · 💻 cs.CR

Recognition: no theorem link

"Training robust watermarking model may hurt authentication!'' Exploring and Mitigating the Identity Leakage in Robust Watermarking

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:27 UTC · model grok-4.3

classification 💻 cs.CR
keywords watermarkingidentity leakagerobust watermarkingrandomized smoothingmutual informationimage ownershipcopyright protectionadversarial attacks
0
0 comments X

The pith

Making watermarking robust against image edits can increase identity leakage risks, which the W-IR framework reduces by adding residual information loss while using randomized smoothing for certified robustness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that both empirical and certified robust watermarking methods heighten vulnerability to identity leakage attacks, such as forging watermarked images to impersonate owners. It introduces the W-IR framework that pairs randomized smoothing for certified robustness against pixel-level and coordinate-level perturbations with a residual information loss strategy that minimizes mutual information between residuals and watermarked images. A sympathetic reader would care because post-processing watermarks are a key tool for copyright protection amid generative AI, yet if robustness undermines authentication security, the protection becomes unreliable. Experiments claim the approach maintains high certified accuracy for authenticity while lowering leakage.

Core claim

Robust watermarking increases susceptibility to identity leakage attacks like forging, but the W-IR framework simultaneously achieves identity protection and robustness by applying randomized smoothing for certified guarantees across pixel and coordinate transformation spaces and a residual information loss to minimize mutual information between the residual and watermarked images.

What carries the argument

The W-IR framework, which applies randomized smoothing to certify robustness in pixel-level and coordinate-level spaces and uses residual information loss to minimize mutual information for leakage reduction.

If this is right

  • Watermarked images retain high certified verification accuracy after perturbations in pixel values or coordinates.
  • Mutual information minimization makes forging watermarked images for identity theft harder.
  • The framework provides a practical way to train robust models without the usual increase in leakage.
  • Experiments show the method works across different attack strengths while preserving authentication.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Watermark designers should incorporate leakage testing as a required step alongside robustness certification.
  • The residual loss idea might apply to other embedding tasks where preventing inference of hidden data matters.
  • Certified robustness claims would need re-validation if the smoothing is combined with generative AI pipelines.
  • Similar identity protection layers could be tested on video or audio watermarking to check for cross-domain leakage.

Load-bearing premise

The residual information loss sufficiently lowers mutual information to block identity leakage without creating new attack surfaces or lowering certified robustness.

What would settle it

An experiment in which an attacker forges a convincing watermarked image from the W-IR output by recovering leaked identity details, or where certified accuracy drops under the tested perturbations.

Figures

Figures reproduced from arXiv: 2605.09646 by Kui Ren, Qingyu Liu, Xinyu Zhang, Yuan Hong, Zhongjie Ba, Ziping Dong.

Figure 1
Figure 1. Figure 1: Our main contributions are: (1) the discovery [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Visualization for the three-facet performance (A – [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 1
Figure 1. Figure 1: To enhance watermark robustness, researchers have de￾veloped various empirical defenses, including the inte￾gration of image augmentation [4], [22] and adversarial training [3] into the training phases of encoder and de￾coder networks. However, they are broken by adaptive or stronger attacks [17], [16]. Certified defenses [23] end the cat￾and-mouse game between attacks and defenses primarily for classifica… view at source ↗
Figure 4
Figure 4. Figure 4: Authentication phase of image watermarking. [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Identity forgery attacks. can publish w on the Internet. When authentication for copyright or ownership is required, the user sends both the watermarked image w and the secret watermark t to the third-party provider. The provider subsequently returns a result indicating whether the secret matches the water￾marked image, with responses categorized as ”yes”/ ”no”. Note that users typically maintain consisten… view at source ↗
Figure 8
Figure 8. Figure 8: Identity extraction attack. coefficient (∈ [−1, 1]) [43] between the secret watermark distance and the residual image distance series may reach as high as 0.95 on StegaStamp (CelebA), indicating a strong positive correlation between these two sequences. As illus￾trated in [PITH_FULL_IMAGE:figures/full_fig_p005_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Ruled-based perturbations. Left to right: original [PITH_FULL_IMAGE:figures/full_fig_p006_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: The cluster results for residual images on HiD [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗
Figure 12
Figure 12. Figure 12: Information content of feature representations. [PITH_FULL_IMAGE:figures/full_fig_p008_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Identity leakage mitigation in watermarking. [PITH_FULL_IMAGE:figures/full_fig_p009_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Impact of m and ζ on COCO (StegaStamp). Certified Robustness Results [PITH_FULL_IMAGE:figures/full_fig_p011_14.png] view at source ↗
Figure 16
Figure 16. Figure 16: Certified accuracy (w/ LRIL) at different radii against additive Gaussian noise. Datasets: left-COCO and right-CelebA; Models: top-StegaStamp and bottom￾HiDDeN. 0.0 0.1 0.2 0.3 radius 0.0 0.2 0.4 0.6 0.8 1.0 certified accuracy = 0.01 = 0.02 = 0.03 0.0 0.1 0.2 0.3 radius 0.0 0.2 0.4 0.6 0.8 1.0 certified accuracy = 0.01 = 0.02 = 0.03 0.0 0.1 0.2 0.3 radius 0.0 0.2 0.4 0.6 0.8 1.0 certified accuracy = 0.01 … view at source ↗
Figure 19
Figure 19. Figure 19: Impact of secret watermark pairs’ distances under [PITH_FULL_IMAGE:figures/full_fig_p017_19.png] view at source ↗
read the original abstract

The rapid advancement of generative AI has underscored the critical need for identifying image ownership and protecting copyrights. This makes post-processing image watermarking an essential tool -- it involves embedding a specific watermark message into an image, with successful verification if a similar message can be decoded from the watermarked image. However, this method is susceptible to both adversarial attacks that manipulate the watermarked image to yield an unverified message upon decoding, and the proposed identity leakage-related attacks (e.g., forging watermarked images). The threat of identity leakage is particularly exacerbated in both empirical and certified robust watermarking methods. To defend against the aforementioned attacks, we propose W-IR, the first image watermarking framework that simultaneously incorporates identity protection and robustness. To enhance model robustness, we introduce a novel randomized smoothing technique as part of a robust watermarking, that offers certified robustness against perturbations across two distinct transformation spaces: pixel-level and coordinate-level. Moreover, to further mitigate identity leakage, we propose a new strategy based on residual information loss, aimed at minimizing the mutual information between the residual and watermarked images. Our work strikes a superior balance between robustness and identity leakage mitigation. Extensive experiments demonstrate that our W-IR framework achieves high certified accuracy for authenticity while effectively reducing identity leakage. \footnote{The code is available at https://github.com/holdrain/W-I-R.}

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes W-IR, the first image watermarking framework that jointly addresses robustness and identity leakage. It augments standard watermark embedding with randomized smoothing to obtain certified robustness against perturbations in both pixel-level and coordinate-level transformation spaces, and adds a residual information loss term to the training objective that minimizes mutual information between the residual and the watermarked image, thereby mitigating forging attacks that exploit identity leakage. The central claim is that this combination achieves a superior robustness-leakage trade-off, supported by extensive experiments that report high certified accuracy while reducing leakage.

Significance. If the certified radii remain valid under the joint objective and the empirical leakage reduction is reproducible, the work would provide a concrete, verifiable advance in post-processing watermarking for generative-AI content. The dual-space certification and the public code release are concrete strengths that would aid follow-up research.

major comments (2)
  1. [§3] §3 (Randomized Smoothing subsection): the certification analysis assumes a fixed extractor, yet the residual information loss term is added to the same training objective; no analytic bound is derived that shows the surrogate MI minimization leaves the certified radii in pixel and coordinate spaces unchanged or that it reduces forgery success probability below the attack threshold.
  2. [§5] §5 (Experimental results): the abstract and results claim 'high certified accuracy' and 'effective reduction' of identity leakage, but the text provides no concrete attack models, evaluation metrics (e.g., forgery success rate, certified radius values), or statistical significance tests that would allow verification of the 'superior balance' claim against baselines.
minor comments (2)
  1. [Abstract] Abstract: the phrase 'the first image watermarking framework' should be qualified with a brief comparison to the most closely related prior certified or leakage-aware methods.
  2. [§3.2] Notation: the residual information loss is described only in prose; an explicit equation relating the surrogate loss to I(residual; watermarked image) would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below, clarifying the technical points and outlining planned revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (Randomized Smoothing subsection): the certification analysis assumes a fixed extractor, yet the residual information loss term is added to the same training objective; no analytic bound is derived that shows the surrogate MI minimization leaves the certified radii in pixel and coordinate spaces unchanged or that it reduces forgery success probability below the attack threshold.

    Authors: The randomized smoothing certification theorems (for both pixel and coordinate spaces) are model-agnostic and apply to any fixed extractor at inference time once the model parameters are determined; they depend only on the smoothed prediction behavior under the chosen noise distribution and do not require assumptions on the training procedure. The residual information loss term acts as a regularizer during training to reduce mutual information between the residual and the watermarked image, but it does not alter the post-training certification guarantees. We agree that an explicit analytic bound linking the MI surrogate to unchanged radii or to a strict reduction in forgery success probability is not derived in the current draft. In the revision we will add a dedicated paragraph in §3 explaining why the certification remains valid independently of the training objective, together with empirical verification that certified radii are preserved under the joint objective and that forgery success rates fall below the attack threshold used in the experiments. revision: partial

  2. Referee: [§5] §5 (Experimental results): the abstract and results claim 'high certified accuracy' and 'effective reduction' of identity leakage, but the text provides no concrete attack models, evaluation metrics (e.g., forgery success rate, certified radius values), or statistical significance tests that would allow verification of the 'superior balance' claim against baselines.

    Authors: The full experimental section already defines the forging attack model (adversary reconstructs a watermarked image by exploiting residual identity leakage), reports forgery success rate as the primary leakage metric, lists concrete certified radii (e.g., r=0.5, 1.0 in pixel space and corresponding coordinate radii), and compares certified accuracy and leakage against multiple baselines. However, we acknowledge that the presentation could be clearer and that statistical significance tests (e.g., paired t-tests across runs) are not explicitly included. In the revised manuscript we will expand §5 with a dedicated subsection that (i) restates the attack model formally, (ii) tabulates exact certified radius values and forgery success rates for all methods, and (iii) adds statistical significance results to support the superiority claim. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on proposed techniques and experiments, not self-referential fits or definitions

full rationale

The paper introduces W-IR by adding a randomized smoothing layer for certified robustness in pixel and coordinate spaces plus a residual information loss term to approximate mutual-information minimization. Neither component is defined in terms of the other or of the final certified-accuracy metric; the loss is a standard surrogate objective whose effect is measured empirically rather than asserted by construction. No load-bearing step reduces a prediction to a fitted parameter, invokes a self-citation uniqueness theorem, or renames an input as an output. The central claim of a superior robustness-leakage balance is therefore an empirical observation, not a tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard assumptions about randomized smoothing providing certification bounds and mutual information minimization reducing leakage; no explicit free parameters or invented entities are named in the abstract.

axioms (2)
  • domain assumption Randomized smoothing yields certified robustness bounds against perturbations in pixel-level and coordinate-level spaces.
    Invoked to claim certified accuracy for authenticity verification.
  • domain assumption Minimizing mutual information between residual and watermarked images reduces identity leakage risk.
    Core of the proposed mitigation strategy.

pith-pipeline@v0.9.0 · 5558 in / 1249 out tokens · 49374 ms · 2026-05-12T03:27:21.959395+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

71 extracted references · 71 canonical work pages · 2 internal anchors

  1. [1]

    Dall·e 3 — openai,

    “Dall·e 3 — openai,” https://openai.com/index/dall-e-3/, 2024

  2. [2]

    Stable diffusion 3 — stability ai,

    “Stable diffusion 3 — stability ai,” https://stability.ai/news/ stable-diffusion-3, 2024

  3. [3]

    Hidden: Hiding data with deep networks,

    J. Zhu, R. Kaplan, J. Johnson, and L. Fei-Fei, “Hidden: Hiding data with deep networks,” inProceedings of the European conference on computer vision (ECCV), 2018, pp. 657–672

  4. [4]

    Stegastamp: Invisible hy- perlinks in physical photographs,

    M. Tancik, B. Mildenhall, and R. Ng, “Stegastamp: Invisible hy- perlinks in physical photographs,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2117– 2126

  5. [5]

    Udh: Universal deep hiding for steganography, watermarking, and light field messaging,

    C. Zhang, P . Benz, A. Karjauv, G. Sun, and I. S. Kweon, “Udh: Universal deep hiding for steganography, watermarking, and light field messaging,”Advances in Neural Information Processing Systems, vol. 33, pp. 10 223–10 234, 2020

  6. [6]

    Meta, google, and openai promise the white house they’ll develop ai responsibly - the verge,

    “Meta, google, and openai promise the white house they’ll develop ai responsibly - the verge,” https://www.theverge.com/2023/7/21/23802274/ artificial-intelligence-meta-google-openai-white-house-security-safety, 2024

  7. [7]

    Recent trends in image watermarking tech- niques for copyright protection: a survey,

    A. Ray and S. Roy, “Recent trends in image watermarking tech- niques for copyright protection: a survey,”International Journal of Multimedia Information Retrieval, vol. 9, no. 4, pp. 249–270, 2020

  8. [8]

    High-resolution image synthesis with latent diffusion models,

    R. Rombach, A. Blattmann, D. Lorenz, P . Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695

  9. [9]

    Scal- able watermarking for identifying large language model outputs,

    S. Dathathri, A. See, S. Ghaisas, P .-S. Huang, R. McAdam, J. Welbl, V . Bachani, A. Kaskasoli, R. Stanforth, T. Matejovicovaet al., “Scal- able watermarking for identifying large language model outputs,” Nature, vol. 634, no. 8035, pp. 818–823, 2024

  10. [10]

    Watermark detection for amazon titan image generator now available in amazon bedrock

    “Watermark detection for amazon titan image generator now available in amazon bedrock.” [Online]. Available: https://aws.amazon.com/about-aws/whats-new/2024/04/ watermark-detection-amazon-titan-image-generator-bedrock/ ?nc1=h ls

  11. [11]

    Announcing microsoft copilot, your everyday ai companion - the official microsoft blog,

    Y. Mehdi, “Announcing microsoft copilot, your everyday ai companion - the official microsoft blog,” 9 2023. [Online]. Available: https://blogs.microsoft.com/blog/2023/09/ 21/announcing-microsoft-copilot-your-everyday-ai-companion/

  12. [12]

    A watermark for digital images,

    R. B. Wolfgang and E. J. Delp, “A watermark for digital images,” in Proceedings of 3rd IEEE International Conference on Image Processing, vol. 3. IEEE, 1996, pp. 219–222

  13. [13]

    Invisible image watermarks are provably removable using generative ai,

    X. Zhao, K. Zhang, Z. Su, S. Vasan, I. Grishchenko, C. Kruegel, G. Vigna, Y.-X. Wang, and L. Li, “Invisible image watermarks are provably removable using generative ai,”Advances in Neural Information Processing Systems, vol. 37, pp. 8643–8672, 2024

  14. [14]

    Explaining and harnessing adversarial examples,

    I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” inInternational Conference on Learning Representations, 2015

  15. [15]

    Exploring the landscape of spatial robustness,

    L. Engstrom, B. Tran, D. Tsipras, L. Schmidt, and A. Madry, “Exploring the landscape of spatial robustness,” inInternational conference on machine learning. PMLR, 2019, pp. 1802–1811

  16. [16]

    Robustness of ai-image detectors: Fundamental lim- its and practical attacks.arXiv preprint arXiv:2310.00076,

    M. Saberi, V . S. Sadasivan, K. Rezaei, A. Kumar, A. Chegini, W. Wang, and S. Feizi, “Robustness of ai-image detec- tors: Fundamental limits and practical attacks,”arXiv preprint arXiv:2310.00076, 2023

  17. [17]

    Evading watermark based detection of ai-generated content,

    Z. Jiang, J. Zhang, and N. Z. Gong, “Evading watermark based detection of ai-generated content,” inProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 1168–1181

  18. [18]

    A compre- hensive survey on robust image watermarking,

    W. Wan, J. Wang, Y. Zhang, J. Li, H. Yu, and J. Sun, “A compre- hensive survey on robust image watermarking,”Neurocomputing, vol. 488, pp. 226–247, 2022

  19. [19]

    Watermark-based detection and attribution of ai-generated content,

    Z. Jiang, M. Guo, Y. Hu, and N. Z. Gong, “Watermark-based detection and attribution of ai-generated content,”arXiv preprint arXiv:2404.04254, 2024

  20. [20]

    Digital watermarking system for copyright protection and au- thentication of images using cryptographic techniques,

    P . V . Sanivarapu, K. N. Rajesh, K. M. Hosny, and M. M. Fouda, “Digital watermarking system for copyright protection and au- thentication of images using cryptographic techniques,”Applied Sciences, vol. 12, no. 17, p. 8724, 2022

  21. [21]

    Watermark copy attack,

    M. Kutter, S. V . Voloshynovskiy, and A. Herrigel, “Watermark copy attack,” inSecurity and Watermarking of Multimedia Contents II, vol

  22. [22]

    SPIE, 2000, pp. 371–380

  23. [23]

    Mbrs: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression,

    Z. Jia, H. Fang, and W. Zhang, “Mbrs: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression,” inProceedings of the 29th ACM international conference on multimedia, 2021, pp. 41–49

  24. [24]

    Sok: Certified robustness for deep neural networks,

    L. Li, T. Xie, and B. Li, “Sok: Certified robustness for deep neural networks,” in2023 IEEE symposium on security and privacy (SP). IEEE, 2023, pp. 1289–1310

  25. [25]

    Certified adversarial ro- bustness via randomized smoothing,

    J. Cohen, E. Rosenfeld, and Z. Kolter, “Certified adversarial ro- bustness via randomized smoothing,” ininternational conference on machine learning. PMLR, 2019, pp. 1310–1320

  26. [26]

    Black-box certifica- tion with randomized smoothing: A functional optimization based framework,

    D. Zhang, M. Ye, C. Gong, Z. Zhu, and Q. Liu, “Black-box certifica- tion with randomized smoothing: A functional optimization based framework,”Advances in Neural Information Processing Systems, vol. 33, pp. 2316–2326, 2020

  27. [27]

    On variational bounds of mutual information,

    B. Poole, S. Ozair, A. Van Den Oord, A. Alemi, and G. Tucker, “On variational bounds of mutual information,” inInternational Conference on Machine Learning. PMLR, 2019, pp. 5171–5180

  28. [28]

    Mutual information neural estima- tion,

    M. I. Belghazi, A. Baratin, S. Rajeshwar, S. Ozair, Y. Bengio, A. Courville, and D. Hjelm, “Mutual information neural estima- tion,” inInternational conference on machine learning. PMLR, 2018, pp. 531–540

  29. [29]

    Farewell to mutual information: Variational distillation for cross-modal person re-identification,

    X. Tian, Z. Zhang, S. Lin, Y. Qu, Y. Xie, and L. Ma, “Farewell to mutual information: Variational distillation for cross-modal person re-identification,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1522–1531

  30. [30]

    Perceptual losses for real-time style transfer and super-resolution,

    J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” inComputer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer, 2016, pp. 694–711

  31. [31]

    The unreasonable effectiveness of deep features as a perceptual metric,

    R. Zhang, P . Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595

  32. [32]

    {PTW}: Pivotal tuning watermark- ing for{Pre-Trained}image generators,

    N. Lukas and F. Kerschbaum, “{PTW}: Pivotal tuning watermark- ing for{Pre-Trained}image generators,” in32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 2241–2258

  33. [33]

    The stable signature: Rooting watermarks in latent diffusion models,

    P . Fernandez, G. Couairon, H. J´egou, M. Douze, and T. Furon, “The stable signature: Rooting watermarks in latent diffusion models,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 22 466–22 477

  34. [34]

    Waves: Bench- marking the robustness of image watermarks.arXiv preprint arXiv:2401.08573, 2024

    B. An, M. Ding, T. Rabbani, A. Agrawal, Y. Xu, C. Deng, S. Zhu, A. Mohamed, Y. Wen, T. Goldsteinet al., “Benchmarking the robustness of image watermarks,”arXiv preprint arXiv:2401.08573, 2024

  35. [35]

    Towards blind watermarking: Combining invertible and non- invertible mechanisms,

    R. Ma, M. Guo, Y. Hou, F. Yang, Y. Li, H. Jia, and X. Xie, “Towards blind watermarking: Combining invertible and non- invertible mechanisms,” inProceedings of the 30th ACM Interna- tional Conference on Multimedia, 2022, pp. 1532–1542

  36. [36]

    Certified robustness to adversarial examples with differential privacy,

    M. Lecuyer, V . Atlidakis, R. Geambasu, D. Hsu, and S. Jana, “Certified robustness to adversarial examples with differential privacy,” in2019 IEEE symposium on security and privacy (SP). IEEE, 2019, pp. 656–672

  37. [37]

    Certified robustness of community detection against adversarial structural perturbation via randomized smoothing,

    J. Jia, B. Wang, X. Cao, and N. Z. Gong, “Certified robustness of community detection against adversarial structural perturbation via randomized smoothing,” inProceedings of The Web Conference 2020, 2020, pp. 2718–2724. 14

  38. [38]

    Certified robustness of graph neural networks against adversarial structural pertur- bation,

    B. Wang, J. Jia, X. Cao, and N. Z. Gong, “Certified robustness of graph neural networks against adversarial structural pertur- bation,” inProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 1645–1653

  39. [39]

    Text-crs: A generalized certified robustness framework against textual adversarial attacks,

    X. Zhang, H. Hong, Y. Hong, P . Huang, B. Wang, Z. Ba, and K. Ren, “Text-crs: A generalized certified robustness framework against textual adversarial attacks,” in2024 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 2023, pp. 53–53

  40. [40]

    A generalized deep neural network approach for digital watermarking analysis,

    W. Ding, Y. Ming, Z. Cao, and C.-T. Lin, “A generalized deep neural network approach for digital watermarking analysis,”IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 6, no. 3, pp. 613–627, 2021

  41. [41]

    Visualizing data using t-sne

    L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008

  42. [42]

    Genetic k-means algorithm,

    K. Krishna and M. N. Murty, “Genetic k-means algorithm,”IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 29, no. 3, pp. 433–439, 1999

  43. [43]

    Collusion issue in video watermark- ing,

    G. Do ¨err and J.-L. Dugelay, “Collusion issue in video watermark- ing,” inSecurity, Steganography, and Watermarking of Multimedia Contents VII, vol. 5681. SPIE, 2005, pp. 685–696

  44. [44]

    Pearson correlation coefficient,

    I. Cohen, Y. Huang, J. Chen, J. Benesty, J. Benesty, J. Chen, Y. Huang, and I. Cohen, “Pearson correlation coefficient,”Noise reduction in speech processing, pp. 1–4, 2009

  45. [45]

    Towards Blind Watermarking: Combining Invertible and Non-invertible Mechanisms,

    R. Ma, M. Guo, Y. Hou, F. Yang, Y. Li, H. Jia, and X. Xie, “Towards Blind Watermarking: Combining Invertible and Non-invertible Mechanisms,” inProceedings of the 30th ACM International Confer- ence on Multimedia. Lisboa Portugal: ACM, Oct. 2022, pp. 1532– 1542

  46. [46]

    3deformrs: Certifying spatial deformations on point clouds,

    J. C. P ´erez, M. Alfarra, S. Giancola, B. Ghanemet al., “3deformrs: Certifying spatial deformations on point clouds,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 15 169–15 179

  47. [47]

    De- formrs: Certifying input deformations with randomized smooth- ing,

    M. Alfarra, A. Bibi, N. Khan, P . H. Torr, and B. Ghanem, “De- formrs: Certifying input deformations with randomized smooth- ing,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 6, 2022, pp. 6001–6009

  48. [48]

    Tss: Transformation-specific smoothing for robustness certification,

    L. Li, M. Weber, X. Xu, L. Rimanic, B. Kailkhura, T. Xie, C. Zhang, and B. Li, “Tss: Transformation-specific smoothing for robustness certification,” inProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 535–557

  49. [49]

    Estimating divergence functionals and the likelihood ratio by convex risk minimization,

    X. Nguyen, M. J. Wainwright, and M. I. Jordan, “Estimating divergence functionals and the likelihood ratio by convex risk minimization,”IEEE Transactions on Information Theory, vol. 56, no. 11, pp. 5847–5861, 2010

  50. [50]

    Progressive growing of GANs for improved quality, stability, and variation,

    T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of GANs for improved quality, stability, and variation,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=Hk99zCeAb

  51. [51]

    Microsoft coco: Common objects in context,

    T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P . Perona, D. Ramanan, P . Doll´ar, and C. L. Zitnick, “Microsoft coco: Common objects in context,” inComputer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 2014, pp. 740–755

  52. [52]

    Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,

    P . J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,”Journal of computational and applied mathematics, vol. 20, pp. 53–65, 1987

  53. [53]

    Diffusetrace: A transparent and flexible watermarking scheme for latent diffusion model.arXiv preprint arXiv:2405.02696, 2024

    L. Lei, K. Gai, J. Yu, and L. Zhu, “Diffusetrace: A transparent and flexible watermarking scheme for latent diffusion model,”arXiv preprint arXiv:2405.02696, 2024

  54. [54]

    Anti- screenshot watermarking algorithm for archival image based on deep learning model,

    W. Gu, C.-C. Chang, Y. Bai, Y. Fan, L. Tao, and L. Li, “Anti- screenshot watermarking algorithm for archival image based on deep learning model,”Entropy, vol. 25, no. 2, p. 288, 2023

  55. [55]

    Lsb based digital image watermarking for gray scale image,

    D. Chopra, P . Gupta, G. Sanjay, and A. Gupta, “Lsb based digital image watermarking for gray scale image,”IOSR Journal of Com- puter Engineering, vol. 6, no. 1, pp. 36–41, 2012

  56. [56]

    Combined dwt-dct digital image watermarking,

    A. Al-Haj, “Combined dwt-dct digital image watermarking,”Jour- nal of computer science, vol. 3, no. 9, pp. 740–746, 2007

  57. [57]

    Learning- based image steganography and watermarking: A survey,

    K. Hu, M. Wang, X. Ma, J. Chen, X. Wang, and X. Wang, “Learning- based image steganography and watermarking: A survey,”Expert Systems with Applications, vol. 249, p. 123715, Sep. 2024

  58. [58]

    Flow-based robust watermarking with invertible noise layer for black-box distortions,

    H. Fang, Y. Qiu, K. Chen, J. Zhang, W. Zhang, and E.-C. Chang, “Flow-based robust watermarking with invertible noise layer for black-box distortions,” inProceedings of the AAAI conference on artificial intelligence, vol. 37, no. 4, 2023, pp. 5054–5061

  59. [59]

    Artificial Fin- gerprinting for Generative Models: Rooting Deepfake Attribution in Training Data,

    N. Yu, V . Skripniuk, S. Abdelnabi, and M. Fritz, “Artificial Fin- gerprinting for Generative Models: Rooting Deepfake Attribution in Training Data,” in2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada: IEEE, Oct. 2021, pp. 14 428–14 437

  60. [60]

    Responsi- ble disclosure of generative models using scalable fingerprinting,

    N. Yu, V . Skripniuk, D. Chen, L. S. Davis, and M. Fritz, “Responsi- ble disclosure of generative models using scalable fingerprinting,” inInternational Conference on Learning Representations, 2021

  61. [61]

    Tree-ring watermarks: Fingerprints for diffu- sion images that are invisible and robust.arXiv preprint arXiv:2305.20030, 2023

    Y. Wen, J. Kirchenbauer, J. Geiping, and T. Goldstein, “Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust,”arXiv preprint arXiv:2305.20030, 2023

  62. [62]

    A recipe for watermark- ing diffusion models.arXiv preprint arXiv:2303.10137,

    Y. Zhao, T. Pang, C. Du, X. Yang, N.-M. Cheung, and M. Lin, “A recipe for watermarking diffusion models,”arXiv preprint arXiv:2303.10137, 2023

  63. [63]

    Towards Deep Learning Models Resistant to Adversarial Attacks

    A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017

  64. [64]

    Hopskipjumpattack: A query-efficient decision-based attack,

    J. Chen, M. I. Jordan, and M. J. Wainwright, “Hopskipjumpattack: A query-efficient decision-based attack,” in2020 ieee symposium on security and privacy (sp). IEEE, 2020, pp. 1277–1294

  65. [65]

    A Transfer Attack to Image Watermarks,

    Y. Hu, Z. Jiang, M. Guo, and N. Gong, “A Transfer Attack to Image Watermarks,” Mar. 2024

  66. [66]

    Invisible Image Watermarks Are Provably Removable Using Generative AI,

    X. Zhao, K. Zhang, Z. Su, S. Vasan, I. Grishchenko, C. Kruegel, G. Vigna, Y.-X. Wang, and L. Li, “Invisible Image Watermarks Are Provably Removable Using Generative AI,” Aug. 2023

  67. [67]

    Variational image compression with a scale hyperprior

    J. Ball ´e, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,”arXiv preprint arXiv:1802.01436, 2018

  68. [68]

    Learned image compression with discretized gaussian mixture likelihoods and attention modules,

    Z. Cheng, H. Sun, M. Takeuchi, and J. Katto, “Learned image compression with discretized gaussian mixture likelihoods and attention modules,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 7939–7948

  69. [69]

    Taming transformers for high-resolution image synthesis,

    P . Esser, R. Rombach, and B. Ommer, “Taming transformers for high-resolution image synthesis,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 12 873–12 883

  70. [70]

    Denoising Diffusion Implicit Models

    J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,”arXiv preprint arXiv:2010.02502, 2020

  71. [71]

    Gaussian Shading: Provable Performance-Lossless Image Water- marking for Diffusion Models,

    Z. Yang, K. Zeng, K. Chen, H. Fang, W. Zhang, and N. Yu, “Gaussian Shading: Provable Performance-Lossless Image Water- marking for Diffusion Models,” May 2024, comment: 17 pages, 11 figures, accepted by CVPR 2024. APPENDIXA PROOF FORRESIDUALINFORMATIONLOSS Theorem 3.To deal with the estimation of mutual infor- mation in Eq. 13, we introduce the following ...