arxiv: 2605.09646 · v1 · submitted 2026-05-10 · 💻 cs.CR

Recognition: no theorem link

"Training robust watermarking model may hurt authentication!'' Exploring and Mitigating the Identity Leakage in Robust Watermarking

Xinyu Zhang , Ziping Dong , Qingyu Liu , Yuan Hong , Zhongjie Ba , Kui Ren

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:27 UTC · model grok-4.3

classification 💻 cs.CR

keywords watermarkingidentity leakagerobust watermarkingrandomized smoothingmutual informationimage ownershipcopyright protectionadversarial attacks

0 comments

The pith

Making watermarking robust against image edits can increase identity leakage risks, which the W-IR framework reduces by adding residual information loss while using randomized smoothing for certified robustness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that both empirical and certified robust watermarking methods heighten vulnerability to identity leakage attacks, such as forging watermarked images to impersonate owners. It introduces the W-IR framework that pairs randomized smoothing for certified robustness against pixel-level and coordinate-level perturbations with a residual information loss strategy that minimizes mutual information between residuals and watermarked images. A sympathetic reader would care because post-processing watermarks are a key tool for copyright protection amid generative AI, yet if robustness undermines authentication security, the protection becomes unreliable. Experiments claim the approach maintains high certified accuracy for authenticity while lowering leakage.

Core claim

Robust watermarking increases susceptibility to identity leakage attacks like forging, but the W-IR framework simultaneously achieves identity protection and robustness by applying randomized smoothing for certified guarantees across pixel and coordinate transformation spaces and a residual information loss to minimize mutual information between the residual and watermarked images.

What carries the argument

The W-IR framework, which applies randomized smoothing to certify robustness in pixel-level and coordinate-level spaces and uses residual information loss to minimize mutual information for leakage reduction.

If this is right

Watermarked images retain high certified verification accuracy after perturbations in pixel values or coordinates.
Mutual information minimization makes forging watermarked images for identity theft harder.
The framework provides a practical way to train robust models without the usual increase in leakage.
Experiments show the method works across different attack strengths while preserving authentication.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Watermark designers should incorporate leakage testing as a required step alongside robustness certification.
The residual loss idea might apply to other embedding tasks where preventing inference of hidden data matters.
Certified robustness claims would need re-validation if the smoothing is combined with generative AI pipelines.
Similar identity protection layers could be tested on video or audio watermarking to check for cross-domain leakage.

Load-bearing premise

The residual information loss sufficiently lowers mutual information to block identity leakage without creating new attack surfaces or lowering certified robustness.

What would settle it

An experiment in which an attacker forges a convincing watermarked image from the W-IR output by recovering leaked identity details, or where certified accuracy drops under the tested perturbations.

Figures

Figures reproduced from arXiv: 2605.09646 by Kui Ren, Qingyu Liu, Xinyu Zhang, Yuan Hong, Zhongjie Ba, Ziping Dong.

**Figure 2.** Figure 2: Visualization for the three-facet performance (A – [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 1.** Figure 1: To enhance watermark robustness, researchers have developed various empirical defenses, including the integration of image augmentation [4], [22] and adversarial training [3] into the training phases of encoder and decoder networks. However, they are broken by adaptive or stronger attacks [17], [16]. Certified defenses [23] end the catand-mouse game between attacks and defenses primarily for classifica… view at source ↗

**Figure 4.** Figure 4: Authentication phase of image watermarking. [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗

**Figure 6.** Figure 6: Identity forgery attacks. can publish w on the Internet. When authentication for copyright or ownership is required, the user sends both the watermarked image w and the secret watermark t to the third-party provider. The provider subsequently returns a result indicating whether the secret matches the watermarked image, with responses categorized as ”yes”/ ”no”. Note that users typically maintain consisten… view at source ↗

**Figure 8.** Figure 8: Identity extraction attack. coefficient (∈ [−1, 1]) [43] between the secret watermark distance and the residual image distance series may reach as high as 0.95 on StegaStamp (CelebA), indicating a strong positive correlation between these two sequences. As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p005_8.png] view at source ↗

**Figure 9.** Figure 9: Ruled-based perturbations. Left to right: original [PITH_FULL_IMAGE:figures/full_fig_p006_9.png] view at source ↗

**Figure 10.** Figure 10: The cluster results for residual images on HiD [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

**Figure 12.** Figure 12: Information content of feature representations. [PITH_FULL_IMAGE:figures/full_fig_p008_12.png] view at source ↗

**Figure 13.** Figure 13: Identity leakage mitigation in watermarking. [PITH_FULL_IMAGE:figures/full_fig_p009_13.png] view at source ↗

**Figure 14.** Figure 14: Impact of m and ζ on COCO (StegaStamp). Certified Robustness Results [PITH_FULL_IMAGE:figures/full_fig_p011_14.png] view at source ↗

**Figure 16.** Figure 16: Certified accuracy (w/ LRIL) at different radii against additive Gaussian noise. Datasets: left-COCO and right-CelebA; Models: top-StegaStamp and bottomHiDDeN. 0.0 0.1 0.2 0.3 radius 0.0 0.2 0.4 0.6 0.8 1.0 certified accuracy = 0.01 = 0.02 = 0.03 0.0 0.1 0.2 0.3 radius 0.0 0.2 0.4 0.6 0.8 1.0 certified accuracy = 0.01 = 0.02 = 0.03 0.0 0.1 0.2 0.3 radius 0.0 0.2 0.4 0.6 0.8 1.0 certified accuracy = 0.01 … view at source ↗

**Figure 19.** Figure 19: Impact of secret watermark pairs’ distances under [PITH_FULL_IMAGE:figures/full_fig_p017_19.png] view at source ↗

read the original abstract

The rapid advancement of generative AI has underscored the critical need for identifying image ownership and protecting copyrights. This makes post-processing image watermarking an essential tool -- it involves embedding a specific watermark message into an image, with successful verification if a similar message can be decoded from the watermarked image. However, this method is susceptible to both adversarial attacks that manipulate the watermarked image to yield an unverified message upon decoding, and the proposed identity leakage-related attacks (e.g., forging watermarked images). The threat of identity leakage is particularly exacerbated in both empirical and certified robust watermarking methods. To defend against the aforementioned attacks, we propose W-IR, the first image watermarking framework that simultaneously incorporates identity protection and robustness. To enhance model robustness, we introduce a novel randomized smoothing technique as part of a robust watermarking, that offers certified robustness against perturbations across two distinct transformation spaces: pixel-level and coordinate-level. Moreover, to further mitigate identity leakage, we propose a new strategy based on residual information loss, aimed at minimizing the mutual information between the residual and watermarked images. Our work strikes a superior balance between robustness and identity leakage mitigation. Extensive experiments demonstrate that our W-IR framework achieves high certified accuracy for authenticity while effectively reducing identity leakage. \footnote{The code is available at https://github.com/holdrain/W-I-R.}

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

W-IR pairs dual-space randomized smoothing with a residual loss to cut identity leakage in robust image watermarking, but the interaction between the two pieces rests only on experiments.

read the letter

The main takeaway is that robust watermarking can create new forgery risks through identity leakage, and this paper offers W-IR as a combined defense using randomized smoothing over pixel and coordinate spaces plus a residual information loss term to lower mutual information. They report high certified accuracy alongside reduced leakage and release the code on GitHub. The dual-space smoothing is a straightforward extension of existing certified robustness methods, and the residual loss directly targets the leakage channel they identified. That combination is the concrete new element here. The experiments appear to show the framework works better than baselines on both fronts, which is useful for anyone building copyright tools for generated images. The soft spot is the missing link between the pieces. The residual loss is a surrogate for mutual information, and nothing in the write-up bounds how it affects the certified radii or guarantees that leakage stays below the threshold for successful forgeries. Without that, the claim of a superior balance stays empirical. The paper is aimed at people working on adversarial robustness and watermarking for generative AI. A reader who needs practical, reproducible defenses against both removal and forgery attacks will get value from the method and the open implementation. It deserves a serious referee because the problem is timely and the proposal is specific enough to evaluate, even if the current version needs tighter analysis on how the loss and the smoothing interact.

Referee Report

2 major / 2 minor

Summary. The paper proposes W-IR, the first image watermarking framework that jointly addresses robustness and identity leakage. It augments standard watermark embedding with randomized smoothing to obtain certified robustness against perturbations in both pixel-level and coordinate-level transformation spaces, and adds a residual information loss term to the training objective that minimizes mutual information between the residual and the watermarked image, thereby mitigating forging attacks that exploit identity leakage. The central claim is that this combination achieves a superior robustness-leakage trade-off, supported by extensive experiments that report high certified accuracy while reducing leakage.

Significance. If the certified radii remain valid under the joint objective and the empirical leakage reduction is reproducible, the work would provide a concrete, verifiable advance in post-processing watermarking for generative-AI content. The dual-space certification and the public code release are concrete strengths that would aid follow-up research.

major comments (2)

[§3] §3 (Randomized Smoothing subsection): the certification analysis assumes a fixed extractor, yet the residual information loss term is added to the same training objective; no analytic bound is derived that shows the surrogate MI minimization leaves the certified radii in pixel and coordinate spaces unchanged or that it reduces forgery success probability below the attack threshold.
[§5] §5 (Experimental results): the abstract and results claim 'high certified accuracy' and 'effective reduction' of identity leakage, but the text provides no concrete attack models, evaluation metrics (e.g., forgery success rate, certified radius values), or statistical significance tests that would allow verification of the 'superior balance' claim against baselines.

minor comments (2)

[Abstract] Abstract: the phrase 'the first image watermarking framework' should be qualified with a brief comparison to the most closely related prior certified or leakage-aware methods.
[§3.2] Notation: the residual information loss is described only in prose; an explicit equation relating the surrogate loss to I(residual; watermarked image) would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below, clarifying the technical points and outlining planned revisions to strengthen the manuscript.

read point-by-point responses

Referee: [§3] §3 (Randomized Smoothing subsection): the certification analysis assumes a fixed extractor, yet the residual information loss term is added to the same training objective; no analytic bound is derived that shows the surrogate MI minimization leaves the certified radii in pixel and coordinate spaces unchanged or that it reduces forgery success probability below the attack threshold.

Authors: The randomized smoothing certification theorems (for both pixel and coordinate spaces) are model-agnostic and apply to any fixed extractor at inference time once the model parameters are determined; they depend only on the smoothed prediction behavior under the chosen noise distribution and do not require assumptions on the training procedure. The residual information loss term acts as a regularizer during training to reduce mutual information between the residual and the watermarked image, but it does not alter the post-training certification guarantees. We agree that an explicit analytic bound linking the MI surrogate to unchanged radii or to a strict reduction in forgery success probability is not derived in the current draft. In the revision we will add a dedicated paragraph in §3 explaining why the certification remains valid independently of the training objective, together with empirical verification that certified radii are preserved under the joint objective and that forgery success rates fall below the attack threshold used in the experiments. revision: partial
Referee: [§5] §5 (Experimental results): the abstract and results claim 'high certified accuracy' and 'effective reduction' of identity leakage, but the text provides no concrete attack models, evaluation metrics (e.g., forgery success rate, certified radius values), or statistical significance tests that would allow verification of the 'superior balance' claim against baselines.

Authors: The full experimental section already defines the forging attack model (adversary reconstructs a watermarked image by exploiting residual identity leakage), reports forgery success rate as the primary leakage metric, lists concrete certified radii (e.g., r=0.5, 1.0 in pixel space and corresponding coordinate radii), and compares certified accuracy and leakage against multiple baselines. However, we acknowledge that the presentation could be clearer and that statistical significance tests (e.g., paired t-tests across runs) are not explicitly included. In the revised manuscript we will expand §5 with a dedicated subsection that (i) restates the attack model formally, (ii) tabulates exact certified radius values and forgery success rates for all methods, and (iii) adds statistical significance results to support the superiority claim. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on proposed techniques and experiments, not self-referential fits or definitions

full rationale

The paper introduces W-IR by adding a randomized smoothing layer for certified robustness in pixel and coordinate spaces plus a residual information loss term to approximate mutual-information minimization. Neither component is defined in terms of the other or of the final certified-accuracy metric; the loss is a standard surrogate objective whose effect is measured empirically rather than asserted by construction. No load-bearing step reduces a prediction to a fitted parameter, invokes a self-citation uniqueness theorem, or renames an input as an output. The central claim of a superior robustness-leakage balance is therefore an empirical observation, not a tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard assumptions about randomized smoothing providing certification bounds and mutual information minimization reducing leakage; no explicit free parameters or invented entities are named in the abstract.

axioms (2)

domain assumption Randomized smoothing yields certified robustness bounds against perturbations in pixel-level and coordinate-level spaces.
Invoked to claim certified accuracy for authenticity verification.
domain assumption Minimizing mutual information between residual and watermarked images reduces identity leakage risk.
Core of the proposed mitigation strategy.

pith-pipeline@v0.9.0 · 5558 in / 1249 out tokens · 49374 ms · 2026-05-12T03:27:21.959395+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

71 extracted references · 71 canonical work pages · 2 internal anchors

[1]

Dall·e 3 — openai,

“Dall·e 3 — openai,” https://openai.com/index/dall-e-3/, 2024

work page 2024
[2]

Stable diffusion 3 — stability ai,

“Stable diffusion 3 — stability ai,” https://stability.ai/news/ stable-diffusion-3, 2024

work page 2024
[3]

Hidden: Hiding data with deep networks,

J. Zhu, R. Kaplan, J. Johnson, and L. Fei-Fei, “Hidden: Hiding data with deep networks,” inProceedings of the European conference on computer vision (ECCV), 2018, pp. 657–672

work page 2018
[4]

Stegastamp: Invisible hy- perlinks in physical photographs,

M. Tancik, B. Mildenhall, and R. Ng, “Stegastamp: Invisible hy- perlinks in physical photographs,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2117– 2126

work page 2020
[5]

Udh: Universal deep hiding for steganography, watermarking, and light field messaging,

C. Zhang, P . Benz, A. Karjauv, G. Sun, and I. S. Kweon, “Udh: Universal deep hiding for steganography, watermarking, and light field messaging,”Advances in Neural Information Processing Systems, vol. 33, pp. 10 223–10 234, 2020

work page 2020
[6]

Meta, google, and openai promise the white house they’ll develop ai responsibly - the verge,

“Meta, google, and openai promise the white house they’ll develop ai responsibly - the verge,” https://www.theverge.com/2023/7/21/23802274/ artificial-intelligence-meta-google-openai-white-house-security-safety, 2024

work page 2023
[7]

Recent trends in image watermarking tech- niques for copyright protection: a survey,

A. Ray and S. Roy, “Recent trends in image watermarking tech- niques for copyright protection: a survey,”International Journal of Multimedia Information Retrieval, vol. 9, no. 4, pp. 249–270, 2020

work page 2020
[8]

High-resolution image synthesis with latent diffusion models,

R. Rombach, A. Blattmann, D. Lorenz, P . Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695

work page 2022
[9]

Scal- able watermarking for identifying large language model outputs,

S. Dathathri, A. See, S. Ghaisas, P .-S. Huang, R. McAdam, J. Welbl, V . Bachani, A. Kaskasoli, R. Stanforth, T. Matejovicovaet al., “Scal- able watermarking for identifying large language model outputs,” Nature, vol. 634, no. 8035, pp. 818–823, 2024

work page 2024
[10]

Watermark detection for amazon titan image generator now available in amazon bedrock

“Watermark detection for amazon titan image generator now available in amazon bedrock.” [Online]. Available: https://aws.amazon.com/about-aws/whats-new/2024/04/ watermark-detection-amazon-titan-image-generator-bedrock/ ?nc1=h ls

work page 2024
[11]

Announcing microsoft copilot, your everyday ai companion - the official microsoft blog,

Y. Mehdi, “Announcing microsoft copilot, your everyday ai companion - the official microsoft blog,” 9 2023. [Online]. Available: https://blogs.microsoft.com/blog/2023/09/ 21/announcing-microsoft-copilot-your-everyday-ai-companion/

work page 2023
[12]

A watermark for digital images,

R. B. Wolfgang and E. J. Delp, “A watermark for digital images,” in Proceedings of 3rd IEEE International Conference on Image Processing, vol. 3. IEEE, 1996, pp. 219–222

work page 1996
[13]

Invisible image watermarks are provably removable using generative ai,

X. Zhao, K. Zhang, Z. Su, S. Vasan, I. Grishchenko, C. Kruegel, G. Vigna, Y.-X. Wang, and L. Li, “Invisible image watermarks are provably removable using generative ai,”Advances in Neural Information Processing Systems, vol. 37, pp. 8643–8672, 2024

work page 2024
[14]

Explaining and harnessing adversarial examples,

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” inInternational Conference on Learning Representations, 2015

work page 2015
[15]

Exploring the landscape of spatial robustness,

L. Engstrom, B. Tran, D. Tsipras, L. Schmidt, and A. Madry, “Exploring the landscape of spatial robustness,” inInternational conference on machine learning. PMLR, 2019, pp. 1802–1811

work page 2019
[16]

Robustness of ai-image detectors: Fundamental lim- its and practical attacks.arXiv preprint arXiv:2310.00076,

M. Saberi, V . S. Sadasivan, K. Rezaei, A. Kumar, A. Chegini, W. Wang, and S. Feizi, “Robustness of ai-image detec- tors: Fundamental limits and practical attacks,”arXiv preprint arXiv:2310.00076, 2023

work page arXiv 2023
[17]

Evading watermark based detection of ai-generated content,

Z. Jiang, J. Zhang, and N. Z. Gong, “Evading watermark based detection of ai-generated content,” inProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 1168–1181

work page 2023
[18]

A compre- hensive survey on robust image watermarking,

W. Wan, J. Wang, Y. Zhang, J. Li, H. Yu, and J. Sun, “A compre- hensive survey on robust image watermarking,”Neurocomputing, vol. 488, pp. 226–247, 2022

work page 2022
[19]

Watermark-based detection and attribution of ai-generated content,

Z. Jiang, M. Guo, Y. Hu, and N. Z. Gong, “Watermark-based detection and attribution of ai-generated content,”arXiv preprint arXiv:2404.04254, 2024

work page arXiv 2024
[20]

Digital watermarking system for copyright protection and au- thentication of images using cryptographic techniques,

P . V . Sanivarapu, K. N. Rajesh, K. M. Hosny, and M. M. Fouda, “Digital watermarking system for copyright protection and au- thentication of images using cryptographic techniques,”Applied Sciences, vol. 12, no. 17, p. 8724, 2022

work page 2022
[21]

Watermark copy attack,

M. Kutter, S. V . Voloshynovskiy, and A. Herrigel, “Watermark copy attack,” inSecurity and Watermarking of Multimedia Contents II, vol

work page
[22]

SPIE, 2000, pp. 371–380

work page 2000
[23]

Mbrs: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression,

Z. Jia, H. Fang, and W. Zhang, “Mbrs: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression,” inProceedings of the 29th ACM international conference on multimedia, 2021, pp. 41–49

work page 2021
[24]

Sok: Certified robustness for deep neural networks,

L. Li, T. Xie, and B. Li, “Sok: Certified robustness for deep neural networks,” in2023 IEEE symposium on security and privacy (SP). IEEE, 2023, pp. 1289–1310

work page 2023
[25]

Certified adversarial ro- bustness via randomized smoothing,

J. Cohen, E. Rosenfeld, and Z. Kolter, “Certified adversarial ro- bustness via randomized smoothing,” ininternational conference on machine learning. PMLR, 2019, pp. 1310–1320

work page 2019
[26]

Black-box certifica- tion with randomized smoothing: A functional optimization based framework,

D. Zhang, M. Ye, C. Gong, Z. Zhu, and Q. Liu, “Black-box certifica- tion with randomized smoothing: A functional optimization based framework,”Advances in Neural Information Processing Systems, vol. 33, pp. 2316–2326, 2020

work page 2020
[27]

On variational bounds of mutual information,

B. Poole, S. Ozair, A. Van Den Oord, A. Alemi, and G. Tucker, “On variational bounds of mutual information,” inInternational Conference on Machine Learning. PMLR, 2019, pp. 5171–5180

work page 2019
[28]

Mutual information neural estima- tion,

M. I. Belghazi, A. Baratin, S. Rajeshwar, S. Ozair, Y. Bengio, A. Courville, and D. Hjelm, “Mutual information neural estima- tion,” inInternational conference on machine learning. PMLR, 2018, pp. 531–540

work page 2018
[29]

Farewell to mutual information: Variational distillation for cross-modal person re-identification,

X. Tian, Z. Zhang, S. Lin, Y. Qu, Y. Xie, and L. Ma, “Farewell to mutual information: Variational distillation for cross-modal person re-identification,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1522–1531

work page 2021
[30]

Perceptual losses for real-time style transfer and super-resolution,

J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” inComputer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. Springer, 2016, pp. 694–711

work page 2016
[31]

The unreasonable effectiveness of deep features as a perceptual metric,

R. Zhang, P . Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595

work page 2018
[32]

{PTW}: Pivotal tuning watermark- ing for{Pre-Trained}image generators,

N. Lukas and F. Kerschbaum, “{PTW}: Pivotal tuning watermark- ing for{Pre-Trained}image generators,” in32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 2241–2258

work page 2023
[33]

The stable signature: Rooting watermarks in latent diffusion models,

P . Fernandez, G. Couairon, H. J´egou, M. Douze, and T. Furon, “The stable signature: Rooting watermarks in latent diffusion models,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 22 466–22 477

work page 2023
[34]

Waves: Bench- marking the robustness of image watermarks.arXiv preprint arXiv:2401.08573, 2024

B. An, M. Ding, T. Rabbani, A. Agrawal, Y. Xu, C. Deng, S. Zhu, A. Mohamed, Y. Wen, T. Goldsteinet al., “Benchmarking the robustness of image watermarks,”arXiv preprint arXiv:2401.08573, 2024

work page arXiv 2024
[35]

Towards blind watermarking: Combining invertible and non- invertible mechanisms,

R. Ma, M. Guo, Y. Hou, F. Yang, Y. Li, H. Jia, and X. Xie, “Towards blind watermarking: Combining invertible and non- invertible mechanisms,” inProceedings of the 30th ACM Interna- tional Conference on Multimedia, 2022, pp. 1532–1542

work page 2022
[36]

Certified robustness to adversarial examples with differential privacy,

M. Lecuyer, V . Atlidakis, R. Geambasu, D. Hsu, and S. Jana, “Certified robustness to adversarial examples with differential privacy,” in2019 IEEE symposium on security and privacy (SP). IEEE, 2019, pp. 656–672

work page 2019
[37]

Certified robustness of community detection against adversarial structural perturbation via randomized smoothing,

J. Jia, B. Wang, X. Cao, and N. Z. Gong, “Certified robustness of community detection against adversarial structural perturbation via randomized smoothing,” inProceedings of The Web Conference 2020, 2020, pp. 2718–2724. 14

work page 2020
[38]

Certified robustness of graph neural networks against adversarial structural pertur- bation,

B. Wang, J. Jia, X. Cao, and N. Z. Gong, “Certified robustness of graph neural networks against adversarial structural pertur- bation,” inProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 1645–1653

work page 2021
[39]

Text-crs: A generalized certified robustness framework against textual adversarial attacks,

X. Zhang, H. Hong, Y. Hong, P . Huang, B. Wang, Z. Ba, and K. Ren, “Text-crs: A generalized certified robustness framework against textual adversarial attacks,” in2024 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 2023, pp. 53–53

work page 2023
[40]

A generalized deep neural network approach for digital watermarking analysis,

W. Ding, Y. Ming, Z. Cao, and C.-T. Lin, “A generalized deep neural network approach for digital watermarking analysis,”IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 6, no. 3, pp. 613–627, 2021

work page 2021
[41]

Visualizing data using t-sne

L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008

work page 2008
[42]

Genetic k-means algorithm,

K. Krishna and M. N. Murty, “Genetic k-means algorithm,”IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 29, no. 3, pp. 433–439, 1999

work page 1999
[43]

Collusion issue in video watermark- ing,

G. Do ¨err and J.-L. Dugelay, “Collusion issue in video watermark- ing,” inSecurity, Steganography, and Watermarking of Multimedia Contents VII, vol. 5681. SPIE, 2005, pp. 685–696

work page 2005
[44]

Pearson correlation coefficient,

I. Cohen, Y. Huang, J. Chen, J. Benesty, J. Benesty, J. Chen, Y. Huang, and I. Cohen, “Pearson correlation coefficient,”Noise reduction in speech processing, pp. 1–4, 2009

work page 2009
[45]

Towards Blind Watermarking: Combining Invertible and Non-invertible Mechanisms,

R. Ma, M. Guo, Y. Hou, F. Yang, Y. Li, H. Jia, and X. Xie, “Towards Blind Watermarking: Combining Invertible and Non-invertible Mechanisms,” inProceedings of the 30th ACM International Confer- ence on Multimedia. Lisboa Portugal: ACM, Oct. 2022, pp. 1532– 1542

work page 2022
[46]

3deformrs: Certifying spatial deformations on point clouds,

J. C. P ´erez, M. Alfarra, S. Giancola, B. Ghanemet al., “3deformrs: Certifying spatial deformations on point clouds,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 15 169–15 179

work page 2022
[47]

De- formrs: Certifying input deformations with randomized smooth- ing,

M. Alfarra, A. Bibi, N. Khan, P . H. Torr, and B. Ghanem, “De- formrs: Certifying input deformations with randomized smooth- ing,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 6, 2022, pp. 6001–6009

work page 2022
[48]

Tss: Transformation-specific smoothing for robustness certification,

L. Li, M. Weber, X. Xu, L. Rimanic, B. Kailkhura, T. Xie, C. Zhang, and B. Li, “Tss: Transformation-specific smoothing for robustness certification,” inProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 535–557

work page 2021
[49]

Estimating divergence functionals and the likelihood ratio by convex risk minimization,

X. Nguyen, M. J. Wainwright, and M. I. Jordan, “Estimating divergence functionals and the likelihood ratio by convex risk minimization,”IEEE Transactions on Information Theory, vol. 56, no. 11, pp. 5847–5861, 2010

work page 2010
[50]

Progressive growing of GANs for improved quality, stability, and variation,

T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of GANs for improved quality, stability, and variation,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=Hk99zCeAb

work page 2018
[51]

Microsoft coco: Common objects in context,

T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P . Perona, D. Ramanan, P . Doll´ar, and C. L. Zitnick, “Microsoft coco: Common objects in context,” inComputer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 2014, pp. 740–755

work page 2014
[52]

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,

P . J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,”Journal of computational and applied mathematics, vol. 20, pp. 53–65, 1987

work page 1987
[53]

Diffusetrace: A transparent and flexible watermarking scheme for latent diffusion model.arXiv preprint arXiv:2405.02696, 2024

L. Lei, K. Gai, J. Yu, and L. Zhu, “Diffusetrace: A transparent and flexible watermarking scheme for latent diffusion model,”arXiv preprint arXiv:2405.02696, 2024

work page arXiv 2024
[54]

Anti- screenshot watermarking algorithm for archival image based on deep learning model,

W. Gu, C.-C. Chang, Y. Bai, Y. Fan, L. Tao, and L. Li, “Anti- screenshot watermarking algorithm for archival image based on deep learning model,”Entropy, vol. 25, no. 2, p. 288, 2023

work page 2023
[55]

Lsb based digital image watermarking for gray scale image,

D. Chopra, P . Gupta, G. Sanjay, and A. Gupta, “Lsb based digital image watermarking for gray scale image,”IOSR Journal of Com- puter Engineering, vol. 6, no. 1, pp. 36–41, 2012

work page 2012
[56]

Combined dwt-dct digital image watermarking,

A. Al-Haj, “Combined dwt-dct digital image watermarking,”Jour- nal of computer science, vol. 3, no. 9, pp. 740–746, 2007

work page 2007
[57]

Learning- based image steganography and watermarking: A survey,

K. Hu, M. Wang, X. Ma, J. Chen, X. Wang, and X. Wang, “Learning- based image steganography and watermarking: A survey,”Expert Systems with Applications, vol. 249, p. 123715, Sep. 2024

work page 2024
[58]

Flow-based robust watermarking with invertible noise layer for black-box distortions,

H. Fang, Y. Qiu, K. Chen, J. Zhang, W. Zhang, and E.-C. Chang, “Flow-based robust watermarking with invertible noise layer for black-box distortions,” inProceedings of the AAAI conference on artificial intelligence, vol. 37, no. 4, 2023, pp. 5054–5061

work page 2023
[59]

Artificial Fin- gerprinting for Generative Models: Rooting Deepfake Attribution in Training Data,

N. Yu, V . Skripniuk, S. Abdelnabi, and M. Fritz, “Artificial Fin- gerprinting for Generative Models: Rooting Deepfake Attribution in Training Data,” in2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada: IEEE, Oct. 2021, pp. 14 428–14 437

work page 2021
[60]

Responsi- ble disclosure of generative models using scalable fingerprinting,

N. Yu, V . Skripniuk, D. Chen, L. S. Davis, and M. Fritz, “Responsi- ble disclosure of generative models using scalable fingerprinting,” inInternational Conference on Learning Representations, 2021

work page 2021
[61]

Tree-ring watermarks: Fingerprints for diffu- sion images that are invisible and robust.arXiv preprint arXiv:2305.20030, 2023

Y. Wen, J. Kirchenbauer, J. Geiping, and T. Goldstein, “Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust,”arXiv preprint arXiv:2305.20030, 2023

work page arXiv 2023
[62]

A recipe for watermark- ing diffusion models.arXiv preprint arXiv:2303.10137,

Y. Zhao, T. Pang, C. Du, X. Yang, N.-M. Cheung, and M. Lin, “A recipe for watermarking diffusion models,”arXiv preprint arXiv:2303.10137, 2023

work page arXiv 2023
[63]

Towards Deep Learning Models Resistant to Adversarial Attacks

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[64]

Hopskipjumpattack: A query-efficient decision-based attack,

J. Chen, M. I. Jordan, and M. J. Wainwright, “Hopskipjumpattack: A query-efficient decision-based attack,” in2020 ieee symposium on security and privacy (sp). IEEE, 2020, pp. 1277–1294

work page 2020
[65]

A Transfer Attack to Image Watermarks,

Y. Hu, Z. Jiang, M. Guo, and N. Gong, “A Transfer Attack to Image Watermarks,” Mar. 2024

work page 2024
[66]

Invisible Image Watermarks Are Provably Removable Using Generative AI,

X. Zhao, K. Zhang, Z. Su, S. Vasan, I. Grishchenko, C. Kruegel, G. Vigna, Y.-X. Wang, and L. Li, “Invisible Image Watermarks Are Provably Removable Using Generative AI,” Aug. 2023

work page 2023
[67]

Variational image compression with a scale hyperprior

J. Ball ´e, D. Minnen, S. Singh, S. J. Hwang, and N. Johnston, “Variational image compression with a scale hyperprior,”arXiv preprint arXiv:1802.01436, 2018

work page Pith review arXiv 2018
[68]

Learned image compression with discretized gaussian mixture likelihoods and attention modules,

Z. Cheng, H. Sun, M. Takeuchi, and J. Katto, “Learned image compression with discretized gaussian mixture likelihoods and attention modules,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 7939–7948

work page 2020
[69]

Taming transformers for high-resolution image synthesis,

P . Esser, R. Rombach, and B. Ommer, “Taming transformers for high-resolution image synthesis,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 12 873–12 883

work page 2021
[70]

Denoising Diffusion Implicit Models

J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,”arXiv preprint arXiv:2010.02502, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[71]

Gaussian Shading: Provable Performance-Lossless Image Water- marking for Diffusion Models,

Z. Yang, K. Zeng, K. Chen, H. Fang, W. Zhang, and N. Yu, “Gaussian Shading: Provable Performance-Lossless Image Water- marking for Diffusion Models,” May 2024, comment: 17 pages, 11 figures, accepted by CVPR 2024. APPENDIXA PROOF FORRESIDUALINFORMATIONLOSS Theorem 3.To deal with the estimation of mutual infor- mation in Eq. 13, we introduce the following ...

work page 2024