pith. machine review for the scientific record. sign in

arxiv: 2604.22550 · v1 · submitted 2026-04-24 · 💻 cs.CR · cs.AI

Recognition: unknown

ArmSSL: Adversarial Robust Black-Box Watermarking for Self-Supervised Learning Pre-trained Encoders

Authors on Pith no claims yet

Pith reviewed 2026-05-08 11:34 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords self-supervised learningwatermarkingblack-box verificationadversarial robustnessencoder IP protectionrepresentation entanglementdistribution alignment
0
0 comments X

The pith

ArmSSL watermarks self-supervised encoders so owners can verify theft via black-box queries while the marks resist detection and removal.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops ArmSSL to protect SSL pre-trained encoders as intellectual property. It solves the problem that existing watermarks either fail when the encoder is embedded in downstream tasks or can be spotted and stripped because watermark samples stand out as unusual data. The method creates a clear verification signal by forcing large differences between how the model processes ordinary inputs and specially chosen watermark inputs. At the same time it hides those watermark inputs by mixing their internal representations with ordinary ones and making their overall statistics match normal data. A reference-guided tuning step keeps the encoder's normal behavior almost unchanged. If correct, this would let owners prove ownership of stolen models even after they have been fine-tuned or deployed, without paying a noticeable performance cost.

Core claim

By enlarging the feature-space discrepancy between each clean input and its paired watermark counterpart, ArmSSL produces a reliable black-box verification signal. Latent representation entanglement mixes watermark features with non-source-class clean features to prevent dense clusters, while distribution alignment reduces statistical differences so watermark samples appear in-distribution. Reference-guided tuning aligns the watermarked encoder's outputs on clean data with the original encoder's outputs, keeping utility intact. Experiments across five SSL frameworks and nine datasets show the approach maintains verification accuracy, near-zero utility loss, and resistance to multiple removal

What carries the argument

Paired discrepancy enlargement for verification, combined with latent representation entanglement and distribution alignment to disguise watermark samples as ordinary data.

If this is right

  • Owners obtain a statistical test on query pairs that confirms ownership even after the encoder is used in new tasks.
  • Watermark samples no longer form isolated clusters, blocking simple outlier-based detection methods.
  • Downstream task accuracy stays essentially the same as the unmarked encoder.
  • The scheme works for multiple popular self-supervised frameworks without requiring white-box access.
  • End-to-end comparisons show better trade-offs than prior watermarking techniques on the same benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the hiding technique generalizes, similar entanglement and alignment steps could protect other types of pre-trained models beyond SSL.
  • Widespread use might lower the practical value of stealing encoders, because verification becomes feasible even after deployment.
  • The method opens the question of whether stronger future attacks could still separate the entangled representations without large utility cost.

Load-bearing premise

Entangling watermark representations with clean ones and aligning their distributions will keep watermark samples from forming a detectable out-of-distribution cluster while still allowing reliable verification and unchanged downstream performance.

What would settle it

An attack that either extracts or erases the watermark without harming downstream accuracy, or that reliably flags the watermark samples as out-of-distribution despite the entanglement and alignment steps.

Figures

Figures reproduced from arXiv: 2604.22550 by Anmin Fu, Boyu Kuang, Chunyi Zhou, Liquan Chen, Yansong Gao, Yongqi Jiang.

Figure 1
Figure 1. Figure 1: An illustration of an attacker stealing and fine view at source ↗
Figure 2
Figure 2. Figure 2: PCA visualization of representations of clean view at source ↗
Figure 3
Figure 3. Figure 3: Reverse-engineered triggers by DECREE. visualization, watermark samples in watermarked encoders form distinct, tightly clustered anomalies (see Fig. 2b,2c), while those in clean encoders overlap with clean samples (see Fig. 2a). This phenomenon likely arises from over￾fitting [35], where feature representations become domi￾nated by watermark-specific patterns instead of benign class￾discriminative features… view at source ↗
Figure 4
Figure 4. Figure 4: The maximum margins computed by MM-BD for view at source ↗
Figure 5
Figure 5. Figure 5: ArmSSL has two phases: watermark embedding and ownership verification. aforementioned objectives. It can be formalized as: Ltotal = αLrbt + βLwm + Lref , (1) where hyperparameters α and β are introduced to balance the adversarial robustness and verification capability. Algo￾rithm 1 in Appendix A.1 details this embedding procedure. To seamlessly integrate the watermark into the well￾trained encoder by fine-… view at source ↗
Figure 6
Figure 6. Figure 6: Overhead of ArmSSL’s watermark embedding. ResNet-18 encoder pretrained on Imagenette. As shown in TABLE 4, we assess the p-values across different down￾stream classifiers using the numbers of 20, 50, 100, 200, and 500 probing pairs. The results indicate that ArmSSL can successfully verify ownership even with as few as 20 probing pairs. As the number of probes increases from 20 to 500, the p-values steadily… view at source ↗
Figure 7
Figure 7. Figure 7: Knowledgeable Attacker (III) with knowing the watermark samples, trigger, and complete embedding pipeline. view at source ↗
Figure 9
Figure 9. Figure 9: The ArmSSL’s p-value under DINOv2 algorithm view at source ↗
read the original abstract

Self-supervised learning (SSL) encoders are invaluable intellectual property (IP). However, no existing SSL watermarking for IP protection can concurrently satisfy the following two practical requirements: (1) provide ownership verification capability under black-box suspect model access once the stolen encoders are used in downstream tasks; (2) be robust under adversarial watermark detection or removal, because the watermark samples form a distinguishable out-of-distribution (OOD) cluster. We propose ArmSSL, an SSL watermarking framework that assures black-box verifiability and adversarial robustness while preserving utility. For verification, we introduce paired discrepancy enlargement, enforcing feature-space orthogonality between the clean and its watermark counterpart to produce a reliable verification signal in black-box against the suspect model. For adversarial robustness, ArmSSL integrates latent representation entanglement and distribution alignment to suppress the OOD clustering. The former entangles watermark representations with clean representations (i.e., from non-source-class) to avoid forming a dense cluster of watermark samples, while the latter minimizes the distributional discrepancy between watermark and clean representations, thereby disguising watermark samples as natural in-distribution data. For utility, a reference-guided watermark tuning strategy is designed to allow the watermark to be learned as a small side task without affecting the main task by aligning the watermarked encoder's outputs with those of the original clean encoder on normal data. Extensive experiments across five mainstream SSL frameworks and nine benchmark datasets, along with end-to-end comparisons with SOTAs, demonstrate that ArmSSL achieves superior ownership verification, negligible utility degradation, and strong robustness against various adversarial detection and removal.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes ArmSSL, a watermarking framework for self-supervised learning (SSL) pre-trained encoders. It enables black-box ownership verification via paired discrepancy enlargement that enforces feature-space orthogonality between clean and watermark sample pairs. Adversarial robustness is achieved by combining latent representation entanglement (to avoid dense watermark clusters) with distribution alignment (to disguise watermark samples as in-distribution). Utility preservation uses reference-guided watermark tuning that aligns watermarked encoder outputs with the original on clean data. Experiments across five SSL frameworks and nine datasets, with comparisons to SOTAs, claim superior verification reliability, negligible utility loss, and strong robustness to detection/removal attacks.

Significance. If the central claims hold, the work is significant for IP protection of SSL encoders, filling gaps in prior schemes that lack concurrent black-box verifiability and adversarial robustness. The multi-component design (orthogonality for signal, entanglement+alignment for OOD suppression, reference tuning for utility) and broad experimental coverage across frameworks/datasets are strengths; reproducible code or parameter-free derivations are not mentioned.

major comments (2)
  1. [§3.2] §3.2 (paired discrepancy enlargement and distribution alignment): the claim that alignment can suppress OOD clustering while preserving a reliable orthogonality-based verification signal is load-bearing, yet no ablation quantifies the verification metric (e.g., AUC or cosine discrepancy magnitude) before versus after the alignment step; this leaves the compatibility of minimizing distributional discrepancy with maintaining feature-space orthogonality unverified.
  2. [§4] §4 (experimental evaluation): while end-to-end comparisons with SOTAs are reported, the manuscript does not include per-attack success rates, threshold choices for black-box verification, or statistical significance tests across the nine datasets, making it difficult to assess whether the claimed superiority is robust to hyperparameter variation.
minor comments (2)
  1. Notation for the combined loss (entanglement + alignment + reference terms) is introduced without an explicit equation numbering the full objective; adding this would improve clarity.
  2. Figure captions for the OOD visualization and verification ROC curves could explicitly state the number of runs and error bars used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Dear Editor, We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below, providing clarifications and committing to revisions that strengthen the empirical support for our claims. All suggested additions are feasible with our existing experimental setup.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (paired discrepancy enlargement and distribution alignment): the claim that alignment can suppress OOD clustering while preserving a reliable orthogonality-based verification signal is load-bearing, yet no ablation quantifies the verification metric (e.g., AUC or cosine discrepancy magnitude) before versus after the alignment step; this leaves the compatibility of minimizing distributional discrepancy with maintaining feature-space orthogonality unverified.

    Authors: We agree that an explicit ablation would better demonstrate the compatibility of the components. The paired discrepancy enlargement loss directly optimizes feature-space orthogonality between clean-watermark pairs, while distribution alignment minimizes marginal distributional discrepancy to suppress OOD clustering without altering the pairwise constraint. In the revised manuscript, we will add an ablation study reporting verification AUC and average cosine discrepancy magnitudes on the nine datasets both before and after the alignment step. This will empirically confirm that the orthogonality signal remains reliable (AUC > 0.95) post-alignment. revision: yes

  2. Referee: [§4] §4 (experimental evaluation): while end-to-end comparisons with SOTAs are reported, the manuscript does not include per-attack success rates, threshold choices for black-box verification, or statistical significance tests across the nine datasets, making it difficult to assess whether the claimed superiority is robust to hyperparameter variation.

    Authors: We acknowledge that greater granularity in the experimental section would aid assessment of robustness. In the revision, we will expand §4 to include: (i) per-attack success rates (detection and removal) for all baselines and ArmSSL across the five SSL frameworks; (ii) explicit description of black-box verification threshold selection (e.g., 95th percentile of clean-pair discrepancies on a held-out set); and (iii) statistical significance tests (paired t-tests with p-values) comparing verification AUC and utility metrics across the nine datasets. These additions will show that superiority holds under the reported hyperparameter ranges. revision: yes

Circularity Check

0 steps flagged

No circularity in claimed derivation chain

full rationale

The paper's core contributions are presented as independent design choices: paired discrepancy enlargement for verification signal, latent entanglement plus distribution alignment for OOD suppression, and reference-guided tuning for utility preservation. These are motivated by distinct goals and do not reduce to self-definition, fitted parameters renamed as predictions, or load-bearing self-citations. No equations or derivation steps are exhibited that equate outputs to inputs by construction. The potential tension between alignment and discrepancy (noted in skeptic analysis) is a compatibility question, not a circularity reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review is based solely on the abstract; no explicit free parameters, axioms, or invented entities are identifiable. The approach implicitly relies on standard assumptions that feature spaces allow measurable discrepancies and that fine-tuning can align behaviors without side effects.

pith-pipeline@v0.9.0 · 5601 in / 1126 out tokens · 83476 ms · 2026-05-08T11:34:51.987635+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 3 canonical work pages · 1 internal anchor

  1. [1]

    A simple framework for contrastive learning of visual representations,

    T. Chen, S. Kornblith, M. Norouziet al., “A simple framework for contrastive learning of visual representations,” inProc. Int. Conf. Mach. Learn. (ICML), 2020, pp. 1597–1607

  2. [2]

    A survey on self-supervised learning: Algorithms, applications, and future trends,

    J. Gui, T. Chen, J. Zhanget al., “A survey on self-supervised learning: Algorithms, applications, and future trends,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 12, pp. 9052–9071, 2024

  3. [3]

    Hssas: Optimizing nlp models by combining self-supervised learning and neural architecture search,

    S. Qi, D. Wang, Y . Fanet al., “Hssas: Optimizing nlp models by combining self-supervised learning and neural architecture search,” inProc. Int. Conf. Robot. Autom. Intell. Control (ICRAIC), 2024, pp. 71–75

  4. [4]

    Deepdriving: Learning affordance for direct perception in autonomous driving,

    C. Chen, A. Seff, A. L. Kornhauseret al., “Deepdriving: Learning affordance for direct perception in autonomous driving,” inProc. ICCV. IEEE Computer Society, 2015, pp. 2722–2730

  5. [5]

    https://ai.meta.com/blog/self-supervised- learning-the-dark-matter-of-intelligence,

    Y . LeCun and I. Misra, “https://ai.meta.com/blog/self-supervised- learning-the-dark-matter-of-intelligence,” March 4, 2021

  6. [6]

    Sslguard: A watermarking scheme for self-supervised learning pre-trained encoders,

    T. Cong, X. He, and Y . Zhang, “Sslguard: A watermarking scheme for self-supervised learning pre-trained encoders,” inACM Conf. Comput. Commun. Secur. (CCS), 2022, pp. 579–593

  7. [7]

    Stolenencoder: stealing pre-trained en- coders in self-supervised learning,

    Y . Liu, J. Jia, H. Liuet al., “Stolenencoder: stealing pre-trained en- coders in self-supervised learning,” inACM Conf. Comput. Commun. Secur. (CCS), 2022, pp. 2115–2128

  8. [8]

    On the difficulty of defending self-supervised learning against model extraction,

    A. Dziedzic, N. Dhawan, M. A. Kaleemet al., “On the difficulty of defending self-supervised learning against model extraction,” in International Conference on Machine Learning. PMLR, 2022, pp. 5757–5776

  9. [9]

    Ssl-wm: A black-box watermarking approach for encoders pre-trained by self-supervised learning,

    P. Lv, P. Li, S. Zhuet al., “Ssl-wm: A black-box watermarking approach for encoders pre-trained by self-supervised learning,”Proc. Netw. Distrib. Syst. Secur. Symp. (NDSS), 2024

  10. [10]

    Watermarking pre-trained encoders in contrastive learning,

    Y . Wu, H. Qiu, T. Zhanget al., “Watermarking pre-trained encoders in contrastive learning,” inProc. Int. Conf. Data Intell. Secur. (ICDIS). IEEE, 2022, pp. 228–233

  11. [11]

    Detecting backdoors in pre-trained encoders,

    S. Feng, G. Tao, S. Chenget al., “Detecting backdoors in pre-trained encoders,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2023, pp. 16 352–16 362

  12. [12]

    Mm-bd: Post-training detection of backdoor attacks with arbitrary backdoor pattern types using a maximum margin statistic,

    H. Wang, Z. Xiang, D. J. Milleret al., “Mm-bd: Post-training detection of backdoor attacks with arbitrary backdoor pattern types using a maximum margin statistic,” inProc. IEEE Symp. Secur. Priv. (SP), 2024, pp. 1994–2012

  13. [13]

    Generalized sliced wasser- stein distances,

    S. Kolouri, K. Nadjahi, U. Simsekliet al., “Generalized sliced wasser- stein distances,”Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 32, 2019

  14. [14]

    A comparison review of transfer learning and self-supervised learning: Definitions, applica- tions, advantages and limitations,

    Z. Zhao, L. Alzubaidi, J. Zhanget al., “A comparison review of transfer learning and self-supervised learning: Definitions, applica- tions, advantages and limitations,”Expert Syst. Appl., vol. 242, p. 122807, 2024

  15. [15]

    Improved Baselines with Momentum Contrastive Learning

    X. Chen, H. Fan, R. Girshicket al., “Improved baselines with momentum contrastive learning,”arXiv preprint arXiv:2003.04297, 2020

  16. [16]

    Bootstrap your own latent-a new approach to self-supervised learning,

    J.-B. Grill, F. Strub, F. Altch ´eet al., “Bootstrap your own latent-a new approach to self-supervised learning,”Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, pp. 21 271–21 284, 2020

  17. [17]

    Exploring simple siamese representation learning,

    X. Chen and K. He, “Exploring simple siamese representation learning,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 15 750–15 758

  18. [18]

    Dinov2: Learning robust visual features without supervision,

    M. Oquab, T. Darcet, T. Moutakanniet al., “Dinov2: Learning robust visual features without supervision,”Trans. Mach. Learn. Res., 2024

  19. [19]

    Embedding watermarks into deep neural networks,

    Y . Uchida, Y . Nagai, S. Sakazawaet al., “Embedding watermarks into deep neural networks,” inProc. Int. Conf. Multimedia Retr. (ICMR), 2017, pp. 269–277

  20. [20]

    Deepmarks: A secure finger- printing framework for digital rights management of deep learning models,

    H. Chen, B. D. Rouhani, C. Fuet al., “Deepmarks: A secure finger- printing framework for digital rights management of deep learning models,” inProc. Int. Conf. Multimedia Retr. (ICMR), 2019, pp. 105– 113

  21. [21]

    Watermarking deep neural networks with greedy residuals

    H. Liu, Z. Weng, and Y . Zhu, “Watermarking deep neural networks with greedy residuals.” inProc. Int. Conf. Mach. Learn. (ICML), 2021, pp. 6978–6988

  22. [22]

    Robust watermarking of neural network with exponential weighting,

    R. Namba and J. Sakuma, “Robust watermarking of neural network with exponential weighting,” inACM Conf. Comput. Commun. Secur. (CCS), 2019, pp. 228–240

  23. [23]

    Deep- eclipse: How to break white-box dnn-watermarking schemes,

    A. Pegoraro, C. Segna, K. Kumari, and A.-R. Sadeghi, “Deep- eclipse: How to break white-box dnn-watermarking schemes,” in Proc. USENIX Secur. Symp. (USENIX Security ’24), 2024, pp. 5287– 5304

  24. [24]

    Entangled watermarks as a defense against model extraction,

    H. Jia, C. A. Choquette-Choo, V . Chandrasekaranet al., “Entangled watermarks as a defense against model extraction,” in30th USENIX security symposium (USENIX Security 21), 2021, pp. 1937–1954

  25. [25]

    Move: Effective and harmless ownership verification via embedded external features,

    Y . Li, L. Zhu, X. Jiaet al., “Move: Effective and harmless ownership verification via embedded external features,”IEEE Trans. Pattern Anal. Mach. Intell., 2025

  26. [26]

    Explanation as a watermark: Towards harmless and multi-bit model ownership verification via watermarking feature attribution,

    S. Shao, Y . Li, H. Yaoet al., “Explanation as a watermark: Towards harmless and multi-bit model ownership verification via watermarking feature attribution,”Proc. Netw. Distrib. Syst. Secur. Symp. (NDSS), 2025

  27. [27]

    Robust watermarking for deep neural networks via bi-level optimization,

    P. Yang, Y . Lao, and P. Li, “Robust watermarking for deep neural networks via bi-level optimization,” inProc. IEEE Int. Conf. Comput. Vis. (ICCV), 2021, pp. 14 841–14 850

  28. [28]

    Aime: Watermarking ai models by leveraging errors,

    D. Mehta, N. Mondol, F. Farahmandiet al., “Aime: Watermarking ai models by leveraging errors,” inProc. Des. Autom. Test Eur.(DATE). IEEE, 2022, pp. 304–309

  29. [29]

    Dataset ownership verification in contrastive pre-trained models,

    Y . Xie, J. Song, M. Xue, H. Zhang, X. Wang, B. Hu, G. Chen, and M. Song, “Dataset ownership verification in contrastive pre-trained models,”Proc. Int. Conf. Learn. Represent. (ICLR), 2025

  30. [30]

    Dataset inference for self-supervised models,

    A. Dziedzic, H. Duan, M. A. Kaleemet al., “Dataset inference for self-supervised models,”Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 35, pp. 12 058–12 070, 2022

  31. [31]

    Bucks for buckets (b4b): Active defenses against stealing encoders,

    J. Dubi ´nski, S. Pawlak, F. Boenischet al., “Bucks for buckets (b4b): Active defenses against stealing encoders,”Advances in Neural Information Processing Systems, vol. 36, pp. 55 237–55 259, 2023

  32. [32]

    Warden: Multi-directional backdoor watermarks for embedding-as-a-service copyright protec- tion,

    A. Shetty, Y . Teng, K. He, and Q. Xu, “Warden: Multi-directional backdoor watermarks for embedding-as-a-service copyright protec- tion,”Proc. Annu. Meet. Assoc. Comput Linguist., pp. 13 430–13 444, 2024

  33. [33]

    Your fixed watermark is fragile: Towards semantic-aware watermark for eaas copyright protection,

    Z. Fei, B. Yi, J. Genget al., “Your fixed watermark is fragile: Towards semantic-aware watermark for eaas copyright protection,”arXiv e- prints, pp. arXiv–2411, 2024

  34. [34]

    Watermarking vision-language pre- trained models for multi-modal embedding as a service,

    Y . Tang, J. Yu, K. Gaiet al., “Watermarking vision-language pre- trained models for multi-modal embedding as a service,”arXiv preprint arXiv:2311.05863, 2023

  35. [35]

    Effective backdoor defense by exploiting sensitivity of poisoned samples,

    W. Chen, B. Wu, and H. Wang, “Effective backdoor defense by exploiting sensitivity of poisoned samples,”Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), pp. 9727–9737, 2022

  36. [36]

    Distributions of angles in random packing on spheres,

    T. Cai, J. Fan, and T. Jiang, “Distributions of angles in random packing on spheres,”J. Mach. Learn. Res., vol. 14, no. 1, pp. 1837– 1864, 2013

  37. [37]

    Algorithm as 136: A k-means clustering algorithm,

    J. A. Hartigan and M. A. Wong, “Algorithm as 136: A k-means clustering algorithm,”Journal of the royal statistical society. series c (applied statistics), vol. 28, no. 1, pp. 100–108, 1979

  38. [38]

    A density-based algorithm for discovering clusters in large spatial databases with noise,

    M. Ester, H.-P. Kriegel, J. Sander, X. Xuet al., “A density-based algorithm for discovering clusters in large spatial databases with noise,” inkdd, vol. 96, no. 34, 1996, pp. 226–231

  39. [39]

    Distribution preserving backdoor attack in self-supervised learning,

    G. Tao, Z. Wang, S. Fenget al., “Distribution preserving backdoor attack in self-supervised learning,” inProc. IEEE Symp. Secur. Priv. (SP), 2024, pp. 2029–2047

  40. [40]

    An introduction to mathematical statistics (hoboken, nj,

    R. J. Larsen and M. L. Marx, “An introduction to mathematical statistics (hoboken, nj,” 2005

  41. [41]

    Division and union: Latent model watermarking,

    Z. Dai, Y . Gao, B. Kuanget al., “Division and union: Latent model watermarking,”IEEE Transactions on Information Forensics and Security, 2025

  42. [42]

    Badencoder: Backdoor attacks to pre- trained encoders in self-supervised learning,

    J. Jia, Y . Liu, and N. Z. Gong, “Badencoder: Backdoor attacks to pre- trained encoders in self-supervised learning,” inProc. IEEE Symp. Secur. Priv. (SP). IEEE, 2022, pp. 2043–2059

  43. [43]

    Machine unlearning: Taxonomy, metrics, applications, challenges, and prospects,

    N. Li, C. Zhou, Y . Gaoet al., “Machine unlearning: Taxonomy, metrics, applications, challenges, and prospects,”IEEE Trans. Neural Netw. Learn. Syst., 2025

  44. [44]

    Imagenet: A large-scale hierarchical image database,

    J. Deng, W. Donget al., “Imagenet: A large-scale hierarchical image database,” inProc. CVPR, 2009, pp. 248–255

  45. [45]

    Tiny imagenet visual recognition challenge,

    Y . Le and X. Yang, “Tiny imagenet visual recognition challenge,”CS 231N, vol. 7, no. 7, p. 3, 2015

  46. [46]

    https://tensorflow.google.cn/datasets/catalog/imagenette

    “https://tensorflow.google.cn/datasets/catalog/imagenette.”

  47. [47]

    https://tensorflow.google.cn/datasets/catalog/cifar100

    “https://tensorflow.google.cn/datasets/catalog/cifar100.”

  48. [48]

    Learning multiple layers of features from tiny images

    G. H. A. Krizhevskyet al., “Learning multiple layers of features from tiny images.” 2009

  49. [49]

    An analysis of single-layer networks in unsupervised feature learning,

    A. Coates, A. Ng, and H. Lee, “An analysis of single-layer networks in unsupervised feature learning,” inProceedings of the fourteenth in- ternational conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011, pp. 215–223

  50. [50]

    Cinic-10 is not imagenet or cifar-10.arXiv preprint arXiv:1810.03505, 2018

    L. N. Darlow, E. J. Crowley, A. Antoniou, and A. J. Storkey, “Cinic- 10 is not imagenet or cifar-10,”arXiv preprint arXiv:1810.03505, 2018

  51. [51]

    http://ufldl.stanford.edu/housenumbers

    “http://ufldl.stanford.edu/housenumbers.”

  52. [52]

    The german traffic sign recogni- tion benchmark: a multi-class classification competition,

    J. Stallkamp, M. Schlipsinget al., “The german traffic sign recogni- tion benchmark: a multi-class classification competition,” inThe 2011 international joint conference on neural networks. IEEE, 2011, pp. 1453–1460. Appendix A. Algorithm A.1. Algorithm For Watermark Embedding The algorithm for the watermark embedding is presented in Algorithm 1. A.2. Alg...