Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning

Anmin Fu; Boyu Kuang; Derek Abbott; Gaurav Varshney; Haodong Li; Qi Chang; Yansong Gao; Zhiyang Dai

arxiv: 2605.01834 · v1 · submitted 2026-05-03 · 💻 cs.CR · cs.AI

Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning

Zhiyang Dai , Yansong Gao , Boyu Kuang , Haodong Li , Qi Chang , Gaurav Varshney , Derek Abbott , Anmin Fu This is my paper

Pith reviewed 2026-05-10 14:44 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords contrastive learningbackdoor attacksdataset watermarkingdata poisoningintellectual property protectionstatistical verificationCL modelsdata ownership

0 comments

The pith

Trigger samples from data-poisoning attacks can be repurposed as verifiable watermarks for protecting contrastive learning datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper evaluates data-poisoning backdoor attacks on contrastive learning models and finds most have poor adaptability, low success rates, limited portability, and restrictive assumptions such as knowledge of downstream tasks. It observes that trigger samples nonetheless exhibit clear statistical divergence from clean samples, which can be turned into a watermark for proving dataset ownership. A unified density metric enables statistical verification, and a multi-level scheme adapts the watermark to feature-level, soft-label, or hard-label outputs. This approach matters because contrastive learning often relies on third-party or internet-scale data where ownership claims are hard to enforce. Experiments confirm that certain attacks can function as effective watermarks, albeit with trade-offs in fidelity, verifiability, and robustness.

Core claim

Trigger samples from data-poisoning backdoor attacks exhibit distinguishable statistical divergence from clean samples in contrastive learning, which can be leveraged through a unified density metric for verification and a multi-level watermarking scheme that adapts to feature-level, soft-label, or hard-label outputs, allowing weak backdoor effects to serve as reliable signals for dataset IP protection despite the original attacks' limitations.

What carries the argument

The statistical divergence of trigger samples from clean data, quantified by a unified density metric and embedded through a multi-level watermarking scheme that matches different CL output formats.

If this is right

Backdoor attacks with low success rates can still function as IP protection signals when paired with statistical verification.
Watermarks can be embedded without requiring knowledge of any downstream task.
A single poisoning method can support verification at feature, soft-label, or hard-label levels depending on the model output.
Dataset owners gain a practical way to assert ownership even when full backdoor success is not achieved.
Trade-offs among fidelity, verifiability, and robustness must be balanced for deployment in real CL pipelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same statistical markers might be adapted to detect unauthorized use in other self-supervised learning settings beyond contrastive learning.
Standard backdoor defenses could unintentionally strip these watermarks, creating a need for watermark-specific robustness tests.
Dataset providers may need protocols to scan for embedded statistical signatures before releasing data publicly.
Combining this technique with non-poisoning watermark methods could strengthen overall dataset protection strategies.

Load-bearing premise

Trigger samples from poisoning attacks maintain reliable statistical divergence from clean samples that can be verified without substantially harming contrastive learning performance or being removed by standard preprocessing.

What would settle it

An experiment in which common data augmentations or normalization steps used in contrastive learning eliminate the statistical divergence, rendering the density metric unable to distinguish trigger samples from clean ones.

Figures

Figures reproduced from arXiv: 2605.01834 by Anmin Fu, Boyu Kuang, Derek Abbott, Gaurav Varshney, Haodong Li, Qi Chang, Yansong Gao, Zhiyang Dai.

**Figure 1.** Figure 1: Comparison of model accuracy and attack success rate under different backdoor attacks and model structures on CIFAR10. [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗

**Figure 2.** Figure 2: Comparison of model accuracy and attack success rate under different backdoor attacks and model structures on ImageNet100. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: The process of repurposing feasible data-poisoning-only based backdoor attacks in CL into a datasets watermarking method. The framework consists [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Watermarked sample example: CIFAR10 (left), ImageNet100 (right). [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of TPR and FPR for SSL-Backdoor, CTRL, BLTO and NA under different thresholds. (Feature / Soft Label / Hard Label levels) [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

read the original abstract

Contrastive learning (CL) reduces annotation cost via auto-derived supervisory signals. Since large-scale in-house CL datasets are infeasible, reliance on third-party or internet data is common. Recent studies show CL models are vulnerable to data-poisoning backdoor attacks, but their generalization and robustness are underexplored. We systematically evaluate existing data-poisoning backdoor attacks on CL, revealing limitations: poor dataset adaptability, low success rates, limited portability, and restrictive assumptions (e.g., downstream task knowledge). Interestingly, trigger samples exhibit distinguishable statistical divergence from clean samples, which inspires repurposing it as a watermark for dataset IP protection. Direct repurposing is challenging due to low success rates; we overcome this by statistical verification using a unified density metric. We further propose a multi-level watermarking scheme adapting to feature-level, soft-label, or hard-label outputs in CL. Experiments show some backdoor attacks can be repurposed as effective watermarks with trade-offs among fidelity, verifiability, and robustness. This work demonstrates weak backdoor effects become reliable signals for dataset IP protection in challenging CL settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper repurposes backdoor triggers as watermarks for contrastive learning datasets using a density metric and multi-level scheme, but the robustness under real CL augmentations is not clearly established.

read the letter

The main point is that they take existing poisoning attacks, note their low success rates on CL, spot statistical differences in trigger samples, and try to turn those into verifiable watermarks with a unified density check plus a multi-level design for different output types. That repurposing angle and the adaptation to CL are the actual new pieces here. The systematic check of prior attack limitations is also useful as a starting point for why direct backdoors don't transfer well to self-supervised settings. They report that some attacks can be made to work with the usual fidelity-robustness trade-offs, which at least frames the problem concretely for dataset IP protection. The multi-level scheme shows they considered how CL models produce features versus labels. The soft spot is the verification step itself. The density metric has to remain reliable after CL training, which includes heavy random crops, flips, color jitter, and blur. The abstract does not describe whether the metric is applied to raw triggers, post-augmentation samples, or model outputs, nor how it is calibrated or tested against standard preprocessing. Without those details or any reported rates, baselines, or ablation numbers, it is hard to judge if the divergence actually survives in practice or just looks good on paper. This is aimed at researchers working on data provenance and poisoning in self-supervised learning. A reader in that area would pick up the idea and the evaluation of attack weaknesses, but would need the full experiments to decide if the claims hold. I would send it for peer review. The direction is worth referee time even if the current write-up needs tighter evidence on the metric's invariance.

Referee Report

3 major / 2 minor

Summary. The paper evaluates limitations of existing data-poisoning backdoor attacks when applied to contrastive learning (CL), including poor adaptability, low success rates, and restrictive assumptions. It observes statistical divergence between trigger and clean samples, repurposes this divergence as a dataset watermark via a unified density metric for statistical verification, and introduces a multi-level scheme supporting feature-level, soft-label, and hard-label outputs. Experiments are reported to show that certain backdoor attacks can be turned into effective watermarks, albeit with trade-offs in fidelity, verifiability, and robustness.

Significance. If the central claims hold, the work would demonstrate a practical route to dataset IP protection in CL settings by converting weak poisoning signals into verifiable watermarks without requiring new attack machinery. The systematic evaluation of backdoor limitations on CL is a clear positive contribution; the multi-level adaptation to different CL output formats could broaden applicability if the density metric proves stable.

major comments (3)

[Abstract and Experiments section] Abstract and Experiments section: the claim that 'experiments show some backdoor attacks can be repurposed as effective watermarks' is not supported by any reported quantitative success rates, baseline comparisons against non-poisoning watermarking methods, or ablation results on the unified density metric; without these, the central repurposing claim cannot be assessed for practical utility.
[Section describing the unified density metric] Section describing the unified density metric: no calibration procedure, threshold selection method, or invariance analysis under standard CL augmentations (random crops, color jitter, Gaussian blur) is provided; the skeptic concern that trigger divergence collapses under these operations directly undermines the verifiability guarantee required for a robust watermark.
[Multi-level watermarking scheme] Multi-level watermarking scheme (feature/soft/hard-label variants): the paper does not report how the density metric is adapted across output types or whether fidelity to the original CL objective is preserved; this is load-bearing for the claim that the approach works 'in challenging CL settings.'

minor comments (2)

[Abstract] The abstract would be strengthened by including at least one key quantitative result (e.g., watermark detection accuracy or downstream accuracy drop) to convey the scale of the reported trade-offs.
[Method section] Notation for the unified density metric should be defined explicitly with a formula or pseudocode early in the method section to avoid ambiguity when comparing trigger versus clean distributions.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below, providing clarifications where possible and committing to revisions that strengthen the presentation of our results without overstating the current content.

read point-by-point responses

Referee: [Abstract and Experiments section] Abstract and Experiments section: the claim that 'experiments show some backdoor attacks can be repurposed as effective watermarks' is not supported by any reported quantitative success rates, baseline comparisons against non-poisoning watermarking methods, or ablation results on the unified density metric; without these, the central repurposing claim cannot be assessed for practical utility.

Authors: We acknowledge that the experiments section reports verification performance using the density metric but does not include the specific quantitative success rates, baseline comparisons to non-poisoning watermarking methods, or ablations on the density metric that the referee requests. To enable proper assessment of the repurposing claim, we will revise the experiments section to add explicit numerical success rates for watermark verification, baseline comparisons against existing non-poisoning watermarking approaches, and ablation studies isolating components of the unified density metric. revision: yes
Referee: [Section describing the unified density metric] Section describing the unified density metric: no calibration procedure, threshold selection method, or invariance analysis under standard CL augmentations (random crops, color jitter, Gaussian blur) is provided; the skeptic concern that trigger divergence collapses under these operations directly undermines the verifiability guarantee required for a robust watermark.

Authors: The referee correctly notes the absence of these methodological details in the current description of the unified density metric. We will add a new subsection that specifies the calibration procedure, the threshold selection method (e.g., via empirical quantiles on clean samples), and empirical invariance analysis under standard CL augmentations including random crops, color jitter, and Gaussian blur. This will either demonstrate stability of the statistical divergence or clearly delineate the conditions under which the verifiability guarantee holds. revision: yes
Referee: [Multi-level watermarking scheme] Multi-level watermarking scheme (feature/soft/hard-label variants): the paper does not report how the density metric is adapted across output types or whether fidelity to the original CL objective is preserved; this is load-bearing for the claim that the approach works 'in challenging CL settings.'

Authors: We agree that the adaptation of the density metric across output types and the preservation of fidelity to the CL objective require explicit reporting. The metric is applied directly to the respective representations (embeddings for feature-level, probability vectors for soft-label, and discrete predictions for hard-label). We will expand the scheme description to detail this adaptation and include new experimental results quantifying fidelity (e.g., change in contrastive loss and downstream accuracy) for each variant to support the claim in challenging CL settings. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical observation of divergence plus new verification metric adds independent content.

full rationale

The paper starts from published backdoor attacks, empirically notes distinguishable statistical divergence in trigger samples, and introduces a unified density metric plus multi-level scheme to repurpose them as watermarks. This chain does not reduce to self-definition, fitted parameters renamed as predictions, or load-bearing self-citations. The central results (trade-offs in fidelity/verifiability/robustness) are presented as experimental outcomes rather than quantities forced by the inputs. No equations or derivations in the provided text exhibit the reduction patterns; the work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the unified density metric is invoked as a verification tool but its construction and any thresholds are not specified.

pith-pipeline@v0.9.0 · 5527 in / 1187 out tokens · 44330 ms · 2026-05-10T14:44:37.814370+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages

[1]

A survey on self-supervised learning: Algorithms, applications, and future trends,

J. Gui, T. Chen, J. Zhanget al., “A survey on self-supervised learning: Algorithms, applications, and future trends,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 12, pp. 9052–9071, 2024

work page 2024
[2]

Self-supervised learning: Generative or contrastive,

X. Liu, F. Zhang, Z. Houet al., “Self-supervised learning: Generative or contrastive,”IEEE Transactions on Knowledge & Data Engineering, vol. 35, no. 01, pp. 857–876, 2023

work page 2023
[3]

Dinov2: Learning robust visual features without supervision,

M. Oquab, T. Darcet, T. Moutakanniet al., “Dinov2: Learning robust visual features without supervision,”arXiv preprint, 2024

work page 2024
[4]

Learning transferable visual models from natural language supervision,

A. Radford, J. W. Kim, C. Hallacyet al., “Learning transferable visual models from natural language supervision,” inInternational Conference on Machine Learning, ICML, 2021, pp. 8748–8763

work page 2021
[5]

Data-efficient contrastive language-image pretraining: Prioritizing data quality over quantity,

S. Joshi, A. Jain, A. Payani, and B. Mirzasoleiman, “Data-efficient contrastive language-image pretraining: Prioritizing data quality over quantity,” inInternational Conference on Artificial Intelligence and Statistics, AISTATS, 2024, pp. 1000–1008

work page 2024
[6]

When does contrastive visual representation learning work?

E. Cole, X. Yang, K. Wilberet al., “When does contrastive visual representation learning work?” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2022, pp. 1–10

work page 2022
[7]

CDI: copy- righted data identification in diffusion models,

J. Dubinski, A. Kowalczuk, F. Boenisch, and A. Dziedzic, “CDI: copy- righted data identification in diffusion models,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2025, pp. 18 674– 18 684

work page 2025
[8]

OBELICS: an open web-scale filtered dataset of interleaved image-text documents,

H. Laurenc ¸on, L. Saulnieret al., “OBELICS: an open web-scale filtered dataset of interleaved image-text documents,” inNeural Information Processing Systems, NeurIPS, 2023

work page 2023
[9]

Data poisoning based backdoor attacks to contrastive learning,

J. Zhang, H. Liu, J. Jia, and N. Z. Gong, “Data poisoning based backdoor attacks to contrastive learning,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2024, pp. 24 357–24 366

work page 2024
[10]

Momentum contrast for unsupervised visual representation learning,

K. He, H. Fan, Y . Wuet al., “Momentum contrast for unsupervised visual representation learning,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2020, pp. 9729–9738

work page 2020
[11]

Improved baselines with momentum contrastive learning,

X. Chen, H. Fan, R. Girshick, and K. He, “Improved baselines with momentum contrastive learning,”arXiv preprint, 2020

work page 2020
[12]

Backdoor attacks on self-supervised learning,

A. Saha, A. Tejankar, S. A. Koohpayegani, and H. Pirsiavash, “Backdoor attacks on self-supervised learning,” inIEEE/CVF Conference on Com- puter Vision and Pattern Recognition, CVPR, 2022, pp. 13 337–13 346

work page 2022
[13]

Poisonedencoder: Poisoning the unlabeled pre-training data in contrastive learning,

H. Liu, J. Jia, and N. Z. Gong, “Poisonedencoder: Poisoning the unlabeled pre-training data in contrastive learning,” inUSENIX Security Symposium, USENIX Security, 2022, pp. 3629–3645

work page 2022
[14]

An embarrassingly simple backdoor attack on self-supervised learning,

C. Li, R. Pang, Z. Xiet al., “An embarrassingly simple backdoor attack on self-supervised learning,” inIEEE/CVF International Conference on Computer Vision, ICCV, 2023, pp. 4367–4378

work page 2023
[15]

Backdoor contrastive learning via bi-level trigger optimization,

W. Sun, X. Zhang, H. Luet al., “Backdoor contrastive learning via bi-level trigger optimization,” inInternational Conference on Learning Representations, ICLR, 2024

work page 2024
[16]

Backdooring self-supervised contrastive learning by noisy alignment,

T. Chen, J. Gui, M. Dong, J. Jia, L. Fang, and J. Liu, “Backdooring self-supervised contrastive learning by noisy alignment,” inIEEE/CVF International Conference on Computer Vision, ICCV, 2025

work page 2025
[17]

Authors up in arms,

Baird Holm LLP, “Authors up in arms,” September 2025. [Online]. Available: https://www.bairdholm.com/blog/authors-up-in-arms/

work page 2025
[18]

Authors sue meta platforms over copyright infringement in ai training dataset,

Mogin Law LLP, “Authors sue meta platforms over copyright infringement in ai training dataset,” June 2025. [Online]. Available: https://www.lexology.com/library/detail.aspx?g= 0550195e-6912-4864-8a31-426796844a56

work page 2025
[19]

Sslguard: A watermarking scheme for self-supervised learning pre-trained encoders,

T. Cong, X. He, and Y . Zhang, “Sslguard: A watermarking scheme for self-supervised learning pre-trained encoders,” inACM SIGSAC Conference on Computer and Communications Security, CCS, 2022, pp. 579–593

work page 2022
[20]

SSL-WM: A black-box watermarking approach for encoders pre-trained by self-supervised learning,

P. Lv, P. Liet al., “SSL-WM: A black-box watermarking approach for encoders pre-trained by self-supervised learning,” inDistributed System Security Symposium, NDSS, 2024

work page 2024
[21]

Watermarking pre-trained encoders in contrastive learning,

Y . Wu, H. Qiu, T. Zhanget al., “Watermarking pre-trained encoders in contrastive learning,” inInternational Conference on Data Intelligence and Security, ICDIS 2022, 2022, pp. 228–233

work page 2022
[22]

Fit-print: Towards false-claim-resistant model ownership verification via targeted fingerprint,

S. Shao, H. Zhu, Y . Liet al., “Fit-print: Towards false-claim-resistant model ownership verification via targeted fingerprint,”arXiv preprint, 2025

work page 2025
[23]

Pointncbw: Toward dataset own- ership verification for point clouds via negative clean-label backdoor watermark,

C. Wei, Y . Wang, K. Gaoet al., “Pointncbw: Toward dataset own- ership verification for point clouds via negative clean-label backdoor watermark,”IEEE Transactions on Information Forensics and Security, vol. 20, pp. 191–206, 2025

work page 2025
[24]

Entropymark: Towards more harm- less backdoor watermark via entropy-based constraint for open-source dataset copyright protection,

M. Sun, R. Wang, Z. Zhuet al., “Entropymark: Towards more harm- less backdoor watermark via entropy-based constraint for open-source dataset copyright protection,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2025, pp. 30 692–30 701

work page 2025
[25]

BERT: pre-training of deep bidirectional transformers for language understanding,

J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” inNorth American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019, pp. 4171–4186

work page 2019
[26]

Context encoders: Feature learning by inpainting,

D. Pathak, P. Kr ¨ahenb¨uhl, J. Donahueet al., “Context encoders: Feature learning by inpainting,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 2536–2544

work page 2016
[27]

A simple framework for contrastive learning of visual representations,

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” inInternational conference on machine learning, ICML, 2020, pp. 1597–1607

work page 2020
[28]

Exploring simple siamese representation learning,

X. Chen and K. He, “Exploring simple siamese representation learning,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2021, pp. 15 750–15 758

work page 2021
[29]

Bootstrap your own latent - A new approach to self-supervised learning,

J. Grill, F. Strubet al., “Bootstrap your own latent - A new approach to self-supervised learning,” inNeural Information Processing Systems, NeurIPS, 2020

work page 2020
[30]

Generative adver- sarial networks,

I. J. Goodfellow, J. Pouget-Abadie, M. Mirzaet al., “Generative adver- sarial networks,”arXiv preprint, 2014

work page 2014
[31]

Auto-encoding variational bayes,

D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” in International Conference on Learning Representations, ICLR, 2014

work page 2014
[32]

Poisoning and backdooring contrastive learn- ing,

N. Carlini and A. Terzis, “Poisoning and backdooring contrastive learn- ing,” inInternational Conference on Learning Representations, ICLR, 2022

work page 2022
[33]

Badencoder: Backdoor attacks to pre- trained encoders in self-supervised learning,

J. Jia, Y . Liu, and N. Z. Gong, “Badencoder: Backdoor attacks to pre- trained encoders in self-supervised learning,” inIEEE Symposium on Security and Privacy, SP, 2022, pp. 2043–2059

work page 2022
[34]

Ghostencoder: Stealthy backdoor attacks with dynamic triggers to pre-trained encoders in self-supervised learning,

Q. Wang, C. Yin, L. Fanget al., “Ghostencoder: Stealthy backdoor attacks with dynamic triggers to pre-trained encoders in self-supervised learning,”Comput. Secur., vol. 142, p. 103855, 2024

work page 2024
[35]

Badclip: Dual-embedding guided backdoor attack on multimodal contrastive learning,

S. Liang, M. Zhu, A. Liuet al., “Badclip: Dual-embedding guided backdoor attack on multimodal contrastive learning,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2024, pp. 24 645–24 654

work page 2024
[36]

Badclip: Trigger-aware prompt learning for backdoor attacks on CLIP,

J. Bai, K. Gao, S. Minet al., “Badclip: Trigger-aware prompt learning for backdoor attacks on CLIP,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2024, pp. 24 239–24 250

work page 2024
[37]

Distribution preserving backdoor attack in self-supervised learning,

G. Tao, Z. Wang, S. Fenget al., “Distribution preserving backdoor attack in self-supervised learning,” inIEEE Symposium on Security and Privacy, SP, 2024, pp. 2029–2047

work page 2024
[38]

A reliable data-based bandwidth selection method for kernel density estimation,

S. J. Sheather and M. C. Jones, “A reliable data-based bandwidth selection method for kernel density estimation,”Journal of the Royal Statistical Society: Series B (Methodological), vol. 53, no. 3, pp. 683– 690, 1991. JOURNAL OF LATEX CLASS FILES, VOL. X, NO. X, MAY 2026 15

work page 1991
[39]

Generalized sliced wasser- stein distances,

S. Kolouri, K. Nadjahi, U. Simsekliet al., “Generalized sliced wasser- stein distances,” inNeural Information Processing Systems, NeurIPS, 2019, pp. 261–272

work page 2019
[40]

Did you train on my dataset? towards public dataset protection with clean-label backdoor watermarking,

R. Tang, Q. Feng, N. Liuet al., “Did you train on my dataset? towards public dataset protection with clean-label backdoor watermarking,” arXiv preprint, 2023

work page 2023
[41]

Untargeted backdoor watermark: Towards harmless and stealthy dataset copyright protection,

Y . Li, Y . Bai, Y . Jianget al., “Untargeted backdoor watermark: Towards harmless and stealthy dataset copyright protection,” inNeural Informa- tion Processing Systems, NeurIPS, 2022

work page 2022
[42]

Open-sourced dataset protection via backdoor watermarking,

Y . Li, Z. Zhang, J. Baiet al., “Open-sourced dataset protection via backdoor watermarking,”arXiv preprint, 2020

work page 2020
[43]

Dataset inference: Ownership resolution in machine learning,

P. Maini, M. Yaghini, and N. Papernot, “Dataset inference: Ownership resolution in machine learning,” inInternational Conference on Learning Representations, ICLR, 2021

work page 2021
[44]

Label- only membership inference attacks,

C. A. Choquette-Choo, F. Tram `er, N. Carlini, and N. Papernot, “Label- only membership inference attacks,” inInternational Conference on Machine Learning, ICML, vol. 139, 2021, pp. 1964–1974

work page 2021
[45]

Radioactive data: tracing through training,

A. Sablayrolles, M. Douze, C. Schmid, and H. J ´egou, “Radioactive data: tracing through training,” inInternational Conference on Machine Learning, ICML, vol. 119, 2020, pp. 8326–8335

work page 2020
[46]

Dataset inference for self-supervised models,

A. Dziedzic, H. Duanet al., “Dataset inference for self-supervised models,” inNeural Information Processing Systems, NeurIPS, 2022

work page 2022
[47]

Dataset ownership verification in contrastive pre-trained models,

Y . Xie, J. Songet al., “Dataset ownership verification in contrastive pre-trained models,” inInternational Conference on Learning Repre- sentations, ICLR, 2025

work page 2025
[48]

A dwt, dct and svd based watermarking technique to protect the image piracy,

M. M. Rahman, “A dwt, dct and svd based watermarking technique to protect the image piracy,”International Journal of Managing Public Sector Information & Communication Technologies, vol. 4, no. 2, pp. 21–32, 2013

work page 2013
[49]

Learning multiple layers of features from tiny images,

A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009

work page 2009
[50]

Deep residual learning for image recognition,

K. He, X. Zhanget al., “Deep residual learning for image recognition,” inIEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 770–778

work page 2016
[51]

Imagenet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socheret al., “Imagenet: A large-scale hierarchical image database,” inIEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, 2009, pp. 248–255

work page 2009
[52]

Image quality assess- ment: from error visibility to structural similarity,

Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assess- ment: from error visibility to structural similarity,”IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004

work page 2004
[53]

The unreasonable effectiveness of deep features as a perceptual metric,

R. Zhang, P. Isola, A. A. Efroset al., “The unreasonable effectiveness of deep features as a perceptual metric,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2018, pp. 586–595

work page 2018
[54]

Dreamsim: Learning new dimen- sions of human visual similarity using synthetic data,

S. Fu, N. Tamir, S. Sundaramet al., “Dreamsim: Learning new dimen- sions of human visual similarity using synthetic data,”arXiv preprint, 2023

work page 2023
[55]

An analysis of single-layer networks in unsupervised feature learning,

A. Coates, A. Y . Ng, and H. Lee, “An analysis of single-layer networks in unsupervised feature learning,” inInternational Conference on Artificial Intelligence and Statistics, AISTATS, vol. 15, 2011, pp. 215–223

work page 2011
[56]

https://tensorflow.google.cn/datasets/catalog/imagenette

“https://tensorflow.google.cn/datasets/catalog/imagenette.” Zhiyang Daireceived the bachelor’s degree in Qian Xuesen College from Nanjing University of Science and Technology, Nanjing, China, in 2021, where he is currently pursuing the Ph.D. degree with the School of Cyber Science and Engineering from Nan- jing University of Science and Technology, Nanjin...

work page 2021

[1] [1]

A survey on self-supervised learning: Algorithms, applications, and future trends,

J. Gui, T. Chen, J. Zhanget al., “A survey on self-supervised learning: Algorithms, applications, and future trends,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 46, no. 12, pp. 9052–9071, 2024

work page 2024

[2] [2]

Self-supervised learning: Generative or contrastive,

X. Liu, F. Zhang, Z. Houet al., “Self-supervised learning: Generative or contrastive,”IEEE Transactions on Knowledge & Data Engineering, vol. 35, no. 01, pp. 857–876, 2023

work page 2023

[3] [3]

Dinov2: Learning robust visual features without supervision,

M. Oquab, T. Darcet, T. Moutakanniet al., “Dinov2: Learning robust visual features without supervision,”arXiv preprint, 2024

work page 2024

[4] [4]

Learning transferable visual models from natural language supervision,

A. Radford, J. W. Kim, C. Hallacyet al., “Learning transferable visual models from natural language supervision,” inInternational Conference on Machine Learning, ICML, 2021, pp. 8748–8763

work page 2021

[5] [5]

Data-efficient contrastive language-image pretraining: Prioritizing data quality over quantity,

S. Joshi, A. Jain, A. Payani, and B. Mirzasoleiman, “Data-efficient contrastive language-image pretraining: Prioritizing data quality over quantity,” inInternational Conference on Artificial Intelligence and Statistics, AISTATS, 2024, pp. 1000–1008

work page 2024

[6] [6]

When does contrastive visual representation learning work?

E. Cole, X. Yang, K. Wilberet al., “When does contrastive visual representation learning work?” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2022, pp. 1–10

work page 2022

[7] [7]

CDI: copy- righted data identification in diffusion models,

J. Dubinski, A. Kowalczuk, F. Boenisch, and A. Dziedzic, “CDI: copy- righted data identification in diffusion models,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2025, pp. 18 674– 18 684

work page 2025

[8] [8]

OBELICS: an open web-scale filtered dataset of interleaved image-text documents,

H. Laurenc ¸on, L. Saulnieret al., “OBELICS: an open web-scale filtered dataset of interleaved image-text documents,” inNeural Information Processing Systems, NeurIPS, 2023

work page 2023

[9] [9]

Data poisoning based backdoor attacks to contrastive learning,

J. Zhang, H. Liu, J. Jia, and N. Z. Gong, “Data poisoning based backdoor attacks to contrastive learning,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2024, pp. 24 357–24 366

work page 2024

[10] [10]

Momentum contrast for unsupervised visual representation learning,

K. He, H. Fan, Y . Wuet al., “Momentum contrast for unsupervised visual representation learning,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2020, pp. 9729–9738

work page 2020

[11] [11]

Improved baselines with momentum contrastive learning,

X. Chen, H. Fan, R. Girshick, and K. He, “Improved baselines with momentum contrastive learning,”arXiv preprint, 2020

work page 2020

[12] [12]

Backdoor attacks on self-supervised learning,

A. Saha, A. Tejankar, S. A. Koohpayegani, and H. Pirsiavash, “Backdoor attacks on self-supervised learning,” inIEEE/CVF Conference on Com- puter Vision and Pattern Recognition, CVPR, 2022, pp. 13 337–13 346

work page 2022

[13] [13]

Poisonedencoder: Poisoning the unlabeled pre-training data in contrastive learning,

H. Liu, J. Jia, and N. Z. Gong, “Poisonedencoder: Poisoning the unlabeled pre-training data in contrastive learning,” inUSENIX Security Symposium, USENIX Security, 2022, pp. 3629–3645

work page 2022

[14] [14]

An embarrassingly simple backdoor attack on self-supervised learning,

C. Li, R. Pang, Z. Xiet al., “An embarrassingly simple backdoor attack on self-supervised learning,” inIEEE/CVF International Conference on Computer Vision, ICCV, 2023, pp. 4367–4378

work page 2023

[15] [15]

Backdoor contrastive learning via bi-level trigger optimization,

W. Sun, X. Zhang, H. Luet al., “Backdoor contrastive learning via bi-level trigger optimization,” inInternational Conference on Learning Representations, ICLR, 2024

work page 2024

[16] [16]

Backdooring self-supervised contrastive learning by noisy alignment,

T. Chen, J. Gui, M. Dong, J. Jia, L. Fang, and J. Liu, “Backdooring self-supervised contrastive learning by noisy alignment,” inIEEE/CVF International Conference on Computer Vision, ICCV, 2025

work page 2025

[17] [17]

Authors up in arms,

Baird Holm LLP, “Authors up in arms,” September 2025. [Online]. Available: https://www.bairdholm.com/blog/authors-up-in-arms/

work page 2025

[18] [18]

Authors sue meta platforms over copyright infringement in ai training dataset,

Mogin Law LLP, “Authors sue meta platforms over copyright infringement in ai training dataset,” June 2025. [Online]. Available: https://www.lexology.com/library/detail.aspx?g= 0550195e-6912-4864-8a31-426796844a56

work page 2025

[19] [19]

Sslguard: A watermarking scheme for self-supervised learning pre-trained encoders,

T. Cong, X. He, and Y . Zhang, “Sslguard: A watermarking scheme for self-supervised learning pre-trained encoders,” inACM SIGSAC Conference on Computer and Communications Security, CCS, 2022, pp. 579–593

work page 2022

[20] [20]

SSL-WM: A black-box watermarking approach for encoders pre-trained by self-supervised learning,

P. Lv, P. Liet al., “SSL-WM: A black-box watermarking approach for encoders pre-trained by self-supervised learning,” inDistributed System Security Symposium, NDSS, 2024

work page 2024

[21] [21]

Watermarking pre-trained encoders in contrastive learning,

Y . Wu, H. Qiu, T. Zhanget al., “Watermarking pre-trained encoders in contrastive learning,” inInternational Conference on Data Intelligence and Security, ICDIS 2022, 2022, pp. 228–233

work page 2022

[22] [22]

Fit-print: Towards false-claim-resistant model ownership verification via targeted fingerprint,

S. Shao, H. Zhu, Y . Liet al., “Fit-print: Towards false-claim-resistant model ownership verification via targeted fingerprint,”arXiv preprint, 2025

work page 2025

[23] [23]

Pointncbw: Toward dataset own- ership verification for point clouds via negative clean-label backdoor watermark,

C. Wei, Y . Wang, K. Gaoet al., “Pointncbw: Toward dataset own- ership verification for point clouds via negative clean-label backdoor watermark,”IEEE Transactions on Information Forensics and Security, vol. 20, pp. 191–206, 2025

work page 2025

[24] [24]

Entropymark: Towards more harm- less backdoor watermark via entropy-based constraint for open-source dataset copyright protection,

M. Sun, R. Wang, Z. Zhuet al., “Entropymark: Towards more harm- less backdoor watermark via entropy-based constraint for open-source dataset copyright protection,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2025, pp. 30 692–30 701

work page 2025

[25] [25]

BERT: pre-training of deep bidirectional transformers for language understanding,

J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” inNorth American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019, pp. 4171–4186

work page 2019

[26] [26]

Context encoders: Feature learning by inpainting,

D. Pathak, P. Kr ¨ahenb¨uhl, J. Donahueet al., “Context encoders: Feature learning by inpainting,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 2536–2544

work page 2016

[27] [27]

A simple framework for contrastive learning of visual representations,

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” inInternational conference on machine learning, ICML, 2020, pp. 1597–1607

work page 2020

[28] [28]

Exploring simple siamese representation learning,

X. Chen and K. He, “Exploring simple siamese representation learning,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2021, pp. 15 750–15 758

work page 2021

[29] [29]

Bootstrap your own latent - A new approach to self-supervised learning,

J. Grill, F. Strubet al., “Bootstrap your own latent - A new approach to self-supervised learning,” inNeural Information Processing Systems, NeurIPS, 2020

work page 2020

[30] [30]

Generative adver- sarial networks,

I. J. Goodfellow, J. Pouget-Abadie, M. Mirzaet al., “Generative adver- sarial networks,”arXiv preprint, 2014

work page 2014

[31] [31]

Auto-encoding variational bayes,

D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” in International Conference on Learning Representations, ICLR, 2014

work page 2014

[32] [32]

Poisoning and backdooring contrastive learn- ing,

N. Carlini and A. Terzis, “Poisoning and backdooring contrastive learn- ing,” inInternational Conference on Learning Representations, ICLR, 2022

work page 2022

[33] [33]

Badencoder: Backdoor attacks to pre- trained encoders in self-supervised learning,

J. Jia, Y . Liu, and N. Z. Gong, “Badencoder: Backdoor attacks to pre- trained encoders in self-supervised learning,” inIEEE Symposium on Security and Privacy, SP, 2022, pp. 2043–2059

work page 2022

[34] [34]

Ghostencoder: Stealthy backdoor attacks with dynamic triggers to pre-trained encoders in self-supervised learning,

Q. Wang, C. Yin, L. Fanget al., “Ghostencoder: Stealthy backdoor attacks with dynamic triggers to pre-trained encoders in self-supervised learning,”Comput. Secur., vol. 142, p. 103855, 2024

work page 2024

[35] [35]

Badclip: Dual-embedding guided backdoor attack on multimodal contrastive learning,

S. Liang, M. Zhu, A. Liuet al., “Badclip: Dual-embedding guided backdoor attack on multimodal contrastive learning,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2024, pp. 24 645–24 654

work page 2024

[36] [36]

Badclip: Trigger-aware prompt learning for backdoor attacks on CLIP,

J. Bai, K. Gao, S. Minet al., “Badclip: Trigger-aware prompt learning for backdoor attacks on CLIP,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2024, pp. 24 239–24 250

work page 2024

[37] [37]

Distribution preserving backdoor attack in self-supervised learning,

G. Tao, Z. Wang, S. Fenget al., “Distribution preserving backdoor attack in self-supervised learning,” inIEEE Symposium on Security and Privacy, SP, 2024, pp. 2029–2047

work page 2024

[38] [38]

A reliable data-based bandwidth selection method for kernel density estimation,

S. J. Sheather and M. C. Jones, “A reliable data-based bandwidth selection method for kernel density estimation,”Journal of the Royal Statistical Society: Series B (Methodological), vol. 53, no. 3, pp. 683– 690, 1991. JOURNAL OF LATEX CLASS FILES, VOL. X, NO. X, MAY 2026 15

work page 1991

[39] [39]

Generalized sliced wasser- stein distances,

S. Kolouri, K. Nadjahi, U. Simsekliet al., “Generalized sliced wasser- stein distances,” inNeural Information Processing Systems, NeurIPS, 2019, pp. 261–272

work page 2019

[40] [40]

Did you train on my dataset? towards public dataset protection with clean-label backdoor watermarking,

R. Tang, Q. Feng, N. Liuet al., “Did you train on my dataset? towards public dataset protection with clean-label backdoor watermarking,” arXiv preprint, 2023

work page 2023

[41] [41]

Untargeted backdoor watermark: Towards harmless and stealthy dataset copyright protection,

Y . Li, Y . Bai, Y . Jianget al., “Untargeted backdoor watermark: Towards harmless and stealthy dataset copyright protection,” inNeural Informa- tion Processing Systems, NeurIPS, 2022

work page 2022

[42] [42]

Open-sourced dataset protection via backdoor watermarking,

Y . Li, Z. Zhang, J. Baiet al., “Open-sourced dataset protection via backdoor watermarking,”arXiv preprint, 2020

work page 2020

[43] [43]

Dataset inference: Ownership resolution in machine learning,

P. Maini, M. Yaghini, and N. Papernot, “Dataset inference: Ownership resolution in machine learning,” inInternational Conference on Learning Representations, ICLR, 2021

work page 2021

[44] [44]

Label- only membership inference attacks,

C. A. Choquette-Choo, F. Tram `er, N. Carlini, and N. Papernot, “Label- only membership inference attacks,” inInternational Conference on Machine Learning, ICML, vol. 139, 2021, pp. 1964–1974

work page 2021

[45] [45]

Radioactive data: tracing through training,

A. Sablayrolles, M. Douze, C. Schmid, and H. J ´egou, “Radioactive data: tracing through training,” inInternational Conference on Machine Learning, ICML, vol. 119, 2020, pp. 8326–8335

work page 2020

[46] [46]

Dataset inference for self-supervised models,

A. Dziedzic, H. Duanet al., “Dataset inference for self-supervised models,” inNeural Information Processing Systems, NeurIPS, 2022

work page 2022

[47] [47]

Dataset ownership verification in contrastive pre-trained models,

Y . Xie, J. Songet al., “Dataset ownership verification in contrastive pre-trained models,” inInternational Conference on Learning Repre- sentations, ICLR, 2025

work page 2025

[48] [48]

A dwt, dct and svd based watermarking technique to protect the image piracy,

M. M. Rahman, “A dwt, dct and svd based watermarking technique to protect the image piracy,”International Journal of Managing Public Sector Information & Communication Technologies, vol. 4, no. 2, pp. 21–32, 2013

work page 2013

[49] [49]

Learning multiple layers of features from tiny images,

A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009

work page 2009

[50] [50]

Deep residual learning for image recognition,

K. He, X. Zhanget al., “Deep residual learning for image recognition,” inIEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 770–778

work page 2016

[51] [51]

Imagenet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socheret al., “Imagenet: A large-scale hierarchical image database,” inIEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR, 2009, pp. 248–255

work page 2009

[52] [52]

Image quality assess- ment: from error visibility to structural similarity,

Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assess- ment: from error visibility to structural similarity,”IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004

work page 2004

[53] [53]

The unreasonable effectiveness of deep features as a perceptual metric,

R. Zhang, P. Isola, A. A. Efroset al., “The unreasonable effectiveness of deep features as a perceptual metric,” inIEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2018, pp. 586–595

work page 2018

[54] [54]

Dreamsim: Learning new dimen- sions of human visual similarity using synthetic data,

S. Fu, N. Tamir, S. Sundaramet al., “Dreamsim: Learning new dimen- sions of human visual similarity using synthetic data,”arXiv preprint, 2023

work page 2023

[55] [55]

An analysis of single-layer networks in unsupervised feature learning,

A. Coates, A. Y . Ng, and H. Lee, “An analysis of single-layer networks in unsupervised feature learning,” inInternational Conference on Artificial Intelligence and Statistics, AISTATS, vol. 15, 2011, pp. 215–223

work page 2011

[56] [56]

https://tensorflow.google.cn/datasets/catalog/imagenette

“https://tensorflow.google.cn/datasets/catalog/imagenette.” Zhiyang Daireceived the bachelor’s degree in Qian Xuesen College from Nanjing University of Science and Technology, Nanjing, China, in 2021, where he is currently pursuing the Ph.D. degree with the School of Cyber Science and Engineering from Nan- jing University of Science and Technology, Nanjin...

work page 2021