BlackMarks: Blackbox Multibit Watermarking for Deep Neural Networks

Bita Darvish Rouhani; Farinaz Koushanfar; Huili Chen

arxiv: 1904.00344 · v1 · pith:TBJ2ZCRNnew · submitted 2019-03-31 · 💻 cs.MM · cs.CR

BlackMarks: Blackbox Multibit Watermarking for Deep Neural Networks

Huili Chen , Bita Darvish Rouhani , Farinaz Koushanfar This is my paper

classification 💻 cs.MM cs.CR

keywords blackmarksmodelownersignaturewatermarkbinaryblack-boxcorresponding

0 comments

read the original abstract

Deep Neural Networks have created a paradigm shift in our ability to comprehend raw data in various important fields ranging from computer vision and natural language processing to intelligence warfare and healthcare. While DNNs are increasingly deployed either in a white-box setting where the model internal is publicly known, or a black-box setting where only the model outputs are known, a practical concern is protecting the models against Intellectual Property (IP) infringement. We propose BlackMarks, the first end-to-end multi-bit watermarking framework that is applicable in the black-box scenario. BlackMarks takes the pre-trained unmarked model and the owner's binary signature as inputs and outputs the corresponding marked model with a set of watermark keys. To do so, BlackMarks first designs a model-dependent encoding scheme that maps all possible classes in the task to bit '0' and bit '1' by clustering the output activations into two groups. Given the owner's watermark signature (a binary string), a set of key image and label pairs are designed using targeted adversarial attacks. The watermark (WM) is then embedded in the prediction behavior of the target DNN by fine-tuning the model with generated WM key set. To extract the WM, the remote model is queried by the WM key images and the owner's signature is decoded from the corresponding predictions according to the designed encoding scheme. We perform a comprehensive evaluation of BlackMarks's performance on MNIST, CIFAR10, ImageNet datasets and corroborate its effectiveness and robustness. BlackMarks preserves the functionality of the original DNN and incurs negligible WM embedding runtime overhead as low as 2.054%.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Dynamics-Level Watermarking of Flow Matching Models with Random Codes
cs.LG 2026-05 unverdicted novelty 7.0

Presents dynamics-level watermarking for flow matching models via random coding over continuous channels, embedding key-dependent perturbations in the velocity field that preserve the generated distribution and enable...