pith. sign in

arxiv: 2304.13850 · v3 · pith:46VVZHRBnew · submitted 2023-04-26 · 💻 cs.CV · cs.CR· cs.LG

Do SSL Models Have D\'ej\`a Vu? A Case of Unintended Memorization in Self-supervised Learning

classification 💻 cs.CV cs.CRcs.LG
keywords memorizationlearningmodelsalgorithmsdifferentimagepartsself-supervised
0
0 comments X
read the original abstract

Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another. However, when taken to the extreme, SSL models can unintendedly memorize specific parts in individual training samples rather than learning semantically meaningful associations. In this work, we perform a systematic study of the unintended memorization of image-specific information in SSL models -- which we refer to as d\'ej\`a vu memorization. Concretely, we show that given the trained model and a crop of a training image containing only the background (e.g., water, sky, grass), it is possible to infer the foreground object with high accuracy or even visually reconstruct it. Furthermore, we show that d\'ej\`a vu memorization is common to different SSL algorithms, is exacerbated by certain design choices, and cannot be detected by conventional techniques for evaluating representation quality. Our study of d\'ej\`a vu memorization reveals previously unknown privacy risks in SSL models, as well as suggests potential practical mitigation strategies. Code is available at https://github.com/facebookresearch/DejaVu.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Dataset Watermarking for Closed LLMs with Provable Detection

    cs.LG 2026-05 unverdicted novelty 7.0

    A new watermarking method for closed LLMs boosts random word-pair co-occurrences via rephrasing and detects the signal statistically in outputs, working reliably even when the watermarked data is only 1% of fine-tunin...