Understanding Deep Image Representations by Inverting Them

Aravindh Mahendran , Andrea Vedaldi

Authors on Pith no claims yet

classification 💻 cs.CV

keywords imagerepresentationscnnsunderstandinginformationinvertquestionrecent

read the original abstract

Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding of them remains limited. In this paper we conduct a direct analysis of the visual information contained in representations by asking the following question: given an encoding of an image, to which extent is it possible to reconstruct the image itself? To answer this question we contribute a general framework to invert representations. We show that this method can invert representations such as HOG and SIFT more accurately than recent alternatives while being applicable to CNNs too. We then use this technique to study the inverse of recent state-of-the-art CNN image representations for the first time. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Deep Dreams Are Made of This: Visualizing Monosemantic Features in Diffusion Models
cs.LG 2026-05 unverdicted novelty 7.0

LVO applies optimization-based feature visualization to latent diffusion models after disentangling their representations with sparse autoencoders, yielding recognizable concept images on a fine-tuned Stable Diffusion...
Open Problems in Mechanistic Interpretability
cs.LG 2025-01 unverdicted novelty 3.0

A review paper that organizes conceptual, practical, and socio-technical open problems in mechanistic interpretability.