pith. machine review for the scientific record. sign in

arxiv: 1904.00760 · v1 · submitted 2019-03-20 · 💻 cs.CV · cs.LG· stat.ML

Recognition: unknown

Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet

Authors on Pith no claims yet
classification 💻 cs.CV cs.LGstat.ML
keywords featuresimagedeepimagenetarchitecturebag-of-featuredecisionsdnns
0
0 comments X
read the original abstract

Deep Neural Networks (DNNs) excel on many complex perceptual tasks but it has proven notoriously difficult to understand how they reach their decisions. We here introduce a high-performance DNN architecture on ImageNet whose decisions are considerably easier to explain. Our model, a simple variant of the ResNet-50 architecture called BagNet, classifies an image based on the occurrences of small local image features without taking into account their spatial ordering. This strategy is closely related to the bag-of-feature (BoF) models popular before the onset of deep learning and reaches a surprisingly high accuracy on ImageNet (87.6% top-5 for 33 x 33 px features and Alexnet performance for 17 x 17 px features). The constraint on local features makes it straight-forward to analyse how exactly each part of the image influences the classification. Furthermore, the BagNets behave similar to state-of-the art deep neural networks such as VGG-16, ResNet-152 or DenseNet-169 in terms of feature sensitivity, error distribution and interactions between image parts. This suggests that the improvements of DNNs over previous bag-of-feature classifiers in the last few years is mostly achieved by better fine-tuning rather than by qualitatively different decision strategies.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. ShapeY: A Principled Framework for Measuring Shape Recognition Capacity via Nearest-Neighbor Matching

    cs.CV 2026-04 unverdicted novelty 6.0

    ShapeY is a benchmark dataset and nearest-neighbor protocol that measures shape-based recognition in vision models, revealing that even state-of-the-art networks fail to generalize consistently across 3D viewpoints an...