pith. sign in

arxiv: 1611.06596 · v3 · pith:URYZUOFKnew · submitted 2016-11-20 · 💻 cs.CV

Object Recognition with and without Objects

classification 💻 cs.CV
keywords networksobjectrecognitiondifferentperformancevisualbackgroundcontext
0
0 comments X
read the original abstract

While recent deep neural networks have achieved a promising performance on object recognition, they rely implicitly on the visual contents of the whole image. In this paper, we train deep neural net- works on the foreground (object) and background (context) regions of images respectively. Consider- ing human recognition in the same situations, net- works trained on the pure background without ob- jects achieves highly reasonable recognition performance that beats humans by a large margin if only given context. However, humans still outperform networks with pure object available, which indicates networks and human beings have different mechanisms in understanding an image. Furthermore, we straightforwardly combine multiple trained networks to explore different visual cues learned by different networks. Experiments show that useful visual hints can be explicitly learned separately and then combined to achieve higher performance, which verifies the advantages of the proposed framework.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. MAPS: A Synthetic Dataset for Probing Vision Models in a Controlled 3D Scene Space

    cs.CV 2026-05 unverdicted novelty 7.0

    MAPS provides 2618 validated 3D meshes and a controllable rendering pipeline to attribute vision model recognition failures to specific scene parameters, finding camera distance and elevation as the dominant failure f...