Interpretability Beyond Classification Output: Semantic Bottleneck Networks

· 2019 · cs.CV · arXiv 1907.10882

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Today's deep learning systems deliver high performance based on end-to-end training. While they deliver strong performance, these systems are hard to interpret. To address this issue, we propose Semantic Bottleneck Networks (SBN): deep networks with semantically interpretable intermediate layers that all downstream results are based on. As a consequence, the analysis on what the final prediction is based on is transparent to the engineer and failure cases and modes can be analyzed and avoided by high-level reasoning. We present a case study on street scene segmentation to demonstrate the feasibility and power of SBN. In particular, we start from a well performing classic deep network which we adapt to house a SB-Layer containing task related semantic concepts (such as object-parts and materials). Importantly, we can recover state of the art performance despite a drastic dimensionality reduction from 1000s (non-semantic feature) to 10s (semantic concept) channels. Additionally we show how the activations of the SB-Layer can be used for both the interpretation of failure cases of the network as well as for confidence prediction of the resulting output. For the first time, e.g., we show interpretable segmentation results for most predictions at over 99% accuracy.

representative citing papers

Investigating Concept Alignment Using Implausible Category Members

cs.AI · 2026-05-20 · unverdicted · novelty 6.0

AI models misalign with humans on concept boundaries when probed with implausible category members, such as classifying words as vehicles or vegetables as fruit.

citing papers explorer

Showing 1 of 1 citing paper.

Investigating Concept Alignment Using Implausible Category Members cs.AI · 2026-05-20 · unverdicted · none · ref 20 · internal anchor
AI models misalign with humans on concept boundaries when probed with implausible category members, such as classifying words as vehicles or vegetables as fruit.

Interpretability Beyond Classification Output: Semantic Bottleneck Networks

fields

years

verdicts

representative citing papers

citing papers explorer