Explainable Neural Networks based on Additive Index Models
read the original abstract
Machine Learning algorithms are increasingly being used in recent years due to their flexibility in model fitting and increased predictive performance. However, the complexity of the models makes them hard for the data analyst to interpret the results and explain them without additional tools. This has led to much research in developing various approaches to understand the model behavior. In this paper, we present the Explainable Neural Network (xNN), a structured neural network designed especially to learn interpretable features. Unlike fully connected neural networks, the features engineered by the xNN can be extracted from the network in a relatively straightforward manner and the results displayed. With appropriate regularization, the xNN provides a parsimonious explanation of the relationship between the features and the output. We illustrate this interpretable feature--engineering property on simulated examples.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Sparse Deep Additive Model with Interactions: Enhancing Interpretability and Predictability
SDAMI detects interactions in high-dimensional data via an Effect Footprint principle and models them using sparsity, group lasso, and dedicated deep subnetworks for improved interpretability.
-
Explainability Methods for Hardware Trojan Detection: A Systematic Comparison
Compares domain-aware, case-based, and feature attribution explainability methods for gate-level hardware Trojan detection on the Trust-Hub benchmark dataset.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.