Explaining Concept Shift with Interpretable Feature Attribution

Alistair Turcan; Bryan Wilder; Ruiqi Lyu

read the original abstract

Concept shift occurs when the distribution of labels conditioned on the features changes between domains, which can make even a well-tuned ML model miscalibrated on a new domain. Identifying these shifted features provides unique insight into how feature-label relationships differ between domains, considering the difference may be across a scientifically relevant dimension, such as time, disease status, population, etc. In this paper, we propose SGShift, a method for attributing performance degradation under concept shift in tabular data to a sparse set of shifted features. We frame concept shift as a feature selection task to learn the features that can explain performance differences between models in the source and target domain. This framework enables SGShift to adapt powerful statistical tools such as generalized additive models, knockoffs, and absorption towards identifying these shifted features. We conduct extensive experiments in synthetic and real data across various ML models and find SGShift can identify shifted features much more accurately than baseline methods, requires few samples in the shifted domain, and is robust to complex cases of concept shift.

Explaining Concept Shift with Interpretable Feature Attribution

discussion (0)