Introduces a causal intervention framework with new metrics for mechanistic interpretability of VAEs and reports empirical findings from extensive experiments on multiple models and datasets.
A framework for the quantitative evaluation of disentangled representations
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
Tabular VAEs show ~50% lower causal circuit modularity than image VAEs, with beta-VAE CES collapsing to 0.043 versus 0.133 due to reconstruction degradation, challenging direct transfer of image interpretability techniques.
citing papers explorer
-
A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders
Introduces a causal intervention framework with new metrics for mechanistic interpretability of VAEs and reports empirical findings from extensive experiments on multiple models and datasets.
-
Posterior-Calibrated Causal Circuits in Variational Autoencoders: Why Image-Domain Interpretability Fails on Tabular Data
Tabular VAEs show ~50% lower causal circuit modularity than image VAEs, with beta-VAE CES collapsing to 0.043 versus 0.133 due to reconstruction degradation, challenging direct transfer of image interpretability techniques.