A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms

Anirudh Goyal; Christopher Pal; Nasim Rahaman; Olexa Bilaniuk; Rosemary Ke; S\'ebastien Lachapelle; Tristan Deleu; Yoshua Bengio

arxiv: 1901.10912 · v2 · pith:3P2LKE57new · submitted 2019-01-30 · 💻 cs.LG · stat.ML

A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms

Yoshua Bengio , Tristan Deleu , Nasim Rahaman , Rosemary Ke , S\'ebastien Lachapelle , Olexa Bilaniuk , Anirudh Goyal , Christopher Pal This is my paper

classification 💻 cs.LG stat.ML

keywords causalchangesmechanismsvariablesadaptationinterventionssparseactions

0 comments

read the original abstract

We propose to meta-learn causal structures based on how fast a learner adapts to new distributions arising from sparse distributional changes, e.g. due to interventions, actions of agents and other sources of non-stationarities. We show that under this assumption, the correct causal structural choices lead to faster adaptation to modified distributions because the changes are concentrated in one or just a few mechanisms when the learned knowledge is modularized appropriately. This leads to sparse expected gradients and a lower effective number of degrees of freedom needing to be relearned while adapting to the change. It motivates using the speed of adaptation to a modified distribution as a meta-learning objective. We demonstrate how this can be used to determine the cause-effect relationship between two observed variables. The distributional changes do not need to correspond to standard interventions (clamping a variable), and the learner has no direct knowledge of these interventions. We show that causal structures can be parameterized via continuous variables and learned end-to-end. We then explore how these ideas could be used to also learn an encoder that would map low-level observed variables to unobserved causal variables leading to faster adaptation out-of-distribution, learning a representation space where one can satisfy the assumptions of independent mechanisms and of small and sparse changes in these mechanisms due to actions and non-stationarities.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis
eess.SP 2026-05 unverdicted novelty 6.0

Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
cs.LG 2025-02 unverdicted novelty 6.0

TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datase...