pith. sign in

hub Canonical reference

Invariant Risk Minimization

Canonical reference. 71% of citing Pith papers cite this work as background.

89 Pith papers citing it
Background 71% of classified citations
abstract

We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.

hub tools

citation-role summary

background 12 method 1 other 1

citation-polarity summary

claims ledger

  • abstract We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.
  • background Θ ⊆ Rd are convex and compact, and letθ∗ ∈ Θ be a minimizer of the worst-group objectiveR(θ). Then there exists a distributionQ∗ ∈ Q such thatθ∗ ∈ arg minθ Ez∼Q∗[ℓ(θ;z)]. However, this equivalence breaks down when the lossℓ is non-convex: Counterexample 1. Consider a uniform data distributionP supported on two points Z = {z1,z 2}, and letℓ(θ;z) be as in Figure 4, withΘ = [0, 1]. The DRO solutionθ∗ achieves a worst-case loss of R(θ∗) = 0.6. Now consider any weights (w1,w 2) ∈ ∆2 and w.l.o.g. letw

co-cited works

clear filters

representative citing papers

Prediction-Intervention Games and Invariant Sets

stat.ML · 2026-05-16 · unverdicted · novelty 7.0

In prediction-intervention games, stable-blanket predictors are at least as good as causal-parent predictors for two classes of follower objectives and can be worst-case optimal under additional conditions.

Continual Learning of Domain-Invariant Representations

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.

Spectral Gradient Surgery for Domain-Generalizable Dataset Distillation

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

Spectral Gradient Surgery disentangles class-discriminative and domain-specific signals in distribution-matching distilled datasets by analyzing gradient agreement in the spectral domain, yielding better out-of-distribution performance.

Measuring Behavior Portability in Large Language Models

cs.AI · 2026-06-22 · unverdicted · novelty 6.0

A new framework measures behavioral portability of LLMs across payoff-equivalent environments and reports substantial systematic transfer losses in seven economic decision problems.

Unsupervised Causal Abstractions Discovery

cs.LG · 2026-06-17 · unverdicted · novelty 6.0

Low-rank graphs induce latents that form causal abstractions, with identifiability results and a practical objective enabling unsupervised learning of high-level SCMs from low-level measurements.

citing papers explorer

Showing 1 of 1 citing paper after filters.