pith. sign in

hub Canonical reference

Invariant Risk Minimization

Canonical reference. 71% of citing Pith papers cite this work as background.

75 Pith papers citing it
Background 71% of classified citations
abstract

We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.

hub tools

citation-role summary

background 12 method 1 other 1

citation-polarity summary

claims ledger

  • abstract We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.
  • background Θ ⊆ Rd are convex and compact, and letθ∗ ∈ Θ be a minimizer of the worst-group objectiveR(θ). Then there exists a distributionQ∗ ∈ Q such thatθ∗ ∈ arg minθ Ez∼Q∗[ℓ(θ;z)]. However, this equivalence breaks down when the lossℓ is non-convex: Counterexample 1. Consider a uniform data distributionP supported on two points Z = {z1,z 2}, and letℓ(θ;z) be as in Figure 4, withΘ = [0, 1]. The DRO solutionθ∗ achieves a worst-case loss of R(θ∗) = 0.6. Now consider any weights (w1,w 2) ∈ ∆2 and w.l.o.g. letw

co-cited works

representative citing papers

Prediction-Intervention Games and Invariant Sets

stat.ML · 2026-05-16 · unverdicted · novelty 7.0

In prediction-intervention games, stable-blanket predictors are at least as good as causal-parent predictors for two classes of follower objectives and can be worst-case optimal under additional conditions.

Continual Learning of Domain-Invariant Representations

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.

Spectral Gradient Surgery for Domain-Generalizable Dataset Distillation

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

Spectral Gradient Surgery disentangles class-discriminative and domain-specific signals in distribution-matching distilled datasets by analyzing gradient agreement in the spectral domain, yielding better out-of-distribution performance.

Understanding Generalization through Decision Pattern Shift

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

DPS quantifies deviation of per-sample decision patterns from class averages and shows linear correlation with generalization gaps while unifying degradation scenarios into a continuous trajectory.

citing papers explorer

Showing 50 of 75 citing papers.