super hub Canonical reference

Invariant Risk Minimization

David Lopez-Paz, Ishaan Gulrajani, Martin Arjovsky · 2019 · stat.ML · arXiv 1907.02893

Canonical reference. 71% of citing Pith papers cite this work as background.

119 Pith papers citing it

Background 71% of classified citations

open full Pith review browse 119 citing papers more from David Lopez-Paz arXiv PDF

abstract

We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 12 method 1 other 1

citation-polarity summary

background 10 unclear 3 use method 1

claims ledger

abstract We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top of that data representation, matches for all training distributions. Through theory and experiments, we show how the invariances learned by IRM relate to the causal structures governing the data and enable out-of-distribution generalization.
background Θ ⊆ Rd are convex and compact, and letθ∗ ∈ Θ be a minimizer of the worst-group objectiveR(θ). Then there exists a distributionQ∗ ∈ Q such thatθ∗ ∈ arg minθ Ez∼Q∗[ℓ(θ;z)]. However, this equivalence breaks down when the lossℓ is non-convex: Counterexample 1. Consider a uniform data distributionP supported on two points Z = {z1,z 2}, and letℓ(θ;z) be as in Figure 4, withΘ = [0, 1]. The DRO solutionθ∗ achieves a worst-case loss of R(θ∗) = 0.6. Now consider any weights (w1,w 2) ∈ ∆2 and w.l.o.g. letw

authors

David Lopez-Paz Ishaan Gulrajani L\'eon Bottou Martin Arjovsky

co-cited works

representative citing papers

The Matching Principle: A Geometric Theory of Loss Functions for Nuisance-Robust Representation Learning

cs.LG · 2026-05-21 · unverdicted · novelty 8.0 · 2 refs

Robustness methods estimate the task covariance Sigma_task, and the matching principle requires penalty matrices to have range covering that of Sigma_task to zero deployment drift.

The Statistical Cost of Adaptation in Multi-Source Transfer Learning

math.ST · 2026-05-10 · unverdicted · novelty 8.0

Multi-source transfer learning incurs an intrinsic adaptation cost that can exceed one, with phase transitions separating regimes where bias-agnostic estimators match oracle performance from those where they cannot.

CouCE: A Unified Causal Framework for Debiased Deep Metric Learning

cs.CV · 2026-06-29 · unverdicted · novelty 7.0

CouCE is a unified causal framework using Orthogonal Dictionary-Based Backdoor Adjustment and Multi-Scale Randomized Causal Intervention to debias deep metric learning against two distinct confounders.

Repair-before-veto control for safe lithium-ion fast charging under unknown ambient and cooling-fault conditions

eess.SY · 2026-06-26 · unverdicted · novelty 7.0

RACL-B safely completes fast charging across all nine tested ambient-temperature and cooling-health conditions in a DFN battery model, achieving 37.9% faster charging than the safest fixed current while minimizing plated lithium.

Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization

cs.CV · 2026-06-19 · unverdicted · novelty 7.0

PAPT uses adversarial prompt tuning on diffusion models to generate domain-style images while preserving category features, claiming superior single-domain generalization performance.

When Dynamics Models Read the Wrong Time Steps: Label-Free Event Credit Re-Anchoring for Robust Global Readouts

cs.LG · 2026-06-16 · unverdicted · novelty 7.0

CREST re-anchors global readouts in dynamics models to transient events via event-versus-rest contrast on learned features, reducing OOD error on gear, impact, and bearing systems while restoring event credit.

Martingale Doppelg\"anger-Eval: An Identification Framework for Auditing Candlestick Understanding in Vision-Language Models

q-fin.CP · 2026-06-16 · unverdicted · novelty 7.0

Introduces an identification benchmark with martingale-null markets, counterfactual pairs, and trend swaps to determine if VLMs ground responses in candlestick evidence or rely on trend extrapolation.

Is Spurious Correlation Removal Always Learnable?

cs.LG · 2026-06-11 · unverdicted · novelty 7.0

Conditional computational barrier exists for learning k=1 invariant subspaces in samplable multi-environment instances under sparse recovery hardness; minimax risk is Theta(k(d-k)/(n|E|)) with phase transition at n* ~ k(d-k)/(|E| gamma^2).

Implicit Neural Representations of Individual Behavior

cs.LG · 2026-06-10 · unverdicted · novelty 7.0

Behavioral INR adapts INRs to behavior by mapping states to actions with FiLM-modulated episode latents for self-supervised policy inference in unlabeled data, with new policy OOD definitions.

Customization under Fire: Plugin Poisoning in Text-to-Image Ecosystem

cs.CR · 2026-06-08 · unverdicted · novelty 7.0

PoisonLoRA demonstrates ~100% attack success rates for stealthy LoRA poisoning via concept hijacking and task injection on real platforms, with robustness to base model transfer and multiple remixes.

Invariant Gradient Alignment for Robust Reasoning Distillation

cs.LG · 2026-06-03 · unverdicted · novelty 7.0

Invariant Gradient Alignment uses Logical Isomer Sets and a Continuous Gradient Conflict Mask to tighten OOD generalization bounds and boost empirical performance over ERM in reasoning distillation.

EarthShift: a benchmark for measuring robustness to real-world distribution shifts in Earth observation

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

EarthShift is a new benchmark using paired datasets to measure robustness of geospatial foundation models to realistic distribution shifts, finding consistent 15-20% performance drops out-of-distribution across 8 models and 11 tasks.

Cumulative Meta-Learning from Active Learning Queries for Robustness to Spurious Correlations

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

CAML meta-learns a progressively refined inductive bias from active-learning queries to improve robustness to spurious correlations, reporting accuracy gains on minority groups across several benchmarks.

Identifiable Multimodal Causal Representation Learning under Partial Latent Sharing

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

Establishes component-wise identifiability guarantees for partially shared causal latents in multimodal nonlinear mixing and introduces a differentiable Wasserstein-based module for recovery.

FML-bench: A Controlled Study of AI Research Agent Strategies from the Perspective of Search Dynamics

cs.LG · 2026-05-17 · unverdicted · novelty 7.0 · 2 refs

FML-Bench shows a simple greedy hill-climber nearly matches tree search on dense-opportunity tasks while an adaptive agent that broadens search on stagnation outperforms six baselines across 18 tasks.

Prediction-Intervention Games and Invariant Sets

stat.ML · 2026-05-16 · unverdicted · novelty 7.0

In prediction-intervention games, stable-blanket predictors are at least as good as causal-parent predictors for two classes of follower objectives and can be worst-case optimal under additional conditions.

Continual Learning of Domain-Invariant Representations

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.

TILT: Target-induced loss tilting under covariate shift

cs.LG · 2026-05-14 · conditional · novelty 7.0

TILT adds a target-data penalty on an auxiliary predictor component to induce effective importance weighting for unsupervised domain adaptation under covariate shift.

Separating Shortcut Transition from Cross-Family OOD Failure in a Minimal Model

cs.LG · 2026-05-13 · unverdicted · novelty 7.0 · 2 refs

A minimal model shows that the training-side switch to a shortcut rule does not uniformly produce cross-family OOD failure, as outcomes depend on the held-out family's shortcut correlation.

Spectral Gradient Surgery for Domain-Generalizable Dataset Distillation

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

Spectral Gradient Surgery disentangles class-discriminative and domain-specific signals in distribution-matching distilled datasets by analyzing gradient agreement in the spectral domain, yielding better out-of-distribution performance.

Privacy-Aware Video Anomaly Detection through Orthogonal Subspace Projection

cs.CV · 2026-05-09 · unverdicted · novelty 7.0

A new orthogonal projection module for video anomaly detection suppresses facial attributes via weak face-presence signals and cosine alignment while preserving anomaly-relevant features like pose and motion.

Flatness and Gradient Alignment Are Both Necessary: Spectral-Aware Gradient-Aligned Exploration for Multi-Distribution Learning

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

Excess risk decomposes into independent alignment (trace of inverse average Hessian times gradient covariance) and curvature terms, so both flatness and gradient alignment are required; SAGE achieves this and sets new SOTA on DomainBed.

Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

A large-scale benchmark finds that recent multimodal domain generalization methods give only marginal gains over a plain ERM baseline, with no method winning consistently and all degrading sharply under corruption or missing modalities.

eXplaining to Learn (eX2L): Regularization Using Contrastive Visual Explanation Pairs for Distribution Shifts

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

eX2L improves robustness to distribution shifts by penalizing similarity between Grad-CAM maps of a label classifier and a confounder classifier, reaching new SOTA average and worst-group accuracy on the Spawrious benchmark.

citing papers explorer

Showing 6 of 6 citing papers after filters.

Prediction-Intervention Games and Invariant Sets stat.ML · 2026-05-16 · unverdicted · none · ref 1 · internal anchor
In prediction-intervention games, stable-blanket predictors are at least as good as causal-parent predictors for two classes of follower objectives and can be worst-case optimal under additional conditions.
Anchor PCA stat.ML · 2026-06-04 · unverdicted · none · ref 4 · internal anchor
Anchor PCA recovers a maximal invariant subspace for multi-domain data via PCA on a modified target matrix that trades off explained variance with domain agreement.
Unsupervised Identification and Removal of Spurious Correlations During Fine-Tuning stat.ML · 2026-05-26 · unverdicted · none · ref 1 · internal anchor
Spurious latent factors in fine-tuning can be identified unsupervised from naive LoRA weights and removed via gradient projection of associated patterns to reduce bias and misalignment while preserving task performance.
Robust Representation Learning through Explicit Environment Modeling stat.ML · 2026-04-28 · unverdicted · none · ref 2 · internal anchor
Explicitly modeling and marginalizing environment variation via generalized random-intercept models produces representations that support robust average prediction across unseen environments and outperform invariant-learning methods in challenging distribution-shift settings.
Environment-Robust Representation Learning with Empirical Bayes stat.ML · 2026-06-03 · unverdicted · none · ref 3 · internal anchor
An empirical Bayes variational inference method learns environment-robust latent variables from multi-environment data for improved prediction in unseen environments.
Causality as the Statistical Conscience of Artificial Intelligence: From Pearl's Ladder to Trustworthy Machines stat.ML · 2026-05-22 · unverdicted · none · ref 3 · internal anchor
Causality is required for out-of-distribution generalization in AI, with a necessity theorem and unified causal estimators proposed to fix failure modes like hallucination and reward hacking.

Invariant Risk Minimization

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer