arxiv: 2605.02658 · v2 · submitted 2026-05-04 · 💻 cs.AI

Recognition: 3 theorem links

· Lean Theorem

Deciphering Shortcut Learning from an Evolutionary Game Theory Perspective

Xiayang Li , Kuo Gai , Shihua Zhang

Authors on Pith no claims yet

Pith reviewed 2026-05-08 19:25 UTC · model grok-4.3

classification 💻 cs.AI

keywords shortcut learningevolutionary game theorygradient descentstochastic gradient descentneural tangent featuresshortcut biascore subnetworkdeep learning

0 comments

The pith

Gradient descent optimizes shortcut subnetworks while stochastic gradient descent optimizes core subnetworks in deep neural networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper models shortcut learning by defining core and shortcut features and casting the training process as an evolutionary game in which data samples are players and neural tangent features are their strategies. It demonstrates that gradient descent converges to a stochastically stable state favoring the shortcut subnetwork, whereas stochastic gradient descent converges to one favoring the core subnetwork. The analysis further uses a stochastic differential equation to show how data and optimization noise modulate the bias toward shortcuts.

Core claim

Assuming the existence of core and shortcut subnetworks, we model data samples as players with neural tangent features as strategies in an evolutionary game. We find that gradient descent and stochastic gradient descent lead to two distinct stochastically stable states, with the former primarily optimizing the shortcut subnetwork and the latter primarily optimizing the core subnetwork. Investigation through a continuous stochastic differential equation reveals the impact of data noise and optimization noise on the formation of shortcut bias.

What carries the argument

The evolutionary game model of neural network training, with data samples as players and neural tangent features as strategies, which produces stochastically stable states under GD and SGD.

Load-bearing premise

Neural networks contain separable core and shortcut subnetworks, and their training can be modeled as an evolutionary game using neural tangent features as strategies for data sample players.

What would settle it

A direct test would involve training simple networks on synthetic data with explicit core and shortcut features and measuring whether the subnetwork weights align with the predicted optimization preferences under GD versus SGD.

Figures

Figures reproduced from arXiv: 2605.02658 by Kuo Gai, Shihua Zhang, Xiayang Li.

**Figure 1.** Figure 1: a and b, Evolution of the three feature types on clean samples (a) and after noise injection (Gaussian noise with a standard deviation of 0.5) (b). c, Shortcut-bias changes with noise strength under different batch sizes. samples. If the neural tangent feature of a sample, ∇θf(X; θ), is interpreted as its strategy for influencing the optimization process, then training can be viewed as a dynamic interacti… view at source ↗

**Figure 2.** Figure 2: Schematic diagram of the sub-network hypothesis. The hidden layer neurons are divided into two categories: core neurons and shortcut neurons. Based on the classification of neurons, the connecting edges (weights) are divided into three categories. In the two major categories, there are two sub-categories E f 1 and E n 1 (E f 2 and E n 2 ) that model core (shortcut) features and noise for E1 (E2), respectiv… view at source ↗

**Figure 3.** Figure 3: A schematic diagram of strategy transfer during the training process. Under the sub-network hypothesis, consider that the samples have two strategies: a core strategy and a shortcut strategy. The corresponding evolutionary paths are marked in blue and red, respectively. When the payoff of one strategy is higher, the population size adopting that strategy will increase at the next epoch. For notational conv… view at source ↗

**Figure 4.** Figure 4: Simulating full-batch versus mini-batch training on the Colored MNIST dataset with a fully connected neural network. a and b, PCA visualization of the original data and the corresponding model gradients for each data point under the fullbatch setting, at epoch 0 and the final epoch, respectively. c and d, similar illustration under the mini-batch setting. 4.4 Simulation of Stochastic Differential Dynamics… view at source ↗

**Figure 5.** Figure 5: Numerical simulation of the stochastic differential equation (SDE) model. a, Evolution of the strengths of the two subnetworks (w1 and w2) over iterations under two different data noise levels (τ = 0.3 and τ = 0.8). b, The proportion of core strategy evolution (α) as a function of optimization noise. c, The shortcut bias, quantified by the difference in subnetwork strengths (E[w2(∞)−w1(∞)]), is plotted aga… view at source ↗

read the original abstract

Shortcut learning causes deep learning models to rely on non-essential features within the data. However, its formation in deep neural network training still lacks theoretical understanding. In this paper, we provide a formal definition of core and shortcut features and employ evolutionary game theory to analyze the origins of shortcut bias by modeling data samples as players and their corresponding neural tangent features as strategies, assuming the existence of core and shortcut subnetworks. We find that gradient descent (GD) and stochastic gradient descent (SGD) lead to two distinct stochastically stable states, each corresponding to a different strategy. The former primarily optimizes the shortcut subnetwork, while the latter primarily optimizes the core subnetwork. We investigate the influence of these strategies on shortcut bias through a continuous stochastic differential equation, and reveal the impact of data noise and optimization noise on the formation of shortcut bias. In brief, our work employs evolutionary game theory to characterize the dynamics of shortcut bias formation and provides a theoretical view on its mitigation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper models shortcut learning via evolutionary games on data samples and NT features, claiming GD favors shortcuts while SGD favors core features, but the payoff-to-gradient link is assumed rather than derived.

read the letter

The main thing to know is that this paper uses evolutionary game theory to argue GD and SGD reach different stable states in shortcut learning, with GD locking onto shortcuts and SGD on core features via a model of data samples as players and neural tangent features as strategies. What stands out as new is the application of replicator dynamics and stochastic differential equations to track how noise influences the formation of shortcut bias. It separates the deterministic case from the noisy one in a way that prior work on shortcuts hasn't done. The paper does a decent job laying out a formal definition of core and shortcut features and then analyzing the impact of data and optimization noise on bias. That's a clean theoretical move. The soft spots are in the foundations. The assumption that the network splits into core and shortcut subnetworks and that the game payoffs arise naturally from the training loss isn't derived or justified step by step. The stress-test point about the unverified translation from gradients to evolutionary payoffs seems to hold; without explicit equations showing how the fitness function matches the gradient descent update, the distinct stable states could be an artifact of the modeling rather than a real property of the dynamics. There's also no empirical component or even toy simulations to test the predictions. This work is aimed at researchers interested in theoretical accounts of deep learning training dynamics, particularly those open to game theory or dynamical systems perspectives on optimization. A reader looking for new ways to think about robust training might find the noise analysis useful. It deserves serious peer review. The novelty in the framing is enough to warrant referee time, though the authors will likely need to strengthen the connection between the game model and actual neural network training.

Referee Report

3 major / 1 minor

Summary. The manuscript claims to provide a theoretical analysis of shortcut learning in deep neural networks by defining core and shortcut features and applying evolutionary game theory. Data samples are modeled as players in a game where strategies are given by neural tangent features, under the assumption of separate core and shortcut subnetworks. The key finding is that deterministic gradient descent (GD) and stochastic gradient descent (SGD) converge to different stochastically stable states, with GD favoring optimization of the shortcut subnetwork and SGD favoring the core subnetwork. The paper then uses a stochastic differential equation (SDE) to study how data noise and optimization noise affect the formation of shortcut bias.

Significance. If the central mapping from optimization dynamics to evolutionary stable states is valid, the work offers a novel perspective on why shortcut learning occurs and how SGD might preferentially avoid it compared to GD. The use of evolutionary game theory to characterize the dynamics and the SDE analysis for noise effects could contribute to theoretical understanding in the field of deep learning generalization and bias. However, without demonstrated derivations linking the game payoffs to actual gradient flows, the significance remains potential rather than realized. The approach is creative but requires substantiation to impact the literature on shortcut learning.

major comments (3)

[Section 3] In the evolutionary game theory model (Section 3), the modeling choice of data samples as players and neural tangent features as strategies is introduced without derivation from the neural network loss or gradient updates. The payoff matrix and replicator dynamics are not shown to arise from the actual loss gradients with respect to the assumed core/shortcut subnetwork weights, making the stable-state claims an artifact of the external framework rather than a reduction from training dynamics.
[Section 4] The central claim that GD and SGD lead to two distinct stochastically stable states, with the former optimizing the shortcut subnetwork and the latter the core subnetwork (Abstract and Section 4), lacks explicit equilibria calculations, stability proofs, or supporting derivations. No equations demonstrate how deterministic vs. noisy updates select different strategies under the defined NT-feature payoffs.
[Section 5] The SDE analysis (Section 5) linking data noise and optimization noise to shortcut bias formation assumes that the stochastically stable states correspond directly to subnetwork optimization via NT features. This correspondence is not derived from the gradient flow or loss landscape, so the noise-impact conclusions inherit the unsupported mapping from the discrete game model.

minor comments (1)

[Section 2] The formal definitions of core and shortcut features (Section 2) would benefit from a concrete low-dimensional example or diagram to clarify how they decompose the input features and subnetworks.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the insightful and constructive comments on our manuscript. We address each major comment point by point below, providing clarifications on the modeling choices and derivations while indicating revisions to strengthen the connections between the evolutionary game theory framework and neural network training dynamics.

read point-by-point responses

Referee: [Section 3] In the evolutionary game theory model (Section 3), the modeling choice of data samples as players and neural tangent features as strategies is introduced without derivation from the neural network loss or gradient updates. The payoff matrix and replicator dynamics are not shown to arise from the actual loss gradients with respect to the assumed core/shortcut subnetwork weights, making the stable-state claims an artifact of the external framework rather than a reduction from training dynamics.

Authors: We agree that an explicit derivation is necessary to establish the link. In the revised manuscript, we will add a new subsection in Section 3 deriving the payoff matrix directly from the expected loss function under the core and shortcut subnetwork decomposition in the neural tangent kernel regime. This derivation will show how the replicator dynamics emerge as an approximation to the gradient updates on the subnetwork weights, grounding the stable-state analysis in the training dynamics rather than an external imposition. revision: yes
Referee: [Section 4] The central claim that GD and SGD lead to two distinct stochastically stable states, with the former optimizing the shortcut subnetwork and the latter the core subnetwork (Abstract and Section 4), lacks explicit equilibria calculations, stability proofs, or supporting derivations. No equations demonstrate how deterministic vs. noisy updates select different strategies under the defined NT-feature payoffs.

Authors: The equilibria and stability analysis are based on the replicator dynamics equations presented in Section 4. To address this, we will expand the section with explicit fixed-point calculations for both the deterministic (GD) and stochastic (SGD) cases, including the Jacobian matrix for local stability proofs and the explicit effect of the noise term in shifting the equilibrium toward the core strategy. Additional equations will illustrate the selection mechanism under the NT-feature payoffs. revision: yes
Referee: [Section 5] The SDE analysis (Section 5) linking data noise and optimization noise to shortcut bias formation assumes that the stochastically stable states correspond directly to subnetwork optimization via NT features. This correspondence is not derived from the gradient flow or loss landscape, so the noise-impact conclusions inherit the unsupported mapping from the discrete game model.

Authors: We acknowledge the importance of rigorously connecting the discrete game model to the continuous SDE. In the revision, we will derive the SDE as the continuous approximation to the stochastic replicator dynamics, explicitly linking it to the underlying gradient flow with additive noise terms. This will substantiate the correspondence to core/shortcut subnetwork optimization and support the analysis of how data noise and optimization noise modulate shortcut bias formation. revision: yes

Circularity Check

0 steps flagged

No significant circularity; external modeling framework yields independent analysis

full rationale

The paper introduces core/shortcut subnetworks and neural tangent features as explicit modeling assumptions, then applies evolutionary game theory (players as data samples, strategies as NT features) to derive stochastically stable states under GD versus SGD via replicator dynamics and SDE. These states are computed outcomes within the assumed game, not reductions of the target phenomenon to its own inputs by construction. No self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear; the framework is presented as an analytical tool rather than a first-principles derivation that collapses. The central claims about shortcut bias formation are interpretive results of the model, not tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on two key modeling premises introduced to enable the game-theoretic analysis, with no free parameters or new postulated physical entities.

axioms (2)

domain assumption Existence of core and shortcut subnetworks within the neural network
Invoked to separate the strategies available to the evolutionary game players.
ad hoc to paper Data samples can be modeled as players whose strategies are neural tangent features in an evolutionary game
This is the foundational modeling step that allows application of evolutionary game theory to shortcut bias dynamics.

pith-pipeline@v0.9.0 · 5463 in / 1511 out tokens · 39201 ms · 2026-05-08T19:25:40.777486+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith.Cost.FunctionalEquation washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We employ evolutionary game theory to analyze the origins of shortcut bias by modeling data samples as players and their corresponding neural tangent features as strategies, assuming the existence of core and shortcut subnetworks.
IndisputableMonolith.Foundation.BranchSelection branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The payoff matrix U(t) = ((1+γw2−w1, −γ(1+γw2−w1)), (γ(1−w2−γw1), 1−w2−γw1)) ... full-batch gradient descent favors shortcuts, whereas mini-batch SGD favors core features.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

56 extracted references · 6 canonical work pages · 1 internal anchor

[1]

C. K. Chow and C. N. Liu , title =. IEEE Transactions on Information Theory , year =
[2]

1988 , address =

Judea Pearl , title =. 1988 , address =

1988
[3]

Transactions on Machine Learning Research , issn=

Learned feature representations are biased by complexity, learning order, position, and more , author=. Transactions on Machine Learning Research , issn=. 2024 , url=

2024
[4]

III , author=

The rotation of eigenvectors by a perturbation. III , author=. SIAM Journal on Numerical Analysis , volume=. 1970 , publisher=

1970
[5]

Nature Machine Intelligence , volume=

Shortcut learning in deep neural networks , author=. Nature Machine Intelligence , volume=. 2020 , publisher=

2020
[6]

Advances in neural information processing systems , volume=

Adversarial examples are not bugs, they are features , author=. Advances in neural information processing systems , volume=
[7]

arXiv preprint arXiv:2411.02018 , year=

Shortcut Learning in In-Context Learning: A Survey , author=. arXiv preprint arXiv:2411.02018 , year=

work page arXiv
[8]

IEEE Access , volume=

Shortcut learning explanations for deep natural language processing: A survey on dataset biases , author=. IEEE Access , volume=. 2024 , publisher=

2024
[9]

PLoS computational biology , volume=

Deep convolutional networks do not classify based on global object shape , author=. PLoS computational biology , volume=. 2018 , publisher=

2018
[10]

International conference on learning representations , year=

ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , author=. International conference on learning representations , year=
[11]

Advances in Neural Information Processing Systems , volume=

The origins and prevalence of texture bias in convolutional neural networks , author=. Advances in Neural Information Processing Systems , volume=
[12]

Cognitive development , volume=

The importance of shape in early lexical learning , author=. Cognitive development , volume=. 1988 , publisher=

1988
[13]

arXiv preprint arXiv:2410.13343 , year=

Do llms overcome shortcut learning? an evaluation of shortcut challenges in large language models , author=. arXiv preprint arXiv:2410.13343 , year=

work page arXiv
[14]

The Twelfth International Conference on Learning Representations , year=

On the Foundations of Shortcut Learning , author=. The Twelfth International Conference on Learning Representations , year=
[15]

Invariant Risk Minimization

Invariant risk minimization , author=. arXiv preprint arXiv:1907.02893 , year=

work page internal anchor Pith review arXiv 1907
[16]

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference , author=. arXiv preprint arXiv:1902.01007 , year=

work page Pith review arXiv 1902
[17]

International Conference on Learning Representations , year=

Salient ImageNet: How to discover spurious features in Deep Learning? , author=. International Conference on Learning Representations , year=
[18]

Bias in Motion: Theoretical Insights into the Dynamics of Bias in

Anchit Jain and Rozhin Nobahari and Aristide Baratin and Stefano Sarao Mannelli , booktitle=. Bias in Motion: Theoretical Insights into the Dynamics of Bias in. 2024 , url=

2024
[19]

International Conference on Learning Representations , year=

Understanding deep learning requires rethinking generalization , author=. International Conference on Learning Representations , year=
[20]

2021 , url=

Shape Matters: Understanding the Implicit Bias of the Noise Covariance , author=. 2021 , url=

2021
[21]

Advances in Neural Information Processing Systems , volume=

Label noise sgd provably prefers flat global minimizers , author=. Advances in Neural Information Processing Systems , volume=
[22]

International conference on machine learning , pages=

Unique properties of flat minima in deep networks , author=. International conference on machine learning , pages=. 2020 , organization=

2020
[23]

stat , volume=

When does sgd favor flat minima? a quantitative characterization via linear stability , author=. stat , volume=
[24]

Advances in neural information processing systems , volume=

Towards explaining the regularization effect of initial large learning rate in training neural networks , author=. Advances in neural information processing systems , volume=
[25]

International Conference on Machine Learning , pages=

Implicit bias of the step size in linear diagonal neural networks , author=. International Conference on Machine Learning , pages=. 2022 , organization=

2022
[26]

Advances in Neural Information Processing Systems , year=

(S) GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability , author=. Advances in Neural Information Processing Systems , year=
[27]

Journal of Machine Learning Research , volume=

The implicit bias of gradient descent on separable data , author=. Journal of Machine Learning Research , volume=
[28]

International Conference on Learning Representations , year=

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , author=. International Conference on Learning Representations , year=
[29]

Forty-first International Conference on Machine Learning , year=

Complexity matters: feature learning in the presence of spurious correlations , author=. Forty-first International Conference on Machine Learning , year=
[30]

Learned feature representations are biased by complexity, learning order, position, and more , author=
[31]

European Journal of Epidemiology , volume=

A structural characterization of shortcut features for prediction , author=. European Journal of Epidemiology , volume=. 2022 , publisher=

2022
[32]

Advances in Neural Information Processing Systems , volume=

On the inductive bias of neural tangent kernels , author=. Advances in Neural Information Processing Systems , volume=
[33]

International Conference on Learning Representations , year=

An Empirical Study of Example Forgetting during Deep Neural Network Learning , author=. International Conference on Learning Representations , year=
[34]

2017 , url=

Understanding intermediate layers using linear classifier probes , author=. 2017 , url=

2017
[35]

2025 , eprint=

Layer by Layer: Uncovering Hidden Representations in Language Models , author=. 2025 , eprint=

2025
[36]

International Conference on Artificial Intelligence and Statistics , pages=

Implicit regularization via neural feature alignment , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

2021
[37]

Mechanism for feature learning in neural networks and backpropagation-free machine learning models

Adityanarayanan Radhakrishnan and Daniel Beaglehole and Parthe Pandit and Mikhail Belkin , title =. Science , volume =. 2024 , doi =. https://www.science.org/doi/pdf/10.1126/science.adi5639 , abstract =

work page doi:10.1126/science.adi5639 2024
[38]

2006 , publisher=

Non-negative matrices and Markov chains , author=. 2006 , publisher=

2006
[39]

International Conference on Machine Learning , pages=

Robustness to spurious correlations via human annotations , author=. International Conference on Machine Learning , pages=. 2020 , organization=

2020
[40]

Transactions on Machine Learning Research , issn=

Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation , author=. Transactions on Machine Learning Research , issn=. 2024 , url=

2024
[41]

International Conference on Machine Learning , pages=

Improving out-of-distribution robustness via selective augmentation , author=. International Conference on Machine Learning , pages=. 2022 , organization=

2022
[42]

International Conference on Learning Representations , year=

Learning The Difference That Makes A Difference With Counterfactually-Augmented Data , author=. International Conference on Learning Representations , year=
[43]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

Counterfactual generator: A weakly-supervised method for named entity recognition , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

2020
[44]

International conference on machine learning , pages=

Learning de-biased representations with biased representations , author=. International conference on machine learning , pages=. 2020 , organization=

2020
[45]

Advances in neural information processing systems , volume=

Large-scale methods for distributionally robust optimization , author=. Advances in neural information processing systems , volume=
[46]

International Conference on Machine Learning , pages=

Just train twice: Improving group robustness without training group information , author=. International Conference on Machine Learning , pages=. 2021 , organization=

2021
[47]

Advances in Neural Information Processing Systems , volume=

Gradient starvation: A learning proclivity in neural networks , author=. Advances in Neural Information Processing Systems , volume=
[48]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Deep stable learning for out-of-distribution generalization , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
[49]

Advances in Neural Information Processing Systems , volume=

Learning debiased representation via disentangled feature augmentation , author=. Advances in Neural Information Processing Systems , volume=
[50]

Advances in Neural Information Processing Systems , volume=

Chroma-vae: Mitigating shortcut learning with generative classifiers , author=. Advances in Neural Information Processing Systems , volume=
[51]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Navigate beyond shortcuts: Debiased learning through the lens of neural collapse , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[52]

Horn and Charles R

Roger A. Horn and Charles R. Johnson , title =. 2013 , address =

2013
[53]

G. W. Stewart and Ji-Guang Sun , title =. 1990 , address =

1990
[54]

Mailath and Rafael Rob , journal =

Michihiro Kandori and George J. Mailath and Rafael Rob , journal =. Learning, Mutation, and Long Run Equilibria in Games , urldate =
[55]

Journal of Multivariate Analysis , volume=

The singular values and vectors of low rank perturbations of large rectangular random matrices , author=. Journal of Multivariate Analysis , volume=. 2012 , publisher=

2012
[56]

Information Systems Security: 20th International Conference, ICISS 2024, Jaipur, India, December 16–20, 2024, Proceedings , pages =

Manna, Debasmita and Tripathy, Somanath , title =. Information Systems Security: 20th International Conference, ICISS 2024, Jaipur, India, December 16–20, 2024, Proceedings , pages =. 2024 , isbn =. doi:10.1007/978-3-031-80020-7_24 , abstract =

work page doi:10.1007/978-3-031-80020-7_24 2024