arxiv: 2603.09145 · v3 · submitted 2026-03-10 · 💻 cs.LG · cs.AI

Recognition: no theorem link

Causally Sufficient and Necessary Feature Expansion for Class-Incremental Learning

Zhen Zhang , Jielei Chu , Jiangtao Hu , Bin Liu , Jie Wang , Ya Liu , Tianrui Li

Authors on Pith no claims yet

Pith reviewed 2026-05-15 13:35 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords class-incremental learningcausal learningfeature expansioncounterfactual generationcatastrophic forgettingregularization methodprobability of necessity and sufficiency

0 comments

The pith

Causal PNS regularization guides feature expansion to avoid collisions in class-incremental learning

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Expansion-based class-incremental learning freezes old features to avoid forgetting but risks collisions when new task-specific features overlap with them. The paper identifies spurious correlations as the root cause, both intra-task through shortcut reliance and inter-task through semantic confusion. It introduces CPNS, an extension of probability of necessity and sufficiency, to measure how causally complete and separable the representations are. A twin-network counterfactual generator produces intra-task and inter-task features to minimize the associated risks. This plug-and-play method aims to produce more robust features that expand without drifting into old spaces.

Core claim

By extending PNS to CPNS for CIL, which quantifies causal completeness of intra-task representations and separability of inter-task representations, and using a dual-scope counterfactual generator based on twin networks to minimize PNS risks, feature expansion can be guided to mitigate collisions while preserving old knowledge.

What carries the argument

CPNS regularization via dual-scope counterfactual generator that creates intra-task counterfactuals for causal completeness and inter-task interfering features for separability.

Load-bearing premise

Spurious feature correlations primarily cause the feature collisions observed in expansion-based class-incremental learning.

What would settle it

Compare the feature space overlap between tasks in models trained with and without the proposed CPNS regularization on standard CIL benchmarks like CIFAR or ImageNet subsets; if overlap remains high despite the method, the claim would be weakened.

Figures

Figures reproduced from arXiv: 2603.09145 by Bin Liu, Jiangtao Hu, Jielei Chu, Jie Wang, Tianrui Li, Ya Liu, Zhen Zhang.

**Figure 2.** Figure 2: Illustration of feature suppression and collision. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Structural Causal Model (SCM) for expansion [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Accuracy curves for CPNS on various scenarios and baselines. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Validation of intra-task and inter-task counterfac [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Examples of intra-task and inter-task counterfac [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Complementary comparison with counterfactual [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: Hyperparameter sensitivity analysis on CIFAR-100 [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗

**Figure 9.** Figure 9: The parameter sensitivity experiment of β (EQ. 9) in the CIFAR100 10-10 scenario. Hyperparameter analysis. The proposed method uses two hyperparameters, λ and γ, to control the balance between PNSintra and PNSinter, and the KL divergence during counterfactual generation, respectively. We find the best combination of them through grid search on the average incremental accuracy (Avg). We perform grid search … view at source ↗

**Figure 10.** Figure 10: t-SNE visualization on the CUB200 dataset. Panels (a)–(d) correspond to the baseline DER, PNS [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗

**Figure 11.** Figure 11: Grad-CAM visualization on the CUB200 dataset. [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗

read the original abstract

Current expansion-based methods for Class Incremental Learning (CIL) effectively mitigate catastrophic forgetting by freezing old features. However, such task-specific features learned from the new task may collide with the old features. From a causal perspective, spurious feature correlations are the main cause of this collision, manifesting in two scopes: (i) guided by empirical risk minimization (ERM), intra-task spurious correlations cause task-specific features to rely on shortcut features. These non-robust features are vulnerable to interference, inevitably drifting into the feature space of other tasks; (ii) inter-task spurious correlations induce semantic confusion between visually similar classes across tasks. To address this, we propose a Probability of Necessity and Sufficiency (PNS)-based regularization method to guide feature expansion in CIL. Specifically, we first extend the definition of PNS to expansion-based CIL, termed CPNS, which quantifies both the causal completeness of intra-task representations and the separability of inter-task representations. We then introduce a dual-scope counterfactual generator based on twin networks to ensure the measurement of CPNS, which simultaneously generates: (i) intra-task counterfactual features to minimize intra-task PNS risk and ensure causal completeness of task-specific features, and (ii) inter-task interfering features to minimize inter-task PNS risk, ensuring the separability of inter-task representations. Theoretical analyses confirm its reliability. The regularization is a plug-and-play method for expansion-based CIL to mitigate feature collision. Extensive experiments demonstrate the effectiveness of the proposed method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts PNS into CPNS for CIL feature expansion with a twin-network counterfactual generator, but the causal claims rest on unverified identifiability assumptions.

read the letter

The new piece is the extension of probability of necessity and sufficiency to expansion-based class-incremental learning, called CPNS, plus the dual-scope counterfactual generator built on twin networks. It targets two sources of feature collision: intra-task shortcuts from ERM and inter-task semantic overlap. The regularization is presented as plug-and-play on top of existing expansion methods, which is a practical framing for people already using those approaches. Framing the collisions as spurious correlations and trying to intervene via counterfactuals is a non-routine move in the CIL literature. The abstract claims theoretical reliability and experimental gains, so the full paper presumably supplies the derivations and numbers that are missing from the summary. The soft spot is the identification step. Standard PNS needs a known structural causal model, positivity, and no unmeasured confounding. None of those are stated for the high-dimensional feature space produced by ERM training. Without them the twin-network generator may simply add generic regularization rather than deliver a causally grounded fix. If the experiments show consistent gains across benchmarks that survive ablation of the counterfactual component, that would strengthen the case; otherwise the causal story stays decorative. This is for continual-learning researchers who already work with expansion methods and want to test a causal-style regularizer. It is coherent enough on its own terms to deserve referee time, even if the assumptions need tightening.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes CPNS, an extension of the probability of necessity and sufficiency (PNS) to class-incremental learning (CIL), to guide feature expansion and mitigate collisions caused by spurious correlations. It introduces a dual-scope counterfactual generator using twin networks to produce intra-task counterfactual features and inter-task interfering features, minimizing intra- and inter-task PNS risks. The method is presented as a plug-and-play regularization with theoretical reliability and empirical effectiveness in reducing feature collision in expansion-based CIL.

Significance. If the theoretical analyses hold and the counterfactuals are valid, this could provide a novel causal framework for improving feature separability in CIL, addressing limitations of freezing-based methods. The plug-and-play aspect makes it potentially impactful for practical CIL systems, though its significance depends on demonstrating that the gains stem from the causal mechanism rather than general regularization.

major comments (3)

[Abstract] Abstract: The central claim that spurious feature correlations are the main cause of feature collision (manifesting as intra-task shortcuts and inter-task semantic confusion) is asserted without derivation or supporting analysis; the manuscript must show why this dominates over other factors such as optimization dynamics or representation capacity.
[Theoretical Analysis] Theoretical section on CPNS definition: The extension of PNS to CPNS quantifies causal completeness and separability using the same learned feature representations that the regularization acts upon, creating a potential circularity; without explicit identification conditions (known SCM, no unmeasured confounding, positivity) for the high-dimensional ERM feature space, the causal interpretation of the regularization term lacks grounding.
[Experiments] Experiments section: The abstract states that extensive experiments demonstrate effectiveness, yet no quantitative results, error bars, ablation studies, or specific comparisons to baselines are referenced; this prevents assessment of whether reported gains arise from the claimed CPNS mechanism or from generic regularization effects.

minor comments (2)

[Method] The dual-scope counterfactual generator description should include pseudocode or explicit equations for how intra-task and inter-task features are generated and how the twin networks are trained.
[Notation] Notation for CPNS and the risk terms should be introduced with a clear table or definitions list to avoid ambiguity when reading the theoretical claims.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our causal framework for class-incremental learning. We address each major comment below and indicate the corresponding revisions.

read point-by-point responses

Referee: [Abstract] The central claim that spurious feature correlations are the main cause of feature collision (manifesting as intra-task shortcuts and inter-task semantic confusion) is asserted without derivation or supporting analysis; the manuscript must show why this dominates over other factors such as optimization dynamics or representation capacity.

Authors: We agree that the abstract states the claim concisely. The full manuscript derives this in the introduction and Section 3 by constructing a causal graph for expansion-based CIL under ERM, showing that shortcut features arise from intra-task confounding and inter-task semantic overlap even when representation capacity is sufficient. To strengthen the dominance argument, we will add a short illustrative example and a paragraph contrasting with optimization dynamics (e.g., via a controlled simulation where capacity is fixed but spurious paths remain). revision: partial
Referee: [Theoretical Analysis] The extension of PNS to CPNS quantifies causal completeness and separability using the same learned feature representations that the regularization acts upon, creating a potential circularity; without explicit identification conditions (known SCM, no unmeasured confounding, positivity) for the high-dimensional ERM feature space, the causal interpretation of the regularization term lacks grounding.

Authors: We acknowledge the circularity concern. CPNS is computed on interventional distributions produced by the twin-network counterfactual generators, which approximate interventions independently of the final classifier features. We will revise the theoretical section to explicitly list the identification assumptions: (i) positivity in the feature space, (ii) no unmeasured confounding between task data and the learned representations (standard in ERM settings), and (iii) the structural causal model is sufficiently approximated by the dual generators. This grounds the causal claims without altering the method. revision: yes
Referee: [Experiments] The abstract states that extensive experiments demonstrate effectiveness, yet no quantitative results, error bars, ablation studies, or specific comparisons to baselines are referenced; this prevents assessment of whether reported gains arise from the claimed CPNS mechanism or from generic regularization effects.

Authors: The abstract summarizes results at a high level for brevity, while Section 5 contains the full quantitative evaluation (accuracy tables with standard deviations over 5 runs, ablation on intra- vs. inter-task generators, and comparisons to recent expansion-based CIL baselines). To improve accessibility, we will expand the abstract with one sentence referencing key gains (e.g., +3.2% average accuracy on CIFAR-100) and direct readers to the corresponding tables and ablation figures. revision: yes

Circularity Check

1 steps flagged

CPNS definition and twin-network generator form a self-referential loop on the same learned features

specific steps

self definitional [Abstract]
"we first extend the definition of PNS to expansion-based CIL, termed CPNS, which quantifies both the causal completeness of intra-task representations and the separability of inter-task representations. We then introduce a dual-scope counterfactual generator based on twin networks to ensure the measurement of CPNS, which simultaneously generates: (i) intra-task counterfactual features to minimize intra-task PNS risk and ensure causal completeness of task-specific features, and (ii) inter-task interfering features to minimize inter-task PNS risk"

CPNS is explicitly defined as a quantifier of properties (completeness, separability) of the task-specific features; the generator is then introduced solely to produce the counterfactuals needed to measure and minimize CPNS risk on those identical features. The regularization therefore enforces the definitional properties it claims to quantify, reducing the claimed causal guidance to a self-referential adjustment of the input representations.

full rationale

The paper's core derivation extends PNS to CPNS to quantify intra-task completeness and inter-task separability of the very representations produced by ERM-trained expansion, then deploys a dual-scope twin-network generator whose sole purpose is to produce counterfactuals that minimize the CPNS risk on those same representations. Because no independent SCM, positivity conditions, or external identification strategy is supplied, the regularization term reduces to enforcing the definitional properties by construction rather than deriving new causal constraints. This matches the self-definitional pattern and yields partial circularity (score 6) while leaving room for empirical utility.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the domain assumption that spurious correlations drive feature collisions and on the new definitions of CPNS and the counterfactual generator; no free parameters or independent evidence for the invented components are stated in the abstract.

axioms (1)

domain assumption Spurious feature correlations are the main cause of collision between task-specific features in expansion-based CIL
Explicitly stated as the causal perspective motivating the work.

invented entities (2)

CPNS no independent evidence
purpose: Quantify causal completeness of intra-task representations and separability of inter-task representations
New extension of PNS defined for CIL.
dual-scope counterfactual generator no independent evidence
purpose: Generate intra-task counterfactual features and inter-task interfering features via twin networks
New component introduced to measure CPNS.

pith-pipeline@v0.9.0 · 5580 in / 1332 out tokens · 38771 ms · 2026-05-15T13:35:51.534430+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · 2 internal anchors

[1]

A comprehensive survey of continual learning: Theory, method and application,

L. Wang, X. Zhang, H. Su, and J. Zhu, “A comprehensive survey of continual learning: Theory, method and application,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 8, pp. 5362–5383, 2024

work page 2024
[2]

Schedule- robust continual learning,

R. Wang, M. Ciccone, M. Pontil, and C. Ciliberto, “Schedule- robust continual learning,” IEEE Transactions on Pattern Anal- ysis and Machine Intelligence, vol. 48, no. 2, pp. 1424–1436, 2026

work page 2026
[3]

icarl: Incremental classifier and representation learning,

S.-A. Rebuﬀi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “icarl: Incremental classifier and representation learning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 2001–2010

work page 2017
[4]

Large scale incremental learning,

Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, and Y. Fu, “Large scale incremental learning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2019, pp. 374–382

work page 2019
[5]

Class-incremental learning: survey and performance evaluation on image classification,

M. Masana, X. Liu, B. Twardowski, M. Menta, A. D. Bagdanov, and J. Van De Weijer, “Class-incremental learning: survey and performance evaluation on image classification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 5, pp. 5513–5533, 2022

work page 2022
[6]

Class-incremental learning: A survey,

D.-W. Zhou, Q.-W. Wang, Z.-H. Qi, H.-J. Ye, D.-C. Zhan, and Z. Liu, “Class-incremental learning: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 9851–9873, 2024

work page 2024
[7]

Crnet: A fast continual learning framework with random theory,

D. Li and Z. Zeng, “Crnet: A fast continual learning framework with random theory,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 9, pp. 10 731–10 744, 2023. 15

work page 2023
[8]

Catastrophic interference in connectionist networks: The sequential learning problem,

M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” in Psychology of Learning and Motivation, 1989, vol. 24, pp. 109– 165

work page 1989
[9]

Adaptive progressive continual learning,

J. Xu, J. Ma, X. Gao, and Z. Zhu, “Adaptive progressive continual learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 10, pp. 6715–6728, 2022

work page 2022
[10]

Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world,

S. Grossberg, “Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world,” Neural Networks, vol. 37, pp. 1–47, 2013

work page 2013
[11]

Dytox: Transformers for continual learning with dynamic token ex- pansion,

A. Douillard, A. Ramé, G. Couairon, and M. Cord, “Dytox: Transformers for continual learning with dynamic token ex- pansion,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2022, pp. 9285–9295

work page 2022
[12]

Resolving task confusion in dynamic expansion architectures for class incremental learning,

B. Huang, Z. Chen, P. Zhou, J. Chen, and Z. Wu, “Resolving task confusion in dynamic expansion architectures for class incremental learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 1, 2023, pp. 908–916

work page 2023
[13]

Foster: Feature boosting and compression for class-incremental learn- ing,

F.-Y. Wang, D.-W. Zhou, H.-J. Ye, and D.-C. Zhan, “Foster: Feature boosting and compression for class-incremental learn- ing,” in European Conference on Computer Vision, 2022, pp. 398–414

work page 2022
[14]

Der: Dynamically expandable representation for class incremental learning,

S. Yan, J. Xie, and X. He, “Der: Dynamically expandable representation for class incremental learning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2021, pp. 3014–3023

work page 2021
[15]

Multi-granularity regularized re-balancing for class incremental learning,

H. Chen, Y. Wang, and Q. Hu, “Multi-granularity regularized re-balancing for class incremental learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 7, pp. 7263– 7277, 2023

work page 2023
[16]

Complemen- tary learning subnetworks towards parameter-eﬀicient class- incremental learning,

D. Li, Z. Zeng, W. Dai, and P. N. Suganthan, “Complemen- tary learning subnetworks towards parameter-eﬀicient class- incremental learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 37, no. 6, pp. 3240–3252, 2025

work page 2025
[17]

Task-agnostic guided feature expansion for class-incremental learning,

B. Zheng, D.-W. Zhou, H.-J. Ye, and D.-C. Zhan, “Task-agnostic guided feature expansion for class-incremental learning,” in Proceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 10 099–10 109

work page 2025
[18]

An overview of statistical learning theory,

V. N. Vapnik, “An overview of statistical learning theory,” IEEE Transactions on Neural Networks, vol. 10, no. 5, pp. 988–999, 1999

work page 1999
[19]

The caltech-ucsd birds-200-2011 dataset,

C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, “The caltech-ucsd birds-200-2011 dataset,” 2011

work page 2011
[20]

Prototypical verbalizer for prompt-based few-shot tuning,

G. Cui, S. Hu, N. Ding, L. Huang, and Z. Liu, “Prototypical verbalizer for prompt-based few-shot tuning,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 7014–7024

work page 2022
[21]

Similarity of neural network representations revisited,

S. Kornblith, M. Norouzi, H. Lee, and G. Hinton, “Similarity of neural network representations revisited,” in International Conference on Machine Learning. PMlR, 2019, pp. 3519–3529

work page 2019
[22]

Comprehensive quality assessment method for neutron radiographic images based on cnn and visual salience: Z. zhang et al

Z. Zhang, C.-B. Meng, X.-L. Jiang, C.-Y. Zhao, S. Qiao, and T. Zhang, “Comprehensive quality assessment method for neutron radiographic images based on cnn and visual salience: Z. zhang et al. ” Nuclear Science and Techniques, vol. 36, no. 7, p. 118, 2025

work page 2025
[23]

Imagenet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255

work page 2009
[24]

Pearl, Causality

J. Pearl, Causality. Cambridge university press, 2009

work page 2009
[25]

Invariant learning via probability of suﬀicient and necessary causes,

M. Yang, Z. Fang, Y. Zhang, Y. Du, F. Liu, J.-F. Ton, J. Wang, and J. Wang, “Invariant learning via probability of suﬀicient and necessary causes,” Advances in Neural Information Processing Systems, vol. 36, pp. 79 832–79 857, 2023

work page 2023
[26]

Progressive Neural Networks

A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell, “Progressive neural networks,” arXiv preprint arXiv:1606.04671, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[27]

Progress & compress: A scalable framework for continual learning,

J. Schwarz, W. Czarnecki, J. Luketina, A. Grabska-Barwinska, Y. W. Teh, R. Pascanu, and R. Hadsell, “Progress & compress: A scalable framework for continual learning,” in International Conference on Machine Learning. PMLR, 2018, pp. 4528–4537

work page 2018
[28]

Beef: Bi-compatible class-incremental learning via energy-based expansion and fusion,

F.-Y. Wang, D.-W. Zhou, L. Liu, H.-J. Ye, Y. Bian, D.-C. Zhan, and P. Zhao, “Beef: Bi-compatible class-incremental learning via energy-based expansion and fusion,” in International Conference on Learning Representations, 2022

work page 2022
[29]

Causal representation learning from multi-modal biomedical observations,

Y. Sun, L. Kong, G. Chen, L. Li, G. Luo, Z. Li, Y. Zhang, Y. Zheng, M. Yang, P. Stojanov et al., “Causal representation learning from multi-modal biomedical observations,” ArXiv, pp. arXiv–2411, 2025

work page 2025
[30]

Multi-view causal representation learning with partial observability,

D. Yao, D. Xu, S. Lachapelle, S. Magliacane, P. Taslakian, G. Martius, J. v. Kügelgen, and F. Locatello, “Multi-view causal representation learning with partial observability,” in 12th International Conference on Learning Representations, 2024

work page 2024
[31]

Interventional causal representation learning,

K. Ahuja, D. Mahajan, Y. Wang, and Y. Bengio, “Interventional causal representation learning,” in International Conference on Machine Learning. PMLR, 2023, pp. 372–407

work page 2023
[32]

Weakly supervised causal representation learning,

J. Brehmer, P. De Haan, P. Lippe, and T. S. Cohen, “Weakly supervised causal representation learning,” Advances in Neural Information Processing Systems, vol. 35, pp. 38 319–38 331, 2022

work page 2022
[33]

Counter- factual fairness,

M. J. Kusner, J. Loftus, C. Russell, and R. Silva, “Counter- factual fairness,” Advances in Neural Information Processing Systems, vol. 30, 2017

work page 2017
[34]

Coun- terfactual samples synthesizing and training for robust visual question answering,

L. Chen, Y. Zheng, Y. Niu, H. Zhang, and J. Xiao, “Coun- terfactual samples synthesizing and training for robust visual question answering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 11, pp. 13 218–13 234, 2023

work page 2023
[35]

Counterfactual visual explanations,

Y. Goyal, Z. Wu, J. Ernst, D. Batra, D. Parikh, and S. Lee, “Counterfactual visual explanations,” in International Confer- ence on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97. PMLR, 09–15 Jun 2019, pp. 2376–2384

work page 2019
[36]

Rademacher complexity for adversarially robust generalization,

D. Yin, R. Kannan, and P. Bartlett, “Rademacher complexity for adversarially robust generalization,” in International Con- ference on Machine Learning. PMLR, 2019, pp. 7085–7094

work page 2019
[37]

Learning multiple layers of features from tiny images,

A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009

work page 2009
[38]

Birds 525 species- image classification,

“Birds 525 species- image classification,” 2023

work page 2023
[39]

Automated flower classifi- cation over a large number of classes,

M.-E. Nilsback and A. Zisserman, “Automated flower classifi- cation over a large number of classes,” in 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing. IEEE, 2008, pp. 722–729

work page 2008
[40]

Food-101 - mining discriminative components with random forests,

L. Bossard, M. Guillaumin, and L. V. Gool, “Food-101 - mining discriminative components with random forests,” in European Conference on Computer Vision, 2014

work page 2014
[41]

Learning a unified classifier incrementally via rebalancing,

S. Hou, X. Pan, C. C. Loy, Z. Wang, and D. Lin, “Learning a unified classifier incrementally via rebalancing,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2019, pp. 831–839

work page 2019
[42]

Semantic drift compensation for class-incremental learning,

L. Yu, B. Twardowski, X. Liu, L. Herranz, K. Wang, Y. Cheng, S. Jui, and J. v. d. Weijer, “Semantic drift compensation for class-incremental learning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2020, pp. 6982–6991

work page 2020
[43]

Beef: Bi-compatible class-incremental learning via energy-based expansion and fusion,

F. L. Wang, D.-W. Zhou, L. Liu, H.-J. Ye, Y. Bian, D. chuan Zhan, and P. Zhao, “Beef: Bi-compatible class-incremental learning via energy-based expansion and fusion,” in Interna- tional Conference on Learning Representations, 2023

work page 2023
[44]

A protocol for evaluat- ing model interpretation methods from visual explanations,

H. Behzadi-Khormouji and J. Oramas, “A protocol for evaluat- ing model interpretation methods from visual explanations,” in Proceedings of the IEEE/CVF Winter Conference on Applica- tions of Computer Vision, 2023, pp. 1421–1429

work page 2023
[45]

Towards Deep Learning Models Resistant to Adversarial Attacks

A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[46]

Transport-based counterfactual models,

L. De Lara, A. González-Sanz, N. Asher, L. Risser, and J.-M. Loubes, “Transport-based counterfactual models,” Journal of Machine Learning Research, vol. 25, no. 1, Jan. 2024

work page 2024
[47]

Diffusion visual counterfactual explanations,

M. Augustin, V. Boreiko, F. Croce, and M. Hein, “Diffusion visual counterfactual explanations,” Advances in Neural Infor- mation Processing Systems, vol. 35, pp. 364–377, 2022

work page 2022
[48]

Towards the causal complete cause of multi-modal representation learning,

J. Wang, S. Zhao, W. Qiang, J. Li, C. Zheng, F. Sun, and H. Xiong, “Towards the causal complete cause of multi-modal representation learning,” arXiv preprint arXiv:2407.14058, 2024

work page arXiv 2024
[49]

Self-supervised learning from a multi-view perspective,

Y.-H. Tsai, Y. Wu, R. Salakhutdinov, and L.-P. Morency, “Self-supervised learning from a multi-view perspective,” in Proceedings of the International Conference on Learning Rep- resentations, 2021. 16

work page 2021
[50]

Statistical aspects of wasser- stein distances,

V. M. Panaretos and Y. Zemel, “Statistical aspects of wasser- stein distances,” Annual Review of Statistics and its Applica- tion, vol. 6, no. 1, pp. 405–431, 2019

work page 2019
[51]

A tutorial on the cross-entropy method,

P.-T. De Boer, D. P. Kroese, S. Mannor, and R. Y. Rubinstein, “A tutorial on the cross-entropy method,” Annals of Operations Research, vol. 134, no. 1, pp. 19–67, 2005

work page 2005
[52]

Approximating the kullback leibler divergence between gaussian mixture models,

J. R. Hershey and P. A. Olsen, “Approximating the kullback leibler divergence between gaussian mixture models,” in 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07, vol. 4. IEEE, 2007, pp. IV–317

work page 2007
[53]

Visualizing data using t- sne

L. Van der Maaten and G. Hinton, “Visualizing data using t- sne. ” Journal of machine learning research, vol. 9, no. 11, 2008

work page 2008
[54]

Grad-cam: Visual explanations from deep net- works via gradient-based localization,

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep net- works via gradient-based localization,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 618–626. Zhen Zhang is pursuing the Ph.D. degree in the School of Computing and Artificial Intelligence, Southwes...

work page 2017