pith. sign in

arxiv: 2606.24161 · v1 · pith:INVXRWYQnew · submitted 2026-06-23 · 💻 cs.CV

Dual-Branch Cross-Projection Debiasing through Diffusion-based Disentanglement

Pith reviewed 2026-06-26 01:05 UTC · model grok-4.3

classification 💻 cs.CV
keywords debiasingfoundation modelsspurious correlationsdiffusion modelsprompt tuningworst group accuracygroup unsupervised learning
0
0 comments X

The pith

Dual-branch cross-projection removes spurious features via diffusion-disentangled concepts without group labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Foundation models trained on biased data latch onto non-causal attributes and fail on minority groups when group labels are absent. The paper first uses diffusion models to disentangle concept representations and mine reliable spurious attributes through a confidence-guided process. It then applies a dual-branch prompt-tuning scheme that isolates target and spurious representations and removes the latter via cross null-space projection. This yields state-of-the-art worst-group accuracy on four benchmarks while updating at most 0.22 percent of model parameters.

Core claim

Confidence-guided Bias Concept Mining (CBCM) extracts semantically aligned spurious attributes from diffusion-disentangled representations without annotations; Dual-branch Cross-projection Debiasing (DCD) then separates target and spurious features into parallel branches and explicitly nulls spurious directions while preserving target semantics.

What carries the argument

Dual-branch Cross-projection Debiasing (DCD) paired with Confidence-guided Bias Concept Mining (CBCM), where diffusion disentanglement supplies pseudo-supervision and cross null-space projection removes spurious information across branches.

If this is right

  • Group-unsupervised debiasing becomes feasible on any foundation model by tuning a tiny prompt subset.
  • Worst-group performance improves without explicit group or attribute labels at training time.
  • The same pipeline applies across multiple vision benchmarks with consistent gains among unsupervised methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could be tested on language or multimodal models where diffusion-style disentanglement is replaced by other generative priors.
  • If the mined concepts prove stable across datasets, they might serve as reusable bias detectors for downstream auditing tasks.

Load-bearing premise

Diffusion-disentangled concept representations can identify spurious attributes that match real-world biases without any attribute annotations.

What would settle it

A controlled test on a dataset whose known spurious correlations the diffusion model fails to separate, showing no gain in worst-group accuracy over a single-branch baseline.

Figures

Figures reproduced from arXiv: 2606.24161 by De Cheng, Dongsheng Li, Lingfeng He, Nannan Wang, Xiangqian Zhao, Xinyang Jiang, Zhipeng Xu, Zilong Wang.

Figure 1
Figure 1. Figure 1: Unlike single-branch methods that may damage [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed framework. The framework consists of two stages: bias concept mining and dual-branch [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Group-wise recall of the pseudo spurious labels [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative analysis of concept manipulation and quantized reconstruction in diffusion generation. Middle: original [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Effect of the sampling exponent 𝛼 in the weighted sampling strategy across Waterbirds, CelebA, and MetaShift. Results are averaged over three random seeds, and the error bars denote the standard deviation across different runs. over-emphasized, the model may overfit these samples and lose its ability to generalize well across all groups. Across all three datasets, the best worst-group accuracy is achieved … view at source ↗
Figure 6
Figure 6. Figure 6: Additional qualitative results of concept manipulation and quantized reconstruction on CMNIST [ [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Waterbirds GradCAM [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
read the original abstract

Foundation models trained on biased datasets often rely on spurious correlations between target labels and non-causal attributes, resulting in poor generalization on minority groups. Bias mitigation remains challenging due to two fundamental issues. First, when group labels are unavailable, existing group-unsupervised methods typically infer spurious attributes implicitly from model behavior, making it difficult to identify spurious factors that are semantically aligned with real-world biases. Second, even with pseudo spurious supervision, most existing debiasing methods follow a single-branch design that operates within a single shared feature space, where target and spurious attributes are intrinsically entangled. To address the first challenge, we introduce Confidence-guided Bias Concept Mining (CBCM), which leverages diffusion-disentangled, semantically grounded concept representations to identify reliable spurious attributes without attribute annotations. To address the second challenge, we propose Dual-branch Cross-projection Debiasing (DCD), a prompt-tuning framework that separates target and spurious representations into two branches and explicitly removes spurious information through cross null-space projection while preserving target-relevant semantics. Extensive experiments on four benchmark datasets show that our method achieves state-of-the-art worst group accuracy among group-unsupervised approaches, while tuning at most 0.22% of the model parameters. The source code is available in the supplementary materials.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that foundation models suffer from spurious correlations when group labels are unavailable, and proposes two components to address this: Confidence-guided Bias Concept Mining (CBCM), which uses diffusion models to produce disentangled concept representations for identifying spurious attributes without annotations, and Dual-branch Cross-projection Debiasing (DCD), a prompt-tuning method that separates target and spurious features into dual branches and applies cross null-space projection to remove spurious information. Extensive experiments on four benchmarks are reported to achieve state-of-the-art worst-group accuracy among group-unsupervised methods while tuning at most 0.22% of parameters, with code released.

Significance. If the central claims hold, the work would be significant for group-unsupervised bias mitigation by offering an explicit mechanism to surface semantically grounded spurious factors via diffusion disentanglement and to enforce separation via dual-branch projection, rather than implicit inference within a shared space. The parameter efficiency and reproducibility via released code are additional strengths that would make the approach practically attractive if the gains prove robust.

major comments (2)
  1. [§3] §3 (CBCM): The central claim that diffusion-disentangled concepts yield reliable, semantically aligned spurious attributes without any attribute annotations or validation is load-bearing for attributing the reported worst-group gains to the method. No independent check (human validation of mined concepts, alignment with known bias factors on Waterbirds/CelebA, or ablation replacing mined concepts with random directions) is described that would confirm the mapping holds rather than surfacing unrelated factors such as lighting or style.
  2. [§4] §4 (DCD and experiments): The cross null-space projection in the dual-branch setup is presented as explicitly removing spurious information while preserving target semantics, but the manuscript provides no quantitative verification (e.g., via concept activation vectors or post-projection spurious correlation metrics) that the projection direction identified by CBCM is the correct one; if the mined direction is misaligned, the worst-group improvements reduce to an unverified assumption.
minor comments (2)
  1. [Abstract] The abstract and introduction would benefit from a brief statement of the precise datasets used and the definition of 'group-unsupervised' to avoid ambiguity with related work.
  2. [§4] Notation for the null-space projection operator and the two branches should be introduced with a single equation or diagram for clarity before the experimental results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments highlighting the need for stronger validation of the semantic alignment in CBCM and the effectiveness of the projection in DCD. We address each point below and commit to revisions that add the requested checks where feasible.

read point-by-point responses
  1. Referee: [§3] §3 (CBCM): The central claim that diffusion-disentangled concepts yield reliable, semantically aligned spurious attributes without any attribute annotations or validation is load-bearing for attributing the reported worst-group gains to the method. No independent check (human validation of mined concepts, alignment with known bias factors on Waterbirds/CelebA, or ablation replacing mined concepts with random directions) is described that would confirm the mapping holds rather than surfacing unrelated factors such as lighting or style.

    Authors: We agree that direct validation of the mined concepts would strengthen the attribution of gains to CBCM. The original manuscript relies on end-to-end worst-group accuracy on benchmarks with established spurious factors (Waterbirds background, CelebA hair color) as indirect support. We will add in revision: (1) an ablation replacing CBCM-mined directions with random vectors, and (2) qualitative examples of mined concepts on Waterbirds/CelebA showing alignment with documented biases. Human validation is inherently subjective and was not performed; we view the random-direction ablation as the most objective check. revision: yes

  2. Referee: [§4] §4 (DCD and experiments): The cross null-space projection in the dual-branch setup is presented as explicitly removing spurious information while preserving target semantics, but the manuscript provides no quantitative verification (e.g., via concept activation vectors or post-projection spurious correlation metrics) that the projection direction identified by CBCM is the correct one; if the mined direction is misaligned, the worst-group improvements reduce to an unverified assumption.

    Authors: We acknowledge the absence of direct post-projection metrics in the original submission. The dual-branch design and cross-projection are motivated by the separation of target and spurious branches, with gains measured via worst-group accuracy. In the revision we will add quantitative verification: correlation between the projected features and known spurious attributes (where group labels are available for analysis) and concept activation vector similarity before/after projection. This will be reported on at least two benchmarks. revision: yes

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The provided abstract and description introduce CBCM for mining spurious attributes via diffusion-disentangled representations and DCD for cross-projection debiasing as independent methodological contributions. No equations, fitted parameters renamed as predictions, self-citations as load-bearing premises, or ansatzes imported from prior author work are present in the text. The performance claims rest on external benchmark experiments rather than reducing to quantities defined by the method's own inputs. The derivation chain is therefore self-contained against external evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the premise that diffusion models produce semantically grounded disentangled representations usable for bias mining and that cross null-space projection cleanly separates target from spurious signals; both are domain assumptions not derived in the abstract.

axioms (2)
  • domain assumption Diffusion models can produce disentangled, semantically grounded concept representations that align with real-world spurious attributes.
    Invoked to justify CBCM without attribute annotations.
  • domain assumption Target and spurious attributes remain intrinsically entangled in a single shared feature space, necessitating a dual-branch design.
    Stated as the motivation for DCD.

pith-pipeline@v0.9.1-grok · 5772 in / 1273 out tokens · 22575 ms · 2026-06-26T01:05:11.642118+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

62 extracted references · 2 canonical work pages

  1. [1]

    Armen Aghajanyan, Sonal Gupta, and Luke Zettlemoyer. 2021. Intrinsic dimen- sionality explains the effectiveness of language model fine-tuning. InProceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (volume 1: long papers). 7319–7328

  2. [3]

    Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. 2020. Invariant Risk Minimization. arXiv:1907.02893 [stat.ML] https://arxiv.org/abs/ 1907.02893

  3. [4]

    Ihab Asaad, Maha Shadaydeh, and Joachim Denzler. 2025. Gradient Extrapolation for Debiased Representation Learning. arXiv:2503.13236 [cs.LG] https://arxiv. org/abs/2503.13236

  4. [5]

    Saeid Asgari, Aliasghar Khani, Fereshte Khani, Ali Gholami, Linh Tran, Ali Mahdavi-Amiri, and Ghassan Hamarneh. 2022. MaskTune: Mitigating Spurious Correlations by Forcing to Explore. InAdvances in Neural Information Processing Systems

  5. [6]

    Rwiddhi Chakraborty, Adrian Sletten, and Michael C Kampffmeyer. 2024. Exmap: Leveraging explainability heatmaps for unsupervised group robustness to spuri- ous correlations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12017–12026

  6. [8]

    Shoufa Chen, Chongjian Ge, Zhan Tong, Jiangliu Wang, Yibing Song, Jue Wang, and Ping Luo. 2022. Adaptformer: Adapting vision transformers for scalable visual recognition.Advances in Neural Information Processing Systems35 (2022), 16664–16678

  7. [9]

    De Cheng, Haichun Tai, Nannan Wang, Xiangqian Zhao, Jie Li, and Xinbo Gao

  8. [10]

    A Multi-Granularity Scene-Aware Graph Convolution Method for Weakly Supervised Person Search.International Journal of Computer Vision134, 1 (2026), 27

  9. [11]

    De Cheng, Zhipeng Xu, Xinyang Jiang, Dongsheng Li, Nannan Wang, and Xinbo Gao. 2026. Prompt Disentanglement via Language Guidance and Representation Alignment for Domain Generalization.IEEE Transactions on Pattern Analysis and Machine Intelligence(2026)

  10. [12]

    De Cheng, Zhipeng Xu, Xinyang Jiang, Nannan Wang, Dongsheng Li, and Xinbo Gao. 2024. Disentangled prompt representation for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 23595–23604

  11. [13]

    De Cheng, Mingyue Zeng, Zhipeng Xu, Di Xu, Nannan Wang, and Xinbo Gao

  12. [14]

    InProceedings of the Fourteenth Inter- national Conference on Learning Representations

    Interference-Isolated Elastic Weight Consolidation and Knowledge Cali- bration for Incremental Object Detection. InProceedings of the Fourteenth Inter- national Conference on Learning Representations. https://openreview.net/forum? id=VrXdmCjni4

  13. [15]

    Elliot Creager, Jörn-Henrik Jacobsen, and Richard Zemel. 2021. Environment Inference for Invariant Learning. arXiv:2010.07249 [cs.LG] https://arxiv.org/abs/ 2010.07249

  14. [16]

    Yihe Deng, Yu Yang, Baharan Mirzasoleiman, and Quanquan Gu. 2023. Ro- bust Learning with Progressive Data Expansion Against Spurious Correlation. arXiv:2306.04949 [cs.LG] https://arxiv.org/abs/2306.04949

  15. [17]

    Ali Edalati, Marzieh Tahaei, Ivan Kobyzev, Vahid Partovi Nia, James J Clark, and Mehdi Rezagholizadeh. 2025. KronA: Parameter-efficient tuning with kronecker adapter. InEnhancing LLM Performance: Efficacy, Fine-Tuning, and Inference Techniques. Springer, 49–65

  16. [18]

    Yujin Han and Difan Zou. 2024. Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference. arXiv:2404.13815 [cs.LG] https: //arxiv.org/abs/2404.13815

  17. [19]

    Zeyu Han, Chao Gao, Jinyang Liu, Jeff Zhang, and Sai Qian Zhang. 2024. Parameter-efficient fine-tuning for large models: A comprehensive survey.arXiv preprint arXiv:2403.14608(2024)

  18. [20]

    Haoyu He, Jianfei Cai, Jing Zhang, Dacheng Tao, and Bohan Zhuang. 2023. Sensitivity-aware visual parameter-efficient fine-tuning. InProceedings of the IEEE/CVF international conference on computer vision. 11825–11835

  19. [21]

    Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, and Graham Neubig. 2021. Towards a unified view of parameter-efficient transfer learning. arXiv preprint arXiv:2110.04366(2021)

  20. [22]

    Lingfeng He, De Cheng, Huaijie Wang, Xi Yang, Nannan Wang, and Xinbo Gao. 2026. Task-Driven Subspace Decomposition for Knowledge Sharing and Isolation in LoRA-based Continual Learning. arXiv:2603.00191 [cs.LG] https: //arxiv.org/abs/2603.00191

  21. [23]

    Lingfeng He, De Cheng, Di Xu, Huaijie Wang, and Nannan Wang. 2026. Harness- ing textual semantic priors for knowledge transfer and refinement in clip-driven continual learning. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 21645–21653

  22. [24]

    Badr Youbi Idrissi, Martin Arjovsky, Mohammad Pezeshki, and David Lopez- Paz. 2022. Simple data balancing achieves competitive worst-group-accuracy. arXiv:2110.14503 [cs.LG] https://arxiv.org/abs/2110.14503

  23. [25]

    Pavel Izmailov, Polina Kirichenko, Nate Gruver, and Andrew Gordon Wil- son. 2022. On Feature Learning in the Presence of Spurious Correlations. arXiv:2210.11369 [cs.LG] https://arxiv.org/abs/2210.11369

  24. [26]

    Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, and Ser-Nam Lim. 2022. Visual prompt tuning. InEuro- pean conference on computer vision. Springer, 709–727

  25. [27]

    Sangwon Jung, Sanghyuk Chun, and Taesup Moon. 2022. Learning Fair Classifiers with Partially Annotated Group Labels. arXiv:2111.14581 [cs.LG] https://arxiv. org/abs/2111.14581

  26. [28]

    Polina Kirichenko, Pavel Izmailov, and Andrew Gordon Wilson. 2023. Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations. arXiv:2204.02937 [cs.LG] https://arxiv.org/abs/2204.02937

  27. [29]

    Brian Lester, Rami Al-Rfou, and Noah Constant. 2021. The Power of Scale for Parameter-Efficient Prompt Tuning. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, Online and Punta Cana, Dominica...

  28. [30]

    doi:10.18653/v1/2021.emnlp-main.243

  29. [31]

    Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. InProceedings of the 59th Annual Meeting of the Associa- tion for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Ass...

  30. [32]

    Weixin Liang and James Zou. 2022. Metashift: A dataset of datasets for eval- uating contextual distribution shifts and training conflicts.arXiv preprint arXiv:2202.06523(2022)

  31. [33]

    Chen, Aditi Raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, and Chelsea Finn

    Evan Zheran Liu, Behzad Haghgoo, Annie S. Chen, Aditi Raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, and Chelsea Finn. 2021. Just Train Twice: Improving Group Robustness without Training Group Informa- tion. arXiv:2107.09044 [cs.LG] https://arxiv.org/abs/2107.09044

  32. [34]

    Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep learning face attributes in the wild. InProceedings of the IEEE international conference on computer vision. 3730–3738

  33. [35]

    Divyat Mahajan, Mohammad Pezeshki, Charles Arnal, Ioannis Mitliagkas, Kartik Ahuja, and Pascal Vincent. 2025. Compositional Risk Minimization. arXiv:2410.06303 [cs.LG] https://arxiv.org/abs/2410.06303

  34. [36]

    Junhyun Nam, Hyuntak Cha, Sungsoo Ahn, Jaeho Lee, and Jinwoo Shin. 2020. Learning from Failure: Training Debiased Classifier from Biased Classifier. arXiv:2007.02561 [cs.LG] https://arxiv.org/abs/2007.02561

  35. [37]

    Junhyun Nam, Jaehyung Kim, Jaeho Lee, and Jinwoo Shin. 2022. Spread Spurious Attribute: Improving Worst-group Accuracy with Spurious Attribute Estimation. arXiv:2204.02070 [cs.LG] https://arxiv.org/abs/2204.02070

  36. [38]

    Sungho Park, Jewook Lee, Pilhyeon Lee, Sunhee Hwang, Dohyung Kim, and Hyeran Byun. 2022. Fair Contrastive Learning for Facial Attribute Classification. arXiv:2203.16209 [cs.CV] https://arxiv.org/abs/2203.16209

  37. [39]

    Mohammad Pezeshki, Diane Bouchacourt, Mark Ibrahim, Nicolas Ballas, Pascal Vincent, and David Lopez-Paz. 2024. Discovering environments with XRM. arXiv:2309.16748 [cs.LG] https://arxiv.org/abs/2309.16748

  38. [40]

    Hoang Phan, Andrew Gordon Wilson, and Qi Lei. 2024. Controllable Prompt Tuning For Balancing Group Distributional Robustness. arXiv:2403.02695 [cs.LG] https://arxiv.org/abs/2403.02695

  39. [41]

    Shikai Qiu, Andres Potapczynski, Pavel Izmailov, and Andrew Gordon Wilson

  40. [42]

    arXiv:2306.11074 [cs.LG] https://arxiv.org/abs/2306.11074

    Simple and Fast Group Robustness by Automatic Feature Reweighting. arXiv:2306.11074 [cs.LG] https://arxiv.org/abs/2306.11074

  41. [43]

    Shiori Sagawa, Pang Wei Koh, Tatsunori B Hashimoto, and Percy Liang. [n. d.]. Distributionally Robust Neural Networks. InInternational Conference on Learning Representations

  42. [45]

    Hashimoto, and Percy Liang

    Shiori Sagawa, Pang Wei Koh, Tatsunori B. Hashimoto, and Percy Liang. 2020. Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization. arXiv:1911.08731 [cs.LG] https: //arxiv.org/abs/1911.08731

  43. [46]

    Sohoni, Jared A

    Nimit S. Sohoni, Jared A. Dunnmon, Geoffrey Angus, Albert Gu, and Christopher Ré. 2022. No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems. arXiv:2011.12945 [cs.LG] https://arxiv.org/abs/2011. 12945

  44. [47]

    Yi-Lin Sung, Jaemin Cho, and Mohit Bansal. 2022. Vl-adapter: Parameter-efficient transfer learning for vision-and-language tasks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5227–5237. Dual-Branch Cross-Projection Debiasing through Diffusion-based Disentanglement Conference acronym ’XX, June 03–05, 2018, Woodstock, NY

  45. [48]

    Xinyu Tian, Shu Zou, Zhaoyuan Yang, Mengqi He, and Jing Zhang. 2025. Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision- Language Recognition. arXiv:2502.15809 [cs.LG] https://arxiv.org/abs/2502.15809

  46. [49]

    Christos Tsirigotis, Joao Monteiro, Pau Rodriguez, David Vazquez, and Aaron Courville. 2023. Group Robust Classification Without Any Group Information. arXiv:2310.18555 [cs.LG] https://arxiv.org/abs/2310.18555

  47. [50]

    Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie

  48. [51]

    The caltech-ucsd birds-200-2011 dataset. (2011)

  49. [52]

    Huaijie Wang, De Cheng, Guozhang Li, Zhipeng Xu, Lingfeng He, Jie Li, Nannan Wang, and Xinbo Gao. 2026. StPR: Spatiotemporal Preservation and Routing for Exemplar-Free Video Class-Incremental Learning. InProceedings of the Fourteenth International Conference on Learning Representations. https://openreview.net/ forum?id=VAn2YVMuZC

  50. [53]

    Yubin Wang, Xinyang Jiang, De Cheng, Xiangqian Zhao, Zilong Wang, Dong- sheng Li, and Cairong Zhao. [n. d.]. Exploring Interpretability for Visual Prompt Tuning with Cross-layer Concepts. InThe Fourteenth International Conference on Learning Representations

  51. [54]

    Tao Wen, Zihan Wang, Quan Zhang, and Qi Lei. 2025. Elastic Representation: Mitigating Spurious Correlations for Group Robustness. arXiv:2502.09850 [cs.LG] https://arxiv.org/abs/2502.09850

  52. [55]

    Jingkai Xu, De Cheng, Xiangqian Zhao, Jungang Yang, Zilong Wang, Xinyang Jiang, Xufang Luo, Lili Chen, Xiaoli Ning, Chengxu Li, et al . 2025. DermINO: Hybrid pretraining for a versatile dermatology foundation model.arXiv preprint arXiv:2508.12190(2025)

  53. [56]

    Zhipeng Xu, De Cheng, Xinyang Jiang, Nannan Wang, Dongsheng Li, and Xinbo Gao. 2025. Adversarial domain prompt tuning and generation for single domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18584–18595

  54. [57]

    Zhipeng Xu, Zilong Wang, Xinyang Jiang, Dongsheng Li, De Cheng, and Nannan Wang. 2026. Reasoning-Driven Multimodal LLM for Domain Generalization. InThe Fourteenth International Conference on Learning Representations. https: //openreview.net/forum?id=psJiUopUt7

  55. [58]

    Tao Yang, Yuwang Wang, Yan Lu, and Nanning Zheng. 2024. Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement.NeurIPS(2024)

  56. [59]

    Jie ZHANG, Xiaosong Ma, Song Guo, Peng Li, Wenchao Xu, Xueyang Tang, and Zicong Hong. 2024. Amend to Alignment: Decoupled Prompt Tuning for Mitigat- ing Spurious Correlation in Vision-Language Models. InForty-first International Conference on Machine Learning. https://openreview.net/forum?id=f8G2KSCSdp

  57. [60]

    Michael Zhang and Christopher Ré. 2022. Contrastive Adapters for Foundation Model Group Robustness. arXiv:2207.07180 [cs.LG] https://arxiv.org/abs/2207. 07180

  58. [61]

    Sohoni, Hongyang R

    Michael Zhang, Nimit S. Sohoni, Hongyang R. Zhang, Chelsea Finn, and Christo- pher Ré. 2024. Correct-N-Contrast: A Contrastive Approach for Improving Robustness to Spurious Correlations. arXiv:2203.01517 [cs.LG] https://arxiv.org/ abs/2203.01517

  59. [62]

    Zengqun Zhao, Ziquan Liu, Yu Cao, Shaogang Gong, and Ioannis Patras. 2025. AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data. arXiv:2503.05665 [cs.CV] https://arxiv. org/abs/2503.05665

  60. [63]

    Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba

  61. [64]

    Places: A 10 million Image Database for Scene Recognition.IEEE Transac- tions on Pattern Analysis and Machine Intelligence(2017)

  62. [65]

    Dual-Branch Cross-Projection Debiasing through Diffusion-based Disentanglement

    Beier Zhu, Jiequan Cui, Hanwang Zhang, and Chi Zhang. 2025. Project-Probe- Aggregate: Efficient Fine-Tuning for Group Robustness. arXiv:2503.09487 [cs.CV] https://arxiv.org/abs/2503.09487 Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Xiangqian Zhao, Xinyang Jiang, Zhipeng Xu, Lingfeng He, Zilong Wang, Dongsheng Li, De Cheng, and Nannan Wang Supp...