Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization

De Cheng; Dongsheng Li; Nannan Wang; Xinbo Gao; Xinyang Jiang; Zhipeng Xu

arxiv: 2606.21736 · v1 · pith:P5S3BYHVnew · submitted 2026-06-19 · 💻 cs.CV

Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization

Zhipeng Xu , De Cheng , Xinyang Jiang , Nannan Wang , Dongsheng Li , Xinbo Gao This is my paper

Pith reviewed 2026-06-26 14:16 UTC · model grok-4.3

classification 💻 cs.CV

keywords single domain generalizationprompt tuningdiffusion modelsadversarial learningdata generationout-of-domain augmentationtext-to-image models

0 comments

The pith

Adversarial tuning of two prompt sets inside a frozen diffusion model generates out-of-domain images that improve single-domain generalization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to establish that a model trained on one domain can generalize to many unseen domains by augmenting its training set with images produced by a pre-trained text-to-image diffusion model. It does this by learning one prompt set that holds category identity and a second prompt set that holds domain style, then optimizing the two sets against each other so that style changes while category stays fixed. This removes the need to write manual descriptions for every possible target domain. If the separation holds, training data can be expanded automatically to cover a wide range of visual styles without collecting new real images.

Core claim

The Progressive Adversarial Prompt Tuning framework learns two sets of abstract prompts as conditions for a pre-trained diffusion model: one set encodes domain-invariant category information while the other encodes domain-specific styles. Adversarial optimization between the two sets lets the diffusion model produce images that vary in style across many domains yet keep the original category features intact, yielding higher accuracy on unseen domains than prior single-domain generalization methods.

What carries the argument

Two adversarially optimized sets of abstract prompts that condition a frozen diffusion model to separate category content from domain style.

If this is right

A single-domain training set can be expanded with generated images that cover many unseen visual styles.
No hand-written domain descriptions are needed to produce the varied training images.
The generated images preserve enough category detail that classifiers trained on them outperform earlier single-domain generalization techniques.
The same frozen diffusion model can be reused across different source domains by retuning only the prompt sets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If prompt separation works reliably, the same idea could be tested on other generative backbones such as autoregressive image models.
Applying the method to fine-grained categories might reveal whether the learned prompts can control attributes at the part level rather than whole-object level.
Running the generated images through a downstream task that requires precise localization could test whether style variation harms spatial structure.

Load-bearing premise

Two learned prompt sets can be made to carry only category information or only style information inside the diffusion model without the categories becoming unrecognizable.

What would settle it

Train a classifier on images generated by the method and measure its accuracy on a held-out multi-domain test set; if accuracy is no higher than a classifier trained only on the original single domain, the claim is false.

Figures

Figures reproduced from arXiv: 2606.21736 by De Cheng, Dongsheng Li, Nannan Wang, Xinbo Gao, Xinyang Jiang, Zhipeng Xu.

**Figure 2.** Figure 2: The framework of our proposed method. We first train a set of category prompts [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: This figure shows the impact of the Progressive Training [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: This figure shows the training data generated by the T2I model under different input conditions, corresponding to each row in the [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Impact of domain prompt number in Eq. 7 N on PACS and VLCS dataset on SDG setting and multi-source DG setting. mainly introduce diverse styles obtain relatively slight performance gains on VLCS dataset compared to PACS ones. However, our AD significantly improves the model performance of 0.28%∼6.40% across each domain and 3.42% on Avg-acc (see 3 rd and 4 th row in Tab. 6). This observation indicates our … view at source ↗

**Figure 6.** Figure 6: Parameter analysis of λ in Eq. 15 on PACS dataset. 4.6. Parameter Analysis We conduct experiments to analyze the impact of the hyperparameters in our method on PACS dataset, including λ in Eq. 15. The results are shown in [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

Single domain generalization (SDG) aims to learn a robust model, which could perform well on many unseen domains while there is only one single domain available for training. One of the promising directions for achieving single-domain generalization is to generate out-of-domain (OOD) training data through data augmentation or image generation. Given the rapid advancements in AI-generated content (AIGC), this paper is the first to propose leveraging powerful pre-trained text-to-image (T2I) foundation models to create the training data. However, manually designing textual prompts to generate images for all possible domains is often impractical, and some domain characteristics may be too abstract to describe with words. To address these challenges, we propose a novel Progressive Adversarial Prompt Tuning (PAPT) framework for pre-trained diffusion models. Instead of relying on static textual domains, our approach learns two sets of abstract prompts as conditions for the diffusion model: one that captures domain-invariant category information and another that models domain-specific styles. This adversarial learning mechanism enables the T2I model to generate images in various domain styles while preserving key categorical features. Extensive experiments demonstrate the effectiveness of the proposed method, achieving superior performances to state-of-the-art single-domain generalization approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces adversarial prompt tuning on frozen T2I models to generate OOD data for single-domain generalization, but the abstract leaves the separation mechanism and supporting results underspecified.

read the letter

The core idea is to skip manual domain prompts and instead tune two sets of abstract prompts inside a pre-trained diffusion model: one set for category-invariant content and one for domain-specific style. The adversarial game is meant to let the model produce varied training images while keeping class identity intact. This is presented as the first application of T2I foundation models to SDG data generation.

The approach targets a practical bottleneck. When only one source domain is available, generating synthetic OOD samples without hand-designed text descriptions could reduce reliance on domain expertise. The progressive aspect of the tuning might help stabilize the optimization over time.

The main soft spot is the lack of any described safeguard against prompt leakage. Nothing in the abstract mentions a reconstruction loss, contrastive term, or classifier regularizer that would keep category prompts from absorbing style signals or style prompts from shifting semantics. If the adversarial objective collapses into a trade-off, the generated images could fail to improve generalization. The stress-test note flags exactly this risk, and the abstract supplies no equations or implementation details to address it.

Experiments are called extensive and superior to prior SDG methods, yet the abstract contains no numbers, baselines, or ablation results. Without those, the central claim cannot be evaluated against its own evidence.

The work is aimed at vision researchers already combining generative models with domain generalization. A reader looking for prompt-based augmentation techniques might extract the method description, but the current write-up is too thin on verification to stand alone.

I would send this to peer review. The direction is timely and the problem is real; a revised version with concrete losses, ablations, and quantitative results could be worth referee time even if the separation claim needs tightening.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes the Progressive Adversarial Prompt Tuning (PAPT) framework for single-domain generalization (SDG). It is the first to leverage pre-trained text-to-image diffusion models to synthesize out-of-domain training data by adversarially learning two sets of abstract prompts—one capturing domain-invariant category information and the other modeling domain-specific styles—without manual domain descriptions. The central claim is that this mechanism allows the frozen T2I model to generate diverse domain-styled images while preserving categorical fidelity, yielding superior performance over state-of-the-art SDG methods as shown by extensive experiments.

Significance. If the disentanglement and performance claims hold, the work would advance SDG by automating OOD data generation via foundation models, removing reliance on hand-crafted prompts and potentially improving robustness when only single-domain training data is available.

major comments (2)

[Abstract] Abstract: the claim that adversarial optimization of the two prompt sets 'enables the T2I model to generate images in various domain styles while preserving key categorical features' is load-bearing, yet the abstract supplies no description of the adversarial objective, any category-reconstruction term, contrastive regularizer, or classifier-based fidelity loss that would prevent style prompts from drifting category semantics or invariant prompts from absorbing style leakage.
[Abstract] Abstract: the assertion of 'extensive experiments' demonstrating 'superior performances to state-of-the-art single-domain generalization approaches' is unsupported by any quantitative results, baseline comparisons, ablation details, or error analysis, preventing verification of the central performance claim.

minor comments (1)

The acronym PAPT is introduced in the abstract but the title uses a different phrasing ('Adversarial Domain Prompt Tuning and Generation'); consistent terminology would aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on the abstract. We respond point-by-point below and indicate where revisions will be made.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that adversarial optimization of the two prompt sets 'enables the T2I model to generate images in various domain styles while preserving key categorical features' is load-bearing, yet the abstract supplies no description of the adversarial objective, any category-reconstruction term, contrastive regularizer, or classifier-based fidelity loss that would prevent style prompts from drifting category semantics or invariant prompts from absorbing style leakage.

Authors: We agree the abstract is concise and omits explicit mention of the loss terms. The full manuscript (Section 3.2) defines the progressive adversarial objective with a category-reconstruction loss, contrastive regularizer, and classifier-based fidelity term to enforce the claimed disentanglement. To address the concern, we will revise the abstract to briefly reference these mechanisms. revision: yes
Referee: [Abstract] Abstract: the assertion of 'extensive experiments' demonstrating 'superior performances to state-of-the-art single-domain generalization approaches' is unsupported by any quantitative results, baseline comparisons, ablation details, or error analysis, preventing verification of the central performance claim.

Authors: Abstracts are space-constrained and conventionally summarize rather than enumerate numbers or tables; the manuscript provides those details in Section 4 (Tables 1-4, Figures 3-5) with SOTA comparisons, ablations, and error analysis. We will explore adding one high-level quantitative statement if it fits the length limit. revision: partial

Circularity Check

0 steps flagged

No circularity: novel prompt-tuning framework with no self-referential reductions

full rationale

The paper introduces a new Progressive Adversarial Prompt Tuning (PAPT) method that learns two abstract prompt sets inside a frozen T2I diffusion model via adversarial optimization. No equations, fitted parameters, or self-citations appear in the provided text that would reduce the claimed OOD generation or SDG performance to inputs by construction. The derivation chain consists of a proposed mechanism (adversarial prompt learning for category-style separation) whose outputs are evaluated experimentally rather than defined into the inputs. This is a standard non-circular proposal of a new technique.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the method implicitly assumes the diffusion model can be conditioned on learned prompts without further specification.

pith-pipeline@v0.9.1-grok · 5756 in / 1127 out tokens · 15027 ms · 2026-06-26T14:16:00.274122+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

78 extracted references · 13 canonical work pages · 7 internal anchors

[1]

Invariant Risk Minimization

Martin Arjovsky, L ´eon Bottou, Ishaan Gulrajani, and David Lopez-Paz. Invariant risk minimization.arXiv preprint arXiv:1907.02893, 2019. 6

work page internal anchor Pith review Pith/arXiv arXiv 1907
[2]

Recognition in terra incognita

Sara Beery, Grant Van Horn, and Pietro Perona. Recognition in terra incognita. InProceedings of the European confer- ence on computer vision (ECCV), pages 456–473, 2018. 6

2018
[3]

Domain generalization by marginal transfer learning.Journal of machine learning re- search, 22(2):1–55, 2021

Gilles Blanchard, Aniket Anand Deshmukh, Urun Dogan, Gyemin Lee, and Clayton Scott. Domain generalization by marginal transfer learning.Journal of machine learning re- search, 22(2):1–55, 2021. 6

2021
[4]

Domain generalization by solving jigsaw puzzles

Fabio M Carlucci, Antonio D’Innocente, Silvia Bucci, Bar- bara Caputo, and Tatiana Tommasi. Domain generalization by solving jigsaw puzzles. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2229–2238, 2019. 2

2019
[5]

Domain generalization by mutual-information regu- larization with pre-trained models

Junbum Cha, Kyungjae Lee, Sungrae Park, and Sanghyuk Chun. Domain generalization by mutual-information regu- larization with pre-trained models. InEuropean conference on computer vision, pages 440–457. Springer, 2022. 7

2022
[6]

Meta-causal learning for single domain generalization

Jin Chen, Zhi Gao, Xinxiao Wu, and Jiebo Luo. Meta-causal learning for single domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7683–7692, 2023. 6

2023
[7]

Improved test-time adaptation for domain generalization

Liang Chen, Yong Zhang, Yibing Song, Ying Shan, and Lingqiao Liu. Improved test-time adaptation for domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24172– 24182, 2023. 6

2023
[8]

Domain generalization via rationale invariance

Liang Chen, Yong Zhang, Yibing Song, Anton Van Den Hen- gel, and Lingqiao Liu. Domain generalization via rationale invariance. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 1751–1760, 2023. 6

2023
[9]

Efficient bilateral cross- modality cluster matching for unsupervised visible-infrared person reid

De Cheng, Lingfeng He, Nannan Wang, Shizhou Zhang, Zhen Wang, and Xinbo Gao. Efficient bilateral cross- modality cluster matching for unsupervised visible-infrared person reid. InProceedings of the 31st ACM International Conference on Multimedia, pages 1325–1333, 2023. 1

2023
[10]

Disentangled prompt rep- resentation for domain generalization

De Cheng, Zhipeng Xu, Xinyang Jiang, Nannan Wang, Dongsheng Li, and Xinbo Gao. Disentangled prompt rep- resentation for domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23595–23604, 2024. 1, 2

2024
[11]

Prompt disentanglement via language guidance and representation alignment for domain generalization.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026

De Cheng, Zhipeng Xu, Xinyang Jiang, Dongsheng Li, Nan- nan Wang, and Xinbo Gao. Prompt disentanglement via language guidance and representation alignment for domain generalization.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026

2026
[12]

Interference-isolated elastic weight consolidation and knowledge calibration for incremental ob- ject detection

De Cheng, Mingyue Zeng, Zhipeng Xu, Di Xu, Nannan Wang, and Xinbo Gao. Interference-isolated elastic weight consolidation and knowledge calibration for incremental ob- ject detection. InProceedings of the Fourteenth Interna- tional Conference on Learning Representations, 2026. 1

2026
[13]

Progressive random con- volutions for single domain generalization

Seokeon Choi, Debasmit Das, Sungha Choi, Seunghan Yang, Hyunsin Park, and Sungrack Yun. Progressive random con- volutions for single domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 10312–10322, 2023. 6

2023
[14]

Directly Fine-Tuning Diffusion Models on Differentiable Rewards

Kevin Clark, Paul Vicol, Kevin Swersky, and David J Fleet. Directly fine-tuning diffusion models on differentiable re- wards.arXiv preprint arXiv:2309.17400, 2023. 3

work page internal anchor Pith review Pith/arXiv arXiv 2023
[15]

Attention consistency on visual corruptions for single-source domain generalization

Ilke Cugu, Massimiliano Mancini, Yanbei Chen, and Zeynep Akata. Attention consistency on visual corruptions for single-source domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4165–4174, 2022. 6

2022
[16]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009. 6

2009
[17]

Cogview: Mastering text-to-image generation via transformers.Advances in neural information processing systems, 34:19822–19835, 2021

Ming Ding, Zhuoyi Yang, Wenyi Hong, Wendi Zheng, Chang Zhou, Da Yin, Junyang Lin, Xu Zou, Zhou Shao, Hongxia Yang, et al. Cogview: Mastering text-to-image generation via transformers.Advances in neural information processing systems, 34:19822–19835, 2021. 2

2021
[18]

Domain generalization via model-agnostic learning of semantic features.Advances in neural informa- tion processing systems, 32, 2019

Qi Dou, Daniel Coelho de Castro, Konstantinos Kamnitsas, and Ben Glocker. Domain generalization via model-agnostic learning of semantic features.Advances in neural informa- tion processing systems, 32, 2019. 2

2019
[19]

Adversarially adaptive normal- ization for single domain generalization

Xinjie Fan, Qifei Wang, Junjie Ke, Feng Yang, Boqing Gong, and Mingyuan Zhou. Adversarially adaptive normal- ization for single domain generalization. InProceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pages 8208–8217, 2021. 2

2021
[20]

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patash- nik, Amit H Bermano, Gal Chechik, and Daniel Cohen- Or. An image is worth one word: Personalizing text-to- image generation using textual inversion.arXiv preprint arXiv:2208.01618, 2022. 3

work page internal anchor Pith review Pith/arXiv arXiv 2022
[21]

Domain-adversarial training of neural networks.Journal of machine learning research, 17(59):1–35, 2016

Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pas- cal Germain, Hugo Larochelle, Franc ¸ois Laviolette, Mario March, and Victor Lempitsky. Domain-adversarial training of neural networks.Journal of machine learning research, 17(59):1–35, 2016. 1, 2, 6, 7

2016
[22]

Generative adversarial networks.Commu- nications of the ACM, 63(11):139–144, 2020

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks.Commu- nications of the ACM, 63(11):139–144, 2020. 2

2020
[23]

Domaindrop: Sup- pressing domain-sensitive channels for domain generaliza- tion

Jintao Guo, Lei Qi, and Yinghuan Shi. Domaindrop: Sup- pressing domain-sensitive channels for domain generaliza- tion. InProceedings of the IEEE/CVF international confer- ence on computer vision, pages 19114–19124, 2023. 7

2023
[24]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 6

2016
[25]

Augmix: A simple data processing method to improve robustness and uncertainty.arXiv preprint arXiv:1912.02781, 2019

Dan Hendrycks, Norman Mu, Ekin D Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshminarayanan. Augmix: A simple data processing method to improve robustness and uncertainty.arXiv preprint arXiv:1912.02781, 2019. 6

work page arXiv 1912
[26]

Distilling the Knowledge in a Neural Network

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distill- ing the knowledge in a neural network.arXiv preprint arXiv:1503.02531, 2015. 4

work page internal anchor Pith review Pith/arXiv arXiv 2015
[27]

Denoising dif- fusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising dif- fusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020. 4

2020
[28]

Representation enhancement-stabilization: Reducing bias- variance of domain generalization

Wei Huang, Yilei Shi, Zhitong Xiong, and Xiao Xiang Zhu. Representation enhancement-stabilization: Reducing bias- variance of domain generalization. 7
[29]

Self-challenging improves cross-domain generalization

Zeyi Huang, Haohan Wang, Eric P Xing, and Dong Huang. Self-challenging improves cross-domain generalization. In Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, part II 16, pages 124–140. Springer, 2020. 6, 7

2020
[30]

Undoing the dam- age of dataset bias

Aditya Khosla, Tinghui Zhou, Tomasz Malisiewicz, Alexei A Efros, and Antonio Torralba. Undoing the dam- age of dataset bias. InComputer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part I 12, pages 158–171. Springer, 2012. 2

2012
[31]

Out-of-distribution general- ization via risk extrapolation (rex)

David Krueger, Ethan Caballero, Joern-Henrik Jacobsen, Amy Zhang, Jonathan Binas, Dinghuai Zhang, Remi Le Priol, and Aaron Courville. Out-of-distribution general- ization via risk extrapolation (rex). InInternational confer- ence on machine learning, pages 5815–5826. PMLR, 2021. 6, 7

2021
[32]

Deeper, broader and artier domain generaliza- tion

Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M Hospedales. Deeper, broader and artier domain generaliza- tion. InProceedings of the IEEE international conference on computer vision, pages 5542–5550, 2017. 5, 6

2017
[33]

Learning to generalize: Meta-learning for do- main generalization

Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy Hospedales. Learning to generalize: Meta-learning for do- main generalization. InProceedings of the AAAI conference on artificial intelligence, 2018. 7

2018
[34]

Prompt-driven dynamic object-centric learning for single do- main generalization

Deng Li, Aming Wu, Yaowei Wang, and Yahong Han. Prompt-driven dynamic object-centric learning for single do- main generalization. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 17606–17615, 2024. 6

2024
[35]

Uncertainty model- ing for out-of-distribution generalization.arXiv preprint arXiv:2202.03958, 2022

Xiaotong Li, Yongxing Dai, Yixiao Ge, Jun Liu, Ying Shan, and Ling-Yu Duan. Uncertainty model- ing for out-of-distribution generalization.arXiv preprint arXiv:2202.03958, 2022. 6

work page arXiv 2022
[36]

Domain generalization via conditional invari- ant representations

Ya Li, Mingming Gong, Xinmei Tian, Tongliang Liu, and Dacheng Tao. Domain generalization via conditional invari- ant representations. InProceedings of the AAAI conference on artificial intelligence, 2018. 6, 7

2018
[37]

Transfer feature learning with joint distribution adaptation

Mingsheng Long, Jianmin Wang, Guiguang Ding, Jiaguang Sun, and Philip S Yu. Transfer feature learning with joint distribution adaptation. InProceedings of the IEEE inter- national conference on computer vision, pages 2200–2207,
[38]

Grounding stylistic do- main generalization with quantitative domain shift measures and synthetic scene images

Yiran Luo, Joshua Feinglass, Tejas Gokhale, Kuan-Cheng Lee, Chitta Baral, and Yezhou Yang. Grounding stylistic do- main generalization with quantitative domain shift measures and synthetic scene images. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7303–7313, 2024. 7

2024
[39]

Reducing domain gap by reduc- ing style bias

Hyeonseob Nam, HyunJae Lee, Jongchan Park, Wonjun Yoon, and Donggeun Yoo. Reducing domain gap by reduc- ing style bias. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8690– 8699, 2021. 6, 7

2021
[40]

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. Glide: Towards photorealistic image generation and editing with text-guided diffusion models.arXiv preprint arXiv:2112.10741, 2021. 2

work page internal anchor Pith review Pith/arXiv arXiv 2021
[41]

Permuted adain: Reducing the bias towards global statistics in image clas- sification

Oren Nuriel, Sagie Benaim, and Lior Wolf. Permuted adain: Reducing the bias towards global statistics in image clas- sification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9482–9491,
[42]

Transferrable prototypical networks for unsupervised domain adaptation

Yingwei Pan, Ting Yao, Yehao Li, Yu Wang, Chong-Wah Ngo, and Tao Mei. Transferrable prototypical networks for unsupervised domain adaptation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2239–2247, 2019. 1

2019
[43]

Moment matching for multi-source domain adaptation

Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, and Bo Wang. Moment matching for multi-source domain adaptation. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 1406–1415,
[44]

Domain agnostic learning with disentangled repre- sentations

Xingchao Peng, Zijun Huang, Ximeng Sun, and Kate Saenko. Domain agnostic learning with disentangled repre- sentations. InInternational conference on machine learning, pages 5102–5112. PMLR, 2019. 2

2019
[45]

Learning to learn single domain generalization

Fengchun Qiao, Long Zhao, and Xi Peng. Learning to learn single domain generalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12556–12565, 2020. 2

2020
[46]

Bmd: A general class-balanced multicen- tric dynamic prototype strategy for source-free domain adap- tation

Sanqing Qu, Guang Chen, Jing Zhang, Zhijun Li, Wei He, and Dacheng Tao. Bmd: A general class-balanced multicen- tric dynamic prototype strategy for source-free domain adap- tation. InEuropean conference on computer vision, pages 165–182. Springer, 2022. 1

2022
[47]

Modality-agnostic debiasing for sin- gle domain generalization

Sanqing Qu, Yingwei Pan, Guang Chen, Ting Yao, Changjun Jiang, and Tao Mei. Modality-agnostic debiasing for sin- gle domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24142–24151, 2023. 5, 6

2023
[48]

Upcycling models under domain and category shift

Sanqing Qu, Tianpei Zou, Florian R ¨ohrbein, Cewu Lu, Guang Chen, Dacheng Tao, and Changjun Jiang. Upcycling models under domain and category shift. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20019–20028, 2023. 1

2023
[49]

Learning transferable visual models from natural language supervi- sion

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PMLR, 2021. 4

2021
[50]

Fishr: Invariant gradient variances for out-of-distribution generalization

Alexandre Rame, Corentin Dancette, and Matthieu Cord. Fishr: Invariant gradient variances for out-of-distribution generalization. InInternational Conference on Machine Learning, pages 18347–18377. PMLR, 2022. 6

2022
[51]

Zero-shot text-to-image generation

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea V oss, Alec Radford, Mark Chen, and Ilya Sutskever. Zero-shot text-to-image generation. InInternational confer- ence on machine learning, pages 8821–8831. Pmlr, 2021. 2

2021
[52]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022. 2, 3, 4

2022
[53]

U- net: Convolutional networks for biomedical image segmen- tation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image segmen- tation. InMedical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pages 234–241. Springer, 2015. 4

2015
[54]

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization

Shiori Sagawa, Pang Wei Koh, Tatsunori B Hashimoto, and Percy Liang. Distributionally robust neural networks for group shifts: On the importance of regularization for worst- case generalization.arXiv preprint arXiv:1911.08731, 2019. 6, 7

work page internal anchor Pith review Pith/arXiv arXiv 1911
[55]

Generalizing Across Domains via Cross-Gradient Training

Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Sid- dhartha Chaudhuri, Preethi Jyothi, and Sunita Sarawagi. Generalizing across domains via cross-gradient training. arXiv preprint arXiv:1804.10745, 2018. 2

work page internal anchor Pith review Pith/arXiv arXiv 2018
[56]

Unknown domain inconsistency minimization for domain generalization

Seungjae Shin, HeeSun Bae, Byeonghu Na, Yoon-Yeong Kim, and Il-chul Moon. Unknown domain inconsistency minimization for domain generalization. InThe Twelfth In- ternational Conference on Learning Representations. 5, 6
[57]

Deep coral: Correlation alignment for deep domain adaptation

Baochen Sun and Kate Saenko. Deep coral: Correlation alignment for deep domain adaptation. InComputer Vision– ECCV 2016 Workshops: Amsterdam, The Netherlands, Oc- tober 8-10 and 15-16, 2016, Proceedings, Part III 14, pages 443–450. Springer, 2016. 6, 7

2016
[58]

Rethinking multi- domain generalization with a general learning objective

Zhaorui Tan, Xi Yang, and Kaizhu Huang. Rethinking multi- domain generalization with a general learning objective. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 23512–23522, 2024. 7

2024
[59]

Springer science & business media, 2013

Vladimir Vapnik.The nature of statistical learning theory. Springer science & business media, 2013. 6, 7

2013
[60]

Deep hashing network for unsupervised domain adaptation

Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty, and Sethuraman Panchanathan. Deep hashing network for unsupervised domain adaptation. InProceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 5018–5027, 2017. 6

2017
[61]

Addressing model vul- nerability to distributional shifts over image transformation sets

Riccardo V olpi and Vittorio Murino. Addressing model vul- nerability to distributional shifts over image transformation sets. InProceedings of the IEEE/CVF International Confer- ence on Computer Vision, pages 7980–7989, 2019. 2

2019
[62]

Generalizing to unseen domains via adversarial data augmentation.Ad- vances in neural information processing systems, 31, 2018

Riccardo V olpi, Hongseok Namkoong, Ozan Sener, John C Duchi, Vittorio Murino, and Silvio Savarese. Generalizing to unseen domains via adversarial data augmentation.Ad- vances in neural information processing systems, 31, 2018. 2

2018
[63]

Meta-learning for domain generalization in semantic parsing.arXiv preprint arXiv:2010.11988, 2020

Bailin Wang, Mirella Lapata, and Ivan Titov. Meta-learning for domain generalization in semantic parsing.arXiv preprint arXiv:2010.11988, 2020. 2

work page arXiv 2010
[64]

Stpr: Spatiotemporal preservation and routing for exemplar-free video class-incremental learning

Huaijie Wang, De Cheng, Guozhang Li, Zhipeng Xu, Lingfeng He, Jie Li, Nannan Wang, and Xinbo Gao. Stpr: Spatiotemporal preservation and routing for exemplar-free video class-incremental learning. InProceedings of the Four- teenth International Conference on Learning Representa- tions, 2026. 1

2026
[65]

Generalizing to unseen domains: A survey on do- main generalization.IEEE transactions on knowledge and data engineering, 35(8):8052–8072, 2022

Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, and S Yu Philip. Generalizing to unseen domains: A survey on do- main generalization.IEEE transactions on knowledge and data engineering, 35(8):8052–8072, 2022. 1

2022
[66]

Sharpness-aware gradient matching for domain generaliza- tion

Pengfei Wang, Zhaoxiang Zhang, Zhen Lei, and Lei Zhang. Sharpness-aware gradient matching for domain generaliza- tion. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 3769–3778,
[67]

Learning to diversify for single do- main generalization

Zijian Wang, Yadan Luo, Ruihong Qiu, Zi Huang, and Mahsa Baktashmotlagh. Learning to diversify for single do- main generalization. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision, pages 834–843,
[68]

Imagere- ward: Learning and evaluating human preferences for text- to-image generation.Advances in Neural Information Pro- cessing Systems, 36, 2024

Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. Imagere- ward: Learning and evaluating human preferences for text- to-image generation.Advances in Neural Information Pro- cessing Systems, 36, 2024. 3

2024
[69]

Reasoning-driven multimodal llm for domain generalization

Zhipeng Xu, Zilong Wang, Xinyang Jiang, Dongsheng Li, De Cheng, and Nannan Wang. Reasoning-driven multimodal llm for domain generalization. InThe Fourteenth Interna- tional Conference on Learning Representations, 2026. 1

2026
[70]

Improve unsupervised domain adaptation with mixup training.arXiv preprint arXiv:2001.00677, 2020

Shen Yan, Huan Song, Nanxiang Li, Lincan Zou, and Liu Ren. Improve unsupervised domain adaptation with mixup training.arXiv preprint arXiv:2001.00677, 2020. 6

work page arXiv 2001
[71]

Cutmix: Regu- larization strategy to train strong classifiers with localizable features

Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. Cutmix: Regu- larization strategy to train strong classifiers with localizable features. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 6023–6032, 2019. 6

2019
[72]

mixup: Beyond empirical risk minimiza- tion

Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. mixup: Beyond empirical risk minimiza- tion. InInternational Conference on Learning Representa- tions, 2018. 6

2018
[73]

Towards principled disentanglement for domain generalization

Hanlin Zhang, Yi-Fan Zhang, Weiyang Liu, Adrian Weller, Bernhard Sch ¨olkopf, and Eric P Xing. Towards principled disentanglement for domain generalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8024–8034, 2022. 1

2022
[74]

Adaptive risk min- imization: Learning to adapt to domain shift.Advances in Neural Information Processing Systems, 34:23664–23678,

Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, and Chelsea Finn. Adaptive risk min- imization: Learning to adapt to domain shift.Advances in Neural Information Processing Systems, 34:23664–23678,
[75]

Exact feature distribution matching for arbitrary style transfer and domain generalization

Yabin Zhang, Minghan Li, Ruihuang Li, Kui Jia, and Lei Zhang. Exact feature distribution matching for arbitrary style transfer and domain generalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8035–8045, 2022. 6

2022
[76]

Maximum-entropy adversarial data augmentation for im- proved generalization and robustness.Advances in Neural Information Processing Systems, 33:14435–14447, 2020

Long Zhao, Ting Liu, Xi Peng, and Dimitris Metaxas. Maximum-entropy adversarial data augmentation for im- proved generalization and robustness.Advances in Neural Information Processing Systems, 33:14435–14447, 2020. 2

2020
[77]

Domain generalization with mixstyle.arXiv preprint arXiv:2104.02008, 2021

Kaiyang Zhou, Yongxin Yang, Yu Qiao, and Tao Xi- ang. Domain generalization with mixstyle.arXiv preprint arXiv:2104.02008, 2021. 2, 6, 7

work page arXiv 2021
[78]

Surrogate gap minimization improves sharpness-aware training.arXiv preprint arXiv:2203.08065,

Juntang Zhuang, Boqing Gong, Liangzhe Yuan, Yin Cui, Hartwig Adam, Nicha Dvornek, Sekhar Tatikonda, James Duncan, and Ting Liu. Surrogate gap minimization improves sharpness-aware training.arXiv preprint arXiv:2203.08065,

work page arXiv

[1] [1]

Invariant Risk Minimization

Martin Arjovsky, L ´eon Bottou, Ishaan Gulrajani, and David Lopez-Paz. Invariant risk minimization.arXiv preprint arXiv:1907.02893, 2019. 6

work page internal anchor Pith review Pith/arXiv arXiv 1907

[2] [2]

Recognition in terra incognita

Sara Beery, Grant Van Horn, and Pietro Perona. Recognition in terra incognita. InProceedings of the European confer- ence on computer vision (ECCV), pages 456–473, 2018. 6

2018

[3] [3]

Domain generalization by marginal transfer learning.Journal of machine learning re- search, 22(2):1–55, 2021

Gilles Blanchard, Aniket Anand Deshmukh, Urun Dogan, Gyemin Lee, and Clayton Scott. Domain generalization by marginal transfer learning.Journal of machine learning re- search, 22(2):1–55, 2021. 6

2021

[4] [4]

Domain generalization by solving jigsaw puzzles

Fabio M Carlucci, Antonio D’Innocente, Silvia Bucci, Bar- bara Caputo, and Tatiana Tommasi. Domain generalization by solving jigsaw puzzles. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2229–2238, 2019. 2

2019

[5] [5]

Domain generalization by mutual-information regu- larization with pre-trained models

Junbum Cha, Kyungjae Lee, Sungrae Park, and Sanghyuk Chun. Domain generalization by mutual-information regu- larization with pre-trained models. InEuropean conference on computer vision, pages 440–457. Springer, 2022. 7

2022

[6] [6]

Meta-causal learning for single domain generalization

Jin Chen, Zhi Gao, Xinxiao Wu, and Jiebo Luo. Meta-causal learning for single domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7683–7692, 2023. 6

2023

[7] [7]

Improved test-time adaptation for domain generalization

Liang Chen, Yong Zhang, Yibing Song, Ying Shan, and Lingqiao Liu. Improved test-time adaptation for domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24172– 24182, 2023. 6

2023

[8] [8]

Domain generalization via rationale invariance

Liang Chen, Yong Zhang, Yibing Song, Anton Van Den Hen- gel, and Lingqiao Liu. Domain generalization via rationale invariance. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 1751–1760, 2023. 6

2023

[9] [9]

Efficient bilateral cross- modality cluster matching for unsupervised visible-infrared person reid

De Cheng, Lingfeng He, Nannan Wang, Shizhou Zhang, Zhen Wang, and Xinbo Gao. Efficient bilateral cross- modality cluster matching for unsupervised visible-infrared person reid. InProceedings of the 31st ACM International Conference on Multimedia, pages 1325–1333, 2023. 1

2023

[10] [10]

Disentangled prompt rep- resentation for domain generalization

De Cheng, Zhipeng Xu, Xinyang Jiang, Nannan Wang, Dongsheng Li, and Xinbo Gao. Disentangled prompt rep- resentation for domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23595–23604, 2024. 1, 2

2024

[11] [11]

Prompt disentanglement via language guidance and representation alignment for domain generalization.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026

De Cheng, Zhipeng Xu, Xinyang Jiang, Dongsheng Li, Nan- nan Wang, and Xinbo Gao. Prompt disentanglement via language guidance and representation alignment for domain generalization.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026

2026

[12] [12]

Interference-isolated elastic weight consolidation and knowledge calibration for incremental ob- ject detection

De Cheng, Mingyue Zeng, Zhipeng Xu, Di Xu, Nannan Wang, and Xinbo Gao. Interference-isolated elastic weight consolidation and knowledge calibration for incremental ob- ject detection. InProceedings of the Fourteenth Interna- tional Conference on Learning Representations, 2026. 1

2026

[13] [13]

Progressive random con- volutions for single domain generalization

Seokeon Choi, Debasmit Das, Sungha Choi, Seunghan Yang, Hyunsin Park, and Sungrack Yun. Progressive random con- volutions for single domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 10312–10322, 2023. 6

2023

[14] [14]

Directly Fine-Tuning Diffusion Models on Differentiable Rewards

Kevin Clark, Paul Vicol, Kevin Swersky, and David J Fleet. Directly fine-tuning diffusion models on differentiable re- wards.arXiv preprint arXiv:2309.17400, 2023. 3

work page internal anchor Pith review Pith/arXiv arXiv 2023

[15] [15]

Attention consistency on visual corruptions for single-source domain generalization

Ilke Cugu, Massimiliano Mancini, Yanbei Chen, and Zeynep Akata. Attention consistency on visual corruptions for single-source domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4165–4174, 2022. 6

2022

[16] [16]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009. 6

2009

[17] [17]

Cogview: Mastering text-to-image generation via transformers.Advances in neural information processing systems, 34:19822–19835, 2021

Ming Ding, Zhuoyi Yang, Wenyi Hong, Wendi Zheng, Chang Zhou, Da Yin, Junyang Lin, Xu Zou, Zhou Shao, Hongxia Yang, et al. Cogview: Mastering text-to-image generation via transformers.Advances in neural information processing systems, 34:19822–19835, 2021. 2

2021

[18] [18]

Domain generalization via model-agnostic learning of semantic features.Advances in neural informa- tion processing systems, 32, 2019

Qi Dou, Daniel Coelho de Castro, Konstantinos Kamnitsas, and Ben Glocker. Domain generalization via model-agnostic learning of semantic features.Advances in neural informa- tion processing systems, 32, 2019. 2

2019

[19] [19]

Adversarially adaptive normal- ization for single domain generalization

Xinjie Fan, Qifei Wang, Junjie Ke, Feng Yang, Boqing Gong, and Mingyuan Zhou. Adversarially adaptive normal- ization for single domain generalization. InProceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pages 8208–8217, 2021. 2

2021

[20] [20]

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patash- nik, Amit H Bermano, Gal Chechik, and Daniel Cohen- Or. An image is worth one word: Personalizing text-to- image generation using textual inversion.arXiv preprint arXiv:2208.01618, 2022. 3

work page internal anchor Pith review Pith/arXiv arXiv 2022

[21] [21]

Domain-adversarial training of neural networks.Journal of machine learning research, 17(59):1–35, 2016

Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pas- cal Germain, Hugo Larochelle, Franc ¸ois Laviolette, Mario March, and Victor Lempitsky. Domain-adversarial training of neural networks.Journal of machine learning research, 17(59):1–35, 2016. 1, 2, 6, 7

2016

[22] [22]

Generative adversarial networks.Commu- nications of the ACM, 63(11):139–144, 2020

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks.Commu- nications of the ACM, 63(11):139–144, 2020. 2

2020

[23] [23]

Domaindrop: Sup- pressing domain-sensitive channels for domain generaliza- tion

Jintao Guo, Lei Qi, and Yinghuan Shi. Domaindrop: Sup- pressing domain-sensitive channels for domain generaliza- tion. InProceedings of the IEEE/CVF international confer- ence on computer vision, pages 19114–19124, 2023. 7

2023

[24] [24]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 6

2016

[25] [25]

Augmix: A simple data processing method to improve robustness and uncertainty.arXiv preprint arXiv:1912.02781, 2019

Dan Hendrycks, Norman Mu, Ekin D Cubuk, Barret Zoph, Justin Gilmer, and Balaji Lakshminarayanan. Augmix: A simple data processing method to improve robustness and uncertainty.arXiv preprint arXiv:1912.02781, 2019. 6

work page arXiv 1912

[26] [26]

Distilling the Knowledge in a Neural Network

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distill- ing the knowledge in a neural network.arXiv preprint arXiv:1503.02531, 2015. 4

work page internal anchor Pith review Pith/arXiv arXiv 2015

[27] [27]

Denoising dif- fusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising dif- fusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020. 4

2020

[28] [28]

Representation enhancement-stabilization: Reducing bias- variance of domain generalization

Wei Huang, Yilei Shi, Zhitong Xiong, and Xiao Xiang Zhu. Representation enhancement-stabilization: Reducing bias- variance of domain generalization. 7

[29] [29]

Self-challenging improves cross-domain generalization

Zeyi Huang, Haohan Wang, Eric P Xing, and Dong Huang. Self-challenging improves cross-domain generalization. In Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, proceedings, part II 16, pages 124–140. Springer, 2020. 6, 7

2020

[30] [30]

Undoing the dam- age of dataset bias

Aditya Khosla, Tinghui Zhou, Tomasz Malisiewicz, Alexei A Efros, and Antonio Torralba. Undoing the dam- age of dataset bias. InComputer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part I 12, pages 158–171. Springer, 2012. 2

2012

[31] [31]

Out-of-distribution general- ization via risk extrapolation (rex)

David Krueger, Ethan Caballero, Joern-Henrik Jacobsen, Amy Zhang, Jonathan Binas, Dinghuai Zhang, Remi Le Priol, and Aaron Courville. Out-of-distribution general- ization via risk extrapolation (rex). InInternational confer- ence on machine learning, pages 5815–5826. PMLR, 2021. 6, 7

2021

[32] [32]

Deeper, broader and artier domain generaliza- tion

Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M Hospedales. Deeper, broader and artier domain generaliza- tion. InProceedings of the IEEE international conference on computer vision, pages 5542–5550, 2017. 5, 6

2017

[33] [33]

Learning to generalize: Meta-learning for do- main generalization

Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy Hospedales. Learning to generalize: Meta-learning for do- main generalization. InProceedings of the AAAI conference on artificial intelligence, 2018. 7

2018

[34] [34]

Prompt-driven dynamic object-centric learning for single do- main generalization

Deng Li, Aming Wu, Yaowei Wang, and Yahong Han. Prompt-driven dynamic object-centric learning for single do- main generalization. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 17606–17615, 2024. 6

2024

[35] [35]

Uncertainty model- ing for out-of-distribution generalization.arXiv preprint arXiv:2202.03958, 2022

Xiaotong Li, Yongxing Dai, Yixiao Ge, Jun Liu, Ying Shan, and Ling-Yu Duan. Uncertainty model- ing for out-of-distribution generalization.arXiv preprint arXiv:2202.03958, 2022. 6

work page arXiv 2022

[36] [36]

Domain generalization via conditional invari- ant representations

Ya Li, Mingming Gong, Xinmei Tian, Tongliang Liu, and Dacheng Tao. Domain generalization via conditional invari- ant representations. InProceedings of the AAAI conference on artificial intelligence, 2018. 6, 7

2018

[37] [37]

Transfer feature learning with joint distribution adaptation

Mingsheng Long, Jianmin Wang, Guiguang Ding, Jiaguang Sun, and Philip S Yu. Transfer feature learning with joint distribution adaptation. InProceedings of the IEEE inter- national conference on computer vision, pages 2200–2207,

[38] [38]

Grounding stylistic do- main generalization with quantitative domain shift measures and synthetic scene images

Yiran Luo, Joshua Feinglass, Tejas Gokhale, Kuan-Cheng Lee, Chitta Baral, and Yezhou Yang. Grounding stylistic do- main generalization with quantitative domain shift measures and synthetic scene images. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7303–7313, 2024. 7

2024

[39] [39]

Reducing domain gap by reduc- ing style bias

Hyeonseob Nam, HyunJae Lee, Jongchan Park, Wonjun Yoon, and Donggeun Yoo. Reducing domain gap by reduc- ing style bias. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8690– 8699, 2021. 6, 7

2021

[40] [40]

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. Glide: Towards photorealistic image generation and editing with text-guided diffusion models.arXiv preprint arXiv:2112.10741, 2021. 2

work page internal anchor Pith review Pith/arXiv arXiv 2021

[41] [41]

Permuted adain: Reducing the bias towards global statistics in image clas- sification

Oren Nuriel, Sagie Benaim, and Lior Wolf. Permuted adain: Reducing the bias towards global statistics in image clas- sification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9482–9491,

[42] [42]

Transferrable prototypical networks for unsupervised domain adaptation

Yingwei Pan, Ting Yao, Yehao Li, Yu Wang, Chong-Wah Ngo, and Tao Mei. Transferrable prototypical networks for unsupervised domain adaptation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2239–2247, 2019. 1

2019

[43] [43]

Moment matching for multi-source domain adaptation

Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, and Bo Wang. Moment matching for multi-source domain adaptation. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 1406–1415,

[44] [44]

Domain agnostic learning with disentangled repre- sentations

Xingchao Peng, Zijun Huang, Ximeng Sun, and Kate Saenko. Domain agnostic learning with disentangled repre- sentations. InInternational conference on machine learning, pages 5102–5112. PMLR, 2019. 2

2019

[45] [45]

Learning to learn single domain generalization

Fengchun Qiao, Long Zhao, and Xi Peng. Learning to learn single domain generalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12556–12565, 2020. 2

2020

[46] [46]

Bmd: A general class-balanced multicen- tric dynamic prototype strategy for source-free domain adap- tation

Sanqing Qu, Guang Chen, Jing Zhang, Zhijun Li, Wei He, and Dacheng Tao. Bmd: A general class-balanced multicen- tric dynamic prototype strategy for source-free domain adap- tation. InEuropean conference on computer vision, pages 165–182. Springer, 2022. 1

2022

[47] [47]

Modality-agnostic debiasing for sin- gle domain generalization

Sanqing Qu, Yingwei Pan, Guang Chen, Ting Yao, Changjun Jiang, and Tao Mei. Modality-agnostic debiasing for sin- gle domain generalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24142–24151, 2023. 5, 6

2023

[48] [48]

Upcycling models under domain and category shift

Sanqing Qu, Tianpei Zou, Florian R ¨ohrbein, Cewu Lu, Guang Chen, Dacheng Tao, and Changjun Jiang. Upcycling models under domain and category shift. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20019–20028, 2023. 1

2023

[49] [49]

Learning transferable visual models from natural language supervi- sion

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PMLR, 2021. 4

2021

[50] [50]

Fishr: Invariant gradient variances for out-of-distribution generalization

Alexandre Rame, Corentin Dancette, and Matthieu Cord. Fishr: Invariant gradient variances for out-of-distribution generalization. InInternational Conference on Machine Learning, pages 18347–18377. PMLR, 2022. 6

2022

[51] [51]

Zero-shot text-to-image generation

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea V oss, Alec Radford, Mark Chen, and Ilya Sutskever. Zero-shot text-to-image generation. InInternational confer- ence on machine learning, pages 8821–8831. Pmlr, 2021. 2

2021

[52] [52]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022. 2, 3, 4

2022

[53] [53]

U- net: Convolutional networks for biomedical image segmen- tation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image segmen- tation. InMedical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pages 234–241. Springer, 2015. 4

2015

[54] [54]

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization

Shiori Sagawa, Pang Wei Koh, Tatsunori B Hashimoto, and Percy Liang. Distributionally robust neural networks for group shifts: On the importance of regularization for worst- case generalization.arXiv preprint arXiv:1911.08731, 2019. 6, 7

work page internal anchor Pith review Pith/arXiv arXiv 1911

[55] [55]

Generalizing Across Domains via Cross-Gradient Training

Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Sid- dhartha Chaudhuri, Preethi Jyothi, and Sunita Sarawagi. Generalizing across domains via cross-gradient training. arXiv preprint arXiv:1804.10745, 2018. 2

work page internal anchor Pith review Pith/arXiv arXiv 2018

[56] [56]

Unknown domain inconsistency minimization for domain generalization

Seungjae Shin, HeeSun Bae, Byeonghu Na, Yoon-Yeong Kim, and Il-chul Moon. Unknown domain inconsistency minimization for domain generalization. InThe Twelfth In- ternational Conference on Learning Representations. 5, 6

[57] [57]

Deep coral: Correlation alignment for deep domain adaptation

Baochen Sun and Kate Saenko. Deep coral: Correlation alignment for deep domain adaptation. InComputer Vision– ECCV 2016 Workshops: Amsterdam, The Netherlands, Oc- tober 8-10 and 15-16, 2016, Proceedings, Part III 14, pages 443–450. Springer, 2016. 6, 7

2016

[58] [58]

Rethinking multi- domain generalization with a general learning objective

Zhaorui Tan, Xi Yang, and Kaizhu Huang. Rethinking multi- domain generalization with a general learning objective. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 23512–23522, 2024. 7

2024

[59] [59]

Springer science & business media, 2013

Vladimir Vapnik.The nature of statistical learning theory. Springer science & business media, 2013. 6, 7

2013

[60] [60]

Deep hashing network for unsupervised domain adaptation

Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty, and Sethuraman Panchanathan. Deep hashing network for unsupervised domain adaptation. InProceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 5018–5027, 2017. 6

2017

[61] [61]

Addressing model vul- nerability to distributional shifts over image transformation sets

Riccardo V olpi and Vittorio Murino. Addressing model vul- nerability to distributional shifts over image transformation sets. InProceedings of the IEEE/CVF International Confer- ence on Computer Vision, pages 7980–7989, 2019. 2

2019

[62] [62]

Generalizing to unseen domains via adversarial data augmentation.Ad- vances in neural information processing systems, 31, 2018

Riccardo V olpi, Hongseok Namkoong, Ozan Sener, John C Duchi, Vittorio Murino, and Silvio Savarese. Generalizing to unseen domains via adversarial data augmentation.Ad- vances in neural information processing systems, 31, 2018. 2

2018

[63] [63]

Meta-learning for domain generalization in semantic parsing.arXiv preprint arXiv:2010.11988, 2020

Bailin Wang, Mirella Lapata, and Ivan Titov. Meta-learning for domain generalization in semantic parsing.arXiv preprint arXiv:2010.11988, 2020. 2

work page arXiv 2010

[64] [64]

Stpr: Spatiotemporal preservation and routing for exemplar-free video class-incremental learning

Huaijie Wang, De Cheng, Guozhang Li, Zhipeng Xu, Lingfeng He, Jie Li, Nannan Wang, and Xinbo Gao. Stpr: Spatiotemporal preservation and routing for exemplar-free video class-incremental learning. InProceedings of the Four- teenth International Conference on Learning Representa- tions, 2026. 1

2026

[65] [65]

Generalizing to unseen domains: A survey on do- main generalization.IEEE transactions on knowledge and data engineering, 35(8):8052–8072, 2022

Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, and S Yu Philip. Generalizing to unseen domains: A survey on do- main generalization.IEEE transactions on knowledge and data engineering, 35(8):8052–8072, 2022. 1

2022

[66] [66]

Sharpness-aware gradient matching for domain generaliza- tion

Pengfei Wang, Zhaoxiang Zhang, Zhen Lei, and Lei Zhang. Sharpness-aware gradient matching for domain generaliza- tion. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 3769–3778,

[67] [67]

Learning to diversify for single do- main generalization

Zijian Wang, Yadan Luo, Ruihong Qiu, Zi Huang, and Mahsa Baktashmotlagh. Learning to diversify for single do- main generalization. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision, pages 834–843,

[68] [68]

Imagere- ward: Learning and evaluating human preferences for text- to-image generation.Advances in Neural Information Pro- cessing Systems, 36, 2024

Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. Imagere- ward: Learning and evaluating human preferences for text- to-image generation.Advances in Neural Information Pro- cessing Systems, 36, 2024. 3

2024

[69] [69]

Reasoning-driven multimodal llm for domain generalization

Zhipeng Xu, Zilong Wang, Xinyang Jiang, Dongsheng Li, De Cheng, and Nannan Wang. Reasoning-driven multimodal llm for domain generalization. InThe Fourteenth Interna- tional Conference on Learning Representations, 2026. 1

2026

[70] [70]

Improve unsupervised domain adaptation with mixup training.arXiv preprint arXiv:2001.00677, 2020

Shen Yan, Huan Song, Nanxiang Li, Lincan Zou, and Liu Ren. Improve unsupervised domain adaptation with mixup training.arXiv preprint arXiv:2001.00677, 2020. 6

work page arXiv 2001

[71] [71]

Cutmix: Regu- larization strategy to train strong classifiers with localizable features

Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. Cutmix: Regu- larization strategy to train strong classifiers with localizable features. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 6023–6032, 2019. 6

2019

[72] [72]

mixup: Beyond empirical risk minimiza- tion

Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. mixup: Beyond empirical risk minimiza- tion. InInternational Conference on Learning Representa- tions, 2018. 6

2018

[73] [73]

Towards principled disentanglement for domain generalization

Hanlin Zhang, Yi-Fan Zhang, Weiyang Liu, Adrian Weller, Bernhard Sch ¨olkopf, and Eric P Xing. Towards principled disentanglement for domain generalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8024–8034, 2022. 1

2022

[74] [74]

Adaptive risk min- imization: Learning to adapt to domain shift.Advances in Neural Information Processing Systems, 34:23664–23678,

Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, and Chelsea Finn. Adaptive risk min- imization: Learning to adapt to domain shift.Advances in Neural Information Processing Systems, 34:23664–23678,

[75] [75]

Exact feature distribution matching for arbitrary style transfer and domain generalization

Yabin Zhang, Minghan Li, Ruihuang Li, Kui Jia, and Lei Zhang. Exact feature distribution matching for arbitrary style transfer and domain generalization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8035–8045, 2022. 6

2022

[76] [76]

Maximum-entropy adversarial data augmentation for im- proved generalization and robustness.Advances in Neural Information Processing Systems, 33:14435–14447, 2020

Long Zhao, Ting Liu, Xi Peng, and Dimitris Metaxas. Maximum-entropy adversarial data augmentation for im- proved generalization and robustness.Advances in Neural Information Processing Systems, 33:14435–14447, 2020. 2

2020

[77] [77]

Domain generalization with mixstyle.arXiv preprint arXiv:2104.02008, 2021

Kaiyang Zhou, Yongxin Yang, Yu Qiao, and Tao Xi- ang. Domain generalization with mixstyle.arXiv preprint arXiv:2104.02008, 2021. 2, 6, 7

work page arXiv 2021

[78] [78]

Surrogate gap minimization improves sharpness-aware training.arXiv preprint arXiv:2203.08065,

Juntang Zhuang, Boqing Gong, Liangzhe Yuan, Yin Cui, Hartwig Adam, Nicha Dvornek, Sekhar Tatikonda, James Duncan, and Ting Liu. Surrogate gap minimization improves sharpness-aware training.arXiv preprint arXiv:2203.08065,

work page arXiv