pith. sign in

arxiv: 2311.14756 · v2 · submitted 2023-11-23 · 💻 cs.LG · cs.AI

Task-Distributionally Robust Data-Free Meta-Learning

Pith reviewed 2026-05-24 05:40 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords data-free meta-learningtask distribution shifttask distribution corruptionmodel inversionmeta-learning robustnesssynthetic task reconstructionautomatic model selectiontrustworthy machine learning
0
0 comments X

The pith

Data-free meta-learning is vulnerable to task distribution shifts and corruption by harmful models, which a three-component framework mitigates via synthetic reconstruction, memory interpolation, and automatic selection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that data-free meta-learning from pre-trained models without original data suffers from two overlooked problems: sequential shifts in task distributions that cause forgetting of prior meta-knowledge, and exposure to untrustworthy models that corrupt the learning process. It proposes a framework that first reconstructs synthetic tasks from the models using inversion techniques, then replays interpolated historical tasks to retain earlier knowledge, and finally applies automatic selection to exclude harmful models. A sympathetic reader would care because real-world meta-learning often occurs with evolving or unverified model pools and no access to source data, making robustness essential for practical deployment. The work shows these components together enable trustworthy meta-learning under distribution uncertainty.

Core claim

By reconstructing synthetic tasks from multiple pre-trained models, replaying interpolated historical tasks to recall previous meta-knowledge, and incorporating an automatic model selection mechanism to filter untrustworthy models, data-free meta-learning achieves robustness against task-distribution shift and task-distribution corruption without requiring original training data or labeled validation sets.

What carries the argument

The trustworthy DFML framework consisting of synthetic task reconstruction via model inversion, meta-learning with task memory interpolation, and automatic model selection.

If this is right

  • Synthetic task reconstruction allows meta-learning to proceed from any collection of pre-trained models even when original data is unavailable or private.
  • Task memory interpolation prevents catastrophic forgetting as the sequence of tasks evolves over time.
  • Automatic model selection removes the need for manual vetting and protects against deceptive or low-quality models in the pool.
  • The overall approach operates in a fully data-free regime while addressing both robustness and security concerns simultaneously.
  • Releasing the code enables direct testing of these robustness gains on new model collections.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The interpolation strategy may generalize to other continual or online meta-learning settings where task order is unpredictable.
  • If inversion quality varies across models, weighting the reconstructed tasks by estimated fidelity could further improve results.
  • Similar selection mechanisms might apply in federated or distributed learning where participants contribute models of unknown quality.
  • The framework highlights that robustness in data-free settings requires explicit mechanisms for both memory retention and source filtering.

Load-bearing premise

Model inversion produces synthetic tasks representative enough of the original unseen distributions for effective meta-learning, and the automatic selector can distinguish beneficial from harmful models without any labeled validation data.

What would settle it

Run the framework on a model pool where inversion yields tasks missing key distribution features and observe whether meta-test accuracy on held-out tasks collapses compared to baselines, or where the selector includes a known harmful model and performance degrades.

Figures

Figures reproduced from arXiv: 2311.14756 by Baoyuan Wu, Chun Yuan, Dacheng Tao, Li Shen, Yongxian Wei, Zhenyi Wang, Zixuan Hu.

Figure 1
Figure 1. Figure 1: Concepts of TDS and TDC. (Left) At a certain training it [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Vulnerabilities of PURER for TDS (left) and TDC (right). [PITH_FULL_IMAGE:figures/full_fig_p001_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: TDC arises due to the use of untrusted pre-trained models, [PITH_FULL_IMAGE:figures/full_fig_p002_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Stability comparison is SPAN versus PURER across 60K meta-training iterations. This figure illustrates the superiority of SPAN, highlighting its significantly higher and more stable meta-testing accuracy throughout the entire meta-training phase. and pre-trained models employe a Conv4 architecture for consistency with existing works. Hyperparameters included an inner learning rate of 0.01 and an outer lear… view at source ↗
Figure 5
Figure 5. Figure 5: (Top) Performance Improvements with AMS on CIFAR-FS. (Bottom) Increasing trends in RSR indicate AMS’s enhanced [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visualization of data generated from pre-trained Conv4. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: T-SNE visualization of interpolated tasks. [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
read the original abstract

Data-Free Meta-Learning (DFML) aims to enable efficient learning of unseen few-shot tasks, by meta-learning from multiple pre-trained models without accessing their original training data. While existing DFML methods typically generate synthetic data from these models to perform meta-learning, a comprehensive analysis of DFML's robustness-particularly its failure modes and vulnerability to potential attacks-remains notably absent. Such an analysis is crucial as algorithms often operate in complex and uncertain real-world environments. This paper fills this significant gap by systematically investigating the robustness of DFML, identifying two critical but previously overlooked vulnerabilities: Task-Distribution Shift (TDS) and Task-Distribution Corruption (TDC). TDS refers to the sequential shifts in the evolving task distribution, leading to the catastrophic forgetting of previously learned meta-knowledge. TDC exposes a security flaw of DFML, revealing its susceptibility to attacks when the pre-trained model pool includes untrustworthy models that deceptively claim to be beneficial but are actually harmful. To mitigate these vulnerabilities, we propose a trustworthy DFML framework comprising three components: synthetic task reconstruction, meta-learning with task memory interpolation, and automatic model selection. Specifically, utilizing model inversion techniques, we reconstruct synthetic tasks from multiple pre-trained models to perform meta-learning. To prevent forgetting, we introduce a strategy to replay interpolated historical tasks to efficiently recall previous meta-knowledge. Furthermore, our framework seamlessly incorporates an automatic model selection mechanism to automatically filter out untrustworthy models during the meta-learning process. Code is available at https://github.com/Egg-Hu/Trustworthy-DFML.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper identifies two previously overlooked vulnerabilities in Data-Free Meta-Learning (DFML)—Task-Distribution Shift (TDS), which causes catastrophic forgetting of meta-knowledge due to evolving task distributions, and Task-Distribution Corruption (TDC), which arises from untrustworthy models in the pre-trained pool—and proposes a trustworthy DFML framework with three components: synthetic task reconstruction via model inversion, meta-learning augmented by task memory interpolation for replay, and an automatic model selection mechanism to filter harmful models. The approach is evaluated on standard few-shot benchmarks with code released.

Significance. If the empirical results hold, the work makes a meaningful contribution by systematically analyzing robustness failures in DFML and providing concrete mitigations, with the public code release supporting reproducibility. This could improve reliability of data-free meta-learning in uncertain environments, though the significance depends on whether the proposed components demonstrably outperform baselines under controlled TDS and TDC conditions.

major comments (2)
  1. [Abstract / Framework description] The central claim that synthetic tasks reconstructed via model inversion are sufficiently representative for effective meta-learning and TDC detection (as stated in the abstract and framework description) is load-bearing; without explicit quantitative fidelity checks (e.g., feature-space divergence, downstream performance gap, or diversity metrics between synthetic and held-out real distributions), it is unclear whether the interpolation replay and selection mechanism can recover true meta-knowledge or reliably separate beneficial from harmful models.
  2. [Proposed framework (automatic model selection)] The automatic model selection component is described as filtering untrustworthy models during meta-learning, but the manuscript must specify the exact selection criterion (e.g., a loss threshold, consistency metric, or learned heuristic) and demonstrate via ablation that it does not rely on the same synthetic data used for evaluation, to avoid circularity in the TDC mitigation claim.
minor comments (2)
  1. [Method] Notation for task memory interpolation (e.g., how historical tasks are sampled and combined) should be formalized with an equation or algorithm box for clarity.
  2. [Introduction / Analysis] The abstract mentions 'comprehensive analysis' of failure modes, but the manuscript should include a dedicated section or table enumerating the attack models or shift scenarios considered.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below, indicating where revisions to the manuscript are warranted to strengthen the presentation of the framework and its claims.

read point-by-point responses
  1. Referee: [Abstract / Framework description] The central claim that synthetic tasks reconstructed via model inversion are sufficiently representative for effective meta-learning and TDC detection (as stated in the abstract and framework description) is load-bearing; without explicit quantitative fidelity checks (e.g., feature-space divergence, downstream performance gap, or diversity metrics between synthetic and held-out real distributions), it is unclear whether the interpolation replay and selection mechanism can recover true meta-knowledge or reliably separate beneficial from harmful models.

    Authors: We agree that the representativeness of the synthetic tasks is central to the claims. The manuscript currently relies on downstream few-shot classification performance as indirect validation of utility. To directly address the concern, we will add quantitative fidelity analyses in the revision, including feature-space divergence metrics (e.g., MMD or FID between synthetic and held-out real task distributions) and diversity measures, along with performance gap comparisons where real data is available for reference. These additions will clarify the basis for the interpolation and selection components. revision: yes

  2. Referee: [Proposed framework (automatic model selection)] The automatic model selection component is described as filtering untrustworthy models during meta-learning, but the manuscript must specify the exact selection criterion (e.g., a loss threshold, consistency metric, or learned heuristic) and demonstrate via ablation that it does not rely on the same synthetic data used for evaluation, to avoid circularity in the TDC mitigation claim.

    Authors: We acknowledge that the manuscript describes the automatic model selection at a high level without providing the precise criterion or the requested ablation. In the revision we will explicitly state the selection criterion (a consistency-based metric computed on reconstructed tasks) and include an ablation study that isolates the selection mechanism from the primary evaluation data to mitigate circularity concerns. This will be added to the experimental section. revision: yes

Circularity Check

0 steps flagged

No circularity: framework components defined independently of evaluation

full rationale

The paper identifies TDS and TDC vulnerabilities and proposes a three-component framework (synthetic task reconstruction via model inversion, meta-learning with task memory interpolation, automatic model selection). No equations, fitted parameters, or self-citations appear in the provided text that reduce any claimed result to its own inputs by construction. The components are procedural definitions whose success is evaluated externally rather than tautologically. This matches the default case of a self-contained methodological paper with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; any implementation details such as interpolation weights or selection thresholds would be free parameters but are not stated here.

pith-pipeline@v0.9.0 · 5826 in / 1148 out tokens · 22080 ms · 2026-05-24T05:40:20.884673+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages · 2 internal anchors

  1. [1]

    Meta-learning with differentiable closed-form solvers

    Luca Bertinetto, Joao F Henriques, Philip HS Torr, and An- drea Vedaldi. Meta-learning with differentiable closed-form solvers. arXiv preprint arXiv:1805.08136, 2018. 5, 6, 1

  2. [2]

    Robust and resource- efficient data-free knowledge distillation by generative pseudo replay

    Kuluhan Binici, Shivam Aggarwal, Nam Trung Pham, Karianto Leman, and Tulika Mitra. Robust and resource- efficient data-free knowledge distillation by generative pseudo replay. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 6089–6096, 2022. 3

  3. [3]

    Language models are few-shot learners

    Tom Brown, Benjamin Mann, Nick Ryder, Melanie Sub- biah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020. 4

  4. [4]

    Data-free learning of student networks

    Hanting Chen, Yunhe Wang, Chang Xu, Zhaohui Yang, Chuanjian Liu, Boxin Shi, Chunjing Xu, Chao Xu, and Qi Tian. Data-free learning of student networks. In Proceed- ings of the IEEE/CVF International Conference on Computer Vision, pages 3514–3522, 2019. 3

  5. [5]

    Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)

    Noel Codella, Veronica Rotemberg, Philipp Tschandl, M Emre Celebi, Stephen Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael Marchetti, et al. Skin lesion analysis toward melanoma detection 2018: A chal- lenge hosted by the international skin imaging collaboration (isic). arXiv preprint arXiv:1902.03368, 2019. 7

  6. [6]

    Momentum adversarial distillation: Handling large distribution shifts in data-free knowledge distillation

    Kien Do, Thai Hung Le, Dung Nguyen, Dang Nguyen, Haripriya Harikumar, Truyen Tran, Santu Rana, and Svetha Venkatesh. Momentum adversarial distillation: Handling large distribution shifts in data-free knowledge distillation. Advances in Neural Information Processing Systems , 35: 10055–10067, 2022. 3

  7. [7]

    Contrastive model inver- sion for data-free knowledge distillation

    Gongfan Fang, Jie Song, Xinchao Wang, Chengchao Shen, Xingen Wang, and Mingli Song. Contrastive model inver- sion for data-free knowledge distillation. arXiv preprint arXiv:2105.08584, 2021

  8. [8]

    Up to 100x faster data-free knowledge distillation

    Gongfan Fang, Kanya Mo, Xinchao Wang, Jie Song, Shitao Bei, Haofei Zhang, and Mingli Song. Up to 100x faster data-free knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 6597–6604, 2022. 3

  9. [9]

    Model- agnostic meta-learning for fast adaptation of deep networks

    Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model- agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pages 1126–

  10. [10]

    Boot- strapped meta-learning

    Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, and Satinder Singh. Boot- strapped meta-learning. In The Tenth International Confer- ence on Learning Representations, 2022. 3

  11. [11]

    Memory- based meta-learning on non-stationary distributions

    Tim Genewein, Gr´egoire Del´etang, Anian Ruoss, Li Kevin Wenliang, Elliot Catt, Vincent Dutordoir, Jordi Grau-Moya, Laurent Orseau, Marcus Hutter, and Joel Veness. Memory- based meta-learning on non-stationary distributions. In Inter- national conference on machine learning, 2023. 3

  12. [12]

    Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification

    Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(7):2217–2226, 2019. 7

  13. [13]

    Architecture, dataset and model- scale agnostic data-free meta-learning

    Zixuan Hu, Li Shen, Zhenyi Wang, Tongliang Liu, Chun Yuan, and Dacheng Tao. Architecture, dataset and model- scale agnostic data-free meta-learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 1, 2, 3, 5, 6

  14. [14]

    Unsupervised meta-learning via few-shot pseudo-supervised contrastive learning

    Huiwon Jang, Hankook Lee, and Jinwoo Shin. Unsupervised meta-learning via few-shot pseudo-supervised contrastive learning. In The Eleventh International Conference on Learn- ing Representations, 2023. 3

  15. [15]

    Ood-maml: Meta-learning for few-shot out-of-distribution detection and classification

    Taewon Jeong and Heeyoung Kim. Ood-maml: Meta-learning for few-shot out-of-distribution detection and classification. Advances in Neural Information Processing Systems, 33:3907– 3916, 2020. 3

  16. [16]

    Meta-learning with a geometry-adaptive preconditioner

    Suhyun Kang, Duhun Hwang, Moonjung Eo, Taesup Kim, and Wonjong Rhee. Meta-learning with a geometry-adaptive preconditioner. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16080– 16090, 2023. 3

  17. [17]

    Maze: Data-free model stealing attack using zeroth-order gradient estimation

    Sanjay Kariyappa, Atul Prakash, and Moinuddin K Qureshi. Maze: Data-free model stealing attack using zeroth-order gradient estimation. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 13814–13823, 2021. 3

  18. [18]

    A reweighted meta learning framework for robust few shot learning.arXiv preprint arXiv:2011.06782,

    Krishnateja Killamsetty, Changbin Li, Chen Zhao, Rishabh Iyer, and Feng Chen. A reweighted meta learning framework for robust few shot learning.arXiv preprint arXiv:2011.06782,

  19. [19]

    Human-level concept learning through probabilistic program induction

    Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenen- baum. Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338, 2015. 7

  20. [20]

    Mnist handwritten digit database

    Yann LeCun, Corinna Cortes, and CJ Burges. Mnist handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, 2, 2010. 7

  21. [21]

    Learning to balance: Bayesian meta-learning for imbalanced and out- of-distribution tasks

    Hae Beom Lee, Hayeon Lee, Donghyun Na, Saehoon Kim, Minseop Park, Eunho Yang, and Sung Ju Hwang. Learning to balance: Bayesian meta-learning for imbalanced and out- of-distribution tasks. In ICLR, 2020. 3 9

  22. [22]

    Boosting few-shot learning with adaptive margin loss

    Aoxue Li, Weiran Huang, Xu Lan, Jiashi Feng, Zhenguo Li, and Liwei Wang. Boosting few-shot learning with adaptive margin loss. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12576–12584,

  23. [23]

    Model conversion via differentially private data-free distillation

    Bochao Liu, Pengju Wang, Shikun Li, Dan Zeng, and Shiming Ge. Model conversion via differentially private data-free distillation. arXiv preprint arXiv:2304.12528, 2023. 3

  24. [24]

    Data-free neural architec- ture search via recursive label calibration

    Zechun Liu, Zhiqiang Shen, Yun Long, Eric Xing, Kwang- Ting Cheng, and Chas Leichner. Data-free neural architec- ture search via recursive label calibration. arXiv preprint arXiv:2112.02086, 2021

  25. [25]

    Inceptionism: Going deeper into neural networks, 2015

    Alexander Mordvintsev, Christopher Olah, and Mike Tyka. Inceptionism: Going deeper into neural networks, 2015. 3

  26. [26]

    Automated flower classification over a large number of classes

    Maria-Elena Nilsback and Andrew Zisserman. Automated flower classification over a large number of classes. In 2008 Sixth Indian conference on computer vision, graphics & image processing, pages 722–729. IEEE, 2008. 5, 6, 1

  27. [27]

    Mars: Meta-learning as score matching in the func- tion space

    Krunoslav Lehman Pavasovic, Jonas Rothfuss, and Andreas Krause. Mars: Meta-learning as score matching in the func- tion space. In The Eleventh International Conference on Learning Representations, 2023. 3

  28. [28]

    Learning to learn without forgetting by maximizing transfer and minimizing interference

    Matthew Riemer, Ignacio Cases, Robert Ajemian, Miao Liu, Irina Rish, Yuhai Tu, and Gerald Tesauro. Learning to learn without forgetting by maximizing transfer and minimizing interference. International Conference on Learning Repre- sentations, 2019. 4

  29. [29]

    Two sides of meta-learning evaluation: In vs

    Amrith Setlur, Oscar Li, and Virginia Smith. Two sides of meta-learning evaluation: In vs. out of distribution. Ad- vances in neural information processing systems, 34:3770– 3783, 2021. 3

  30. [30]

    Meta- learning for multi-label few-shot classification

    Christian Simon, Piotr Koniusz, and Mehrtash Harandi. Meta- learning for multi-label few-shot classification. In Proceed- ings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3951–3960, 2022. 3

  31. [31]

    Prototypical networks for few-shot learning

    Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. Advances in neural informa- tion processing systems, 30, 2017. 4

  32. [32]

    Learning large-scale neural fields via context pruned meta-learning

    Jihoon Tack, Subin Kim, Sihyun Yu, Jaeho Lee, Jinwoo Shin, and Jonathan Richard Schwarz. Learning large-scale neural fields via context pruned meta-learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. 3

  33. [33]

    Meta-dataset: A dataset of datasets for learning to learn from few examples

    Eleni Triantafillou, Tyler Zhu, Vincent Dumoulin, Pascal Lam- blin, Utku Evci, Kelvin Xu, Ross Goroshin, Carles Gelada, Kevin Jordan Swersky, Pierre-Antoine Manzagol, and Hugo Larochelle. Meta-dataset: A dataset of datasets for learning to learn from few examples. In International Conference on Learning Representations, 2020. 3, 5, 1

  34. [34]

    Data-free model extraction

    Jean-Baptiste Truong, Pratyush Maini, Robert J Walls, and Nicolas Papernot. Data-free model extraction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4771–4780, 2021. 3

  35. [35]

    The ham10000 dataset, a large collection of multi-source dermato- scopic images of common pigmented skin lesions

    Philipp Tschandl, Cliff Rosendahl, and Harald Kittler. The ham10000 dataset, a large collection of multi-source dermato- scopic images of common pigmented skin lesions. Scientific data, 5(1):1–9, 2018. 7

  36. [36]

    Matching networks for one shot learning

    Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. Advances in neural information processing systems, 29, 2016. 5, 6, 1

  37. [37]

    Multimodal model-agnostic meta-learning via task- aware modulation

    Risto Vuorio, Shao-Hua Sun, Hexiang Hu, and Joseph J Lim. Multimodal model-agnostic meta-learning via task- aware modulation. Advances in neural information process- ing systems, 32, 2019. 3

  38. [38]

    The Caltech-UCSD Birds-200-2011 Dataset

    Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. The Caltech-UCSD Birds-200-2011 Dataset. 2011. 5, 6, 1

  39. [39]

    Chestx- ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases

    Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mo- hammadhadi Bagheri, and Ronald M Summers. Chestx- ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2097–2106,

  40. [40]

    Data-free knowledge distillation with soft targeted transfer set synthesis

    Zi Wang. Data-free knowledge distillation with soft targeted transfer set synthesis. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 10245–10253, 2021. 3

  41. [41]

    Meta learning on a sequence of imbalanced domains with difficulty awareness

    Zhenyi Wang, Tiehang Duan, Le Fang, Qiuling Suo, and Mingchen Gao. Meta learning on a sequence of imbalanced domains with difficulty awareness. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 8947–8957, 2021. 3

  42. [42]

    Meta-learning with less forgetting on large-scale non-stationary task distributions

    Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Donglin Zhan, Tiehang Duan, and Mingchen Gao. Meta-learning with less forgetting on large-scale non-stationary task distributions. In European Conference on Computer Vision, pages 221–238. Springer, 2022. 3

  43. [43]

    Meta-learning without data via wasserstein distributionally- robust model fusion

    Zhenyi Wang, Xiaoyang Wang, Li Shen, Qiuling Suo, Kaiqiang Song, Dong Yu, Yan Shen, and Mingchen Gao. Meta-learning without data via wasserstein distributionally- robust model fusion. In The 38th Conference on Uncertainty in Artificial Intelligence, 2022. 1, 3, 5, 6

  44. [44]

    Simple statistical gradient-following al- gorithms for connectionist reinforcement learning

    Ronald J Williams. Simple statistical gradient-following al- gorithms for connectionist reinforcement learning. Reinforce- ment learning, pages 5–32, 1992. 5

  45. [45]

    Free lunch for few- shot learning: Distribution calibration

    Shuo Yang, Lu Liu, and Min Xu. Free lunch for few- shot learning: Distribution calibration. arXiv preprint arXiv:2101.06395, 2021. 3

  46. [46]

    Automated relational meta- learning

    Huaxiu Yao, Xian Wu, Zhiqiang Tao, Yaliang Li, Bolin Ding, Ruirui Li, and Zhenhui Li. Automated relational meta- learning. In International Conference on Learning Represen- tations, 2020. 3

  47. [47]

    Meta-learning with an adaptive task scheduler

    Huaxiu Yao, Yu Wang, Ying Wei, Peilin Zhao, Mehrdad Mahdavi, Defu Lian, and Chelsea Finn. Meta-learning with an adaptive task scheduler. Advances in Neural Information Processing Systems, 34:7497–7509, 2021. 3

  48. [48]

    Meta-learning with fewer tasks through task interpolation

    Huaxiu Yao, Linjun Zhang, and Chelsea Finn. Meta-learning with fewer tasks through task interpolation. In Proceeding of the 10th International Conference on Learning Representa- tions, 2022. 4

  49. [49]

    Few-shot learning with a strong teacher

    Han-Jia Ye, Lu Ming, De-Chuan Zhan, and Wei-Lun Chao. Few-shot learning with a strong teacher. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022. 3 10

  50. [50]

    Dreaming to distill: Data-free knowledge transfer via deep- inversion

    Hongxu Yin, Pavlo Molchanov, Jose M Alvarez, Zhizhong Li, Arun Mallya, Derek Hoiem, Niraj K Jha, and Jan Kautz. Dreaming to distill: Data-free knowledge transfer via deep- inversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8715–8724,

  51. [51]

    Enhancing meta learning via multi-objective soft improve- ment functions

    Runsheng Yu, Weiyu Chen, Xinrun Wang, and James Kwok. Enhancing meta learning via multi-objective soft improve- ment functions. In The Eleventh International Conference on Learning Representations, 2023. 3

  52. [52]

    Data-free knowledge distillation via feature exchange and activation region constraint

    Shikang Yu, Jiachen Chen, Hu Han, and Shuqiang Jiang. Data-free knowledge distillation via feature exchange and activation region constraint. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 24266–24275, 2023. 3

  53. [53]

    Dense: Data- free one-shot federated learning

    Jie Zhang, Chen Chen, Bo Li, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chunhua Shen, and Chao Wu. Dense: Data- free one-shot federated learning. Advances in Neural Infor- mation Processing Systems, 35:21414–21428, 2022. 3

  54. [54]

    Fine-tuning global model via data-free knowledge distillation for non-iid federated learning

    Lin Zhang, Li Shen, Liang Ding, Dacheng Tao, and Ling-Yu Duan. Fine-tuning global model via data-free knowledge distillation for non-iid federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10174–10183, 2022. 3

  55. [55]

    Data-free knowledge dis- tillation for image super-resolution

    Yiman Zhang, Hanting Chen, Xinghao Chen, Yiping Deng, Chunjing Xu, and Yunhe Wang. Data-free knowledge dis- tillation for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7852–7861, 2021. 3

  56. [56]

    Mars” images generated from it will be falsely labeled as “dog

    Zhuangdi Zhu, Junyuan Hong, and Jiayu Zhou. Data-free knowledge distillation for heterogeneous federated learning. In International Conference on Machine Learning , pages 12878–12889. PMLR, 2021. 3 11 Task-Distributionally Robust Data-Free Meta-Learning Supplementary Material A. Additional Experimental Setup Our experiments utilized four datasets: CIFAR-F...