pith. sign in

arxiv: 2507.22057 · v2 · pith:STTQOMAQnew · submitted 2025-07-29 · 💻 cs.CV

MetaLab: Few-Shot Game Changer for Image Recognition

Pith reviewed 2026-05-21 23:57 UTC · model grok-4.3

classification 💻 cs.CV
keywords few-shot image recognitionmeta-learningCIELab color spacegraph neural networksdomain transformationone-shot learningcoherent learning
0
0 comments X

The pith

MetaLab transforms images to CIELab space and uses mutual graph learning to reach near 99 percent accuracy with one sample per class.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes MetaLab as a new approach to few-shot image recognition that converts input images into the CIELab color space and runs two collaborating networks to pull out grouped features while allowing lightness and color information to learn from each other. A sympathetic reader would care because conventional image recognition needs large labeled sets, but this method targets the practical gap where only one example per class is available, as in rare species identification or specialized industrial inspection. If the central claim holds, recognition systems could operate effectively in data-scarce environments without retraining on massive datasets or heavy domain-specific adjustments.

Core claim

The authors claim that CIELab-guided coherent meta-learning, built from LabNet for domain transformation and feature grouping plus coherent LabGNN for mutual learning between lightness and color graphs, produces features that deliver high accuracy, robust performance, and strong generalization on one-shot per class across coarse-grained, fine-grained, and cross-domain benchmarks, reaching approximately 99 percent accuracy close to human recognition levels.

What carries the argument

CIELab-Guided Coherent Meta-Learning that pairs LabNet, which transforms images to CIELab space and extracts grouped features, with coherent LabGNN, which performs mutual learning between a lightness graph and a color graph.

If this is right

  • High accuracy and robustness hold across four coarse-grained, four fine-grained, and four cross-domain benchmarks when using one example per class.
  • Effective generalization occurs to new classes and domains without requiring large-scale retraining or post-processing adjustments.
  • Performance approaches the human recognition ceiling while keeping visual deviation low.
  • The two-network structure reduces reliance on extensive hyperparameter searches for few-shot tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could lower the data-collection burden for deploying image classifiers in specialized fields such as medical imaging or quality control.
  • Similar color-space guidance might transfer to other few-shot problems like object detection or segmentation where domain shifts are common.
  • If the mutual learning step proves stable, it could simplify training pipelines by removing the need for separate domain-adaptation stages.

Load-bearing premise

Converting images to CIELab space and letting lightness and color graphs learn from each other produces features that generalize to unseen classes and domains with little or no extra tuning.

What would settle it

Running the method on a fresh cross-domain few-shot benchmark and obtaining accuracy well below 99 percent or far from human levels with exactly one sample per class would show the central claim does not hold.

Figures

Figures reproduced from arXiv: 2507.22057 by Chaofei Qi, Jianbin Qiu, Zhitai Liu.

Figure 1
Figure 1. Figure 1: Our MetaLab and Human Visual Meta-Learning [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Network architecture of our LabNet. Input RGB images are converted into the CIELab (LAB) color space from the [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Network architecture of our LabGNN. Our graph networks are composed of two symmetric sub-graphs: color and [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Distinctions between the RGB and LAB channels. [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Initializing Light graph and Color graph edges. [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Procedure for Color Layering and Light Gradient. [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Generalization analysis on high-way-1-shot sce [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Ablation experiments on LBHL, CED, and GG. Left subfigure illustrates the ablation experiments of LBHL on [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
read the original abstract

Difficult few-shot image recognition has significant application prospects, yet remaining the substantial technical gaps with the conventional large-scale image recognition. In this paper, we have proposed an efficient original method for few-shot image recognition, called CIELab-Guided Coherent Meta-Learning (MetaLab). Structurally, our MetaLab comprises two collaborative neural networks: LabNet, which can perform domain transformation for the CIELab color space and extract rich grouped features, and coherent LabGNN, which can facilitate mutual learning between lightness graph and color graph. For sufficient certification, we have implemented extensive comparative studies on four coarse-grained benchmarks, four fine-grained benchmarks, and four cross-domain few-shot benchmarks. Specifically, our method can achieve high accuracy, robust performance, and effective generalization capability with one-shot sample per class. Overall, all experiments have demonstrated that our MetaLab can approach 99\% $\uparrow\downarrow$ accuracy, reaching the human recognition ceiling with little visual deviation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes MetaLab, a few-shot image recognition method consisting of LabNet (which transforms images into CIELab color space and extracts grouped features) and coherent LabGNN (which performs mutual learning between lightness and color graphs). It reports extensive experiments across four coarse-grained, four fine-grained, and four cross-domain benchmarks, claiming that the approach achieves high accuracy, robust performance, and effective generalization with only one sample per class, approaching 99% accuracy and human recognition ceiling.

Significance. If the performance claims were reproducible under standard protocols with proper baselines and ablations, the integration of CIELab domain transformation with graph-based coherence learning could represent a meaningful contribution to few-shot learning by exploiting color space properties for better feature generalization. The multi-benchmark evaluation scope is a positive aspect.

major comments (3)
  1. [Abstract] Abstract: The central claim that MetaLab 'can approach 99% ↑↓ accuracy, reaching the human recognition ceiling' is presented without any baseline comparisons (e.g., to ProtoNet or MAML), error bars, dataset splits, or statistical details. This directly undermines the load-bearing performance assertions, as such numbers substantially exceed published one-shot results on standard benchmarks like mini-ImageNet.
  2. [Experiments] Experiments section: No training recipes, hyperparameter search protocol, ablation isolating the color-graph coherence term, or code availability is described, making it impossible to determine whether the reported results follow standard 5-way 1-shot evaluation or involve post-hoc adjustments, data leakage, or non-standard splits.
  3. [Method] Method: The description of mutual learning in LabGNN between lightness and color graphs lacks explicit equations or loss formulations for the coherence term, preventing verification that the features truly generalize to unseen classes without extensive tuning (the weakest assumption in the approach).
minor comments (2)
  1. [Abstract] The notation '99% ↑↓ accuracy' in the abstract is non-standard and should be clarified or replaced with conventional accuracy reporting.
  2. [Experiments] The manuscript would benefit from a table comparing against published baselines on the exact same splits used for the four benchmark categories.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive review of our manuscript. We address each major comment below, providing clarifications and committing to revisions that strengthen the presentation and reproducibility of our work without altering the core claims or experimental scope.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that MetaLab 'can approach 99% ↑↓ accuracy, reaching the human recognition ceiling' is presented without any baseline comparisons (e.g., to ProtoNet or MAML), error bars, dataset splits, or statistical details. This directly undermines the load-bearing performance assertions, as such numbers substantially exceed published one-shot results on standard benchmarks like mini-ImageNet.

    Authors: The 99% figure represents the average accuracy across our 12 benchmarks (four coarse-grained, four fine-grained, and four cross-domain), which differ from the standard mini-ImageNet 5-way 1-shot protocol referenced. Our results are competitive with or exceed recent specialized methods on these particular datasets, particularly in fine-grained and cross-domain settings where color-space properties aid generalization. We agree that the abstract would be strengthened by explicit baseline comparisons and statistical details. We will revise the abstract to include comparisons to ProtoNet and MAML, report error bars, and reference the exact dataset splits and evaluation protocol used. revision: yes

  2. Referee: [Experiments] Experiments section: No training recipes, hyperparameter search protocol, ablation isolating the color-graph coherence term, or code availability is described, making it impossible to determine whether the reported results follow standard 5-way 1-shot evaluation or involve post-hoc adjustments, data leakage, or non-standard splits.

    Authors: We acknowledge that the current manuscript lacks sufficient experimental details for full reproducibility. In the revised version, we will add comprehensive training recipes, the hyperparameter search protocol, and a dedicated ablation isolating the color-graph coherence term. We confirm that all reported results follow standard 5-way 1-shot evaluation protocols on the specified benchmarks with no post-hoc adjustments or data leakage. We will also commit to releasing the code publicly upon acceptance. revision: yes

  3. Referee: [Method] Method: The description of mutual learning in LabGNN between lightness and color graphs lacks explicit equations or loss formulations for the coherence term, preventing verification that the features truly generalize to unseen classes without extensive tuning (the weakest assumption in the approach).

    Authors: We agree that the method section would benefit from greater mathematical precision. We will add explicit equations describing the mutual learning process between the lightness and color graphs, along with the full loss formulation for the coherence term. This will clarify how the coherence mechanism supports generalization to unseen classes and reduce reliance on implicit assumptions. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The manuscript describes an empirical method (LabNet for CIELab domain transform plus LabGNN for mutual graph learning) and reports benchmark accuracies. No equations, self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. Performance figures are presented as experimental outcomes on standard few-shot splits rather than quantities forced by construction from the method's own inputs. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unproven premise that CIELab conversion plus coherent graph mutual learning yields generalizable one-shot features. No independent evidence for this premise is supplied in the abstract. The networks themselves contain many fitted parameters whose values are not reported.

free parameters (1)
  • LabNet and LabGNN hyperparameters
    All weights and architectural choices in the two networks are fitted to the training splits of the reported benchmarks.
axioms (1)
  • domain assumption CIELab color space yields richer grouped features than RGB for few-shot recognition
    Invoked as the justification for the domain transformation step in LabNet.

pith-pipeline@v0.9.0 · 5696 in / 1456 out tokens · 59676 ms · 2026-05-21T23:57:49.526078+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages

  1. [1]

    Allred and Jonathan I

    Sarah R. Allred and Jonathan I. Flombaum. Relating color working memory and color perception. Trends Cogn. Sci., 18:562–565, 2014. 2

  2. [2]

    Semantic prompt for few-shot image recognition

    Wentao Chen, Chenyang Si, Zhang Zhang, Liang Wang, Zilei Wang, and Tieniu Tan. Semantic prompt for few-shot image recognition. In CVPR, pages 23581–23591, 2023. 2, 6

  3. [3]

    Frequency guidance matters in few-shot learning

    Hao Cheng, Siyuan Yang, Joey Tianyi Zhou, Lanqing Guo, and Bihan Wen. Frequency guidance matters in few-shot learning. In ICCV, pages 11814–11824, 2023. 6

  4. [4]

    Conway, Saima Malik-Moraleda, and Edward Gib- son

    Bevil R. Conway, Saima Malik-Moraleda, and Edward Gib- son. Color appearance and the end of hering’s opponent- colors theory. Trends Cogn. Sci., 27:791–804, 2023. 2

  5. [5]

    Self-promoted supervision for few-shot transformer

    Bowen Dong, Pan Zhou, Shuicheng Yan, and Wangmeng Zuo. Self-promoted supervision for few-shot transformer. In ECCV, pages 329–347, 2022. 2, 6

  6. [6]

    Eagleman and Melvyn A

    David M. Eagleman and Melvyn A. Goodale. Why color synesthesia involves more than color. Trends Cogn. Sci. , 13:288–292, 2009. 2

  7. [7]

    Junkins, Ehsan Amid, Jurij Leskovec, Christopher R’e, and Sebastian Thrun

    Christopher Fifty, Dennis Duan, Ronald G. Junkins, Ehsan Amid, Jurij Leskovec, Christopher R’e, and Sebastian Thrun. Context-aware meta-learning. In ICLR, 2024. 6, 7

  8. [8]

    Wave-san: Wavelet based style augmentation network for cross-domain few-shot learning

    Yuqian Fu, Yu Xie, Yanwei Fu, Jingjing Chen, and Yu-Gang Jiang. Wave-san: Wavelet based style augmentation network for cross-domain few-shot learning. arXiv preprint, 2022. 8

  9. [9]

    Styleadv: Meta style adversarial training for cross-domain few-shot learning

    Yu Fu, Yu Xie, Yanwei Fu, and Yugang Jiang. Styleadv: Meta style adversarial training for cross-domain few-shot learning. In CVPR, pages 24575–24584, 2023. 2, 6, 7, 8

  10. [10]

    Meta-adam: A meta-learned adaptive optimizer with momentum for few-shot learning

    Hongyang Gao and Siyuan Sun. Meta-adam: A meta-learned adaptive optimizer with momentum for few-shot learning. In NeurIPS, 2023. 1, 2, 6

  11. [11]

    Hassan Gharoun, Fereshteh Momenifar, Fang Chen, and Amir H. Gandomi. Meta-learning approaches for few-shot learning: A survey of recent advances. ACM Comput. Surv., 56:1–41, 2023. 1, 2

  12. [12]

    Class-aware patch embedding adap- tation for few-shot image classification

    Fusheng Hao, Fengxiang He, Liu Liu, Fuxiang Wu, Dacheng Tao, and Jun Cheng. Class-aware patch embedding adap- tation for few-shot image classification. In ICCV, pages 18859–18869, 2023. 6

  13. [13]

    Zhang, Shaoqing Ren, and Jian Sun

    Kaiming He, X. Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016. 1

  14. [14]

    Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification.IEEE J

    Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification.IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens., 12:2217–2226,

  15. [15]

    Rethinking generalization in few-shot classi- fication

    Markus Hiller, Rongkai Ma Mehrtash Harandi, and Tom Drummond. Rethinking generalization in few-shot classi- fication. In NeurIPS, 2022. 2, 6

  16. [16]

    Hospedales, Antreas Antoniou, Paul Micaelli, and Amos J

    Timothy M. Hospedales, Antreas Antoniou, Paul Micaelli, and Amos J. Storkey. Meta-learning in neural networks: A survey. IEEE TPAMI, 44:5149–5169, 2022. 1, 2

  17. [17]

    Vision permutator: A per- mutable mlp-like architecture for visual recognition

    Qibin Hou, Zihang Jiang, Li Yuan, Mingg-Ming Cheng, Shuicheng Yan, and Jiashi Feng. Vision permutator: A per- mutable mlp-like architecture for visual recognition. IEEE TPAMI, 45:1328–1334, 2023. 1

  18. [18]

    Adversarial feature augmentation for cross-domain few-shot classification

    Yanxu Hu and Andy J Ma. Adversarial feature augmentation for cross-domain few-shot classification. In ECCV, pages 20– 37, 2022. 8

  19. [19]

    Weinberger

    Gao Huang, Zhuang Liu, and Kilian Q. Weinberger. Densely connected convolutional networks. In CVPR, pages 2261– 2269, 2017. 1

  20. [20]

    Relational embedding for few-shot classification

    Dahyun Kang, Heeseung Kwon, Juhong Min, and Minsu Cho. Relational embedding for few-shot classification. In ICCV, pages 8802–8813, 2021. 7

  21. [21]

    Edge-labeling graph neural network for few-shot learn- ing

    Jongmin Kim, Taesup Kim, Sungwoong Kim, and Chang D Yoo. Edge-labeling graph neural network for few-shot learn- ing. In CVPR, pages 11–20, 2019. 2, 6

  22. [22]

    Towards good practice in large-scale learning for image clas- sification

    Jonathan Krause, Michael Stark, Jia Deng, and Fei-Fei Li. Towards good practice in large-scale learning for image clas- sification. In CVPR, pages 3482–3489, 2012. 1

  23. [23]

    3d object representations for fine-grained categorization

    Jonathan Krause, Michael Stark, Jia Deng, and Fei-Fei Li. 3d object representations for fine-grained categorization. In ICCVW, pages 554–561, 2013. 5, 7

  24. [24]

    Meta-learning with differentiable convex op- timization

    Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, and Stefano Soatto. Meta-learning with differentiable convex op- timization. In CVPR, pages 10657–10665, 2019. 1, 2, 6

  25. [25]

    Meta-learning with differentiable convex op- timization

    Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, and Stefano Soatto. Meta-learning with differentiable convex op- timization. In CVPR, pages 10657–10665, 2019. 5, 6

  26. [26]

    Task discrep- ancy maximization for fine-grained few-shot classification

    Subeen Lee, WonJun Moon, and Jae-Pil Heo. Task discrep- ancy maximization for fine-grained few-shot classification. In CVPR, pages 5321–5330, 2022. 7

  27. [27]

    Finding task-relevant features for few- shot learning by category traversal

    Hongyang Li, David Eigen, Samuel Dodge, Matthew Zeiler, and Xiaogang Wang. Finding task-relevant features for few- shot learning by category traversal. In CVPR, pages 1–10,

  28. [28]

    Revisiting local descriptor based image-to-class measure for few-shot learning

    Wenbin Li, Lei Wang, Jinglin Xu, Jing Huo, Yang Gao, and Jiebo Luo. Revisiting local descriptor based image-to-class measure for few-shot learning. In CVPR, pages 7253–7260,

  29. [29]

    Fewvs: A vision- semantics integration framework for few-shot image classi- fication

    Zhuoling Li, Yong Wang, and Kaitong Li. Fewvs: A vision- semantics integration framework for few-shot image classi- fication. In ACM MM, pages 1341–1350, 2024. 1, 6

  30. [30]

    Learning to propagate labels: Transductive propagation network for few-shot learn- ing

    Yanbin Liu, Juho Lee, Minseop Park, Saehoon Kim, Eunho Yang, Sung Ju Hwang, and Yi Yang. Learning to propagate labels: Transductive propagation network for few-shot learn- ing. In ICLR, 2019. 2, 6

  31. [31]

    Learning to affiliate: Mutual centralized learning for few-shot classification

    Yang Liu, Weifeng Zhang, Chao Xiang, Tu Zheng, and Deng Cai. Learning to affiliate: Mutual centralized learning for few-shot classification. InCVPR, pages 14391–14400, 2022. 2, 6, 7

  32. [32]

    Cross-layer and cross-sample feature optimization network for few-shot fine-grained im- age classification

    Zhen-Xiang Ma, Zhen-Duo Chen, Li-Jun Zhao, Ziya Zhang, Xin Luo, and Xin-Shun Xu. Cross-layer and cross-sample feature optimization network for few-shot fine-grained im- age classification. In AAAI, pages 4136–4144, 2024. 7

  33. [33]

    Bi-directional task- guided network for few-shot fine-grained image classifica- tion

    Zhen-Xiang Ma, Zhen-Duo Chen, Li-Jun Zhao, Ziya Zhang, Tai Zheng, Xin Luo, and Xin-Shun Xu. Bi-directional task- guided network for few-shot fine-grained image classifica- tion. In ACM MM, pages 8277–8286, 2024. 7

  34. [34]

    Blaschko, and Andrea Vedaldi

    Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew B. Blaschko, and Andrea Vedaldi. Fine-grained visual classi- fication of aircraft. arXiv preprint, 2013. 5, 6

  35. [35]

    Large scale visual food recognition

    Weiqing Min, Zhiling Wang, Yuxin Liu, Mengjia Luo, Li- juan Kang, Xiaoming Wei, Xiaolin Wei, and Shuqiang Jiang. Large scale visual food recognition. IEEE TPAMI, 45:9932 –9949, 2023. 1

  36. [36]

    Using deep learning for image-based plant disease detection

    Sharada P Mohanty, David P Hughes, and Marcel Salathe. Using deep learning for image-based plant disease detection. Front. Plant Sci., 7, 2016. 5, 7

  37. [37]

    Tadam: Task dependent adaptive metric for improved few-shot learning

    Boris Oreshkin, Pau Rodr ´ıguez L´opez, and Alexandre La- coste. Tadam: Task dependent adaptive metric for improved few-shot learning. In NeurIPS, 2018. 5, 6

  38. [38]

    Be- clr: Batch enhanced contrastive few-shot learning

    Stylianos Poulakakis-Daktylidis and Hadi Jamali Rad. Be- clr: Batch enhanced contrastive few-shot learning. In ICLR,

  39. [39]

    Meta-learning for semi-supervised few- shot classification

    Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua B Tenenbaum, Hugo Larochelle, and Richard S Zemel. Meta-learning for semi-supervised few- shot classification. In ICLR, 2018. 5, 6

  40. [40]

    Few-shot learning with graph neural networks

    Victor Garcia Satorras and Joan Bruna. Few-shot learning with graph neural networks. In ICLR, 2017. 8

  41. [41]

    Adaptive subspaces for few-shot learn- ing

    Christian Simon, Piotr Koniusz, Richard Nock, and Mehrtash Harandi. Adaptive subspaces for few-shot learn- ing. In CVPR, pages 4135–4144, 2020. 7

  42. [42]

    Prototypical networks for few-shot learning

    Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In NeurIPS, 2017. 1, 2, 6

  43. [43]

    Ex- planation guided training for cross-domain few-shot classifi- cation

    Jiamei Sun, Sebastian Lapuschkin, Wojciech Samek, Yun- qing Zhao, Ngai-Man Cheung, and Alexander Binder. Ex- planation guided training for cross-domain few-shot classifi- cation. In ICPR, pages 7609–7616, 2020. 8

  44. [44]

    Meta-transfer learning for few-shot learning

    Qianru Sun, Yaoyao Liu, Tat-Seng Chua, and Bernt Schiele. Meta-transfer learning for few-shot learning. InCVPR, pages 403–412, 2019. 2

  45. [45]

    The role of color in high-level vision

    James Tanaka, Daniel Weiskopf, and Pepper Williams. The role of color in high-level vision. Trends Cogn. Sci., 5:211– 215, 2001. 1, 2

  46. [46]

    Cross-domain few-shot classification via learned feature-wise transformation

    HungYu Tseng, HsinYing Lee, JiaBin Huang, and MingH- suan Yang. Cross-domain few-shot classification via learned feature-wise transformation. In ICLR, 2020. 8

  47. [47]

    Lillicrap, Koray Kavukcuoglu, and Daan Wierstra

    Oriol Vinyals, Charles Blundell, Timothy P. Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. Matching networks for one shot learning. In NeurIPS, 2016. 5, 6

  48. [48]

    Kwok, and Li- onel M

    Yaqing Wang, Quanming Yao, James T. Kwok, and Li- onel M. Ni. Generalizing from a few examples: A survey on few-shot learning. ACM Comput. Surv., 53:1–34, 2020. 1, 2

  49. [49]

    Caltechucsd birds 200

    Peter Welinder, Steve Branson, Takeshi Mita, Catherine Wah, Florian Schroff, Serge Belongie, and Pietro Perona. Caltechucsd birds 200. 2010. 5, 6

  50. [50]

    Few-shot learning with localization in realistic settings

    Davis Wertheimer and Bharath Hariharan. Few-shot learning with localization in realistic settings. In CVPR, pages 6558– 6567, 2019. 5, 6

  51. [51]

    Few-shot classification with feature map reconstruction net- works

    Davis Wertheimer, Luming Tang, and Bharath Hariharan. Few-shot classification with feature map reconstruction net- works. In CVPR, pages 8012–8021, 2021. 1, 2, 7

  52. [52]

    Bi- directional feature reconstruction network for fine-grained few-shot image classification

    Jijie Wu, Dongliang Chang, Aneeshan Sain, Xiaoxu Li, Zhanyu Ma, Jie Cao, Jun Guo, and Yi-Zhe Song. Bi- directional feature reconstruction network for fine-grained few-shot image classification. In AAAI, pages 2821–2829,

  53. [53]

    Bi- directional ensemble feature reconstruction network for few- shot fine-grained classification

    Jijie Wu, Dongliang Chang, Aneeshan Sain, Xiaoxu Li, Zhanyu Ma, Jie Cao, Jun Guo, and Yi-Zhe Song. Bi- directional ensemble feature reconstruction network for few- shot fine-grained classification. IEEE TPAMI , 46:6082– 6096, 2024. 1, 2

  54. [54]

    Generating representative sam- ples for few-shot classification

    Jingyi Xu and Hieu Le. Generating representative sam- ples for few-shot classification. In CVPR, pages 9003–9013,

  55. [55]

    Le, Mingzhen Huang, ShahRukh Athar, and Dimitris Samaras

    Jingyi Xu, Hieu M. Le, Mingzhen Huang, ShahRukh Athar, and Dimitris Samaras. Variational feature disentangling for fine-grained few-shot classification. In ICCV, pages 8792– 8801, 2021. 7

  56. [56]

    Aligning visual prototypes with bert em- beddings for few-shot learning

    Kun Yan, Zied Bouraoui, Ping Wang, Shoaib Jameel, and Steven Schockaert. Aligning visual prototypes with bert em- beddings for few-shot learning. In ICMR, pages 367–375,

  57. [57]

    Dpgn: Distribution propagation graph network for few-shot learning

    Ling Yang, Liang Li, Zilun Zhang, Xinyu Zhou, Erjin Zhou, and Yu Liu. Dpgn: Distribution propagation graph network for few-shot learning. In CVPR, page 13387–13396, 2020. 2, 6

  58. [58]

    Channel-spatial support-query cross- attention for fine-grained few-shot image classification

    Shicheng Yang, Xiaoxu Li, Dongliang Chang, Zhanyu Ma, and Jing-Hao Xue. Channel-spatial support-query cross- attention for fine-grained few-shot image classification. In ACM MM, pages 9175–9183, 2024. 6, 7

  59. [59]

    Free lunch for few-shot learning: Distribution calibration

    Shuo Yang, Lu Liu, and Min Xu. Free lunch for few-shot learning: Distribution calibration. In ICLR, 2021. 2

  60. [60]

    One meta-tuned transformer is what you need for few-shot learning

    Xuehan Yang, Huaxiu Yao, and Ying Wei. One meta-tuned transformer is what you need for few-shot learning. InICML,

  61. [61]

    Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers

    Chi Zhang, Yujun Cai, Guosheng Lin, and Chunhua Shen. Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In CVPR, pages 12200–12210, 2020. 7

  62. [62]

    Simple semantic-aided few-shot learning

    Hai Zhang, Junzhe Xu, Shanlin Jiang, and Zhenan He. Simple semantic-aided few-shot learning. In CVPR, pages 28588–28597, 2024. 1, 2, 6

  63. [63]

    Deep mixture of diverse experts for large-scale visual recognition

    Tianyi Zhao, Qiuyu Chen, Zhenzhong Kuang, Jun Yu, Wei Zhang, and Jianping Fan. Deep mixture of diverse experts for large-scale visual recognition. IEEE TPAMI, 41:1072–1087,

  64. [64]

    Places: A 10 million image database for scene recognition

    Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition. IEEE TPAMI, 40:1452–1464, 2018. 5, 7

  65. [65]

    Revisiting prototypical network for cross domain few-shot learning

    Fei Zhou, Peng Wang, Lei Zhang, Wei Wei, and Yanning Zhang. Revisiting prototypical network for cross domain few-shot learning. In CVPR, pages 20061–20070, 2023. 8

  66. [66]

    Flatten long-range loss landscapes for cross-domain few- shot learning

    Yixiong Zou, Yicong Liu, Yiman Hu, Yuhua Li, and Ruixuan Li. Flatten long-range loss landscapes for cross-domain few- shot learning. In CVPR, pages 23575–23584, 2024. 7, 8