pith. sign in

arxiv: 2507.22041 · v2 · pith:3IOJ243Znew · submitted 2025-07-29 · 💻 cs.CV

Shallow Deep Learning Can Still Excel in Fine-Grained Few-Shot Learning

Pith reviewed 2026-05-22 00:10 UTC · model grok-4.3

classification 💻 cs.CV
keywords fine-grained few-shot learningshallow backbonelocation-aware feature clusteringConvNet-4position encodingfrequency domain embeddingfeature fusionfew-shot classification
0
0 comments X

The pith

With location-aware enhancements, a shallow ConvNet-4 matches deep ResNet12 performance in fine-grained few-shot learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper re-evaluates the assumption that deep backbones are required for fine-grained few-shot learning and tests whether a shallow architecture can match or exceed them. It introduces the location-aware constellation network (LCN-4) built on ConvNet-4, featuring a module that combines spatial feature fusion, clustering, and hidden location cues to cut overall loss. Two supporting techniques restore positional data lost in ordinary convolutions: grid position encoding and frequency domain location embedding. Experiments across three standard benchmarks show LCN-4 beating prior ConvNet-4 methods while reaching or surpassing most ResNet12 results. A reader would care because the work questions whether added depth is the main route to handling scarce, detailed visual examples.

Core claim

We introduce a location-aware constellation network (LCN-4) equipped with a location-aware feature clustering module that proficiently encodes and integrates spatial feature fusion, feature clustering, and recessive feature location, thereby significantly minimizing the overall loss. We also propose a general grid position encoding compensation to address positional information missing in convolutions and a frequency domain location embedding technique to offset location loss in clustering features. Validation on three representative fine-grained few-shot benchmarks shows that LCN-4 notably outperforms the ConvNet-4 based state-of-the-arts and achieves performance on par with or superior to

What carries the argument

The location-aware feature clustering module, which integrates spatial feature fusion, feature clustering, and recessive feature location, aided by grid position encoding compensation and frequency domain location embedding.

If this is right

  • LCN-4 outperforms previous ConvNet-4 based state-of-the-art methods on the tested benchmarks.
  • LCN-4 reaches or exceeds the accuracy of most ResNet12-based methods.
  • Grid position encoding compensation restores positional information lost during standard convolution.
  • Frequency domain location embedding reduces location loss inside the clustering step.
  • The results support the view that shallow backbones can fully encode few-shot instances when location awareness is added.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same location compensation ideas could be tested on other low-data vision tasks where spatial layout is critical.
  • Resource-limited settings might benefit from replacing deep networks with these enhanced shallow models.
  • The modules might improve standard few-shot classification outside the fine-grained setting.
  • Similar positional fixes could be explored in other shallow architectures for efficiency gains.

Load-bearing premise

The location-aware feature clustering module can proficiently encode and integrate spatial feature fusion, feature clustering, and recessive feature location to minimize overall loss in a shallow backbone.

What would settle it

Running LCN-4 on the three fine-grained few-shot benchmarks and finding that its accuracy does not exceed prior ConvNet-4 methods or match most ResNet12 results would disprove the central claim.

Figures

Figures reproduced from arXiv: 2507.22041 by Chaofei Qi, Chao Ye, Jianbin Qiu, Weiyang Lin, Zhitai Liu.

Figure 1
Figure 1. Figure 1: Accuracy comparison of our proposed LCN-4 with sev [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of our LCN-4 which consists of four primary modules: two base Stem modules and two Constellation modules. Stem [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Classification Confusion Heatmaps Comparison of our LCN-4 on 5way-1shot Scenarios, with Baseline (BL, ConstellationNet) and [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: t-SNE Visualization of LCN-4 and ConstellationNet in Classification and 5way-1shot Scenarios on Aircraft-Fewshot. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
read the original abstract

Deep learning has witnessed the extensive utilization across a wide spectrum of domains, including fine-grained few-shot learning (FGFSL) which heavily depends on deep backbones. Nonetheless, shallower deep backbones such as ConvNet-4, are not commonly preferred because they're prone to extract a larger quantity of non-abstract visual attributes. In this paper, we initially re-evaluate the relationship between network depth and the ability to fully encode few-shot instances, and delve into whether shallow deep architecture could effectuate comparable or superior performance to mainstream deep backbone. Fueled by the inspiration from vanilla ConvNet-4, we introduce a location-aware constellation network (LCN-4), equipped with a cutting-edge location-aware feature clustering module. This module can proficiently encoder and integrate spatial feature fusion, feature clustering, and recessive feature location, thereby significantly minimizing the overall loss. Specifically, we innovatively put forward a general grid position encoding compensation to effectively address the issue of positional information missing during the feature extraction process of specific ordinary convolutions. Additionally, we further propose a general frequency domain location embedding technique to offset for the location loss in clustering features. We have carried out validation procedures on three representative fine-grained few-shot benchmarks. Relevant experiments have established that LCN-4 notably outperforms the ConvNet-4 based State-of-the-Arts and achieves performance that is on par with or superior to most ResNet12-based methods, confirming the correctness of our conjecture.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper re-evaluates the role of network depth in fine-grained few-shot learning and proposes the Location-aware Constellation Network (LCN-4), a ConvNet-4 variant augmented with a location-aware feature clustering module. This module is claimed to integrate spatial feature fusion, feature clustering, and recessive feature location via a general grid position encoding compensation and a frequency domain location embedding technique. Experiments on three standard FGFSL benchmarks are reported to show LCN-4 outperforming prior ConvNet-4 methods and matching or exceeding most ResNet-12 baselines, supporting the conjecture that suitably modified shallow backbones can compete with deeper ones.

Significance. If the performance gains are shown to stem from the proposed module rather than differences in training protocol or implementation, the result would be significant: it would demonstrate that depth is not strictly required for competitive FGFSL performance and could encourage more efficient architectures. The work supplies an empirical test of a conjecture that challenges the prevailing preference for deep backbones in this domain.

major comments (3)
  1. [§3] §3 (Location-aware feature clustering module): The description states that the module 'proficiently encoder and integrate spatial feature fusion, feature clustering, and recessive feature location' but supplies no equations, pseudocode, or derivation showing how the general grid position encoding compensation and frequency domain location embedding are formulated or combined. Without these details the central claim that the module minimizes overall loss in a shallow backbone cannot be verified or reproduced.
  2. [Results section] Results section / Table 1 (or equivalent): The reported outperformance of LCN-4 over ConvNet-4 SOTAs and parity with ResNet-12 methods is presented without mention of error bars, statistical significance tests, or ablation studies isolating the contribution of each new component. This leaves open the possibility that gains arise from unstated differences in augmentation, optimizer, or training schedule rather than the module itself.
  3. [§4] §4 (Experimental setup): The manuscript does not explicitly state whether the ResNet-12 baselines were re-implemented under identical hyper-parameters, data splits, and augmentation policies as LCN-4. Any mismatch would undermine the cross-architecture comparison that supports the main conjecture.
minor comments (2)
  1. [Abstract] Abstract: 'encoder' should read 'encode'; 'recessive feature location' is non-standard terminology and should be defined on first use.
  2. [§3] The paper would benefit from a small diagram or pseudocode block illustrating the data flow through the location-aware feature clustering module.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to improve clarity, rigor, and reproducibility.

read point-by-point responses
  1. Referee: [§3] §3 (Location-aware feature clustering module): The description states that the module 'proficiently encoder and integrate spatial feature fusion, feature clustering, and recessive feature location' but supplies no equations, pseudocode, or derivation showing how the general grid position encoding compensation and frequency domain location embedding are formulated or combined. Without these details the central claim that the module minimizes overall loss in a shallow backbone cannot be verified or reproduced.

    Authors: We agree that the current description lacks sufficient mathematical detail for independent verification. In the revised manuscript we will add the explicit equations defining the general grid position encoding compensation and the frequency domain location embedding, together with a derivation of how they are combined inside the location-aware feature clustering module. Pseudocode for the full module will also be included to show the integration steps and the resulting loss minimization. revision: yes

  2. Referee: [Results section] Results section / Table 1 (or equivalent): The reported outperformance of LCN-4 over ConvNet-4 SOTAs and parity with ResNet-12 methods is presented without mention of error bars, statistical significance tests, or ablation studies isolating the contribution of each new component. This leaves open the possibility that gains arise from unstated differences in augmentation, optimizer, or training schedule rather than the module itself.

    Authors: We acknowledge that the absence of error bars, ablations, and significance testing weakens the strength of the empirical claims. We will add standard-deviation error bars computed over multiple random seeds, include ablation tables that isolate the grid-position-encoding and frequency-domain-embedding components, and report paired statistical significance tests (e.g., t-tests) against the baselines. These additions will help demonstrate that the observed gains are attributable to the proposed module rather than training-protocol differences. revision: yes

  3. Referee: [§4] §4 (Experimental setup): The manuscript does not explicitly state whether the ResNet-12 baselines were re-implemented under identical hyper-parameters, data splits, and augmentation policies as LCN-4. Any mismatch would undermine the cross-architecture comparison that supports the main conjecture.

    Authors: The ResNet-12 baselines were re-implemented using exactly the same hyper-parameters, data splits, and augmentation policies as LCN-4. To remove any ambiguity we will expand §4 with an explicit statement of this shared experimental protocol, including the precise hyper-parameter values and augmentation settings employed for both architectures. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on external benchmark validation

full rationale

The paper's central claim is that the proposed LCN-4 with its location-aware feature clustering module outperforms ConvNet-4 SOTAs and matches or exceeds ResNet12 methods on three fine-grained few-shot benchmarks. This is established through experimental validation rather than any mathematical derivation chain. The abstract describes the module's intended capabilities at a high level (spatial feature fusion, clustering, and location encoding) but provides no equations, ansatzes, or fitted parameters that reduce the reported performance gains to the module definition itself. No self-citations, uniqueness theorems, or renamings of known results are invoked in the provided text to load-bear the conjecture. The result is therefore self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the untested premise that the added modules fully compensate for reduced abstraction depth; no independent evidence for this compensation is supplied in the abstract.

axioms (1)
  • domain assumption Shallow convolutional networks can reach deep-network performance in FGFSL once spatial location information is explicitly restored.
    Invoked in the abstract when the authors state that LCN-4 confirms their conjecture about network depth.
invented entities (1)
  • location-aware feature clustering module no independent evidence
    purpose: To integrate spatial fusion, clustering, and recessive location cues inside a shallow backbone.
    New component introduced by the paper; no external validation or falsifiable prediction outside the reported benchmarks is mentioned.

pith-pipeline@v0.9.0 · 5802 in / 1330 out tokens · 46905 ms · 2026-05-22T00:10:01.205233+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

106 extracted references · 106 canonical work pages · 2 internal anchors

  1. [1]

    2016 Deep residual learning for image recognition

    He, K.; Zhang, X.; Ren, S.; and Sun, J. 2016 Deep residual learning for image recognition . In Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 770-778

  2. [3]

    Vettoruzzo, A.; Bouguelia, M.; Vanschoren, J.; Rögnvaldsson, T.S.; and Santosh, K. 2023. Advances and Challenges in Meta-Learning: A Technical Review . In IEEE Transactions on Pattern Analysis and Machine Intelligence, 4763-4779

  3. [4]

    Krizhevsky, A.; Sutskever, I.; and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks . In Advances in neural information processing systems(NeurIPS), 1-9

  4. [5]

    Yao G., Min L., Yusen Z., Zhuzhen H., Yujie H.. 2024. Few-shot image generation with reverse contrastive learning, Neural Networks . In Neural Networks, 154-164

  5. [6]

    Wang, Y.; Xu, C.; Liu, C.; Zhang, L.; and Fu, Y. 2020. Instance Credibility Inference for Few-Shot Learning . In IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 12833-12842

  6. [7]

    Wu, J.; Chang, D.; Sain, A.; Li, X.; Ma, Z.; Cao, J.; Guo, J.; and Song, Y.Z. 2023. Bi-directional feature reconstruction network for fine-grained few-shot image classification . In Proceedings of the AAAI Conference on Artificial Intelligence(AAAI), 2821-2829

  7. [8]

    Xie, J.; Long, F.; Lv, J.; Wang, Q.; and Li, P. 2022. Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification . In IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 7962-7971

  8. [9]

    Hong, J.; Fang, P.; Li, W.; Zhang, T.; Simon, C.; Harandi, M.; and Petersson, L. 2021. Reinforced Attention for Few-Shot Learning and Beyond . In IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 913-923

  9. [10]

    Oh, J.; Yoo, H.; Kim, C.; and Yun, S. 2021. BOIL: Towards Representation Change for Few-shot Learning . In International Conference on Learning Representations(ICLR)

  10. [11]

    Li, Y.; Tarlow, D.; Brockschmidt, M.; and Zemel, R.S. 2015. Gated Graph Sequence Neural Networks . In International Conference on Learning Representations(ICLR)

  11. [12]

    Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; and Guo, B. 2021. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows . In IEEE/CVF International Conference on Computer Vision(ICCV), 9992-10002

  12. [13]

    Yao, H.; Zhang, C.; Wei, Y.; Jiang, M.; Wang, S.; Huang, J.; Chawla, N.; and Li, Z.J. 2019. Graph Few-shot Learning via Knowledge Transfer . In Proceedings of the AAAI Conference on Artificial Intelligence(AAAI), 6656-6663

  13. [14]

    Chen, R.; Chen, T.; Hui, X.; Wu, H.; Li, G.; and Lin, L. 2019. Knowledge Graph Transfer Network for Few-Shot Recognition . In Proceedings of the AAAI Conference on Artificial Intelligence(AAAI), 10575-10582

  14. [15]

    Zhang, Q.; Wu, X.; Yang, Q.; Zhang, C.; and Zhang, X. 2022. Few-shot Heterogeneous Graph Learning via Cross-domain Knowledge Transfer . In Proceedings of SIGKDD Conference on Knowledge Discovery and Data Mining(KDD), 2450-2460

  15. [16]

    Tong, X.; Yin, J.; Han, B.; and Qv, H. 2020. Few-Shot Learning With Attention-Weighted Graph Convolutional Networks For Hyperspectral Image Classification . In IEEE International Conference on Image Processing(ICIP), 1686-1690

  16. [17]

    Zhang, X.; Zhang, Y.; and Zhang, Z. 2021. Multi-granularity Recurrent Attention Graph Neural Network for Few-Shot Learning . In Conference on Multimedia Modeling(MMM), 147-158

  17. [18]

    Cheng H.; Zhou J.T.; Tay W.P.; and Wen B. 2023. Graph Neural Networks With Triple Attention for Few-Shot Learning . In IEEE Transactions on Multimedia, 8225-8239

  18. [19]

    Liu, L.; Hamilton, W.; Long, G.; Jiang, J.; and Larochelle, H. 2021. A universal representation transformer layer for few-shot image classification . In International Conference on Learning Representations(ICLR)

  19. [20]

    Gan, T.; Li, W.; Lu, Y.; and He, Y. 2021. Transformer-based few-shot learning for image classification . In Artificial Intelligence for Communications and Networks: AICON, 68-74

  20. [21]

    Wang, X.; Wang, X.; Jiang, B.; and Luo, B. 2023. Few-shot learning meets transformer: Unified query-support transformers for few-shot classification . In IEEE Transactions on Circuits and Systems for Video Technology, 7789-7802

  21. [22]

    Jiang, B.; Zhao, K.; and Tang, J. 2022. RGTransformer: Region-graph transformer for image representation and few-shot classification . In IEEE Signal Processing Letters, 792-796

  22. [23]

    He, Y.; Liang, W.; Zhao, D.; Zhou, H.Y.; Ge, W.; Yu, Y.; and Zhang, W. 2022. Attribute surrogates learning and spectral tokens pooling in transformers for few-shot learning . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 9119-9129

  23. [24]

    Wu, J.; Tian, X.; and Zhong, G. 2022. Supervised Contrastive Representation Embedding Based on Transformer for Few-Shot Classification . In Journal of Physics: Conference Series, 12-22

  24. [25]

    Cai, J.; Zhang, Y.; Guo, J.; Zhao, X.; Lv, J.; and Hu, Y. 2022. St-pn: A spatial transformed prototypical network for few-shot sar image classification . In Remote Sensing, 2000-2019

  25. [26]

    Li, Z.; Xue, Z.; Xu, Q.; Zhang, L.; Zhu, T.; and Zhang, M. 2023. SPFormer: Self-pooling transformer for few-shot hyperspectral image classification . In IEEE Transactions on Geoscience and Remote Sensing, 1-19

  26. [27]

    Xu, W.; Xu, Y.; Wang, H.; and Tu, Z. 2021. Attentional constellation nets for few-shot learning . In International Conference on Learning Representations(ICLR)

  27. [28]

    Felzenszwalb, P.F.; and Huttenlocher, D.P. 2005. Pictorial structures for object recognition . In International Journal of Computer Vision, 55-79

  28. [29]

    Sudderth, E.B.; Torralba, A.; Freeman, W.T.; and Willsky, A.S. 2005. Learning hierarchical models of scenes, objects, and parts . In IEEE/CVF International Conference on Computer Vision(ICCV), 1331-1338

  29. [30]

    Fei-Fei, L.; Fergus, R.; and Perona, P. 2006. One-shot learning of object categories . In IEEE Transactions on Pattern Analysis and Machine Intelligence, 594-611

  30. [31]

    Zhu, S.C.; and Mumford, D. 2007. A stochastic grammar of images . In Foundations and Trends® in Computer Graphics and Vision, 259-362

  31. [32]

    Li, X.; Song, Q.; Wu, J.; Zhu, R.; Ma, Z.; and Xue, J.H. 2023. Locally-enriched cross-reconstruction for few-shot fine-grained image classification . In IEEE Transactions on Circuits and Systems for Video Technology, 7530-7540

  32. [33]

    Li, Y.; Bian, C.; and Chen, H. 2023. Generalized ridge regression-based channelwise feature map weighted reconstruction network for fine-grained few-shot ship classification . In IEEE Transactions on Geoscience and Remote Sensing, 1-10

  33. [34]

    Wu, J.; Chang, D.; Sain, A.; Li, X.; Ma, Z.; Cao, J.; Guo, J.; and Song, Y.Z. 2024. Bi-directional ensemble feature reconstruction network for few-shot fine-grained classification . In IEEE Transactions on Pattern Analysis and Machine Intelligence, 1-16

  34. [35]

    2024 Cross-Layer and Cross-Sample Feature Optimization Network for Few-Shot Fine-Grained Image Classification

    Ma, Z.X.; Chen, Z.D.; Zhao, L.J.; Zhang, Z.C.; Luo, X.; and Xu, X.S. 2024 Cross-Layer and Cross-Sample Feature Optimization Network for Few-Shot Fine-Grained Image Classification . In Proceedings of the AAAI Conference on Artificial Intelligence(AAAI), 4136-4144

  35. [36]

    Xu, J.; Le, H.; Huang, M.; Athar, S.; and Samaras, D. 2021. Variational feature disentangling for fine-grained few-shot classification . In IEEE/CVF International Conference on Computer Vision(ICCV), 8812-8821

  36. [37]

    Huang, H.; Zhang, J.; Zhang, J.; Xu, J.; and Wu, Q. 2020. Low-rank pairwise alignment bilinear network for few-shot fine-grained image classification . In IEEE Transactions on Multimedia, 1666-1680

  37. [38]

    HENC: Hierarchical embedding network with center calibration for few-shot fine-grained SAR target classification

    Yang, M.; Bai, X.; Wang, L.; and Zhou, F., 2023. HENC: Hierarchical embedding network with center calibration for few-shot fine-grained SAR target classification . In IEEE Transactions on Image Processing, 3324-3337

  38. [39]

    Vinyals, O.; Blundell, C.; Lillicrap, T.; and Wierstra, D. 2016. Matching networks for one shot learning . In Advances in neural information processing systems(NeurIPS), 1-9

  39. [40]

    Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.; Torr, P.H.; and Hospedales, T.M. 2018. Learning to compare: Relation network for few-shot learning . In Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 1199-1208

  40. [41]

    Chen, W.Y.; Liu, Y.C.; Kira, Z.; Wang, Y.C.F.; and Huang, J.B. 2019. A closer look at few-shot classification . In International Conference on Learning Representations(ICLR)

  41. [42]

    Li, W.; Wang, L.; Xu, J.; Huo, J.; Gao, Y.; and Luo, J. 2019. Revisiting local descriptor based image-to-class measure for few-shot learning . In Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 7260-7268

  42. [43]

    Simon, C.; Koniusz, P.; Nock, R.; and Harandi, M. 2020. Adaptive subspaces for few-shot learning . In Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 4136-4145

  43. [44]

    and Xue, J.H., 2020

    Li, X., Wu, J., Sun, Z., Ma, Z., Cao, J. and Xue, J.H., 2020. BSNet: Bi-similarity network for few-shot fine-grained image classification . In IEEE Transactions on Image Processing, 1318-1331

  44. [45]

    Afrasiyabi, A.; Lalonde, J.F.; and Gagné, C. 2021. Mixture-based feature space learning for few-shot image classification . In Proceedings of the IEEE/CVF international conference on computer vision(ICCV), 9041-9051

  45. [46]

    Wertheimer, D.; Tang, L.; and Hariharan, B. 2021. Few-shot classification with feature map reconstruction networks . In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(CVPR), 8012-8021

  46. [47]

    Lee, S.; Moon, W.; and Heo, J.P. 2022. Task discrepancy maximization for fine-grained few-shot classification . In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(CVPR), 5331-5340

  47. [48]

    Xu, J.; Le, H.; Huang, M.; Athar, S.; and Samaras, D. 2021. Variational feature disentangling for fine-grained few-shot classification . In Proceedings of the IEEE/CVF international conference on computer vision(ICCV), 8812-8821

  48. [49]

    Kang, D.; Kwon, H.; Min, J.; and Cho, M. 2021. Relational embedding for few-shot classification . In Proceedings of the IEEE/CVF international conference on computer vision(ICCV), 8822-8833

  49. [50]

    Wah, C.; Branson, S.; Welinder, P.; Perona, P.; and Belongie, S. 2011. The caltech-ucsd birds-200-2011 dataset

  50. [52]

    and Zisserman, A., 2006 A visual vocabulary for flower classification

    Nilsback, M.E. and Zisserman, A., 2006 A visual vocabulary for flower classification . In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(CVPR), 1447-1454

  51. [53]

    Zhao, W.; Yang, L.; Dang, C.; Rocchetta, R.; Valdebenito, M.A.; and Moens, D. 2022. Enriching stochastic model updating metrics: An efficient Bayesian approach using Bray-Curtis distance and an adaptive binning algorithm . In Mechanical Systems and Signal Processing, 1-18

  52. [54]

    Zhang, C.; Cai, Y.; Lin, G.; and Shen, C. 2022. Deepemd: Differentiable earth mover's distance for few-shot learning . In IEEE Transactions on Pattern Analysis and Machine Intelligence, 5632-5648

  53. [55]

    and Li, P., 2022

    Xie, J., Long, F., Lv, J., Wang, Q. and Li, P., 2022. Joint distribution matters: Deep brownian distance covariance for few-shot classification . In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(CVPR), 7972-7981

  54. [56]

    , " * write output.state after.block = add.period write newline

    ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

  55. [57]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

  56. [58]

    Zhang, Shaoqing Ren and Jian Sun

    Kaiming He, X. Zhang, Shaoqing Ren and Jian Sun. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 770-778, 2016

  57. [59]

    Wide Residual Networks

    Sergey Zagoruyko and Nikos Komodakis. Wide Residual Networks. In arXiv preprint arXiv:1605.07146 , 2016

  58. [60]

    Vanschoren, T

    Anna Vettoruzzo, Mohamed-Rafik Bouguelia, J. Vanschoren, T. S. R \"o gnvaldsson, and Kc Santosh. Advances and Challenges in Meta-Learning: A Technical Review. IEEE Transactions on Pattern Analysis and Machine Intelligence , 46(7): 4763-4779, 2023

  59. [61]

    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems , pages 1106-1114, 2012

  60. [62]

    Xu, Chen Liu, Li Zhang, and Yanwei Fu

    Yikai Wang, C. Xu, Chen Liu, Li Zhang, and Yanwei Fu. Instance Credibility Inference for Few-Shot Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 12833-12842, 2020

  61. [63]

    Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification

    Jijie Wu, Dongliang Chang, Aneeshan Sain, Xiaoxu Li, Zhanyu Ma, Jie Cao, Jun Guo, and Yi-Zhe Song. Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification. In Proceedings of the AAAI Conference on Artificial Intelligence , pages 2821-2829, 2023

  62. [64]

    Jiangtao Xie, Fei Long, Jiaming Lv, Qilong Wang, and P. Li. Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 7962-7971, 2022

  63. [65]

    BOIL: Towards Representation Change for Few-shot Learning In International Conference on Learning Representations , 2021

    Jaehoon Oh, Hyungjun Yoo, ChangHwan Kim, and Seyoung Yun. BOIL: Towards Representation Change for Few-shot Learning In International Conference on Learning Representations , 2021

  64. [66]

    Tarlow, Marc Brockschmidt, and Richard S

    Yujia Li, D. Tarlow, Marc Brockschmidt, and Richard S. Zemel. Gated Graph Sequence Neural Networks. In International Conference on Learning Representations , 2015

  65. [67]

    Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 9992-10002, 2021

  66. [68]

    Knowledge Graph Transfer Network for Few-Shot Recognition

    Riquan Chen, Tianshui Chen, Xiaolu Hui, Hefeng Wu, Guanbin Li, and Liang Lin. Knowledge Graph Transfer Network for Few-Shot Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence , pages 10575-10582, 2019

  67. [69]

    Few-Shot Learning with Graph Neural Networks

    Victor Garcia Satorras and Joan Bruna. Few-Shot Learning with Graph Neural Networks. In International Conference on Learning Representations , 2018

  68. [70]

    DPGN: Distribution Propagation Graph Network for Few-Shot Learning

    Yang Ling, Liang Li, Zilun Zhang, Xinyu Zhou, Erjin Zhou, and Yu Liu. DPGN: Distribution Propagation Graph Network for Few-Shot Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 13387-13396, 2020

  69. [71]

    Angelov and Lopez Pellicer Alvaro

    Mona Alghamdi, Plamen P. Angelov and Lopez Pellicer Alvaro. Person identification from fingernails and knuckles images using deep learning features and the Bray-Curtis similarity measure . Neurocomputing, 513:83-93, 2022

  70. [72]

    Few-Shot Learning Meets Transformer: Unified Query-Support Transformers for Few-Shot Classification

    Xixi Wang, Xiao Wang, Bo Jiang, and Bin Luo. Few-Shot Learning Meets Transformer: Unified Query-Support Transformers for Few-Shot Classification. IEEE Transactions on Circuits and Systems for Video Technology , 33(12): 7789-7802, 2023

  71. [73]

    Attribute Surrogates Learning and Spectral Tokens Pooling in Transformers for Few-shot Learning

    Yang He, Weihan Liang, Dongyang Zhao, Hong-Yu Zhou, Weifeng Ge, Yizhou Yu, and Wenqiang Zhang. Attribute Surrogates Learning and Spectral Tokens Pooling in Transformers for Few-shot Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 9119-9129, 2022

  72. [74]

    SPFormer: Self-Pooling Transformer for Few-Shot Hyperspectral Image Classification

    Ziyu Li, Zhaohui Xue, Qi Xu, Ling Zhang, Tianzhi Zhu, and Mengxue Zhang. SPFormer: Self-Pooling Transformer for Few-Shot Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing , 1-19, 2023

  73. [75]

    Attentional Constellation Nets for Few-Shot Learning

    Weijian Xu, Yifan Xu, Huaijin Wang, and Zhuowen Tu. Attentional Constellation Nets for Few-Shot Learning. In International Conference on Learning Representations , 2021

  74. [76]

    Felzenszwalb and Daniel P

    Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Pictorial Structures for Object Recognition. International Journal of Computer Vision , 61:55-79 2005

  75. [77]

    Sudderth, Antonio Torralba, William T

    Erik B. Sudderth, Antonio Torralba, William T. Freeman, and Alan S. Willsky. Learning hierarchical models of scenes, objects, and parts. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 1331-1338, 2005

  76. [78]

    One-shot learning of object categories

    Fei-Fei Li, Rob Fergus, and Pietro Perona. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence , 28(4): 594-611, 2006

  77. [79]

    Locally-Enriched Cross-Reconstruction for Few-Shot Fine-Grained Image Classification

    Xiaoxu Li, Qi Song, Jijie Wu, Rui Zhu, Zhanyu Ma, and Jing-Hao Xue. Locally-Enriched Cross-Reconstruction for Few-Shot Fine-Grained Image Classification. IEEE Transactions on Circuits and Systems for Video Technology , 33(12): 7530-7540, 2023

  78. [80]

    Cross-Layer and Cross-Sample Feature Optimization Network for Few-Shot Fine-Grained Image Classification

    Zhenxiang Ma, Zhenduo Chen, Lijun Zhao, Ziya Zhang, Xin Luo, and Xinshun Xu. Cross-Layer and Cross-Sample Feature Optimization Network for Few-Shot Fine-Grained Image Classification. In Proceedings of the AAAI Conference on Artificial Intelligence , pages 4136-4144, 2024

  79. [81]

    Low-Rank Pairwise Alignment Bilinear Network For Few-Shot Fine-Grained Image Classification

    Huaxi Huang, Junjie Zhang, Jian Zhang, Jingsong Xu, and Qiang Wu. Low-Rank Pairwise Alignment Bilinear Network For Few-Shot Fine-Grained Image Classification. IEEE Transactions on Multimedia , 23:1666-1680, 2019

  80. [82]

    HENC: Hierarchical embedding network with center calibration for few-shot fine-grained SAR target classification

    Minjia Yang, Xueru Bai, Li Wang, and Feng Zhou. HENC: Hierarchical embedding network with center calibration for few-shot fine-grained SAR target classification. IEEE Transactions on Image Processing , 32:3324-3337, 2023

Showing first 80 references.