pith. machine review for the scientific record. sign in

arxiv: 2604.16936 · v1 · submitted 2026-04-18 · 💻 cs.CV · cs.AI

Recognition: unknown

Adaptive receptive field-based spatial-frequency feature reconstruction network for few-shot fine-grained image classification

Authors on Pith no claims yet

Pith reviewed 2026-05-10 06:55 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords few-shot fine-grained image classificationadaptive receptive fieldspatial-frequency featuresfeature reconstructionepisodic trainingFSFGICARF-SFR-Net
0
0 comments X

The pith

A network that picks receptive field sizes based on each input image extracts and fuses spatial and frequency features to improve few-shot fine-grained classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies the difficulty of selecting appropriate receptive field sizes when extracting spatial and frequency descriptors from varied input images in few-shot fine-grained tasks. To solve this, it introduces ARF-SFR-Net, which learns to set receptive field sizes adaptively for each sample, extracts the corresponding features, and fuses them for reconstruction. The network slots directly into standard episodic training pipelines and trains end-to-end without pre-training. Experiments across several FSFGIC benchmarks show the approach outperforms prior feature-reconstruction methods.

Core claim

The designed ARF-SFR-Net has the capability to adaptively determine receptive field sizes for obtaining spatial and frequency features, and effectively fuse them for reconstruction and FSFGIC tasks.

What carries the argument

ARF-SFR-Net, which adaptively selects input-dependent receptive field sizes to extract spatial and frequency descriptors before fusing them for reconstruction.

If this is right

  • The network integrates into any episodic training loop for end-to-end training from scratch.
  • Feature reconstruction quality improves when spatial and frequency descriptors are drawn from input-specific receptive fields.
  • Performance gains appear on multiple standard FSFGIC benchmarks relative to prior state-of-the-art feature-based methods.
  • The adaptive mechanism addresses the core difficulty of choosing receptive field size across different category inputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same input-dependent field selection could be tested in other low-data regimes such as few-shot object detection or segmentation.
  • If the adaptation proves stable, it might reduce the engineering effort spent on manual receptive-field hyper-parameter search in convolutional backbones.
  • Sample-specific field sizing could interact with other few-shot techniques such as prototype networks or meta-learning updates.

Load-bearing premise

Making receptive field size depend on the input image will reliably produce stronger descriptors than any fixed size without adding instability or extra overfitting risk under few-shot constraints.

What would settle it

A controlled experiment in which a fixed-receptive-field version of the same architecture matches or exceeds the adaptive version's accuracy on the same few-shot fine-grained benchmarks under identical episodic training.

Figures

Figures reproduced from arXiv: 2604.16936 by Changming Sun, Jun Hu, Linyue Zhang, Lixian Liu, Tuo Wang, Weichuan Zhang, Wenyi Zeng, Yongsheng Gao, Zicheng Pan.

Figure 1
Figure 1. Figure 1: (a) The impact of receptive field size on the FSFGIC accuracy of BDFRNet [46], [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The pipeline of the proposed ARF-SFR-Net for a 5-way 1-shot FSFGIC task [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The proposed adaptive receptive field (ARF) strategy. [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The architecture of ARF-SFR based on the ResNet-12 backbone. [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Examples of the loss and accuracy curves of BDFRNet and the proposed method [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The heatmaps of six images visualized by different domain. [PITH_FULL_IMAGE:figures/full_fig_p028_6.png] view at source ↗
read the original abstract

Feature reconstruction techniques are widely applied for few-shot fine-grained image classification (FSFGIC). Our research indicates that one of the main challenges facing existing feature-based FSFGIC methods is how to choose the size of the receptive field to extract feature descriptors (including spatial and frequency feature descriptors) from different category input images, thereby better performing the FSFGIC tasks. To address this, an adaptive receptive field-based spatial-frequency feature reconstruction network (ARF-SFR-Net) is proposed. The designed ARF-SFR-Net has the capability to adaptively determine receptive field sizes for obtaining spatial and frequency features, and effectively fuse them for reconstruction and FSFGIC tasks. The designed ARF-SFR-Net can be easily embedded into a given episodic training mechanism for end-to-end training from scratch. Extensive experiments on multiple FSFGIC benchmarks demonstrate the effectiveness and superiority of the proposed ARF-SFR-Net over state-of-the-art approaches. The code is available at: https://github.com/ICL-SUST/ARF-SFR-Net.git.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces ARF-SFR-Net for few-shot fine-grained image classification (FSFGIC). It addresses challenges in selecting receptive field sizes for spatial and frequency feature descriptors by proposing an adaptive mechanism that determines these sizes from input images, fuses the resulting features for reconstruction, and integrates into standard episodic training for end-to-end optimization. Experiments on multiple FSFGIC benchmarks are reported to demonstrate superiority over state-of-the-art methods, with code released publicly.

Significance. If the adaptive receptive-field module reliably produces more discriminative descriptors without introducing instability, the approach could advance feature reconstruction techniques in data-scarce fine-grained settings by making receptive-field selection content-dependent rather than fixed. The public code release supports reproducibility, which is a clear strength.

major comments (3)
  1. [§3] §3 (Method): The adaptive receptive-field predictor is trained jointly on the same small support sets as the downstream classifier; the manuscript provides no explicit regularization, meta-regularization, or capacity-control mechanism for this predictor, leaving open the possibility that reported gains arise from episode-specific memorization rather than a general mapping from image content to optimal field size.
  2. [§4.3] §4.3 (Ablations): Direct comparisons isolating the adaptive module against strong fixed-size receptive-field baselines (with matched capacity) are not presented in sufficient detail; without these, it is impossible to confirm that adaptivity, rather than added parameters or the spatial-frequency fusion alone, drives the benchmark improvements.
  3. [Table 1] Table 1 and §4.2 (Results): Accuracy gains are stated without standard deviations across random seeds or statistical significance tests; this makes it difficult to judge whether the superiority claims are robust or sensitive to particular data splits and hyperparameter choices in the few-shot episodic setting.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'extensive experiments on multiple FSFGIC benchmarks' would be more informative if the number of datasets and primary metrics were named explicitly.
  2. [§2] §2 (Related Work): A brief discussion of recent adaptive-receptive-field methods outside the few-shot literature would help clarify the novelty of the proposed predictor.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment below and outline the revisions planned for the resubmitted manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (Method): The adaptive receptive-field predictor is trained jointly on the same small support sets as the downstream classifier; the manuscript provides no explicit regularization, meta-regularization, or capacity-control mechanism for this predictor, leaving open the possibility that reported gains arise from episode-specific memorization rather than a general mapping from image content to optimal field size.

    Authors: We acknowledge the concern that joint end-to-end training on limited support sets could lead to episode-specific fitting rather than learning a general content-to-receptive-field mapping. The episodic training procedure itself exposes the predictor to thousands of distinct support-query pairs sampled from base classes, which functions as implicit regularization. Nevertheless, to directly address the point we will add a new subsection in the revised §3 that analyzes the stability of predicted receptive-field sizes across episodes and datasets, including qualitative visualizations and quantitative variance measures. We will also introduce a lightweight output-entropy regularization term on the predictor during revision and report its effect. revision: partial

  2. Referee: [§4.3] §4.3 (Ablations): Direct comparisons isolating the adaptive module against strong fixed-size receptive-field baselines (with matched capacity) are not presented in sufficient detail; without these, it is impossible to confirm that adaptivity, rather than added parameters or the spatial-frequency fusion alone, drives the benchmark improvements.

    Authors: We agree that the current ablation study does not fully isolate the benefit of adaptivity from capacity or fusion effects. In the revised §4.3 we will add new experiments that compare the adaptive module against fixed-size receptive-field baselines whose total parameter count is matched by increasing the number of parallel fixed branches or adjusting channel widths. These controlled comparisons will be presented alongside the existing ablations to demonstrate that the observed gains are attributable to the adaptive mechanism. revision: yes

  3. Referee: [Table 1] Table 1 and §4.2 (Results): Accuracy gains are stated without standard deviations across random seeds or statistical significance tests; this makes it difficult to judge whether the superiority claims are robust or sensitive to particular data splits and hyperparameter choices in the few-shot episodic setting.

    Authors: This observation is correct; the absence of variability measures limits the strength of the superiority claims. In the revised manuscript we will recompute all entries in Table 1 (and the corresponding text in §4.2) as means ± standard deviations over five independent runs with different random seeds. We will also include paired statistical significance tests (e.g., t-tests) against the strongest baselines to quantify the robustness of the reported improvements. revision: yes

Circularity Check

0 steps flagged

No derivation reduces to fitted input or self-citation by construction; architectural proposal validated empirically

full rationale

The paper presents ARF-SFR-Net as a novel network architecture with adaptive receptive-field modules for spatial-frequency feature reconstruction in few-shot fine-grained classification. No equations or claims in the abstract or described method reduce a prediction to a fitted parameter by construction, nor does any load-bearing premise rely on self-citation of an unverified uniqueness theorem. The central claim is that the adaptive design improves performance, which is asserted to be shown via experiments on benchmarks rather than derived tautologically from the training objective. This matches the default expectation of no significant circularity for an empirical architecture paper.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard deep-learning assumptions (gradient-based optimization works, episodic training is a valid proxy for few-shot generalization) plus the unproven premise that adaptive receptive fields will outperform fixed ones without additional regularization.

free parameters (1)
  • adaptive receptive-field parameters
    The network must learn parameters that decide receptive-field size per input; these are fitted during training and directly affect the extracted features.
axioms (1)
  • domain assumption Episodic training on base classes produces transferable features for novel classes
    The paper states the network can be embedded into any episodic training mechanism, inheriting this common but unproven assumption of the few-shot literature.

pith-pipeline@v0.9.0 · 5507 in / 1176 out tokens · 27497 ms · 2026-05-10T06:55:45.621973+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

85 extracted references · 4 canonical work pages

  1. [1]

    Y. Liao, Y. Gao, W. Zhang, Neuron abandoning attention flow: Vi- sual explanation of dynamics inside cnn models, IEEE Transactions on Pattern Analysis and Machine Intelligence (2026)

  2. [2]

    Y. Liao, Y. Gao, W. Zhang, Dynamic accumulated attention map for interpreting evolution of decision-making in vision transformer, Pattern Recognition 165 (2025) 111607

  3. [3]

    J. Jing, S. Liu, G. Wang, W. Zhang, C. Sun, Recent advances on image edge detection: A comprehensive review, Neurocomputing 503 (2022) 259–271

  4. [4]

    C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, The CalTech-UCSD Birds-200-2011 Dataset, California Institute of Technol- ogy (2011)

  5. [5]

    Khosla, N

    A. Khosla, N. Jayadevaprakash, B. Yao, F.-F. Li, Novel dataset for fine- grained image categorization: Stanford Dogs, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Work- shop on Fine-Grained Visual Categorization, Vol. 2, 2011. 29

  6. [6]

    Krause, M

    J. Krause, M. Stark, J. Deng, L. Fei-Fei, 3D object representations for fine-grained categorization, in: Proceedings of the IEEE International Conference on Computer Vision Workshops, 2013

  7. [7]

    Zhang, C

    W. Zhang, C. Sun, Y. Gao, Image intensity variation information for interest point detection, IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (8) (2023) 9883–9894

  8. [8]

    B. Ma, J. Guo, T.-T. Zhai, A. van der Schaaf, R. J. Steenbakkers, L. V. van Dijk, S. Both, J. A. Langendijk, W. Zhang, B. Qiu, et al., Ct-based deep multi-label learning prediction model for outcome in patients with oropharyngeal squamous cell carcinoma, Medical Physics 50 (10) (2023) 6190–6200

  9. [9]

    T. Liu, J. Xu, T. Lei, Y. Wang, X. Du, W. Zhang, Z. Lv, M. Gong, Aekan: Exploring superpixel-based autoencoder kolmogorov-arnold net- work for unsupervised multimodal change detection, IEEE Transactions on Geoscience and Remote Sensing (2024)

  10. [10]

    Y. Li, Y. Bi, W. Zhang, C. Sun, Multi-scale anisotropic gaussian kernels for image edge detection, IEEE Access 8 (2019) 1803–1812

  11. [11]

    M. A. Islam, J. Zhou, W. Zhang, Y. Gao, Background-aware band selec- tion for object tracking in hyperspectral videos, IEEE Geoscience and Remote Sensing Letters 20 (2023) 1–5

  12. [12]

    Y. Li, B. Feng, W. Zhang, Mutual interference mitigation of millimeter- wave radar based on variational mode decomposition and signal recon- struction, Remote Sensing 15 (3) (2023) 557. 30

  13. [13]

    J. Wang, W. Zhang, A survey of corner detection methods, in: 2018 2nd International Conference on Electrical Engineering and Automation (ICEEA 2018), Atlantis Press, 2018, pp. 214–219

  14. [14]

    Y. An, J. Jing, W. Zhang, Edge detection using multi-directional anisotropicgaussiandirectionalderivative, Signal, ImageandVideoPro- cessing 17 (7) (2023) 3767–3774

  15. [15]

    Y. Li, Y. Bi, W. Zhang, J. Ren, J. Chen, M 2gf: Multi-scale and multi-directional gabor filters for image edge detection, Applied Sciences 13 (16) (2023) 9409

  16. [16]

    M. Wang, W. Zhang, C. Sun, A. Sowmya, Corner detection based on shearlet transform and multi-directional structure tensor, Pattern Recognition 103 (2020) 107299

  17. [17]

    B. Qiu, J. Guo, J. Kraeima, H. H. Glas, W. Zhang, R. J. Borra, M. J. H. Witjes, P. M. van Ooijen, Recurrent convolutional neural networks for 3d mandible segmentation in computed tomography, Journal of person- alized medicine 11 (6) (2021) 492

  18. [18]

    Y.Li, W.Zhang, Trafficflowdigitaltwingenerationforhighwayscenario based on radar-camera paired fusion, Scientific reports 13 (1) (2023) 642

  19. [19]

    Zhang, C

    W. Zhang, C. Sun, Corner detection using second-order generalized gaussian directional derivative representations, IEEE transactions on pattern analysis and machine intelligence 43 (4) (2019) 1213–1224

  20. [20]

    Zhang, F.-P

    W.-C. Zhang, F.-P. Wang, L. Zhu, Z.-F. Zhou, Corner detection using gabor filters, IET Image Processing 8 (11) (2014) 639–646. 31

  21. [21]

    Zhang, C

    W. Zhang, C. Sun, T. Breckon, N. Alshammari, Discrete curvature rep- resentations for noise robust image corner detection, IEEE Transactions on Image Processing 28 (9) (2019) 4444–4459

  22. [22]

    Zhang, C

    W. Zhang, C. Sun, Corner detection using multi-directional structure tensor with multiple scales, International Journal of Computer Vision 128 (2) (2020) 438–459

  23. [23]

    Zhang, P.-L

    W.-C. Zhang, P.-L. Shui, Contour-based corner detection via angle dif- ference of principal directions of anisotropic gaussian directional deriva- tives, Pattern Recognition 48 (9) (2015) 2785–2797

  24. [24]

    Zhang, Y

    W. Zhang, Y. Zhao, T. P. Breckon, L. Chen, Noise robust image edge de- tection based upon the automatic anisotropic gaussian kernels, Pattern Recognition 63 (2017) 193–205

  25. [25]

    Shui, W.-C

    P.-L. Shui, W.-C. Zhang, Noise-robust edge detector combining isotropic and anisotropic gaussian kernels, Pattern Recognition 45 (2) (2012) 806– 820

  26. [26]

    Shui, W.-C

    P.-L. Shui, W.-C. Zhang, Corner detection and classification using anisotropic directional derivative representations, IEEE Transactions on Image Processing 22 (8) (2013) 3204–3218. doi:10.1109/TIP.2013.2259834

  27. [27]

    G. Guo, J. Wan, W. Zhang, GAttenRNN: a recurrent neural network for spati-temporal prediction learning based on gated transformer, In- ternational Journal of Machine Learning and Cybernetics 17 (5) (2026) 207. 32

  28. [28]

    J. Song, A. Sowmya, W. Zhang, C. Sun, Efficient transformer with com- pressed attention for stereo image super-resolution, Knowledge-Based Systems (2025) 114844

  29. [29]

    Y. Liao, U. E. Akpudo, J. Zhang, Y. Gao, J. Zhou, W. Zeng, W. Zhang, Visual explanation via similar feature activation for metric learning, arXiv preprint arXiv:2506.01636 (2025)

  30. [30]

    Huang, X

    J. Huang, X. Rao, W. Zhang, J. Song, X. Sun, Heart rate detection using motion compensation with multiple rois, in: Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition, 2022, pp. 431–438

  31. [31]

    X. Tang, S. Cen, Z. Deng, Z. Zhang, Y. Meng, J. Xie, C. Tang, W. Zhang, G. Zhao, Cascading attention enhancement network for rgb-d indoor scene segmentation, Computer Vision and Image Understanding 259 (2025) 104411

  32. [32]

    J. Lu, W. Wu, K. Gao, P. Mao, W. Zhang, T. Wang, L. Ma, J. Guo, Z. Wu, Y. Hu, et al., Meningioma analysis and diagnosis using limited labeled samples, arXiv preprint arXiv:2602.13335 (2026)

  33. [33]

    J. Ren, Y. An, T. Lei, J. Yang, W. Zhang, Z. Pan, Y. Liao, Y. Gao, C. Sun, W. Zhang, Adaptive feature selection-based feature reconstruc- tion network for few-shot learning, Pattern Recognition (2026) 112289

  34. [34]

    J. Lu, G. Peng, W. Zhang, C. Sun, Track-before-detect algorithm based on cost-reference particle filter bank for weak target detection, IEEE Access 11 (2023) 121688–121701. 33

  35. [35]

    Zhang, Y

    W. Zhang, Y. Zhao, Y. Gao, C. Sun, Re-abstraction and perturbing sup- port pair network for few-shot fine-grained image classification, Pattern Recognition 148 (2024) 110158

  36. [36]

    J. Wang, J. Lu, J. Yang, M. Wang, W. Zhang, An unbiased feature estimationnetworkforfew-shotfine-grainedimageclassification, Sensors 24 (23) (2024) 7737

  37. [37]

    T. Lei, W. Song, W. Zhang, X. Du, C. Li, L. He, A. K. Nandi, Semi-supervised 3-d medical image segmentation using multiconsistency learning with fuzzy perception-guided target selection, IEEE Transac- tions on Radiation and Plasma Medical Sciences 9 (4) (2024) 421–432

  38. [38]

    J. Jing, S. Liu, C. Liu, T. Gao, W. Zhang, C. Sun, A novel decision mechanism for image edge detection, in: International Conference on Intelligent Computing, Springer, 2021, pp. 274–287

  39. [39]

    M. Wang, B. Zheng, G. Wang, J. Yang, J. Lu, W. Zhang, A princi- pal component analysis-based feature optimization network for few-shot fine-grained image classification, Mathematics 13 (7) (2025) 1098

  40. [40]

    Y. Liao, W. Zhang, Y. Gao, C. Sun, X. Yu, Asrsnet: Automatic salient region selection network for few-shot fine-grained image classification, in: International Conference on Pattern Recognition and Artificial In- telligence, Springer, 2022, pp. 627–638

  41. [41]

    J. Lu, W. Zhang, Y. Zhao, C. Sun, Image local structure information learning for fine-grained visual classification, Scientific Reports 12 (1) (2022) 19205. 34

  42. [42]

    Zhang, X

    W. Zhang, X. Liu, Z. Xue, Y. Gao, C. Sun, Ndpnet: A novel non-linear data projection network for few-shot fine-grained image classification, arXiv preprint arXiv:2106.06988 (2021)

  43. [43]

    J. Ren, C. Li, Y. An, W. Zhang, C. Sun, Few-shot fine-grained image classification: A comprehensive review, AI 5 (1) (2024) 405–425

  44. [44]

    Z. Pan, W. Zhang, X. Yu, M. Zhang, Y. Gao, Pseudo-set frequency re- finement architecture for fine-grained few-shot class-incremental learn- ing, Pattern Recognition 155 (2024) 110686

  45. [45]

    J. Ren, Y. Zhao, W. Zhang, C. Sun, Zero-shot incremental learning using spatial-frequency feature representations, Scientific Reports 15 (1) (2025) 10932

  46. [46]

    J. Wu, D. Chang, A. Sain, X. Li, Z. Ma, J. Cao, J. Guo, Y.-Z. Song, Bi-directional ensemble feature reconstruction network for few-shot fine- grained classification, IEEE Transactions on Pattern Analysis and Ma- chine Intelligence 46 (9) (2024) 6082–6096

  47. [47]

    Wertheimer, L

    D. Wertheimer, L. Tang, B. Hariharan, Few-shot classification with fea- ture map reconstruction networks, in: Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition, 2021, pp. 8012–8021

  48. [48]

    R.R.Selvaraju, M.Cogswell, A.Das, R.Vedantam, D.Parikh, D.Batra, Grad-CAM: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 618–626. 35

  49. [49]

    Zheng, H

    Z. Zheng, H. Ren, Y. Wu, W. Zhang, H. Lu, Y. Yang, H. T. Shen, Fully unsupervised domain-agnostic image retrieval, IEEE Transactions on Circuits and Systems for Video Technology 34 (6) (2023) 5077–5090

  50. [50]

    J. Jing, C. Liu, W. Zhang, Y. Gao, C. Sun, Ecfrnet: Effective corner fea- ture representations network for image corner detection, Expert Systems with Applications 211 (2023) 118673

  51. [51]

    Wertheimer, B

    D. Wertheimer, B. Hariharan, Few-shot learning with localization in realistic settings, in: Proceedings of the Conference on Computer Vision and Pattern Recognition, 2019, pp. 6558–6567

  52. [52]

    X. Ruan, G. Lin, C. Long, S. Lu, Few-shot fine-grained classification with spatial attentive comparison, Knowledge-Based Systems 218 (2021) 106840

  53. [53]

    Y. Wang, Y. Ji, W. Wang, B. Wang, Bi-channel attention meta learn- ing for few-shot fine-grained image recognition, Expert Systems with Applications 242 (2024) 122741

  54. [54]

    Z. Leng, M. Wang, Q. Wan, Y. Xu, B. Yan, S. Sun, Meta-learning of feature distribution alignment for enhanced feature sharing, Knowledge- Based Systems 296 (2024) 111875

  55. [55]

    Y. Wu, S. Chanda, M. Hosseinzadeh, Z. Liu, Y. Wang, Few-shot learning of compact models via task-specific meta distillation, in: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2023, pp. 6265–6274. 36

  56. [56]

    S. Xu, F. Zhang, X. Wei, J. Wang, Dual attention networks for few- shot fine-grained recognition, in: Proceedings of the Association for the Advancement of Artificial Intelligence, 2022

  57. [57]

    Zhang, J

    B. Zhang, J. Yuan, B. Li, T. Chen, J. Fan, B. Shi, Learning cross-image object semantic relation in transformer for few-shot fine-grained image classification, in: Proceedings of the ACM International Conference on Multimedia, 2022, pp. 2135–2144

  58. [58]

    Zhang, Y

    C. Zhang, Y. Cai, G. Lin, C. Shen, Deepemd: Differentiable earth mover’s distance for few-shot learning, IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (5) (2022) 5632–5648

  59. [59]

    Ma, Z.-D

    Z.-X. Ma, Z.-D. Chen, L.-J. Zhao, Z.-C. Zhang, X. Luo, X.-S. Xu, Cross- layer and cross-sample feature optimization network for few-shot fine- grained image classification, in: Proceedings of Conference on Associ- ation for the Advancement of Artificial Intelligence, Vol. 38, 2024, pp. 4136–4144

  60. [60]

    Ma, Z.-D

    Z.-X. Ma, Z.-D. Chen, T. Zheng, X. Luo, Z. Jia, X.-S. Xu, Few-shot fine- grained image classification with progressively feature refinement and continuous relationship modeling, in: Proceedings of the Association for the Advancement of Artificial Intelligence, Vol. 39, 2025, pp. 6036–6044

  61. [61]

    S. Yang, X. Li, D. Chang, Z. Ma, J.-H. Xue, Channel-spatial support- query cross-attention for fine-grained few-shot image classification, in: Proceedings of the ACM International Conference on Multimedia, 2024, pp. 9175–9183. 37

  62. [62]

    S.Lee, W.Moon, H.S.Seong, J.-P.Heo, Task-orientedchannelattention for fine-grained few-shot classification, IEEE Transactions on Pattern Analysis and Machine Intelligence 47 (3) (2025) 1448–1463

  63. [63]

    X. Long, X. Wang, C. Yang, Z. He, Q. He, X. Chen, ATARS: Adap- tive task-aware feature learning for few-shot fine-grained classification, Knowledge-Based Systems 338 (2026) 115485

  64. [64]

    Cheng, S

    H. Cheng, S. Yang, J. T. Zhou, L. Guo, B. Wen, Frequency guidance matters in few-shot learning, in: Proceedings of the IEEE International Conference on Computer Vision, 2023, pp. 11814–11824

  65. [65]

    R. Zhou, J. Chen, Y. Shi, L. Wang, W. Wang, J. Sun, C. Zhang, Meta- exploiting frequency prior for cross-domain few-shot learning, in: Ad- vances in Neural Information Processing Systems, 2024

  66. [66]

    Z. Ji, Z. Wang, X. Liu, Y. Yu, Y. Pang, J. Han, Frequency-spatial complementation: Unified channel-specific style attack for cross-domain few-shot learning, IEEE Transactions on Image Processing 34 (2025) 2242–2253

  67. [67]

    M. Cao, X. Chen, J. Zhang, Z. Wang, L. Zhang, Spa- tial–spectral–semantic cross-domain few-shot learning for hyperspectral imageclassification, IEEETransactionsonGeoscienceandRemoteSens- ing 62 (2024) 1–15

  68. [68]

    J. Bao, J. Jing, W. Zhang, C. Liu, T. Gao, A corner detection method based on adaptive multi-directional anisotropic diffusion, Multimedia Tools and Applications 81 (20) (2022) 28729–28754. 38

  69. [69]

    B. Xi, Y. Zhang, J. Li, Y. Huang, Y. Li, Z. Li, J. Chanussot, Trans- ductive few-shot learning with enhanced spectral–spatial embedding for hyperspectral image classification, IEEE Transactions on Image Pro- cessing 34 (2025) 854–868

  70. [70]

    Snell, K

    J. Snell, K. Swersky, R. Zemel, Prototypical networks for few-shot learn- ing, in: Advances in Neural Information Processing Systems, 2017, pp. 4077–4087

  71. [71]

    K. Lee, S. Maji, A. Ravichandran, S. Soatto, Meta-learning with differ- entiable convex optimization, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 10657–10665

  72. [72]

    S. Lee, W. Moon, J.-P. Heo, Task discrepancy maximization for fine- grained few-shot classification, in: Proceedings of the Conference on Computer Vision and Pattern Recognition, 2022, pp. 5331–5340

  73. [73]

    Z. Sun, W. Zheng, P. Guo, M. Wang, TST_MFL: Two-stage training based metric fusion learning for few-shot image classification, Informa- tion Fusion 113 (2025) 102611

  74. [74]

    Zhao, Z.-D

    L.-J. Zhao, Z.-D. Chen, Z.-X. Ma, X. Luo, X.-S. Xu, Angular isotonic loss guided multi-layer integration for few-shot fine-grained image clas- sification, IEEE Transactions on Image Processing 33 (2024) 3778–3792

  75. [75]

    Ma, Z.-D

    Z.-X. Ma, Z.-D. Chen, L.-J. Zhao, Z.-C. Zhang, T. Zheng, X. Luo, X.-S. Xu, Bi-directional task-guided network for few-shot fine-grained image classification, in: Proceedings of the ACM International Conference on Multimedia, 2024, pp. 8277–8286. 39

  76. [76]

    Huang, J

    H. Huang, J. Zhang, J. Zhang, J. Xu, Q. Wu, Low-rank pairwise align- ment bilinear network for few-shot fine-grained image classification, IEEE Transactions on Multimedia 23 (2020) 1666–1680

  77. [77]

    C. Wang, H. Fu, H. Ma, PaCL: Part-level Contrastive Learning for Fine- grained Few-shot Image Classification, in: Proceedings of the ACM In- ternational Conference on Multimedia, 2022, pp. 6416–6424

  78. [78]

    L. Yu, Z. Guan, W. Zhao, Y. Yang, J. Tan, Adaptive task-aware refin- ing network for few-shot fine-grained image classification, IEEE Trans- actions on Circuits and Systems for Video Technology 35 (3) (2025) 2301–2314

  79. [79]

    C. Qi, C. Ye, W. Lin, Z. Liu, J. Qiu, Binohem: Binocular singu- lar hellinger metametric for fine-grained few-shot classification, IEEE Transactions on Image Processing 34 (2025) 7264–7277

  80. [80]

    J. Xie, F. Long, J. Lv, Q. Wang, P. Li, Joint distribution matters: Deep Brownian distance covariance for few-shot classification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 10865–10874

Showing first 80 references.