arxiv: 2604.16936 · v1 · submitted 2026-04-18 · 💻 cs.CV · cs.AI

Recognition: unknown

Adaptive receptive field-based spatial-frequency feature reconstruction network for few-shot fine-grained image classification

Linyue Zhang , Wenyi Zeng , Zicheng Pan , Yongsheng Gao , Changming Sun , Jun Hu , Lixian Liu , Weichuan Zhang

show 1 more author

Tuo Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 06:55 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords few-shot fine-grained image classificationadaptive receptive fieldspatial-frequency featuresfeature reconstructionepisodic trainingFSFGICARF-SFR-Net

0 comments

The pith

A network that picks receptive field sizes based on each input image extracts and fuses spatial and frequency features to improve few-shot fine-grained classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies the difficulty of selecting appropriate receptive field sizes when extracting spatial and frequency descriptors from varied input images in few-shot fine-grained tasks. To solve this, it introduces ARF-SFR-Net, which learns to set receptive field sizes adaptively for each sample, extracts the corresponding features, and fuses them for reconstruction. The network slots directly into standard episodic training pipelines and trains end-to-end without pre-training. Experiments across several FSFGIC benchmarks show the approach outperforms prior feature-reconstruction methods.

Core claim

The designed ARF-SFR-Net has the capability to adaptively determine receptive field sizes for obtaining spatial and frequency features, and effectively fuse them for reconstruction and FSFGIC tasks.

What carries the argument

ARF-SFR-Net, which adaptively selects input-dependent receptive field sizes to extract spatial and frequency descriptors before fusing them for reconstruction.

If this is right

The network integrates into any episodic training loop for end-to-end training from scratch.
Feature reconstruction quality improves when spatial and frequency descriptors are drawn from input-specific receptive fields.
Performance gains appear on multiple standard FSFGIC benchmarks relative to prior state-of-the-art feature-based methods.
The adaptive mechanism addresses the core difficulty of choosing receptive field size across different category inputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same input-dependent field selection could be tested in other low-data regimes such as few-shot object detection or segmentation.
If the adaptation proves stable, it might reduce the engineering effort spent on manual receptive-field hyper-parameter search in convolutional backbones.
Sample-specific field sizing could interact with other few-shot techniques such as prototype networks or meta-learning updates.

Load-bearing premise

Making receptive field size depend on the input image will reliably produce stronger descriptors than any fixed size without adding instability or extra overfitting risk under few-shot constraints.

What would settle it

A controlled experiment in which a fixed-receptive-field version of the same architecture matches or exceeds the adaptive version's accuracy on the same few-shot fine-grained benchmarks under identical episodic training.

Figures

Figures reproduced from arXiv: 2604.16936 by Changming Sun, Jun Hu, Linyue Zhang, Lixian Liu, Tuo Wang, Weichuan Zhang, Wenyi Zeng, Yongsheng Gao, Zicheng Pan.

**Figure 2.** Figure 2: The pipeline of the proposed ARF-SFR-Net for a 5-way 1-shot FSFGIC task [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: The proposed adaptive receptive field (ARF) strategy. [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: The architecture of ARF-SFR based on the ResNet-12 backbone. [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗

**Figure 5.** Figure 5: Examples of the loss and accuracy curves of BDFRNet and the proposed method [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗

**Figure 6.** Figure 6: The heatmaps of six images visualized by different domain. [PITH_FULL_IMAGE:figures/full_fig_p028_6.png] view at source ↗

read the original abstract

Feature reconstruction techniques are widely applied for few-shot fine-grained image classification (FSFGIC). Our research indicates that one of the main challenges facing existing feature-based FSFGIC methods is how to choose the size of the receptive field to extract feature descriptors (including spatial and frequency feature descriptors) from different category input images, thereby better performing the FSFGIC tasks. To address this, an adaptive receptive field-based spatial-frequency feature reconstruction network (ARF-SFR-Net) is proposed. The designed ARF-SFR-Net has the capability to adaptively determine receptive field sizes for obtaining spatial and frequency features, and effectively fuse them for reconstruction and FSFGIC tasks. The designed ARF-SFR-Net can be easily embedded into a given episodic training mechanism for end-to-end training from scratch. Extensive experiments on multiple FSFGIC benchmarks demonstrate the effectiveness and superiority of the proposed ARF-SFR-Net over state-of-the-art approaches. The code is available at: https://github.com/ICL-SUST/ARF-SFR-Net.git.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The adaptive receptive field idea for spatial-frequency reconstruction in few-shot fine-grained classification is a solid incremental proposal, though its robustness in low-data settings requires closer examination.

read the letter

The paper's key move is to add an adaptive receptive field selector that decides sizes for extracting both spatial and frequency features before reconstructing them for few-shot fine-grained image classification. This specific pairing of adaptive fields with spatial-frequency reconstruction in the FSFGIC setting looks new compared to the referenced priors. The design integrates cleanly into episodic training and comes with public code, which makes it straightforward to inspect or build on. The work does a good job of targeting a concrete pain point with fixed receptive fields and delivering benchmark gains that suggest the adaptation helps in practice. The soft spot is the risk that the adaptive module overfits the small support sets typical in few-shot episodes. Any capacity there could memorize per-episode patterns, and the fusion step might amplify them, so the superiority could depend on how well the experiments control for that. Without seeing detailed ablations or variance numbers, it's difficult to confirm the adaptive choice is the real driver rather than other factors. This is aimed at the few-shot vision crowd, especially those focused on fine-grained problems with limited data. A reader interested in feature reconstruction techniques would find the architecture and code useful to consider. I recommend sending it for peer review. The proposal is specific enough to evaluate, the code allows checking the claims, and addressing the potential overfitting concern would strengthen it.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces ARF-SFR-Net for few-shot fine-grained image classification (FSFGIC). It addresses challenges in selecting receptive field sizes for spatial and frequency feature descriptors by proposing an adaptive mechanism that determines these sizes from input images, fuses the resulting features for reconstruction, and integrates into standard episodic training for end-to-end optimization. Experiments on multiple FSFGIC benchmarks are reported to demonstrate superiority over state-of-the-art methods, with code released publicly.

Significance. If the adaptive receptive-field module reliably produces more discriminative descriptors without introducing instability, the approach could advance feature reconstruction techniques in data-scarce fine-grained settings by making receptive-field selection content-dependent rather than fixed. The public code release supports reproducibility, which is a clear strength.

major comments (3)

[§3] §3 (Method): The adaptive receptive-field predictor is trained jointly on the same small support sets as the downstream classifier; the manuscript provides no explicit regularization, meta-regularization, or capacity-control mechanism for this predictor, leaving open the possibility that reported gains arise from episode-specific memorization rather than a general mapping from image content to optimal field size.
[§4.3] §4.3 (Ablations): Direct comparisons isolating the adaptive module against strong fixed-size receptive-field baselines (with matched capacity) are not presented in sufficient detail; without these, it is impossible to confirm that adaptivity, rather than added parameters or the spatial-frequency fusion alone, drives the benchmark improvements.
[Table 1] Table 1 and §4.2 (Results): Accuracy gains are stated without standard deviations across random seeds or statistical significance tests; this makes it difficult to judge whether the superiority claims are robust or sensitive to particular data splits and hyperparameter choices in the few-shot episodic setting.

minor comments (2)

[Abstract] Abstract: The phrase 'extensive experiments on multiple FSFGIC benchmarks' would be more informative if the number of datasets and primary metrics were named explicitly.
[§2] §2 (Related Work): A brief discussion of recent adaptive-receptive-field methods outside the few-shot literature would help clarify the novelty of the proposed predictor.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment below and outline the revisions planned for the resubmitted manuscript.

read point-by-point responses

Referee: [§3] §3 (Method): The adaptive receptive-field predictor is trained jointly on the same small support sets as the downstream classifier; the manuscript provides no explicit regularization, meta-regularization, or capacity-control mechanism for this predictor, leaving open the possibility that reported gains arise from episode-specific memorization rather than a general mapping from image content to optimal field size.

Authors: We acknowledge the concern that joint end-to-end training on limited support sets could lead to episode-specific fitting rather than learning a general content-to-receptive-field mapping. The episodic training procedure itself exposes the predictor to thousands of distinct support-query pairs sampled from base classes, which functions as implicit regularization. Nevertheless, to directly address the point we will add a new subsection in the revised §3 that analyzes the stability of predicted receptive-field sizes across episodes and datasets, including qualitative visualizations and quantitative variance measures. We will also introduce a lightweight output-entropy regularization term on the predictor during revision and report its effect. revision: partial
Referee: [§4.3] §4.3 (Ablations): Direct comparisons isolating the adaptive module against strong fixed-size receptive-field baselines (with matched capacity) are not presented in sufficient detail; without these, it is impossible to confirm that adaptivity, rather than added parameters or the spatial-frequency fusion alone, drives the benchmark improvements.

Authors: We agree that the current ablation study does not fully isolate the benefit of adaptivity from capacity or fusion effects. In the revised §4.3 we will add new experiments that compare the adaptive module against fixed-size receptive-field baselines whose total parameter count is matched by increasing the number of parallel fixed branches or adjusting channel widths. These controlled comparisons will be presented alongside the existing ablations to demonstrate that the observed gains are attributable to the adaptive mechanism. revision: yes
Referee: [Table 1] Table 1 and §4.2 (Results): Accuracy gains are stated without standard deviations across random seeds or statistical significance tests; this makes it difficult to judge whether the superiority claims are robust or sensitive to particular data splits and hyperparameter choices in the few-shot episodic setting.

Authors: This observation is correct; the absence of variability measures limits the strength of the superiority claims. In the revised manuscript we will recompute all entries in Table 1 (and the corresponding text in §4.2) as means ± standard deviations over five independent runs with different random seeds. We will also include paired statistical significance tests (e.g., t-tests) against the strongest baselines to quantify the robustness of the reported improvements. revision: yes

Circularity Check

0 steps flagged

No derivation reduces to fitted input or self-citation by construction; architectural proposal validated empirically

full rationale

The paper presents ARF-SFR-Net as a novel network architecture with adaptive receptive-field modules for spatial-frequency feature reconstruction in few-shot fine-grained classification. No equations or claims in the abstract or described method reduce a prediction to a fitted parameter by construction, nor does any load-bearing premise rely on self-citation of an unverified uniqueness theorem. The central claim is that the adaptive design improves performance, which is asserted to be shown via experiments on benchmarks rather than derived tautologically from the training objective. This matches the default expectation of no significant circularity for an empirical architecture paper.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard deep-learning assumptions (gradient-based optimization works, episodic training is a valid proxy for few-shot generalization) plus the unproven premise that adaptive receptive fields will outperform fixed ones without additional regularization.

free parameters (1)

adaptive receptive-field parameters
The network must learn parameters that decide receptive-field size per input; these are fitted during training and directly affect the extracted features.

axioms (1)

domain assumption Episodic training on base classes produces transferable features for novel classes
The paper states the network can be embedded into any episodic training mechanism, inheriting this common but unproven assumption of the few-shot literature.

pith-pipeline@v0.9.0 · 5507 in / 1176 out tokens · 27497 ms · 2026-05-10T06:55:45.621973+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

85 extracted references · 4 canonical work pages

[1]

Y. Liao, Y. Gao, W. Zhang, Neuron abandoning attention flow: Vi- sual explanation of dynamics inside cnn models, IEEE Transactions on Pattern Analysis and Machine Intelligence (2026)

2026
[2]

Y. Liao, Y. Gao, W. Zhang, Dynamic accumulated attention map for interpreting evolution of decision-making in vision transformer, Pattern Recognition 165 (2025) 111607

2025
[3]

J. Jing, S. Liu, G. Wang, W. Zhang, C. Sun, Recent advances on image edge detection: A comprehensive review, Neurocomputing 503 (2022) 259–271

2022
[4]

C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, The CalTech-UCSD Birds-200-2011 Dataset, California Institute of Technol- ogy (2011)

2011
[5]

Khosla, N

A. Khosla, N. Jayadevaprakash, B. Yao, F.-F. Li, Novel dataset for fine- grained image categorization: Stanford Dogs, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Work- shop on Fine-Grained Visual Categorization, Vol. 2, 2011. 29

2011
[6]

Krause, M

J. Krause, M. Stark, J. Deng, L. Fei-Fei, 3D object representations for fine-grained categorization, in: Proceedings of the IEEE International Conference on Computer Vision Workshops, 2013

2013
[7]

Zhang, C

W. Zhang, C. Sun, Y. Gao, Image intensity variation information for interest point detection, IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (8) (2023) 9883–9894

2023
[8]

B. Ma, J. Guo, T.-T. Zhai, A. van der Schaaf, R. J. Steenbakkers, L. V. van Dijk, S. Both, J. A. Langendijk, W. Zhang, B. Qiu, et al., Ct-based deep multi-label learning prediction model for outcome in patients with oropharyngeal squamous cell carcinoma, Medical Physics 50 (10) (2023) 6190–6200

2023
[9]

T. Liu, J. Xu, T. Lei, Y. Wang, X. Du, W. Zhang, Z. Lv, M. Gong, Aekan: Exploring superpixel-based autoencoder kolmogorov-arnold net- work for unsupervised multimodal change detection, IEEE Transactions on Geoscience and Remote Sensing (2024)

2024
[10]

Y. Li, Y. Bi, W. Zhang, C. Sun, Multi-scale anisotropic gaussian kernels for image edge detection, IEEE Access 8 (2019) 1803–1812

2019
[11]

M. A. Islam, J. Zhou, W. Zhang, Y. Gao, Background-aware band selec- tion for object tracking in hyperspectral videos, IEEE Geoscience and Remote Sensing Letters 20 (2023) 1–5

2023
[12]

Y. Li, B. Feng, W. Zhang, Mutual interference mitigation of millimeter- wave radar based on variational mode decomposition and signal recon- struction, Remote Sensing 15 (3) (2023) 557. 30

2023
[13]

J. Wang, W. Zhang, A survey of corner detection methods, in: 2018 2nd International Conference on Electrical Engineering and Automation (ICEEA 2018), Atlantis Press, 2018, pp. 214–219

2018
[14]

Y. An, J. Jing, W. Zhang, Edge detection using multi-directional anisotropicgaussiandirectionalderivative, Signal, ImageandVideoPro- cessing 17 (7) (2023) 3767–3774

2023
[15]

Y. Li, Y. Bi, W. Zhang, J. Ren, J. Chen, M 2gf: Multi-scale and multi-directional gabor filters for image edge detection, Applied Sciences 13 (16) (2023) 9409

2023
[16]

M. Wang, W. Zhang, C. Sun, A. Sowmya, Corner detection based on shearlet transform and multi-directional structure tensor, Pattern Recognition 103 (2020) 107299

2020
[17]

B. Qiu, J. Guo, J. Kraeima, H. H. Glas, W. Zhang, R. J. Borra, M. J. H. Witjes, P. M. van Ooijen, Recurrent convolutional neural networks for 3d mandible segmentation in computed tomography, Journal of person- alized medicine 11 (6) (2021) 492

2021
[18]

Y.Li, W.Zhang, Trafficflowdigitaltwingenerationforhighwayscenario based on radar-camera paired fusion, Scientific reports 13 (1) (2023) 642

2023
[19]

Zhang, C

W. Zhang, C. Sun, Corner detection using second-order generalized gaussian directional derivative representations, IEEE transactions on pattern analysis and machine intelligence 43 (4) (2019) 1213–1224

2019
[20]

Zhang, F.-P

W.-C. Zhang, F.-P. Wang, L. Zhu, Z.-F. Zhou, Corner detection using gabor filters, IET Image Processing 8 (11) (2014) 639–646. 31

2014
[21]

Zhang, C

W. Zhang, C. Sun, T. Breckon, N. Alshammari, Discrete curvature rep- resentations for noise robust image corner detection, IEEE Transactions on Image Processing 28 (9) (2019) 4444–4459

2019
[22]

Zhang, C

W. Zhang, C. Sun, Corner detection using multi-directional structure tensor with multiple scales, International Journal of Computer Vision 128 (2) (2020) 438–459

2020
[23]

Zhang, P.-L

W.-C. Zhang, P.-L. Shui, Contour-based corner detection via angle dif- ference of principal directions of anisotropic gaussian directional deriva- tives, Pattern Recognition 48 (9) (2015) 2785–2797

2015
[24]

Zhang, Y

W. Zhang, Y. Zhao, T. P. Breckon, L. Chen, Noise robust image edge de- tection based upon the automatic anisotropic gaussian kernels, Pattern Recognition 63 (2017) 193–205

2017
[25]

Shui, W.-C

P.-L. Shui, W.-C. Zhang, Noise-robust edge detector combining isotropic and anisotropic gaussian kernels, Pattern Recognition 45 (2) (2012) 806– 820

2012
[26]

Shui, W.-C

P.-L. Shui, W.-C. Zhang, Corner detection and classification using anisotropic directional derivative representations, IEEE Transactions on Image Processing 22 (8) (2013) 3204–3218. doi:10.1109/TIP.2013.2259834

work page doi:10.1109/tip.2013.2259834 2013
[27]

G. Guo, J. Wan, W. Zhang, GAttenRNN: a recurrent neural network for spati-temporal prediction learning based on gated transformer, In- ternational Journal of Machine Learning and Cybernetics 17 (5) (2026) 207. 32

2026
[28]

J. Song, A. Sowmya, W. Zhang, C. Sun, Efficient transformer with com- pressed attention for stereo image super-resolution, Knowledge-Based Systems (2025) 114844

2025
[29]

Y. Liao, U. E. Akpudo, J. Zhang, Y. Gao, J. Zhou, W. Zeng, W. Zhang, Visual explanation via similar feature activation for metric learning, arXiv preprint arXiv:2506.01636 (2025)

work page arXiv 2025
[30]

Huang, X

J. Huang, X. Rao, W. Zhang, J. Song, X. Sun, Heart rate detection using motion compensation with multiple rois, in: Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition, 2022, pp. 431–438

2022
[31]

X. Tang, S. Cen, Z. Deng, Z. Zhang, Y. Meng, J. Xie, C. Tang, W. Zhang, G. Zhao, Cascading attention enhancement network for rgb-d indoor scene segmentation, Computer Vision and Image Understanding 259 (2025) 104411

2025
[32]

J. Lu, W. Wu, K. Gao, P. Mao, W. Zhang, T. Wang, L. Ma, J. Guo, Z. Wu, Y. Hu, et al., Meningioma analysis and diagnosis using limited labeled samples, arXiv preprint arXiv:2602.13335 (2026)

work page arXiv 2026
[33]

J. Ren, Y. An, T. Lei, J. Yang, W. Zhang, Z. Pan, Y. Liao, Y. Gao, C. Sun, W. Zhang, Adaptive feature selection-based feature reconstruc- tion network for few-shot learning, Pattern Recognition (2026) 112289

2026
[34]

J. Lu, G. Peng, W. Zhang, C. Sun, Track-before-detect algorithm based on cost-reference particle filter bank for weak target detection, IEEE Access 11 (2023) 121688–121701. 33

2023
[35]

Zhang, Y

W. Zhang, Y. Zhao, Y. Gao, C. Sun, Re-abstraction and perturbing sup- port pair network for few-shot fine-grained image classification, Pattern Recognition 148 (2024) 110158

2024
[36]

J. Wang, J. Lu, J. Yang, M. Wang, W. Zhang, An unbiased feature estimationnetworkforfew-shotfine-grainedimageclassification, Sensors 24 (23) (2024) 7737

2024
[37]

T. Lei, W. Song, W. Zhang, X. Du, C. Li, L. He, A. K. Nandi, Semi-supervised 3-d medical image segmentation using multiconsistency learning with fuzzy perception-guided target selection, IEEE Transac- tions on Radiation and Plasma Medical Sciences 9 (4) (2024) 421–432

2024
[38]

J. Jing, S. Liu, C. Liu, T. Gao, W. Zhang, C. Sun, A novel decision mechanism for image edge detection, in: International Conference on Intelligent Computing, Springer, 2021, pp. 274–287

2021
[39]

M. Wang, B. Zheng, G. Wang, J. Yang, J. Lu, W. Zhang, A princi- pal component analysis-based feature optimization network for few-shot fine-grained image classification, Mathematics 13 (7) (2025) 1098

2025
[40]

Y. Liao, W. Zhang, Y. Gao, C. Sun, X. Yu, Asrsnet: Automatic salient region selection network for few-shot fine-grained image classification, in: International Conference on Pattern Recognition and Artificial In- telligence, Springer, 2022, pp. 627–638

2022
[41]

J. Lu, W. Zhang, Y. Zhao, C. Sun, Image local structure information learning for fine-grained visual classification, Scientific Reports 12 (1) (2022) 19205. 34

2022
[42]

Zhang, X

W. Zhang, X. Liu, Z. Xue, Y. Gao, C. Sun, Ndpnet: A novel non-linear data projection network for few-shot fine-grained image classification, arXiv preprint arXiv:2106.06988 (2021)

work page arXiv 2021
[43]

J. Ren, C. Li, Y. An, W. Zhang, C. Sun, Few-shot fine-grained image classification: A comprehensive review, AI 5 (1) (2024) 405–425

2024
[44]

Z. Pan, W. Zhang, X. Yu, M. Zhang, Y. Gao, Pseudo-set frequency re- finement architecture for fine-grained few-shot class-incremental learn- ing, Pattern Recognition 155 (2024) 110686

2024
[45]

J. Ren, Y. Zhao, W. Zhang, C. Sun, Zero-shot incremental learning using spatial-frequency feature representations, Scientific Reports 15 (1) (2025) 10932

2025
[46]

J. Wu, D. Chang, A. Sain, X. Li, Z. Ma, J. Cao, J. Guo, Y.-Z. Song, Bi-directional ensemble feature reconstruction network for few-shot fine- grained classification, IEEE Transactions on Pattern Analysis and Ma- chine Intelligence 46 (9) (2024) 6082–6096

2024
[47]

Wertheimer, L

D. Wertheimer, L. Tang, B. Hariharan, Few-shot classification with fea- ture map reconstruction networks, in: Proceedings of the IEEE Confer- ence on Computer Vision and Pattern Recognition, 2021, pp. 8012–8021

2021
[48]

R.R.Selvaraju, M.Cogswell, A.Das, R.Vedantam, D.Parikh, D.Batra, Grad-CAM: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 618–626. 35

2017
[49]

Zheng, H

Z. Zheng, H. Ren, Y. Wu, W. Zhang, H. Lu, Y. Yang, H. T. Shen, Fully unsupervised domain-agnostic image retrieval, IEEE Transactions on Circuits and Systems for Video Technology 34 (6) (2023) 5077–5090

2023
[50]

J. Jing, C. Liu, W. Zhang, Y. Gao, C. Sun, Ecfrnet: Effective corner fea- ture representations network for image corner detection, Expert Systems with Applications 211 (2023) 118673

2023
[51]

Wertheimer, B

D. Wertheimer, B. Hariharan, Few-shot learning with localization in realistic settings, in: Proceedings of the Conference on Computer Vision and Pattern Recognition, 2019, pp. 6558–6567

2019
[52]

X. Ruan, G. Lin, C. Long, S. Lu, Few-shot fine-grained classification with spatial attentive comparison, Knowledge-Based Systems 218 (2021) 106840

2021
[53]

Y. Wang, Y. Ji, W. Wang, B. Wang, Bi-channel attention meta learn- ing for few-shot fine-grained image recognition, Expert Systems with Applications 242 (2024) 122741

2024
[54]

Z. Leng, M. Wang, Q. Wan, Y. Xu, B. Yan, S. Sun, Meta-learning of feature distribution alignment for enhanced feature sharing, Knowledge- Based Systems 296 (2024) 111875

2024
[55]

Y. Wu, S. Chanda, M. Hosseinzadeh, Z. Liu, Y. Wang, Few-shot learning of compact models via task-specific meta distillation, in: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2023, pp. 6265–6274. 36

2023
[56]

S. Xu, F. Zhang, X. Wei, J. Wang, Dual attention networks for few- shot fine-grained recognition, in: Proceedings of the Association for the Advancement of Artificial Intelligence, 2022

2022
[57]

Zhang, J

B. Zhang, J. Yuan, B. Li, T. Chen, J. Fan, B. Shi, Learning cross-image object semantic relation in transformer for few-shot fine-grained image classification, in: Proceedings of the ACM International Conference on Multimedia, 2022, pp. 2135–2144

2022
[58]

Zhang, Y

C. Zhang, Y. Cai, G. Lin, C. Shen, Deepemd: Differentiable earth mover’s distance for few-shot learning, IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (5) (2022) 5632–5648

2022
[59]

Ma, Z.-D

Z.-X. Ma, Z.-D. Chen, L.-J. Zhao, Z.-C. Zhang, X. Luo, X.-S. Xu, Cross- layer and cross-sample feature optimization network for few-shot fine- grained image classification, in: Proceedings of Conference on Associ- ation for the Advancement of Artificial Intelligence, Vol. 38, 2024, pp. 4136–4144

2024
[60]

Ma, Z.-D

Z.-X. Ma, Z.-D. Chen, T. Zheng, X. Luo, Z. Jia, X.-S. Xu, Few-shot fine- grained image classification with progressively feature refinement and continuous relationship modeling, in: Proceedings of the Association for the Advancement of Artificial Intelligence, Vol. 39, 2025, pp. 6036–6044

2025
[61]

S. Yang, X. Li, D. Chang, Z. Ma, J.-H. Xue, Channel-spatial support- query cross-attention for fine-grained few-shot image classification, in: Proceedings of the ACM International Conference on Multimedia, 2024, pp. 9175–9183. 37

2024
[62]

S.Lee, W.Moon, H.S.Seong, J.-P.Heo, Task-orientedchannelattention for fine-grained few-shot classification, IEEE Transactions on Pattern Analysis and Machine Intelligence 47 (3) (2025) 1448–1463

2025
[63]

X. Long, X. Wang, C. Yang, Z. He, Q. He, X. Chen, ATARS: Adap- tive task-aware feature learning for few-shot fine-grained classification, Knowledge-Based Systems 338 (2026) 115485

2026
[64]

Cheng, S

H. Cheng, S. Yang, J. T. Zhou, L. Guo, B. Wen, Frequency guidance matters in few-shot learning, in: Proceedings of the IEEE International Conference on Computer Vision, 2023, pp. 11814–11824

2023
[65]

R. Zhou, J. Chen, Y. Shi, L. Wang, W. Wang, J. Sun, C. Zhang, Meta- exploiting frequency prior for cross-domain few-shot learning, in: Ad- vances in Neural Information Processing Systems, 2024

2024
[66]

Z. Ji, Z. Wang, X. Liu, Y. Yu, Y. Pang, J. Han, Frequency-spatial complementation: Unified channel-specific style attack for cross-domain few-shot learning, IEEE Transactions on Image Processing 34 (2025) 2242–2253

2025
[67]

M. Cao, X. Chen, J. Zhang, Z. Wang, L. Zhang, Spa- tial–spectral–semantic cross-domain few-shot learning for hyperspectral imageclassification, IEEETransactionsonGeoscienceandRemoteSens- ing 62 (2024) 1–15

2024
[68]

J. Bao, J. Jing, W. Zhang, C. Liu, T. Gao, A corner detection method based on adaptive multi-directional anisotropic diffusion, Multimedia Tools and Applications 81 (20) (2022) 28729–28754. 38

2022
[69]

B. Xi, Y. Zhang, J. Li, Y. Huang, Y. Li, Z. Li, J. Chanussot, Trans- ductive few-shot learning with enhanced spectral–spatial embedding for hyperspectral image classification, IEEE Transactions on Image Pro- cessing 34 (2025) 854–868

2025
[70]

Snell, K

J. Snell, K. Swersky, R. Zemel, Prototypical networks for few-shot learn- ing, in: Advances in Neural Information Processing Systems, 2017, pp. 4077–4087

2017
[71]

K. Lee, S. Maji, A. Ravichandran, S. Soatto, Meta-learning with differ- entiable convex optimization, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 10657–10665

2019
[72]

S. Lee, W. Moon, J.-P. Heo, Task discrepancy maximization for fine- grained few-shot classification, in: Proceedings of the Conference on Computer Vision and Pattern Recognition, 2022, pp. 5331–5340

2022
[73]

Z. Sun, W. Zheng, P. Guo, M. Wang, TST_MFL: Two-stage training based metric fusion learning for few-shot image classification, Informa- tion Fusion 113 (2025) 102611

2025
[74]

Zhao, Z.-D

L.-J. Zhao, Z.-D. Chen, Z.-X. Ma, X. Luo, X.-S. Xu, Angular isotonic loss guided multi-layer integration for few-shot fine-grained image clas- sification, IEEE Transactions on Image Processing 33 (2024) 3778–3792

2024
[75]

Ma, Z.-D

Z.-X. Ma, Z.-D. Chen, L.-J. Zhao, Z.-C. Zhang, T. Zheng, X. Luo, X.-S. Xu, Bi-directional task-guided network for few-shot fine-grained image classification, in: Proceedings of the ACM International Conference on Multimedia, 2024, pp. 8277–8286. 39

2024
[76]

Huang, J

H. Huang, J. Zhang, J. Zhang, J. Xu, Q. Wu, Low-rank pairwise align- ment bilinear network for few-shot fine-grained image classification, IEEE Transactions on Multimedia 23 (2020) 1666–1680

2020
[77]

C. Wang, H. Fu, H. Ma, PaCL: Part-level Contrastive Learning for Fine- grained Few-shot Image Classification, in: Proceedings of the ACM In- ternational Conference on Multimedia, 2022, pp. 6416–6424

2022
[78]

L. Yu, Z. Guan, W. Zhao, Y. Yang, J. Tan, Adaptive task-aware refin- ing network for few-shot fine-grained image classification, IEEE Trans- actions on Circuits and Systems for Video Technology 35 (3) (2025) 2301–2314

2025
[79]

C. Qi, C. Ye, W. Lin, Z. Liu, J. Qiu, Binohem: Binocular singu- lar hellinger metametric for fine-grained few-shot classification, IEEE Transactions on Image Processing 34 (2025) 7264–7277

2025
[80]

J. Xie, F. Long, J. Lv, Q. Wang, P. Li, Joint distribution matters: Deep Brownian distance covariance for few-shot classification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 10865–10874

2022

Showing first 80 references.