LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios

Bing Su; Jiahao Chen; Zhiyuan Huang

arxiv: 2509.09926 · v5 · submitted 2025-09-12 · 💻 cs.LG · cs.CV

LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios

Zhiyuan Huang , Jiahao Chen , Bing Su This is my paper

Pith reviewed 2026-05-18 17:58 UTC · model grok-4.3

classification 💻 cs.LG cs.CV

keywords long-tailed semi-supervised learningfoundation modelsparameter-efficient fine-tuningopen-world scenariosgeneralization boundsbalanced posterior errorout-of-distribution samples

0 comments

The pith

Fine-tuning foundation models reduces hypothesis complexity and tightens generalization bounds for long-tailed semi-supervised learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that switching to foundation model fine-tuning in long-tailed semi-supervised learning cuts down hypothesis complexity, which tightens the generalization bound and lowers the balanced posterior error. This also makes features more compact, shrinking the space where outliers are accepted and giving a geometric basis for robustness to noise and OOD data. Prior methods trained from scratch often suffer from overconfidence and bad pseudo-labels, but foundation models avoid this by starting from rich pre-trained representations. The work proposes the LoFT framework for efficient fine-tuning and LoFT-OW to handle open-world unlabeled data that may include out-of-distribution samples. If correct, this would mean better accuracy on imbalanced datasets with fewer labeled examples in the tails.

Core claim

Utilizing a foundation model significantly reduces the hypothesis complexity, which tightens the generalization bound and in turn minimizes the Balanced Posterior Error (BPE). The feature compactness of foundation models strictly compresses the acceptance region for outliers, providing a geometric guarantee for robustness. This insight leads to the LoFT framework for parameter-efficient fine-tuning in long-tailed semi-supervised learning and an extension LoFT-OW for open-world scenarios with potential OOD samples in unlabeled data.

What carries the argument

Theoretical proofs that foundation models reduce hypothesis complexity and compress outlier acceptance regions, which motivate the parameter-efficient fine-tuning in the LoFT method.

Load-bearing premise

The proofs assume that foundation models provide feature compactness and reduced hypothesis complexity that improve the bounds in long-tailed semi-supervised learning even in the presence of pseudo-label noise.

What would settle it

Finding that the balanced posterior error does not decrease or that the outlier acceptance region does not shrink when using a foundation model instead of training from scratch on long-tailed datasets would falsify the central claim.

Figures

Figures reproduced from arXiv: 2509.09926 by Bing Su, Jiahao Chen, Zhiyuan Huang.

**Figure 2.** Figure 2: The reliability diagrams on (a) ImageNet-LT and (b) Places365-LT based on training from scratch and PEFT, respec [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the proposed LoFT-OW. H(p, q) denotes the cross-entropy. where a weakly augmented view is used to generate pseudolabels, and a strongly augmented view is used to obtain logits for optimization. To better handle uncertain predictions, we partition unlabeled samples into high-confidence and low-confidence subsets based on their Maximum Softmax Probability (MSP), and apply different optimiza… view at source ↗

**Figure 4.** Figure 4: Visualizations of unlabeled samples and their pre [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Ablation studies on hyper-parameter cu. The horizontal axis represents the value of cu, and the vertical axis represents the accuracy. Compared to PEFT, LoFT consistently achieves higher accuracy with both CLIP and OpenCLIP backbones, reaching 73.3% and 73.9%, respectively. These improvements over strong baselines and prior methods (e.g., FixMatch+CCL at 67.8%) highlight LoFT’s effectiveness beyond small… view at source ↗

read the original abstract

Long-tailed semi-supervised learning (LTSSL) presents a formidable challenge where models must overcome the scarcity of tail samples while mitigating the noise from unreliable pseudo-labels. Most prior LTSSL methods are designed to train models from scratch, which often leads to issues such as overconfidence and low-quality pseudo-labels. To address this problem, we first theoretically prove that utilizing a foundation model significantly reduces the hypothesis complexity, which tightens the generalization bound and in turn minimizes the Balanced Posterior Error (BPE). Furthermore, we demonstrate that the feature compactness of foundation models strictly compresses the acceptance region for outliers, providing a geometric guarantee for robustness. Motivated by these theoretical insights, we extend LTSSL into the foundation model fine-tuning paradigm and propose a novel framework: LoFT (Long-tailed semi-supervised learning via parameter-efficient Fine-Tuning). Furthermore, we explore a more practical setting by investigating semi-supervised learning under open-world conditions, where the unlabeled data may include out-of-distribution (OOD) samples.To handle this problem, we propose LoFT-OW (LoFT under Open-World scenarios) to improve the discriminative ability. Experimental results on multiple benchmarks demonstrate that our method achieves superior performance. Code is available: https://github.com/games-liker/LoFT

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LoFT shifts LTSSL to PEFT on foundation models with theory on complexity and compactness, but the bounds look vulnerable to pseudo-label noise.

read the letter

This paper moves long-tailed semi-supervised learning from scratch training to parameter-efficient fine-tuning of foundation models, and adds an open-world variant that handles OOD samples in the unlabeled pool. The central contribution is the LoFT framework plus LoFT-OW, motivated by two theoretical claims: foundation models reduce hypothesis complexity enough to tighten generalization bounds and cut Balanced Posterior Error, and their feature compactness geometrically shrinks the acceptance region for outliers.

Referee Report

2 major / 2 minor

Summary. The paper addresses long-tailed semi-supervised learning (LTSSL) in open-world settings by leveraging foundation models via parameter-efficient fine-tuning. It claims to theoretically prove that foundation models reduce hypothesis complexity (tightening generalization bounds and minimizing Balanced Posterior Error or BPE) and that their feature compactness strictly compresses the outlier acceptance region for geometric robustness. Motivated by this, it proposes the LoFT framework and its open-world extension LoFT-OW, with experiments showing superior performance on benchmarks; code is released.

Significance. If the theoretical claims hold after accounting for pseudo-label noise and tail imbalance, the work could meaningfully advance LTSSL by providing a foundation-model-based paradigm with explicit generalization and robustness guarantees. The release of code supports reproducibility, which strengthens the contribution if the experiments are detailed and the bounds are made rigorous.

major comments (2)

[Theoretical Analysis / Abstract] Abstract and theoretical section: the central claim that foundation models reduce hypothesis complexity and tighten the generalization bound to minimize BPE does not incorporate a pseudo-label error term or tail-class imbalance. The derivations appear to treat the foundation-model properties as invariant to the semi-supervised fine-tuning objective; this is load-bearing because the actual regime uses noisy pseudo-labels on long-tailed data, so the claimed tightening may not transfer (see stress-test concern).
[Theoretical Analysis] Geometric argument: the demonstration that feature compactness strictly compresses the acceptance region for outliers lacks any analysis of how this guarantee persists after parameter-efficient fine-tuning on imbalanced data that may contain OOD samples. Without an explicit error term or invariance proof, the robustness claim for LoFT-OW is not yet supported.

minor comments (2)

[Experiments] Experiments section: the abstract states superior performance but provides no specific metrics, baselines, or error bars; these should be added with clear comparisons to prior LTSSL methods to substantiate the claims.
[Notation / Theory] Notation: ensure BPE and related quantities are defined consistently and that any equations in the theoretical section are numbered for easy reference.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our work. We address each of the major comments below and indicate the revisions we will make to the manuscript.

read point-by-point responses

Referee: [Theoretical Analysis / Abstract] Abstract and theoretical section: the central claim that foundation models reduce hypothesis complexity and tighten the generalization bound to minimize BPE does not incorporate a pseudo-label error term or tail-class imbalance. The derivations appear to treat the foundation-model properties as invariant to the semi-supervised fine-tuning objective; this is load-bearing because the actual regime uses noisy pseudo-labels on long-tailed data, so the claimed tightening may not transfer (see stress-test concern).

Authors: We appreciate this observation. Our theoretical analysis focuses on the reduction in hypothesis complexity provided by the foundation model itself, which serves as a foundation for the subsequent fine-tuning. The parameter-efficient fine-tuning approach is designed to maintain proximity to the pre-trained model, thereby preserving the benefits in the generalization bound. However, we acknowledge that explicitly incorporating a pseudo-label error term and accounting for tail-class imbalance would provide a more complete picture. In the revised manuscript, we will extend the theoretical section to include a discussion of these factors and their impact on the balanced posterior error. We will also conduct additional stress tests to demonstrate the robustness of the bound under noisy pseudo-labels. revision: yes
Referee: [Theoretical Analysis] Geometric argument: the demonstration that feature compactness strictly compresses the acceptance region for outliers lacks any analysis of how this guarantee persists after parameter-efficient fine-tuning on imbalanced data that may contain OOD samples. Without an explicit error term or invariance proof, the robustness claim for LoFT-OW is not yet supported.

Authors: Thank you for highlighting this gap. The geometric argument is based on the inherent feature compactness of foundation models. To address how this persists after parameter-efficient fine-tuning, particularly under imbalance and potential OOD samples in the open-world setting, we will add an analysis in the revised version. This will include an invariance argument showing that the updates from fine-tuning do not substantially alter the compressed acceptance region, supported by an error term. This will bolster the theoretical support for LoFT-OW. revision: yes

Circularity Check

0 steps flagged

Theoretical proofs presented as self-contained first-principles results with no reduction to inputs or self-citations

full rationale

The paper states it 'first theoretically prove[s]' that foundation models reduce hypothesis complexity (tightening the generalization bound and minimizing BPE) and that feature compactness 'strictly compresses the acceptance region for outliers'. These are framed as derivations internal to the manuscript rather than outputs of a fit, a renamed empirical pattern, or a load-bearing self-citation chain. No equations or sections in the provided text exhibit a step where the claimed result is equivalent to its own inputs by construction, nor is any prior work by the same authors invoked to justify uniqueness or an ansatz. The derivation chain therefore remains independent of the target claims.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on domain assumptions about foundation model properties rather than new free parameters or invented entities; full details of the proofs and any implicit assumptions are not visible in the abstract.

axioms (2)

domain assumption Foundation models provide feature compactness that compresses outlier acceptance regions in LTSSL
Invoked to provide geometric guarantee for robustness against unreliable pseudo-labels and OOD samples.
domain assumption Utilizing foundation models reduces hypothesis complexity and tightens generalization bounds for LTSSL
Central to minimizing Balanced Posterior Error as stated in the theoretical analysis.

pith-pipeline@v0.9.0 · 5761 in / 1374 out tokens · 39499 ms · 2026-05-18T17:58:37.294759+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We first theoretically prove that utilizing a foundation model significantly reduces the hypothesis complexity, which tightens the generalization bound and in turn minimizes the Balanced Posterior Error (BPE).
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the feature compactness of foundation models strictly compresses the acceptance region for outliers

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 6 internal anchors

[1]

, " * write output.state after.block = add.period write newline

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

work page
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page
[3]

Chen, S.; Ge, C.; Tong, Z.; Wang, J.; Song, Y.; Wang, J.; and Luo, P. 2022. Adaptformer: Adapting vision transformers for scalable visual recognition. Advances in Neural Information Processing Systems, 35: 16664--16678

work page 2022
[4]

Cui, Y.; Jia, M.; Lin, T.-Y.; Song, Y.; and Belongie, S. 2019. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 9268--9277

work page 2019
[5]

Dong, B.; Zhou, P.; Yan, S.; and Zuo, W. 2022. Lpt: Long-tailed prompt tuning for image classification. arXiv preprint arXiv:2210.01033

work page arXiv 2022
[6]

Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

work page internal anchor Pith review Pith/arXiv arXiv 2020
[7]

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Goodfellow, I. J.; Bulatov, Y.; Ibarz, J.; Arnoud, S.; and Shet, V. 2013. Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082

work page internal anchor Pith review Pith/arXiv arXiv 2013
[8]

Guo, C.; Pleiss, G.; Sun, Y.; and Weinberger, K. Q. 2017. On calibration of modern neural networks. In International conference on machine learning, 1321--1330. PMLR

work page 2017
[9]

Hendrycks, D.; and Gimpel, K. 2016. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136

work page internal anchor Pith review Pith/arXiv arXiv 2016
[10]

Hendrycks, D.; Mazeika, M.; and Dietterich, T. 2018. Deep anomaly detection with outlier exposure. arXiv preprint arXiv:1812.04606

work page internal anchor Pith review Pith/arXiv arXiv 2018
[11]

Hou, Y.; and Jia, Y. 2025. A Square Peg in a Square Hole: Meta-Expert for Long-Tailed Semi-Supervised Learning. arXiv preprint arXiv:2505.16341

work page arXiv 2025
[12]

Kang, B.; Xie, S.; Rohrbach, M.; Yan, Z.; Gordo, A.; Feng, J.; and Kalantidis, Y. 2019. Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217

work page arXiv 2019
[13]

Krizhevsky, A.; Hinton, G.; et al. 2009. Learning multiple layers of features from tiny images

work page 2009
[14]

Le, Y.; and Yang, X. 2015. Tiny imagenet visual recognition challenge. CS 231N, 7(7): 3

work page 2015
[15]

Li, L.; Tao, B.; Han, L.; Zhan, D.-c.; and Ye, H.-j. 2024. Twice class bias correction for imbalanced semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 13563--13571

work page 2024
[16]

Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Doll \'a r, P.; and Zitnick, C. L. 2014. Microsoft coco: Common objects in context. In European conference on computer vision, 740--755. Springer

work page 2014
[17]

Liu, K.; Fu, Z.; Jin, S.; Chen, C.; Chen, Z.; Jiang, R.; Zhou, F.; Chen, Y.; and Ye, J. 2024. Rethinking out-of-distribution detection on imbalanced data distribution. Advances in Neural Information Processing Systems, 37: 109152--109176

work page 2024
[18]

Liu, Z.; Miao, Z.; Zhan, X.; Wang, J.; Gong, B.; and Yu, S. X. 2019. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2537--2546

work page 2019
[19]

Ma, C.; Elezi, I.; Deng, J.; Dong, W.; and Xu, C. 2024. Three heads are better than one: Complementary experts for long-tailed semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 14229--14237

work page 2024
[20]

K.; Jayasumana, S.; Rawat, A

Menon, A. K.; Jayasumana, S.; Rawat, A. S.; Jain, H.; Veit, A.; and Kumar, S. 2020. Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314

work page arXiv 2020
[21]

Miao, W.; Pang, G.; Bai, X.; Li, T.; and Zheng, J. 2024. Out-of-distribution detection in long-tailed recognition with calibrated outlier class learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 4216--4224

work page 2024
[22]

Ouali, Y.; Hudelot, C.; and Tami, M. 2020. An overview of deep semi-supervised learning. arXiv preprint arXiv:2006.05278

work page arXiv 2020
[23]

Peng, H.; Pian, W.; Sun, M.; and Li, P. 2023. Dynamic re-weighting for long-tailed semi-supervised learning. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 6464--6474

work page 2023
[24]

W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al

Radford, A.; Kim, J. W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning, 8748--8763. PMLR

work page 2021
[25]

Sanchez Aimar, E.; Helgesen, N.; Xu, Y.; Kuhlmann, M.; and Felsberg, M. 2024. Flexible Distribution Alignment: Towards Long-Tailed Semi-supervised Learning with Proper Calibration. In European Conference on Computer Vision, 307--327. Springer

work page 2024
[26]

Sanchez Aimar, E.; Jonnarth, A.; Felsberg, M.; and Kuhlmann, M. 2023. Balanced Product of Calibrated Experts for Long-Tailed Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 19967--19977

work page 2023
[27]

Shi, J.-X.; Wei, T.; Zhou, Z.; Shao, J.-J.; Han, X.-Y.; and Li, Y.-F. 2024. Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts. In Forty-first International Conference on Machine Learning

work page 2024
[28]

A.; Cubuk, E

Sohn, K.; Berthelot, D.; Carlini, N.; Zhang, Z.; Zhang, H.; Raffel, C. A.; Cubuk, E. D.; Kurakin, A.; and Li, C.-L. 2020. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems, 33: 596--608

work page 2020
[29]

Tian, C.; Wang, W.; Zhu, X.; Dai, J.; and Qiao, Y. 2022. Vl-ltr: Learning class-wise visual-linguistic representation for long-tailed visual recognition. In European Conference on Computer Vision, 73--91. Springer

work page 2022
[30]

E.; Cremers, D.; and Buettner, F

Tomani, C.; Gruber, S.; Erdem, M. E.; Cremers, D.; and Buettner, F. 2021. Post-Hoc Uncertainty Calibration for Domain Drift Scenarios. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10124--10132

work page 2021
[31]

Wei, C.; Sohn, K.; Mellina, C.; Yuille, A.; and Yang, F. 2021. Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10857--10866

work page 2021
[32]

Wei, T.; and Gan, K. 2023. Towards Realistic Long-Tailed Semi-Supervised Learning: Consistency Is All You Need. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 3469--3478

work page 2023
[33]

Xu, Z.; Chai, Z.; and Yuan, C. 2021. Towards calibrated model for long-tailed visual recognition from prior perspective. Advances in Neural Information Processing Systems, 34: 7139--7152

work page 2021
[34]

Yu, F.; Seff, A.; Zhang, Y.; Song, S.; Funkhouser, T.; and Xiao, J. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365

work page internal anchor Pith review Pith/arXiv arXiv 2015
[35]

mixup: Beyond Empirical Risk Minimization

Zhang, H.; Cisse, M.; Dauphin, Y. N.; and Lopez-Paz, D. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412

work page internal anchor Pith review Pith/arXiv arXiv 2017
[36]

Zheng, H.; Zhou, L.; Li, H.; Su, J.; Wei, X.; and Xu, X. 2024. Bem: Balanced and entropy-based mix for long-tailed semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 22893--22903

work page 2024
[37]

Zhong, Z.; Cui, J.; Liu, S.; and Jia, J. 2021. Improving calibration for long-tailed recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 16489--16498

work page 2021
[38]

Zhou, B.; Lapedriza, A.; Khosla, A.; Oliva, A.; and Torralba, A. 2017. Places: A 10 million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence

work page 2017
[39]

Zhou, Z.-H.; Fang, S.; Zhou, Z.-J.; Wei, T.; Wan, Y.; and Zhang, M.-L. 2024. Continuous contrastive learning for long-tailed semi-supervised recognition. Advances in Neural Information Processing Systems, 37: 51411--51435

work page 2024

[1] [1]

, " * write output.state after.block = add.period write newline

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

work page

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page

[3] [3]

Chen, S.; Ge, C.; Tong, Z.; Wang, J.; Song, Y.; Wang, J.; and Luo, P. 2022. Adaptformer: Adapting vision transformers for scalable visual recognition. Advances in Neural Information Processing Systems, 35: 16664--16678

work page 2022

[4] [4]

Cui, Y.; Jia, M.; Lin, T.-Y.; Song, Y.; and Belongie, S. 2019. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 9268--9277

work page 2019

[5] [5]

Dong, B.; Zhou, P.; Yan, S.; and Zuo, W. 2022. Lpt: Long-tailed prompt tuning for image classification. arXiv preprint arXiv:2210.01033

work page arXiv 2022

[6] [6]

Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

work page internal anchor Pith review Pith/arXiv arXiv 2020

[7] [7]

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Goodfellow, I. J.; Bulatov, Y.; Ibarz, J.; Arnoud, S.; and Shet, V. 2013. Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082

work page internal anchor Pith review Pith/arXiv arXiv 2013

[8] [8]

Guo, C.; Pleiss, G.; Sun, Y.; and Weinberger, K. Q. 2017. On calibration of modern neural networks. In International conference on machine learning, 1321--1330. PMLR

work page 2017

[9] [9]

Hendrycks, D.; and Gimpel, K. 2016. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136

work page internal anchor Pith review Pith/arXiv arXiv 2016

[10] [10]

Hendrycks, D.; Mazeika, M.; and Dietterich, T. 2018. Deep anomaly detection with outlier exposure. arXiv preprint arXiv:1812.04606

work page internal anchor Pith review Pith/arXiv arXiv 2018

[11] [11]

Hou, Y.; and Jia, Y. 2025. A Square Peg in a Square Hole: Meta-Expert for Long-Tailed Semi-Supervised Learning. arXiv preprint arXiv:2505.16341

work page arXiv 2025

[12] [12]

Kang, B.; Xie, S.; Rohrbach, M.; Yan, Z.; Gordo, A.; Feng, J.; and Kalantidis, Y. 2019. Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217

work page arXiv 2019

[13] [13]

Krizhevsky, A.; Hinton, G.; et al. 2009. Learning multiple layers of features from tiny images

work page 2009

[14] [14]

Le, Y.; and Yang, X. 2015. Tiny imagenet visual recognition challenge. CS 231N, 7(7): 3

work page 2015

[15] [15]

Li, L.; Tao, B.; Han, L.; Zhan, D.-c.; and Ye, H.-j. 2024. Twice class bias correction for imbalanced semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 13563--13571

work page 2024

[16] [16]

Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Doll \'a r, P.; and Zitnick, C. L. 2014. Microsoft coco: Common objects in context. In European conference on computer vision, 740--755. Springer

work page 2014

[17] [17]

Liu, K.; Fu, Z.; Jin, S.; Chen, C.; Chen, Z.; Jiang, R.; Zhou, F.; Chen, Y.; and Ye, J. 2024. Rethinking out-of-distribution detection on imbalanced data distribution. Advances in Neural Information Processing Systems, 37: 109152--109176

work page 2024

[18] [18]

Liu, Z.; Miao, Z.; Zhan, X.; Wang, J.; Gong, B.; and Yu, S. X. 2019. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2537--2546

work page 2019

[19] [19]

Ma, C.; Elezi, I.; Deng, J.; Dong, W.; and Xu, C. 2024. Three heads are better than one: Complementary experts for long-tailed semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 14229--14237

work page 2024

[20] [20]

K.; Jayasumana, S.; Rawat, A

Menon, A. K.; Jayasumana, S.; Rawat, A. S.; Jain, H.; Veit, A.; and Kumar, S. 2020. Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314

work page arXiv 2020

[21] [21]

Miao, W.; Pang, G.; Bai, X.; Li, T.; and Zheng, J. 2024. Out-of-distribution detection in long-tailed recognition with calibrated outlier class learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, 4216--4224

work page 2024

[22] [22]

Ouali, Y.; Hudelot, C.; and Tami, M. 2020. An overview of deep semi-supervised learning. arXiv preprint arXiv:2006.05278

work page arXiv 2020

[23] [23]

Peng, H.; Pian, W.; Sun, M.; and Li, P. 2023. Dynamic re-weighting for long-tailed semi-supervised learning. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 6464--6474

work page 2023

[24] [24]

W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al

Radford, A.; Kim, J. W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning, 8748--8763. PMLR

work page 2021

[25] [25]

Sanchez Aimar, E.; Helgesen, N.; Xu, Y.; Kuhlmann, M.; and Felsberg, M. 2024. Flexible Distribution Alignment: Towards Long-Tailed Semi-supervised Learning with Proper Calibration. In European Conference on Computer Vision, 307--327. Springer

work page 2024

[26] [26]

Sanchez Aimar, E.; Jonnarth, A.; Felsberg, M.; and Kuhlmann, M. 2023. Balanced Product of Calibrated Experts for Long-Tailed Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 19967--19977

work page 2023

[27] [27]

Shi, J.-X.; Wei, T.; Zhou, Z.; Shao, J.-J.; Han, X.-Y.; and Li, Y.-F. 2024. Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts. In Forty-first International Conference on Machine Learning

work page 2024

[28] [28]

A.; Cubuk, E

Sohn, K.; Berthelot, D.; Carlini, N.; Zhang, Z.; Zhang, H.; Raffel, C. A.; Cubuk, E. D.; Kurakin, A.; and Li, C.-L. 2020. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems, 33: 596--608

work page 2020

[29] [29]

Tian, C.; Wang, W.; Zhu, X.; Dai, J.; and Qiao, Y. 2022. Vl-ltr: Learning class-wise visual-linguistic representation for long-tailed visual recognition. In European Conference on Computer Vision, 73--91. Springer

work page 2022

[30] [30]

E.; Cremers, D.; and Buettner, F

Tomani, C.; Gruber, S.; Erdem, M. E.; Cremers, D.; and Buettner, F. 2021. Post-Hoc Uncertainty Calibration for Domain Drift Scenarios. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10124--10132

work page 2021

[31] [31]

Wei, C.; Sohn, K.; Mellina, C.; Yuille, A.; and Yang, F. 2021. Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10857--10866

work page 2021

[32] [32]

Wei, T.; and Gan, K. 2023. Towards Realistic Long-Tailed Semi-Supervised Learning: Consistency Is All You Need. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 3469--3478

work page 2023

[33] [33]

Xu, Z.; Chai, Z.; and Yuan, C. 2021. Towards calibrated model for long-tailed visual recognition from prior perspective. Advances in Neural Information Processing Systems, 34: 7139--7152

work page 2021

[34] [34]

Yu, F.; Seff, A.; Zhang, Y.; Song, S.; Funkhouser, T.; and Xiao, J. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365

work page internal anchor Pith review Pith/arXiv arXiv 2015

[35] [35]

mixup: Beyond Empirical Risk Minimization

Zhang, H.; Cisse, M.; Dauphin, Y. N.; and Lopez-Paz, D. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412

work page internal anchor Pith review Pith/arXiv arXiv 2017

[36] [36]

Zheng, H.; Zhou, L.; Li, H.; Su, J.; Wei, X.; and Xu, X. 2024. Bem: Balanced and entropy-based mix for long-tailed semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 22893--22903

work page 2024

[37] [37]

Zhong, Z.; Cui, J.; Liu, S.; and Jia, J. 2021. Improving calibration for long-tailed recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 16489--16498

work page 2021

[38] [38]

Zhou, B.; Lapedriza, A.; Khosla, A.; Oliva, A.; and Torralba, A. 2017. Places: A 10 million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence

work page 2017

[39] [39]

Zhou, Z.-H.; Fang, S.; Zhou, Z.-J.; Wei, T.; Wan, Y.; and Zhang, M.-L. 2024. Continuous contrastive learning for long-tailed semi-supervised recognition. Advances in Neural Information Processing Systems, 37: 51411--51435

work page 2024