RankOOD -- Class Ranking-based Out-of-Distribution Detection
Pith reviewed 2026-05-17 04:58 UTC · model grok-4.3
The pith
Training a classifier to predict class-specific rankings lets it flag out-of-distribution inputs that violate those rankings even when they receive high class probability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
With a deep learning model trained using cross-entropy loss, each in-distribution class induces a consistent ranking pattern among the other classes. RankOOD extracts these rank lists from an initial classifier and then trains a second model with the Plackett-Luce loss to treat the class rank as the predicted variable. An out-of-distribution input may still receive high probability for an in-distribution class, but the probability that its induced ranking respects the learned permutation for that class is low, providing a detection signal.
What carries the argument
Plackett-Luce loss applied to fixed per-class ranking permutations extracted from an initial classifier
If this is right
- OOD scoring reduces to computing the Plackett-Luce probability of the observed ranking given the model's top class.
- The method can be added on top of any initial classifier without changing the inference architecture at test time.
- Performance gains appear on near-OOD benchmarks where semantic overlap makes probability-based scores unreliable.
- The framework directly reuses preference-modeling losses already common in large-model alignment.
Where Pith is reading between the lines
- The same ranking-consistency idea could be tested on non-image modalities if a natural ordering among output tokens or labels can be defined.
- If the initial rank lists are noisy, the second training stage may amplify errors; an ablation that varies the quality of the extracted permutations would quantify sensitivity.
- Combining the ranking score with existing energy or distance-based OOD scores might produce a stronger ensemble detector.
Load-bearing premise
Out-of-distribution inputs that receive high probability for some in-distribution class will still produce a ranking whose probability under the Plackett-Luce model is reliably low.
What would settle it
A controlled test set in which many out-of-distribution images produce high Plackett-Luce probability for the ranking of their top predicted class would show the detection rule fails to separate them from in-distribution images.
Figures
read the original abstract
We propose RankOOD, a rank-based Out-of-Distribution (OOD) detection approach based on training a model with the Placket-Luce loss, which is now extensively used for preference alignment tasks in foundational models. Our approach is based on the insight that with a deep learning model trained using the Cross Entropy Loss, in-distribution (ID) class prediction induces a ranking pattern for each ID class prediction. The RankOOD framework formalizes the insight by first extracting a rank list for each class using an initial classifier and then uses another round of training with the Plackett-Luce loss, where the class rank, a fixed permutation for each class, is the predicted variable. An OOD example may get assigned with high probability to an ID example, but the probability of it respecting the ranking classification is likely to be small. RankOOD, achieves SOTA performance on the near-ODD TinyImageNet evaluation benchmark, reducing FPR95 by 4.3%.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes RankOOD, a two-stage OOD detection method. An initial cross-entropy classifier is used to extract a fixed permutation (rank list) per ID class. A second training stage then optimizes the model with the Plackett-Luce loss so that it predicts these fixed permutations. Detection treats an input as OOD if its Plackett-Luce probability for the predicted ranking is low, even when the input receives high probability for some ID class. The abstract reports that this yields SOTA performance on the near-OOD TinyImageNet benchmark, reducing FPR95 by 4.3%.
Significance. If the separation between ID and OOD samples in Plackett-Luce ranking probability can be shown to be robust and not an artifact of the two-stage procedure, the approach would constitute a novel use of ranking losses for OOD detection. The connection to preference-alignment techniques is interesting and could generalize to other ranking-based models. However, the significance is currently limited by the absence of detailed experimental protocols, ablations, and theoretical justification for why the second stage produces the claimed separation rather than simply reinforcing the original classifier.
major comments (2)
- Abstract: the SOTA claim (4.3% FPR95 reduction on TinyImageNet) is presented without any description of the experimental protocol, baselines, number of runs, statistical significance, or ablation of the two-stage training. This information is load-bearing for the central performance claim and must be supplied before the result can be evaluated.
- Method description (two-stage procedure): the fixed permutations are extracted from the initial CE model; no derivation or analysis shows why retraining with the Plackett-Luce loss on these fixed targets produces a reliable drop in ranking probability for OOD inputs that still receive high ID-class probability. The separation assumption therefore remains an unverified modeling choice rather than a demonstrated property.
minor comments (2)
- Abstract: 'Placket-Luce' should be spelled 'Plackett-Luce'; 'near-ODD' should be 'near-OOD'.
- The manuscript should include a clear statement of the detection score (e.g., whether it is the Plackett-Luce likelihood itself or a normalized version) and how it is thresholded.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We agree that the abstract and method sections require additional details to support the performance claims and to justify the two-stage procedure. We have revised the manuscript to address these points directly.
read point-by-point responses
-
Referee: Abstract: the SOTA claim (4.3% FPR95 reduction on TinyImageNet) is presented without any description of the experimental protocol, baselines, number of runs, statistical significance, or ablation of the two-stage training. This information is load-bearing for the central performance claim and must be supplied before the result can be evaluated.
Authors: We agree that the abstract as originally written does not supply sufficient context for the reported 4.3% FPR95 reduction. In the revised version we have expanded the abstract to state that experiments follow the standard near-OOD protocol on TinyImageNet, compare against recent baselines from the literature, report averages over five independent runs with standard deviations, and reference the full protocol, statistical significance tests, and two-stage ablations now detailed in Section 4 and the appendix. This change makes the central claim evaluable while respecting abstract length limits. revision: yes
-
Referee: Method description (two-stage procedure): the fixed permutations are extracted from the initial CE model; no derivation or analysis shows why retraining with the Plackett-Luce loss on these fixed targets produces a reliable drop in ranking probability for OOD inputs that still receive high ID-class probability. The separation assumption therefore remains an unverified modeling choice rather than a demonstrated property.
Authors: We acknowledge that the original manuscript presents the separation as an insight without a formal derivation. The core modeling choice is that each ID class induces a stable ranking permutation under cross-entropy training; the second stage then trains the model to predict that exact permutation via the Plackett-Luce loss. OOD inputs that receive high top-class probability are still unlikely to produce the full class-specific ranking because they lie outside the ID data manifold that generated the permutation. In the revision we have added a short theoretical paragraph in Section 3 that derives the expected drop in Plackett-Luce probability for OOD samples and an ablation study in Section 4.3 that isolates the contribution of the second training stage versus simply reusing the original classifier. These additions convert the assumption into an explicitly analyzed property. revision: yes
Circularity Check
RankOOD derivation is self-contained with no circular reductions
full rationale
The paper describes a two-stage training process: initial cross-entropy training to extract class-specific ranking permutations, followed by Plackett-Luce loss training to align predictions with these fixed permutations. The OOD detection relies on the Plackett-Luce probability of the predicted ranking. This procedure introduces an explicit second training stage with a distinct loss function, and the central performance claim on TinyImageNet is presented as an empirical result rather than a quantity derived by construction from fitted parameters within the same equations. No self-citations or ansatzes are invoked to justify the separation between ID and OOD ranking probabilities. The derivation chain does not reduce to its inputs by definition or statistical forcing.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Cross-entropy trained classifiers induce a consistent ranking pattern for each in-distribution class.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We employ a Listwise Maximum Likelihood Estimation (ListMLE) objective to learn the rank structure... under the Plackett–Luce model
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Towards open set deep networks
Abhijit Bendale et al. Towards open set deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1563–1572, 2016. 1, 2, 3, 4, 5, 6, 7, 8, 9
work page 2016
-
[2]
In or out? fixing imagenet out-of-distribution detection evalua- tion
Julian Bitterwolf, Maximilian M ¨uller, and Matthias Hein. In or out? fixing imagenet out-of-distribution detection evalua- tion. InProceedings of the 40th International Conference on Machine Learning. JMLR.org, 2023. 5
work page 2023
-
[3]
Guangyao Chen, Peixi Peng, Xiangqian Wang, and Yonghong Tian. Adversarial reciprocal points learning for open set recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):8065–8081, 2021. 2, 3, 4, 5, 6, 7, 8, 9
work page 2021
-
[4]
Describing textures in the wild
Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3606–3613, 2014. 5
work page 2014
-
[5]
Li Deng. The mnist database of handwritten digit images for machine learning research [best of the web].IEEE Signal Processing Magazine, 29(6):141–142, 2012. 5
work page 2012
-
[6]
Learning Confidence for Out-of-Distribution Detection in Neural Networks
Terrance DeVries et al. Learning confidence for out-of- distribution detection in neural networks.arXiv preprint arXiv:1802.04865, 2018. 5, 6, 2, 3, 4, 7, 8, 9
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[7]
Extremely simple activation shaping for out- of-distribution detection
Andrija Djurisic, Nebojsa Bozanic, Arjun Ashok, and Rosanne Liu. Extremely simple activation shaping for out- of-distribution detection. InThe Eleventh International Con- ference on Learning Representations, 2023. 2, 5, 6, 3, 4, 7, 8, 9
work page 2023
-
[8]
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. On calibration of modern neural networks. InProceedings of the 34th International Conference on Machine Learning - Volume 70, page 1321–1330. JMLR.org, 2017. 1, 6, 2, 3, 4, 5, 7, 8, 9
work page 2017
-
[9]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. 5
work page 2016
-
[10]
A baseline for detect- ing misclassified and out-of-distribution examples in neural networks
Dan Hendrycks and Kevin Gimpel. A baseline for detect- ing misclassified and out-of-distribution examples in neural networks. InInternational Conference on Learning Repre- sentations, 2017. 1, 2, 5, 6, 3, 4, 7, 8, 9
work page 2017
-
[11]
Deep anomaly detection with outlier exposure
Dan Hendrycks, Mantas Mazeika, and Thomas Dietterich. Deep anomaly detection with outlier exposure. InInterna- tional Conference on Learning Representations, 2019. 1, 2, 5, 6, 3, 4, 7, 8, 9
work page 2019
-
[12]
Dan Hendrycks, Mantas Mazeika, Saurav Kadavath, and Dawn Song. Using self-supervised learning can improve model robustness and uncertainty.Advances in neural in- formation processing systems, 32, 2019. 2
work page 2019
-
[13]
Scaling out-of-distribution detection for real-world settings
Dan Hendrycks, Steven Basart, Mantas Mazeika, Andy Zou, Joseph Kwon, Mohammadreza Mostajabi, Jacob Steinhardt, and Dawn Song. Scaling out-of-distribution detection for real-world settings. InInternational Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Mary- land, USA, pages 8759–8773. PMLR, 2022. 6, 2, 3, 4, 5, 7, 8, 9
work page 2022
-
[14]
Yen-Chang Hsu, Yilin Shen, Hongxia Jin, and Zsolt Kira. Generalized odin: Detecting out-of-distribution image with- out learning from out-of-distribution data. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10951–10960, 2020. 2, 5, 6, 3, 4, 7, 8, 9
work page 2020
-
[15]
Rui Huang et al. On the importance of gradients for detecting distributional shifts in the wild.Advances in Neural Informa- tion Processing Systems, 34:677–689, 2021. 2, 3, 4, 5, 6, 7, 8, 9
work page 2021
-
[16]
Mos: Towards scaling out-of-distribution detection for large semantic space
Rui Huang et al. Mos: Towards scaling out-of-distribution detection for large semantic space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8710–8719, 2021. 2, 3, 4, 5, 6, 7, 8, 9
work page 2021
-
[17]
Naveen Karunanayake, Suranga Seneviratne, and Sanjay Chawla. Excel: Combined extreme and collective logit in- formation for enhancing out-of-distribution detection.arXiv preprint arXiv:2311.14754, 2023. 1, 2, 5, 6, 3, 4, 7, 8, 9
-
[18]
Craft: Class ranking aware fine-tuning for enhanced out-of-distribution detection
Naveen Karunanayake, Suranga Seneviratne, and Sanjay Chawla. Craft: Class ranking aware fine-tuning for enhanced out-of-distribution detection. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 4119–4128. IEEE, 2025. 1, 3, 5, 6, 2, 4, 7, 8, 9
work page 2025
-
[19]
Opengan: Open-set recognition via open data generation
Shu Kong et al. Opengan: Open-set recognition via open data generation. InProceedings of the IEEE/CVF Inter- national Conference on Computer Vision, pages 813–822,
-
[20]
2, 3, 4, 5, 6, 7, 8, 9
-
[21]
Learning multiple layers of features from tiny images
Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. 4
work page 2009
-
[22]
Tiny imagenet visual recognition challenge.CS 231N, 7(7):3, 2015
Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge.CS 231N, 7(7):3, 2015. 4
work page 2015
-
[23]
A simple unified framework for detecting out-of-distribution samples and adversarial attacks
Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. InAdvances in Neural In- formation Processing Systems, 2018. 2, 3, 4, 5, 6, 7, 8, 9
work page 2018
-
[24]
Shiyu Liang, Yixuan Li, and R. Srikant. Enhancing the re- liability of out-of-distribution image detection in neural net- works. InInternational Conference on Learning Represen- tations, 2018. 2, 3, 4, 5, 6, 7, 8, 9
work page 2018
-
[25]
Weitang Liu, Xiaoyun Wang, John D. Owens, and Yixuan Li. Energy-based out-of-distribution detection. InProceedings of the 34th International Conference on Neural Information Processing Systems, Red Hook, NY , USA, 2020. 1, 2, 5, 6, 3, 4, 7, 8, 9
work page 2020
-
[26]
Gen: Pushing the limits of softmax-based out- of-distribution detection
Xixi Liu et al. Gen: Pushing the limits of softmax-based out- of-distribution detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23946–23955, 2023. 5, 6, 2, 3, 4, 7, 8, 9
work page 2023
-
[27]
John I Marden.Analyzing and modeling rank data. CRC Press, 1996. 2, 3, 4 9
work page 1996
-
[28]
Yifei Ming, Yiyou Sun, Ousmane Dia, and Yixuan Li. How to exploit hyperspherical embeddings for out-of-distribution detection? InThe Eleventh International Conference on Learning Representations, 2023. 2, 6, 3, 4, 5, 7, 8, 9
work page 2023
-
[29]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bis- sacco, Bo Wu, and Andrew Y . Ng. Reading digits in natural images with unsupervised feature learning. InNIPS Work- shop on Deep Learning and Unsupervised Feature Learning 2011, 2011. 5
work page 2011
-
[30]
A simple fix to ma- halanobis distance for improving near-ood detec- tion,
Jie Ren, Stanislav Fort, Jeremiah Liu, Abhijit Guha Roy, Shreyas Padhy, and Balaji Lakshminarayanan. A simple fix to mahalanobis distance for improving near-ood detection. arXiv preprint arXiv:2106.09022, 2021. 6, 2, 3, 4, 5, 7, 8, 9
-
[31]
Detecting out-of- distribution examples with gram matrices
Chandramouli Shama Sastry et al. Detecting out-of- distribution examples with gram matrices. InInternational Conference on Machine Learning, pages 8491–8501. PMLR,
-
[32]
Out- of-distribution segmentation in autonomous driving: Prob- lems and state of the art
Youssef Shoeb, Azarm Nowzad, and Hanno Gottschalk. Out- of-distribution segmentation in autonomous driving: Prob- lems and state of the art. InProceedings of the Computer Vi- sion and Pattern Recognition Conference, pages 4310–4320,
-
[33]
Yue Song et al. Rankfeat: Rank-1 feature removal for out- of-distribution detection.Advances in Neural Information Processing Systems, 35:17885–17898, 2022. 2, 3, 4, 5, 6, 7, 8, 9
work page 2022
-
[34]
Out- of-distribution detection with deep nearest neighbors
Yiyou Sun, Yifei Ming, Xiaojin Zhu, and Yixuan Li. Out- of-distribution detection with deep nearest neighbors. InIn- ternational Conference on Machine Learning, pages 20827– 20840. PMLR, 2022. 6, 2, 3, 4, 5, 7, 8, 9
work page 2022
-
[35]
Yiyou Sun et al. React: Out-of-distribution detection with rectified activations.Advances in Neural Information Pro- cessing Systems, 34:144–157, 2021. 1, 2, 5, 6, 3, 4, 7, 8, 9
work page 2021
-
[36]
Dice: Leveraging sparsification for out-of- distribution detection
Yiyou Sun et al. Dice: Leveraging sparsification for out-of- distribution detection. InEuropean Conference on Computer Vision, pages 691–708. Springer, 2022. 2, 3, 4, 5, 6, 7, 8, 9
work page 2022
-
[37]
Jihoon Tack, Sangwoo Mo, Jongheon Jeong, and Jinwoo Shin. Csi: Novelty detection via contrastive learning on dis- tributionally shifted instances.Advances in Neural Informa- tion Processing Systems, 33:11839–11852, 2020. 2, 3, 4, 5, 6, 7, 8, 9
work page 2020
-
[38]
The inaturalist species classification and detection dataset
Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, and Serge Belongie. The inaturalist species classification and detection dataset. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8769– 8778, 2018. 5
work page 2018
-
[39]
Open-set recognition: A good closed-set classifier is all you need
Sagar Vaze, Kai Han, Andrea Vedaldi, and Andrew Zisser- man. Open-set recognition: A good closed-set classifier is all you need. InInternational Conference on Learning Rep- resentations, 2022. 5
work page 2022
-
[40]
Vim: Out-of-distribution with virtual-logit matching
Haoqi Wang, Zhizhong Li, Litong Feng, and Wayne Zhang. Vim: Out-of-distribution with virtual-logit matching. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4921–4930, 2022. 1, 5, 6, 2, 3, 4, 7, 8, 9
work page 2022
-
[41]
Mitigating neural network overconfidence with logit normalization
Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, and Yixuan Li. Mitigating neural network overconfidence with logit normalization. InInternational Conference on Machine Learning, pages 23631–23644. PMLR, 2022. 2, 5, 6, 3, 4, 7, 8, 9
work page 2022
-
[42]
Listwise approach to learning to rank: theory and algorithm
Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. Listwise approach to learning to rank: theory and algorithm. InProceedings of the 25th international confer- ence on Machine learning, pages 1192–1199, 2008. 2, 3, 6
work page 2008
-
[43]
Semantically coherent out-of-distribution detection
Jingkang Yang, Haoqi Wang, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang, and Ziwei Liu. Semantically coherent out-of-distribution detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 8301–8309, 2021. 1, 2, 5, 6, 3, 4, 7, 8, 9
work page 2021
-
[44]
Unsupervised out-of-distribution detection by maximum classifier discrepancy
Qing Yu et al. Unsupervised out-of-distribution detection by maximum classifier discrepancy. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 9518–9526, 2019. 1, 2, 5, 6, 3, 4, 7, 8, 9
work page 2019
-
[45]
Lifan Yuan, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Fangyuan Zou, Xingyi Cheng, Heng Ji, Zhiyuan Liu, and Maosong Sun. Revisiting out-of-distribution robustness in nlp: Benchmarks, analysis, and llms evaluations.Advances in Neural Information Processing Systems, 36:58478–58507,
-
[46]
Jinsong Zhang, Qiang Fu, Xu Chen, Lun Du, Zelin Li, Gang Wang, Shi Han, Dongmei Zhang, et al. Out-of-distribution detection based on in-distribution data patterns memoriza- tion with modern hopfield energy. InThe Eleventh Interna- tional Conference on Learning Representations, 2022. 2, 3, 4, 5, 6, 7, 8, 9
work page 2022
-
[47]
Mixture outlier exposure: Towards out-of-distribution detection in fine-grained environments
Jingyang Zhang, Nathan Inkawhich, Randolph Linderman, Yiran Chen, and Hai Li. Mixture outlier exposure: Towards out-of-distribution detection in fine-grained environments. In Proceedings of the IEEE/CVF Winter Conference on Appli- cations of Computer Vision, pages 5531–5540, 2023. 1, 2, 5, 6, 3, 4, 7, 8, 9
work page 2023
-
[48]
OpenOOD v1.5: Enhanced Benchmark for Out -of- Distribution Detection,
Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, et al. Openood v1. 5: En- hanced benchmark for out-of-distribution detection.arXiv preprint arXiv:2306.09301, 2023. 2, 4
-
[49]
Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition.IEEE Transactions on Pattern Analy- sis and Machine Intelligence, 40(6):1452–1464, 2017. 5 10 A. Appendix In this appendix, we present additional details and results that were excluded from the main content due to space l...
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.