pith. sign in

arxiv: 2511.19996 · v2 · submitted 2025-11-25 · 💻 cs.LG

RankOOD -- Class Ranking-based Out-of-Distribution Detection

Pith reviewed 2026-05-17 04:58 UTC · model grok-4.3

classification 💻 cs.LG
keywords out-of-distribution detectionPlackett-Luce lossrankingimage classificationdeep learningOOD detectionpreference modeling
0
0 comments X

The pith

Training a classifier to predict class-specific rankings lets it flag out-of-distribution inputs that violate those rankings even when they receive high class probability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces RankOOD, which first trains a standard classifier to produce a fixed ranking of classes for each predicted label, then retrains the model using the Plackett-Luce loss so that the output is the ranking permutation itself. In-distribution examples tend to produce rankings that match the learned pattern for their top class, while out-of-distribution examples often do not, even when the model assigns them high probability to an in-distribution class. The approach is evaluated on near-OOD benchmarks and reports improved detection measured by lower false-positive rates at high true-positive rates. A sympathetic reader would care because many deployed vision systems need reliable ways to notice inputs that fall outside the training distribution without requiring extra data or ensembles.

Core claim

With a deep learning model trained using cross-entropy loss, each in-distribution class induces a consistent ranking pattern among the other classes. RankOOD extracts these rank lists from an initial classifier and then trains a second model with the Plackett-Luce loss to treat the class rank as the predicted variable. An out-of-distribution input may still receive high probability for an in-distribution class, but the probability that its induced ranking respects the learned permutation for that class is low, providing a detection signal.

What carries the argument

Plackett-Luce loss applied to fixed per-class ranking permutations extracted from an initial classifier

If this is right

  • OOD scoring reduces to computing the Plackett-Luce probability of the observed ranking given the model's top class.
  • The method can be added on top of any initial classifier without changing the inference architecture at test time.
  • Performance gains appear on near-OOD benchmarks where semantic overlap makes probability-based scores unreliable.
  • The framework directly reuses preference-modeling losses already common in large-model alignment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same ranking-consistency idea could be tested on non-image modalities if a natural ordering among output tokens or labels can be defined.
  • If the initial rank lists are noisy, the second training stage may amplify errors; an ablation that varies the quality of the extracted permutations would quantify sensitivity.
  • Combining the ranking score with existing energy or distance-based OOD scores might produce a stronger ensemble detector.

Load-bearing premise

Out-of-distribution inputs that receive high probability for some in-distribution class will still produce a ranking whose probability under the Plackett-Luce model is reliably low.

What would settle it

A controlled test set in which many out-of-distribution images produce high Plackett-Luce probability for the ranking of their top predicted class would show the detection rule fails to separate them from in-distribution images.

Figures

Figures reproduced from arXiv: 2511.19996 by Dishanika Denipitiyage, Naveen Karunanayake, Sanjay Chawla, Suranga Seneviratne.

Figure 1
Figure 1. Figure 1: The performance comparison of average FPR95 on Far [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Logit distributions at selected rank positions for predicted class [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Left: Distributions of RankOOD and MSP scores for [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: OOD detection performance on CIFAR-100 with respect [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: CIFAR-10 Conditional probability matrix (CP) of rank [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
read the original abstract

We propose RankOOD, a rank-based Out-of-Distribution (OOD) detection approach based on training a model with the Placket-Luce loss, which is now extensively used for preference alignment tasks in foundational models. Our approach is based on the insight that with a deep learning model trained using the Cross Entropy Loss, in-distribution (ID) class prediction induces a ranking pattern for each ID class prediction. The RankOOD framework formalizes the insight by first extracting a rank list for each class using an initial classifier and then uses another round of training with the Plackett-Luce loss, where the class rank, a fixed permutation for each class, is the predicted variable. An OOD example may get assigned with high probability to an ID example, but the probability of it respecting the ranking classification is likely to be small. RankOOD, achieves SOTA performance on the near-ODD TinyImageNet evaluation benchmark, reducing FPR95 by 4.3%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes RankOOD, a two-stage OOD detection method. An initial cross-entropy classifier is used to extract a fixed permutation (rank list) per ID class. A second training stage then optimizes the model with the Plackett-Luce loss so that it predicts these fixed permutations. Detection treats an input as OOD if its Plackett-Luce probability for the predicted ranking is low, even when the input receives high probability for some ID class. The abstract reports that this yields SOTA performance on the near-OOD TinyImageNet benchmark, reducing FPR95 by 4.3%.

Significance. If the separation between ID and OOD samples in Plackett-Luce ranking probability can be shown to be robust and not an artifact of the two-stage procedure, the approach would constitute a novel use of ranking losses for OOD detection. The connection to preference-alignment techniques is interesting and could generalize to other ranking-based models. However, the significance is currently limited by the absence of detailed experimental protocols, ablations, and theoretical justification for why the second stage produces the claimed separation rather than simply reinforcing the original classifier.

major comments (2)
  1. Abstract: the SOTA claim (4.3% FPR95 reduction on TinyImageNet) is presented without any description of the experimental protocol, baselines, number of runs, statistical significance, or ablation of the two-stage training. This information is load-bearing for the central performance claim and must be supplied before the result can be evaluated.
  2. Method description (two-stage procedure): the fixed permutations are extracted from the initial CE model; no derivation or analysis shows why retraining with the Plackett-Luce loss on these fixed targets produces a reliable drop in ranking probability for OOD inputs that still receive high ID-class probability. The separation assumption therefore remains an unverified modeling choice rather than a demonstrated property.
minor comments (2)
  1. Abstract: 'Placket-Luce' should be spelled 'Plackett-Luce'; 'near-ODD' should be 'near-OOD'.
  2. The manuscript should include a clear statement of the detection score (e.g., whether it is the Plackett-Luce likelihood itself or a normalized version) and how it is thresholded.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We agree that the abstract and method sections require additional details to support the performance claims and to justify the two-stage procedure. We have revised the manuscript to address these points directly.

read point-by-point responses
  1. Referee: Abstract: the SOTA claim (4.3% FPR95 reduction on TinyImageNet) is presented without any description of the experimental protocol, baselines, number of runs, statistical significance, or ablation of the two-stage training. This information is load-bearing for the central performance claim and must be supplied before the result can be evaluated.

    Authors: We agree that the abstract as originally written does not supply sufficient context for the reported 4.3% FPR95 reduction. In the revised version we have expanded the abstract to state that experiments follow the standard near-OOD protocol on TinyImageNet, compare against recent baselines from the literature, report averages over five independent runs with standard deviations, and reference the full protocol, statistical significance tests, and two-stage ablations now detailed in Section 4 and the appendix. This change makes the central claim evaluable while respecting abstract length limits. revision: yes

  2. Referee: Method description (two-stage procedure): the fixed permutations are extracted from the initial CE model; no derivation or analysis shows why retraining with the Plackett-Luce loss on these fixed targets produces a reliable drop in ranking probability for OOD inputs that still receive high ID-class probability. The separation assumption therefore remains an unverified modeling choice rather than a demonstrated property.

    Authors: We acknowledge that the original manuscript presents the separation as an insight without a formal derivation. The core modeling choice is that each ID class induces a stable ranking permutation under cross-entropy training; the second stage then trains the model to predict that exact permutation via the Plackett-Luce loss. OOD inputs that receive high top-class probability are still unlikely to produce the full class-specific ranking because they lie outside the ID data manifold that generated the permutation. In the revision we have added a short theoretical paragraph in Section 3 that derives the expected drop in Plackett-Luce probability for OOD samples and an ablation study in Section 4.3 that isolates the contribution of the second training stage versus simply reusing the original classifier. These additions convert the assumption into an explicitly analyzed property. revision: yes

Circularity Check

0 steps flagged

RankOOD derivation is self-contained with no circular reductions

full rationale

The paper describes a two-stage training process: initial cross-entropy training to extract class-specific ranking permutations, followed by Plackett-Luce loss training to align predictions with these fixed permutations. The OOD detection relies on the Plackett-Luce probability of the predicted ranking. This procedure introduces an explicit second training stage with a distinct loss function, and the central performance claim on TinyImageNet is presented as an empirical result rather than a quantity derived by construction from fitted parameters within the same equations. No self-citations or ansatzes are invoked to justify the separation between ID and OOD ranking probabilities. The derivation chain does not reduce to its inputs by definition or statistical forcing.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the empirical observation that cross-entropy training induces stable per-class ranking patterns; no free parameters, axioms, or invented entities are explicitly introduced beyond standard deep-learning assumptions.

axioms (1)
  • domain assumption Cross-entropy trained classifiers induce a consistent ranking pattern for each in-distribution class.
    Stated as the core insight in the abstract; used to justify extracting a fixed permutation per class.

pith-pipeline@v0.9.0 · 5480 in / 1248 out tokens · 33403 ms · 2026-05-17T04:58:51.735624+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 1 internal anchor

  1. [1]

    Towards open set deep networks

    Abhijit Bendale et al. Towards open set deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1563–1572, 2016. 1, 2, 3, 4, 5, 6, 7, 8, 9

  2. [2]

    In or out? fixing imagenet out-of-distribution detection evalua- tion

    Julian Bitterwolf, Maximilian M ¨uller, and Matthias Hein. In or out? fixing imagenet out-of-distribution detection evalua- tion. InProceedings of the 40th International Conference on Machine Learning. JMLR.org, 2023. 5

  3. [3]

    Adversarial reciprocal points learning for open set recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):8065–8081, 2021

    Guangyao Chen, Peixi Peng, Xiangqian Wang, and Yonghong Tian. Adversarial reciprocal points learning for open set recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):8065–8081, 2021. 2, 3, 4, 5, 6, 7, 8, 9

  4. [4]

    Describing textures in the wild

    Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3606–3613, 2014. 5

  5. [5]

    The mnist database of handwritten digit images for machine learning research [best of the web].IEEE Signal Processing Magazine, 29(6):141–142, 2012

    Li Deng. The mnist database of handwritten digit images for machine learning research [best of the web].IEEE Signal Processing Magazine, 29(6):141–142, 2012. 5

  6. [6]

    Learning Confidence for Out-of-Distribution Detection in Neural Networks

    Terrance DeVries et al. Learning confidence for out-of- distribution detection in neural networks.arXiv preprint arXiv:1802.04865, 2018. 5, 6, 2, 3, 4, 7, 8, 9

  7. [7]

    Extremely simple activation shaping for out- of-distribution detection

    Andrija Djurisic, Nebojsa Bozanic, Arjun Ashok, and Rosanne Liu. Extremely simple activation shaping for out- of-distribution detection. InThe Eleventh International Con- ference on Learning Representations, 2023. 2, 5, 6, 3, 4, 7, 8, 9

  8. [8]

    Weinberger

    Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. On calibration of modern neural networks. InProceedings of the 34th International Conference on Machine Learning - Volume 70, page 1321–1330. JMLR.org, 2017. 1, 6, 2, 3, 4, 5, 7, 8, 9

  9. [9]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. 5

  10. [10]

    A baseline for detect- ing misclassified and out-of-distribution examples in neural networks

    Dan Hendrycks and Kevin Gimpel. A baseline for detect- ing misclassified and out-of-distribution examples in neural networks. InInternational Conference on Learning Repre- sentations, 2017. 1, 2, 5, 6, 3, 4, 7, 8, 9

  11. [11]

    Deep anomaly detection with outlier exposure

    Dan Hendrycks, Mantas Mazeika, and Thomas Dietterich. Deep anomaly detection with outlier exposure. InInterna- tional Conference on Learning Representations, 2019. 1, 2, 5, 6, 3, 4, 7, 8, 9

  12. [12]

    Using self-supervised learning can improve model robustness and uncertainty.Advances in neural in- formation processing systems, 32, 2019

    Dan Hendrycks, Mantas Mazeika, Saurav Kadavath, and Dawn Song. Using self-supervised learning can improve model robustness and uncertainty.Advances in neural in- formation processing systems, 32, 2019. 2

  13. [13]

    Scaling out-of-distribution detection for real-world settings

    Dan Hendrycks, Steven Basart, Mantas Mazeika, Andy Zou, Joseph Kwon, Mohammadreza Mostajabi, Jacob Steinhardt, and Dawn Song. Scaling out-of-distribution detection for real-world settings. InInternational Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Mary- land, USA, pages 8759–8773. PMLR, 2022. 6, 2, 3, 4, 5, 7, 8, 9

  14. [14]

    Generalized odin: Detecting out-of-distribution image with- out learning from out-of-distribution data

    Yen-Chang Hsu, Yilin Shen, Hongxia Jin, and Zsolt Kira. Generalized odin: Detecting out-of-distribution image with- out learning from out-of-distribution data. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10951–10960, 2020. 2, 5, 6, 3, 4, 7, 8, 9

  15. [15]

    On the importance of gradients for detecting distributional shifts in the wild.Advances in Neural Informa- tion Processing Systems, 34:677–689, 2021

    Rui Huang et al. On the importance of gradients for detecting distributional shifts in the wild.Advances in Neural Informa- tion Processing Systems, 34:677–689, 2021. 2, 3, 4, 5, 6, 7, 8, 9

  16. [16]

    Mos: Towards scaling out-of-distribution detection for large semantic space

    Rui Huang et al. Mos: Towards scaling out-of-distribution detection for large semantic space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8710–8719, 2021. 2, 3, 4, 5, 6, 7, 8, 9

  17. [17]

    Excel: Combined extreme and collective logit in- formation for enhancing out-of-distribution detection.arXiv preprint arXiv:2311.14754, 2023

    Naveen Karunanayake, Suranga Seneviratne, and Sanjay Chawla. Excel: Combined extreme and collective logit in- formation for enhancing out-of-distribution detection.arXiv preprint arXiv:2311.14754, 2023. 1, 2, 5, 6, 3, 4, 7, 8, 9

  18. [18]

    Craft: Class ranking aware fine-tuning for enhanced out-of-distribution detection

    Naveen Karunanayake, Suranga Seneviratne, and Sanjay Chawla. Craft: Class ranking aware fine-tuning for enhanced out-of-distribution detection. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 4119–4128. IEEE, 2025. 1, 3, 5, 6, 2, 4, 7, 8, 9

  19. [19]

    Opengan: Open-set recognition via open data generation

    Shu Kong et al. Opengan: Open-set recognition via open data generation. InProceedings of the IEEE/CVF Inter- national Conference on Computer Vision, pages 813–822,

  20. [20]

    2, 3, 4, 5, 6, 7, 8, 9

  21. [21]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. 4

  22. [22]

    Tiny imagenet visual recognition challenge.CS 231N, 7(7):3, 2015

    Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge.CS 231N, 7(7):3, 2015. 4

  23. [23]

    A simple unified framework for detecting out-of-distribution samples and adversarial attacks

    Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. InAdvances in Neural In- formation Processing Systems, 2018. 2, 3, 4, 5, 6, 7, 8, 9

  24. [24]

    Shiyu Liang, Yixuan Li, and R. Srikant. Enhancing the re- liability of out-of-distribution image detection in neural net- works. InInternational Conference on Learning Represen- tations, 2018. 2, 3, 4, 5, 6, 7, 8, 9

  25. [25]

    Owens, and Yixuan Li

    Weitang Liu, Xiaoyun Wang, John D. Owens, and Yixuan Li. Energy-based out-of-distribution detection. InProceedings of the 34th International Conference on Neural Information Processing Systems, Red Hook, NY , USA, 2020. 1, 2, 5, 6, 3, 4, 7, 8, 9

  26. [26]

    Gen: Pushing the limits of softmax-based out- of-distribution detection

    Xixi Liu et al. Gen: Pushing the limits of softmax-based out- of-distribution detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23946–23955, 2023. 5, 6, 2, 3, 4, 7, 8, 9

  27. [27]

    CRC Press, 1996

    John I Marden.Analyzing and modeling rank data. CRC Press, 1996. 2, 3, 4 9

  28. [28]

    How to exploit hyperspherical embeddings for out-of-distribution detection? InThe Eleventh International Conference on Learning Representations, 2023

    Yifei Ming, Yiyou Sun, Ousmane Dia, and Yixuan Li. How to exploit hyperspherical embeddings for out-of-distribution detection? InThe Eleventh International Conference on Learning Representations, 2023. 2, 6, 3, 4, 5, 7, 8, 9

  29. [29]

    Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bis- sacco, Bo Wu, and Andrew Y . Ng. Reading digits in natural images with unsupervised feature learning. InNIPS Work- shop on Deep Learning and Unsupervised Feature Learning 2011, 2011. 5

  30. [30]

    A simple fix to ma- halanobis distance for improving near-ood detec- tion,

    Jie Ren, Stanislav Fort, Jeremiah Liu, Abhijit Guha Roy, Shreyas Padhy, and Balaji Lakshminarayanan. A simple fix to mahalanobis distance for improving near-ood detection. arXiv preprint arXiv:2106.09022, 2021. 6, 2, 3, 4, 5, 7, 8, 9

  31. [31]

    Detecting out-of- distribution examples with gram matrices

    Chandramouli Shama Sastry et al. Detecting out-of- distribution examples with gram matrices. InInternational Conference on Machine Learning, pages 8491–8501. PMLR,

  32. [32]

    Out- of-distribution segmentation in autonomous driving: Prob- lems and state of the art

    Youssef Shoeb, Azarm Nowzad, and Hanno Gottschalk. Out- of-distribution segmentation in autonomous driving: Prob- lems and state of the art. InProceedings of the Computer Vi- sion and Pattern Recognition Conference, pages 4310–4320,

  33. [33]

    Rankfeat: Rank-1 feature removal for out- of-distribution detection.Advances in Neural Information Processing Systems, 35:17885–17898, 2022

    Yue Song et al. Rankfeat: Rank-1 feature removal for out- of-distribution detection.Advances in Neural Information Processing Systems, 35:17885–17898, 2022. 2, 3, 4, 5, 6, 7, 8, 9

  34. [34]

    Out- of-distribution detection with deep nearest neighbors

    Yiyou Sun, Yifei Ming, Xiaojin Zhu, and Yixuan Li. Out- of-distribution detection with deep nearest neighbors. InIn- ternational Conference on Machine Learning, pages 20827– 20840. PMLR, 2022. 6, 2, 3, 4, 5, 7, 8, 9

  35. [35]

    React: Out-of-distribution detection with rectified activations.Advances in Neural Information Pro- cessing Systems, 34:144–157, 2021

    Yiyou Sun et al. React: Out-of-distribution detection with rectified activations.Advances in Neural Information Pro- cessing Systems, 34:144–157, 2021. 1, 2, 5, 6, 3, 4, 7, 8, 9

  36. [36]

    Dice: Leveraging sparsification for out-of- distribution detection

    Yiyou Sun et al. Dice: Leveraging sparsification for out-of- distribution detection. InEuropean Conference on Computer Vision, pages 691–708. Springer, 2022. 2, 3, 4, 5, 6, 7, 8, 9

  37. [37]

    Csi: Novelty detection via contrastive learning on dis- tributionally shifted instances.Advances in Neural Informa- tion Processing Systems, 33:11839–11852, 2020

    Jihoon Tack, Sangwoo Mo, Jongheon Jeong, and Jinwoo Shin. Csi: Novelty detection via contrastive learning on dis- tributionally shifted instances.Advances in Neural Informa- tion Processing Systems, 33:11839–11852, 2020. 2, 3, 4, 5, 6, 7, 8, 9

  38. [38]

    The inaturalist species classification and detection dataset

    Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, and Serge Belongie. The inaturalist species classification and detection dataset. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8769– 8778, 2018. 5

  39. [39]

    Open-set recognition: A good closed-set classifier is all you need

    Sagar Vaze, Kai Han, Andrea Vedaldi, and Andrew Zisser- man. Open-set recognition: A good closed-set classifier is all you need. InInternational Conference on Learning Rep- resentations, 2022. 5

  40. [40]

    Vim: Out-of-distribution with virtual-logit matching

    Haoqi Wang, Zhizhong Li, Litong Feng, and Wayne Zhang. Vim: Out-of-distribution with virtual-logit matching. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4921–4930, 2022. 1, 5, 6, 2, 3, 4, 7, 8, 9

  41. [41]

    Mitigating neural network overconfidence with logit normalization

    Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, and Yixuan Li. Mitigating neural network overconfidence with logit normalization. InInternational Conference on Machine Learning, pages 23631–23644. PMLR, 2022. 2, 5, 6, 3, 4, 7, 8, 9

  42. [42]

    Listwise approach to learning to rank: theory and algorithm

    Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. Listwise approach to learning to rank: theory and algorithm. InProceedings of the 25th international confer- ence on Machine learning, pages 1192–1199, 2008. 2, 3, 6

  43. [43]

    Semantically coherent out-of-distribution detection

    Jingkang Yang, Haoqi Wang, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang, and Ziwei Liu. Semantically coherent out-of-distribution detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 8301–8309, 2021. 1, 2, 5, 6, 3, 4, 7, 8, 9

  44. [44]

    Unsupervised out-of-distribution detection by maximum classifier discrepancy

    Qing Yu et al. Unsupervised out-of-distribution detection by maximum classifier discrepancy. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 9518–9526, 2019. 1, 2, 5, 6, 3, 4, 7, 8, 9

  45. [45]

    Revisiting out-of-distribution robustness in nlp: Benchmarks, analysis, and llms evaluations.Advances in Neural Information Processing Systems, 36:58478–58507,

    Lifan Yuan, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Fangyuan Zou, Xingyi Cheng, Heng Ji, Zhiyuan Liu, and Maosong Sun. Revisiting out-of-distribution robustness in nlp: Benchmarks, analysis, and llms evaluations.Advances in Neural Information Processing Systems, 36:58478–58507,

  46. [46]

    Out-of-distribution detection based on in-distribution data patterns memoriza- tion with modern hopfield energy

    Jinsong Zhang, Qiang Fu, Xu Chen, Lun Du, Zelin Li, Gang Wang, Shi Han, Dongmei Zhang, et al. Out-of-distribution detection based on in-distribution data patterns memoriza- tion with modern hopfield energy. InThe Eleventh Interna- tional Conference on Learning Representations, 2022. 2, 3, 4, 5, 6, 7, 8, 9

  47. [47]

    Mixture outlier exposure: Towards out-of-distribution detection in fine-grained environments

    Jingyang Zhang, Nathan Inkawhich, Randolph Linderman, Yiran Chen, and Hai Li. Mixture outlier exposure: Towards out-of-distribution detection in fine-grained environments. In Proceedings of the IEEE/CVF Winter Conference on Appli- cations of Computer Vision, pages 5531–5540, 2023. 1, 2, 5, 6, 3, 4, 7, 8, 9

  48. [48]

    OpenOOD v1.5: Enhanced Benchmark for Out -of- Distribution Detection,

    Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, et al. Openood v1. 5: En- hanced benchmark for out-of-distribution detection.arXiv preprint arXiv:2306.09301, 2023. 2, 4

  49. [49]

    Places: A 10 million image database for scene recognition.IEEE Transactions on Pattern Analy- sis and Machine Intelligence, 40(6):1452–1464, 2017

    Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition.IEEE Transactions on Pattern Analy- sis and Machine Intelligence, 40(6):1452–1464, 2017. 5 10 A. Appendix In this appendix, we present additional details and results that were excluded from the main content due to space l...