arxiv: 2602.02409 · v3 · submitted 2026-02-02 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

Catalyst: Out-of-Distribution Detection via Elastic Scaling

Abid Hassan , Tuan Ngo , Saad Shafiq , Nenad Medvidovic

Authors on Pith no claims yet

Pith reviewed 2026-05-16 07:55 UTC · model grok-4.3

classification 💻 cs.CV

keywords out-of-distribution detectionpost-hoc methodelastic scalingpre-pooling feature mapchannel-wise statisticsfalse positive rateResNetCIFAR-10

0 comments

The pith

Catalyst improves out-of-distribution detection by multiplicatively scaling baseline scores with an input-dependent factor derived from pre-pooling channel-wise statistics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper contends that standard post-hoc OOD methods discard valuable information by depending solely on output logits or globally averaged features. Catalyst recovers this signal by deriving an input-specific scaling factor gamma from the mean, standard deviation, and maximum activations across channels in the feature map before pooling. This factor then elastically scales any baseline score, increasing the separation between in-distribution and out-of-distribution examples. The method is shown to work with various detectors and yields large reductions in false positive rates on CIFAR-10, CIFAR-100, and ImageNet benchmarks using ResNet architectures. It requires no retraining and is presented as complementary to prior approaches.

Core claim

Catalyst is a post-hoc framework that computes an input-dependent scaling factor (γ) on-the-fly from the raw channel-wise statistics of the pre-pooling feature map and fuses it multiplicatively with existing baseline OOD scores to push the ID and OOD distributions further apart.

What carries the argument

Elastic scaling via an input-dependent factor γ computed from channel-wise mean, standard deviation, and maximum activation of the pre-pooling feature map, which multiplicatively modulates baseline scores.

If this is right

Integrates seamlessly with logit-based methods such as Energy, ReAct, and SCALE.
Provides significant boosts to distance-based detectors like KNN.
Reduces average false positive rate by 32.87% on CIFAR-10 with ResNet-18.
Achieves 27.94% reduction on CIFAR-100 and 22.25% on ImageNet with ResNet-50.
Demonstrates that pre-pooling statistics offer complementary signal to GAP and logits.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Global average pooling may be discarding per-channel distributional information that is useful for distinguishing OOD inputs.
Detectors could be redesigned to use these statistics directly instead of post-hoc scaling.
The approach might generalize to other vision architectures or even non-vision domains where intermediate activations are available.
It suggests revisiting intermediate layer representations for other safety tasks in neural networks.

Load-bearing premise

The raw channel-wise statistics of the pre-pooling feature map contain a rich complementary signal that is systematically discarded by global average pooling and logit-based scoring.

What would settle it

Running Catalyst on a held-out OOD benchmark with ResNet and finding that the scaled scores yield higher FPR than the unscaled baseline would falsify the improvement claim.

Figures

Figures reproduced from arXiv: 2602.02409 by Abid Hassan, Nenad Medvidovic, Saad Shafiq, Tuan Ngo.

**Figure 1.** Figure 1: Information cues from each channel before the penultimate layer of a ResNet-50 trained on ImageNet-1k, evaluated with Texture as the OOD dataset. The x-axis shows channel indices; the y-axis shows cue strength. Left to right: (a) µ(x): mean activation, (b) σ(x): standard deviation, (c) max(x): dominant activation, and (d) H(x): entropy per channel. The existing methods have under-explored these distinctiv… view at source ↗

**Figure 2.** Figure 2: Illustration of Catalyst’s effectiveness. The model is ResNet-50 trained on ImageNet-1k, evaluated on Texture (OOD). Here, we apply γ computed from the channel-maximum statistic (m) multiplicatively to the baseline ReAct. (a) The unscaled score distribution shows more significant overlap than (b) the Catalyst-scaled score distribution. 3. Methodology The key contribution of this paper is Catalyst, a novel … view at source ↗

**Figure 3.** Figure 3: Sensitivity analysis of the clipping percentile (p) on Catalyst(m) performance. All values averaged over 4 OOD test datasets for a ResNet-50 (ImageNet). 4.6. Comparison with Other Baselines Comparing Catalyst against three contemporary methods, AdaScale [50], NCI [35] and fDBD [34], confirms its superiority, particularly when used as a complementary module. Aginst, AdaScale’s reported reslults using Den… view at source ↗

**Figure 4.** Figure 4: Distribution of scaling factor \gamma from the penultimate layer of a ResNet-50 trained on ImageNet-1k, evaluated with Texture as the OOD dataset. The scales show clear separation between ID and OOD samples. Left to right: (a) µ(x): mean, (b) σ(x): standard deviation, (c) max(x): max require this assumption to hold to achieve strong performance. Assumption 2. The mean scaling factor for ID data is larger … view at source ↗

**Figure 5.** Figure 5: Distributions of the scaling factor \gamma , derived from the penultimate layer of a MobileNet-V2 model trained on ImageNet-1k. The rows (top to bottom) correspond to the OOD datasets: SUN, Places365, Texture, and iNaturalist. The columns (left to right) correspond to the statistical cue used to compute \gamma : (a) mean: µ(x), (b) standard deviation: σ(x), and (c) maximum value: max(x) ( we used \qopname … view at source ↗

**Figure 6.** Figure 6: Distribution of scaling factor \gamma from the penultimate layer of a ResNet-18 trained on CIFAR-100, evaluated with Places365 as the OOD dataset. The scales shows high overlap between ID and OOD samples. Left to right: (a) µ(x): mean, (b) σ(x): standard deviation, (c) max(x): max [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗

**Figure 7.** Figure 7: Distribution of scaling factor \gamma from the penultimate layer of a DenseNet-101 trained on CIFAR-100, evaluated with Places365 as the OOD dataset. The scales shows high overlap between ID and OOD samples. Left to right: (a) µ(x): mean, (b) σ(x): standard deviation, (c) max(x): max [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗

**Figure 8.** Figure 8: Distributions of the scaling factor (γ) are computed from the four residual stages. The model was trained on ImageNet-1K (ID) and evaluated against Texture (OOD). (a-c) The γ distributions from the early-to-mid stages (Layer 1 to Layer 3) show significant overlap between ID and OOD samples, rendering them ineffective as a discriminative signal. (d) In sharp contrast, the distribution from the final residua… view at source ↗

**Figure 9.** Figure 9: Distribution of the scaling factor, γ, computed using the median statistic. The model is a DenseNet-101 trained on CIFAR-100 (ID), evaluated against the SVHN dataset (OOD). The plot reveals that the OOD distribution is shifted to the right of the ID distribution, indicating that OOD samples produce a higher γ value than ID samples. This contradicts the core assumption of our method, leading to degraded OOD… view at source ↗

**Figure 10.** Figure 10: Superior OOD separation of γentropy as a standalone score on CIFAR-100. The model is a ResNet-18 trained on CIFAR-100 (ID), evaluated against the Texture dataset (OOD). (Left) Significant distribution overlap between ID and OOD using the baseline Energy score. (Right) Dramatically improved separation using the standalone γentropy score.This visualization confirms the finding from [PITH_FULL_IMAGE:figures… view at source ↗

read the original abstract

Out-of-distribution (OOD) detection is critical for the safe deployment of deep neural networks. State-of-the-art post-hoc methods typically derive OOD scores from the output logits or penultimate feature vector obtained via global average pooling (GAP). We contend that this exclusive reliance on the logit or feature vector discards a rich, complementary signal: the raw channel-wise statistics of the pre-pooling feature map lost in GAP. In this paper, we introduce Catalyst, a post-hoc framework that exploits these under-explored signals. Catalyst computes an input-dependent scaling factor ($\gamma$) on-the-fly from these raw statistics (e.g., mean, standard deviation, and maximum activation). This $\gamma$ is then fused with the existing baseline score, multiplicatively modulating it -- an $\textit{elastic scaling}$ -- to push the ID and OOD distributions further apart. We demonstrate Catalyst is a generalizable framework: it seamlessly integrates with logit-based methods (e.g., Energy, ReAct, SCALE) and also provides a significant boost to distance-based detectors like KNN. As a result, Catalyst achieves substantial and consistent performance gains, reducing the average False Positive Rate by 32.87 on CIFAR-10 (ResNet-18), 27.94% on CIFAR-100 (ResNet-18), and 22.25% on ImageNet (ResNet-50). Our results highlight the untapped potential of pre-pooling statistics and demonstrate that Catalyst is complementary to existing OOD detection approaches. Our code is available here: https://github.com/bingabid/Catalyst

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Catalyst adds a simple input-dependent scaling from pre-pooling channel stats to existing OOD scores and reports sizable FPR drops, but lacks controls to show the stats themselves are the source of the gain.

read the letter

The main thing to know is that Catalyst computes an input-specific gamma from the mean, standard deviation, and max of the pre-pooling feature map channels, then multiplies an existing OOD score by that gamma. This elastic scaling is meant to separate ID and OOD distributions more than the baseline alone, and the abstract reports average FPR reductions of roughly 33% on CIFAR-10, 28% on CIFAR-100, and 22% on ImageNet when added to several detectors. The approach is post-hoc and does not fit to target OOD labels. It is presented as a general wrapper that works on top of Energy, ReAct, SCALE, and even KNN. That combination of pre-pooling statistics into a multiplicative factor is not in the methods they cite, so the specific mechanism is new. The generality across logit-based and distance-based detectors is a practical strength, and releasing the code makes it easy to check. The experiments cover three datasets and two architectures with consistent direction of improvement. The central assumption is that raw channel-wise statistics carry a rich signal that global average pooling and logit scoring systematically discard. That assumption is plausible on its face, but the paper does not include the obvious control: replacing the statistic-derived gamma with a random draw from the same range or a simple function of the baseline score alone. If those controls produce similar lifts, the reported gains would not demonstrate that the chosen statistics are doing the real work. The abstract also gives no error bars, ablation tables, or statistical tests, so the full paper needs to show the numbers are robust and not the result of post-hoc selection. This is for people already working on post-hoc OOD detection who want a lightweight add-on they can try on their current detector. A reader looking for a clean, generalizable tweak would get value from it once the controls are in place. It is coherent enough on its own terms to deserve a serious referee rather than a desk reject; the authors can add the missing ablations in revision.

Referee Report

2 major / 1 minor

Summary. The paper introduces Catalyst, a post-hoc OOD detection framework that computes an input-dependent scaling factor γ from raw channel-wise statistics (mean, std, max) of the pre-pooling feature map. This γ is multiplicatively fused with baseline scores (e.g., Energy, ReAct, SCALE, KNN) via elastic scaling to increase separation between in-distribution and out-of-distribution samples. Experiments on CIFAR-10/100 (ResNet-18) and ImageNet (ResNet-50) report average FPR reductions of 32.87%, 27.94%, and 22.25% respectively, with code released for reproducibility.

Significance. If the gains are shown to arise specifically from the pre-pooling channel statistics rather than generic input-dependent modulation, the work would be significant by identifying an under-exploited signal complementary to logits and GAP features. The method is simple, training-free, and generalizes across logit-based and distance-based detectors, offering a practical enhancement for safe deployment of DNNs.

major comments (2)

[Method (γ computation and elastic scaling)] The central claim that raw channel-wise statistics of the pre-pooling feature map supply a rich complementary signal systematically lost by GAP (abstract and method description) is load-bearing but unsupported by controls. No experiments replace the statistic-derived γ with a random draw from the same range or a simple function of the baseline score alone; without these, the reported FPR reductions cannot be attributed to the specific statistics rather than any input-dependent scaling.
[Experiments] §4 (experiments): the reported average FPR reductions (32.87 on CIFAR-10, etc.) are presented without error bars, ablation on the choice of statistics (mean/std/max), or statistical significance tests. This undermines assessment of whether the gains are robust or subject to post-hoc selection across the three datasets and two architectures.

minor comments (1)

[Abstract] Abstract: the phrase 'reducing the average False Positive Rate by 32.87' should specify units (e.g., percentage points) and the exact baseline method for each number to avoid ambiguity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment point-by-point below. We agree that the suggested controls and statistical analyses would strengthen the paper and have revised the manuscript to include them.

read point-by-point responses

Referee: [Method (γ computation and elastic scaling)] The central claim that raw channel-wise statistics of the pre-pooling feature map supply a rich complementary signal systematically lost by GAP (abstract and method description) is load-bearing but unsupported by controls. No experiments replace the statistic-derived γ with a random draw from the same range or a simple function of the baseline score alone; without these, the reported FPR reductions cannot be attributed to the specific statistics rather than any input-dependent scaling.

Authors: We agree that explicit controls are required to isolate the contribution of the pre-pooling channel statistics. In the revised manuscript we add two sets of controls: (1) γ is replaced by a random scalar drawn uniformly from the empirical range of observed γ values on the same dataset, and (2) γ is replaced by a simple monotonic function of the baseline score alone (e.g., γ = 1 + 0.1 × baseline). The new results, reported in an expanded Section 4.3 and Table 3, show that only the statistic-derived γ produces the claimed FPR reductions; the random and baseline-only variants yield negligible or negative gains. These additions directly support the central claim without altering the original method. revision: yes
Referee: [Experiments] §4 (experiments): the reported average FPR reductions (32.87 on CIFAR-10, etc.) are presented without error bars, ablation on the choice of statistics (mean/std/max), or statistical significance tests. This undermines assessment of whether the gains are robust or subject to post-hoc selection across the three datasets and two architectures.

Authors: We acknowledge the absence of error bars, statistic ablations, and significance testing. The revised version reruns all experiments over five random seeds and reports mean FPR ± standard deviation. We add a full ablation (new Table 4) that evaluates every subset of {mean, std, max} for computing γ, confirming that the full triplet is optimal. We also include paired t-test p-values (all < 0.01) comparing Catalyst-augmented scores against the corresponding baselines on each dataset/architecture pair. These results appear in Section 4 and the supplementary material. revision: yes

Circularity Check

0 steps flagged

No circularity: γ computed directly from input statistics with no fitting or self-citation reduction

full rationale

The paper defines Catalyst's core mechanism as computing an input-dependent γ on-the-fly from the raw channel-wise statistics (mean, std, max) of the pre-pooling feature map, then multiplicatively scaling an existing baseline score. This computation uses only the current input's own activations and does not fit parameters to OOD labels, baseline scores, or target metrics. No equations reduce the claimed FPR reductions to a fitted parameter by construction, and the text contains no load-bearing self-citations or uniqueness theorems imported from prior author work. The reported gains are presented as empirical results on standard benchmarks rather than a derived necessity. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are introduced; the method relies on standard assumptions of post-hoc OOD detection and computes gamma directly from feature statistics.

pith-pipeline@v0.9.0 · 5593 in / 982 out tokens · 23246 ms · 2026-05-16T07:55:05.133685+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Catalyst computes an input-dependent scaling factor (γ) on-the-fly from these raw statistics (e.g., mean, standard deviation, and maximum activation). This γ is then fused with the existing baseline score, multiplicatively modulating it — an 'elastic scaling'
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

S∗mul(x;θ,γ)=γ(x;f)×S(x;θ)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

91 extracted references · 91 canonical work pages · 3 internal anchors

[1]

NoiseOut: A Simple Way to Prune Neural Networks

Mohammad Babaeizadeh, Paris Smaragdis, and Roy H. Campbell. Noiseout: A simple way to prune neural net- works.CoRR, abs/1611.06211, 2016. 8

work page internal anchor Pith review Pith/arXiv arXiv 2016
[2]

Gradorth: A simple yet efficient out- of-distribution detection with orthogonal projection of gra- dients

Sima Behpour, Thang Doan, Xin Li, Wenbin He, Liang Gou, and Liu Ren. Gradorth: A simple yet efficient out- of-distribution detection with orthogonal projection of gra- dients. InThirty-seventh Conference on Neural Information Processing Systems, 2023. 6, 8, 4, 13, 14

work page 2023
[3]

Cimpoi, S

M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, , and A. Vedaldi. Describing textures in the wild. InProceedings of the IEEE Conf. on Computer Vision and Pattern Recogni- tion, page 3606–3613, 2014. 4, 5, 6, 9, 11, 12

work page 2014
[4]

Density estimation using real NVP

Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real NVP. InInternational Con- ference on Learning Representations, 2017. 8

work page 2017
[5]

Extremely simple activation shaping for out- of-distribution detection

Andrija Djurisic, Nebojsa Bozanic, Arjun Ashok, and Rosanne Liu. Extremely simple activation shaping for out- of-distribution detection. InThe Eleventh International Con- ference on Learning Representations, 2023. 1, 2, 4, 6, 8, 13, 14, 24

work page 2023
[6]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. InInternational Conference on Learning Representa- tions, 2021. 8

work page 2021
[7]

Unknown-aware object detection: Learning what you don’t know from videos in the wild

Xuefeng Du, Xin Wang, Gabriel Gozum, and Yixuan Li. Unknown-aware object detection: Learning what you don’t know from videos in the wild. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. 8

work page 2022
[8]

To- wards unknown-aware learning with virtual outlier synthe- sis

Xuefeng Du, Zhaoning Wang, Mu Cai, and Sharon Li. To- wards unknown-aware learning with virtual outlier synthe- sis. InInternational Conference on Learning Representa- tions, 2022. 8

work page 2022
[9]

Can au- tonomous vehicles identify, recover from, and adapt to dis- tribution shifts? InInternational Conference on Machine Learning (ICML), 2020

Angelos Filos, Panagiotis Tigas, Rowan McAllister, Nicholas Rhinehart, Sergey Levine, and Yarin Gal. Can au- tonomous vehicles identify, recover from, and adapt to dis- tribution shifts? InInternational Conference on Machine Learning (ICML), 2020. 1

work page 2020
[10]

Dropout as a bayesian approximation: Representing model uncertainty in deep learning

Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. InProceedings of The 33rd International Confer- ence on Machine Learning, pages 1050–1059, 2016. 8

work page 2016
[11]

Soumya Suvra Ghosal, Yiyou Sun, and Yixuan Li. How to overcome curse-of-dimensionality for out-of-distribution de- tection? InProceedings of the Thirty-Eighth AAAI Con- ference on Artificial Intelligence and Thirty-Sixth Confer- ence on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intell...

work page 2024
[12]

Battle of the backbones: A large-scale comparison of pretrained models across computer vision tasks

Micah Goldblum, Hossein Souri, Renkun Ni, Manli Shu, Viraj Uday Prabhu, Gowthami Somepalli, Prithvijit Chat- topadhyay, Mark Ibrahim, Adrien Bardes, Judy Hoffman, Rama Chellappa, Andrew Gordon Wilson, and Tom Gold- stein. Battle of the backbones: A large-scale comparison of pretrained models across computer vision tasks. InThirty- seventh Conference on Ne...

work page 2023
[13]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 5, 8

work page 2016
[14]

Exploring channel-aware typical features for out-of-distribution detec- tion.Proceedings of the AAAI Conference on Artificial Intel- ligence, 38:12402–12410, 2024

Rundong He, Yue Yuan, Zhongyi Han, Fan Wang, Wan Su, Yilong Yin, Tongliang Liu, and Yongshun Gong. Exploring channel-aware typical features for out-of-distribution detec- tion.Proceedings of the AAAI Conference on Artificial Intel- ligence, 38:12402–12410, 2024. 14

work page 2024
[15]

Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the prob- lem

Matthias Hein, Maksym Andriushchenko, and Julian Bitter- wolf. Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the prob- lem. In2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 41–50, 2019. 1, 8

work page 2019
[16]

A baseline for detect- ing misclassified and out-of-distribution examples in neural networks

Dan Hendrycks and Kevin Gimpel. A baseline for detect- ing misclassified and out-of-distribution examples in neural networks. InInternational Conference on Learning Repre- sentations, 2017. 1, 2, 4, 7, 8, 13, 14, 24

work page 2017
[17]

Deep anomaly detection with outlier exposure

Dan Hendrycks, Mantas Mazeika, and Thomas Dietterich. Deep anomaly detection with outlier exposure. InInterna- tional Conference on Learning Representations, 2019. 8

work page 2019
[18]

Generalized odin: Detecting out-of-distribution im- age without learning from out-of-distribution data

Yen-Chang Hsu, Yilin Shen, Hongxia Jin, and Zsolt Kira. Generalized odin: Detecting out-of-distribution im- age without learning from out-of-distribution data. In2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10948–10957, 2020. 8, 13, 14

work page 2020
[19]

Weinberger

Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kil- ian Q. Weinberger. Densely connected convolutional net- works. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 5

work page 2017
[20]

Mos: Towards scaling out-of- distribution detection for large semantic space

Rui Huang and Yixuan Li. Mos: Towards scaling out-of- distribution detection for large semantic space. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8710–8719, 2021. 8

work page 2021
[21]

On the impor- tance of gradients for detecting distributional shifts in the wild

Rui Huang, Andrew Geng, and Yixuan Li. On the impor- tance of gradients for detecting distributional shifts in the wild. InAdvances in Neural Information Processing Sys- tems, pages 677–689. Curran Associates, Inc., 2021. 6, 8, 13, 14

work page 2021
[22]

Stacked generative adversarial networks

Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, and Serge Belongie. Stacked generative adversarial networks. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 8

work page 2017
[23]

Ood-maml: Meta- learning for few-shot out-of-distribution detection and clas- sification

Taewon Jeong and Heeyoung Kim. Ood-maml: Meta- learning for few-shot out-of-distribution detection and clas- sification. InAdvances in Neural Information Processing Systems, pages 3907–3916, 2020. 8

work page 2020
[24]

Billion- scale similarity search with GPUs.IEEE Transactions on Big Data, 7(3):535–547, 2019

Jeff Johnson, Matthijs Douze, and Herv ´e J ´egou. Billion- scale similarity search with GPUs.IEEE Transactions on Big Data, 7(3):535–547, 2019. 4

work page 2019
[25]

Training OOD detectors in their natural habitats

Julian Katz-Samuels, Julia B Nakhleh, Robert Nowak, and Yixuan Li. Training OOD detectors in their natural habitats. InProceedings of the 39th International Conference on Ma- chine Learning, pages 10848–10865, 2022. 8

work page 2022
[26]

Auto-encoding varia- tional bayes, 2014

Diederik P Kingma and Max Welling. Auto-encoding varia- tional bayes, 2014. 8

work page 2014
[27]

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. 4

work page 2009
[28]

Simple and scalable predictive uncertainty esti- mation using deep ensembles

Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty esti- mation using deep ensembles. InAdvances in Neural Infor- mation Processing Systems, 2017. 8

work page 2017
[29]

Training confidence-calibrated classifiers for detecting out- of-distribution samples

Kimin Lee, Honglak Lee, Kibok Lee, and Jinwoo Shin. Training confidence-calibrated classifiers for detecting out- of-distribution samples. InInternational Conference on Learning Representations, 2018. 8

work page 2018
[30]

A simple unified framework for detecting out-of-distribution samples and adversarial attacks

Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. InAdvances in Neural In- formation Processing Systems, 2018. 2, 6, 8, 4, 13, 14

work page 2018
[31]

Pruning filters for efficient convnets

Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. Pruning filters for efficient convnets. InIn- ternational Conference on Learning Representations, 2017. 8

work page 2017
[32]

Shiyu Liang, Yixuan Li, and R. Srikant. Enhancing the re- liability of out-of-distribution image detection in neural net- works. InInternational Conference on Learning Represen- tations, 2018. 2, 4, 8, 1, 13, 14, 24

work page 2018
[33]

Mood: Multi-level out-of-distribution detection

Ziqian Lin, Sreya Dutta Roy, and Yixuan Li. Mood: Multi-level out-of-distribution detection. In2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15308–15318, 2021. 8

work page 2021
[34]

Fast decision boundary based out-of-distribution detector.ICML Workshop or arXiv preprint, 2023

Litian Liu and Yao Qin. Fast decision boundary based out-of-distribution detector.ICML Workshop or arXiv preprint, 2023. 6, 8, 13, 14

work page 2023
[35]

Detecting out-of-distribution through the lens of neural collapse

Litian Liu and Yao Qin. Detecting out-of-distribution through the lens of neural collapse. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025. 6, 8, 13

work page 2025
[36]

Energy-based out-of-distribution detection

Weitang Liu, Xiaoyun Wang, John Owens, and Yixuan Li. Energy-based out-of-distribution detection. InAdvances in Neural Information Processing Systems, pages 21464– 21475. Curran Associates, Inc., 2020. 1, 2, 4, 5, 7, 8, 13, 14, 24

work page 2020
[37]

A convnet for the 2020s

Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 11976– 11986, 2022. 8

work page 2022
[38]

A simple baseline for bayesian uncertainty in deep learning

Wesley J Maddox, Pavel Izmailov, Timur Garipov, Dmitry P Vetrov, and Andrew Gordon Wilson. A simple baseline for bayesian uncertainty in deep learning. InAdvances in Neural Information Processing Systems, 2019. 8

work page 2019
[39]

Predictive uncertainty es- timation via prior networks

Andrey Malinin and Mark Gales. Predictive uncertainty es- timation via prior networks. InAdvances in Neural Informa- tion Processing Systems, 2018. 8

work page 2018
[40]

Reverse kl-divergence training of prior networks: Improved uncertainty and adver- sarial robustness

Andrey Malinin and Mark Gales. Reverse kl-divergence training of prior networks: Improved uncertainty and adver- sarial robustness. InAdvances in Neural Information Pro- cessing Systems, 2019. 8

work page 2019
[41]

Towards neural net- works that provably know when they don’t know

Alexander Meinke and Matthias Hein. Towards neural net- works that provably know when they don’t know. InInter- national Conference on Learning Representations, 2020. 8

work page 2020
[42]

POEM: Out-of- distribution detection with posterior sampling

Yifei Ming, Ying Fan, and Yixuan Li. POEM: Out-of- distribution detection with posterior sampling. InProceed- ings of the 39th International Conference on Machine Learn- ing, pages 15650–15665, 2022. 8

work page 2022
[43]

How to exploit hyperspherical embeddings for out-of-distribution detection? InThe Eleventh International Conference on Learning Representations, 2023

Yifei Ming, Yiyou Sun, Ousmane Dia, and Yixuan Li. How to exploit hyperspherical embeddings for out-of-distribution detection? InThe Eleventh International Conference on Learning Representations, 2023. 14

work page 2023
[44]

Provable guarantees for understanding out-of-distribution detection

Peyman Morteza and Yixuan Li. Provable guarantees for understanding out-of-distribution detection. InProceedings of the AAAI conference on Aritificial Intelligence, 2021. 8

work page 2021
[45]

Do deep generative models know what they don’t know? InInternational Con- ference on Learning Representations, 2019

Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, and Balaji Lakshminarayanan. Do deep generative models know what they don’t know? InInternational Con- ference on Learning Representations, 2019. 8

work page 2019
[46]

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bis- sacco, Bo Wu, and Andrew Y . Ng. Reading digits in natural images with unsupervised feature learning. InNIPS Work- shop on Deep Learning and Unsupervised Feature Learning,

work page
[47]

Dnn modularization via activation-driven training,

Tuan Ngo, Abid Hassan, Saad Shafiq, and Nenad Medvi- dovic. Dnn modularization via activation-driven training,

work page
[48]

Deep neural networks are easily fooled: High confidence predictions for unrecognizable images

Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In2015 IEEE Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 427– 436, 2015. 1, 8

work page 2015
[49]

Nearest neighbor guidance for out-of-distribution detection,

Jaewoo Park, Yoon Gyo Jung, and Andrew Beng Jin Teoh. Nearest neighbor guidance for out-of-distribution detection,

work page
[50]

Adascale: Adaptive scaling for ood de- tection, 2025

Sudarshan Regmi. Adascale: Adaptive scaling for ood de- tection, 2025. 6, 17

work page 2025
[51]

Stochastic backpropagation and approximate inference in deep generative models, 2014

Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wier- stra. Stochastic backpropagation and approximate inference in deep generative models, 2014. 8

work page 2014
[52]

Roy, J Ren, S Azizi, A Loh, V Natarajan, B Mustafa, N Pawlowski, J Freyberg, Y Liu, and Z Beaver

A.G. Roy, J Ren, S Azizi, A Loh, V Natarajan, B Mustafa, N Pawlowski, J Freyberg, Y Liu, and Z Beaver. Does your dermatology classifier know what it doesn’t know? detecting the long-tail of unseen conditions.CoRR, arXiv:2104.03829,

work page arXiv
[53]

Ssd: A unified framework for self-supervised outlier detection

Vikash Sehwag, Mung Chiang, and Prateek Mittal. Ssd: A unified framework for self-supervised outlier detection. In Proceedings of the International Conference on Learning Representations, 2021. 8, 4

work page 2021
[54]

Dice: Leveraging sparsifica- tion for out-of-distribution detection

Yiyou Sun and Yixuan Li. Dice: Leveraging sparsifica- tion for out-of-distribution detection. InComputer Vision – ECCV 2022, pages 691–708. Springer Nature Switzerland,

work page 2022
[55]

1, 2, 6, 8, 4, 13, 14, 15, 24

work page
[56]

React: Out-of- distribution detection with rectified activations

Yiyou Sun, Chuan Guo, and Yixuan Li. React: Out-of- distribution detection with rectified activations. InAdvances in Neural Information Processing Systems, pages 144–157. Curran Associates, Inc., 2021. 1, 2, 3, 4, 5, 6, 8, 13, 14, 15, 24

work page 2021
[57]

Out-of- distribution detection with deep nearest neighbors

Yiyou Sun, Yifei Ming, Xiaojin Zhu, and Yixuan Li. Out-of- distribution detection with deep nearest neighbors. InPro- ceedings of the 39th International Conference on Machine Learning, pages 20827–20840, 2022. 1, 2, 4, 6, 8, 13, 14, 24

work page 2022
[58]

Tabak and Turner Cristina

E. Tabak and Turner Cristina. A family of nonparametric density estimation algorithms.Communications on Pure and Applied Mathematics, 66:145–164, 2013. 8

work page 2013
[59]

Uncertainty estimation using a single deep deter- ministic neural network

Joost Van Amersfoort, Lewis Smith, Yee Whye Teh, and Yarin Gal. Uncertainty estimation using a single deep deter- ministic neural network. InProceedings of the 37th Interna- tional Conference on Machine Learning, pages 9690–9700,

work page
[60]

Condi- tional image generation with pixelcnn decoders

Aaron van den Oord, Nal Kalchbrenner, Lasse Espeholt, ko- ray kavukcuoglu, Oriol Vinyals, and Alex Graves. Condi- tional image generation with pixelcnn decoders. InAdvances in Neural Information Processing Systems, 2016. 8

work page 2016
[61]

The inaturalist species classification and de- tection dataset

Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, and Serge Belongie. The inaturalist species classification and de- tection dataset. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 5, 9

work page 2018
[62]

Can multi-label classification networks know what they don’t know? InAdvances in Neural Information Processing Systems, 2021

Haoran Wang, Weitang Liu, Alex Bocchieri, and Yixuan Li. Can multi-label classification networks know what they don’t know? InAdvances in Neural Information Processing Systems, 2021. 8

work page 2021
[63]

Vim: Out-of-distribution with virtual-logit matching

Haoqi Wang, Zhizhong Li, Litong Feng, and Wayne Zhang. Vim: Out-of-distribution with virtual-logit matching. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 5, 8, 7, 13, 14

work page 2022
[64]

Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mo- hammadhadi Bagheri, and Ronald M. Summers. Chestx- ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of com- mon thorax diseases. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 1

work page 2017
[65]

Mitigating neural network overconfidence with logit normalization

Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, and Yixuan Li. Mitigating neural network overconfidence with logit normalization. InProceedings of the 39th Interna- tional Conference on Machine Learning, 2022. 8

work page 2022
[66]

Con- vnext v2: Co-designing and scaling convnets with masked autoencoders

Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, and Saining Xie. Con- vnext v2: Co-designing and scaling convnets with masked autoencoders. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16133–16142, 2023. 8

work page 2023
[67]

Ehinger, Aude Oliva, and Antonio Torralba

Jianxiong Xiao, James Hays, Krista A. Ehinger, Aude Oliva, and Antonio Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In2010 IEEE Computer So- ciety Conference on Computer Vision and Pattern Recogni- tion, pages 3485–3492, 2010. 5, 9

work page 2010
[68]

Scaling for training time and post-hoc out-of-distribution de- tection enhancement

Kai Xu, Rongyu Chen, Gianni Franchi, and Angela Yao. Scaling for training time and post-hoc out-of-distribution de- tection enhancement. InThe Twelfth International Confer- ence on Learning Representations, 2024. 1, 2, 8, 13, 14, 16

work page 2024
[69]

TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking

Pingmei Xu, Krista A. Ehinger, Yinda Zhang, Adam Finkel- stein, Sanjeev R. Kulkarni, and Jianxiong Xiao. Turkergaze: Crowdsourcing saliency with webcam based eye tracking. CoRR, 1504.06755, 2015. 5, 6, 11, 12

work page internal anchor Pith review Pith/arXiv arXiv 2015
[70]

Semantically Coherent Out-of-Distribution Detection

Jingkang Yang, Haoqi Wang, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang, and Ziwei Liu. Semantically Coherent Out-of-Distribution Detection . In2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021. 8

work page 2021
[71]

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

Fisher Yu, Yinda Zhang, Shuran Song, Ari Seff, and Jianx- iong Xiao. Lsun: construction of a large-scale image dataset using deep learning with humans in the loop.CoRR, 1506.03365, 2015. 4, 5, 6, 11, 12

work page internal anchor Pith review Pith/arXiv arXiv 2015
[72]

Visualizing and under- standing convolutional networks, 2013

Matthew D Zeiler and Rob Fergus. Visualizing and under- standing convolutional networks, 2013. 7

work page 2013
[73]

Openood v1

Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xue- feng Du, Kaiyang Zhou, Wayne Zhang, Yixuan Li, Ziwei Liu, Yiran Chen, and Li Hai. Openood v1.5: Enhanced benchmark for out-of-distribution detection.arXiv preprint arXiv:2306.09301, 2023. 5

work page arXiv 2023
[74]

Places: A 10 million image database for scene recognition.IEEE Transactions on Pattern Analy- sis and Machine Intelligence, 40(6):1452–1464, 2017

Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition.IEEE Transactions on Pattern Analy- sis and Machine Intelligence, 40(6):1452–1464, 2017. 4, 5, 6, 9, 11, 12

work page 2017
[75]

Boosting out-of-distribution detection with typical features,

Yao Zhu, YueFeng Chen, Chuanlong Xie, Xiaodan Li, Rong Zhang, Hui Xue, Xiang Tian, bolun zheng, and Yaowu Chen. Boosting out-of-distribution detection with typical features,

work page
[76]

13, 14 Catalyst: Out-of-Distribution Detection via Elastic Scaling Supplementary Material A. Description of Baseline Methods In resonance with existing work [5, 36, 54, 55], for the reader’s convenience, we summarize in detail a few com- mon techniques for defining OOD scores that measure the degree of ID-ness on the given sample. All the methods derive t...

work page
[77]

Compute the p-th percentile threshold t of h(x)

work page
[78]

Let s1 =P h(x), the sum of all activation values before pruning

work page
[79]

Set all values in h(x) less than t to zero

work page
[80]

Let s2 =P h(x), the sum after pruning

work page

Showing first 80 references.