Early High-Frequency Injection for Geometry-Sensitive OOD Detection
Pith reviewed 2026-05-21 05:09 UTC · model grok-4.3
The pith
Injecting high-frequency input components early reshapes neural features to separate in-distribution from out-of-distribution samples more cleanly.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under matched training and scoring settings, early high-frequency injection (EIHF) reshapes class-conditional feature geometry and reduces ID/OOD Mahalanobis score overlap. The method works by exposing higher-frequency bands at the input before the first convolution, exploiting the observation that these bands induce stronger feature discrepancy than low-frequency bands.
What carries the argument
EIHF, an input-side intervention that adds high-frequency evidence before the first convolution to reshape class-conditional feature geometry for geometry-sensitive OOD scoring.
If this is right
- Geometry-sensitive post-hoc detectors such as Mahalanobis distance scoring receive the largest gains.
- Performance improves on object-centric shifts such as CIFAR-100 and ImageNet-100.
- No change to the training loss or architecture is required.
- A performance drop appears on scene-centric shifts such as Places.
Where Pith is reading between the lines
- The same frequency-band analysis could be applied to other representation-learning objectives to decide where to inject information.
- Input-level frequency interventions may complement or replace some post-training scoring refinements.
- The limitation on scene-centric data suggests that frequency content interacts with the semantic granularity of the shift.
Load-bearing premise
The band-wise MMD squared diagnostic correctly identifies higher-frequency bands as carrying stronger ID/OOD separability that early injection can exploit.
What would settle it
Measuring the Mahalanobis score distributions on a held-out ID/OOD pair after applying EIHF; if the overlap between the two distributions does not decrease relative to the baseline, the central claim is false.
Figures
read the original abstract
Post-hoc OOD detectors score logits or features after training, so their success depends on the geometry already encoded in the representation. We revisit this assumption through a band-wise MMD^2 analysis across CE, SimCLR, SupCon, and the OOD-oriented representation method PALM. In our diagnostic, low-frequency input bands induce weaker ID/OOD feature discrepancy, whereas higher-frequency bands tend to provide stronger separability. This observation motivates EIHF, an input-side intervention that exposes high-frequency evidence before the first convolution without changing the training objective. EIHF is strongest for geometry-sensitive OOD detection: under matched training and scoring settings, it reshapes class-conditional feature geometry and reduces ID/OOD Mahalanobis score overlap. Experiments on CIFAR-100 and ImageNet-100 show gains on CIFAR-100 and the best average FPR95 with second-best average AUROC on ImageNet-100, while also revealing a limitation on the scene-centric Places shift. Code is available at https://anonymous.4open.science/r/EIHF.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that a band-wise MMD² analysis across CE, SimCLR, SupCon, and PALM representations shows weaker ID/OOD separability in low-frequency input bands and stronger separability in higher-frequency bands. This motivates EIHF, an input-side intervention that injects high-frequency evidence before the first convolution without altering the training objective. Under matched training and scoring, EIHF reshapes class-conditional feature geometry, reduces Mahalanobis ID/OOD overlap, and yields gains on CIFAR-100 plus the best average FPR95 on ImageNet-100, while noting a limitation on the Places shift. Code is released.
Significance. If the central claim holds after addressing the diagnostic, EIHF provides a lightweight, training-agnostic way to improve geometry-sensitive post-hoc OOD detectors by directly influencing early feature geometry. The empirical results on standard benchmarks, code availability, and frequency-based diagnostic add practical and conceptual value to the OOD literature.
major comments (1)
- [Band-wise MMD² analysis (motivation)] The band-wise MMD² diagnostic may confound frequency content with per-band energy. Higher-frequency bands inherently carry lower amplitude; without explicit per-band energy or variance normalization before feature extraction, any MMD² increase could trace to signal-strength differences rather than frequency-specific discriminative structure. This assumption is load-bearing for motivating the early-injection intervention (see Abstract and motivation section).
minor comments (2)
- [Abstract] The abstract and methods lack precise implementation details on the high-frequency injection (filtering method, scaling, exact masking).
- [Experiments] Statistical significance of reported gains should be included, along with expanded controls or analysis for the Places limitation.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for highlighting a potential confound in our band-wise MMD² diagnostic. We address this point directly below and have revised the manuscript to strengthen the motivation for EIHF.
read point-by-point responses
-
Referee: [Band-wise MMD² analysis (motivation)] The band-wise MMD² diagnostic may confound frequency content with per-band energy. Higher-frequency bands inherently carry lower amplitude; without explicit per-band energy or variance normalization before feature extraction, any MMD² increase could trace to signal-strength differences rather than frequency-specific discriminative structure. This assumption is load-bearing for motivating the early-injection intervention (see Abstract and motivation section).
Authors: We agree that the absence of per-band energy normalization represents a valid concern, as higher-frequency bands do carry lower amplitude on average and this could partially drive observed MMD² trends. In the original analysis we did not apply explicit per-band variance normalization before feature extraction. To resolve this, we have revised the diagnostic: each frequency band is now normalized by its own standard deviation prior to MMD² computation. The updated results (new Figure 2 and expanded Section 3.2) preserve the key pattern—low-frequency bands continue to yield weaker ID/OOD separability while higher-frequency bands yield stronger separability—across CE, SimCLR, SupCon, and PALM representations. We have also added a brief discussion of this normalization step in the motivation section to make the frequency-specific claim more robust. These changes directly address the load-bearing assumption for EIHF. revision: yes
Circularity Check
No circularity: independent diagnostic motivates intervention with empirical validation
full rationale
The derivation begins with a band-wise MMD^2 analysis across multiple training methods (CE, SimCLR, SupCon, PALM) that identifies frequency-dependent ID/OOD separability in features. This observation directly motivates the EIHF input intervention, which is then evaluated by measuring changes in class-conditional geometry and Mahalanobis score overlap on CIFAR-100 and ImageNet-100. No equations reduce the claimed gains to a fitted parameter or self-referential definition; the central result is the measured effect of the intervention under matched settings. No load-bearing self-citations or uniqueness theorems are invoked. The chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Higher-frequency input bands induce stronger ID/OOD feature discrepancy than low-frequency bands
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
band-wise MMD2 analysis... low-frequency input bands induce weaker ID/OOD feature discrepancy, whereas higher-frequency bands tend to provide stronger separability... EIHF... appends a fixed high-frequency residual channel before the first convolution
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Invariant scattering convolution networks
Joan Bruna and Stephane Mallat. Invariant scattering convolution networks. IEEE Transactions on Pattern Analysis and Machine Intelligence , 35(8):1872--1886, 2013
work page 2013
-
[2]
A computational approach to edge detection
John Canny. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence , 8(6):679--698, 1986
work page 1986
-
[3]
Out-of-distribution detection via frequency-regularized generative models
Mu Cai and Yixuan Li. Out-of-distribution detection via frequency-regularized generative models. In IEEE/CVF Winter Conference on Applications of Computer Vision , 2023
work page 2023
-
[4]
Deep clustering for unsupervised learning of visual features
Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Matthijs Douze. Deep clustering for unsupervised learning of visual features. In European Conference on Computer Vision , 2018
work page 2018
-
[5]
Unsupervised learning of visual features by contrasting cluster assignments
Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. Unsupervised learning of visual features by contrasting cluster assignments. In Advances in Neural Information Processing Systems , 2020
work page 2020
-
[6]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning , 2020
work page 2020
-
[7]
Describing textures in the wild
Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild. In IEEE Conference on Computer Vision and Pattern Recognition , 2014
work page 2014
-
[8]
ImageNet: A large-scale hierarchical image database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition , 2009
work page 2009
-
[9]
Extremely simple activation shaping for out-of-distribution detection
Andrija Djurisic, Nebojsa Bozanic, Arjun Ashok, and Rosanne Liu. Extremely simple activation shaping for out-of-distribution detection. In International Conference on Learning Representations , 2023
work page 2023
-
[10]
VOS: Learning what you don't know by virtual outlier synthesis
Xuefeng Du, Zhaoning Wang, Mu Cai, and Yixuan Li. VOS: Learning what you don't know by virtual outlier synthesis. In International Conference on Learning Representations , 2022
work page 2022
-
[11]
Nick Drummond and Rob Shearer. The open world assumption. In The Closed World of Databases Meets the Open World of the Semantic Web , 2006
work page 2006
-
[12]
Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A. Wichmann, and Wieland Brendel. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In International Conference on Learning Representations , 2019
work page 2019
-
[13]
Your classifier is secretly an energy based model and you should treat it like one
Will Grathwohl, Kuan-Chieh Wang, Joern-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, and Kevin Swersky. Your classifier is secretly an energy based model and you should treat it like one. In International Conference on Learning Representations , 2020
work page 2020
-
[14]
Bootstrap your own latent: A new approach to self-supervised learning
Jean-Bastien Grill, Florian Strub, Florent Altche, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, and others. Bootstrap your own latent: A new approach to self-supervised learning. In Advances in Neural Information Processing Systems , 2020
work page 2020
-
[15]
Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schoelkopf, and Alexander Smola. A kernel two-sample test. Journal of Machine Learning Research , 13:723--773, 2012
work page 2012
-
[16]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition , 2016
work page 2016
-
[17]
Momentum contrast for unsupervised visual representation learning
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2020
work page 2020
-
[18]
A baseline for detecting misclassified and out-of-distribution examples in neural networks
Dan Hendrycks and Kevin Gimpel. A baseline for detecting misclassified and out-of-distribution examples in neural networks. In International Conference on Learning Representations , 2017
work page 2017
-
[19]
A survey of safety and trustworthiness of deep neural networks
Xiaowei Huang, Daniel Kroening, Wenjie Ruan, James Sharp, Youcheng Sun, Emese Thamo, Min Wu, and Xinping Yi. A survey of safety and trustworthiness of deep neural networks. Computer Science Review , 37:100270, 2020
work page 2020
-
[20]
On the importance of gradients for detecting distributional shifts in the wild
Rui Huang, Andrew Geng, and Yixuan Li. On the importance of gradients for detecting distributional shifts in the wild. In Advances in Neural Information Processing Systems , 2021
work page 2021
-
[21]
Measuring the tendency of CNNs to Learn Surface Statistical Regularities
Jason Jo and Yoshua Bengio. Measuring the tendency of CNNs to learn surface statistical regularities. arXiv preprint arXiv:1711.11561, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[22]
Supervised contrastive learning
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. Supervised contrastive learning. In Advances in Neural Information Processing Systems , 2020
work page 2020
-
[23]
Durk P. Kingma and Prafulla Dhariwal. Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems , 2018
work page 2018
-
[24]
Learning multiple layers of features from tiny images
Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009
work page 2009
-
[25]
A simple unified framework for detecting out-of-distribution samples and adversarial attacks
Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In Advances in Neural Information Processing Systems , 2018
work page 2018
-
[26]
Enhancing the reliability of out-of-distribution image detection in neural networks
Shiyu Liang, Yixuan Li, and Rayadurgam Srikant. Enhancing the reliability of out-of-distribution image detection in neural networks. In International Conference on Learning Representations , 2018
work page 2018
-
[27]
Junnan Li, Pan Zhou, Caiming Xiong, and Steven C. H. Hoi. Prototypical contrastive learning of unsupervised representations. In International Conference on Learning Representations , 2021
work page 2021
-
[28]
Weitang Liu, Xiaoyun Wang, John D. Owens, and Yixuan Li. Energy-based out-of-distribution detection. In Advances in Neural Information Processing Systems , 2020
work page 2020
-
[29]
Learning with mixture of prototypes for out-of-distribution detection
Haodong Lu, Dong Gong, Shuo Wang, Jason Xue, Lina Yao, and Kristen Moore. Learning with mixture of prototypes for out-of-distribution detection. arXiv preprint arXiv:2402.02653, 2024
-
[30]
A theory for multiresolution signal decomposition: The wavelet representation
Stephane Mallat. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence , 11(7):674--693, 1989
work page 1989
-
[31]
Yifei Ming, Yiyou Sun, Ousmane Dia, and Yixuan Li. How to exploit hyperspherical embeddings for out-of-distribution detection? In International Conference on Learning Representations , 2023
work page 2023
-
[32]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. Reading digits in natural images with unsupervised feature learning. In NeurIPS Workshop on Deep Learning and Unsupervised Feature Learning , 2011
work page 2011
-
[33]
Deep neural networks are easily fooled: High confidence predictions for unrecognizable images
Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In IEEE Conference on Computer Vision and Pattern Recognition , 2015
work page 2015
-
[34]
Representation Learning with Contrastive Predictive Coding
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[35]
Distributional prototype learning for out-of-distribution detection
Bo Peng, Jie Lu, Yonggang Zhang, Guangquan Zhang, and Zhen Fang. Distributional prototype learning for out-of-distribution detection. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 2025
work page 2025
-
[36]
Hamprecht, Yoshua Bengio, and Aaron Courville
Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, and Aaron Courville. On the spectral bias of neural networks. In International Conference on Machine Learning , 2019
work page 2019
-
[37]
Jie Ren, Peter J. Liu, Emily Fertig, Jasper Snoek, Ryan Poplin, Mark DePristo, Joshua Dillon, and Balaji Lakshminarayanan. Likelihood ratios for out-of-distribution detection. In Advances in Neural Information Processing Systems , 2019
work page 2019
-
[38]
Contrastive learning with hard negative samples
Joshua David Robinson, Ching-Yao Chuang, Suvrit Sra, and Stefanie Jegelka. Contrastive learning with hard negative samples. In International Conference on Learning Representations , 2021
work page 2021
-
[39]
SSD: A unified framework for self-supervised outlier detection
Vikash Sehwag, Mung Chiang, and Prateek Mittal. SSD: A unified framework for self-supervised outlier detection. In International Conference on Learning Representations , 2021
work page 2021
-
[40]
Prototypical networks for few-shot learning
Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems , 2017
work page 2017
-
[41]
ReAct: Out-of-distribution detection with rectified activations
Yiyou Sun, Chuan Guo, and Yixuan Li. ReAct: Out-of-distribution detection with rectified activations. In Advances in Neural Information Processing Systems , 2021
work page 2021
-
[42]
Out-of-distribution detection with deep nearest neighbors
Yiyou Sun, Yifei Ming, Xiaojin Zhu, and Yixuan Li. Out-of-distribution detection with deep nearest neighbors. In International Conference on Machine Learning , 2022
work page 2022
-
[43]
CSI: Novelty detection via contrastive learning on distributionally shifted instances
Jihoon Tack, Sangwoo Mo, Jongheon Jeong, and Jinwoo Shin. CSI: Novelty detection via contrastive learning on distributionally shifted instances. In Advances in Neural Information Processing Systems , 2020
work page 2020
-
[44]
Non-parametric outlier synthesis
Leitian Tao, Xuefeng Du, Jerry Zhu, and Yixuan Li. Non-parametric outlier synthesis. In International Conference on Learning Representations , 2023
work page 2023
-
[45]
Yonglong Tian, Dilip Krishnan, and Phillip Isola. Contrastive multiview coding. In European Conference on Computer Vision , 2020
work page 2020
-
[46]
Energy-based open-world uncertainty modeling for confidence calibration
Yezhen Wang, Bo Li, Tong Che, Kaiyang Zhou, Ziwei Liu, and Dongsheng Li. Energy-based open-world uncertainty modeling for confidence calibration. In IEEE/CVF International Conference on Computer Vision , 2021
work page 2021
-
[47]
ViM: Out-of-distribution with virtual-logit matching
Haoqi Wang, Zhizhong Li, Litong Feng, and Wayne Zhang. ViM: Out-of-distribution with virtual-logit matching. In IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022
work page 2022
-
[48]
Mitigating neural network overconfidence with logit normalization
Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, and Yixuan Li. Mitigating neural network overconfidence with logit normalization. In International Conference on Machine Learning , 2022
work page 2022
-
[49]
Zhirong Wu, Yuanjun Xiong, Stella X. Yu, and Dahua Lin. Unsupervised feature learning via non-parametric instance discrimination. In IEEE Conference on Computer Vision and Pattern Recognition , 2018
work page 2018
-
[50]
Likelihood regret: An out-of-distribution detection score for variational auto-encoder
Zhisheng Xiao, Qing Yan, and Yali Amit. Likelihood regret: An out-of-distribution detection score for variational auto-encoder. In Advances in Neural Information Processing Systems , 2020
work page 2020
-
[51]
TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking
Pingmei Xu, Krista A. Ehinger, Yinda Zhang, Adam Finkelstein, Sanjeev R. Kulkarni, and Jianxiong Xiao. TurkerGaze: Crowdsourcing saliency with webcam based eye tracking. arXiv preprint arXiv:1504.06755, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[52]
Frequency principle: Fourier analysis sheds light on deep neural networks
Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, and Zheng Ma. Frequency principle: Fourier analysis sheds light on deep neural networks. Communications in Computational Physics , 28(5):1746--1767, 2020
work page 2020
-
[53]
Generalized out-of-distribution detection: A survey
Jingkang Yang, Kaiyang Zhou, Yixuan Li, and Ziwei Liu. Generalized out-of-distribution detection: A survey. International Journal of Computer Vision , 132:5635--5662, 2024
work page 2024
-
[54]
Dong Yin, Raphael Gontijo Lopes, Jonathon Shlens, Ekin D. Cubuk, and Justin Gilmer. A Fourier perspective on model robustness in computer vision. In Advances in Neural Information Processing Systems , 2019
work page 2019
-
[55]
LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop
Fisher Yu, Yinda Zhang, Shuran Song, Ari Seff, and Jianxiong Xiao. LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[56]
Learning to shape in-distribution feature space for out-of-distribution detection
Yonggang Zhang, Jie Lu, Bo Peng, Zhen Fang, and Yiu-ming Cheung. Learning to shape in-distribution feature space for out-of-distribution detection. In Advances in Neural Information Processing Systems , 2024
work page 2024
-
[57]
Places: A 10 million image database for scene recognition
Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence , 40(6):1452--1464, 2017
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.