ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection

Bo Peng; Yadan Luo; Yixuan Li; Yonggang Zhang; Zhen Fang

arxiv: 2402.17888 · v5 · pith:RUASZ4XOnew · submitted 2024-02-27 · 💻 cs.LG · cs.AI

ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection

Bo Peng , Yadan Luo , Yonggang Zhang , Yixuan Li , Zhen Fang This is my paper

Pith reviewed 2026-05-25 08:33 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords out-of-distribution detectiondensity estimationBregman divergenceConjNormMonte Carlo estimatornorm coefficientexponential family distributions

0 comments

The pith

ConjNorm reframes density estimation for out-of-distribution detection as optimization of a norm coefficient under Bregman divergence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Many OOD detection approaches rely on scores from logits or distances that may not capture true data density. The paper offers a unified view using Bregman divergence to cover exponential family distributions. It derives ConjNorm, turning density design into finding the right norm coefficient p for the dataset. A Monte Carlo importance sampling method provides an unbiased estimate of the needed partition function. This setup delivers better OOD detection across benchmarks.

Core claim

We propose a novel theoretical framework grounded in Bregman divergence, which extends distribution considerations to encompass an exponential family of distributions. Leveraging the conjugation constraint revealed in our theorem, we introduce a ConjNorm method, reframing density function design as a search for the optimal norm coefficient p against the given dataset. In light of the computational challenges of normalization, we devise an unbiased and analytically tractable estimator of the partition function using the Monte Carlo-based importance sampling technique.

What carries the argument

The conjugation constraint from the Bregman divergence theorem that reframes density function design as a search for the optimal norm coefficient p.

Load-bearing premise

The conjugation constraint from the Bregman divergence theorem allows reframing density function design as a search for the optimal norm coefficient p against the given dataset.

What would settle it

A demonstration that the Monte Carlo estimator is biased on real datasets or that ConjNorm fails to outperform existing methods would falsify the performance claims.

Figures

Figures reproduced from arXiv: 2402.17888 by Bo Peng, Yadan Luo, Yixuan Li, Yonggang Zhang, Zhen Fang.

**Figure 1.** Figure 1: Illustration of the alignment of GEM score and true density of Gaussian (Left) and Gamma (Right) distributions. Distance-based OOD methods (Lee et al., 2017) target on deriving gθ(z, k) by assessing the proximity of the input to the k-th prototype µk. The selection of appropriate similarity metrics is crucial in capturing the intrinsic geometric data relationships. One of the most representative metrics… view at source ↗

**Figure 2.** Figure 2: Evaluations of different partition function estimation baselines on ImageNet: Left: Mo [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Ablation study using feature extractions from (a) the first, (b) the second, and (c) the last [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Ablation study w.r.t varing sampling ratio [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Comparisons of varying q when p is fixed at 2.5 (Left) and 3.0 (Right) on CIFAR-100 [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

read the original abstract

Post-hoc out-of-distribution (OOD) detection has garnered intensive attention in reliable machine learning. Many efforts have been dedicated to deriving score functions based on logits, distances, or rigorous data distribution assumptions to identify low-scoring OOD samples. Nevertheless, these estimate scores may fail to accurately reflect the true data density or impose impractical constraints. To provide a unified perspective on density-based score design, we propose a novel theoretical framework grounded in Bregman divergence, which extends distribution considerations to encompass an exponential family of distributions. Leveraging the conjugation constraint revealed in our theorem, we introduce a \textsc{ConjNorm} method, reframing density function design as a search for the optimal norm coefficient $p$ against the given dataset. In light of the computational challenges of normalization, we devise an unbiased and analytically tractable estimator of the partition function using the Monte Carlo-based importance sampling technique. Extensive experiments across OOD detection benchmarks empirically demonstrate that our proposed \textsc{ConjNorm} has established a new state-of-the-art in a variety of OOD detection setups, outperforming the current best method by up to 13.25$\%$ and 28.19$\%$ (FPR95) on CIFAR-100 and ImageNet-1K, respectively.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ConjNorm turns density-based OOD scoring into a search over norm coefficient p via a Bregman conjugation constraint and adds an MC estimator, with sizable reported gains, but the core validity claims rest on unshown derivations.

read the letter

The main point on this paper is that ConjNorm uses a Bregman divergence theorem to reframe density estimation for OOD detection as finding the right p in a norm, then supplies a Monte Carlo importance sampler for the partition function that is claimed to be unbiased and tractable. It reports clear improvements over prior methods on standard benchmarks, up to 13.25% and 28.19% FPR95 on CIFAR-100 and ImageNet-1K. The specific combination of the conjugation constraint with the p-search and the estimator looks like the actual new piece, extending earlier tools in a way not directly covered in the cited work. The paper does a reasonable job of giving a unified theoretical view on density-based scores and then running the usual OOD detection experiments across multiple setups, which is useful for seeing where the numbers land. If the theorem and estimator hold, the framing could help avoid some of the ad-hoc choices in earlier scores. The soft spots sit mainly in the soundness and circularity areas. The abstract states the constraint allows reframing density design as p-search and that the estimator is unbiased, but without the actual theorem or proof it is impossible to check whether the resulting function is a legitimate density or merely a score, or whether the sampler stays unbiased in high-dimensional image space. Optimizing p against the given dataset also raises the possibility that some of the reported gains come from dataset-specific fitting rather than the framework itself. Those two links are the least secured, exactly as the stress-test note flags. This work is aimed at people already working on post-hoc OOD methods and density estimation in reliable ML. A reader focused on that subfield would get value from the experiments and the proposed lens, even if they later disagree with the approach. It has enough structure and empirical scale to deserve a serious referee who can inspect the derivations and the exact procedure for choosing p. I would send it to peer review rather than desk reject.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes ConjNorm for post-hoc OOD detection. Grounded in Bregman divergence, it extends considerations to an exponential family and uses a conjugation constraint to reframe density design as a search for the optimal norm coefficient p on the given dataset. An unbiased Monte Carlo importance-sampling estimator is introduced for the partition function to address normalization. Experiments across OOD benchmarks report new SOTA results, with FPR95 gains up to 13.25% on CIFAR-100 and 28.19% on ImageNet-1K over prior best methods.

Significance. If the Bregman theorem and unbiasedness of the estimator hold, the work supplies a unified theoretical lens on density-based OOD scores together with a tractable estimator, and the reported gains would represent a substantial empirical advance over existing logit-, distance-, and density-based detectors.

major comments (3)

[Theoretical framework] The Bregman-divergence theorem and conjugation constraint (theoretical development section): the central claim that this constraint legitimately reframes density estimation as a search over p and yields a valid density (rather than a fitted score) cannot be assessed without the explicit theorem statement, proof, and any assumptions on the exponential family.
[Estimator] Monte Carlo importance-sampling estimator for the partition function (method section): the assertion of unbiasedness is load-bearing for tractability and for attributing performance gains to the framework rather than to post-hoc fitting; the derivation, proposal distribution, and variance behavior in high-dimensional image regimes must be shown explicitly.
[Experiments] Selection of the norm coefficient p (experimental protocol): the method searches for optimal p against the given dataset; it is unclear whether this search is performed solely on ID training data or involves validation/test splits, which would introduce circularity and undermine the claim that scores retain independent grounding.

minor comments (2)

[Introduction/Theory] Notation for the exponential family and the resulting density should be introduced with explicit equations early in the theoretical section to aid readability.
[Experiments] Table captions and axis labels on the main result figures should explicitly state the evaluation metric (FPR95) and the baselines being compared.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and detailed review. The three major comments identify areas where additional explicit detail would strengthen the manuscript. We address each point below and indicate the corresponding revisions.

read point-by-point responses

Referee: [Theoretical framework] The Bregman-divergence theorem and conjugation constraint (theoretical development section): the central claim that this constraint legitimately reframes density estimation as a search over p and yields a valid density (rather than a fitted score) cannot be assessed without the explicit theorem statement, proof, and any assumptions on the exponential family.

Authors: Theorem 1 in Section 3 states the conjugation constraint and its consequence for reframing the density as a search over the norm coefficient p within the exponential family. The full proof appears in Appendix A, under the assumption that the base measure is positive and the natural parameter space is convex. To improve accessibility we will move a concise statement of the theorem and the key proof steps into the main text of Section 3. revision: yes
Referee: [Estimator] Monte Carlo importance-sampling estimator for the partition function (method section): the assertion of unbiasedness is load-bearing for tractability and for attributing performance gains to the framework rather than to post-hoc fitting; the derivation, proposal distribution, and variance behavior in high-dimensional image regimes must be shown explicitly.

Authors: Section 4.2 derives the unbiased estimator via importance sampling with the in-distribution empirical measure as the proposal; unbiasedness follows directly from the standard Monte Carlo identity. We will insert the complete derivation, the explicit proposal distribution, and a short analysis of variance scaling with dimension into the main text of Section 4.2, together with additional high-dimensional variance diagnostics in the supplementary material. revision: yes
Referee: [Experiments] Selection of the norm coefficient p (experimental protocol): the method searches for optimal p against the given dataset; it is unclear whether this search is performed solely on ID training data or involves validation/test splits, which would introduce circularity and undermine the claim that scores retain independent grounding.

Authors: The search for p is performed exclusively on the ID training set (using an internal validation split carved from the training data) and never touches OOD or test data. This protocol is stated in Section 5.1. We will add an explicit sentence clarifying that no OOD or test information is used during p selection. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces its own Bregman divergence theorem and conjugation constraint to reframe density design as a search over norm coefficient p on the given (in-distribution) dataset, followed by an MC importance-sampling estimator for the partition function. This constitutes an explicit modeling choice and fitting procedure whose outputs are then evaluated on separate OOD benchmarks; the derivation chain does not reduce by construction to prior inputs, self-citations, or renamed known results. The central empirical claims rest on independent validation rather than tautological re-use of fitted quantities as predictions. No load-bearing self-citation or self-definitional step is present in the provided text.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; full derivations, assumptions, and experimental protocols unavailable. The ledger therefore records only elements explicitly named in the abstract.

free parameters (1)

norm coefficient p
Described as searched against the given dataset to obtain the optimal value for the density function.

axioms (1)

domain assumption Bregman divergence framework extends distribution considerations to an exponential family of distributions
Invoked as the grounding for the proposed theorem on conjugation constraints.

pith-pipeline@v0.9.0 · 5762 in / 1391 out tokens · 24901 ms · 2026-05-25T08:33:46.042749+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

On the Provable Importance of Gradients for Language-Assisted Image Clustering
cs.CV 2025-10 unverdicted novelty 6.0

GradNorm selects positive nouns via gradient magnitudes from cross-entropy loss, with an error bound proving it subsumes prior CLIP methods and delivers SOTA clustering results.

Reference graph

Works this paper leans on

87 extracted references · 87 canonical work pages · cited by 1 Pith paper · 6 internal anchors

[1]

Line: Out-of-distribution detection by leveraging important neurons

Yong Hyun Ahn, Gyeong-Moon Park, and Seong Tae Kim. Line: Out-of-distribution detection by leveraging important neurons. arXiv preprint arXiv:2303.13995, 2023

work page arXiv 2023
[2]

Building Normalizing Flows with Stochastic Interpolants

Michael S Albergo and Eric Vanden-Eijnden. Building normalizing flows with stochastic interpolants. arXiv preprint arXiv:2209.15571, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[3]

Exponential families and mixture families of probability distributions

Shun-ichi Amari. Exponential families and mixture families of probability distributions. In Information Geometry and Its Applications, pp.\ 31--49. Springer, 2016

work page 2016
[4]

Relative loss bounds for on-line density estimation with the exponential family of distributions

Katy S Azoury and Manfred K Warmuth. Relative loss bounds for on-line density estimation with the exponential family of distributions. Machine learning, 43: 0 211--246, 2001

work page 2001
[5]

On the effectiveness of out-of-distribution data in self-supervised long-tail learning

Jianhong Bai, Zuozhu Liu, Hualiang Wang, Jin Hao, Yang Feng, Huanpeng Chu, and Haoji Hu. On the effectiveness of out-of-distribution data in self-supervised long-tail learning. arXiv preprint arXiv:2306.04934, 2023

work page arXiv 2023
[6]

Density modeling of images using a generalized normalization transformation

Johannes Ball \'e , Valero Laparra, and Eero P Simoncelli. Density modeling of images using a generalized normalization transformation. arXiv preprint arXiv:1511.06281, 2015

work page arXiv 2015
[7]

Clustering with bregman divergences

Arindam Banerjee, Srujana Merugu, Inderjit S Dhillon, Joydeep Ghosh, and John Lafferty. Clustering with bregman divergences. Journal of machine learning research, 6 0 (10), 2005

work page 2005
[8]

Adaptive importance sampling for multilevel monte carlo euler method

Mohamed Ben Alaya, Kaouther Hajji, and Ahmed Kebaier. Adaptive importance sampling for multilevel monte carlo euler method. Stochastics, 95 0 (2): 0 303--327, 2023

work page 2023
[9]

The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming

Lev M Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR computational mathematics and mathematical physics, 7 0 (3): 0 200--217, 1967

work page 1967
[10]

Fundamentals of statistical exponential families: with applications in statistical decision theory

Lawrence D Brown. Fundamentals of statistical exponential families: with applications in statistical decision theory. Ims, 1986

work page 1986
[11]

Learning imbalanced datasets with label-distribution-aware margin loss

Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, and Tengyu Ma. Learning imbalanced datasets with label-distribution-aware margin loss. Advances in neural information processing systems, 32, 2019

work page 2019
[12]

Adversarial reciprocal points learning for open set recognition

Guangyao Chen, Peixi Peng, Xiangqian Wang, and Yonghong Tian. Adversarial reciprocal points learning for open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44 0 (11): 0 8065--8081, 2021 a

work page 2021
[13]

Atom: Robustifying out-of-distribution detection using outlier mining

Jiefeng Chen, Yixuan Li, Xi Wu, Yingyu Liang, and Somesh Jha. Atom: Robustifying out-of-distribution detection using outlier mining. In Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13--17, 2021, Proceedings, Part III 21, pp.\ 430--445. Springer, 2021 b

work page 2021
[14]

Milestones in autonomous driving and intelligent vehicles: Survey of surveys

Long Chen, Yuchen Li, Chao Huang, Bai Li, Yang Xing, Daxin Tian, Li Li, Zhongxu Hu, Xiaoxiang Na, Zixuan Li, et al. Milestones in autonomous driving and intelligent vehicles: Survey of surveys. IEEE Transactions on Intelligent Vehicles, 8 0 (2): 0 1046--1056, 2022

work page 2022
[15]

A tutorial on kernel density estimation and recent advances

Yen-Chi Chen. A tutorial on kernel density estimation and recent advances. Biostatistics & Epidemiology, 1 0 (1): 0 161--187, 2017

work page 2017
[16]

Bregman deviations of generic exponential families

Sayak Ray Chowdhury, Patrick Saux, Odalric Maillard, and Aditya Gopalan. Bregman deviations of generic exponential families. In The Thirty Sixth Annual Conference on Learning Theory, pp.\ 394--449. PMLR, 2023

work page 2023
[17]

Describing textures in the wild

Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 3606--3613, 2014

work page 2014
[18]

Density estimation using Real NVP

Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[19]

Extremely simple activation shaping for out-of-distribution detection

Andrija Djurisic, Nebojsa Bozanic, Arjun Ashok, and Rosanne Liu. Extremely simple activation shaping for out-of-distribution detection. arXiv preprint arXiv:2209.09858, 2022

work page arXiv 2022
[20]

Vos: Learning what you don't know by virtual outlier synthesis

Xuefeng Du, Zhaoning Wang, Mu Cai, and Yixuan Li. Vos: Learning what you don't know by virtual outlier synthesis. arXiv preprint arXiv:2202.01197, 2022

work page arXiv 2022
[21]

Is out-of-distribution detection learnable? In NeurIPS, 2022

Zhen Fang, Yixuan Li, Jie Lu, Jiahua Dong, Bo Han, and Feng Liu. Is out-of-distribution detection learnable? In NeurIPS, 2022

work page 2022
[22]

Relative expected instantaneous loss bounds

J \"u rgen Forster and Manfred K Warmuth. Relative expected instantaneous loss bounds. Journal of Computer and System Sciences, 64 0 (1): 0 76--102, 2002

work page 2002
[23]

A review on speech recognition technique

Santosh K Gaikwad, Bharti W Gawali, and Pravin Yannawar. A review on speech recognition technique. International Journal of Computer Applications, 10 0 (3): 0 16--24, 2010

work page 2010
[24]

Made: Masked autoencoder for distribution estimation

Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle. Made: Masked autoencoder for distribution estimation. In International conference on machine learning, pp.\ 881--889. PMLR, 2015

work page 2015
[25]

Flow-gan: Combining maximum likelihood and adversarial learning in generative models

Aditya Grover, Manik Dhar, and Stefano Ermon. Flow-gan: Combining maximum likelihood and adversarial learning in generative models. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018

work page 2018
[26]

Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics

Michael U Gutmann and Aapo Hyv \"a rinen. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of machine learning research, 13 0 (2), 2012 a

work page 2012
[27]

Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics

Michael U Gutmann and Aapo Hyv \"a rinen. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of machine learning research, 13 0 (2), 2012 b

work page 2012
[28]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 770--778, 2016

work page 2016
[29]

A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks

Dan Hendrycks and Kevin Gimpel. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[30]

Scaling out-of-distribution detection for real-world settings

Dan Hendrycks, Steven Basart, Mantas Mazeika, Andy Zou, Joe Kwon, Mohammadreza Mostajabi, Jacob Steinhardt, and Dawn Song. Scaling out-of-distribution detection for real-world settings. arXiv preprint arXiv:1911.11132, 2019

work page arXiv 1911
[31]

Densely connected convolutional networks

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 4700--4708, 2017

work page 2017
[32]

Mos: Towards scaling out-of-distribution detection for large semantic space

Rui Huang and Yixuan Li. Mos: Towards scaling out-of-distribution detection for large semantic space. 2021 ieee. In CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp, pp.\ 8706--8715, 2021

work page 2021
[33]

A historical perspective of speech recognition

Xuedong Huang, James Baker, and Raj Reddy. A historical perspective of speech recognition. Communications of the ACM, 57 0 (1): 0 94--103, 2014

work page 2014
[34]

Detecting out-of-distribution data through in-distribution class prior

Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, and Bo Han. Detecting out-of-distribution data through in-distribution class prior. 2023

work page 2023
[35]

Training ood detectors in their natural habitats

Julian Katz-Samuels, Julia B Nakhleh, Robert Nowak, and Yixuan Li. Training ood detectors in their natural habitats. In International Conference on Machine Learning, pp.\ 10848--10865. PMLR, 2022

work page 2022
[36]

Supervised contrastive learning

Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. Supervised contrastive learning. Advances in neural information processing systems, 33: 0 18661--18673, 2020

work page 2020
[37]

Robust kernel density estimation

JooSeuk Kim and Clayton D Scott. Robust kernel density estimation. The Journal of Machine Learning Research, 13 0 (1): 0 2529--2565, 2012

work page 2012
[38]

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009

work page 2009
[39]

Imagenet classification with deep convolutional neural networks

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012

work page 2012
[40]

Tiny imagenet visual recognition challenge

Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. 2015

work page 2015
[41]

Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples

Kimin Lee, Honglak Lee, Kibok Lee, and Jinwoo Shin. Training confidence-calibrated classifiers for detecting out-of-distribution samples. arXiv preprint arXiv:1711.09325, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[42]

A simple unified framework for detecting out-of-distribution samples and adversarial attacks

Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. Advances in neural information processing systems, 31, 2018

work page 2018
[43]

Your diffusion model is secretly a zero-shot classifier

Alexander C Li, Mihir Prabhudesai, Shivam Duggal, Ellis Brown, and Deepak Pathak. Your diffusion model is secretly a zero-shot classifier. arXiv preprint arXiv:2303.16203, 2023

work page arXiv 2023
[44]

Enhancing the reliability of out-of-distribution image detection in neural networks

Shiyu Liang, Yixuan Li, and Rayadurgam Srikant. Enhancing the reliability of out-of-distribution image detection in neural networks. arXiv preprint arXiv:1706.02690, 2017

work page arXiv 2017
[45]

Estimating the partition function by discriminance sampling

Qiang Liu, Jian Peng, Alexander Ihler, and John Fisher III. Estimating the partition function by discriminance sampling. In Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, pp.\ 514--522, 2015

work page 2015
[46]

Energy-based out-of-distribution detection

Weitang Liu, Xiaoyun Wang, John Owens, and Yixuan Li. Energy-based out-of-distribution detection. Advances in neural information processing systems, 33: 0 21464--21475, 2020

work page 2020
[47]

Class-incremental learning: survey and performance evaluation on image classification

Marc Masana, Xialei Liu, Bart omiej Twardowski, Mikel Menta, Andrew D Bagdanov, and Joost Van De Weijer. Class-incremental learning: survey and performance evaluation on image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45 0 (5): 0 5513--5533, 2022

work page 2022
[48]

Poem: Out-of-distribution detection with posterior sampling

Yifei Ming, Ying Fan, and Yixuan Li. Poem: Out-of-distribution detection with posterior sampling. In International Conference on Machine Learning, pp.\ 15650--15665. PMLR, 2022 a

work page 2022
[49]

How to exploit hyperspherical embeddings for out-of-distribution detection? arXiv preprint arXiv:2203.04450, 2022 b

Yifei Ming, Yiyou Sun, Ousmane Dia, and Yixuan Li. How to exploit hyperspherical embeddings for out-of-distribution detection? arXiv preprint arXiv:2203.04450, 2022 b

work page arXiv 2022
[50]

Learning word embeddings efficiently with noise-contrastive estimation

Andriy Mnih and Koray Kavukcuoglu. Learning word embeddings efficiently with noise-contrastive estimation. Advances in neural information processing systems, 26, 2013

work page 2013
[51]

Provable guarantees for understanding out-of-distribution detection

Peyman Morteza and Yixuan Li. Provable guarantees for understanding out-of-distribution detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.\ 7831--7840, 2022

work page 2022
[52]

Integral probability metrics and their generating classes of functions

Alfred M \"u ller. Integral probability metrics and their generating classes of functions. Advances in applied probability, 29 0 (2): 0 429--443, 1997

work page 1997
[53]

Reading digits in natural images with unsupervised feature learning

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. 2011

work page 2011
[54]

Masked autoregressive flow for density estimation

George Papamakarios, Theo Pavlakou, and Iain Murray. Masked autoregressive flow for density estimation. Advances in neural information processing systems, 30, 2017

work page 2017
[55]

Pytorch: An imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019

work page 2019
[56]

Scalable diffusion models with transformers

William Peebles and Saining Xie. Scalable diffusion models with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.\ 4195--4205, 2023

work page 2023
[57]

Variational inference with normalizing flows

Danilo Rezende and Shakir Mohamed. Variational inference with normalizing flows. In International conference on machine learning, pp.\ 1530--1538. PMLR, 2015

work page 2015
[58]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj \"o rn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.\ 10684--10695, 2022

work page 2022
[59]

Mobilenetv2: Inverted residuals and linear bottlenecks

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 4510--4520, 2018

work page 2018
[60]

A survey on approaches of object detection

Sanjivani Shantaiya, Keshri Verma, and Kamal Mehta. A survey on approaches of object detection. International Journal of Computer Applications, 65 0 (18), 2013

work page 2013
[61]

Dice: Leveraging sparsification for out-of-distribution detection

Yiyou Sun and Yixuan Li. Dice: Leveraging sparsification for out-of-distribution detection. In European Conference on Computer Vision, pp.\ 691--708. Springer, 2022

work page 2022
[62]

React: Out-of-distribution detection with rectified activations

Yiyou Sun, Chuan Guo, and Yixuan Li. React: Out-of-distribution detection with rectified activations. Advances in Neural Information Processing Systems, 34: 0 144--157, 2021

work page 2021
[63]

Out-of-distribution detection with deep nearest neighbors

Yiyou Sun, Yifei Ming, Xiaojin Zhu, and Yixuan Li. Out-of-distribution detection with deep nearest neighbors. In International Conference on Machine Learning, pp.\ 20827--20840. PMLR, 2022

work page 2022
[64]

Csi: Novelty detection via contrastive learning on distributionally shifted instances

Jihoon Tack, Sangwoo Mo, Jongheon Jeong, and Jinwoo Shin. Csi: Novelty detection via contrastive learning on distributionally shifted instances. Advances in neural information processing systems, 33: 0 11839--11852, 2020

work page 2020
[65]

Importance sampling: a review

Surya T Tokdar and Robert E Kass. Importance sampling: a review. Wiley Interdisciplinary Reviews: Computational Statistics, 2 0 (1): 0 54--60, 2010

work page 2010
[66]

Tsybakov

Alexandre B. Tsybakov. Introduction to nonparametric estimation. 2008

work page 2008
[67]

Neural autoregressive distribution estimation

Benigno Uria, Marc-Alexandre C \^o t \'e , Karol Gregor, Iain Murray, and Hugo Larochelle. Neural autoregressive distribution estimation. The Journal of Machine Learning Research, 17 0 (1): 0 7184--7220, 2016

work page 2016
[68]

Partial and asymmetric contrastive learning for out-of-distribution detection in long-tailed recognition

Haotao Wang, Aston Zhang, Yi Zhu, Shuai Zheng, Mu Li, Alex J Smola, and Zhangyang Wang. Partial and asymmetric contrastive learning for out-of-distribution detection in long-tailed recognition. In International Conference on Machine Learning, pp.\ 23446--23458. PMLR, 2022

work page 2022
[69]

Out-of-distribution detection with implicit outlier transformation

Qizhou Wang, Junjie Ye, Feng Liu, Quanyu Dai, Marcus Kalander, Tongliang Liu, Jianye Hao, and Bo Han. Out-of-distribution detection with implicit outlier transformation. arXiv preprint arXiv:2303.05033, 2023

work page arXiv 2023
[70]

Mitigating neural network overconfidence with logit normalization

Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, and Yixuan Li. Mitigating neural network overconfidence with logit normalization. In International Conference on Machine Learning, pp.\ 23631--23644. PMLR, 2022

work page 2022
[71]

Unsupervised feature learning via non-parametric instance discrimination

Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 3733--3742, 2018

work page 2018
[72]

Sun database: Large-scale scene recognition from abbey to zoo

Jianxiong Xiao, James Hays, Krista A Ehinger, Aude Oliva, and Antonio Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition, pp.\ 3485--3492. IEEE, 2010 a

work page 2010
[73]

Sun database: Large-scale scene recognition from abbey to zoo

Jianxiong Xiao, James Hays, Krista A Ehinger, Aude Oliva, and Antonio Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition, pp.\ 3485--3492. IEEE, 2010 b

work page 2010
[74]

TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking

Pingmei Xu, Krista A Ehinger, Yinda Zhang, Adam Finkelstein, Sanjeev R Kulkarni, and Jianxiong Xiao. Turkergaze: Crowdsourcing saliency with webcam based eye tracking. arXiv preprint arXiv:1504.06755, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[75]

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[76]

Out-of-distribution detection based on in-distribution data patterns memorization with modern hopfield energy

Jinsong Zhang, Qiang Fu, Xu Chen, Lun Du, Zelin Li, Gang Wang, Shi Han, Dongmei Zhang, et al. Out-of-distribution detection based on in-distribution data patterns memorization with modern hopfield energy. In The Eleventh International Conference on Learning Representations, 2022

work page 2022
[77]

Understanding failures in out-of-distribution detection with deep generative models

Lily Zhang, Mark Goldstein, and Rajesh Ranganath. Understanding failures in out-of-distribution detection with deep generative models. In International Conference on Machine Learning, pp.\ 12427--12436. PMLR, 2021

work page 2021
[78]

Object detection with deep learning: A review

Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems, 30 0 (11): 0 3212--3232, 2019

work page 2019
[79]

Improving calibration for long-tailed recognition

Zhisheng Zhong, Jiequan Cui, Shu Liu, and Jiaya Jia. Improving calibration for long-tailed recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.\ 16489--16498, 2021

work page 2021
[80]

Places: A 10 million image database for scene recognition

Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition. IEEE transactions on pattern analysis and machine intelligence, 40 0 (6): 0 1452--1464, 2017

work page 2017

Showing first 80 references.

[1] [1]

Line: Out-of-distribution detection by leveraging important neurons

Yong Hyun Ahn, Gyeong-Moon Park, and Seong Tae Kim. Line: Out-of-distribution detection by leveraging important neurons. arXiv preprint arXiv:2303.13995, 2023

work page arXiv 2023

[2] [2]

Building Normalizing Flows with Stochastic Interpolants

Michael S Albergo and Eric Vanden-Eijnden. Building normalizing flows with stochastic interpolants. arXiv preprint arXiv:2209.15571, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[3] [3]

Exponential families and mixture families of probability distributions

Shun-ichi Amari. Exponential families and mixture families of probability distributions. In Information Geometry and Its Applications, pp.\ 31--49. Springer, 2016

work page 2016

[4] [4]

Relative loss bounds for on-line density estimation with the exponential family of distributions

Katy S Azoury and Manfred K Warmuth. Relative loss bounds for on-line density estimation with the exponential family of distributions. Machine learning, 43: 0 211--246, 2001

work page 2001

[5] [5]

On the effectiveness of out-of-distribution data in self-supervised long-tail learning

Jianhong Bai, Zuozhu Liu, Hualiang Wang, Jin Hao, Yang Feng, Huanpeng Chu, and Haoji Hu. On the effectiveness of out-of-distribution data in self-supervised long-tail learning. arXiv preprint arXiv:2306.04934, 2023

work page arXiv 2023

[6] [6]

Density modeling of images using a generalized normalization transformation

Johannes Ball \'e , Valero Laparra, and Eero P Simoncelli. Density modeling of images using a generalized normalization transformation. arXiv preprint arXiv:1511.06281, 2015

work page arXiv 2015

[7] [7]

Clustering with bregman divergences

Arindam Banerjee, Srujana Merugu, Inderjit S Dhillon, Joydeep Ghosh, and John Lafferty. Clustering with bregman divergences. Journal of machine learning research, 6 0 (10), 2005

work page 2005

[8] [8]

Adaptive importance sampling for multilevel monte carlo euler method

Mohamed Ben Alaya, Kaouther Hajji, and Ahmed Kebaier. Adaptive importance sampling for multilevel monte carlo euler method. Stochastics, 95 0 (2): 0 303--327, 2023

work page 2023

[9] [9]

The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming

Lev M Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR computational mathematics and mathematical physics, 7 0 (3): 0 200--217, 1967

work page 1967

[10] [10]

Fundamentals of statistical exponential families: with applications in statistical decision theory

Lawrence D Brown. Fundamentals of statistical exponential families: with applications in statistical decision theory. Ims, 1986

work page 1986

[11] [11]

Learning imbalanced datasets with label-distribution-aware margin loss

Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, and Tengyu Ma. Learning imbalanced datasets with label-distribution-aware margin loss. Advances in neural information processing systems, 32, 2019

work page 2019

[12] [12]

Adversarial reciprocal points learning for open set recognition

Guangyao Chen, Peixi Peng, Xiangqian Wang, and Yonghong Tian. Adversarial reciprocal points learning for open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44 0 (11): 0 8065--8081, 2021 a

work page 2021

[13] [13]

Atom: Robustifying out-of-distribution detection using outlier mining

Jiefeng Chen, Yixuan Li, Xi Wu, Yingyu Liang, and Somesh Jha. Atom: Robustifying out-of-distribution detection using outlier mining. In Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13--17, 2021, Proceedings, Part III 21, pp.\ 430--445. Springer, 2021 b

work page 2021

[14] [14]

Milestones in autonomous driving and intelligent vehicles: Survey of surveys

Long Chen, Yuchen Li, Chao Huang, Bai Li, Yang Xing, Daxin Tian, Li Li, Zhongxu Hu, Xiaoxiang Na, Zixuan Li, et al. Milestones in autonomous driving and intelligent vehicles: Survey of surveys. IEEE Transactions on Intelligent Vehicles, 8 0 (2): 0 1046--1056, 2022

work page 2022

[15] [15]

A tutorial on kernel density estimation and recent advances

Yen-Chi Chen. A tutorial on kernel density estimation and recent advances. Biostatistics & Epidemiology, 1 0 (1): 0 161--187, 2017

work page 2017

[16] [16]

Bregman deviations of generic exponential families

Sayak Ray Chowdhury, Patrick Saux, Odalric Maillard, and Aditya Gopalan. Bregman deviations of generic exponential families. In The Thirty Sixth Annual Conference on Learning Theory, pp.\ 394--449. PMLR, 2023

work page 2023

[17] [17]

Describing textures in the wild

Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 3606--3613, 2014

work page 2014

[18] [18]

Density estimation using Real NVP

Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[19] [19]

Extremely simple activation shaping for out-of-distribution detection

Andrija Djurisic, Nebojsa Bozanic, Arjun Ashok, and Rosanne Liu. Extremely simple activation shaping for out-of-distribution detection. arXiv preprint arXiv:2209.09858, 2022

work page arXiv 2022

[20] [20]

Vos: Learning what you don't know by virtual outlier synthesis

Xuefeng Du, Zhaoning Wang, Mu Cai, and Yixuan Li. Vos: Learning what you don't know by virtual outlier synthesis. arXiv preprint arXiv:2202.01197, 2022

work page arXiv 2022

[21] [21]

Is out-of-distribution detection learnable? In NeurIPS, 2022

Zhen Fang, Yixuan Li, Jie Lu, Jiahua Dong, Bo Han, and Feng Liu. Is out-of-distribution detection learnable? In NeurIPS, 2022

work page 2022

[22] [22]

Relative expected instantaneous loss bounds

J \"u rgen Forster and Manfred K Warmuth. Relative expected instantaneous loss bounds. Journal of Computer and System Sciences, 64 0 (1): 0 76--102, 2002

work page 2002

[23] [23]

A review on speech recognition technique

Santosh K Gaikwad, Bharti W Gawali, and Pravin Yannawar. A review on speech recognition technique. International Journal of Computer Applications, 10 0 (3): 0 16--24, 2010

work page 2010

[24] [24]

Made: Masked autoencoder for distribution estimation

Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle. Made: Masked autoencoder for distribution estimation. In International conference on machine learning, pp.\ 881--889. PMLR, 2015

work page 2015

[25] [25]

Flow-gan: Combining maximum likelihood and adversarial learning in generative models

Aditya Grover, Manik Dhar, and Stefano Ermon. Flow-gan: Combining maximum likelihood and adversarial learning in generative models. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018

work page 2018

[26] [26]

Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics

Michael U Gutmann and Aapo Hyv \"a rinen. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of machine learning research, 13 0 (2), 2012 a

work page 2012

[27] [27]

Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics

Michael U Gutmann and Aapo Hyv \"a rinen. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of machine learning research, 13 0 (2), 2012 b

work page 2012

[28] [28]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 770--778, 2016

work page 2016

[29] [29]

A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks

Dan Hendrycks and Kevin Gimpel. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[30] [30]

Scaling out-of-distribution detection for real-world settings

Dan Hendrycks, Steven Basart, Mantas Mazeika, Andy Zou, Joe Kwon, Mohammadreza Mostajabi, Jacob Steinhardt, and Dawn Song. Scaling out-of-distribution detection for real-world settings. arXiv preprint arXiv:1911.11132, 2019

work page arXiv 1911

[31] [31]

Densely connected convolutional networks

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 4700--4708, 2017

work page 2017

[32] [32]

Mos: Towards scaling out-of-distribution detection for large semantic space

Rui Huang and Yixuan Li. Mos: Towards scaling out-of-distribution detection for large semantic space. 2021 ieee. In CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp, pp.\ 8706--8715, 2021

work page 2021

[33] [33]

A historical perspective of speech recognition

Xuedong Huang, James Baker, and Raj Reddy. A historical perspective of speech recognition. Communications of the ACM, 57 0 (1): 0 94--103, 2014

work page 2014

[34] [34]

Detecting out-of-distribution data through in-distribution class prior

Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, and Bo Han. Detecting out-of-distribution data through in-distribution class prior. 2023

work page 2023

[35] [35]

Training ood detectors in their natural habitats

Julian Katz-Samuels, Julia B Nakhleh, Robert Nowak, and Yixuan Li. Training ood detectors in their natural habitats. In International Conference on Machine Learning, pp.\ 10848--10865. PMLR, 2022

work page 2022

[36] [36]

Supervised contrastive learning

Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. Supervised contrastive learning. Advances in neural information processing systems, 33: 0 18661--18673, 2020

work page 2020

[37] [37]

Robust kernel density estimation

JooSeuk Kim and Clayton D Scott. Robust kernel density estimation. The Journal of Machine Learning Research, 13 0 (1): 0 2529--2565, 2012

work page 2012

[38] [38]

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009

work page 2009

[39] [39]

Imagenet classification with deep convolutional neural networks

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012

work page 2012

[40] [40]

Tiny imagenet visual recognition challenge

Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. 2015

work page 2015

[41] [41]

Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples

Kimin Lee, Honglak Lee, Kibok Lee, and Jinwoo Shin. Training confidence-calibrated classifiers for detecting out-of-distribution samples. arXiv preprint arXiv:1711.09325, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[42] [42]

A simple unified framework for detecting out-of-distribution samples and adversarial attacks

Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. Advances in neural information processing systems, 31, 2018

work page 2018

[43] [43]

Your diffusion model is secretly a zero-shot classifier

Alexander C Li, Mihir Prabhudesai, Shivam Duggal, Ellis Brown, and Deepak Pathak. Your diffusion model is secretly a zero-shot classifier. arXiv preprint arXiv:2303.16203, 2023

work page arXiv 2023

[44] [44]

Enhancing the reliability of out-of-distribution image detection in neural networks

Shiyu Liang, Yixuan Li, and Rayadurgam Srikant. Enhancing the reliability of out-of-distribution image detection in neural networks. arXiv preprint arXiv:1706.02690, 2017

work page arXiv 2017

[45] [45]

Estimating the partition function by discriminance sampling

Qiang Liu, Jian Peng, Alexander Ihler, and John Fisher III. Estimating the partition function by discriminance sampling. In Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, pp.\ 514--522, 2015

work page 2015

[46] [46]

Energy-based out-of-distribution detection

Weitang Liu, Xiaoyun Wang, John Owens, and Yixuan Li. Energy-based out-of-distribution detection. Advances in neural information processing systems, 33: 0 21464--21475, 2020

work page 2020

[47] [47]

Class-incremental learning: survey and performance evaluation on image classification

Marc Masana, Xialei Liu, Bart omiej Twardowski, Mikel Menta, Andrew D Bagdanov, and Joost Van De Weijer. Class-incremental learning: survey and performance evaluation on image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45 0 (5): 0 5513--5533, 2022

work page 2022

[48] [48]

Poem: Out-of-distribution detection with posterior sampling

Yifei Ming, Ying Fan, and Yixuan Li. Poem: Out-of-distribution detection with posterior sampling. In International Conference on Machine Learning, pp.\ 15650--15665. PMLR, 2022 a

work page 2022

[49] [49]

How to exploit hyperspherical embeddings for out-of-distribution detection? arXiv preprint arXiv:2203.04450, 2022 b

Yifei Ming, Yiyou Sun, Ousmane Dia, and Yixuan Li. How to exploit hyperspherical embeddings for out-of-distribution detection? arXiv preprint arXiv:2203.04450, 2022 b

work page arXiv 2022

[50] [50]

Learning word embeddings efficiently with noise-contrastive estimation

Andriy Mnih and Koray Kavukcuoglu. Learning word embeddings efficiently with noise-contrastive estimation. Advances in neural information processing systems, 26, 2013

work page 2013

[51] [51]

Provable guarantees for understanding out-of-distribution detection

Peyman Morteza and Yixuan Li. Provable guarantees for understanding out-of-distribution detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.\ 7831--7840, 2022

work page 2022

[52] [52]

Integral probability metrics and their generating classes of functions

Alfred M \"u ller. Integral probability metrics and their generating classes of functions. Advances in applied probability, 29 0 (2): 0 429--443, 1997

work page 1997

[53] [53]

Reading digits in natural images with unsupervised feature learning

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. 2011

work page 2011

[54] [54]

Masked autoregressive flow for density estimation

George Papamakarios, Theo Pavlakou, and Iain Murray. Masked autoregressive flow for density estimation. Advances in neural information processing systems, 30, 2017

work page 2017

[55] [55]

Pytorch: An imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019

work page 2019

[56] [56]

Scalable diffusion models with transformers

William Peebles and Saining Xie. Scalable diffusion models with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.\ 4195--4205, 2023

work page 2023

[57] [57]

Variational inference with normalizing flows

Danilo Rezende and Shakir Mohamed. Variational inference with normalizing flows. In International conference on machine learning, pp.\ 1530--1538. PMLR, 2015

work page 2015

[58] [58]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj \"o rn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.\ 10684--10695, 2022

work page 2022

[59] [59]

Mobilenetv2: Inverted residuals and linear bottlenecks

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 4510--4520, 2018

work page 2018

[60] [60]

A survey on approaches of object detection

Sanjivani Shantaiya, Keshri Verma, and Kamal Mehta. A survey on approaches of object detection. International Journal of Computer Applications, 65 0 (18), 2013

work page 2013

[61] [61]

Dice: Leveraging sparsification for out-of-distribution detection

Yiyou Sun and Yixuan Li. Dice: Leveraging sparsification for out-of-distribution detection. In European Conference on Computer Vision, pp.\ 691--708. Springer, 2022

work page 2022

[62] [62]

React: Out-of-distribution detection with rectified activations

Yiyou Sun, Chuan Guo, and Yixuan Li. React: Out-of-distribution detection with rectified activations. Advances in Neural Information Processing Systems, 34: 0 144--157, 2021

work page 2021

[63] [63]

Out-of-distribution detection with deep nearest neighbors

Yiyou Sun, Yifei Ming, Xiaojin Zhu, and Yixuan Li. Out-of-distribution detection with deep nearest neighbors. In International Conference on Machine Learning, pp.\ 20827--20840. PMLR, 2022

work page 2022

[64] [64]

Csi: Novelty detection via contrastive learning on distributionally shifted instances

Jihoon Tack, Sangwoo Mo, Jongheon Jeong, and Jinwoo Shin. Csi: Novelty detection via contrastive learning on distributionally shifted instances. Advances in neural information processing systems, 33: 0 11839--11852, 2020

work page 2020

[65] [65]

Importance sampling: a review

Surya T Tokdar and Robert E Kass. Importance sampling: a review. Wiley Interdisciplinary Reviews: Computational Statistics, 2 0 (1): 0 54--60, 2010

work page 2010

[66] [66]

Tsybakov

Alexandre B. Tsybakov. Introduction to nonparametric estimation. 2008

work page 2008

[67] [67]

Neural autoregressive distribution estimation

Benigno Uria, Marc-Alexandre C \^o t \'e , Karol Gregor, Iain Murray, and Hugo Larochelle. Neural autoregressive distribution estimation. The Journal of Machine Learning Research, 17 0 (1): 0 7184--7220, 2016

work page 2016

[68] [68]

Partial and asymmetric contrastive learning for out-of-distribution detection in long-tailed recognition

Haotao Wang, Aston Zhang, Yi Zhu, Shuai Zheng, Mu Li, Alex J Smola, and Zhangyang Wang. Partial and asymmetric contrastive learning for out-of-distribution detection in long-tailed recognition. In International Conference on Machine Learning, pp.\ 23446--23458. PMLR, 2022

work page 2022

[69] [69]

Out-of-distribution detection with implicit outlier transformation

Qizhou Wang, Junjie Ye, Feng Liu, Quanyu Dai, Marcus Kalander, Tongliang Liu, Jianye Hao, and Bo Han. Out-of-distribution detection with implicit outlier transformation. arXiv preprint arXiv:2303.05033, 2023

work page arXiv 2023

[70] [70]

Mitigating neural network overconfidence with logit normalization

Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, and Yixuan Li. Mitigating neural network overconfidence with logit normalization. In International Conference on Machine Learning, pp.\ 23631--23644. PMLR, 2022

work page 2022

[71] [71]

Unsupervised feature learning via non-parametric instance discrimination

Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 3733--3742, 2018

work page 2018

[72] [72]

Sun database: Large-scale scene recognition from abbey to zoo

Jianxiong Xiao, James Hays, Krista A Ehinger, Aude Oliva, and Antonio Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition, pp.\ 3485--3492. IEEE, 2010 a

work page 2010

[73] [73]

Sun database: Large-scale scene recognition from abbey to zoo

Jianxiong Xiao, James Hays, Krista A Ehinger, Aude Oliva, and Antonio Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition, pp.\ 3485--3492. IEEE, 2010 b

work page 2010

[74] [74]

TurkerGaze: Crowdsourcing Saliency with Webcam based Eye Tracking

Pingmei Xu, Krista A Ehinger, Yinda Zhang, Adam Finkelstein, Sanjeev R Kulkarni, and Jianxiong Xiao. Turkergaze: Crowdsourcing saliency with webcam based eye tracking. arXiv preprint arXiv:1504.06755, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[75] [75]

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[76] [76]

Out-of-distribution detection based on in-distribution data patterns memorization with modern hopfield energy

Jinsong Zhang, Qiang Fu, Xu Chen, Lun Du, Zelin Li, Gang Wang, Shi Han, Dongmei Zhang, et al. Out-of-distribution detection based on in-distribution data patterns memorization with modern hopfield energy. In The Eleventh International Conference on Learning Representations, 2022

work page 2022

[77] [77]

Understanding failures in out-of-distribution detection with deep generative models

Lily Zhang, Mark Goldstein, and Rajesh Ranganath. Understanding failures in out-of-distribution detection with deep generative models. In International Conference on Machine Learning, pp.\ 12427--12436. PMLR, 2021

work page 2021

[78] [78]

Object detection with deep learning: A review

Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems, 30 0 (11): 0 3212--3232, 2019

work page 2019

[79] [79]

Improving calibration for long-tailed recognition

Zhisheng Zhong, Jiequan Cui, Shu Liu, and Jiaya Jia. Improving calibration for long-tailed recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.\ 16489--16498, 2021

work page 2021

[80] [80]

Places: A 10 million image database for scene recognition

Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. Places: A 10 million image database for scene recognition. IEEE transactions on pattern analysis and machine intelligence, 40 0 (6): 0 1452--1464, 2017

work page 2017