Possibilistic Predictive Uncertainty for Deep Learning

Jeremie Houssineau; Piotr Koniusz; Yao Ni; Yew Soon Ong

arxiv: 2605.00600 · v2 · pith:T6XEL2WTnew · submitted 2026-05-01 · 💻 cs.LG · cs.AI· cs.CV

Possibilistic Predictive Uncertainty for Deep Learning

Yao Ni , Jeremie Houssineau , Yew Soon Ong , Piotr Koniusz This is my paper

Pith reviewed 2026-05-09 19:11 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CV

keywords epistemic uncertaintypossibility theoryDirichlet approximationdeep neural networksuncertainty quantificationevidential deep learningposterior projection

0 comments

The pith

Deep neural networks can quantify epistemic uncertainty by projecting possibilistic posteriors over parameters onto predictions via supremum operators and approximating them with learnable Dirichlet possibility functions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Deep neural networks often produce overconfident predictions on unfamiliar inputs, creating a need for reliable epistemic uncertainty estimates. Bayesian approaches offer principled uncertainty but are computationally heavy, while faster alternatives lack strong theoretical grounding. This paper introduces Dirichlet-approximated possibilistic posterior predictions, or DAPPr, which defines a possibilistic posterior over model parameters. It projects that posterior into prediction space using supremum operators and approximates the result with learnable Dirichlet possibility functions. The method produces a simple training objective with closed-form solutions and performs competitively or better than existing evidential deep learning techniques on standard benchmarks.

Core claim

We introduce Dirichlet-approximated possibilistic posterior predictions (DAPPr), a principled framework leveraging possibility theory. We define a possibilistic posterior over parameters, project this posterior to the prediction space via supremum operators, and approximate the projected posterior using learnable Dirichlet possibility functions. This projection-and-approximation strategy yields a simple training objective with closed-form solutions. Extensive experiments across diverse benchmarks demonstrate that our approach achieves competitive or superior uncertainty quantification performance compared to state-of-the-art evidential deep learning methods while maintaining both principled

What carries the argument

The supremum projection of a possibilistic posterior over parameters onto prediction space, followed by approximation with learnable Dirichlet possibility functions.

If this is right

Yields a simple training objective with closed-form solutions.
Achieves competitive or superior uncertainty quantification performance compared to state-of-the-art evidential deep learning methods.
Maintains both principled derivation from possibility theory and computational efficiency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The closed-form solutions could make uncertainty modeling easier to integrate into very large models without added sampling costs.
Possibility theory might offer better handling of uncertainty in settings where strict probabilistic assumptions do not apply, such as adversarial inputs.
The projection step could be adapted to other output types beyond classification to quantify uncertainty in regression or structured prediction tasks.

Load-bearing premise

The supremum-based projection of the possibilistic posterior onto prediction space, when approximated by Dirichlet functions, rigorously quantifies epistemic uncertainty.

What would settle it

An experiment in which the method assigns low uncertainty to out-of-distribution inputs where the model makes errors, or where its uncertainty estimates fail to improve upon those from standard evidential deep learning on the same benchmarks.

Figures

Figures reproduced from arXiv: 2605.00600 by Jeremie Houssineau, Piotr Koniusz, Yao Ni, Yew Soon Ong.

**Figure 1.** Figure 1: PyTorch-style pseudocode for the DAPPr loss (∼10 lines of code) and its usage by simply replacing the cross-entropy loss. ciency, these approaches adopt heuristic objectives without rigorous justification for uncertainty quantification. These limitations reveal a dilemma for modeling epistemic uncertainty: Bayesian methods offer theoretical rigor but remain computationally intractable, while second-order p… view at source ↗

**Figure 2.** Figure 2: Test accuracy and OOD AUPR (↑) for varying λ on CIFAR-100 and Stanford Dogs. OOD AUPR averaged over their corresponding OOD datasets. 0.82 0.85 0.88 EDL 0.5 1 2 3 4 5 6 8 10 Data Points (× 1000) 0.80 0.82 0.84 DAPPr Epistemic Uncertainty (a) Epistemic Uncertainty. 1 0.5 1 2 3 4 5 6 8 10 Data Points (× 1000) 0 10 20 30 40 50 Acc EDL DAPPr (b) Accuracy. 1 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 4.** Figure 4: shows the distribution of normalised α0 for ID and OOD samples. EDL produces heavily overlapping distributions, making OOD detection difficult. DAPPr clearly separates ID (high α0) from OOD (low α0), enabling reliable detection of distribution shift. 0.0 0.2 0.4 0.6 0.8 1.0 0 4 8 12 16 Density ID (CIFAR-10) OOD (SVHN) OOD (CIFAR-100) 0.0 0.2 0.4 0.6 0.8 1.0 ID (CIFAR-10) OOD (SVHN) OOD (CIFAR-100) 1 (a) … view at source ↗

**Figure 5.** Figure 5: x axis: 1 Ltrue. y-axis: Sx 2. Perturbed label fine-tuning: Fine-tune θ0 identically, but replace the label of x with a randomly sampled soft label p ∈ ∆K−1 for x, yielding model θp and loss Lp = L(θp; D \ {(x, y)}). For each sample, we compute the maximum loss deviation Sx = maxp |Lp − Ltrue| over multiple random perturbations [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

read the original abstract

Deep neural networks achieve impressive results across diverse applications, yet their overconfidence on unseen inputs necessitates reliable epistemic uncertainty modeling. Existing methods for uncertainty modeling face a fundamental dilemma: Bayesian approaches provide principled estimates but remain computationally prohibitive, while efficient second-order predictors lack rigorous connections between their specific objectives and epistemic uncertainty quantification. To resolve this dilemma, we introduce Dirichlet-approximated possibilistic posterior predictions (DAPPr), a principled framework grounded in possibility theory. We define a possibilistic posterior over parameters, project it to the prediction space via supremum operators, and approximate the projected posterior using learnable Dirichlet possibility functions. This projection-and-approximation strategy yields a simple training objective with closed-form solutions. Despite its simplicity, extensive experiments across diverse benchmarks show that DAPPr achieves competitive or superior uncertainty quantification performance over state-of-the-art second-order predictors while maintaining both principled derivation and computational efficiency. Code is available at https://github.com/MaxwellYaoNi/DAPPr.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper offers a possibilistic alternative to Bayesian and evidential uncertainty in deep nets via supremum projection and Dirichlet approximation, but the abstract leaves the key derivation steps unshown so the epistemic claim is hard to assess.

read the letter

The main point is that this work tries to split the difference between slow principled Bayesian uncertainty and fast but loosely justified second-order methods by borrowing from possibility theory. It defines a possibilistic posterior over parameters, pushes it forward to the output space with supremum operators, and then approximates the result with learnable Dirichlet possibility functions to get a closed-form training loss. Experiments reportedly match or beat evidential deep learning on standard benchmarks while staying efficient.

Referee Report

2 major / 2 minor

Summary. The paper introduces Dirichlet-approximated possibilistic posterior predictions (DAPPr), a framework that defines a possibilistic posterior over neural network parameters, projects it onto the prediction space via supremum operators, and approximates the result using learnable Dirichlet possibility functions. This yields a simple training objective with closed-form solutions for epistemic uncertainty quantification. Experiments across benchmarks show competitive or superior performance relative to evidential deep learning baselines while claiming both principled derivation and computational efficiency.

Significance. If the supremum-based projection and Dirichlet approximation are shown to rigorously preserve epistemic semantics from possibility theory without collapsing to a data-dependent heuristic, the work could meaningfully address the trade-off between principled uncertainty estimates and efficiency. The closed-form objective and reproducible code (promised) would strengthen its contribution as a falsifiable alternative to Bayesian or evidential methods.

major comments (2)

[§3] §3 (Possibilistic Posterior and Projection): The central claim that the supremum projection of the possibilistic posterior onto prediction space rigorously quantifies epistemic uncertainty (rather than producing a convenient optimizable objective) requires explicit verification. The abstract asserts this follows from possibility-theoretic axioms, but the derivation steps showing that the supremum correctly marginalizes parameter uncertainty without introducing uncontrolled bias or reducing to a fit of the Dirichlet parameters must be provided and checked against the axioms; absent this, the epistemic semantics remain unestablished.
[§4] §4 (Dirichlet Approximation and Training Objective): The claim of a 'principled' closed-form objective depends on the Dirichlet possibility functions being an approximation that does not alter the epistemic character of the projected posterior. If the parameters of these functions are optimized directly against the training loss (as implied by the learnable setup), this risks circularity where the uncertainty estimate is defined by the fit rather than derived independently; the manuscript must demonstrate that the approximation error is bounded in a way that preserves epistemic quantification.

minor comments (2)

[Abstract] The abstract states 'projects this posterior' (subject-verb agreement error); correct to 'project' for grammatical consistency.
[§2] Notation for the supremum operator and Dirichlet parameters should be introduced with explicit definitions and contrasted with standard Bayesian marginalization to aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments raise important points about the rigor of the theoretical derivations, which we address below by clarifying the foundations and committing to expansions in the revised manuscript.

read point-by-point responses

Referee: [§3] §3 (Possibilistic Posterior and Projection): The central claim that the supremum projection of the possibilistic posterior onto prediction space rigorously quantifies epistemic uncertainty (rather than producing a convenient optimizable objective) requires explicit verification. The abstract asserts this follows from possibility-theoretic axioms, but the derivation steps showing that the supremum correctly marginalizes parameter uncertainty without introducing uncontrolled bias or reducing to a fit of the Dirichlet parameters must be provided and checked against the axioms; absent this, the epistemic semantics remain unestablished.

Authors: We agree that Section 3 would benefit from additional formal detail. In the revised manuscript we will insert a new proposition with proof establishing that the supremum projection is the standard marginalization operator under possibility theory (Zadeh's extension principle), which by construction yields a possibility measure on the prediction space whose value at each output represents the highest compatibility with any parameter setting. This step is independent of the subsequent Dirichlet approximation and does not introduce bias beyond the semantics of possibility measures; it satisfies maxitivity and normalization by definition. We will explicitly cross-reference the relevant axioms and show that the projection step alone already encodes epistemic uncertainty before any approximation is applied. revision: yes
Referee: [§4] §4 (Dirichlet Approximation and Training Objective): The claim of a 'principled' closed-form objective depends on the Dirichlet possibility functions being an approximation that does not alter the epistemic character of the projected posterior. If the parameters of these functions are optimized directly against the training loss (as implied by the learnable setup), this risks circularity where the uncertainty estimate is defined by the fit rather than derived independently; the manuscript must demonstrate that the approximation error is bounded in a way that preserves epistemic quantification.

Authors: We accept that the manuscript should more clearly separate the projection step from the approximation step and bound the error. In the revision we will add an analysis showing that the Dirichlet family can represent possibility distributions with controlled approximation error (via the fact that any continuous possibility function on the simplex can be approximated arbitrarily closely by a Dirichlet possibility function under the sup-norm). The training objective minimizes a discrepancy measure between the projected posterior and this parametric family; once the parameters are obtained, epistemic uncertainty is read off directly from the resulting possibility function using closed-form expressions that inherit the semantics of the projection. We will include a proposition bounding the total variation between the true projected possibility and its Dirichlet approximation, thereby ensuring that the epistemic character is preserved up to a quantifiable error that vanishes with better approximation. This removes any appearance of circularity, as the uncertainty measure is defined by the approximated possibility, not by the loss value itself. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper's core construction defines a possibilistic posterior over parameters, applies supremum-based projection to prediction space, and approximates the result with Dirichlet possibility functions to derive a training objective with closed-form solutions. This sequence is presented as following from possibility theory without the projection or approximation reducing to a tautological fit of the target uncertainty quantity itself. No equations or steps are shown to equate the epistemic uncertainty output directly to the fitted parameters by construction, and the framework maintains independent content through its claimed axiomatic grounding and empirical validation on benchmarks. The derivation remains self-contained against external possibility-theoretic principles rather than relying on self-referential definitions or load-bearing self-citations.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 2 invented entities

The central claim rests on possibility theory as an alternative to probability, the validity of supremum projection for transferring uncertainty from parameters to predictions, and the adequacy of Dirichlet functions as approximators; these are introduced without independent empirical or formal support in the abstract.

free parameters (1)

Dirichlet possibility function parameters
Learnable parameters that define the approximating distribution and are optimized as part of the training objective.

axioms (2)

domain assumption Possibility theory provides a valid representation of epistemic uncertainty over neural-network parameters
Invoked when defining the possibilistic posterior.
domain assumption Supremum operator correctly projects the parameter-level possibilistic posterior onto the prediction space
Central step in the projection-and-approximation strategy.

invented entities (2)

possibilistic posterior over parameters no independent evidence
purpose: To encode epistemic uncertainty in a non-probabilistic manner
New construct introduced to replace Bayesian posterior.
Dirichlet possibility functions no independent evidence
purpose: To approximate the projected possibilistic posterior with closed-form tractability
Learnable approximation introduced for computational efficiency.

pith-pipeline@v0.9.0 · 5476 in / 1556 out tokens · 40101 ms · 2026-05-09T19:11:00.236027+00:00 · methodology

Possibilistic Predictive Uncertainty for Deep Learning

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)