Recognition: unknown
Pandora's Regret: A Proper Scoring Rule for Evaluating Sequential Search
Pith reviewed 2026-05-10 15:27 UTC · model grok-4.3
The pith
Sequential search costs induce Pandora's Regret, a closed-form strictly proper scoring rule that penalizes rank reversals where distractors outrank the true class.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Pandora's Regret is obtained by analyzing the expected cost of optimal sequential testing under varying per-test costs and subtracting the cost that would be incurred with true probabilities. The resulting expression is closed-form, pairwise additive, and strictly proper, so that any deviation from true probabilities increases the score. It penalizes rank-reversing miscalibrations in addition to magnitude errors and yields a Beta family whose single parameter trades off the two kinds of penalty while retaining an interpretation as excess search cost.
What carries the argument
Pandora's Regret, the closed-form excess expected cost of optimal sequential search under the model's probabilities, which supplies both the strict properness and the pairwise additive structure.
If this is right
- Log loss, accuracy, and macro-F1 each embed an implicit decision model that does not match the sequential-search utility.
- Pandora-based metrics can be used to select or tune models when the downstream task is sequential testing.
- The Beta family lets practitioners choose how heavily to penalize rank swaps versus probability magnitude while keeping a cost interpretation.
- The construction extends the decision-theoretic approach to proper scoring rules from binary to multiclass sequential settings.
Where Pith is reading between the lines
- Similar cost-based derivations may produce aligned scoring rules for other sequential decision problems such as adaptive testing or active learning.
- In diagnostic pipelines the rule could be used directly as a training objective rather than only for post-hoc evaluation.
- If the pairwise structure generalizes, the same method might yield proper rules for partial-information search settings where not all alternatives are tested.
Load-bearing premise
The expected cost of optimal search under the model's probabilities admits a pairwise decomposition whose closed form remains strictly proper for arbitrary testing-cost regimes.
What would settle it
A model that reports the true class probabilities but incurs higher Pandora's Regret than a model that reverses the ranking of the true class and one distractor, on the same test set.
Figures
read the original abstract
In sequential search, alternatives are tested until the true class is found. Standard proper scoring rules like log loss are local, ignoring the ranking of competitors and misaligning model evaluation with search utility. We show that sequential search induces a pairwise structure that overcomes this. By analyzing the expected cost of optimal search under varying testing costs, we derive Pandora's Regret: a closed-form, pairwise-additive, and strictly proper scoring rule. Pandora's Regret both elicits true probabilities and penalizes rank-reversing miscalibrations where distractors outrank the true class. Our construction yields a one-parameter Beta family that balances penalties for rank-swapping versus probability magnitude, while retaining a grounded interpretation as expected search cost. We prove that log loss, accuracy, and macro-F1 rely on implicit decision models misaligned with sequential search. Across 597 MedMNIST models, Pandora-based metrics better predict clinical diagnostic costs than standard alternatives, extending decision-theoretic scoring rule construction to the multiclass setting.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript derives Pandora's Regret as a closed-form, pairwise-additive, strictly proper scoring rule from the expected cost of optimal sequential search under Pandora's problem. It claims this rule elicits true probabilities, penalizes rank-reversing miscalibrations, and belongs to a one-parameter Beta family balancing rank-swap and magnitude penalties while retaining an expected-search-cost interpretation. Standard metrics (log loss, accuracy, macro-F1) are shown to rely on misaligned implicit decision models. Empirical results across 597 MedMNIST models indicate Pandora-based metrics better predict clinical diagnostic costs than alternatives.
Significance. If the derivation is sound, the work meaningfully extends decision-theoretic scoring-rule construction to multiclass sequential search, offering a utility-aligned alternative for applications such as medical diagnosis where ranking and search costs matter. The large-scale empirical comparison on 597 models provides concrete evidence of practical advantage and is a clear strength.
major comments (2)
- [Abstract and §3 (derivation)] Abstract and theoretical derivation (Pandora's problem analysis): the central claim that the expected optimal search cost yields a closed-form pairwise-additive strictly proper rule that generalizes to any testing-cost regime is load-bearing. When class-specific costs are heterogeneous, optimal thresholds in Pandora's problem generally depend on the full probability vector in a non-separable manner; this risks breaking the claimed pairwise structure and additivity. Explicit closed-form expression and verification for non-uniform costs are required.
- [§4 (Beta family)] Beta-family construction: the one-parameter Beta family is presented as both tunable and grounded in expected search cost, yet the parameter-selection procedure is not detailed. If the choice is post-hoc or data-dependent, it undermines the claim of a parameter-free derivation from first principles and the interpretation as expected cost.
minor comments (1)
- [Empirical evaluation] Notation for the Beta parameter and the exact definition of 'clinical diagnostic costs' in the empirical section should be stated explicitly in the main text rather than deferred to supplements.
Simulated Author's Rebuttal
We thank the referee for the insightful comments and for acknowledging the significance of our contribution. We address the major comments point by point below and have made revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract and §3 (derivation)] Abstract and theoretical derivation (Pandora's problem analysis): the central claim that the expected optimal search cost yields a closed-form pairwise-additive strictly proper rule that generalizes to any testing-cost regime is load-bearing. When class-specific costs are heterogeneous, optimal thresholds in Pandora's problem generally depend on the full probability vector in a non-separable manner; this risks breaking the claimed pairwise structure and additivity. Explicit closed-form expression and verification for non-uniform costs are required.
Authors: We thank the referee for pointing out this critical aspect of the derivation. Our analysis in §3 starts with the general case of heterogeneous testing costs in Pandora's problem. Although optimal thresholds can depend on the probability vector, the expected optimal search cost regret decomposes into a sum of pairwise terms because the search continues until the true class is found, and the contribution of each misranked pair is independent in the cost accumulation. We will include the explicit closed-form expression for arbitrary costs in the revised manuscript and provide a mathematical verification of the additivity property to confirm the structure holds. revision: yes
-
Referee: [§4 (Beta family)] Beta-family construction: the one-parameter Beta family is presented as both tunable and grounded in expected search cost, yet the parameter-selection procedure is not detailed. If the choice is post-hoc or data-dependent, it undermines the claim of a parameter-free derivation from first principles and the interpretation as expected cost.
Authors: The Beta family parameter is not selected post-hoc but corresponds directly to the testing cost in the Pandora formulation, providing a tunable balance while remaining grounded. We will revise §4 to include a detailed description of how the parameter is determined from the cost regime, including examples for different cost settings, to clarify that it does not undermine the first-principles derivation. revision: yes
Circularity Check
Derivation of Pandora's Regret from expected optimal search cost is independent and non-circular
full rationale
The paper constructs Pandora's Regret by analyzing the expected cost of optimal sequential search under varying testing costs, drawing on the external decision-theoretic framework of Pandora's problem. This provides an independent grounding for the closed-form, pairwise-additive, and strictly proper properties rather than defining the rule in terms of itself or fitting parameters to the target evaluation metric. The one-parameter Beta family arises directly from the construction as a tunable balance between rank-swapping and magnitude penalties while preserving the expected-cost interpretation. No load-bearing self-citations, imported uniqueness theorems, or ansatzes smuggled via prior work are present. The derivation chain remains self-contained against external benchmarks of search-cost minimization and does not reduce to its inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (1)
- Beta family parameter
axioms (2)
- domain assumption Sequential search induces a pairwise structure on the scoring rule
- domain assumption Optimal testing order is determined by the model's reported probabilities
Reference graph
Works this paper leans on
-
[1]
On the consistency of top-k surrogate losses , url =
Yang, Forest and Koyejo, Sanmi , booktitle =. On the consistency of top-k surrogate losses , url =. 2020 , abstract =
2020
-
[2]
IEEE Transactions on Pattern Analysis & Machine Intelligence , keywords =
Lapin, Maksim and Hein, Matthias and Schiele, Bernt , date-added =. IEEE Transactions on Pattern Analysis & Machine Intelligence , keywords =. 2018 , abstract =. doi:10.1109/TPAMI.2017.2751607 , issn =
-
[3]
On the Relationship Between Binary Classification, Bipartite Ranking, and Binary Class Probability Estimation , url =
Narasimhan, Harikrishna and Agarwal, Shivani , booktitle =. On the Relationship Between Binary Classification, Bipartite Ranking, and Binary Class Probability Estimation , url =. 2013 , bdsk-url-1 =
2013
-
[4]
Scale Calibration of Deep Ranking Models , year =
Le Yan and Zhen Qin and Xuanhui Wang and Mike Bendersky and Marc Najork , date-added =. Scale Calibration of Deep Ranking Models , year =
-
[5]
Differentiable Ranking and Sorting using Optimal Transport , url =
Cuturi, Marco and Teboul, Olivier and Vert, Jean-Philippe , booktitle =. Differentiable Ranking and Sorting using Optimal Transport , url =. 2019 , bdsk-file-1 =
2019
-
[6]
Monotonic Differentiable Sorting Networks , url =
Felix Petersen and Christian Borgelt and Hilde Kuehne and Oliver Deussen , booktitle =. Monotonic Differentiable Sorting Networks , url =. 2022 , bdsk-file-1 =
2022
-
[7]
An Analysis of the Softmax Cross Entropy Loss for Learning-to-Rank with Binary Relevance , url =
Bruch, Sebastian and Wang, Xuanhui and Bendersky, Michael and Najork, Marc , booktitle =. An Analysis of the Softmax Cross Entropy Loss for Learning-to-Rank with Binary Relevance , url =. 2019 , abstract =. doi:10.1145/3341981.3344221 , isbn =
-
[8]
A Stochastic Treatment of Learning to Rank Scoring Functions , url =
Bruch, Sebastian and Han, Shuguang and Bendersky, Michael and Najork, Marc , booktitle =. A Stochastic Treatment of Learning to Rank Scoring Functions , url =. 2020 , abstract =. doi:10.1145/3336191.3371844 , isbn =
-
[9]
Multilabel classification with meta-level features in a learning-to-rank framework , url =
Yang, Yiming and Gopal, Siddharth , date =. Multilabel classification with meta-level features in a learning-to-rank framework , url =. Machine Learning , number =. 2012 , abstract =. doi:10.1007/s10994-011-5270-7 , id =
-
[10]
Rank4Class: A Ranking Formulation for Multiclass Classification , year =
Nan Wang and Zhen Qin and Le Yan and Honglei Zhuang and Xuanhui Wang and Michael Bendersky and Marc Najork , date-added =. Rank4Class: A Ranking Formulation for Multiclass Classification , year =
-
[11]
Wang, Zitai and Xu, Qianqian and Yang, Zhiyong and Wen, Peisong and He, Yuan and Cao, Xiaochun and Huang, Qingming , date =. Top-K Pairwise Ranking: Bridging the Gap Among Ranking-Based Measures for Multi-label Classification , url =. International Journal of Computer Vision , number =. 2025 , abstract =. doi:10.1007/s11263-024-02157-w , id =
-
[12]
Williamson , date-added =
Aditya Krishna Menon and Robert C. Williamson , date-added =. Bipartite Ranking: a Risk-Theoretic Perspective , url =. Journal of Machine Learning Research , number =. 2016 , bdsk-file-1 =
2016
-
[13]
Predicting accurate probabilities with a ranking loss
Menon, Aditya Krishna and Jiang, Xiaoqian J and Vembu, Shankar and Elkan, Charles and Ohno-Machado, Lucila , crdt =. Predicting accurate probabilities with a ranking loss. , volume =. Proc Int Conf Mach Learn , jt =. 2012 , abstract =
2012
-
[14]
Theory and applications of proper scoring rules , url =
Dawid, Alexander Philip and Musio, Monica , date =. Theory and applications of proper scoring rules , url =. METRON , number =. 2014 , abstract =. doi:10.1007/s40300-014-0039-y , id =
-
[15]
Cost-effective diagnostic test sequencing
Eiseman, B and Jones, R and McClatchey, M and Borlase, B , crdt =. Cost-effective diagnostic test sequencing. , volume =. World J Surg , jt =. 1989 , abstract =. doi:10.1007/BF01659033 , edat =
-
[16]
Cost-sensitive classification with cost uncertainty: do we need surrogate losses? , url =
Komisarenko, Viacheslav and Kull, Meelis , date =. Cost-sensitive classification with cost uncertainty: do we need surrogate losses? , url =. Machine Learning , number =. 2025 , abstract =. doi:10.1007/s10994-024-06634-8 , id =
-
[17]
On Loss Functions and Regret Bounds for Multi-Category Classification , volume =
Tan, Zhiqiang and Zhang, Xinwei , date-added =. On Loss Functions and Regret Bounds for Multi-Category Classification , volume =. 2022 , bdsk-file-1 =. doi:10.1109/TIT.2022.3167635 , journal =
-
[18]
Ross Wightman , date-added =. PyTorch Image Models , year =. GitHub repository , publisher =. doi:10.5281/zenodo.4414861 , howpublished =
-
[19]
Optimal discrete search with imperfect specificity , url =
Moshe Kress and Kyle Lin and Roberto Szechtman , date-added =. Optimal discrete search with imperfect specificity , url =. 2008 , abstract =. doi:10.1007/s00186-007-0197-2 , journal =
-
[20]
PiRank: scalable learning to rank via differentiable sorting , year =
Swezey, Robin and Grover, Aditya and Charron, Bruno and Ermon, Stefano , booktitle =. PiRank: scalable learning to rank via differentiable sorting , year =
-
[21]
SoftRank: optimizing non-smooth rank metrics , url =
Taylor, Michael and Guiver, John and Robertson, Stephen and Minka, Tom , booktitle =. SoftRank: optimizing non-smooth rank metrics , url =. 2008 , abstract =. doi:10.1145/1341531.1341544 , isbn =
-
[22]
Xia, Fen and Liu, Tie-Yan and Wang, Jue and Zhang, Wensheng and Li, Hang , booktitle =. Listwise approach to learning to rank: theory and algorithm , url =. 2008 , abstract =. doi:10.1145/1390156.1390306 , isbn =
-
[23]
Optimizing search engines using clickthrough data
Joachims, Thorsten , booktitle =. Optimizing search engines using clickthrough data , url =. 2002 , abstract =. doi:10.1145/775047.775067 , isbn =
-
[24]
ArXiv , title =
Przemyslaw Pobrotyn and Radoslaw Bialobrzeski , date-added =. ArXiv , title =. 2021 , bdsk-file-1 =
2021
-
[25]
From ranknet to lambdarank to lambdamart: An overview , volume =
Burges, Christopher , date-added =. From ranknet to lambdarank to lambdamart: An overview , volume =. Learning , month =. 2010 , bdsk-file-1 =
2010
-
[26]
LambdaRank Gradients are Incoherent , url =
Marcuzzi, Federico and Lucchese, Claudio and Orlando, Salvatore , booktitle =. LambdaRank Gradients are Incoherent , url =. 2023 , abstract =. doi:10.1145/3583780.3614948 , isbn =
-
[27]
Advances in neural information processing systems , title =
Burges, Christopher and Ragno, Robert and Le, Quoc , date-added =. Advances in neural information processing systems , title =. 2006 , bdsk-file-1 =
2006
-
[28]
Learning to rank using gradient descent , url =
Burges, Chris and Shaked, Tal and Renshaw, Erin and Lazier, Ari and Deeds, Matt and Hamilton, Nicole and Hullender, Greg , booktitle =. Learning to rank using gradient descent , url =. 2005 , abstract =. doi:10.1145/1102351.1102363 , isbn =
-
[29]
Three Types of Calibration with Properties and their Semantic and Formal Relationships , url =. 2025 , bdsk-file-1 =. arXiv , author =:2504.18395 , primaryclass =
-
[30]
Frongillo and Jana Hlavinov'a and Birgit Rudloff , date-added =
Tobias Fissler and Rafael M. Frongillo and Jana Hlavinov'a and Birgit Rudloff , date-added =. Electronic Journal of Statistics , title =. 2019 , bdsk-file-1 =
2019
-
[31]
Painsky, Amichai and Wornell, Gregory W. , date-added =. Bregman Divergence Bounds and Universality Properties of the Logarithmic Loss , volume =. 2020 , bdsk-file-1 =. doi:10.1109/TIT.2019.2958705 , journal =
-
[32]
Asymmetric Penalties Underlie Proper Loss Functions in Probabilistic Forecasting , url =. 2025 , bdsk-file-1 =. arXiv , author =:2505.00937 , primaryclass =
-
[33]
Cross-entropy loss functions: theoretical analysis and applications , year =
Mao, Anqi and Mohri, Mehryar and Zhong, Yutao , booktitle =. Cross-entropy loss functions: theoretical analysis and applications , year =
-
[34]
Robust Classification for Imprecise Environments , url =
Provost, Foster and Fawcett, Tom , date =. Robust Classification for Imprecise Environments , url =. Machine Learning , number =. 2001 , abstract =. doi:10.1023/A:1007601015854 , id =
-
[35]
Reid and Robert C
Mark D. Reid and Robert C. Williamson , date-added =. Composite Binary Losses , url =. Journal of Machine Learning Research , number =. 2010 , bdsk-file-1 =
2010
-
[36]
Williamson and Elodie Vernet and Mark D
Robert C. Williamson and Elodie Vernet and Mark D. Reid , date-added =. Composite Multiclass Losses , url =. Journal of Machine Learning Research , number =. 2016 , bdsk-file-1 =
2016
-
[37]
Elicitability of Instance and Object Ranking , url =
Werner, Tino , date-added =. Elicitability of Instance and Object Ranking , url =. Decision Analysis , keywords =. 2022 , abstract =. doi:10.1287/deca.2021.0446 , issn =
-
[38]
Advances in Neural Information Processing Systems , title =
Calauzenes, Cl. Advances in Neural Information Processing Systems , title =. 2012 , bdsk-file-1 =
2012
-
[39]
Shannon, C. E. , date-added =. A Mathematical Theory of Communication , volume =. 1948 , bdsk-file-1 =. doi:10.1002/j.1538-7305.1948.tb01338.x , journal =
-
[40]
Ledley, Robert S. and Lusted, Lee B. , date-added =. Reasoning Foundations of Medical Diagnosis , volume =. 1959 , bdsk-file-1 =. doi:10.1126/science.130.3366.9 , journal =
-
[41]
Philip , date-added =
Dawid, A. Philip , date-added =. The Well-Calibrated. Journal of the American Statistical Association , number =. 1982 , bdsk-file-1 =
1982
-
[42]
, booktitle =
Turney, Peter D. , booktitle =. Types of Cost in Inductive Concept Learning , year =
-
[43]
Learning and Making Decisions When Costs and Probabilities Are Both Unknown , year =
Zadrozny, Bianca and Elkan, Charles , booktitle =. Learning and Making Decisions When Costs and Probabilities Are Both Unknown , year =. doi:10.1145/502512.502540 , pages =
-
[44]
Lambert, Nicolas S. and Pennock, David M. and Shoham, Yoav , booktitle =. Eliciting Properties of Probability Distributions , year =. doi:10.1145/1386790.1386813 , pages =
-
[45]
Gittins, John C. and Glazebrook, Kevin D. and Weber, Richard , date-added =. Multi-Armed Bandit Allocation Indices , year =. doi:10.1002/9780470980033 , edition =
-
[46]
Sequential analysis
Wald, Abraham , date-added =. Sequential analysis. , year =
-
[47]
Gittins, J. C. , date-added =. Bandit Processes and Dynamic Allocation Indices , url =. Journal of the Royal Statistical Society: Series B (Methodological) , keywords =. 1979 , abstract =. doi:https://doi.org/10.1111/j.2517-6161.1979.tb01068.x , eprint =
-
[48]
Stephen G. Pauker and Jerome P. Kassirer , date-added =. The Threshold Approach to Clinical Decision Making , url =. New England Journal of Medicine , number =. 1980 , abstract =. doi:10.1056/NEJM198005153022003 , eprint =
-
[49]
Vickers, Andrew J and Van Calster, Ben and Steyerberg, Ewout W , cin =. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. , volume =. BMJ , jt =. 2016 , abstract =. doi:10.1136/bmj.i6 , edat =
-
[50]
Pfohl, Stephen and Xu, Yizhe and Foryciarz, Agata and Ignatiadis, Nikolaos and Genkins, Julian and Shah, Nigam , booktitle =. Net benefit, calibration, threshold selection, and training objectives for algorithmic fairness in healthcare , url =. 2022 , abstract =. doi:10.1145/3531146.3533166 , isbn =
-
[51]
Van Calster, Ben and Collins, Gary S and Vickers, Andrew J and Wynants, Laure and Kerr, Kathleen F and Barre. Evaluation of performance measures in predictive artificial intelligence models to support medical decisions: overview and guidance. , volume =. Lancet Digit Health , jt =. 2025 , abstract =. doi:10.1016/j.landig.2025.100916 , edat =
-
[52]
Rouhollah Ahmadian and Mehdi Ghatee and Johan Wahlstr. Superior scoring rules for probabilistic evaluation of single-label multi-class classification tasks , url =. International Journal of Approximate Reasoning , keywords =. 2025 , abstract =. doi:https://doi.org/10.1016/j.ijar.2025.109421 , issn =
-
[53]
Information-theoretic Generalization Analysis for Expected Calibration Error , url =. 2025 , bdsk-url-1 =. arXiv , author =:2405.15709 , primaryclass =
-
[54]
Understanding Model Calibration - A gentle introduction and visual exploration of calibration and the expected calibration error (
Maja Pavlovic , booktitle =. Understanding Model Calibration - A gentle introduction and visual exploration of calibration and the expected calibration error (. 2025 , bdsk-url-1 =
2025
-
[55]
2024 , bdsk-url-1 =
National overview of. 2024 , bdsk-url-1 =
2024
-
[56]
Threshold Moving for Online Class Imbalance Learning with Dynamic Evolutionary Cost Vector , url =
Qin, Peijia and Li, Shuxian and Liu, Xiaoqun and Zheng, Zubin and Chong, Siang Yew , code =. Threshold Moving for Online Class Imbalance Learning with Dynamic Evolutionary Cost Vector , url =. Transactions on Machine Learning Research , month =. 2024 , bdsk-url-1 =
2024
-
[57]
2024 , abstract =
H2 2024 Update: State of Omnichannel Fraud Report: Trends and strategies for protecting organizations and consumers , type =. 2024 , abstract =
2024
-
[58]
The history of the ROC curve , year =
Huijzer, Rik , date-modified =. The history of the ROC curve , year =
-
[59]
Notes on
Tilman B. Notes on. 2024 , bdsk-url-1 =
2024
-
[60]
Cost-sensitive learning for imbalanced medical data: a review , url =
Araf, Imane and Idri, Ali and Chairi, Ikram , date =. Cost-sensitive learning for imbalanced medical data: a review , url =. Artificial Intelligence Review , number =. 2024 , abstract =. doi:10.1007/s10462-023-10652-8 , id =
-
[61]
Accuracy, Estimates, and Representation Results , url =. 2024 , bdsk-url-1 =. arXiv , author =:2412.06420 , primaryclass =
-
[62]
Optimal Scoring Rule Design under Partial Knowledge , url =. 2024 , bdsk-file-1 =. arXiv , author =:2107.07420 , primaryclass =
-
[63]
and Maxwell, Aaron E
Farhadpour, Sarah and Warner, Timothy A. and Maxwell, Aaron E. , doi =. Selecting and Interpreting Multiclass Loss and Accuracy Assessment Metrics for Classifications with Class Imbalance: Guidance and Best Practices , url =. Remote Sensing , number =. 2024 , abstract =
2024
-
[64]
Evaluating Posterior Probabilities: Decision Theory, Proper Scoring Rules, and Calibration , year =
Ferrer, Luciana and Ramos, Daniel , journal =. Evaluating Posterior Probabilities: Decision Theory, Proper Scoring Rules, and Calibration , year =
-
[65]
Understanding subgroup performance differences of fair predictors using causal models , url =
Stephen Robert Pfohl and Natalie Harris and Chirag Nagpal and David Madras and Vishwali Mhasawade and Olawale Elijah Salaudeen and Katherine A Heller and Sanmi Koyejo and Alexander Nicholas D'Amour , booktitle =. Understanding subgroup performance differences of fair predictors using causal models , url =. 2024 , bdsk-file-1 =
2024
-
[66]
Weighted Brier Score -- an Overall Summary Measure for Risk Prediction Models with Clinical Utility Consideration , url =. 2024 , abstract =. arXiv , author =:2408.01626 , primaryclass =
-
[67]
Jordan and Peter Vogel , doi =
Timo Dimitriadis and Tilmann Gneiting and Alexander I. Jordan and Peter Vogel , doi =. Evaluating probabilistic classifiers: The triptych , url =. International Journal of Forecasting , keywords =. 2024 , abstract =
2024
-
[68]
McDermott and Haoran Zhang and Lasse Hyldig Hansen and Giovanni Angelotti and Jack Gallifant , booktitle =
Matthew B.A. McDermott and Haoran Zhang and Lasse Hyldig Hansen and Giovanni Angelotti and Jack Gallifant , booktitle =. A Closer Look at. 2024 , abstract =
2024
-
[69]
Foody, Giles M. , doi =. Challenges in the real world use of classification accuracy metrics: From recall and precision to the Matthews correlation coefficient , url =. PLOS ONE , month =. 2023 , abstract =
2023
-
[70]
Online Harmonizing Gradient Descent for Imbalanced Data Streams One-Pass Classification , url =
Zhou, Han and Yin, Hongpeng and Deng, Xuanhong and Huang, Yuyu , booktitle =. Online Harmonizing Gradient Descent for Imbalanced Data Streams One-Pass Classification , url =. 2023 , bdsk-url-1 =. doi:10.24963/ijcai.2023/274 , editor =
-
[71]
Kwegyir-Aggrey, Kweku and Gerchick, Marissa and Mohan, Malika and Horowitz, Aaron and Venkatasubramanian, Suresh , booktitle =. The Misuse of AUC: What High Impact Risk Assessment Gets Wrong , url =. 2023 , abstract =. doi:10.1145/3593013.3594100 , location =
-
[72]
Carrington, Andre M and Manuel, Douglas G and Fieguth, Paul W and Ramsay, Tim and Osmani, Venet and Wernly, Bernhard and Bennett, Carol and Hawken, Steven and Magwood, Olivia and Sheikh, Yusuf and McInnes, Matthew and Holzinger, Andreas , crdt =. Deep ROC Analysis and AUC as Balanced Average Accuracy, for Improved Classifier Selection, Audit and Explanati...
-
[73]
A Comparative Study of Assessment Metrics for Imbalanced Learning , year =
Farou, Zakarya and Aharrat, Mohamed and Horv. A Comparative Study of Assessment Metrics for Imbalanced Learning , year =. New Trends in Database and Information Systems , date =
-
[74]
Hand, D. J. and Anagnostopoulos, C. , date =. Notes on the H-measure of classifier performance , url =. Advances in Data Analysis and Classification , number =. 2023 , abstract =. doi:10.1007/s11634-021-00490-3 , id =
-
[75]
MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification , volume =
Yang, Jiancheng and Shi, Rui and Wei, Donglai and Liu, Zequan and Zhao, Lin and Ke, Bilian and Pfister, Hanspeter and Ni, Bingbing , journal =. MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification , volume =
-
[76]
AdaCC: cumulative cost-sensitive boosting for imbalanced classification , url =
Iosifidis, Vasileios and Papadopoulos, Symeon and Rosenhahn, Bodo and Ntoutsi, Eirini , date =. AdaCC: cumulative cost-sensitive boosting for imbalanced classification , url =. Knowledge and Information Systems , number =. 2023 , abstract =. doi:10.1007/s10115-022-01780-8 , id =
-
[77]
From classification accuracy to proper scoring rules: elicitability of probabilistic top list predictions , volume =
Resin, Johannes , issn =. From classification accuracy to proper scoring rules: elicitability of probabilistic top list predictions , volume =. J. Mach. Learn. Res. , keywords =. 2023 , abstract =
2023
-
[78]
and Cranko, Zac , issn =
Williamson, Robert C. and Cranko, Zac , issn =. The geometry and calculus of losses , volume =. J. Mach. Learn. Res. , keywords =. 2023 , abstract =
2023
-
[79]
Ferrer, Analysis and comparison of classification met- rics, arXiv preprint arXiv:2209.05355 (2022)
Ferrer, Luciana , date-added =. Analysis and Comparison of Classification Metrics , year =. doi:10.48550/arXiv.2209.05355 , month =
-
[80]
Frameworks and Results in Distributionally Robust Optimization , url =
Rahimian, Hamed and Mehrotra, Sanjay , doi =. Frameworks and Results in Distributionally Robust Optimization , url =. Open Journal of Mathematical Optimization , month = jul, pages =. 2022 , bdsk-url-1 =
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.