When Individually Calibrated Models Become Collectively Miscalibrated
Pith reviewed 2026-05-20 20:17 UTC · model grok-4.3
The pith
Individually calibrated models become collectively miscalibrated under Brier-score aggregation with correlated beliefs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under Brier-score-based aggregation with positively correlated beliefs, each agent's individually optimal report systematically underestimates the positive-class probability, yielding a Price of Anarchy greater than one whenever Cov(b_i, b_j) > 0. In the canonical setting with n=5, pairwise correlation=0.5, base rate=0.3, the empirically measured PoA in false-negative rate reaches 7.25x. VCG-based aggregation aligns incentives and achieves dominant-strategy incentive compatibility.
What carries the argument
The game-theoretic strategic response to Brier-score aggregation, where agents optimize local scores without coordination, leading to underestimation when beliefs covary positively.
If this is right
- Each agent's report underestimates the positive-class probability under positive covariance.
- The aggregate shows higher false-negative rates, up to 7.25 times in the example case.
- VCG aggregation provides incentive compatibility and maintains accuracy on real datasets.
- Adaptive weighting improves performance under distribution shift.
Where Pith is reading between the lines
- Similar miscalibration could occur in other aggregation rules if they do not account for strategic reporting.
- Monitoring correlations between model predictions could help detect potential collective miscalibration.
- Extending this to non-probabilistic settings or different loss functions might reveal analogous incentive issues.
Load-bearing premise
Agents independently optimize their local Brier score reports without coordination and treat the aggregation rule as fixed when choosing their reports.
What would settle it
Comparing the frequency of positive outcomes to the aggregated probability estimate when agents use Brier-optimal reports versus when they report truthfully, in a controlled setting with known positive correlations.
Figures
read the original abstract
Probabilistic prediction systems often aggregate probability estimates from multiple models into a single decision. A common assumption is that if each model is individually calibrated, the aggregate prediction will also be well calibrated. We show that this assumption fails in multi-agent settings: individually calibrated predictors can become collectively miscalibrated when their predictions interact strategically, in the game-theoretic sense of Brier-optimal local response, even without deliberate coordination. This phenomenon arises naturally when agents are independently trained on overlapping data. We prove that under Brier-score-based aggregation with positively correlated beliefs, each agent's individually optimal report systematically underestimates the positive-class probability, yielding a Price of Anarchy greater than one whenever Cov(b_i, b_j) > 0. In a canonical setting (n = 5 agents, pairwise correlation = 0.5, base rate = 0.3), the empirically measured PoA in false-negative rate reaches 7.25x. In contrast, VCG-based aggregation aligns incentives by rewarding marginal contribution, achieving dominant-strategy incentive compatibility and near-optimal performance. Experiments on three real-world datasets (NSL-KDD, UNSW-NB15, Credit Card Fraud) show that VCG provides strong robustness while maintaining comparable accuracy. It performs particularly well in data-sparse and adversarial settings, and adaptive weighting further improves performance under distribution shift.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that individually Brier-calibrated predictors become collectively miscalibrated under Brier-score aggregation when beliefs are positively correlated, because each agent’s myopic best response systematically underestimates the positive-class probability. It proves this underestimation result, shows that the resulting Price of Anarchy exceeds 1 whenever Cov(b_i, b_j) > 0, and reports an empirical PoA of 7.25× in false-negative rate for the canonical parameter set (n=5, pairwise correlation=0.5, base rate=0.3). The work contrasts this with VCG-based aggregation, which is dominant-strategy incentive compatible, and supports the claims with experiments on NSL-KDD, UNSW-NB15, and Credit Card Fraud datasets.
Significance. If the central game-theoretic result holds, the paper identifies a mechanism by which strategic local optimization can induce collective miscalibration even when every individual model is calibrated, with direct implications for ensemble methods and multi-model decision systems. The explicit PoA quantification, the VCG incentive-alignment proposal, and the three-dataset empirical evaluation are concrete strengths that would make the contribution noteworthy in the machine-learning literature.
major comments (2)
- [Proof of underestimation (Section 3)] The derivation of the individually optimal report (r_i = n b_i − E[∑_{j≠i} b_j | b_i]) treats the other agents’ reports as fixed at their private beliefs b_j. In a symmetric game the reports must satisfy the fixed-point condition that the assumed r_j equal the equilibrium strategy; substituting the equilibrium strategy back into the conditional expectations changes the bias term and can eliminate or reverse the claimed systematic underestimation. This assumption is load-bearing for both the underestimation theorem and the PoA > 1 claim.
- [Canonical setting and PoA measurement (Section 4)] The reported PoA of 7.25× is obtained under the myopic best-response model with a specific parameter triple (n=5, correlation=0.5, base rate=0.3). No sensitivity analysis or equilibrium-consistent re-computation is provided, so it is unclear whether the quantitative result survives the fixed-point correction required by the skeptic note.
minor comments (2)
- [Experiments] The PoA figure is presented without error bars or bootstrap intervals; adding these would make the empirical claim more robust.
- [Preliminaries] Notation for the aggregation rule (average of reports) and the exact Brier-score objective should be stated explicitly once at the beginning of the formal section.
Simulated Author's Rebuttal
We thank the referee for the careful and insightful review. The comments highlight important distinctions between myopic best responses and full Nash equilibrium, which we address below. We outline planned revisions to clarify assumptions and strengthen the quantitative claims.
read point-by-point responses
-
Referee: [Proof of underestimation (Section 3)] The derivation of the individually optimal report (r_i = n b_i − E[∑_{j≠i} b_j | b_i]) treats the other agents’ reports as fixed at their private beliefs b_j. In a symmetric game the reports must satisfy the fixed-point condition that the assumed r_j equal the equilibrium strategy; substituting the equilibrium strategy back into the conditional expectations changes the bias term and can eliminate or reverse the claimed systematic underestimation. This assumption is load-bearing for both the underestimation theorem and the PoA > 1 claim.
Authors: We appreciate this observation on the modeling choice. Our analysis is explicitly framed under myopic best-response dynamics, in which each agent optimizes its report while treating others' reports as fixed at their private beliefs. This corresponds to the natural setting of independently trained models on overlapping data, where agents do not coordinate on a joint equilibrium strategy. The underestimation theorem and the resulting PoA > 1 result are derived and stated under this myopic regime, which we believe is the appropriate model for the paper's claims about collective miscalibration. We agree that a symmetric Nash equilibrium would require solving the fixed-point equations. In the revision we will add a clarifying paragraph in Section 3 that explicitly states the myopic assumption, contrasts it with full equilibrium, and notes that the directional bias from positive covariance is expected to persist (though possibly attenuated) under equilibrium play. revision: partial
-
Referee: [Canonical setting and PoA measurement (Section 4)] The reported PoA of 7.25× is obtained under the myopic best-response model with a specific parameter triple (n=5, correlation=0.5, base rate=0.3). No sensitivity analysis or equilibrium-consistent re-computation is provided, so it is unclear whether the quantitative result survives the fixed-point correction required by the skeptic note.
Authors: We acknowledge that the 7.25× figure is presented for a single canonical parameter set under the myopic model. In the revised manuscript we will expand Section 4 with a sensitivity analysis over ranges of n, pairwise correlation, and base rate, still under myopic best responses. In addition, we will numerically solve the symmetric fixed-point equations for the equilibrium reports at the canonical parameters and report the resulting PoA value, thereby directly addressing whether the quantitative conclusion is robust to the equilibrium correction. revision: yes
Circularity Check
No circularity: derivation is explicit game-theoretic model with independent simulation
full rationale
The paper derives the underestimation and PoA > 1 directly from the closed-form solution to each agent's local Brier minimization treating other reports as fixed at b_j, then computes the resulting false-negative-rate ratio on explicitly chosen parameters (n=5, correlation=0.5, base rate=0.3). This is a forward derivation from stated assumptions rather than any reduction of the target quantity to a fitted input, self-citation chain, or definitional equivalence. The empirical PoA figure is a simulation output under those parameters, not a prediction forced by reusing the same data or equilibrium fixed-point. No load-bearing self-citations, ansatzes, or renamings appear in the derivation chain.
Axiom & Free-Parameter Ledger
free parameters (3)
- pairwise correlation =
0.5
- base rate =
0.3
- number of agents =
5
axioms (2)
- domain assumption Each agent independently selects the report that maximizes its expected Brier score given the fixed aggregation rule
- domain assumption Beliefs are positively correlated across agents
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Under the Brier score mechanism with n≥2 agents whose beliefs are correlated and outcome Pr(y=1|b1,...,bn)=1/n ∑j bj, reporting mi=bi is not the Brier-optimal strategy. The Brier-optimal report for agent i is m∗i=E[y|bi]≠bi
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Tavallaee, Mahbod and Bagheri, Ebrahim and Lu, Wei and Ghorbani, Ali A , booktitle=. A detailed analysis of the. 2009 , doi=
work page 2009
-
[2]
Expert Systems with Applications , volume=
Learned lessons in credit card fraud detection from a practitioner perspective , author=. Expert Systems with Applications , volume=. 2014 , doi=
work page 2014
-
[3]
Journal of the American Statistical Association , volume=
Strictly proper scoring rules, prediction, and estimation , author=. Journal of the American Statistical Association , volume=. 2007 , doi=
work page 2007
-
[4]
Advances in Neural Information Processing Systems , volume=
Truthful data acquisition via peer prediction , author=. Advances in Neural Information Processing Systems , volume=
-
[5]
Proceedings of the 18th ACM Conference on Economics and Computation , pages=
Machine-learning aided peer prediction , author=. Proceedings of the 18th ACM Conference on Economics and Computation , pages=. 2017 , doi=
work page 2017
-
[6]
Eliciting informative feedback: The peer-prediction method , author=. Management Science , volume=. 2005 , doi=
work page 2005
-
[7]
Proceedings of the 13th ACM Conference on Electronic Commerce , pages=
Peer prediction without a common prior , author=. Proceedings of the 13th ACM Conference on Electronic Commerce , pages=. 2012 , doi=
work page 2012
-
[8]
Journal of Computer and System Sciences , volume=
A decision-theoretic generalization of on-line learning and an application to boosting , author=. Journal of Computer and System Sciences , volume=. 1997 , doi=
work page 1997
-
[9]
Incentives in teams , author=. Econometrica , volume=. 1973 , doi=
work page 1973
-
[10]
Monthly Weather Review , volume=
Verification of forecasts expressed in terms of probability , author=. Monthly Weather Review , volume=. 1950 , doi=
work page 1950
- [11]
-
[12]
Journal of the Royal Statistical Society: Series D (The Statistician) , volume=
The comparison and evaluation of forecasters , author=. Journal of the Royal Statistical Society: Series D (The Statistician) , volume=. 1983 , doi=
work page 1983
-
[13]
Proceedings of the 34th International Conference on Machine Learning , pages=
On calibration of modern neural networks , author=. Proceedings of the 34th International Conference on Machine Learning , pages=
-
[14]
Multiple Classifier Systems , series=
Ensemble methods in machine learning , author=. Multiple Classifier Systems , series=. 2000 , publisher=
work page 2000
- [15]
-
[16]
npj Digital Medicine , volume=
Scalable and accurate deep learning with electronic health records , author=. npj Digital Medicine , volume=. 2018 , doi=
work page 2018
-
[17]
A clinically applicable approach to continuous prediction of future acute kidney injury , author=. Nature , volume=. 2019 , doi=
work page 2019
-
[18]
Advances in Neural Information Processing Systems , volume=
Simple and scalable predictive uncertainty estimation using deep ensembles , author=. Advances in Neural Information Processing Systems , volume=
-
[19]
Advances in Neural Information Processing Systems , volume=
Bayesian deep learning and a probabilistic perspective of generalization , author=. Advances in Neural Information Processing Systems , volume=
-
[20]
The Lancet Digital Health , volume=
The myth of generalisability in clinical research and machine learning in health care , author=. The Lancet Digital Health , volume=. 2020 , doi=
work page 2020
-
[21]
Dissecting racial bias in an algorithm used to manage the health of populations , author=. Science , volume=. 2019 , doi=
work page 2019
-
[22]
American Journal of Cardiology , volume=
International application of a new probability algorithm for the diagnosis of coronary artery disease , author=. American Journal of Cardiology , volume=. 1989 , doi=
work page 1989
- [23]
-
[24]
Advances in Neural Information Processing Systems , volume=
Deep sets , author=. Advances in Neural Information Processing Systems , volume=
-
[25]
Proceedings of the 4th International Conference on Information Systems Security and Privacy , pages=
Toward generating a new intrusion detection dataset and intrusion traffic characterization , author=. Proceedings of the 4th International Conference on Information Systems Security and Privacy , pages=. 2018 , doi=
work page 2018
- [26]
-
[27]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Arik, Sercan. Proceedings of the AAAI Conference on Artificial Intelligence , volume=. 2021 , doi=
work page 2021
-
[28]
Combining probability distributions: A critique and an annotated bibliography , author=. Statistical Science , volume=. 1986 , doi=
work page 1986
-
[29]
The Journal of Finance , volume=
Counterspeculation, auctions, and competitive sealed tenders , author=. The Journal of Finance , volume=. 1961 , doi=
work page 1961
-
[30]
Multipart pricing of public goods , author=. Public Choice , volume=. 1971 , doi=
work page 1971
-
[31]
Annual Symposium on Theoretical Aspects of Computer Science , series=
Worst-case equilibria , author=. Annual Symposium on Theoretical Aspects of Computer Science , series=. 1999 , publisher=
work page 1999
-
[32]
Machine learning with adversaries:
Blanchard, Peva and El Mhamdi, El Mahdi and Guerraoui, Rachid and Stainer, Julien , booktitle=. Machine learning with adversaries:
-
[33]
Proceedings of the 35th International Conference on Machine Learning , pages=
Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates , author=. Proceedings of the 35th International Conference on Machine Learning , pages=
-
[34]
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics , pages=
Communication-Efficient Learning of Deep Networks from Decentralized Data , author=. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics , pages=
-
[35]
Karimireddy, Sai Praneeth and Kale, Satyen and Mohri, Mehryar and Reddi, Sashank and Stich, Sebastian and Suresh, Ananda Theertha , booktitle=
- [36]
-
[37]
Contributions to the Theory of Games , editor=
A value for n-person games , author=. Contributions to the Theory of Games , editor=. 1953 , publisher=
work page 1953
-
[38]
Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume=
Combining probability forecasts , author=. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume=. 2010 , doi=
work page 2010
-
[39]
Maximum likelihood estimation of observer error-rates using the
Dawid, A Philip and Skene, Allan M , journal=. Maximum likelihood estimation of observer error-rates using the. 1979 , doi=
work page 1979
-
[40]
Intrinsic Robustness of the Price of Anarchy , author=. Journal of the ACM , volume=. 2015 , doi=
work page 2015
-
[41]
Can You Trust Your Model's Uncertainty?
Ovadia, Yaniv and Fertig, Emily and Ren, Jie and Nado, Zachary and Sculley, D and Nowozin, Sebastian and Dillon, Joshua V and Lakshminarayanan, Balaji and Snoek, Jasper , booktitle=. Can You Trust Your Model's Uncertainty?
-
[42]
IEEE Signal Processing Magazine , volume=
Federated Learning: Challenges, Methods, and Future Directions , author=. IEEE Signal Processing Magazine , volume=. 2020 , doi=
work page 2020
-
[43]
International Conference on Machine Learning , pages=
Online Learning under Delayed Feedback , author=. International Conference on Machine Learning , pages=
-
[44]
RAND Memorandum RM-2651 , year=
Values of Large Games, IV: Evaluating the Electoral College by Montecarlo Techniques , author=. RAND Memorandum RM-2651 , year=
-
[45]
TabNet: Attentive Interpretable Tabular Learning.,
Sercan \"O Arik and Tomas Pfister. TabNet : Attentive interpretable tabular learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 6679--6687, 2021. doi:10.1609/aaai.v35i8.16826
-
[46]
Machine learning with adversaries: Byzantine tolerant gradient descent
Peva Blanchard, El Mahdi El Mhamdi, Rachid Guerraoui, and Julien Stainer. Machine learning with adversaries: Byzantine tolerant gradient descent. In Advances in Neural Information Processing Systems, volume 30, 2017
work page 2017
-
[47]
Verification of forecasts expressed in terms of probability
Glenn W Brier. Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78 0 (1): 0 1--3, 1950. doi:10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2
-
[48]
Prediction, Learning, and Games
Nicol \`o Cesa-Bianchi and G \'a bor Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006. doi:10.1017/CBO9780511546921
-
[49]
Truthful data acquisition via peer prediction
Yiling Chen, Yiheng Shen, and Shuran Zheng. Truthful data acquisition via peer prediction. In Advances in Neural Information Processing Systems, volume 33, pages 18879--18889, 2020
work page 2020
-
[50]
Multipart pricing of public goods
Edward H Clarke. Multipart pricing of public goods. Public Choice, 11 0 (1): 0 17--33, 1971. doi:10.1007/BF01726210
-
[51]
Learned lessons in credit card fraud detection from a practitioner perspective
Andrea Dal Pozzolo, Olivier Caelen, Yann-Ael Le Borgne, Serge Waterschoot, and Gianluca Bontempi. Learned lessons in credit card fraud detection from a practitioner perspective. Expert Systems with Applications, 41 0 (10): 0 4915--4928, 2014. doi:10.1016/j.eswa.2014.02.026
-
[52]
A Philip Dawid. The well-calibrated B ayesian. Journal of the American Statistical Association, 77 0 (379): 0 605--610, 1982. doi:10.1080/01621459.1982.10477856
-
[53]
Maximum Likelihood Estimation of Observer Error-Rates Using the EM Algorithm
A Philip Dawid and Allan M Skene. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28 0 (1): 0 20--28, 1979. doi:10.2307/2346806
-
[54]
Morris H DeGroot and Stephen E Fienberg. The comparison and evaluation of forecasters. Journal of the Royal Statistical Society: Series D (The Statistician), 32 0 (1-2): 0 12--22, 1983. doi:10.2307/2987588
-
[55]
Robert Detrano, Ales Jan s a, Walter Steinbrunn, Matthias Pfisterer, Johann-Jakob Schmid, Sarbjit Sandhu, Kern H Guppy, Stella Lee, and Victor Froelicher. International application of a new probability algorithm for the diagnosis of coronary artery disease. American Journal of Cardiology, 64 0 (5): 0 304--310, 1989. doi:10.1016/0002-9149(89)90524-9
-
[56]
Ensemble methods in machine learning
Thomas G Dietterich. Ensemble methods in machine learning. In Multiple Classifier Systems, Lecture Notes in Computer Science, pages 1--15. Springer, 2000. doi:10.1007/3-540-45014-9_1
-
[57]
Yoav Freund and Robert E Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55 0 (1): 0 119--139, 1997. doi:10.1006/jcss.1997.1504
-
[58]
The myth of generalisability in clinical research and machine learning in health care
Joseph Futoma, Morgan Siber, and Jonathan A Quinn. The myth of generalisability in clinical research and machine learning in health care. The Lancet Digital Health, 2 0 (9): 0 e489--e492, 2020. doi:10.1016/S2589-7500(20)30186-2
-
[59]
Christian Genest and James V Zidek. Combining probability distributions: A critique and an annotated bibliography. Statistical Science, 1 0 (1): 0 114--135, 1986. doi:10.1214/ss/1177013825
-
[60]
Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102 0 (477): 0 359--378, 2007. doi:10.1198/016214506000001437
-
[61]
Theodore Groves. Incentives in teams. Econometrica, 41 0 (4): 0 617--631, 1973. doi:10.2307/1914085
-
[62]
On calibration of modern neural networks
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning, pages 1321--1330, 2017
work page 2017
-
[63]
Online learning under delayed feedback
Pooria Joulani, Andras Gyorgy, and Csaba Szepesvari. Online learning under delayed feedback. In International Conference on Machine Learning, pages 1453--1461, 2013
work page 2013
-
[64]
SCAFFOLD : Stochastic controlled averaging for federated learning
Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, and Ananda Theertha Suresh. SCAFFOLD : Stochastic controlled averaging for federated learning. In Proceedings of the 37th International Conference on Machine Learning, pages 5132--5143, 2020
work page 2020
-
[65]
Elias Koutsoupias and Christos Papadimitriou. Worst-case equilibria. In Annual Symposium on Theoretical Aspects of Computer Science, volume 1563 of Lecture Notes in Computer Science, pages 404--413. Springer, 1999. doi:10.1007/3-540-49116-3_38
-
[66]
Simple and scalable predictive uncertainty estimation using deep ensembles
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in Neural Information Processing Systems, volume 30, 2017
work page 2017
-
[67]
Federated Learn- ing: Challenges, Methods, and Future Directions,
Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37 0 (3): 0 50--60, 2020. doi:10.1109/MSP.2020.2975749
-
[68]
Machine-learning aided peer prediction
Yang Liu and Yiling Chen. Machine-learning aided peer prediction. In Proceedings of the 18th ACM Conference on Economics and Computation, pages 63--80, 2017. doi:10.1145/3033274.3085126
-
[69]
Irwin Mann and Lloyd S. Shapley. Values of large games, iv: Evaluating the electoral college by montecarlo techniques. RAND Memorandum RM-2651, 1960
work page 1960
-
[70]
Communication-efficient learning of deep networks from decentralized data
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Ag \"u era y Arcas. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pages 1273--1282, 2017
work page 2017
-
[71]
Eliciting informative feedback: The peer-prediction method
Nolan Miller, Paul Resnick, and Richard Zeckhauser. Eliciting informative feedback: The peer-prediction method. Management Science, 51 0 (9): 0 1359--1373, 2005. doi:10.1287/mnsc.1050.0379
-
[72]
Nour Moustafa and Jill Slay. UNSW-NB15 : A comprehensive data set for network intrusion detection systems ( UNSW-NB15 network data set). In Military Communications and Information Systems Conference, pages 1--6, 2015. doi:10.1109/MilCIS.2015.7348942
-
[73]
Noam Nisan, Tim Roughgarden, Eva Tardos, and Vijay V Vazirani. Algorithmic Game Theory. Cambridge University Press, 2007. doi:10.1017/CBO9780511800481
-
[74]
Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366 0 (6464): 0 447--453, 2019. doi:10.1126/science.aax2342
-
[75]
Can you trust your model's uncertainty? Evaluating predictive uncertainty under dataset shift
Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D Sculley, Sebastian Nowozin, Joshua V Dillon, Balaji Lakshminarayanan, and Jasper Snoek. Can you trust your model's uncertainty? Evaluating predictive uncertainty under dataset shift. In Advances in Neural Information Processing Systems, volume 32, 2019
work page 2019
-
[76]
Scalable and accurate deep learning with electronic health records,
Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M Dai, Nissan Hajaj, Michaela Hardt, Peter J Liu, Xiaobing Liu, Jake Marcus, Mimi Sun, et al. Scalable and accurate deep learning with electronic health records. npj Digital Medicine, 1 0 (1): 0 18, 2018. doi:10.1038/s41746-018-0029-1
-
[77]
Combining probability forecasts
Roopesh Ranjan and Tilmann Gneiting. Combining probability forecasts. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72 0 (1): 0 71--91, 2010. doi:10.1111/j.1467-9868.2009.00726.x
-
[78]
Intrinsic robustness of the price of ana rchy
Tim Roughgarden. Intrinsic robustness of the price of anarchy. Journal of the ACM, 62 0 (5): 0 1--42, 2015. doi:10.1145/2806883
-
[79]
Lloyd S Shapley. A value for n-person games. In Harold W Kuhn and Albert W Tucker, editors, Contributions to the Theory of Games, volume 2, pages 307--317. Princeton University Press, 1953. doi:10.1515/9781400881970-018
-
[80]
Toward generating a new intrusion detection dataset and intrusion traffic characterization
Iman Sharafaldin, Arash Habibi Lashkari, and Ali A Ghorbani. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy, pages 108--116, 2018. doi:10.5220/0006639801080116
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.