Corruptions of Supervised Learning Problems: Typology and Mitigations
Pith reviewed 2026-05-24 07:12 UTC · model grok-4.3
The pith
Markov kernels on data distributions unify all corruptions in supervised learning and enable loss corrections beyond labels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By focusing on changes to the underlying probability distributions via Markov kernels, the approach constructs a provably exhaustive corruption framework that unifies existing models, enables a systematic comparison of Bayes risks in clean and corrupted settings, and supplies loss-correction formulas for attribute and joint corruption under a generalized paradigm with weaker requirements.
What carries the argument
Markov kernels applied to the joint distribution of inputs and labels, which induce the corrupted distributions and thereby determine the altered loss and hypothesis class.
If this is right
- Existing corruption models receive a single exhaustive framework and consistent nomenclature.
- Label corruptions affect only the loss function while attribute corruptions affect both loss and hypothesis class.
- Loss-correction methods extend to dependent corruption types and to attribute and joint cases.
- The classical loss-correction paradigm must be replaced by one with weaker requirements.
- Bayes-risk comparisons become a systematic tool for predicting corruption consequences.
Where Pith is reading between the lines
- The kernel view could guide construction of algorithms that correct attribute corruptions without assuming independence.
- Data-collection pipelines could be audited by estimating the effective Markov kernel they apply.
- The same distribution-change language might transfer to settings beyond supervised learning where distributions are altered.
- Empirical tests could check whether every observed corruption type admits a Markov-kernel representation.
Load-bearing premise
Every modification to a supervised learning problem, including changes to the model class and loss, can be captured by applying Markov kernels solely to the data-generating distributions.
What would settle it
A concrete corruption scenario that alters the learning problem yet cannot be expressed as the result of any Markov kernel acting on the original input-label distribution.
Figures
read the original abstract
Corruption is notoriously widespread in data collection. Despite extensive research, the existing literature predominantly focuses on specific settings and learning scenarios, lacking a unified view of corruption modelization and mitigation. In this work, we develop a general theory of corruption, which incorporates all modifications to a supervised learning problem, including changes in model class and loss. Focusing on changes to the underlying probability distributions via Markov kernels, our approach leads to three novel opportunities. First, it enables the construction of a novel, provably exhaustive corruption framework, distinguishing among different corruption types. This serves to unify existing models and establish a consistent nomenclature. Second, it facilitates a systematic analysis of corruption's consequences on learning tasks, by comparing Bayes risks in the clean and corrupted scenarios. Notably, while label corruptions affect only the loss function, attribute corruptions additionally influence the hypothesis class. Third, building upon these results, we investigate mitigations for various corruption types. We expand existing loss-correction methods for label corruption to handle dependent corruption types. Our findings highlight the necessity to generalize this classical corruption-corrected learning framework to a new paradigm with weaker requirements to encompass more corruption types. We provide such a paradigm as well as loss correction formulas in the attribute and joint corruption cases.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a general theory of corruption in supervised learning by modeling all modifications—including to the hypothesis class and loss—via Markov kernels applied to the data-generating joint distribution P(X,Y). This is claimed to yield a provably exhaustive corruption typology that unifies existing models, a Bayes-risk comparison showing label corruptions affect only the loss while attribute corruptions also affect the hypothesis class, and explicit loss-correction formulas for attribute and joint corruptions under a generalized paradigm with weaker requirements than classical label-noise correction.
Significance. If the exhaustiveness claim and the loss-correction derivations hold, the work would supply a unifying modeling language and systematic mitigation approach for a broad range of corruptions, moving the literature beyond case-by-case treatments of label noise or specific attribute shifts. The Bayes-risk distinction between corruption types would be a useful organizing principle for choosing correction strategies.
major comments (4)
- [Abstract] Abstract: the central claim that Markov kernels on P(X,Y) alone induce 'all modifications to a supervised learning problem, including changes in model class and loss' is asserted without an explicit construction or proof showing how an arbitrary restriction or expansion of the hypothesis class H is realized solely by the kernel; this construction is load-bearing for both the exhaustiveness and the unification claims.
- [Abstract] Abstract: the 'provably exhaustive' corruption framework is announced but no proof of exhaustiveness, no enumeration of the corruption types, and no verification that every possible modification is captured appear in the manuscript; without these the typology cannot be assessed as exhaustive.
- [Abstract] Abstract: the loss-correction formulas for attribute and joint corruption cases are promised under a 'generalized paradigm with weaker requirements,' yet neither the paradigm nor the formulas are supplied; these formulas are the concrete output needed to substantiate the mitigation contribution.
- [Abstract] Abstract: the Bayes-risk comparison is described qualitatively ('label corruptions affect only the loss function, attribute corruptions additionally influence the hypothesis class') but no explicit risk expressions, no clean-versus-corrupted risk difference, and no derivation showing why the hypothesis class is unaffected by label kernels are given; this distinction is load-bearing for the claimed systematic analysis.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive critique of the abstract. The comments correctly identify that several central claims are asserted at a high level without sufficient explicit support or cross-references in the current manuscript. We agree that these points require clarification and will make the requested additions and revisions to the abstract and body to substantiate the claims.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that Markov kernels on P(X,Y) alone induce 'all modifications to a supervised learning problem, including changes in model class and loss' is asserted without an explicit construction or proof showing how an arbitrary restriction or expansion of the hypothesis class H is realized solely by the kernel; this construction is load-bearing for both the exhaustiveness and the unification claims.
Authors: We agree the abstract would be strengthened by an explicit pointer to the construction. In the full manuscript (Section 3), we define a corruption as a Markov kernel K acting on the joint P(X,Y) and show that the induced marginals and conditionals can render certain hypotheses in H suboptimal or infeasible under the corrupted measure, thereby realizing effective restrictions or expansions of the hypothesis class without altering H itself. We will revise the abstract to include a one-sentence reference to this construction and the relevant section. revision: yes
-
Referee: [Abstract] Abstract: the 'provably exhaustive' corruption framework is announced but no proof of exhaustiveness, no enumeration of the corruption types, and no verification that every possible modification is captured appear in the manuscript; without these the typology cannot be assessed as exhaustive.
Authors: The referee is correct that the abstract announces exhaustiveness without supplying the supporting argument or enumeration in the visible text. The manuscript classifies corruptions according to whether the kernel acts on the label marginal, the attribute marginal, or the joint; exhaustiveness follows from the fact that any measurable transformation of P(X,Y) can be represented by some Markov kernel. We will add a short paragraph (or subsection) enumerating the three primary types with the supporting argument and will update the abstract to reference this enumeration. revision: yes
-
Referee: [Abstract] Abstract: the loss-correction formulas for attribute and joint corruption cases are promised under a 'generalized paradigm with weaker requirements,' yet neither the paradigm nor the formulas are supplied; these formulas are the concrete output needed to substantiate the mitigation contribution.
Authors: We acknowledge that the abstract promises the formulas and the generalized paradigm but does not exhibit them. The derivations appear in Section 5, where we relax the classical independence assumption between clean and corrupted labels and obtain explicit correction terms for attribute and joint kernels. We will revise the abstract to state that the formulas are derived in Section 5 and will ensure the paradigm is clearly named and contrasted with prior work. revision: yes
-
Referee: [Abstract] Abstract: the Bayes-risk comparison is described qualitatively ('label corruptions affect only the loss function, attribute corruptions additionally influence the hypothesis class') but no explicit risk expressions, no clean-versus-corrupted risk difference, and no derivation showing why the hypothesis class is unaffected by label kernels are given; this distinction is load-bearing for the claimed systematic analysis.
Authors: The comment is accurate: the abstract gives only the qualitative distinction. Section 4 supplies the explicit expressions R*(P) versus R*_K(P_K) and shows that a label-only kernel leaves the argmin over H unchanged while modifying the loss, whereas an attribute kernel changes both the effective loss and the measure with respect to which the risk is evaluated. We will add a concise statement of the risk difference to the abstract together with a reference to Section 4. revision: yes
Circularity Check
No significant circularity; framework derived from Markov kernel modeling choice
full rationale
The paper models corruptions via Markov kernels applied to data-generating distributions and derives an exhaustive framework, Bayes risk distinctions, and loss corrections from that choice. No equations, fitted parameters, or self-citations are shown reducing any central claim (exhaustive unification, risk comparisons, or correction formulas) to a tautology or input by construction. The derivation is self-contained against the stated modeling assumptions; the reader's assessment of score 2 aligns with absence of load-bearing self-reference or definitional collapse.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption All modifications to a supervised learning problem can be represented by Markov kernels acting on the underlying probability distributions.
invented entities (1)
-
Provably exhaustive corruption framework
no independent evidence
Reference graph
Works this paper leans on
-
[1]
University of Chicago Press, 1998
Mary Poovey.A history of the modern fact: Problems of knowledge in the sciences of wealth and society. University of Chicago Press, 1998
work page 1998
-
[2]
RobertCWilliamson. ProcessandPurpose,NotThingandTechnique: HowtoPoseData Science Research Challenges.Harvard Data Science Review, 2(3), 2020
work page 2020
-
[3]
How to prevent discriminatory outcomes in machine learning
World Economic Forum. How to prevent discriminatory outcomes in machine learning. In World Economic Forum Global Future Council on Human Rights 2016-18, REF, 2018
work page 2016
-
[4]
Shifts: A dataset of real distributional shift across multiple large-scale tasks
Andrey Malinin, Neil Band, Yarin Gal, Mark Gales, Alexander Ganshin, German Ches- nokov, Alexey Noskov, Andrey Ploskonosov, Liudmila Prokhorenkova, Ivan Provilkov, Vatsal Raina, Vyas Raina, Denis Roginskiy, Mariya Shmatova, Panagiotis Tigas, and Boris Yangel. Shifts: A dataset of real distributional shift across multiple large-scale tasks. In Thirty-fifth...
work page 2021
-
[5]
Wilds: A benchmark of in-the-wild distribution shifts
Pang Wei Koh, Shiori Sagawa, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, et al. Wilds: A benchmark of in-the-wild distribution shifts. InInternational Conference on Machine Learning, pages 5637–5664. PMLR, 2021. 49
work page 2021
-
[6]
Xiao-LiMeng. Enhancing(publicationson)dataquality: Deeperdatamindingandfuller data confession.Journal of the Royal Statistical Society Series A: Statistics in Society, 184(4): 1161–1175, 2021
work page 2021
-
[7]
Thinking beyond distributions in testing machine learned models
Negar Rostamzadeh, Ben Hutchinson, Christina Greer, and Vinodkumar Prabhakaran. Thinking beyond distributions in testing machine learned models. InNeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and Applications, 2021
work page 2021
-
[8]
Geometry and stability of supervised learning problems.arXiv preprint arXiv:2403.01660, 2024
Facundo Mémoli, Brantley Vose, and Robert C Williamson. Geometry and stability of supervised learning problems.arXiv preprint arXiv:2403.01660, 2024
-
[9]
Learning from noisy examples.Machine Learning, 2: 343–370, 1988
Dana Angluin and Philip Laird. Learning from noisy examples.Machine Learning, 2: 343–370, 1988
work page 1988
-
[10]
Domain adaptation under target and conditional shift
Kun Zhang, Bernhard Schölkopf, Krikamol Muandet, and Zhikun Wang. Domain adaptation under target and conditional shift. InInternational conference on machine learning, pages 819–827. PMLR, 2013
work page 2013
-
[11]
Learning with noisy labels.Advances in neural information processing systems, 26, 2013
Nagarajan Natarajan, Inderjit S Dhillon, Pradeep K Ravikumar, and Ambuj Tewari. Learning with noisy labels.Advances in neural information processing systems, 26, 2013
work page 2013
-
[12]
Making deep neural networks robust to label noise: A loss correction approach
Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. Making deep neural networks robust to label noise: A loss correction approach. In ProceedingsoftheIEEEconferenceoncomputervisionandpatternrecognition ,pages1944–1952, 2017
work page 1952
-
[13]
HidetoshiShimodaira. Improvingpredictiveinferenceundercovariateshiftbyweighting the log-likelihood function.Journal of statistical planning and inference, 90(2):227–244, 2000
work page 2000
-
[14]
Dataset shift in machine learning
Joaquin Quiñonero-Candela, Masashi Sugiyama, Anton Schwaighofer, and Neil D Lawrence. Dataset shift in machine learning. MIT Press, 2008
work page 2008
-
[15]
A one-step approach to covariate shift adaptation
Tianyi Zhang, Ikko Yamane, Nan Lu, and Masashi Sugiyama. A one-step approach to covariate shift adaptation. InAsian Conference on Machine Learning, pages 65–80. PMLR, 2020
work page 2020
-
[16]
A unifying view on dataset shift in classification.Pattern recognition, 45(1):521–530, 2012
Jose G Moreno-Torres, Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V Chawla, and Francisco Herrera. A unifying view on dataset shift in classification.Pattern recognition, 45(1):521–530, 2012
work page 2012
-
[17]
Meelis Kull and Peter Flach. Patterns of dataset shift. InFirst International Workshop on Learning over Multiple Contexts (LMCE) at ECML-PKDD, 2014
work page 2014
-
[18]
José A. Sáez. Noise models in classification: Unified nomenclature, extended taxonomy and pragmatic categorization.Mathematics, 10(20), 2022
work page 2022
-
[19]
Adarsh Subbaswamy, Bryant Chen, and Suchi Saria. A unifying causal framework for analyzing dataset shift-stable learning algorithms.Journal of Causal Inference, 10(1):64–89, 2022
work page 2022
-
[20]
Learning k-DNF with noise in the attributes
George Shackelford and Dennis Volper. Learning k-DNF with noise in the attributes. In Proceedings of the first annual workshop on Computational learning theory, pages 97–103, 1988
work page 1988
-
[21]
Sally A. Goldman and Robert H. Sloan. Can PAC learning algorithms tolerate random attribute noise?Algorithmica, 14(1):70–84, 1995. 50
work page 1995
-
[22]
Xingquan Zhu and Xindong Wu. Class noise vs. attribute noise: A quantitative study. The Artificial Intelligence Review, 22(3):177, 2004
work page 2004
-
[23]
Robert C. Williamson and Zac Cranko. Information processing equalities and the information–risk bridge.Journal of machine learning research, 25(103):1–53, 2024
work page 2024
-
[24]
Combining labeled and unlabeled data with co-training
Avrim Blum and Tom Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on Computational learning theory, pages 92–100, 1998
work page 1998
-
[25]
BrendanVanRooyen,AdityaMenon,andRobertCWilliamson. Learningwithsymmetric label noise: The importance of being unhinged.Advances in neural information processing systems, 28, 2015
work page 2015
-
[26]
Brendan van Rooyen and Robert C. Williamson. A theory of learning with corrupted labels. Journal of machine learning research, 18(228):1–50, 2018
work page 2018
-
[27]
Learning from binary labels with instance-dependent noise.Machine Learning, 107(8):1561–1595, 2018
Aditya Krishna Menon, Brendan Van Rooyen, and Nagarajan Natarajan. Learning from binary labels with instance-dependent noise.Machine Learning, 107(8):1561–1595, 2018
work page 2018
-
[28]
Learning with bounded instance and label-dependent label noise
Jiacheng Cheng, Tongliang Liu, Kotagiri Ramamohanarao, and Dacheng Tao. Learning with bounded instance and label-dependent label noise. InInternational Conference on Machine Learning, pages 1789–1799. PMLR, 2020
work page 2020
-
[29]
Yu Yao, Tongliang Liu, Mingming Gong, Bo Han, Gang Niu, and Kun Zhang. Instance- dependent label-noise learning under a structural causal model.Advances in Neural Information Processing Systems, 34:4409–4420, 2021
work page 2021
-
[30]
Tackling instance-dependent label noise via a universal probabilistic model
Qizhou Wang, Bo Han, Tongliang Liu, Gang Niu, Jian Yang, and Chen Gong. Tackling instance-dependent label noise via a universal probabilistic model. InProceedings of the AAAI Conference on Artificial Intelligence, pages 10183–10191, 2021
work page 2021
-
[31]
Decontamination of mutually contaminated models
Gilles Blanchard and Clayton Scott. Decontamination of mutually contaminated models. In Artificial Intelligence and Statistics, pages 1–9. PMLR, 2014
work page 2014
-
[32]
Learning from corrupted binary labels via class-probability estimation
Aditya Menon, Brendan Van Rooyen, Cheng Soon Ong, and Robert C Williamson. Learning from corrupted binary labels via class-probability estimation. InInternational conference on machine learning, pages 125–134. PMLR, 2015
work page 2015
-
[33]
Gilles Blanchard, Marek Flaska, Gregory Handy, Sara Pozzi, and Clayton Scott. Classi- fication with asymmetric label noise: Consistency and maximal denoising.Electronic Journal of Statistics, 10(2):2780–2824, 2016
work page 2016
-
[34]
Decontamination of mutual contamination models.Journal of machine learning research, 20(41), 2019
Julian Katz-Samuels, Gilles Blanchard, and Clayton Scott. Decontamination of mutual contamination models.Journal of machine learning research, 20(41), 2019
work page 2019
-
[35]
The class imbalance problem: A systematic study
Nathalie Japkowicz and Shaju Stephen. The class imbalance problem: A systematic study. Intelligent data analysis, 6(5):429–449, 2002
work page 2002
-
[36]
Haibo He and Edwardo A García. Learning from imbalanced data.IEEE Transactions on knowledge and data engineering, 21(9):1263–1284, 2009
work page 2009
-
[37]
Mateusz Buda, Atsuto Maki, and Maciej A Mazurowski. A systematic study of the class imbalance problem in convolutional neural networks.Neural networks, 106:249–259, 2018. 51
work page 2018
-
[38]
Detecting and correcting for label shift with black box predictors
Zachary Lipton, Yu-Xiang Wang, and Alexander Smola. Detecting and correcting for label shift with black box predictors. InInternational conference on machine learning, pages 3122–3130. PMLR, 2018
work page 2018
-
[39]
Covariate shift by kernel mean matching.Dataset shift in machine learning, 3(4):5, 2009
ArthurGretton,AlexSmola,JiayuanHuang,MarcelSchmittfull,KarstenBorgwardt,and Bernhard Schölkopf. Covariate shift by kernel mean matching.Dataset shift in machine learning, 3(4):5, 2009
work page 2009
-
[40]
Masashi Sugiyama and Motoaki Kawanabe.Machine learning in non-stationary environ- ments: Introduction to covariate shift adaptation. MIT press, 2012
work page 2012
-
[41]
Domainadaptationwithconditionaltransferablecomponents
MingmingGong,KunZhang,TongliangLiu,DachengTao,ClarkGlymour,andBernhard Schölkopf. Domainadaptationwithconditionaltransferablecomponents. In International conference on machine learning, pages 2839–2848. PMLR, 2016
work page 2016
-
[42]
Label-noiserobustdomainadaptation
Xiyu Yu, Tongliang Liu, Mingming Gong, Kun Zhang, Kayhan Batmanghelich, and DachengTao. Label-noiserobustdomainadaptation. In Internationalconferenceonmachine learning, pages 10913–10924. PMLR, 2020
work page 2020
-
[43]
A Neural Algorithm of Artistic Style
Leon A Gatys, Alexander S Ecker, and Matthias Bethge. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[44]
Perceptual losses for real-time style transfer and super-resolution
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. InEuropean Conference on Computer Vision, page 694, 2016
work page 2016
-
[45]
Eric Grinstein, Ngoc QK Duong, Alexey Ozerov, and Patrick Pérez. Audio style transfer. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 586–590. IEEE, 2018
work page 2018
-
[46]
Intriguing properties of neural networks
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks.arXiv preprint arXiv:1312.6199, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[47]
Explaining and Harnessing Adversarial Examples
Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples.arXiv preprint arXiv:1412.6572, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[48]
Thelimitationsofdeeplearninginadversarialsettings
Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and AnanthramSwami. Thelimitationsofdeeplearninginadversarialsettings. In 2016IEEE European symposium on security and privacy (EuroS&P), pages 372–387. IEEE, 2016
work page 2016
-
[49]
Adversarialexamplesinthephysical world
AlexeyKurakin,IanJGoodfellow,andSamyBengio. Adversarialexamplesinthephysical world. InArtificial intelligence safety and security, pages 99–112. Chapman and Hall/CRC, 2018
work page 2018
-
[50]
Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, and Dawn Song. Natural adversarial examples. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15262–15271, 2021
work page 2021
-
[51]
Learning in the presence of concept drift and hidden contexts.Machine learning, 23:69–101, 1996
Gerhard Widmer and Miroslav Kubat. Learning in the presence of concept drift and hidden contexts.Machine learning, 23:69–101, 1996
work page 1996
-
[52]
A survey on concept drift adaptation.ACM computingsurveys (CSUR), 46(4): 1–37, 2014
João Gama, Indr˙e Žliobait˙e, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. A survey on concept drift adaptation.ACM computingsurveys (CSUR), 46(4): 1–37, 2014. 52
work page 2014
-
[53]
Learning underconceptdrift: Areview
Jie Lu, Anjin Liu, Fan Dong, Feng Gu, Joao Gama, and Guangquan Zhang. Learning underconceptdrift: Areview. IEEETransactionsonKnowledgeandDataEngineering ,pages 1–1, 2018
work page 2018
-
[54]
Entropy-based concept shift detection
Peter Vorburger and Abraham Bernstein. Entropy-based concept shift detection. InSixth International Conference on Data Mining (ICDM’06), pages 1113–1118. IEEE, 2006
work page 2006
-
[55]
Effective learning in dynamic environments by explicit context tracking
Gerhard Widmer and Miroslav Kubat. Effective learning in dynamic environments by explicit context tracking. InEuropean Conference on Machine Learning, volume 6, pages 227–243, 1993
work page 1993
-
[56]
Marcos Salganicoff. Tolerating concept and sampling shift in lazy learning using prediction error context switching.Artificial Intelligence Review, 11:133–155, 1997
work page 1997
-
[57]
Alexey Tsymbal. The problem of concept drift: definitions and related work.Computer Science Department, Trinity College Dublin, 106(2):58, 2004
work page 2004
-
[58]
Achim Klenke.Probability Theory: A Comprehensive Course. Springer, 2007
work page 2007
- [59]
-
[60]
Olav Kallenberg.Random measures, theory and applications. Springer, 2017
work page 2017
-
[61]
StatisticalCausalModellingandDecisionTheory .PhDthesis,TheAustralian National University, 2023
DavidJohnston. StatisticalCausalModellingandDecisionTheory .PhDthesis,TheAustralian National University, 2023
work page 2023
-
[62]
Arthur Parzygnat. Kleisli categories and probability - 03 - markov kernels.https: //youtu.be/psUDrasc21o?si=we87QEeKiGOa0_eN, 2020
work page 2020
-
[63]
Imre Csiszár. A class of measures of informativity of observation channels.Periodica Mathematica Hungarica, 2(1-4):191–213, 1972
work page 1972
-
[64]
Cambridge University Press, 1991
Erik Torgersen.Comparison of statistical experiments. Cambridge University Press, 1991
work page 1991
-
[65]
Albert N Shiryaev and Vladimir G Spokoiny.Statistical Experiments And Decision, Asymptotic Theory. World Scientific, 2000
work page 2000
-
[66]
Everyone wants to do the model work, not the data work
Nithya Sambasivan, Shivani Kapania, Hannah Highfill, Diana Akrong, Praveen Paritosh, and Lora M Aroyo. "Everyone wants to do the model work, not the data work": Data Cascades in High-Stakes AI. Inproceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pages 1–15, 2021
work page 2021
-
[67]
Peter L Bartlett, Michael I Jordan, and Jon D McAuliffe. Convexity, classification, and risk bounds.Journal of the American Statistical Association, 101(473):138–156, 2006
work page 2006
-
[68]
A theory of learning from different domains.Machine Learning, 79: 151–175, 2010
Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Vaughan. A theory of learning from different domains.Machine Learning, 79: 151–175, 2010
work page 2010
-
[69]
Fairness evaluation in presence of biased noisy labels
Riccardo Fogliato, Alexandra Chouldechova, and Max G’Sell. Fairness evaluation in presence of biased noisy labels. InInternational conference on artificial intelligence and statistics, pages 2325–2336. PMLR, 2020
work page 2020
-
[70]
Jonathan Rothwell. How the war on drugs damages black social mobility.The Brookings Institution, published Sept, 30, 2014. 53
work page 2014
-
[71]
Learningclassifiersfromonlypositiveandunlabeleddata
CharlesElkanandKeithNoto. Learningclassifiersfromonlypositiveandunlabeleddata. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 213–220, 2008
work page 2008
-
[72]
Presence-only data and the EM algorithm.Biometrics, 65(2):554–563, 2009
Gill Ward, Trevor Hastie, Simon Barry, Jane Elith, and John R Leathwick. Presence-only data and the EM algorithm.Biometrics, 65(2):554–563, 2009
work page 2009
-
[73]
Marthinus C Du Plessis, Gang Niu, and Masashi Sugiyama. Analysis of learning from positive and unlabeled data.Advances in neural information processing systems, 27, 2014
work page 2014
-
[74]
Convex formulation for learning from positive and unlabeled data
Marthinus Du Plessis, Gang Niu, and Masashi Sugiyama. Convex formulation for learning from positive and unlabeled data. InInternational conference on machine learning, pages 1386–1394. PMLR, 2015
work page 2015
-
[75]
Ryuichi Kiryo, Gang Niu, Marthinus C Du Plessis, and Masashi Sugiyama. Positive- unlabeled learning with non-negative risk estimator.Advances in neural information processing systems, 30, 2017
work page 2017
-
[76]
Estimating labels from label proportions
Novi Quadrianto, Alex J Smola, Tiberio S Caetano, and Quoc V Le. Estimating labels from label proportions. InProceedings of the 25th International Conference on Machine learning, pages 776–783, 2008
work page 2008
-
[77]
On Learning from Label Proportions
Felix X Yu, Krzysztof Choromanski, Sanjiv Kumar, Tony Jebara, and Shih-Fu Chang. On learning from label proportions.arXiv preprint arXiv:1402.5902, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[78]
Learning from label proportionswithgenerativeadversarialnetworks
Jiabin Liu, Bo Wang, Zhiquan Qi, Yingjie Tian, and Yong Shi. Learning from label proportionswithgenerativeadversarialnetworks. Advancesinneuralinformationprocessing systems, 32, 2019
work page 2019
-
[79]
Learning from label proportions: A mutual contam- ination framework
Clayton Scott and Jianxin Zhang. Learning from label proportions: A mutual contam- ination framework. Advances in neural information processing systems, 33:22256–22267, 2020
work page 2020
-
[80]
Multi-class classification from multiple unlabeled datasets with partial risk regularization
Yuting Tang, Nan Lu, Tianyi Zhang, and Masashi Sugiyama. Multi-class classification from multiple unlabeled datasets with partial risk regularization. InAsian Conference on Machine Learning, pages 990–1005. PMLR, 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.