Decision Tree Learning for Uncertain Clinical Measurements

Anders Jonsson; Bart Bijnens; Cec\'ilia Nunes; H\'el\`ene Langet; Mathieu De Craene; Oscar Camara

arxiv: 1907.11325 · v1 · pith:EMIKERXUnew · submitted 2019-07-25 · 📊 stat.AP

Decision Tree Learning for Uncertain Clinical Measurements

Cec\'ilia Nunes , H\'el\`ene Langet , Mathieu De Craene , Oscar Camara , Bart Bijnens , Anders Jonsson This is my paper

Pith reviewed 2026-05-24 15:31 UTC · model grok-4.3

classification 📊 stat.AP

keywords decision treesuncertain dataclinical measurementsprobabilistic thresholdssoft trainingnoise robustnessregularizationmedical diagnosis

0 comments

The pith

Modeling uncertainty as noise only during decision tree training produces smaller trees that retain accuracy as noise increases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper separates the use of probabilistic thresholds into three distinct phases of decision tree construction and use: locating split values, assigning training examples to branches, and issuing predictions on new cases. It tests each phase independently on data where measurement error is represented by independent noise distributions. The training phases produce a regularizing effect that shrinks the resulting trees while accuracy holds steady or improves slightly with rising noise; the prediction phase alone yields no such gain. This separation clarifies that the benefit comes from how the tree is grown rather than how it is later applied.

Core claim

Soft training approaches that realize noise distributions when searching for split thresholds and when splitting training instances achieve a regularizing effect, leading to significant reductions in decision tree size while maintaining accuracy for increased noise; soft evaluation during prediction shows no benefit in handling noise.

What carries the argument

A probabilistic decision tree that independently realizes noise distributions in three phases: (1) searching for split thresholds, (2) splitting the training instances, and (3) generating predictions for unseen data.

If this is right

Decision trees trained with soft thresholds can be smaller yet equally accurate when input measurements contain noise.
The regularization benefit arises specifically from the training phases rather than from probabilistic prediction.
Larger noise levels do not degrade accuracy when the soft training steps are used.
Interpretability is preserved because the final tree structure remains a standard decision tree.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same three-phase separation could be applied to other tree-based methods such as random forests to test whether the regularization generalizes.
If clinical measurements exhibit systematic bias rather than zero-mean noise, the observed size reduction may not hold.
The approach could be extended by learning the noise distribution parameters jointly with the tree rather than assuming them known.

Load-bearing premise

That modeling measurement uncertainty as independently realized noise distributions across the three phases is enough to capture the relevant uncertainty structure in clinical data.

What would settle it

A dataset in which measurement errors are correlated across features or across patients, tested under the same three-phase protocol, where the size-reduction effect disappears or reverses.

Figures

Figures reproduced from arXiv: 1907.11325 by Anders Jonsson, Bart Bijnens, Cec\'ilia Nunes, H\'el\`ene Langet, Mathieu De Craene, Oscar Camara.

**Figure 2.** Figure 2: Probability of misclassifying (x(t) , 1) as function of the standard deviation σ of the normal uncertainty model. In 2a, the uncertainty model is considered only for the training instances x(1) and x(4), simulating soft training propagation (STP), while x(t) is certain. In 2b x(t) has normally-distributed noise, as in soft evaluation (SE). factor and x¯ the training subset mean of X. The same n is used for… view at source ↗

**Figure 3.** Figure 3: (a) Ejection fraction (EF) data of the Data Science Bo [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Information gain computation the ejection fraction [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Results of the experiments displayed as boxplots of t [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

read the original abstract

Clinical decision requires reasoning in the presence of imperfect data. DTs are a well-known decision support tool, owing to their interpretability, fundamental in safety-critical contexts such as medical diagnosis. However, learning DTs from uncertain data leads to poor generalization, and generating predictions for uncertain data hinders prediction accuracy. Several methods have suggested the potential of probabilistic decisions at the internal nodes in making DTs robust to uncertainty. Some approaches only employ probabilistic thresholds during evaluation. Others also consider the uncertainty in the learning phase, at the expense of increased computational complexity or reduced interpretability. The existing methods have not clarified the merit of a probabilistic approach in the distinct phases of DT learning, nor when the uncertainty is present in the training or the test data. We present a probabilistic DT approach that models measurement uncertainty as a noise distribution, independently realized: (1) when searching for the split thresholds, (2) when splitting the training instances, and (3) when generating predictions for unseen data. The soft training approaches (1, 2) achieved a regularizing effect, leading to significant reductions in DT size, while maintaining accuracy, for increased noise. Soft evaluation (3) showed no benefit in handling noise.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper isolates probabilistic noise handling to training phases versus evaluation and finds regularization only from the former, but the independent noise model is a potential limit for real clinical data.

read the letter

The one thing to take away is that modeling measurement uncertainty as independent noise during the training phases of decision tree construction produces smaller trees at the same accuracy level when noise increases, while applying it only at evaluation time provides no such benefit. What is new here is the explicit separation of the probabilistic treatment across the three phases and the finding that the regularization comes from the learning steps rather than prediction. Prior approaches mixed these without isolating the effects. The paper does a reasonable job of showing this differential impact through their experiments, at least according to the abstract, and it keeps the focus on maintaining interpretability, which matters for clinical use. The main concern is whether treating the noise as independently realized in each phase captures real clinical measurement errors. Those errors are often correlated across variables due to patient factors or instrument issues, and independent sampling might create an artificial regularization that disappears under more realistic dependence structures. The abstract also gives no information on the datasets used, the specific noise distributions, the baselines compared against, or any statistical significance, so the strength of the size reduction claim is difficult to judge without the full details. This work is aimed at people developing interpretable decision support tools for medical applications where data is noisy. A reader looking for ways to make decision trees more robust without losing transparency would find the phase distinction useful. I would send it to peer review because the core idea of isolating the phases is worth checking with proper experiments, even if revisions are needed to address the noise model.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces a probabilistic decision tree framework that models measurement uncertainty via independent noise distributions realized separately in three phases: (1) split threshold search, (2) training instance splitting, and (3) prediction on unseen data. It reports that soft training in phases (1) and (2) produces a regularizing effect with significantly smaller trees at maintained accuracy under increased noise, while soft evaluation in phase (3) yields no benefit.

Significance. If the empirical results hold under realistic conditions, the phase-specific analysis offers a clear way to isolate where probabilistic handling of uncertainty improves DT practicality in clinical settings, particularly by reducing model size (and thus improving interpretability) without accuracy loss. The explicit separation of training versus evaluation phases is a methodological strength that could guide future work on robust DTs.

major comments (2)

[Abstract] Abstract (noise model): The central claim that soft training yields smaller trees at maintained accuracy rests on treating uncertainty as independently realized noise distributions across the three phases. Clinical measurements commonly exhibit correlated errors (e.g., shared instrument drift or patient physiology across features), which independent per-phase sampling does not reproduce. This independence assumption is load-bearing for the practical conclusion; without experiments using multivariate or correlated noise, the reported regularization benefit may not transfer to clinical data.
[Abstract] Abstract (empirical support): The abstract asserts 'significant reductions in DT size' and 'maintaining accuracy' but supplies no information on datasets, noise distribution families, baseline comparators, number of replicates, or statistical tests. These details are required to evaluate whether the regularization effect is robust or an artifact of the chosen experimental conditions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below. We agree that the abstract requires additional empirical details and will revise it to include them. For the noise model, we will add discussion of the independence assumption as a modeling choice while noting its implications.

read point-by-point responses

Referee: [Abstract] Abstract (noise model): The central claim that soft training yields smaller trees at maintained accuracy rests on treating uncertainty as independently realized noise distributions across the three phases. Clinical measurements commonly exhibit correlated errors (e.g., shared instrument drift or patient physiology across features), which independent per-phase sampling does not reproduce. This independence assumption is load-bearing for the practical conclusion; without experiments using multivariate or correlated noise, the reported regularization benefit may not transfer to clinical data.

Authors: Our framework deliberately models uncertainty via independent noise distributions realized separately in each phase precisely to isolate the effects of soft decisions during threshold search, instance splitting, and prediction. This separation is central to the phase-specific analysis. While we recognize that correlated errors occur in clinical measurements, the regularization benefit of soft training is shown under the independent model. We will revise the manuscript to explicitly state this modeling assumption and discuss its potential limitations for direct applicability to correlated clinical data. revision: partial
Referee: [Abstract] Abstract (empirical support): The abstract asserts 'significant reductions in DT size' and 'maintaining accuracy' but supplies no information on datasets, noise distribution families, baseline comparators, number of replicates, or statistical tests. These details are required to evaluate whether the regularization effect is robust or an artifact of the chosen experimental conditions.

Authors: We agree that the abstract would be strengthened by including these details. In the revised manuscript we will expand the abstract to specify the datasets, noise distribution families, baseline comparators, number of replicates, and statistical tests used to support the reported reductions in tree size and maintained accuracy. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on experimental comparisons, not derivations or self-referential reductions.

full rationale

The paper describes probabilistic decision tree methods that model measurement uncertainty as independent noise distributions applied in three phases (threshold search, instance splitting, prediction). It reports empirical results showing regularization effects from soft training phases. No equations, derivations, or first-principles claims are present that reduce outputs to inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The central claims are statistical outcomes from experiments, which are externally falsifiable and do not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review provides no equations, so free parameters, axioms, and invented entities cannot be enumerated in detail; the core modeling choice of independent noise realizations is treated as a domain assumption.

axioms (1)

domain assumption Measurement uncertainty in clinical data can be adequately represented as independent noise distributions realized separately during split search, instance assignment, and prediction
This modeling choice underpins the three-phase approach described in the abstract.

pith-pipeline@v0.9.0 · 5755 in / 1147 out tokens · 18700 ms · 2026-05-24T15:31:22.982218+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

models measurement uncertainty as a noise distribution, independently realized: (1) when searching for the split thresholds, (2) when splitting the training instances, and (3) when generating predictions
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The soft training approaches (1, 2) achieved a regularizing effect, leading to significant reductions in DT size, while maintaining accuracy, for increased noise.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

[1]

The coming of age o f artiﬁcial intelligence in medicine,

V . L. Patel, E. H. Shortliffe, M. Stefanelli, P . Szolovits, M . R. Berthold, R. Bellazzi, and A. Abu-Hanna, “The coming of age o f artiﬁcial intelligence in medicine,” Artiﬁcial Intelligence in Medicine , vol. 46, no. 1, pp. 5–17, 2009

work page 2009
[2]

Health Informatics via Machine Learning for the Clinical M an- agement of Patients,

D. A. Clifton, K. E. Niehaus, P . Charlton, and G. W. Colopy , “Health Informatics via Machine Learning for the Clinical M an- agement of Patients,” Y earb Med Inform, vol. 10, no. 1, pp. 38–43, 2015

work page 2015
[3]

Exploratory medical k nowl- edge discovery: Experiences and issues,

J. Roddick, P . Fule, and W. Graco, “Exploratory medical k nowl- edge discovery: Experiences and issues,” ACM SIGKDD Explo- rations Newsletter, pp. 2–7, 2003

work page 2003
[4]

Intelligent data analysis for medical diagnosis: Using ma chine learning and temporal abstraction,

N. Lavraˇ c, I. Kononenko, E. Keravnou, M. Kukar, and B. Zu pan, “Intelligent data analysis for medical diagnosis: Using ma chine learning and temporal abstraction,” AI Communications , vol. 11, no. 3, pp. 191–218, 1998

work page 1998
[5]

Data quality: A sta tistical perspective,

A. F. Karr, A. P . Sanil, and D. L. Banks, “Data quality: A sta tistical perspective,” Statistical Methodology , vol. 3, no. 2, pp. 137–173, 2006

work page 2006
[6]

Evaluat ion of measurement data - guide to the expression of uncertainty in measurement,

W. G. . Joint Committee for Guides in Metrology , “Evaluat ion of measurement data - guide to the expression of uncertainty in measurement,” in T ech. Rep. JCGM 100: 2008 (BIPM, IEC, IFCC, ILAC, ISO, IUP AC, IUP AP and OIML, 2008

work page 2008
[7]

Uniqueness of medical da ta mining,

K. J. Cios and G. William Moore, “Uniqueness of medical da ta mining,” Artiﬁcial Intelligence in Medicine , vol. 26, no. 1-2, pp. 1–24, 2002

work page 2002
[8]

Intra- and interobserver variability in th e mea- surements of abdominal aortic and common iliac artery diame ter with computed tomography . The Tromsø study,

K. Singh, B. K. Jacobsen, S. Solberg, K. H. Bønaa, S. Kumar, R. B ajic, and E. Arnesen, “Intra- and interobserver variability in th e mea- surements of abdominal aortic and common iliac artery diame ter with computed tomography . The Tromsø study,” European Journal Vascular and Endovascular Surgery, vol. 25, no. 5, pp. 399–407, 2003

work page 2003
[9]

Measuring left ventricular ejecti on fraction-techniques and potential pitfalls,

T. Foley , S. Mankad, N. Anavekar, C. Bonnichsen, M. Morris , T. Miller, and P . Araoz, “Measuring left ventricular ejecti on fraction-techniques and potential pitfalls,” European Cardiology , vol. 8, no. 2, pp. 108–114, 2012

work page 2012
[10]

Comparison of imaging techniques to assess appendage anatomy and measurements for left atrial a p- pendage closure device selection

J. R. Lopez-Minguez, R. Gonzalez-Fernandez, C. Fernan dez-V egas, V . Millan-Nunez, M. E. Fuentes-Canamero, J. M. Nogales-Asensio, J. Doncel-V ecino, M. Y uste Dominguez, L. Garcia Serrano, and D. Sanchez Quintana, “Comparison of imaging techniques to assess appendage anatomy and measurements for left atrial a p- pendage closure device selection.” The Jou...

work page 2014
[11]

The quantita tive science of evaluating imaging evidence,

T. S. Genders, B. S. Ferket, and M. M. Hunink, “The quantita tive science of evaluating imaging evidence,” JACC: Cardiovascular Imaging, vol. 10, no. 3, pp. 264–275, 2017

work page 2017
[12]

Assessment of left ventricular e jection fraction in patients eligible for ICD therapy: Discrepancy between cardiac magnetic resonance imaging and 2D echocardiograph y,

S. de Haan, K. de Boer, J. Commandeur, A. M. Beek, A. C. van Rossum, and C. P . Allaart, “Assessment of left ventricular e jection fraction in patients eligible for ICD therapy: Discrepancy between cardiac magnetic resonance imaging and 2D echocardiograph y,” Netherlands Heart Journal , vol. 22, no. 10, pp. 449–455, 2014

work page 2014
[13]

Closing the chasm between research and pra ctice: evidence of and for change,

L. W. Green, “Closing the chasm between research and pra ctice: evidence of and for change,” Health Promotion Journal of Australia , vol. 25, no. 1, pp. 25–29, 2014. (PREPRINT) IEEE TRANSACTIONS ON KNOWLEDGE AND DA T A ENGINEERING, SUBMITTED FOR REVIEW, AUGUST 2019 12

work page 2014
[14]

Interactive dichotomizer, id3,

J. Quinlan et al. , “Interactive dichotomizer, id3,” Eds. Morgan Kauffmann, Springer-Verlag, 1979

work page 1979
[15]

Quinlan, C4.5: Programs for Machine Learning

R. Quinlan, C4.5: Programs for Machine Learning . San Mateo, CA: Morgan Kaufmann Publishers, 1993

work page 1993
[16]

Breiman, J

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classiﬁca- tion and Regression T rees. Belmont, CA: Wadsworth International Group, 1984

work page 1984
[17]

An exploratory technique for investigatin g large quantities of categorical data,

G. V . Kass, “An exploratory technique for investigatin g large quantities of categorical data,” Applied statistics, pp. 119–127, 1980

work page 1980
[18]

Can machine-learning improve cardiovascular risk prediction using routine clinical data?

S. F. Weng, J. Reps, J. Kai, J. M. Garibaldi, and N. Qureshi , “Can machine-learning improve cardiovascular risk prediction using routine clinical data?” PLOS ONE, vol. 12, no. 4, 2017

work page 2017
[19]

L 119, 4.5.:1–88

“Regulation (EU) 2016/679 of the European Parliament a nd of the Council of 27 April 2016 on the protection of natural pers ons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC,” 2 016 O.J. L 119, 4.5.:1–88

work page 2016
[20]

Decision trees as probabilistic classi ﬁers,

J. R. Quinlan, “Decision trees as probabilistic classi ﬁers,” in Pro- ceedings of the 4th International Workshop on Machine Learn ing. Morgan Kauffman, 1987, pp. 31–37

work page 1987
[21]

Softening splits in decision trees using simulated annealing,

J. Dvor´ ak and P . Savick ´y, “Softening splits in decision trees using simulated annealing,” in Adaptive and Natural Computing Algorithms, 8th International Conference, ICANNGA 2007, W arsaw, Poland, April 11-14, 2007, Proceedings, Part I , 2007, pp. 721–729

work page 2007
[22]

Decision trees for uncertain data,

S. Tsang, B. Kao, K. Y . Yip, W.-S. Ho, and S. D. Lee, “Decision trees for uncertain data,” IEEE transactions on knowledge and data engineering, vol. 23, no. 1, pp. 64–78, 2011

work page 2011
[23]

Soft decision tr ees,

O. Irsoy , O. T. Yıldız, and E. Alpaydın, “Soft decision tr ees,” in Pattern Recognition (ICPR), 2012 21st International Confe rence on . IEEE, 2012, pp. 1819–1822

work page 2012
[24]

Induction of fuzzy decision trees,

Y . Y uan, “Induction of fuzzy decision trees,” Fuzzy Sets and Sys- tems, vol. 69, no. 2, pp. 125–139, 1995

work page 1995
[25]

On the optimization of fuzzy decision trees,

X. Wang, B. Chen, G. Qian, and F. Y e, “On the optimization of fuzzy decision trees,” Fuzzy Sets and Systems , vol. 112, no. 1, pp. 117–125, may 2000

work page 2000
[26]

On Distribu ted Fuzzy Decision Trees for Big Data,

A. Segatori, F. Marcelloni, and W. Pedrycz, “On Distribu ted Fuzzy Decision Trees for Big Data,” IEEE T ransactions on Fuzzy Systems , pp. 1–1, 2017

work page 2017
[27]

Probabilistic decision trees,

J. R. Quinlan, “Probabilistic decision trees,” Machine learning: an artiﬁcial intelligence approach , vol. 3, pp. 140–152, 1990

work page 1990
[28]

Hierarchical mixtures of experts and the em algorithm,

M. I. Jordan and R. A. Jacobs, “Hierarchical mixtures of experts and the em algorithm,” Neural computation , vol. 6, no. 2, pp. 181– 214, 1994

work page 1994
[29]

Constructing optimal binary decision trees is NP-complete,

L. Hyaﬁl and R. L. Rivest, “Constructing optimal binary decision trees is NP-complete,” Information Processing Letters , vol. 5, no. 1, pp. 15–17, 1976

work page 1976
[30]

Induction of decision trees,

J. R. Quinlan, “Induction of decision trees,” Machine Learning , vol. 1, no. 1, pp. 81–106, 1986

work page 1986
[31]

Top-down induction of decisio n trees classiﬁers - A survey ,

L. Rokach and O. Maimon, “Top-down induction of decisio n trees classiﬁers - A survey ,” IEEE T ransactions on Systems, Man and Cybernetics Part C: Applications and Reviews , vol. 35, no. 4, pp. 476– 487, 2005

work page 2005
[32]

Ross Quinlan’s personal homepage

Quinlan, Ross. Ross Quinlan’s personal homepage. Acce ssed: 2018-06-03. [Online]. Available: www.rulequest.com/Personal/

work page 2018
[33]

Bayesian model averaging: a tutorial,

J. A. Hoeting, D. Madigan, A. E. Raftery , and C. T. V olins ky , “Bayesian model averaging: a tutorial,” Statistical science, pp. 382– 401, 1999

work page 1999
[34]

Two-dimensional speckle tracking echocardiography: standardization effo rts based on synthetic ultrasound data,

J. D’Hooge, D. Barbosa, H. Gao, P . Claus, D. Prater, J. Ha milton, P . Lysyansky , Y . Abe, Y . Ito, H. Houle et al. , “Two-dimensional speckle tracking echocardiography: standardization effo rts based on synthetic ultrasound data,” Eur Heart J Cardiovasc Imaging , vol. 17, no. 6, pp. 693–701, 2016

work page 2016
[35]

An experimen tal and theoretical comparison of model selection methods,

M. Kearns, Y . Mansour, A. Y . Ng, and D. Ron, “An experimen tal and theoretical comparison of model selection methods,” Machine Learning, vol. 50, pp. 7–50, 1997

work page 1997
[36]

Learning decision rules in no isy do- mains,

T. Niblett and I. Bratko, “Learning decision rules in no isy do- mains,” in Proceedings of Expert Systems ’86, The 6Th Annual T ech- nical Conference on Research and development in expert syst ems III . Cambridge University Press, 1986, pp. 25–34

work page 1986
[37]

UCI Machine Learning Repository,

M. Lichman, “UCI Machine Learning Repository,” 2013. [ Online]. Available: http://archive.ics.uci.edu/ml

work page 2013
[38]

KEEL data-mining software tool: Data set repository , integration of algorithms and experimental analysis framework,

J. Alcal´ a-Fdez, A. Fern´ andez, J. Luengo, J. Derrac, S. Garc´ ıa, L. S´ anchez, and F. Herrera, “KEEL data-mining software tool: Data set repository , integration of algorithms and experimental analysis framework,” Journal of Multiple-Valued Logic and Soft Computing , vol. 17, no. 2-3, pp. 255–287, 2011

work page 2011
[39]

Design of experiments for the nips 2003 variable selection benchmark,

I. Guyon, “Design of experiments for the nips 2003 variable selection benchmark,” 2003. [Online]. Available : clopinet.com/isabelle/Projects/NIPS2003

work page 2003
[40]

Scikit-learn: Machine learning in Python ,

F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. T hirion, O. Grisel, M. Blondel, P . Prettenhofer, R. Weiss, V . Dubourg , J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perr ot, and E. Duchesnay , “Scikit-learn: Machine learning in Python ,” Journal of Machine Learning Research , vol. 12, pp. 2825–2830, 2011

work page 2011
[41]

The effects of training set size on decision tree complexity ,

D. Jensen and T. Oates, “The effects of training set size on decision tree complexity ,” in Proceedings of the 14th International Conference on Machine Learning , 1999, pp. 254–262

work page 1999
[42]

Data Scienc e Bowl Cardiac Challenge Data,

National Heart, Lung, and Blood Institute, “Data Scienc e Bowl Cardiac Challenge Data,” 2015. [Online]. Available: www.kaggle.com/c/second-annual-data-science-bowl

work page 2015
[43]

Ponikowski et al

P . Ponikowski et al. , “2016 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure: The Task Force for the diagnosis and treatment of acute and chronic heart failure o f the European Society of Cardiology (ESC) Developed with the speci al contribution of the Heart Failure Association (HFA) of the E SC,” European heart journal ...

work page 2016
[44]

Statistical Comparison of Classiﬁers over M ultiple Data Sets,

J. Demsar, “Statistical Comparison of Classiﬁers over M ultiple Data Sets,” Journal of Machine Learning Research , vol. 7, no. 7, pp. 1–30, 2006

work page 2006
[45]

Individual Comparisons by Ranking Metho ds,

F. Wilcoxon, “Individual Comparisons by Ranking Metho ds,” Biometrics Bulletin , vol. 1, no. 6, pp. 80–83, 1945

work page 1945
[46]

The use of conﬁdence or ﬁduci al limits illustrated in the case of the binomial,

C. Clopper and E. Pearson, “The use of conﬁdence or ﬁduci al limits illustrated in the case of the binomial,” Biometrika, vol. 26, no. 4, p. 404, 1934. (PREPRINT) IEEE TRANSACTIONS ON KNOWLEDGE AND DA T A ENGINEERING, SUBMITTED FOR REVIEW, AUGUST 2019 13 APPENDIX A PARAMETER TUNING Figures A.1 and A.2 display the average value of the param- eters that con...

work page 1934

[1] [1]

The coming of age o f artiﬁcial intelligence in medicine,

V . L. Patel, E. H. Shortliffe, M. Stefanelli, P . Szolovits, M . R. Berthold, R. Bellazzi, and A. Abu-Hanna, “The coming of age o f artiﬁcial intelligence in medicine,” Artiﬁcial Intelligence in Medicine , vol. 46, no. 1, pp. 5–17, 2009

work page 2009

[2] [2]

Health Informatics via Machine Learning for the Clinical M an- agement of Patients,

D. A. Clifton, K. E. Niehaus, P . Charlton, and G. W. Colopy , “Health Informatics via Machine Learning for the Clinical M an- agement of Patients,” Y earb Med Inform, vol. 10, no. 1, pp. 38–43, 2015

work page 2015

[3] [3]

Exploratory medical k nowl- edge discovery: Experiences and issues,

J. Roddick, P . Fule, and W. Graco, “Exploratory medical k nowl- edge discovery: Experiences and issues,” ACM SIGKDD Explo- rations Newsletter, pp. 2–7, 2003

work page 2003

[4] [4]

Intelligent data analysis for medical diagnosis: Using ma chine learning and temporal abstraction,

N. Lavraˇ c, I. Kononenko, E. Keravnou, M. Kukar, and B. Zu pan, “Intelligent data analysis for medical diagnosis: Using ma chine learning and temporal abstraction,” AI Communications , vol. 11, no. 3, pp. 191–218, 1998

work page 1998

[5] [5]

Data quality: A sta tistical perspective,

A. F. Karr, A. P . Sanil, and D. L. Banks, “Data quality: A sta tistical perspective,” Statistical Methodology , vol. 3, no. 2, pp. 137–173, 2006

work page 2006

[6] [6]

Evaluat ion of measurement data - guide to the expression of uncertainty in measurement,

W. G. . Joint Committee for Guides in Metrology , “Evaluat ion of measurement data - guide to the expression of uncertainty in measurement,” in T ech. Rep. JCGM 100: 2008 (BIPM, IEC, IFCC, ILAC, ISO, IUP AC, IUP AP and OIML, 2008

work page 2008

[7] [7]

Uniqueness of medical da ta mining,

K. J. Cios and G. William Moore, “Uniqueness of medical da ta mining,” Artiﬁcial Intelligence in Medicine , vol. 26, no. 1-2, pp. 1–24, 2002

work page 2002

[8] [8]

Intra- and interobserver variability in th e mea- surements of abdominal aortic and common iliac artery diame ter with computed tomography . The Tromsø study,

K. Singh, B. K. Jacobsen, S. Solberg, K. H. Bønaa, S. Kumar, R. B ajic, and E. Arnesen, “Intra- and interobserver variability in th e mea- surements of abdominal aortic and common iliac artery diame ter with computed tomography . The Tromsø study,” European Journal Vascular and Endovascular Surgery, vol. 25, no. 5, pp. 399–407, 2003

work page 2003

[9] [9]

Measuring left ventricular ejecti on fraction-techniques and potential pitfalls,

T. Foley , S. Mankad, N. Anavekar, C. Bonnichsen, M. Morris , T. Miller, and P . Araoz, “Measuring left ventricular ejecti on fraction-techniques and potential pitfalls,” European Cardiology , vol. 8, no. 2, pp. 108–114, 2012

work page 2012

[10] [10]

Comparison of imaging techniques to assess appendage anatomy and measurements for left atrial a p- pendage closure device selection

J. R. Lopez-Minguez, R. Gonzalez-Fernandez, C. Fernan dez-V egas, V . Millan-Nunez, M. E. Fuentes-Canamero, J. M. Nogales-Asensio, J. Doncel-V ecino, M. Y uste Dominguez, L. Garcia Serrano, and D. Sanchez Quintana, “Comparison of imaging techniques to assess appendage anatomy and measurements for left atrial a p- pendage closure device selection.” The Jou...

work page 2014

[11] [11]

The quantita tive science of evaluating imaging evidence,

T. S. Genders, B. S. Ferket, and M. M. Hunink, “The quantita tive science of evaluating imaging evidence,” JACC: Cardiovascular Imaging, vol. 10, no. 3, pp. 264–275, 2017

work page 2017

[12] [12]

Assessment of left ventricular e jection fraction in patients eligible for ICD therapy: Discrepancy between cardiac magnetic resonance imaging and 2D echocardiograph y,

S. de Haan, K. de Boer, J. Commandeur, A. M. Beek, A. C. van Rossum, and C. P . Allaart, “Assessment of left ventricular e jection fraction in patients eligible for ICD therapy: Discrepancy between cardiac magnetic resonance imaging and 2D echocardiograph y,” Netherlands Heart Journal , vol. 22, no. 10, pp. 449–455, 2014

work page 2014

[13] [13]

Closing the chasm between research and pra ctice: evidence of and for change,

L. W. Green, “Closing the chasm between research and pra ctice: evidence of and for change,” Health Promotion Journal of Australia , vol. 25, no. 1, pp. 25–29, 2014. (PREPRINT) IEEE TRANSACTIONS ON KNOWLEDGE AND DA T A ENGINEERING, SUBMITTED FOR REVIEW, AUGUST 2019 12

work page 2014

[14] [14]

Interactive dichotomizer, id3,

J. Quinlan et al. , “Interactive dichotomizer, id3,” Eds. Morgan Kauffmann, Springer-Verlag, 1979

work page 1979

[15] [15]

Quinlan, C4.5: Programs for Machine Learning

R. Quinlan, C4.5: Programs for Machine Learning . San Mateo, CA: Morgan Kaufmann Publishers, 1993

work page 1993

[16] [16]

Breiman, J

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classiﬁca- tion and Regression T rees. Belmont, CA: Wadsworth International Group, 1984

work page 1984

[17] [17]

An exploratory technique for investigatin g large quantities of categorical data,

G. V . Kass, “An exploratory technique for investigatin g large quantities of categorical data,” Applied statistics, pp. 119–127, 1980

work page 1980

[18] [18]

Can machine-learning improve cardiovascular risk prediction using routine clinical data?

S. F. Weng, J. Reps, J. Kai, J. M. Garibaldi, and N. Qureshi , “Can machine-learning improve cardiovascular risk prediction using routine clinical data?” PLOS ONE, vol. 12, no. 4, 2017

work page 2017

[19] [19]

L 119, 4.5.:1–88

“Regulation (EU) 2016/679 of the European Parliament a nd of the Council of 27 April 2016 on the protection of natural pers ons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC,” 2 016 O.J. L 119, 4.5.:1–88

work page 2016

[20] [20]

Decision trees as probabilistic classi ﬁers,

J. R. Quinlan, “Decision trees as probabilistic classi ﬁers,” in Pro- ceedings of the 4th International Workshop on Machine Learn ing. Morgan Kauffman, 1987, pp. 31–37

work page 1987

[21] [21]

Softening splits in decision trees using simulated annealing,

J. Dvor´ ak and P . Savick ´y, “Softening splits in decision trees using simulated annealing,” in Adaptive and Natural Computing Algorithms, 8th International Conference, ICANNGA 2007, W arsaw, Poland, April 11-14, 2007, Proceedings, Part I , 2007, pp. 721–729

work page 2007

[22] [22]

Decision trees for uncertain data,

S. Tsang, B. Kao, K. Y . Yip, W.-S. Ho, and S. D. Lee, “Decision trees for uncertain data,” IEEE transactions on knowledge and data engineering, vol. 23, no. 1, pp. 64–78, 2011

work page 2011

[23] [23]

Soft decision tr ees,

O. Irsoy , O. T. Yıldız, and E. Alpaydın, “Soft decision tr ees,” in Pattern Recognition (ICPR), 2012 21st International Confe rence on . IEEE, 2012, pp. 1819–1822

work page 2012

[24] [24]

Induction of fuzzy decision trees,

Y . Y uan, “Induction of fuzzy decision trees,” Fuzzy Sets and Sys- tems, vol. 69, no. 2, pp. 125–139, 1995

work page 1995

[25] [25]

On the optimization of fuzzy decision trees,

X. Wang, B. Chen, G. Qian, and F. Y e, “On the optimization of fuzzy decision trees,” Fuzzy Sets and Systems , vol. 112, no. 1, pp. 117–125, may 2000

work page 2000

[26] [26]

On Distribu ted Fuzzy Decision Trees for Big Data,

A. Segatori, F. Marcelloni, and W. Pedrycz, “On Distribu ted Fuzzy Decision Trees for Big Data,” IEEE T ransactions on Fuzzy Systems , pp. 1–1, 2017

work page 2017

[27] [27]

Probabilistic decision trees,

J. R. Quinlan, “Probabilistic decision trees,” Machine learning: an artiﬁcial intelligence approach , vol. 3, pp. 140–152, 1990

work page 1990

[28] [28]

Hierarchical mixtures of experts and the em algorithm,

M. I. Jordan and R. A. Jacobs, “Hierarchical mixtures of experts and the em algorithm,” Neural computation , vol. 6, no. 2, pp. 181– 214, 1994

work page 1994

[29] [29]

Constructing optimal binary decision trees is NP-complete,

L. Hyaﬁl and R. L. Rivest, “Constructing optimal binary decision trees is NP-complete,” Information Processing Letters , vol. 5, no. 1, pp. 15–17, 1976

work page 1976

[30] [30]

Induction of decision trees,

J. R. Quinlan, “Induction of decision trees,” Machine Learning , vol. 1, no. 1, pp. 81–106, 1986

work page 1986

[31] [31]

Top-down induction of decisio n trees classiﬁers - A survey ,

L. Rokach and O. Maimon, “Top-down induction of decisio n trees classiﬁers - A survey ,” IEEE T ransactions on Systems, Man and Cybernetics Part C: Applications and Reviews , vol. 35, no. 4, pp. 476– 487, 2005

work page 2005

[32] [32]

Ross Quinlan’s personal homepage

Quinlan, Ross. Ross Quinlan’s personal homepage. Acce ssed: 2018-06-03. [Online]. Available: www.rulequest.com/Personal/

work page 2018

[33] [33]

Bayesian model averaging: a tutorial,

J. A. Hoeting, D. Madigan, A. E. Raftery , and C. T. V olins ky , “Bayesian model averaging: a tutorial,” Statistical science, pp. 382– 401, 1999

work page 1999

[34] [34]

Two-dimensional speckle tracking echocardiography: standardization effo rts based on synthetic ultrasound data,

J. D’Hooge, D. Barbosa, H. Gao, P . Claus, D. Prater, J. Ha milton, P . Lysyansky , Y . Abe, Y . Ito, H. Houle et al. , “Two-dimensional speckle tracking echocardiography: standardization effo rts based on synthetic ultrasound data,” Eur Heart J Cardiovasc Imaging , vol. 17, no. 6, pp. 693–701, 2016

work page 2016

[35] [35]

An experimen tal and theoretical comparison of model selection methods,

M. Kearns, Y . Mansour, A. Y . Ng, and D. Ron, “An experimen tal and theoretical comparison of model selection methods,” Machine Learning, vol. 50, pp. 7–50, 1997

work page 1997

[36] [36]

Learning decision rules in no isy do- mains,

T. Niblett and I. Bratko, “Learning decision rules in no isy do- mains,” in Proceedings of Expert Systems ’86, The 6Th Annual T ech- nical Conference on Research and development in expert syst ems III . Cambridge University Press, 1986, pp. 25–34

work page 1986

[37] [37]

UCI Machine Learning Repository,

M. Lichman, “UCI Machine Learning Repository,” 2013. [ Online]. Available: http://archive.ics.uci.edu/ml

work page 2013

[38] [38]

KEEL data-mining software tool: Data set repository , integration of algorithms and experimental analysis framework,

J. Alcal´ a-Fdez, A. Fern´ andez, J. Luengo, J. Derrac, S. Garc´ ıa, L. S´ anchez, and F. Herrera, “KEEL data-mining software tool: Data set repository , integration of algorithms and experimental analysis framework,” Journal of Multiple-Valued Logic and Soft Computing , vol. 17, no. 2-3, pp. 255–287, 2011

work page 2011

[39] [39]

Design of experiments for the nips 2003 variable selection benchmark,

I. Guyon, “Design of experiments for the nips 2003 variable selection benchmark,” 2003. [Online]. Available : clopinet.com/isabelle/Projects/NIPS2003

work page 2003

[40] [40]

Scikit-learn: Machine learning in Python ,

F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. T hirion, O. Grisel, M. Blondel, P . Prettenhofer, R. Weiss, V . Dubourg , J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perr ot, and E. Duchesnay , “Scikit-learn: Machine learning in Python ,” Journal of Machine Learning Research , vol. 12, pp. 2825–2830, 2011

work page 2011

[41] [41]

The effects of training set size on decision tree complexity ,

D. Jensen and T. Oates, “The effects of training set size on decision tree complexity ,” in Proceedings of the 14th International Conference on Machine Learning , 1999, pp. 254–262

work page 1999

[42] [42]

Data Scienc e Bowl Cardiac Challenge Data,

National Heart, Lung, and Blood Institute, “Data Scienc e Bowl Cardiac Challenge Data,” 2015. [Online]. Available: www.kaggle.com/c/second-annual-data-science-bowl

work page 2015

[43] [43]

Ponikowski et al

P . Ponikowski et al. , “2016 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure: The Task Force for the diagnosis and treatment of acute and chronic heart failure o f the European Society of Cardiology (ESC) Developed with the speci al contribution of the Heart Failure Association (HFA) of the E SC,” European heart journal ...

work page 2016

[44] [44]

Statistical Comparison of Classiﬁers over M ultiple Data Sets,

J. Demsar, “Statistical Comparison of Classiﬁers over M ultiple Data Sets,” Journal of Machine Learning Research , vol. 7, no. 7, pp. 1–30, 2006

work page 2006

[45] [45]

Individual Comparisons by Ranking Metho ds,

F. Wilcoxon, “Individual Comparisons by Ranking Metho ds,” Biometrics Bulletin , vol. 1, no. 6, pp. 80–83, 1945

work page 1945

[46] [46]

The use of conﬁdence or ﬁduci al limits illustrated in the case of the binomial,

C. Clopper and E. Pearson, “The use of conﬁdence or ﬁduci al limits illustrated in the case of the binomial,” Biometrika, vol. 26, no. 4, p. 404, 1934. (PREPRINT) IEEE TRANSACTIONS ON KNOWLEDGE AND DA T A ENGINEERING, SUBMITTED FOR REVIEW, AUGUST 2019 13 APPENDIX A PARAMETER TUNING Figures A.1 and A.2 display the average value of the param- eters that con...

work page 1934