pith. sign in

arxiv: 2605.20143 · v1 · pith:USMGG3SMnew · submitted 2026-05-19 · 📊 stat.AP · stat.CO· stat.ML

Semi-Parametric Bayesian Additive Regression Trees for Risk Prediction with High-Dimensional Epigenetic Signatures and Low-Dimensional Covariates

Pith reviewed 2026-05-20 03:04 UTC · model grok-4.3

classification 📊 stat.AP stat.COstat.ML
keywords semi-parametric BARTBayesian additive regression treesepigenetic signaturesvariable selectionrisk predictionmultiple myelomahigh-dimensional data5-hydroxymethylcytosine
0
0 comments X

The pith

A semi-parametric BART model places low-dimensional covariates in a parametric component with interpretable coefficients while modeling high-dimensional epigenetic predictors through the tree ensemble.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces spBART to overcome a limitation in standard BART models. Standard BART treats every predictor the same inside the tree ensemble, which hides the separate contributions of a few important covariates and makes variable selection difficult when most predictors are high-dimensional. spBART therefore adds an explicit parametric regression term for the low-dimensional covariates so that their coefficients remain directly interpretable. The high-dimensional epigenetic signatures continue to be handled by the usual BART tree sum, which captures nonlinear effects and interactions. The authors also supply a cross-validation routine that pools posterior inclusion probabilities across folds and applies Bayesian false-discovery-rate control to produce a stable, parsimonious set of selected loci. When fitted to pooled 5-hydroxymethylcytosine profiles from two multiple-myeloma studies, the model reaches an AUC of 0.96 on held-out data while returning only a small number of candidate loci.

Core claim

spBART augments the standard Bayesian additive regression tree ensemble with a parametric linear component for low-dimensional covariates. This separation yields directly interpretable regression coefficients for the covariates while the tree ensemble retains its flexibility for complex, nonlinear associations among the high-dimensional epigenetic predictors. A cross-validation procedure aggregates posterior inclusion probabilities across folds and imposes Bayesian false-discovery-rate control to perform stable variable selection. Applied to genome-wide 5-hydroxymethylcytosine profiles from 869 participants in two multiple-myeloma case-control studies, the model identifies a parsimonious set

What carries the argument

The semi-parametric BART (spBART) model, which augments the nonparametric BART tree ensemble with a separate parametric regression component for low-dimensional covariates.

If this is right

  • Interpretable coefficients are obtained for the effects of low-dimensional covariates such as clinical or demographic factors.
  • Stable variable selection remains feasible in high-dimensional epigenetic settings despite complex dependence among predictors.
  • Strong out-of-sample discrimination (AUC 0.96) is achieved in held-out data for multiple-myeloma risk prediction.
  • A unified modeling framework combines covariate adjustment with flexible tree-based prediction and controlled variable selection.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The parametric-nonparametric split may simplify adjustment for known confounders when the low-dimensional covariates include clinical variables that precede the epigenetic measurements.
  • The same separation could be tested on other paired data types such as gene-expression profiles together with basic demographic covariates.
  • Changing the number of cross-validation folds or the exact Bayesian FDR threshold would provide a direct check on the robustness of the selected loci.

Load-bearing premise

The cross-validation procedure that aggregates posterior inclusion probabilities across folds and applies Bayesian false-discovery-rate control produces stable variable selection even when the high-dimensional epigenetic predictors are dependent and interact with the parametric component.

What would settle it

Re-running the analysis on an independent set of comparable size and observing whether the AUC falls substantially below 0.96 or the selected loci fail to overlap with those reported in the original validation set.

Figures

Figures reproduced from arXiv: 2605.20143 by Brian C.-H. Chiu, Parveen Bhatti, Saurabh Bhandari, Yuan Ji.

Figure 1
Figure 1. Figure 1: Is X1 > 0.5? Is X2 > −0.2? Predict: 0.8 (high risk) Predict: 0.6 (medium risk) Predict: 0.1 (low risk) [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
read the original abstract

In the era of precision medicine, genome-wide epigenetic modifications offer rich data that could inform risk prediction. However, these data are high-dimensional and exhibit complex dependence structures, which makes it difficult to jointly model them with low-dimensional covariates when the goal is to obtain interpretable effect estimates for covariate adjustment. Standard Bayesian additive regression trees (BART) provide strong predictive performance but treat all predictors uniformly within the tree ensemble, obscuring the contributions of significant covariates and complicating variable selection in high-dimensional settings. We propose a semi-parametric BART model (spBART) that addresses this limitation by modeling low-dimensional covariates through a parametric component with interpretable coefficients, while capturing complex nonlinear associations among high-dimensional predictors through the tree ensemble. To perform stable variable selection, we develop a cross-validation-based procedure that aggregates posterior inclusion probabilities across folds and applies Bayesian false discovery rate control. We apply the proposed method to a pooled case--control analysis of high-dimensional genome-wide 5-hydroxymethylcytosine profiles derived from circulating cell-free DNA in two multiple myeloma studies ($N = 869$). The approach identifies a parsimonious set of candidate loci and achieves strong out-of-sample discrimination (AUC $= 0.96$) in a held-out validation set. Overall, spBART provides a unified framework for combining interpretable covariate inference with flexible modeling and variable selection in high-dimensional biomedical studies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a semi-parametric Bayesian additive regression trees (spBART) model that integrates a parametric component for low-dimensional covariates to obtain interpretable coefficients with a nonparametric tree ensemble for capturing complex associations in high-dimensional epigenetic predictors. It introduces a cross-validation procedure that aggregates posterior inclusion probabilities (PIPs) across folds and applies Bayesian false discovery rate (BFDR) control for variable selection. In an application to pooled case-control data from two multiple myeloma studies involving genome-wide 5-hydroxymethylcytosine (5hmC) profiles in circulating cell-free DNA (N = 869), the method identifies a parsimonious set of candidate loci and reports an area under the curve (AUC) of 0.96 on a held-out validation set.

Significance. If the variable selection procedure is robust to the dependence structures inherent in epigenetic data, spBART could provide a useful framework for risk prediction in precision medicine by balancing interpretability of covariate effects with flexible modeling of high-dimensional predictors. The reported high out-of-sample AUC suggests strong predictive performance, and the focus on parsimony aids in identifying biologically relevant loci. However, the significance hinges on validation of the selection stability, which is not fully detailed.

major comments (2)
  1. [§3] §3 (variable selection procedure): The cross-validation aggregation of posterior inclusion probabilities followed by Bayesian FDR control implicitly assumes exchangeability or independence of predictors across folds. Epigenetic data exhibit substantial local and global correlations from chromatin domains, co-regulation, and batch effects; when this assumption fails, the selected loci may be unstable or exhibit inflated false discovery rates. This directly undermines the central claim of a reliable 'parsimonious set of candidate loci' and requires either a simulation study under realistic correlation structures or explicit sensitivity checks to support the reported results.
  2. [§2] §2 (model specification): The description of spBART does not detail how the parametric component for low-dimensional covariates interacts with the tree ensemble during fitting or selection. Unmodeled leakage of covariate effects into the nonparametric component could compromise both the interpretability of the parametric coefficients and the stability of high-dimensional variable selection, which is load-bearing for the claimed separation of roles and the AUC = 0.96 result.
minor comments (2)
  1. [Abstract] The abstract reports AUC = 0.96 on held-out data but provides no information on the exact splitting procedure, whether case-control sampling was accounted for in the AUC calculation, or any calibration checks.
  2. [Application] In the application section, clarify the pooling of the two multiple myeloma studies and whether batch correction or normalization was applied to the 5hmC profiles prior to modeling, as technical artifacts could affect both selection and prediction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments. We have addressed each major comment point by point below, providing clarifications and indicating revisions made to strengthen the manuscript.

read point-by-point responses
  1. Referee: §3 (variable selection procedure): The cross-validation aggregation of posterior inclusion probabilities followed by Bayesian FDR control implicitly assumes exchangeability or independence of predictors across folds. Epigenetic data exhibit substantial local and global correlations from chromatin domains, co-regulation, and batch effects; when this assumption fails, the selected loci may be unstable or exhibit inflated false discovery rates. This directly undermines the central claim of a reliable 'parsimonious set of candidate loci' and requires either a simulation study under realistic correlation structures or explicit sensitivity checks to support the reported results.

    Authors: We thank the referee for this important observation on dependence structures in epigenetic data. Our cross-validation aggregation of PIPs was developed specifically to improve selection stability across data partitions, which can partially buffer against correlation-induced variability. However, we agree that explicit checks under realistic correlation patterns would better support the robustness claims. In the revised manuscript, we have added a dedicated simulation study that generates synthetic data with correlation structures calibrated to epigenetic profiles (local chromatin-domain correlations, co-regulation blocks, and batch effects). Results confirm that the aggregated PIP + BFDR procedure maintains nominal FDR control and produces stable locus selections under moderate-to-high dependence. We have also included sensitivity analyses on the real multiple myeloma data examining selection stability across varying fold counts and correlation-adjusted priors. revision: yes

  2. Referee: §2 (model specification): The description of spBART does not detail how the parametric component for low-dimensional covariates interacts with the tree ensemble during fitting or selection. Unmodeled leakage of covariate effects into the nonparametric component could compromise both the interpretability of the parametric coefficients and the stability of high-dimensional variable selection, which is load-bearing for the claimed separation of roles and the AUC = 0.96 result.

    Authors: We agree that the original exposition of the fitting procedure was insufficiently detailed. The spBART model is defined as Y = Xβ + f(Z) + ε, with β estimated parametrically and f implemented via BART on the high-dimensional epigenetic predictors Z. In the MCMC sampler, the tree ensemble is updated on the residuals after subtracting the current parametric fit, while β is drawn from its full conditional given the current tree predictions; this alternating scheme explicitly prevents leakage of covariate effects into the nonparametric component. Variable selection (aggregated PIPs and BFDR) is performed exclusively on the predictors entering the tree ensemble. We have substantially expanded Section 2 with the complete set of full-conditional distributions, pseudocode for the sampler, and a diagram illustrating the separation of roles. These additions directly support the interpretability of the parametric coefficients and the stability of high-dimensional selection underlying the reported AUC. revision: yes

Circularity Check

0 steps flagged

No significant circularity in spBART proposal or variable selection

full rationale

The paper proposes a new semi-parametric BART extension (spBART) that separates parametric modeling of low-dimensional covariates from tree-ensemble modeling of high-dimensional epigenetic predictors, then applies an independent cross-validation procedure to aggregate posterior inclusion probabilities and control Bayesian FDR for variable selection. The reported AUC of 0.96 is obtained on a held-out validation set after model fitting, providing external evaluation rather than any in-sample reduction. No equations or procedures in the abstract or described methodology equate a claimed prediction or result to a fitted input by construction, and no self-citations are invoked as load-bearing uniqueness theorems. The central claims rest on the model structure and out-of-sample performance rather than tautological re-use of fitted quantities.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; specific free parameters, axioms, and invented entities cannot be audited in detail. The approach rests on standard BART tree priors plus the added assumption that low-dimensional covariates can be isolated in a parametric component without loss of important interactions.

axioms (1)
  • domain assumption Low-dimensional covariates exert effects that are adequately captured by a parametric (interpretable-coefficient) component while high-dimensional epigenetic predictors require nonparametric tree modeling.
    This separation is the core modeling choice stated in the abstract.

pith-pipeline@v0.9.0 · 5799 in / 1484 out tokens · 61905 ms · 2026-05-20T03:04:25.886867+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    We propose a semi-parametric BART model (spBART) that addresses this limitation by modeling low-dimensional covariates through a parametric component with interpretable coefficients, while capturing complex nonlinear associations among high-dimensional predictors through the tree ensemble... cross-validation-based procedure that aggregates posterior inclusion probabilities across folds and applies Bayesian false discovery rate control.

  • IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    The approach identifies a parsimonious set of candidate loci and achieves strong out-of-sample discrimination (AUC = 0.96)

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

87 extracted references · 87 canonical work pages · 1 internal anchor

  1. [1]

    Clinical Pharmacology & Therapeutics , volume=

    Beyond randomized clinical trials: use of external controls , author=. Clinical Pharmacology & Therapeutics , volume=. 2020 , publisher=

  2. [2]

    Clinical Trials , volume=

    Summarizing historical information on controls in clinical trials , author=. Clinical Trials , volume=. 2010 , publisher=

  3. [3]

    Statistics in Medicine , volume=

    Bayesian semiparametric meta-analytic-predictive prior for historical control borrowing in clinical trials , author=. Statistics in Medicine , volume=. 2021 , publisher=

  4. [4]

    Biometrics , volume=

    A causal inference framework for leveraging external controls in hybrid trials , author=. Biometrics , volume=. 2024 , publisher=

  5. [5]

    Biometrics , volume=

    Improving efficiency of inference in clinical trials with external control data , author=. Biometrics , volume=. 2023 , publisher=

  6. [6]

    arXiv preprint arXiv:2310.20087 , year=

    PAM-HC: A Bayesian Nonparametric Construction of Hybrid Control for Randomized Clinical Trials Using External Data , author=. arXiv preprint arXiv:2310.20087 , year=

  7. [7]

    Statistics in Medicine , volume=

    Incorporating external data into the analysis of clinical trials via Bayesian additive regression trees , author=. Statistics in Medicine , volume=. 2021 , publisher=

  8. [8]

    The lancet oncology , volume=

    International Myeloma Working Group consensus criteria for response and minimal residual disease assessment in multiple myeloma , author=. The lancet oncology , volume=. 2016 , publisher=

  9. [9]

    Blood, The Journal of the American Society of Hematology , volume=

    Whole-genome sequencing of multiple myeloma from diagnosis to plasma cell leukemia reveals genomic initiating events, evolution, and clonal tides , author=. Blood, The Journal of the American Society of Hematology , volume=. 2012 , publisher=

  10. [10]

    Blood, The Journal of the American Society of Hematology , volume=

    Clonal competition with alternating dominance in multiple myeloma , author=. Blood, The Journal of the American Society of Hematology , volume=. 2012 , publisher=

  11. [11]

    European journal of haematology , volume=

    Dynamics of tumor-specific cfDNA in response to therapy in multiple myeloma patients , author=. European journal of haematology , volume=. 2020 , publisher=

  12. [12]

    Nature Reviews Genetics , volume=

    DNA methylation profiling in the clinic: applications and challenges , author=. Nature Reviews Genetics , volume=. 2012 , publisher=

  13. [13]

    Nature Reviews Cancer , volume=

    Cell-free nucleic acids as biomarkers in cancer patients , author=. Nature Reviews Cancer , volume=. 2011 , publisher=

  14. [14]

    New England Journal of Medicine , volume=

    Application of cell-free DNA analysis to cancer treatment , author=. New England Journal of Medicine , volume=. 2018 , publisher=

  15. [15]

    arXiv preprint arXiv:2404.07923 , year=

    A Bayesian Estimator of Sample Size , author=. arXiv preprint arXiv:2404.07923 , year=

  16. [16]

    Journal of Big data , volume=

    A survey of transfer learning , author=. Journal of Big data , volume=. 2016 , publisher=

  17. [17]

    Handbook of research on machine learning applications and trends: algorithms, methods, and techniques , pages=

    Transfer learning , author=. Handbook of research on machine learning applications and trends: algorithms, methods, and techniques , pages=. 2010 , publisher=

  18. [18]

    Proceedings of the IEEE , volume=

    A comprehensive survey on transfer learning , author=. Proceedings of the IEEE , volume=. 2020 , publisher=

  19. [19]

    Journal of Big Data , volume=

    Transfer learning: a friendly introduction , author=. Journal of Big Data , volume=. 2022 , publisher=

  20. [20]

    Learning , volume=

    Transfer learning , author=. Learning , volume=

  21. [21]

    IEEE Transactions on knowledge and data engineering , volume=

    A survey on transfer learning , author=. IEEE Transactions on knowledge and data engineering , volume=. 2009 , publisher=

  22. [22]

    arXiv preprint arXiv:2312.13484 , year=

    Bayesian transfer learning , author=. arXiv preprint arXiv:2312.13484 , year=

  23. [23]

    IEEE Transactions on Signal Processing , volume=

    Optimal Bayesian transfer learning , author=. IEEE Transactions on Signal Processing , volume=. 2018 , publisher=

  24. [24]

    IEEE Signal Processing Letters , volume=

    Optimal Bayesian transfer regression , author=. IEEE Signal Processing Letters , volume=. 2018 , publisher=

  25. [25]

    arXiv preprint arXiv:2109.13233 , year=

    Bayesian transfer learning: An overview of probabilistic graphical models for transfer learning , author=. arXiv preprint arXiv:2109.13233 , year=

  26. [26]

    Advances in Neural Information Processing Systems , volume=

    Pre-train your loss: Easy bayesian transfer learning with informative priors , author=. Advances in Neural Information Processing Systems , volume=

  27. [27]

    IEEE Transactions on Knowledge and Data Engineering , volume=

    Transfer learning for dynamic feature extraction using variational Bayesian inference , author=. IEEE Transactions on Knowledge and Data Engineering , volume=. 2021 , publisher=

  28. [28]

    International conference on artificial intelligence and statistics , pages=

    Transfer learning with gaussian processes for bayesian optimization , author=. International conference on artificial intelligence and statistics , pages=. 2022 , organization=

  29. [29]

    IEEE Transactions on Geoscience and Remote Sensing , volume=

    Bayesian transfer learning for object detection in optical remote sensing images , author=. IEEE Transactions on Geoscience and Remote Sensing , volume=. 2020 , publisher=

  30. [30]

    Quality Engineering , volume=

    Statistical transfer learning: A review and some extensions to statistical process control , author=. Quality Engineering , volume=. 2018 , publisher=

  31. [31]

    proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Adaptive transfer learning , author=. proceedings of the AAAI Conference on Artificial Intelligence , volume=

  32. [32]

    The Annals of Statistics , volume=

    Adaptive transfer learning , author=. The Annals of Statistics , volume=. 2021 , publisher=

  33. [33]

    Biometrika , volume=

    Sparse Bayesian infinite factor models , author=. Biometrika , volume=. 2011 , publisher=

  34. [34]

    Variable selection for

    Bleich, Justin and Kapelner, Adam and George, Edward I and Jensen, Shane T , journal=. Variable selection for. 2014 , publisher=

  35. [35]

    Variable selection using

    Luo, Chuji and Daniels, Michael J , journal=. Variable selection using. 2024 , publisher=

  36. [36]

    Bayesian additive regression trees and the General

    Tan, Yaoyuan Vincent and Roy, Jason , journal=. Bayesian additive regression trees and the General. 2019 , publisher=

  37. [37]

    2010 , publisher=

    Chipman, Hugh A and George, Edward I and McCulloch, Robert E , journal=. 2010 , publisher=

  38. [38]

    Journal of Statistical Software , volume=

    bartMachine: Machine learning with Bayesian additive regression trees , author=. Journal of Statistical Software , volume=

  39. [39]

    Annual Review of Statistics and Its Application , volume=

    Bayesian additive regression trees: A review and look forward , author=. Annual Review of Statistics and Its Application , volume=. 2020 , publisher=

  40. [40]

    Journal of the American Statistical Association , volume=

    Bayesian regression trees for high-dimensional prediction and variable selection , author=. Journal of the American Statistical Association , volume=. 2018 , publisher=

  41. [41]

    The annals of applied statistics , volume=

    A semiparametric modeling approach using Bayesian additive regression trees with an application to evaluate heterogeneous treatment effects , author=. The annals of applied statistics , volume=

  42. [42]

    The Annals of Applied Statistics , volume=

    Accounting for shared covariates in semiparametric Bayesian additive regression trees , author=. The Annals of Applied Statistics , volume=. 2025 , publisher=

  43. [43]

    Journal of Computational and Graphical Statistics , volume=

    A product partition model with regression on covariates , author=. Journal of Computational and Graphical Statistics , volume=. 2011 , publisher=

  44. [44]

    Johns Hopkins University, Dept

    FDR and Bayesian multiple comparisons rules , author=. Johns Hopkins University, Dept. of Biostatistics Working Papers , volume=. 2006 , publisher=

  45. [45]

    Journal of the Royal statistical society: series B (Methodological) , volume=

    Controlling the false discovery rate: a practical and powerful approach to multiple testing , author=. Journal of the Royal statistical society: series B (Methodological) , volume=. 1995 , publisher=

  46. [46]

    Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

    Regression-based latent factor models , author=. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

  47. [47]

    International Workshop on Artificial Intelligence and Statistics , pages=

    Semiparametric latent factor models , author=. International Workshop on Artificial Intelligence and Statistics , pages=. 2005 , organization=

  48. [48]

    Journal of the National Cancer Institute Monographs , volume=

    Etiologic heterogeneity among non-Hodgkin lymphoma subtypes: the InterLymph non-Hodgkin lymphoma subtypes project , author=. Journal of the National Cancer Institute Monographs , volume=. 2014 , publisher=

  49. [49]

    Journal of the National Cancer Institute Monographs , volume=

    Medical history, lifestyle, family history, and occupational risk factors for follicular lymphoma: the InterLymph Non-Hodgkin Lymphoma Subtypes Project , author=. Journal of the National Cancer Institute Monographs , volume=. 2014 , publisher=

  50. [50]

    NPJ genomic medicine , volume=

    Alterations of 5-hydroxymethylation in circulating cell-free DNA reflect molecular distinctions of subtypes of non-Hodgkin lymphoma , author=. NPJ genomic medicine , volume=. 2021 , publisher=

  51. [51]

    Journal of Hematology & Oncology , volume=

    Genome-wide profiling of 5-hydroxymethylcytosines in circulating cell-free DNA reveals population-specific pathways in the development of multiple myeloma , author=. Journal of Hematology & Oncology , volume=. 2022 , publisher=

  52. [52]

    The Journal of Machine Learning Research , volume=

    A widely applicable Bayesian information criterion , author=. The Journal of Machine Learning Research , volume=. 2013 , publisher=

  53. [53]

    Bayesian Cross Validation and WAIC for Predictive Prior Design in Regular Asymptotic Theory

    Bayesian cross validation and WAIC for predictive prior design in regular asymptotic theory , author=. arXiv preprint arXiv:1503.07970 , year=

  54. [54]

    Proceedings of the Workshop on Information Theoretic Methods in Science and Engineering , pages=

    WAIC and WBIC are information criteria for singular statistical model evaluation , author=. Proceedings of the Workshop on Information Theoretic Methods in Science and Engineering , pages=

  55. [55]

    Statistica Sinica , pages=

    Assessing the treatment effect heterogeneity with a latent variable , author=. Statistica Sinica , pages=. 2018 , publisher=

  56. [56]

    American Journal of Epidemiology , volume=

    Assessing heterogeneity of treatment effects in observational studies , author=. American Journal of Epidemiology , volume=. 2021 , publisher=

  57. [57]

    arXiv preprint arXiv:2509.05775 , year=

    Causal Clustering for Conditional Average Treatment Effects Estimation and Subgroup Discovery , author=. arXiv preprint arXiv:2509.05775 , year=

  58. [58]

    Annals of Statistics , pages=

    Convergence rates of posterior distributions , author=. Annals of Statistics , pages=. 2000 , publisher=

  59. [59]

    Genome biology , volume=

    Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , author=. Genome biology , volume=. 2014 , publisher=

  60. [60]

    CRAN R Repositary , volume=

    Package ‘glmnet’ , author=. CRAN R Repositary , volume=

  61. [61]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Regularization and variable selection via the elastic net , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2005 , publisher=

  62. [62]

    The annals of statistics , pages=

    The bayesian bootstrap , author=. The annals of statistics , pages=. 1981 , publisher=

  63. [63]

    Journal of the American statistical Association , volume=

    Bayesian analysis of binary and polychotomous response data , author=. Journal of the American statistical Association , volume=. 1993 , publisher=

  64. [64]

    mclust 5: Clustering, classification and density estimation using

    Scrucca, Luca and Fop, Michael and Murphy, T Brendan and Raftery, Adrian E , journal=. mclust 5: Clustering, classification and density estimation using

  65. [65]

    Robin, Xavier and Turck, Natacha and Hainard, Alexandre and Tiberti, Natalia and Lisacek, Fr. p. BMC Bioinformatics , volume=. 2011 , publisher=

  66. [66]

    Leukemia , volume=

    Contrast in cytokine expression between patients with monoclonal gammopathy of undetermined significance or multiple myeloma , author=. Leukemia , volume=. 1998 , publisher=

  67. [67]

    Mayo Clinic Proceedings , volume=

    Induction of a chronic disease state in patients with smoldering or indolent multiple myeloma by targeting interleukin 1 -induced interleukin 6 production and the myeloma proliferative component , author=. Mayo Clinic Proceedings , volume=. 2009 , organization=

  68. [68]

    Oncotarget , volume=

    FGF23 is elevated in multiple myeloma and increases heparanase expression by tumor cells , author=. Oncotarget , volume=

  69. [69]

    Proceedings of the National Academy of Sciences , volume=

    Activating KRAS, NRAS, and BRAF mutants enhance proteasome capacity and reduce endoplasmic reticulum stress in multiple myeloma , author=. Proceedings of the National Academy of Sciences , volume=. 2020 , publisher=

  70. [70]

    elife , volume=

    Paradoxical resistance of multiple myeloma to proteasome inhibitors by decreased levels of 19S proteasomal subunits , author=. elife , volume=. 2015 , publisher=

  71. [71]

    BMC Medical Genomics , volume=

    The prognostic significance of ubiquitination-related genes in multiple myeloma by bioinformatics analysis , author=. BMC Medical Genomics , volume=. 2024 , publisher=

  72. [72]

    Econometrica: journal of the Econometric Society , pages=

    Root-N-consistent semiparametric regression , author=. Econometrica: journal of the Econometric Society , pages=. 1988 , publisher=

  73. [73]

    2003 , publisher=

    Semiparametric regression , author=. 2003 , publisher=

  74. [74]

    2018 , publisher=

    Double/debiased machine learning for treatment and structural parameters , author=. 2018 , publisher=

  75. [75]

    The Annals of Applied Statistics , volume=

    A weakly informative default prior distribution for logistic and other regression models , author=. The Annals of Applied Statistics , volume=

  76. [76]

    Electronic Journal of Statistics , volume=

    The horseshoe estimator: Posterior concentration around nearly black vectors , author=. Electronic Journal of Statistics , volume=

  77. [77]

    Nature biotechnology , volume=

    Selective chemical labeling reveals the genome-wide distribution of 5-hydroxymethylcytosine , author=. Nature biotechnology , volume=. 2011 , publisher=

  78. [78]

    Cell research , volume=

    5-Hydroxymethylcytosine signatures in circulating cell-free DNA as diagnostic biomarkers for human cancers , author=. Cell research , volume=. 2017 , publisher=

  79. [79]

    Blood , volume=

    Multiple myeloma , author=. Blood , volume=

  80. [80]

    Blood, the Journal of the American Society of Hematology , volume=

    Racial disparities in incidence and outcome in multiple myeloma: a population-based study , author=. Blood, the Journal of the American Society of Hematology , volume=. 2010 , publisher=

Showing first 80 references.