pith. sign in

arxiv: 2604.17956 · v1 · submitted 2026-04-20 · 💻 cs.LG · stat.ME

Federated Rule Ensemble Method in Medical Data

Pith reviewed 2026-05-10 05:10 UTC · model grok-4.3

classification 💻 cs.LG stat.ME
keywords federated learningRuleFitinterpretable modelsmedical datadifferential privacygradient boosting treessparse optimizationdistributed healthcare
0
0 comments X

The pith

A federated RuleFit method builds one interpretable model across hospitals while matching centralized accuracy without sharing raw data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a federated RuleFit framework that lets multiple medical institutions train a single global rule-based model without pooling patient records. It first uses differentially private histograms to agree on feature cutoffs so every site defines the same rules, then generates rules locally with gradient boosting trees and estimates their importance through federated L1-regularized optimization. A sympathetic reader would care because privacy laws block data sharing yet clinical tools need both high accuracy and human-understandable explanations. Simulations show the method reaches performance close to a centralized RuleFit while beating other federated baselines; real medical data confirm it yields competitive predictions plus explicit rules.

Core claim

We proposed a federated RuleFit framework to construct a unified and interpretable global model for distributed environments. It integrates three components: preprocessing based on differentially private histograms to estimate shared cutoff values, enabling consistent rule definitions and reducing heterogeneity across clients; local rule generation using gradient boosting decision trees with shared cutoffs; and coefficient estimation via ℓ1-regularized optimization using a Federated Dual Averaging algorithm for sparse and consistent variable selection. In simulation studies, the proposed method achieved a performance comparable to that of centralized RuleFit while outperforming existingfeder

What carries the argument

Differentially private histograms that compute shared cutoff values for consistent rule definitions across clients, paired with Federated Dual Averaging for sparse coefficient estimation.

Load-bearing premise

Differentially private histograms can produce sufficiently accurate shared cutoff values to keep rule definitions consistent across heterogeneous client datasets without materially degrading downstream model quality.

What would settle it

Apply the method to client datasets whose feature ranges differ sharply, then measure whether the resulting global model's predictive accuracy falls more than five percentage points below the centralized RuleFit baseline on the same test distribution.

Figures

Figures reproduced from arXiv: 2604.17956 by Kensuke Tanioka, Ke Wan, Toshio Shimokawa.

Figure 1
Figure 1. Figure 1: Overview of the proposed federated RuleFit framework 2.2 Proposed Algorithm The conventional RuleFit algorithm [26] employs a two-step procedure. In the first step, rule terms are generated using GBDT [27]. In the second step, linear terms are incorporated, and the coefficients of both the rule and linear terms are estimated using LASSO[31]. However, this framework assumes that all training data are centra… view at source ↗
Figure 2
Figure 2. Figure 2: Predictive performance under three simulation scenarios, expressed as differences relative to the proposed method. Positive values indicate a better performance than that of the proposed method, whereas negative values indicate worse performance. The horizontal dashed line at zero represents no performance differences compared to the proposed method. Panel A corresponds to Scenario 1, where the number of p… view at source ↗
Figure 3
Figure 3. Figure 3: Subgroup evaluation for each rule. Bars represent mortality rates within (in-subgroup) and outside (out-of￾subgroup) the population defined by each rule. 4.2 Performance evaluation For the real-data application, data from each institution were randomly split into training and test sets, with 70% used for training and 30% used for testing. This division was conducted independently within each institution to… view at source ↗
Figure 4
Figure 4. Figure 4: Subgroup evaluation for each rule of the candidate rule set and increased computational burden. To overcome this challenge, we introduced a preprocess￾ing step based on DP histograms to construct a shared set of candidate cut-off values. This component harmonized the split points across clients in a privacy-preserving manner and constrained the growth of the candidate rule set. The results presented in App… view at source ↗
Figure 5
Figure 5. Figure 5: Impact of preprocessing design choices on predictive performance for the proposed method. The number of bins and quantile cut-offs used in the differentially private (DP) histogram are varied under Scenario 1, with the number of clients fixed at M = 5. Performance is evaluated in terms of AUC, accuracy, F1-score, computation time, and the number of extracted rules. Results are compared with a baseline with… view at source ↗
read the original abstract

Machine learning has become integral to medical research and is increasingly applied in clinical settings to support diagnosis and decision-making; however, its effectiveness depends on access to large, diverse datasets, which are limited within single institutions. Although integrating data across institutions can address this limitation, privacy regulations and data ownership constraints hinder these efforts. Federated learning enables collaborative model training without sharing raw data; however, most methods rely on complex architectures that lack interpretability, limiting clinical applicability. Therefore, we proposed a federated RuleFit framework to construct a unified and interpretable global model for distributed environments. It integrates three components: preprocessing based on differentially private histograms to estimate shared cutoff values, enabling consistent rule definitions and reducing heterogeneity across clients; local rule generation using gradient boosting decision trees with shared cutoffs; and coefficient estimation via $\ell_1$-regularized optimization using a Federated Dual Averaging algorithm for sparse and consistent variable selection. In simulation studies, the proposed method achieved a performance comparable to that of centralized RuleFit while outperforming existing federated approaches. Real-world analysis demonstrated its ability to provide interpretable insights with competitive predictive accuracy. Therefore, the proposed framework offers a practical and effective solution for interpretable and reliable modeling in federated learning environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a federated RuleFit framework for medical data that integrates differentially private histograms for shared cutoff estimation to ensure consistent rule definitions across clients, local rule generation via gradient boosting decision trees using those cutoffs, and ℓ1-regularized coefficient estimation via Federated Dual Averaging. It claims that simulation studies show performance comparable to centralized RuleFit and superior to other federated baselines, while real-world analysis yields interpretable insights with competitive predictive accuracy.

Significance. If the central empirical claims hold with proper validation, the work would offer a practical contribution by extending interpretable rule ensembles to federated settings in privacy-constrained medical domains. The constructive pipeline combining DP preprocessing, GBDT rules, and federated dual averaging addresses heterogeneity without raw data sharing. No machine-checked proofs or reproducible code are mentioned, but the approach is falsifiable via the reported simulation and real-world comparisons.

major comments (2)
  1. Abstract: the claim of 'performance comparable to that of centralized RuleFit while outperforming existing federated approaches' provides no quantitative metrics (e.g., AUC values, tables, or figures), statistical tests, baseline descriptions, or details on data heterogeneity. This is load-bearing for the central claim, as the abstract is the only evidence presented.
  2. Preprocessing component (differentially private histograms for shared cutoffs): the assumption that noisy histograms yield sufficiently accurate global cutoffs to preserve rule consistency under client heterogeneity is untested. No ablation on privacy budget ε versus cutoff deviation or final model quality is reported, and the subsequent federated dual averaging cannot compensate for divergent rule sets; this directly undermines the simulation comparability claim.
minor comments (1)
  1. The abstract and method description use 'Federated Dual Averaging' without specifying the exact update rule or convergence analysis relative to centralized ℓ1 optimization.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight opportunities to strengthen the presentation of empirical results and the validation of key components. We address each major comment below and will revise the manuscript accordingly to improve clarity and rigor.

read point-by-point responses
  1. Referee: Abstract: the claim of 'performance comparable to that of centralized RuleFit while outperforming existing federated approaches' provides no quantitative metrics (e.g., AUC values, tables, or figures), statistical tests, baseline descriptions, or details on data heterogeneity. This is load-bearing for the central claim, as the abstract is the only evidence presented.

    Authors: We agree that the abstract would be strengthened by including quantitative metrics to support the central claim. The full manuscript (Section 4) reports simulation results with specific AUC values, comparisons to baselines (including descriptions of the federated approaches), details on data heterogeneity settings, and statistical comparisons. However, the abstract currently provides only a high-level summary. In revision, we will update the abstract to incorporate key quantitative results from the simulations (e.g., AUC values and baseline performance) while remaining concise. This will make the abstract self-contained without altering the manuscript's core findings. revision: yes

  2. Referee: Preprocessing component (differentially private histograms for shared cutoffs): the assumption that noisy histograms yield sufficiently accurate global cutoffs to preserve rule consistency under client heterogeneity is untested. No ablation on privacy budget ε versus cutoff deviation or final model quality is reported, and the subsequent federated dual averaging cannot compensate for divergent rule sets; this directly undermines the simulation comparability claim.

    Authors: We acknowledge that the current manuscript lacks an explicit ablation study on the privacy budget ε and its effects on cutoff accuracy or downstream model quality. This is a valid observation, as such analysis would better substantiate the robustness of the shared cutoffs under noise and heterogeneity. In the revised version, we will add an ablation experiment varying ε and reporting resulting cutoff deviations along with final AUC in the simulation settings. This will directly test the assumption and support the comparability to centralized RuleFit. The federated dual averaging step relies on consistent rules from the shared cutoffs, and the added results will address this dependency. revision: yes

Circularity Check

0 steps flagged

No circularity: constructive pipeline evaluated against external baselines

full rationale

The paper describes a constructive three-stage pipeline (DP-histogram preprocessing for shared cutoffs, local GBDT rule generation with those cutoffs, and federated dual averaging for L1-regularized coefficients) whose performance is asserted via simulation studies and real-world comparisons to centralized RuleFit and other federated baselines. No equations or derivations are presented that reduce a claimed result to its own inputs by construction; there are no self-definitional steps, fitted parameters renamed as predictions, load-bearing self-citations, or uniqueness theorems. The method remains independently falsifiable against external benchmarks, so the derivation chain is self-contained.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The framework rests on standard domain assumptions from differential privacy and federated optimization rather than new postulates; free parameters such as the privacy budget and regularization strength are implicit but not quantified in the abstract.

free parameters (2)
  • privacy budget epsilon
    Controls noise added to histograms for cutoff estimation; value not stated in abstract.
  • l1 regularization strength lambda
    Determines sparsity in the final coefficient estimation; value not stated in abstract.
axioms (2)
  • domain assumption Differentially private histograms yield cutoff values accurate enough for consistent rule definitions across sites
    Invoked in the preprocessing stage to reduce heterogeneity.
  • domain assumption Federated Dual Averaging produces sparse, consistent variable selection across clients
    Used for the global coefficient estimation step.

pith-pipeline@v0.9.0 · 5514 in / 1398 out tokens · 39351 ms · 2026-05-10T05:10:09.272606+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 1 internal anchor

  1. [1]

    Eric J. Topol. High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine, 25:44–56, 1 2019

  2. [2]

    Machine learning in medicine

    Alvin Rajkomar, Jeffrey Dean, and Isaac Kohane. Machine learning in medicine. New England Journal of Medicine, 380:1347–1358, 4 2019

  3. [3]

    Sahni and Brandon Carrus

    Nikhil R. Sahni and Brandon Carrus. Artificial intelligence in u.s. health care delivery. New England Journal of Medicine, 389(4):348–358, 2023

  4. [4]

    Nguyen, Christopher M

    Thuy D. Nguyen, Christopher M. Whaley, Kosali Simon, Neil Mehta, Hao Y u, Ryan K. McBain, Ateev Mehrotra, and Jonathan H. Cantor. Adoption of artificial intelligence in the health care sector. JAMA Health F orum, 6(11):e255029–e255029, 11 2025

  5. [5]

    Roth, Shadi Albarqouni, Spyridon Bakas, Mathieu N

    Nicola Rieke, Jonny Hancox, Wenqi Li, Fausto Milletarì, Holger R. Roth, Shadi Albarqouni, Spyridon Bakas, Mathieu N. Galtier, Bennett A. Landman, Klaus Maier-Hein, Sébastien Ourselin, Micah Sheller, Ronald M. Summers, Andrew Trask, Daguang Xu, Maximilian Baust, and M. Jorge Cardoso. The future of digital health with federated learning. npj Digital Medicin...

  6. [6]

    Sheller, Brandon Edwards, G

    Micah J. Sheller, Brandon Edwards, G. Anthony Reina, Jason Martin, Sarthak Pati, Aikaterini Kotrotsou, Mikhail Milchenko, Weilin Xu, Daniel Marcus, Rivka R. Colen, and Spyridon Bakas. Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data. Scientific Reports, 10:12598, 7 2020

  7. [7]

    A systematic review of barriers to data sharing in public health

    Willem G van Panhuis, Proma Paul, Claudia Emerson, John Grefenstette, Richard Wilder, Abraham J Herbst, David Heymann, and Donald S Burke. A systematic review of barriers to data sharing in public health. BMC Public Health, 14:1144, 12 2014

  8. [8]

    Marc Overhage, Markus Bundschus, Shahrooz Rabizadeh, Peter A

    V olker Tresp, J. Marc Overhage, Markus Bundschus, Shahrooz Rabizadeh, Peter A. Fasching, and Shipeng Y u. Going digital: A survey on digitalization and large-scale data analytics in healthcare. Proceedings of the IEEE , 104:2180–2206, 11 2016

  9. [9]

    Privacy protection and intrusion avoidance for cloudlet-based medical data sharing

    Min Chen, Y ongfeng Qian, Jing Chen, Kai Hwang, Shiwen Mao, and Long Hu. Privacy protection and intrusion avoidance for cloudlet-based medical data sharing. IEEE Transactions on Cloud Computing , 8:1274–1283, 10 2020

  10. [10]

    Communication- efficient learning of deep networks from decentralized data

    Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication- efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics , pages 1273–

  11. [11]

    Brisimi, Ruidi Chen, Theofanie Mela, Alex Olshevsky, Ioannis Ch

    Theodora S. Brisimi, Ruidi Chen, Theofanie Mela, Alex Olshevsky, Ioannis Ch. Paschalidis, and Wei Shi. Fed- erated learning of predictive models from federated electronic health records. International Journal of Medical Informatics, 112:59–67, 4 2018

  12. [12]

    Roth, Aoxiao Zhong, Ahmed Harouni, Amilcare Gentili, Anas Z

    Ittai Dayan, Holger R. Roth, Aoxiao Zhong, Ahmed Harouni, Amilcare Gentili, Anas Z. Abidin, Andrew Liu, Anthony Beardsworth Costa, Bradford J. Wood, Chien-Sung Tsai, Chih-Hung Wang, Chun-Nan Hsu, C. K. Lee, Peiying Ruan, Daguang Xu, Dufan Wu, Eddie Huang, Felipe Campos Kitamura, Griffin Lacey, Gustavo César de Antônio Corradi, Gustavo Nino, Hao-Hsin Shin, ...

  13. [13]

    Federated learning of electronic health records to improve mor- tality prediction in hospitalized patients with covid-19: Machine learning approach

    Akhil V aid, Suraj K Jaladanki, Jie Xu, Shelly Teng, Arvind Kumar, Samuel Lee, Sulaiman Somani, Ishan Paran- jpe, Jessica K De Freitas, Tingyi Wanyan, Kipp W Johnson, Mesude Bicak, Eyal Klang, Y oung Joon Kwon, Anthony Costa, Shan Zhao, Riccardo Miotto, Alexander W Charney, Erwin Böttinger, Zahi A Fayad, Girish N 17 A PREPRINT - A PRIL 21, 2026 Nadkarni, ...

  14. [14]

    Fed- erated machine learning in healthcare: A systematic review on clinical applications and technical architecture

    Zhen Ling Teo, Liyuan Jin, Nan Liu, Siqi Li, Di Miao, Xiaoman Zhang, Wei Y an Ng, Ting Fang Tan, Debo- rah Meixuan Lee, Kai Jie Chua, John Heng, Y ong Liu, Rick Siow Mong Goh, and Daniel Shu Wei Ting. Fed- erated machine learning in healthcare: A systematic review on clinical applications and technical architecture. Cell Reports Medicine, 5:101419, 2 2024

  15. [15]

    A review of federated learning technology and its research progress in healthcare applications

    Zezhong Ma, Nur Intan Raihana Ruhaiyem, Meng Zhang, Kamarul Imran Musa, Tengku Muhammad Hanis, Tianyun Xiao, Dianbo Hua, and Hao Li. A review of federated learning technology and its research progress in healthcare applications. Applied Intelligence, 55:765, 7 2025

  16. [16]

    Connecting the dots in trustworthy artificial intelligence: From ai principles, ethics, and key requirements to responsible ai systems and regulation

    Natalia Díaz-Rodríguez, Javier Del Ser, Mark Coeckelbergh, Marcos López de Prado, Enrique Herrera-Viedma, and Francisco Herrera. Connecting the dots in trustworthy artificial intelligence: From ai principles, ethics, and key requirements to responsible ai systems and regulation. Information Fusion, 99:101896, 11 2023

  17. [17]

    Nicholson Price

    W. Nicholson Price. Big data and black-box medical algorithms. Science Translational Medicine, 10, 12 2018

  18. [18]

    Opening the black box: The promise and limitations of explainable machine learning in cardiology

    Jeremy Petch, Shuang Di, and Walter Nelson. Opening the black box: The promise and limitations of explainable machine learning in cardiology. Canadian Journal of Cardiology, 38:204–213, 2 2022

  19. [19]

    A survey of methods for explaining black box models

    Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. A survey of methods for explaining black box models. ACM Computing Surveys, 51:1–42, 9 2019

  20. [20]

    Interpret federated learning with shapley values

    Guan Wang. Interpret federated learning with shapley values. arXiv preprint arXiv:1905.04519, 2019

  21. [21]

    L. S. Shapley. 17. A V alue for n-Person Games, pages 307–318. Princeton University Press, 12 1953

  22. [22]

    Towards interpretable federated learning

    Anran Li, Rui Liu, Ming Hu, Luu Anh Tuan, and Han Y u. Towards interpretable federated learning. arXiv preprint arXiv:2302.13473, 2023

  23. [23]

    Clinical decision rules in primary care: necessary investments for sustainable healthcare

    Jorn S Heerink, Ruud Oudega, Rogier Hopstaken, Hendrik Koffijberg, and Ron Kusters. Clinical decision rules in primary care: necessary investments for sustainable healthcare. Primary health care research & development , 24:e34, 5 2023

  24. [24]

    Interpretability of clinical decision support systems based on artificial intelligence from technological and medical perspective: A systematic review

    Qian Xu, Wenzhao Xie, Bolin Liao, Chao Hu, Lu Qin, Zhengzijin Y ang, Huan Xiong, Yi Lyu, Y ue Zhou, and Aijing Luo. Interpretability of clinical decision support systems based on artificial intelligence from technological and medical perspective: A systematic review. Journal of Healthcare Engineering, 2023, 1 2023

  25. [25]

    Argente-Garrido, C

    A. Argente-Garrido, C. Zuheros, M.V . Luzón, and F. Herrera. An interpretable client decision tree aggregation process for federated learning. Information Sciences, 694:121711, 3 2025

  26. [26]

    Predictive learning via rule ensembles

    J H Friedman and B E Popescu. Predictive learning via rule ensembles. The Annals of Applied Statistics , 2(3):916–954, 2008

  27. [27]

    Greedy function approximation: a gradient boosting machine

    Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics , pages 1189–1232, 2001

  28. [28]

    Calibrating Noise to Sensitivity in Private Data Analysis, pages 265–284

    Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating Noise to Sensitivity in Private Data Analysis, pages 265–284. Springer, 2006

  29. [29]

    Federated composite optimization

    Honglin Y uan, Manzil Zaheer, and Sashank Reddi. Federated composite optimization. In International Confer- ence on Machine Learning , pages 12253–12266. PMLR, 2021

  30. [30]

    A survey on heterogeneous federated learning.arXiv preprint arXiv:2210.04505,

    Dashan Gao, Xin Y ao, and Qiang Y ang. A survey on heterogeneous federated learning. arXiv preprint arXiv:2210.04505, 2022

  31. [31]

    Regression shrinkage and selection via the lasso

    Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology , 58:267–288, 1 1996

  32. [32]

    Federated random forests can improve local performance of predictive models for various healthcare applications

    Anne-Christin Hauschild, Marta Lemanczyk, Julian Matschinske, Tobias Frisch, Olga Zolotareva, Andreas Holzinger, Jan Baumbach, and Dominik Heider. Federated random forests can improve local performance of predictive models for various healthcare applications. Bioinformatics, 38:2278–2286, 4 2022

  33. [33]

    Differentially private histogram publication

    Jia Xu, Zhenjie Zhang, Xiaokui Xiao, Yin Y ang, Ge Y u, and Marianne Winslett. Differentially private histogram publication. The VLDB Journal, 22:797–822, 12 2013

  34. [34]

    Primal-dual subgradient methods for convex problems

    Y urii Nesterov. Primal-dual subgradient methods for convex problems. Mathematical Programming, 120:221– 259, 8 2009

  35. [35]

    Dual averaging methods for regularized stochastic learning and online optimization

    Lin Xiao. Dual averaging methods for regularized stochastic learning and online optimization. Journal of Machine Learning Research, 2010. 18 A PREPRINT - A PRIL 21, 2026

  36. [36]

    Bayesian federated inference for estimating statis- tical models based on nonshared multicenter data sets

    Marianne A Jonker, Hassan Pazira, and Anthony CC Coolen. Bayesian federated inference for estimating statis- tical models based on nonshared multicenter data sets. Statistics in Medicine, 43:2421–2438, 5 2024

  37. [37]

    Flex: Flexible federated learning framework

    Francisco Herrera, Daniel Jiménez-López, Alberto Argente-Garrido, Nuria Rodríguez-Barroso, Cristina Zuheros, Ignacio Aguilera-Martos, Beatriz Bello, Mario García-Márquez, and M Luzón. Flex: Flexible federated learning framework. arXiv preprint arXiv:2404.06127, 2024

  38. [38]

    Jos M. Th. Draaisma, Anton F. J. de Haan, and R Jan A. Goris. Preventable trauma deaths in the netherlandsa prospective multicenter study. The Journal of Trauma: Injury, Infection, and Critical Care , 29:1552–1557, 11 1989. 19 A PREPRINT - A PRIL 21, 2026 A Differential Privacy and Laplace Mechanism A.1 Differential Privacy Differential privacy is a mathe...