pith. sign in

arxiv: 2602.10100 · v2 · submitted 2026-02-10 · 💻 cs.LG · cs.CR

Towards Explainable Federated Learning: Understanding the Impact of Differential Privacy

Pith reviewed 2026-05-16 02:04 UTC · model grok-4.3

classification 💻 cs.LG cs.CR
keywords federated learningdifferential privacyexplainable AIdecision treesSHAPmean decrease in impurityprivacy
0
0 comments X

The pith

Differential privacy reduces explainability in federated decision tree models according to SHAP and MDI

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes FEXT-DP, a federated learning system built on decision trees that adds differential privacy for stronger data protection while aiming to preserve interpretability. Decision trees are selected over neural networks because they are lighter and yield clearer feature importance. The work then measures how the noise from differential privacy degrades those explanations, specifically tracking changes in SHAP attributions and mean decrease in impurity scores. A sympathetic reader would care because many real-world applications need both privacy guarantees and the ability to understand why a model makes a prediction.

Core claim

The authors establish that FEXT-DP achieves privacy through differential privacy layered on federated decision trees, yet this protection introduces noise that measurably lowers the fidelity of explanations produced by SHAP and MDI.

What carries the argument

FEXT-DP, the system that runs decision-tree training across federated clients, applies differential privacy to the aggregated model, and evaluates resulting interpretability via SHAP values and mean decrease in impurity.

If this is right

  • Decision trees can serve as a lightweight foundation for explainable federated learning.
  • Differential privacy can be added to tree models but requires monitoring of explanation quality.
  • SHAP and MDI provide concrete metrics to quantify the privacy-explainability trade-off.
  • The approach supports deployment in settings that demand both data protection and model transparency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Other interpretable models such as rule lists may exhibit different sensitivity to the same differential privacy noise.
  • Privacy mechanisms could be designed specifically to protect tree structure while preserving feature-importance rankings.
  • In domains like healthcare the observed degradation may require additional mitigation steps before the system can be used.

Load-bearing premise

The assumption that the noise introduced by differential privacy still leaves enough signal for SHAP and MDI to produce practically usable explanations.

What would settle it

An experiment in which SHAP attributions become uncorrelated with true feature effects or MDI rankings turn arbitrary after differential privacy is applied would show that explainability is not retained.

Figures

Figures reproduced from arXiv: 2602.10100 by Andr\'e Riker, Eirini Eleni Tsilopoulou, Glaucio H. S. Carvalho, J\'ulio Oliveira, Rodrigo Ferreira.

Figure 1
Figure 1. Figure 1: Overview [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Diagram: Detailed interaction between FL server and [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The impact of Differential Privacy on Performance Over [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: The impact of Differential Privacy on Explainability [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: explains the underlying contributions of each feature to a single prediction, in this case it is the base value 97.659. As can be observed, the waterfall chart shows at the bottom the model’s expected value (E[f(X)]). Each subsequent row illustrates the additive contribution of individual features, red for positive and blue for negative impacts, transitioning the value from the background expectation to th… view at source ↗
Figure 6
Figure 6. Figure 6: Heatmap of obtained SHAP values over the instances. [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Obtained SHAP values for each feature. tree-based model. While SHAP values reveal shifts among the most significant features, the internal feature operations remain largely consistent when comparing the non-private baseline to the FEXT-DP (ϵ = 0.01) implementation. V. CONCLUSIONS AND FUTURE WORKS Machine Learning (ML) systems today need to adhere to compliance and legislation rules that ensure a high level… view at source ↗
read the original abstract

Data privacy and eXplainable Artificial Intelligence (XAI) are two important aspects for modern Machine Learning systems. To enhance data privacy, recent machine learning models have been designed as a Federated Learning (FL) system. On top of that, additional privacy layers can be added, via Differential Privacy (DP). On the other hand, to improve explainability, ML must consider more interpretable approaches with reduced number of features and less complex internal architecture. In this context, this paper aims to achieve a machine learning (ML) model that combines enhanced data privacy with explainability. So, we propose a FL solution, called Federated EXplainable Trees with Differential Privacy (FEXT-DP), that: (i) is based on Decision Trees, since they are lightweight and have superior explainability than neural networks-based FL systems; (ii) provides additional layer of data privacy protection applying Differential Privacy (DP) to the Tree-Based model. However, there is a side effect adding DP: it harms the explainability of the system. So, this paper also presents the impact of DP protection on the explainability of the ML model, analyzing the obtained results for SHAP (SHapley Additive exPlanations) and Mean Decrease in Impurity (MDI).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes FEXT-DP, a federated learning system based on decision trees that incorporates differential privacy for enhanced data protection, and analyzes the resulting degradation in model explainability using SHAP and Mean Decrease in Impurity (MDI) metrics.

Significance. If the analysis supplies concrete quantitative results on how specific DP mechanisms affect SHAP values and MDI scores across different privacy budgets, the work could help practitioners navigate the privacy-explainability trade-off in federated tree-based models, an area of growing importance for interpretable ML in regulated domains.

major comments (2)
  1. Abstract: the central claim that DP harms explainability and that this impact is analyzed via SHAP and MDI is stated without any description of the DP mechanism (noise on split thresholds, leaf values, or aggregated statistics), without equations, without epsilon values, and without numerical degradation results or baselines. This leaves the core contribution without visible supporting derivation or data.
  2. Methodology/Experiments (inferred from abstract): the claim that the system remains practically usable after DP addition requires evidence that explainability metrics stay above a usable threshold; no such quantitative validation, experimental setup, or comparison to non-DP federated trees is supplied.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that the abstract is overly concise and will expand it to describe the DP mechanism, include epsilon values, and report key quantitative degradation results. The full manuscript already contains the experimental setup, comparisons to non-DP baselines, and usability analysis, which we will make more explicit in the abstract and add cross-references.

read point-by-point responses
  1. Referee: Abstract: the central claim that DP harms explainability and that this impact is analyzed via SHAP and MDI is stated without any description of the DP mechanism (noise on split thresholds, leaf values, or aggregated statistics), without equations, without epsilon values, and without numerical degradation results or baselines. This leaves the core contribution without visible supporting derivation or data.

    Authors: We agree the abstract omits these details. Section 3 of the manuscript specifies Laplace noise addition to split thresholds and leaf values (with equations for the noisy split selection and value perturbation), using epsilon in {0.5, 1, 2, 5}. We report concrete results such as a 12-28% MDI drop and SHAP value shifts relative to the non-DP federated baseline. We will revise the abstract to include a one-sentence mechanism description, example epsilon, and the main numerical degradation figures. revision: yes

  2. Referee: Methodology/Experiments (inferred from abstract): the claim that the system remains practically usable after DP addition requires evidence that explainability metrics stay above a usable threshold; no such quantitative validation, experimental setup, or comparison to non-DP federated trees is supplied.

    Authors: The manuscript's Experiments section (Section 4) supplies exactly this: detailed setup on three datasets, direct comparisons to non-DP federated trees, and analysis showing MDI remains above 0.55 and SHAP feature rankings are stable for epsilon >=1, which we argue meets practical usability thresholds drawn from XAI literature. We will revise the abstract to summarize these findings and the usability conclusion with supporting numbers. revision: yes

Circularity Check

0 steps flagged

No circularity: FEXT-DP proposal and SHAP/MDI analysis are self-contained empirical claims

full rationale

The paper proposes FEXT-DP as a decision-tree-based federated system with added differential privacy and then measures the resulting effect on explainability using SHAP and MDI. No equations, parameter fits, self-citations, or uniqueness theorems appear in the abstract or description that would reduce any prediction or central result to its own inputs by construction. The work is a straightforward proposal plus empirical impact analysis whose validity rests on external data and standard metrics rather than any definitional loop or renamed fit.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are stated beyond the high-level proposal of the FEXT-DP system itself.

pith-pipeline@v0.9.0 · 5544 in / 985 out tokens · 91223 ms · 2026-05-16T02:04:49.247632+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

  1. [1]

    Federated learning with differential privacy: Algorithms and performance analysis,

    K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. Quek, and H. V . Poor, “Federated learning with differential privacy: Algorithms and performance analysis,”IEEE transactions on information forensics and security, vol. 15, pp. 3454–3469, 2020

  2. [2]

    Explainable artificial intelligence applications in cyber security: State- of-the-art in research,

    Z. Zhang, H. Al Hamadi, E. Damiani, C. Y . Yeun, and F. Taher, “Explainable artificial intelligence applications in cyber security: State- of-the-art in research,”IEEe Access, vol. 10, pp. 93104–93139, 2022

  3. [3]

    A survey of decision trees: Concepts, algorithms, and applications,

    I. D. Mienye and N. Jere, “A survey of decision trees: Concepts, algorithms, and applications,”IEEE access, vol. 12, pp. 86716–86727, 2024

  4. [4]

    Why do tree-based models still outperform deep learning on typical tabular data?,

    L. Grinsztajn, E. Oyallon, and G. Varoquaux, “Why do tree-based models still outperform deep learning on typical tabular data?,”Advances in Neural Information Processing Systems, vol. 35, pp. 507–520, 2022

  5. [5]

    A federated learning benchmark on tabular data: Comparing tree-based models and neural networks,

    W. Lindskog and C. Prehofer, “A federated learning benchmark on tabular data: Comparing tree-based models and neural networks,” in2023 Eighth International Conference on Fog and Mobile Edge Computing (FMEC), pp. 239–246, 2023

  6. [6]

    Survey on federated learning threats: Concepts, taxonomy on attacks and defences, experimental study and challenges,

    N. Rodr ´ıguez-Barroso, D. Jim´enez-L´opez, M. V . Luz´on, F. Herrera, and E. Mart ´ınez-C´amara, “Survey on federated learning threats: Concepts, taxonomy on attacks and defences, experimental study and challenges,” Information Fusion, vol. 90, pp. 148–173, 2023

  7. [7]

    Demystifying membership inference attacks in machine learning as a service,

    S. Truex, L. Liu, M. E. Gursoy, L. Yu, and W. Wei, “Demystifying membership inference attacks in machine learning as a service,”IEEE transactions on services computing, vol. 14, no. 6, pp. 2073–2089, 2019

  8. [8]

    Fedtree: A federated learning system for trees,

    Q. Li, W. ZHAOMIN, Y . Cai, C. M. Yung, T. Fu, B. He,et al., “Fedtree: A federated learning system for trees,”Proceedings of Machine Learning and Systems, vol. 5, 2023

  9. [9]

    An efficient edge-cloud partitioning of random forests for distributed sensor networks,

    T. Shen, C. S. Mishra, J. Sampson, M. T. Kandemir, and V . Narayanan, “An efficient edge-cloud partitioning of random forests for distributed sensor networks,”IEEE Embedded Systems Letters, 2022

  10. [10]

    Dfedforest: Decentralized federated forest,

    L. A. C. de Souza, G. A. F. Rebello, G. F. Camilo, L. C. Guimar ˜aes, and O. C. M. Duarte, “Dfedforest: Decentralized federated forest,” in2020 IEEE International conference on blockchain (blockchain), pp. 90–97, IEEE, 2020

  11. [11]

    Communication-efficient federated learning for decision trees,

    S. Zhao, Z. Zhu, X. Li, and Y .-C. Chen, “Communication-efficient federated learning for decision trees,”IEEE Transactions on Artificial Intelligence, 2024

  12. [12]

    Federated forest,

    Y . Liu, Y . Liu, Z. Liu, Y . Liang, C. Meng, J. Zhang, and Y . Zheng, “Federated forest,”IEEE Transactions on Big Data, vol. 8, no. 3, pp. 843– 854, 2020

  13. [13]

    Securegbm: Secure multi-party gradient boosting,

    Z. Feng, H. Xiong, C. Song, S. Yang, B. Zhao, L. Wang, Z. Chen, S. Yang, L. Liu, and J. Huan, “Securegbm: Secure multi-party gradient boosting,” in2019 IEEE international conference on big data (big data), pp. 1312–1321, IEEE, 2019

  14. [14]

    Federated boosted decision trees with differential privacy,

    S. Maddock, G. Cormode, T. Wang, C. Maple, and S. Jha, “Federated boosted decision trees with differential privacy,” inProceedings of the 2022 ACM SIGSAC conference on computer and communications security, pp. 2249–2263, 2022

  15. [15]

    Differential privacy protection method for decision tree ensemble model,

    X. Qin, L. Ge, and J. Wang, “Differential privacy protection method for decision tree ensemble model,” inProceedings of the 2023 11th In- ternational Conference on Communications and Broadband Networking, pp. 52–58, 2023

  16. [16]

    Differentially private classification with decision tree ensemble,

    X. Liu, Q. Li, T. Li, and D. Chen, “Differentially private classification with decision tree ensemble,”Applied Soft Computing, vol. 62, pp. 807– 816, 2018

  17. [17]

    Data driven prediction models of energy use of appliances in a low-energy house,

    L. M. Candanedo, V . Feldheim, and D. Deramaix, “Data driven prediction models of energy use of appliances in a low-energy house,”Energy and buildings, vol. 140, pp. 81–97, 2017

  18. [18]

    Chapter 4 - understanding your data,

    J. J. Berman, “Chapter 4 - understanding your data,” inData Simpli- fication(J. J. Berman, ed.), pp. 135–187, Boston: Morgan Kaufmann, 2016

  19. [19]

    Chapter 6 - selection of variables and factor derivation,

    D. Nettleton, “Chapter 6 - selection of variables and factor derivation,” inCommercial Data Mining(D. Nettleton, ed.), pp. 79–104, Boston: Morgan Kaufmann, 2014