pith. sign in

arxiv: 2512.19131 · v2 · pith:EJD54PGFnew · submitted 2025-12-22 · 💻 cs.DC · cs.LG

Evidential Trust-Aware Model Personalization in Decentralized Federated Learning for Wearable IoT

Pith reviewed 2026-05-22 12:35 UTC · model grok-4.3

classification 💻 cs.DC cs.LG
keywords decentralized federated learningevidential deep learningmodel personalizationpeer compatibilityepistemic uncertaintywearable IoTtrust-aware aggregationnon-IID data
0
0 comments X

The pith

Epistemic uncertainty from Dirichlet-based evidential models directly signals which peers have matching data distributions for selective collaboration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Murmura, a decentralized federated learning framework for wearable IoT devices that personalizes local models by selectively aggregating updates only from compatible peers. Its core mechanism uses epistemic uncertainty produced by Dirichlet evidential models: when a peer model is evaluated on a node's local validation data, high uncertainty flags distributional mismatch and triggers exclusion. This replaces heuristic peer selection with an explicit trust score computed through cross-evaluation and adaptive thresholds. On three wearable datasets the method limits accuracy drop from IID to non-IID conditions to 0.9 percent versus 19.3 percent for baselines, while converging 7.4 times faster and remaining stable across hyperparameter settings.

Core claim

Epistemic uncertainty measured when a peer's Dirichlet evidential model evaluates a node's local validation samples serves as a direct indicator of distributional mismatch, allowing nodes to compute compatibility scores and perform trust-aware aggregation that excludes incompatible influence while still forming personalized models through selective collaboration with compatible peers.

What carries the argument

The trust-aware aggregation mechanism that derives peer compatibility scores from epistemic uncertainty on local validation samples and applies adaptive thresholds to decide which models to incorporate.

If this is right

  • Nodes exclude peers that produce high epistemic uncertainty on local data, preventing negative transfer in heterogeneous environments.
  • Personalized models retain near-IID accuracy with only 0.9 percent degradation across non-IID wearable datasets.
  • Convergence occurs 7.4 times faster than baseline decentralized methods while remaining robust to hyperparameter variation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same uncertainty signal could be used to detect gradual distribution drift over time without additional monitoring modules.
  • In very large networks the method may naturally form implicit clusters of devices with similar sensing conditions.
  • Replacing the Dirichlet evidential head with other uncertainty estimators would test whether the compatibility signal is specific to this formulation.

Load-bearing premise

Epistemic uncertainty observed when a peer model processes local validation samples reliably reflects data-distribution mismatch rather than model initialization, training noise, or architectural differences.

What would settle it

Run controlled experiments in which all nodes draw from identical data distributions yet still observe high epistemic uncertainty when cross-evaluating peer models; if compatibility scores then fail to predict actual performance gains, the claimed direct link collapses.

Figures

Figures reproduced from arXiv: 2512.19131 by Murtaza Rangwala, Rajkumar Buyya, Richard O. Sinnott.

Figure 1
Figure 1. Figure 1: Murmura framework architecture showing the three-layer design. The [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Model accuracy across data heterogeneity levels (Dirichlet [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 5
Figure 5. Figure 5: Convergence speed comparison showing rounds to reach peak accu [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 4
Figure 4. Figure 4: Model personalization under high heterogeneity ( [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
read the original abstract

Decentralized federated learning (DFL) enables collaborative model training across edge devices without centralized coordination, offering resilience against single points of failure. However, statistical heterogeneity arising from non-identically distributed local data creates a fundamental challenge: nodes must learn personalized models adapted to their local distributions while selectively collaborating with compatible peers. Existing approaches either enforce a single global model that fits no one well, or rely on heuristic peer selection mechanisms that cannot distinguish between peers with genuinely incompatible data distributions and those with valuable complementary knowledge. We present Murmura, a framework that leverages evidential deep learning to enable trust-aware model personalization in DFL. Our key insight is that epistemic uncertainty from Dirichlet-based evidential models directly indicates peer compatibility: high epistemic uncertainty when a peer's model evaluates local data reveals distributional mismatch, enabling nodes to exclude incompatible influence while maintaining personalized models through selective collaboration. Murmura introduces a trust-aware aggregation mechanism that computes peer compatibility scores through cross-evaluation on local validation samples and personalizes model aggregation based on evidential trust with adaptive thresholds. Evaluation on three wearable IoT datasets (UCI HAR, PAMAP2, PPG-DaLiA) demonstrates that Murmura reduces performance degradation from IID to non-IID conditions compared to baseline (0.9% vs. 19.3%), achieves 7.4$\times$ faster convergence, and maintains stable accuracy across hyperparameter choices. These results establish evidential uncertainty as a principled foundation for compatibility-aware personalization in decentralized heterogeneous environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes Murmura, a framework for decentralized federated learning (DFL) in wearable IoT that uses Dirichlet-based evidential deep learning to compute epistemic uncertainty during cross-evaluation of peer models on local validation data. High epistemic uncertainty is interpreted as a direct signal of distributional mismatch, enabling a trust-aware aggregation mechanism with adaptive thresholds for selective collaboration and personalized models. Experiments on UCI HAR, PAMAP2, and PPG-DaLiA report reduced degradation from IID to non-IID settings (0.9% vs. 19.3%) and 7.4× faster convergence relative to baselines.

Significance. If the central claim holds, the work supplies a principled uncertainty-driven alternative to heuristic peer selection in heterogeneous DFL, with potential value for edge IoT personalization. The multi-dataset evaluation and focus on evidential models are strengths; however, the absence of detailed experimental protocols limits the strength of the reported gains.

major comments (2)
  1. [§3] §3 (Trust-aware aggregation): the claim that epistemic uncertainty 'directly indicates' distributional mismatch is load-bearing for the compatibility scores and selective aggregation, yet the manuscript provides no ablations or controlled experiments that isolate this signal from confounders such as random initialization, optimization noise, or minor architectural differences under identical data distributions.
  2. [§5] §5 (Evaluation): the quantitative claims (0.9% vs 19.3% degradation, 7.4× faster convergence) are central to supporting the framework's effectiveness, but the text omits baseline implementation details, hyperparameter search procedures, statistical testing, number of independent runs, and precise non-IID partitioning methods, leaving the results only weakly reproducible.
minor comments (2)
  1. [§3.2] Clarify the exact mathematical definition of the compatibility score (e.g., whether it is simply 1 minus normalized epistemic uncertainty or involves additional terms) and ensure it appears in an equation rather than only in prose.
  2. [Figures in §5] Add error bars or standard deviations to convergence and accuracy plots to demonstrate stability across runs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and positive assessment of the work's potential significance. We address each major comment below and will revise the manuscript to incorporate the suggested improvements for greater rigor and reproducibility.

read point-by-point responses
  1. Referee: [§3] §3 (Trust-aware aggregation): the claim that epistemic uncertainty 'directly indicates' distributional mismatch is load-bearing for the compatibility scores and selective aggregation, yet the manuscript provides no ablations or controlled experiments that isolate this signal from confounders such as random initialization, optimization noise, or minor architectural differences under identical data distributions.

    Authors: We acknowledge that the current manuscript does not contain explicit ablations isolating epistemic uncertainty from potential confounders under matched distributions. The core claim draws from the established properties of Dirichlet evidential deep learning, in which epistemic uncertainty specifically quantifies insufficient evidence in the training distribution rather than aleatoric noise or optimization artifacts. Nevertheless, to directly address the concern, we will add a dedicated controlled-experiment subsection in the revised §3. These experiments will train peer models on identical data distributions while varying random seeds, optimization trajectories, and minor architectural perturbations, demonstrating that cross-evaluation epistemic uncertainty remains low in such cases and rises substantially only under distributional mismatch. This addition will provide the requested empirical isolation. revision: yes

  2. Referee: [§5] §5 (Evaluation): the quantitative claims (0.9% vs 19.3% degradation, 7.4× faster convergence) are central to supporting the framework's effectiveness, but the text omits baseline implementation details, hyperparameter search procedures, statistical testing, number of independent runs, and precise non-IID partitioning methods, leaving the results only weakly reproducible.

    Authors: We agree that the evaluation section requires substantially more detail to support reproducibility. In the revised manuscript we will expand §5 with: (i) precise descriptions of all baseline implementations and their adaptation to the decentralized setting, (ii) the full hyperparameter search grid, selection criteria, and final values used, (iii) the number of independent runs (five runs with distinct random seeds, reporting mean and standard deviation), (iv) statistical significance testing (paired t-tests and Wilcoxon signed-rank tests with p-values), and (v) the exact non-IID partitioning procedure (Dirichlet label skew with concentration parameter α = 0.5, plus sensor-specific feature skew for the wearable datasets). We will also release the full experimental code upon acceptance. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's core mechanism computes peer compatibility from epistemic uncertainty outputs of Dirichlet evidential models during cross-evaluation on local validation samples, then applies these scores in trust-aware aggregation with adaptive thresholds. This is presented as a direct insight rather than a closed derivation; the uncertainty signal is an output of the evidential model, not redefined or fitted to force the compatibility result. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the claims. The framework is evaluated on external wearable IoT datasets (UCI HAR, PAMAP2, PPG-DaLiA) with reported metrics, keeping the central claim independent of its own inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on one domain assumption about uncertainty as a mismatch indicator plus adaptive thresholds that function as free parameters; no new physical entities are postulated.

free parameters (1)
  • adaptive trust thresholds
    Thresholds used to decide inclusion in aggregation are adaptive and therefore require data-dependent selection or tuning.
axioms (1)
  • domain assumption Epistemic uncertainty from a peer model on local data directly signals distributional mismatch
    This premise is invoked as the key insight that turns uncertainty into a compatibility score.

pith-pipeline@v0.9.0 · 5813 in / 1241 out tokens · 96359 ms · 2026-05-22T12:35:10.127460+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. OpenCLAW-Nexus: A Self-Reinforcing Trust Framework for Byzantine-Resilient Decentralized Federated Learning

    cs.NI 2026-04 unverdicted novelty 5.0

    OpenCLAW-Nexus uses a single discounted Beta-reputation model to unify reputation-based node selection, Rep-FedAvg aggregation, and reputation-aware BFT consensus, achieving Byzantine resilience in decentralized FL wi...

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    Fedhealth: A federated transfer learning framework for wearable healthcare,

    Y . Chen, X. Qin, J. Wang, C. Yu, and W. Gao, “Fedhealth: A federated transfer learning framework for wearable healthcare,”IEEE Intelligent Systems, vol. 35, no. 4, pp. 83–93, 2020

  2. [2]

    Communication-efficient learning of deep networks from decentralized data,

    B. McMahan, E. Mooreet al., “Communication-efficient learning of deep networks from decentralized data,” inArtificial intelligence and statistics. PMLR, 2017, pp. 1273–1282

  3. [3]

    Fully decentralized federated learning,

    A. Lalitha, S. Shekhar, T. Javidi, and F. Koushanfar, “Fully decentralized federated learning,” inThird workshop on bayesian deep learning (NeurIPS), vol. 12, 2018

  4. [4]

    Towards personalized federated learning,

    A. Z. Tan, H. Yu, L. Cui, and Q. Yang, “Towards personalized federated learning,”IEEE transactions on neural networks and learning systems, vol. 34, no. 12, pp. 9587–9603, 2022

  5. [5]

    Evidential deep learning to quantify classification uncertainty,

    M. Sensoy, L. Kaplan, and M. Kandemir, “Evidential deep learning to quantify classification uncertainty,”Advances in neural information processing systems, vol. 31, 2018

  6. [6]

    A unified theory of decentralized sgd with changing topology and local updates,

    A. Koloskova, N. Loizou, S. Boreiri, M. Jaggi, and S. Stich, “A unified theory of decentralized sgd with changing topology and local updates,” inInternational conference on machine learning. PMLR, 2020, pp. 5381–5393

  7. [7]

    Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent,

    X. Lian, C. Zhang, H. Zhang, C.-J. Hsieh, W. Zhang, and J. Liu, “Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent,”Advances in neural information processing systems, vol. 30, 2017

  8. [8]

    Gossip learning as a decentral- ized alternative to federated learning,

    I. Heged ˝us, G. Danner, and M. Jelasity, “Gossip learning as a decentral- ized alternative to federated learning,” inIFIP International Conference on Distributed Applications and Interoperable Systems. Springer, 2019, pp. 74–90

  9. [9]

    Personalized federated learning with moreau envelopes,

    C. T Dinh, N. Tran, and J. Nguyen, “Personalized federated learning with moreau envelopes,”Advances in neural information processing systems, vol. 33, pp. 21 394–21 405, 2020

  10. [10]

    Ditto: Fair and robust federated learning through personalization,

    T. Li, S. Hu, A. Beirami, and V . Smith, “Ditto: Fair and robust federated learning through personalization,” inInternational conference on machine learning. PMLR, 2021, pp. 6357–6368

  11. [11]

    Byzantine-robust decentralized federated learning,

    M. Fang, Z. Zhanget al., “Byzantine-robust decentralized federated learning,” inProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, 2024, pp. 2874–2888

  12. [12]

    SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening

    M. Rangwala, F. Azzedin, R. O. Sinnott, and R. Buyya, “Sketchguard: Scaling byzantine-robust decentralized federated learning via sketch- based screening,”arXiv preprint arXiv:2510.07922, 2025

  13. [13]

    Byzantine-resilient decentralized stochastic gradient descent,

    S. Guo, T. Zhang, H. Yu, X. Xie, L. Ma, T. Xiang, and Y . Liu, “Byzantine-resilient decentralized stochastic gradient descent,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 6, pp. 4096–4106, 2021

  14. [14]

    Dispfl: Towards communication-efficient personalized federated learning via decentral- ized sparse training,

    R. Dai, L. Shen, F. He, X. Tian, and D. Tao, “Dispfl: Towards communication-efficient personalized federated learning via decentral- ized sparse training,” inInternational Conference on Machine Learning. PMLR, 2022, pp. 4587–4604

  15. [15]

    Pfeddst: Personalized federated learning with de- centralized selection training,

    M. Fan, K. Liet al., “Pfeddst: Personalized federated learning with de- centralized selection training,”arXiv preprint arXiv:2502.07750, 2025

  16. [16]

    Trustworthy fault diagnosis with uncertainty estimation through evidential convolutional neural networks,

    H. Zhou, W. Chen, L. Cheng, J. Liu, and M. Xia, “Trustworthy fault diagnosis with uncertainty estimation through evidential convolutional neural networks,”IEEE Transactions on Industrial Informatics, vol. 19, no. 11, pp. 10 842–10 852, 2023

  17. [17]

    Federated uncertainty-aware aggregation for fundus diabetic retinopa- thy staging,

    M. Wang, L. Wang, X. Xu, K. Zou, Y . Qian, R. S. M. Gohet al., “Federated uncertainty-aware aggregation for fundus diabetic retinopa- thy staging,” inInternational Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2023, pp. 222–232

  18. [18]

    On the evolution of random graphs,

    P. Erd ˝os and A. R´enyi, “On the evolution of random graphs,”Publication of the Mathematical Institute of the Hungarian Academy of Sciences, vol. 5, pp. 17–61, 1960

  19. [19]

    Pytorch: An imperative style, high-performance deep learning library,

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chananet al., “Pytorch: An imperative style, high-performance deep learning library,” Advances in neural information processing systems, vol. 32, 2019

  20. [20]

    A pub- lic domain dataset for human activity recognition using smartphones

    D. Anguita, A. Ghio, L. Oneto, X. Parra, J. L. Reyes-Ortizet al., “A pub- lic domain dataset for human activity recognition using smartphones.” inEsann, vol. 3, no. 1, 2013, pp. 3–4

  21. [21]

    Introducing a new benchmarked dataset for activity monitoring,

    A. Reiss and D. Stricker, “Introducing a new benchmarked dataset for activity monitoring,” in2012 16th international symposium on wearable computers. IEEE, 2012, pp. 108–109

  22. [22]

    Deep ppg: Large-scale heart rate estimation with convolutional neural networks,

    A. Reiss, I. Indlekofer, P. Schmidt, and K. Van Laerhoven, “Deep ppg: Large-scale heart rate estimation with convolutional neural networks,” Sensors, vol. 19, no. 14, p. 3079, 2019