JiRAIYA: A Reputation-Based Hierarchical Federated Learning Framework on Web3

Pallav Kumar Baruah; Venkata Raghava Kurada

arxiv: 2606.13180 · v1 · pith:PBPBY3NFnew · submitted 2026-06-11 · 💻 cs.DC

JiRAIYA: A Reputation-Based Hierarchical Federated Learning Framework on Web3

Venkata Raghava Kurada , Pallav Kumar Baruah This is my paper

Pith reviewed 2026-06-27 05:58 UTC · model grok-4.3

classification 💻 cs.DC

keywords federated learningweb3hierarchical architectureconsensus mechanismreputation systemnovelty detectionmodel poisoningtransparency

0 comments

The pith

A Web3 hierarchy lets delegated managers reach consensus on encoded model updates to keep federated learning transparent and resistant to poisoning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a federated learning system built on Web3 tools to address opaque aggregation and limited auditability in enterprise FL. It organizes participants into federations run by delegated managers who broadcast encoded updates, evaluate them independently with novelty detection, and accept only those that pass consensus. A reputation-based backup keeps model generation running if primary paths fail. Real-world experiments test resilience against adversarial attacks. The design aims to move reliable FL into open, decentralized settings without relying on external validators.

Core claim

Model updates are encoded and broadcast to all managers, who independently evaluate their validity using novelty detection; updates approved by consensus are incorporated into the global model, while a reputation score mechanism provides backup to ensure continued model generation.

What carries the argument

Hierarchical architecture of delegated managers that broadcast encoded updates for independent consensus evaluation combined with novelty detection and reputation-based backup.

If this is right

Every accepted model update becomes visible and auditable by all managers in the hierarchy.
Training can continue without external validators because managers handle evaluation internally.
Reputation scores allow the system to fall back to reliable participants when others drop out.
The framework supports FL participation beyond closed enterprise networks by leveraging Web3 transparency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same broadcast-and-consensus pattern could be extended to record an immutable training log on a blockchain for post-hoc audits.
Open participant pools might grow larger than in traditional FL because evaluation is distributed rather than centralized.
If novelty detection parameters are tuned per federation, the approach could adapt to domain-specific data distributions without global retraining.

Load-bearing premise

Broadcasting updates to multiple managers for consensus evaluation plus novelty detection can filter out bad updates effectively while keeping communication costs low.

What would settle it

An experiment in which a poisoning attack succeeds in altering the global model even after the consensus and novelty checks, or where the measured communication volume exceeds that of validator-based alternatives.

Figures

Figures reproduced from arXiv: 2606.13180 by Pallav Kumar Baruah, Venkata Raghava Kurada.

**Figure 1.** Figure 1: Hierarchical Arrangement of the Roles 4. Roles and Framework Workflow The system architecture is illustrated in [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗

**Figure 2.** Figure 2: Workflow of the JiRAIYA FL Framework rounds between the Aggregator and the Managers. With each increment in the number of Global Rounds, the number of Communication Rounds increases, while the number of Local Epochs scales proportionally with the Communication Rounds. The workflow of the proposed framework was illustrated in [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Gas, Communication Usage and Test Accuracy of JiRAIYA. : Preprint submitted to Elsevier Page 14 of 18 [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

**Figure 4.** Figure 4: Gas, Communication Usage and Test Accuracy of JiRAIYA. : Preprint submitted to Elsevier Page 15 of 18 [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

**Figure 5.** Figure 5: Data Poisoning Attacks with Varying Degree of Flip, Proportion. (a) Model Poisoning with 2 Global Rounds (b) Model Poisoning with 3 Global Rounds [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗

**Figure 6.** Figure 6: Model Poisoning Attacks with Varying Degree of Flip, Proportion. : Preprint submitted to Elsevier Page 16 of 18 [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

read the original abstract

Federated Learning(FL) is predominantly deployed in enterprise environments, where limited transparency and restricted auditability hinder broader adoption. Existing FL systems often suffer from opaque aggregation processes, making it unclear which model updates are accepted or discarded. Current mitigation strategies typically rely on external validators introducing additional computational and communication overhead. In this paper, we propose a novel FL framework that leverages existing Web3 technologies to enhance transparency, trust and auditability throughout the training process. The framework adopts a hierarchical architecture in which delegated managers orchestrate the FL training process within their respective federations. To mitigate adversarial and poisoning attacks, a combination of novelty detection and consensus mechanisms were employed. Model updates are encoded and broad casted to all managers, who independently evaluate their validity and those model updates that are approved by the consensus are incorporated into the global model. Additionally, a reputation score based backup mechanism is employed to ensure model generation. Extensive experiments conducted under real world scenarios demonstrate the effectiveness, resilience of the proposed framework, highlighting its potential to enable transparent FL beyond traditional enterprise setting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

JiRAIYA sketches a Web3 hierarchical FL setup with manager consensus and novelty checks but supplies no data, threat model, or attack results to support its resilience claims.

read the letter

The main takeaway is that this paper outlines a new architecture called JiRAIYA for running federated learning on Web3. Delegated managers handle local federations, model updates get encoded and broadcast for independent novelty detection, consensus among managers decides acceptance, and a reputation score acts as backup to keep training going. The goal is more transparency and auditability without extra external validators.

What stands out as new is the specific layering: Web3 broadcast plus manager-level consensus to replace opaque central aggregation. It builds on known pieces like novelty detection and reputation but ties them to a hierarchical Web3 structure aimed at non-enterprise settings.

The idea addresses a genuine pain point in current FL systems around trust and visibility. Using existing Web3 mechanisms for encoding and agreement is a reasonable direction if the overhead stays manageable.

The soft spots are substantial and sit right at the center. The abstract states that extensive real-world experiments show effectiveness and resilience against adversarial and poisoning attacks, yet the text gives no methods, datasets, metrics, error bars, or attack success rates. There is also no threat model describing the fraction of malicious managers, attack types like model replacement, or how the novelty threshold and consensus rules perform under those conditions. Without those, it is impossible to judge whether the consensus step actually filters bad updates or just adds communication cost.

The reputation backup helps with continuity but does not strengthen the validation step itself. If the paper has more in the full text, that would change the picture, but based on what is here the claims rest on assertion rather than shown results.

This is for people already working at the intersection of blockchain and distributed ML who want architecture sketches. It does not yet have the grounding for a serious referee process. I would not send it to peer review until the experimental evaluation and threat analysis are added and the results are reported in detail.

Referee Report

2 major / 2 minor

Summary. The paper proposes JiRAIYA, a hierarchical federated learning framework that uses Web3 technologies (including blockchain for transparency) to address opacity in traditional FL aggregation. Delegated managers orchestrate training within federations; model updates are encoded and broadcast to all managers, who apply novelty detection and reach consensus to accept or reject updates before incorporation into the global model. A reputation-score-based backup mechanism ensures continuity. The authors claim that extensive experiments under real-world scenarios demonstrate the framework's effectiveness and resilience against adversarial and poisoning attacks, enabling transparent FL beyond enterprise settings.

Significance. If the central claims on attack mitigation and real-world performance hold, the work would offer a concrete path to auditable, decentralized FL by repurposing existing Web3 primitives rather than introducing new external validators. The hierarchical manager design and reputation backup are pragmatic engineering choices that could reduce single points of failure, but the absence of quantitative attack evaluations limits the immediate impact on the FL security literature.

major comments (2)

[Section 3] Section 3 (framework description): the consensus-plus-novelty-detection mechanism is presented at a high level without a formal threat model (e.g., Byzantine fraction among managers, model-replacement or backdoor attack definitions, or assumptions on manager collusion). This is load-bearing because the central claim that broadcasting updates plus independent evaluation reliably filters poisoning attacks rests on unstated security assumptions.
[Experiments section] Experiments section (and abstract): the manuscript asserts 'extensive experiments conducted under real world scenarios' demonstrate resilience, yet reports only aggregate effectiveness metrics. No attack success rates, false-positive rates on benign updates, ablation on the novelty detector, or comparison against baselines under explicit poisoning are provided. This undermines the resilience claim that is central to the contribution.

minor comments (2)

[Abstract] Abstract: 'broad casted' should be 'broadcast'; 'the effectiveness, resilience' is missing 'and'.
Notation for reputation scores and consensus thresholds is introduced without a clear table or equation reference, making it difficult to reproduce the exact validation logic.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point by point below and commit to revisions that directly strengthen the security analysis and experimental evidence.

read point-by-point responses

Referee: [Section 3] Section 3 (framework description): the consensus-plus-novelty-detection mechanism is presented at a high level without a formal threat model (e.g., Byzantine fraction among managers, model-replacement or backdoor attack definitions, or assumptions on manager collusion). This is load-bearing because the central claim that broadcasting updates plus independent evaluation reliably filters poisoning attacks rests on unstated security assumptions.

Authors: We agree that a formal threat model is necessary to rigorously ground the security claims. In the revised manuscript we will insert a dedicated threat-model subsection in Section 3 that explicitly states (i) the assumed Byzantine fraction among managers, (ii) definitions of model-replacement and backdoor attacks, and (iii) collusion assumptions. This will clarify how the broadcast-plus-consensus design is intended to filter poisoning under those assumptions. revision: yes
Referee: [Experiments section] Experiments section (and abstract): the manuscript asserts 'extensive experiments conducted under real world scenarios' demonstrate resilience, yet reports only aggregate effectiveness metrics. No attack success rates, false-positive rates on benign updates, ablation on the novelty detector, or comparison against baselines under explicit poisoning are provided. This undermines the resilience claim that is central to the contribution.

Authors: We accept that the current experimental presentation does not supply the quantitative attack metrics needed to substantiate the resilience claims. In the revision we will augment the experiments section with (i) attack success rates under poisoning and backdoor scenarios, (ii) false-positive rates on benign updates, (iii) ablation results isolating the novelty detector, and (iv) direct comparisons against standard FL baselines under the same attack models. These additions will be reported alongside the existing aggregate metrics. revision: yes

Circularity Check

0 steps flagged

No circularity: framework proposal relies on external Web3 primitives, not self-derived inputs.

full rationale

The manuscript describes a hierarchical FL architecture using delegated managers, novelty detection, consensus on encoded updates, and reputation-based backup. No equations, fitted parameters, or derivations are presented that reduce to the paper's own inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The central claims rest on the proposed mechanisms and claimed experiments rather than any self-definitional or fitted-input reduction. This matches the expected non-circular outcome for a systems-proposal paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Based on abstract only; the reputation score mechanism appears introduced by the paper as a core component without external grounding.

invented entities (1)

reputation score based backup mechanism no independent evidence
purpose: to ensure model generation in the hierarchical FL process
Introduced to address potential failures in consensus; no independent evidence or falsifiable prediction provided in abstract.

pith-pipeline@v0.9.1-grok · 5716 in / 1343 out tokens · 28268 ms · 2026-06-27T05:58:45.274766+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 1 canonical work pages

[1]

Accessed: 2026-01-16

Foundry.https://getfoundry.sh/. Accessed: 2026-01-16. Bai,L.,Hu,H.,Ye,Q.,Li,H.,Wang,L.,Xu,J.,2024. Membershipinferenceattacksanddefensesinfederatedlearning:Asurvey. ACMComputing Surveys 57, 1–35. Benet, J.,

2026
[2]

arXiv preprint arXiv:1407.3561

Ipfs-content addressed, versioned, p2p file system. arXiv preprint arXiv:1407.3561 . Beutel,D.J.,Topal,T.,Mathur,A.,Qiu,X.,Fernandez-Marques,J.,Gao,Y.,Sani,L.,Li,K.H.,Parcollet,T.,DeGusmão,P.P.B.,etal.,2020. Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390 . Blanchard, P., Mhamdi, E.M.E., Guerraoui, R., Stainer, J.,

Pith/arXiv arXiv 2020
[3]

arXiv preprint arXiv:1703.02757

Byzantine-tolerant machine learning. arXiv preprint arXiv:1703.02757 . Breunig,M.M.,Kriegel,H.P.,Ng,R.T.,Sander,J.,2000. Lof:identifyingdensity-basedlocaloutliers,in:Proceedingsofthe2000ACMSIGMOD international conference on Management of data, pp. 93–104. Buterin, V., et al.,

Pith/arXiv arXiv 2000
[4]

GitHub repository 1, 5–7

Ethereum white paper. GitHub repository 1, 5–7. Cao,X.,Gong,N.Z.,2022. Mpaf:Modelpoisoningattackstofederatedlearningbasedonfakeclients,in:ProceedingsoftheIEEE/CVFconference on computer vision and pattern recognition, pp. 3396–3404. Desai, H.B., Ozdayi, M.S., Kantarcioglu, M.,

2022
[5]

Accessed: 2026-01-16

Web3.py: Python library for ethereum.https://web3py.readthedocs.io/. Accessed: 2026-01-16. Grinberg, M.,

2026
[6]

Accessed: 2026-01-16

Python socket.io.https://python-socketio.readthedocs.io/. Accessed: 2026-01-16. Haber, S., Stornetta, W.S.,

2026
[7]

arXiv preprint arXiv:1909.06335

Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335 . Huang, R., Chen, J., Wang, Y., Bi, T., Nie, L., Zheng, Z.,

Pith/arXiv arXiv 1909
[8]

Blockchain: Research and Applications 5, 100173

An overview of web3 technology: Infrastructure, applications, and popularity. Blockchain: Research and Applications 5, 100173. Karimireddy,S.P.,Kale,S.,Mohri,M.,Reddi,S.,Stich,S.,Suresh,A.T.,2020. Scaffold:Stochasticcontrolledaveragingforfederatedlearning,in: International conference on machine learning, PMLR. pp. 5132–5143. Konečn`y, J., McMahan, H.B., Y...

2020
[9]

arXiv preprint arXiv:1610.05492

Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 . Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.,

Pith/arXiv arXiv
[10]

IEEE Internet of Things Journal 12, 20393–20405

Similarity and diversity: Pca-based contribution evaluation in federated learning. IEEE Internet of Things Journal 12, 20393–20405. doi:10.1109/JIOT.2025.3546679. Maymounkov, P., Mazieres, D.,

work page doi:10.1109/jiot.2025.3546679 2025
[11]

Scikit-learn:Machinelearninginpython.https://scikit-learn

Pedregosa,F.,Varoquaux,G.,Gramfort,A.,Michel,V.,Thirion,B.,Grisel,O.,Blondel,M.,Prettenhofer,P.,Weiss,R.,Dubourg,V.,Vanderplas,J., Passos,A.,Cournapeau,D.,Brucher,M.,Perrot,M.,Duchesnay,E.,2026. Scikit-learn:Machinelearninginpython.https://scikit-learn. org/. Accessed: 2026-01-16. Pimentel, M.A., Clifton, D.A., Clifton, L., Tarassenko, L.,

2026
[12]

Signal processing 99, 215–249

A review of novelty detection. Signal processing 99, 215–249. Rocket,T.,Yin,M.,Sekniqi,K.,vanRenesse,R.,Sirer,E.G.,2019. Scalableandprobabilisticleaderlessbftconsensusthroughmetastability. arXiv preprint arXiv:1906.08936 . Schölkopf,B.,Platt,J.C.,Shawe-Taylor,J.,Smola,A.J.,Williamson,R.C.,2001. Estimatingthesupportofahigh-dimensionaldistribution. Neural c...

arXiv 2019
[13]

Federated learning of gboard language models with differential privacy, in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pp. 629–639. Yazdinejad,A.,Dehghantanha,A.,Karimipour,H.,Srivastava,G.,Parizi,R.M.,2024. Arobustprivacy-preservingfederatedlearningmodelagainst model poisoning attac...

2024
[14]

Incentive mechanism for horizontal federated learning based on reputation and reverse auction, in: Proceedings of the Web Conference 2021, pp. 947–956. :Preprint submitted to Elsevier Page 17 of 18 Zheng, G., Ivanov, D., Brintrup, A.,

2021

[1] [1]

Accessed: 2026-01-16

Foundry.https://getfoundry.sh/. Accessed: 2026-01-16. Bai,L.,Hu,H.,Ye,Q.,Li,H.,Wang,L.,Xu,J.,2024. Membershipinferenceattacksanddefensesinfederatedlearning:Asurvey. ACMComputing Surveys 57, 1–35. Benet, J.,

2026

[2] [2]

arXiv preprint arXiv:1407.3561

Ipfs-content addressed, versioned, p2p file system. arXiv preprint arXiv:1407.3561 . Beutel,D.J.,Topal,T.,Mathur,A.,Qiu,X.,Fernandez-Marques,J.,Gao,Y.,Sani,L.,Li,K.H.,Parcollet,T.,DeGusmão,P.P.B.,etal.,2020. Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390 . Blanchard, P., Mhamdi, E.M.E., Guerraoui, R., Stainer, J.,

Pith/arXiv arXiv 2020

[3] [3]

arXiv preprint arXiv:1703.02757

Byzantine-tolerant machine learning. arXiv preprint arXiv:1703.02757 . Breunig,M.M.,Kriegel,H.P.,Ng,R.T.,Sander,J.,2000. Lof:identifyingdensity-basedlocaloutliers,in:Proceedingsofthe2000ACMSIGMOD international conference on Management of data, pp. 93–104. Buterin, V., et al.,

Pith/arXiv arXiv 2000

[4] [4]

GitHub repository 1, 5–7

Ethereum white paper. GitHub repository 1, 5–7. Cao,X.,Gong,N.Z.,2022. Mpaf:Modelpoisoningattackstofederatedlearningbasedonfakeclients,in:ProceedingsoftheIEEE/CVFconference on computer vision and pattern recognition, pp. 3396–3404. Desai, H.B., Ozdayi, M.S., Kantarcioglu, M.,

2022

[5] [5]

Accessed: 2026-01-16

Web3.py: Python library for ethereum.https://web3py.readthedocs.io/. Accessed: 2026-01-16. Grinberg, M.,

2026

[6] [6]

Accessed: 2026-01-16

Python socket.io.https://python-socketio.readthedocs.io/. Accessed: 2026-01-16. Haber, S., Stornetta, W.S.,

2026

[7] [7]

arXiv preprint arXiv:1909.06335

Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335 . Huang, R., Chen, J., Wang, Y., Bi, T., Nie, L., Zheng, Z.,

Pith/arXiv arXiv 1909

[8] [8]

Blockchain: Research and Applications 5, 100173

An overview of web3 technology: Infrastructure, applications, and popularity. Blockchain: Research and Applications 5, 100173. Karimireddy,S.P.,Kale,S.,Mohri,M.,Reddi,S.,Stich,S.,Suresh,A.T.,2020. Scaffold:Stochasticcontrolledaveragingforfederatedlearning,in: International conference on machine learning, PMLR. pp. 5132–5143. Konečn`y, J., McMahan, H.B., Y...

2020

[9] [9]

arXiv preprint arXiv:1610.05492

Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 . Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.,

Pith/arXiv arXiv

[10] [10]

IEEE Internet of Things Journal 12, 20393–20405

Similarity and diversity: Pca-based contribution evaluation in federated learning. IEEE Internet of Things Journal 12, 20393–20405. doi:10.1109/JIOT.2025.3546679. Maymounkov, P., Mazieres, D.,

work page doi:10.1109/jiot.2025.3546679 2025

[11] [11]

Scikit-learn:Machinelearninginpython.https://scikit-learn

Pedregosa,F.,Varoquaux,G.,Gramfort,A.,Michel,V.,Thirion,B.,Grisel,O.,Blondel,M.,Prettenhofer,P.,Weiss,R.,Dubourg,V.,Vanderplas,J., Passos,A.,Cournapeau,D.,Brucher,M.,Perrot,M.,Duchesnay,E.,2026. Scikit-learn:Machinelearninginpython.https://scikit-learn. org/. Accessed: 2026-01-16. Pimentel, M.A., Clifton, D.A., Clifton, L., Tarassenko, L.,

2026

[12] [12]

Signal processing 99, 215–249

A review of novelty detection. Signal processing 99, 215–249. Rocket,T.,Yin,M.,Sekniqi,K.,vanRenesse,R.,Sirer,E.G.,2019. Scalableandprobabilisticleaderlessbftconsensusthroughmetastability. arXiv preprint arXiv:1906.08936 . Schölkopf,B.,Platt,J.C.,Shawe-Taylor,J.,Smola,A.J.,Williamson,R.C.,2001. Estimatingthesupportofahigh-dimensionaldistribution. Neural c...

arXiv 2019

[13] [13]

Federated learning of gboard language models with differential privacy, in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pp. 629–639. Yazdinejad,A.,Dehghantanha,A.,Karimipour,H.,Srivastava,G.,Parizi,R.M.,2024. Arobustprivacy-preservingfederatedlearningmodelagainst model poisoning attac...

2024

[14] [14]

Incentive mechanism for horizontal federated learning based on reputation and reverse auction, in: Proceedings of the Web Conference 2021, pp. 947–956. :Preprint submitted to Elsevier Page 17 of 18 Zheng, G., Ivanov, D., Brintrup, A.,

2021