BlockRaFT: A Distributed Framework for Fault-Tolerant and Scalable Blockchain Nodes
Pith reviewed 2026-05-10 08:16 UTC · model grok-4.3
The pith
BlockRaFT turns a blockchain node into a RAFT-coordinated cluster that partitions tasks for better scalability and crash tolerance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BlockRaFT elects a leader with the RAFT protocol inside a cluster of machines that together execute a blockchain node. Stateless tasks run only at the leader while stateful tasks are coordinated and replicated to the followers. This separation plus a concurrent Merkle tree that decouples execution from updates is presented as a way to improve workload balance, fault tolerance, and performance compared with a conventional single-node node.
What carries the argument
RAFT leader-follower cluster that partitions blockchain node tasks by statefulness, centralizing stateless work and replicating stateful work, combined with concurrent Merkle tree updates.
If this is right
- Nodes can tolerate crashes of individual machines without losing the ability to process blocks or queries.
- Adding machines to the cluster increases capacity without redesigning the core blockchain logic.
- Load balancing across the cluster improves overall resource utilization.
- Decoupling smart-contract execution from Merkle-tree updates removes one major serial bottleneck.
- The same task-partition pattern could be reused in other state-heavy distributed applications.
Where Pith is reading between the lines
- Validators might run on cheaper commodity hardware clusters instead of single high-end servers.
- The approach could be wrapped around existing blockchain clients with minimal changes to the node software itself.
- Real deployments would need to measure whether the RAFT messages and state replication overhead stay negligible at scale.
- Similar intra-node distribution might help other systems that keep large mutable state, such as distributed databases.
Load-bearing premise
Blockchain node tasks can be split into stateful and stateless categories with low overhead and RAFT coordination will not add new consistency, security, or performance problems.
What would settle it
A side-by-side benchmark that measures transaction throughput, latency, and crash recovery time for a BlockRaFT cluster versus a standard single-node blockchain client under rising load.
Figures
read the original abstract
Blockchain technology enhances transparency by maintaining a distributed ledger among mutually untrusting parties. Despite its advantages, scalability and availability remain critical bottlenecks that hinder widespread adoption. The increasing complexity of blockchain nodes further necessitates robust fault tolerance and high throughput to ensure seamless operations. We present BlockRaFT, a crash-tolerant distributed framework designed to improve both the scalability and reliability of blockchain node operations. BlockRaFT framework utilizes RAFT consensus protocol to elect a leader within a cluster of systems. The elected leader coordinates and distributes workloads across follower nodes, thereby optimizing resource utilization and work load balancing. We analyzed the tasks performed by blockchain nodes and partition them according to their stateful and stateless characteristics. Stateless operations are centralized at the leader, while stateful operations are replicated and coordinated across the cluster to ensure consistency and fault tolerance. We evaluate whether this distributed intra-node architecture provides measurable benefits over traditional single-node execution models in terms of scalability, availability, and performance. Additionally, we introduce a concurrent Merkle tree optimization that decouples smart contract execution from tree updates, significantly reducing one of the significant performance overheads in blockchain systems. Our design philosophy is rooted in utilizing the well-established principles of distributed computing and customizing them for the blockchain domain rather than reinventing them.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes BlockRaFT, a crash-tolerant distributed intra-node framework for blockchain systems. It applies the RAFT consensus protocol to elect a leader within a cluster of machines that coordinate workloads, partitions node tasks into stateful and stateless categories (centralizing stateless work at the leader while replicating stateful work), and introduces a concurrent Merkle-tree optimization that decouples smart-contract execution from tree updates. The central claim is that this architecture delivers measurable gains in scalability, availability, and performance relative to conventional single-node blockchain execution.
Significance. If the partitioning and RAFT integration can be shown to avoid new consistency or latency bottlenecks while delivering the claimed gains, the work would offer a practical way to strengthen blockchain node robustness by reusing established distributed-systems primitives rather than inventing new ones. The concurrent Merkle-tree idea addresses a known hot spot, but without quantitative evidence the practical impact remains unclear.
major comments (3)
- [Evaluation] Evaluation section: the manuscript states that an evaluation of scalability, availability, and performance benefits was performed, yet supplies no quantitative results, baselines, error bars, methodology details, or comparison against single-node execution. This absence prevents verification of the central claim.
- [Design / Task Partitioning] Task-partitioning design (Section 3 or equivalent): the assumption that blockchain operations can be cleanly separated into stateless and stateful categories with low cross-node synchronization cost is load-bearing, but no analysis quantifies the frequency or overhead of state handoffs. Operations labeled stateless (e.g., signature verification) frequently require read access to account balances or contract storage, creating potential serialization points and consistency windows with the external P2P network.
- [Concurrent Merkle Tree Optimization] Concurrent Merkle-tree optimization and RAFT integration: the design decouples execution from tree updates, but the manuscript does not specify how the RAFT log guarantees that followers obtain identical state snapshots before executing stateful work, leaving open the risk of divergent views or additional consistency mechanisms.
minor comments (2)
- [Abstract] Abstract: 'work load balancing' should be written as the single word 'workload balancing'.
- [Abstract] Abstract: the phrase 'significantly reducing one of the significant performance overheads' is redundant; rephrase for conciseness.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment point by point below, providing clarifications where possible and committing to revisions that strengthen the presentation of our results and design rationale.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: the manuscript states that an evaluation of scalability, availability, and performance benefits was performed, yet supplies no quantitative results, baselines, error bars, methodology details, or comparison against single-node execution. This absence prevents verification of the central claim.
Authors: We acknowledge that the current manuscript does not present the quantitative evaluation results, even though the abstract and introduction describe that such an evaluation was performed using a prototype. The experiments compared BlockRaFT against a baseline single-node implementation on a cluster, measuring throughput, latency under load, and recovery time after simulated crashes. In the revised version we will expand the Evaluation section with the specific metrics, baselines, error bars, workload descriptions, and direct comparisons to single-node execution so that the central claims can be verified. revision: yes
-
Referee: [Design / Task Partitioning] Task-partitioning design (Section 3 or equivalent): the assumption that blockchain operations can be cleanly separated into stateless and stateful categories with low cross-node synchronization cost is load-bearing, but no analysis quantifies the frequency or overhead of state handoffs. Operations labeled stateless (e.g., signature verification) frequently require read access to account balances or contract storage, creating potential serialization points and consistency windows with the external P2P network.
Authors: Our partitioning classifies operations by whether they mutate persistent node state (stateful, replicated via RAFT) or not (stateless, centralized at the leader). Reads required by some stateless operations are served from the leader’s current committed state; updates are applied only through the RAFT log, limiting cross-node handoffs to committed state changes. We agree that the manuscript lacks a quantitative breakdown of handoff frequency and overhead. The revised version will include measurements of these costs together with a discussion of how read consistency is maintained relative to the external P2P layer. revision: yes
-
Referee: [Concurrent Merkle Tree Optimization] Concurrent Merkle-tree optimization and RAFT integration: the design decouples execution from tree updates, but the manuscript does not specify how the RAFT log guarantees that followers obtain identical state snapshots before executing stateful work, leaving open the risk of divergent views or additional consistency mechanisms.
Authors: Stateful work is executed only after the corresponding RAFT log entries have been committed, so every follower applies the identical sequence of state updates before performing those operations. The concurrent Merkle-tree optimization runs tree updates in parallel with execution but still respects the log commit order for state visibility. We will revise the relevant sections to explicitly describe these synchronization points and the resulting guarantee of identical snapshots, thereby clarifying that no additional consistency mechanisms are required beyond standard RAFT. revision: yes
Circularity Check
No circularity; architecture applies standard RAFT to analyzed task partitions
full rationale
The paper's central claims rest on partitioning blockchain node operations into stateful and stateless categories (presented as an analysis of existing tasks) and then applying the established RAFT consensus protocol for leader election and workload distribution. No equations, fitted parameters, or predictions are defined in terms of themselves. The concurrent Merkle tree optimization is introduced as a decoupling technique without reference to self-referential success metrics. The design explicitly roots itself in 'well-established principles of distributed computing' rather than deriving uniqueness or correctness from author prior work or self-citations. Evaluation compares the proposed intra-node distribution against single-node baselines using conventional scalability and performance measures. No load-bearing step reduces by construction to its own inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math RAFT consensus protocol guarantees crash fault tolerance and correct leader election within a small cluster.
- domain assumption Blockchain node operations can be partitioned into stateless and stateful categories with negligible overhead and preserved correctness.
invented entities (2)
-
BlockRaFT framework
no independent evidence
-
Concurrent Merkle tree optimization
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Bitcoin: A Peer-to-Peer Electronic Cash System,
S. Nakamoto, “Bitcoin: A Peer-to-Peer Electronic Cash System,” https: //bitcoin.org/bitcoin.pdf, 2008
work page 2008
-
[2]
Blockchain for industry 4.0: A comprehensive review,
U. Bodkhe, S. Tanwar, K. Parekh, P. Khanpara, S. Tyagi, N. Kumar, and M. Alazab, “Blockchain for industry 4.0: A comprehensive review,” IEEE Access, vol. 8, pp. 79 764–79 800, 2020
work page 2020
-
[3]
Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains,
E. Androulaki, A. Barger, V . Bortnikov, C. Cachin, K. Christidis, A. De Caro, D. Enyeart, C. Ferris, G. Laventman, Y . Manevich, S. Muralidharan, C. Murthy, B. Nguyen, M. Sethi, G. Singh, K. Smith, A. Sorniotti, C. Stathakopoulou, M. Vukoli´c, S. W. Cocco, and J. Yellick, “Hyperledger Fabric: A Distributed Operating System for Permissioned Blockchains,” ...
work page 2018
-
[4]
Adding Concurrency to Smart Contracts,
T. Dickerson, P. Gazzillo, M. Herlihy, and E. Koskinen, “Adding Concurrency to Smart Contracts,” ser. PODC ’17. New York, NY , USA: Association for Computing Machinery, 2017, p. 303–312
work page 2017
-
[5]
P. S. Anjana, H. Attiya, S. Kumari, S. Peri, and A. Somani, “Efficient Concurrent Execution of Smart Contracts in Blockchains Using Object- Based Transactional Memory,” inNetworked Systems, C. Georgiou and R. Majumdar, Eds. Cham: Springer International Publishing, 2021, pp. 77–93
work page 2021
-
[6]
Dag- based efficient parallel scheduler for blockchains: Hyperledger sawtooth as a case study,
M. Piduguralla, S. Chakraborty, P. S. Anjana, and S. Peri, “Dag- based efficient parallel scheduler for blockchains: Hyperledger sawtooth as a case study,” inEuro-Par 2023: Parallel Processing, J. Cano, M. D. Dikaiakos, G. A. Papadopoulos, M. Peric `as, and R. Sakellariou, Eds. Cham: Springer Nature Switzerland, 2023, pp. 184–198
work page 2023
-
[7]
Algorand: Scaling byzantine agreements for cryptocurrencies,
Y . Gilad, R. Hemo, S. Micali, G. Vlachos, and N. Zeldovich, “Algorand: Scaling byzantine agreements for cryptocurrencies,” inProceedings of the 26th Symposium on Operating Systems Principles, ser. SOSP ’17. New York, NY , USA: Association for Computing Machinery, 2017, p. 51–68. [Online]. Available: https://doi.org/10.1145/3132747.3132757
-
[8]
Blackcoin’s proof-of-stake protocol v2,
P. Vasin, “Blackcoin’s proof-of-stake protocol v2,”URL: https://blackcoin. co/blackcoin-pos-protocol-v2-whitepaper. pdf, vol. 71, 2014
work page 2014
-
[9]
POET: Target- System Independent Visualizations of Complex Distributed-Applications Executions,
T. Kunz, J. P. Black, D. J. Taylor, and T. Basten, “POET: Target- System Independent Visualizations of Complex Distributed-Applications Executions,”The Computer Journal, vol. 40, no. 8, 1997
work page 1997
-
[10]
Merkle tree: A fundamental component of blockchains,
H. Liu, X. Luo, H. Liu, and X. Xia, “Merkle tree: A fundamental component of blockchains,” in2021 International Conference on Elec- tronic Information Engineering and Computer Science (EIECS), 2021, pp. 556–561
work page 2021
-
[11]
In search of an understandable consensus algorithm,
D. Ongaro and J. Ousterhout, “In search of an understandable consensus algorithm,” inProceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference, ser. USENIX ATC’14. USA: USENIX Association, 2014, p. 305–320
work page 2014
-
[12]
L. Lamport, “Paxos made simple,”ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001), pp. 51–58, December 2001. [Online]. Available: https://www.microsoft. com/en-us/research/publication/paxos-made-simple/
work page 2001
-
[13]
R. E. Tarjan, “Efficiency of a good but not linear set union algorithm,” J. ACM, vol. 22, no. 2, p. 215–225, Apr. 1975. [Online]. Available: https://doi.org/10.1145/321879.321884
-
[14]
etcd: A distributed, reliable key-value store for critical data,
T. etcd Authors, “etcd: A distributed, reliable key-value store for critical data,” GitHub repository, 2025, accessed: 2025-02-19. [Online]. Available: https://github.com/etcd-io/etcd
work page 2025
-
[15]
Z. Gao, Y . Hu, and Q. Wu, “Jellyfish merkle tree (2021),” 2021
work page 2021
-
[16]
Angela: A sparse, distributed, and highly concurrent merkle tree,
J. Kalidhindi, A. Kazorian, A. Khera, and C. Pari, “Angela: A sparse, distributed, and highly concurrent merkle tree,”UC Berkeley, Berkeley, 2018
work page 2018
-
[17]
Oltp- bench: An extensible testbed for benchmarking relational databases,
D. E. Difallah, A. Pavlo, C. Curino, and P. Cudr ´e-Mauroux, “Oltp- bench: An extensible testbed for benchmarking relational databases,” PVLDB, vol. 7, no. 4, pp. 277–288, 2013. [Online]. Available: http://www.vldb.org/pvldb/vol7/p277-difallah.pdf
work page 2013
-
[18]
Benchbase: Multi-dbms sql benchmarking framework via jdbc,
cmu-db, “Benchbase: Multi-dbms sql benchmarking framework via jdbc,” https://github.com/cmu-db/benchbase, 2025, accessed: 2025-08- 09
work page 2025
-
[19]
Buffer-based end-to-end request event monitoring in the cloud,
K. Gao, C. Sun, S. Wang, D. Li, Y . Zhou, H. H. Liu, L. Zhu, and M. Zhang, “Buffer-based end-to-end request event monitoring in the cloud,” in19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). Renton, W A: USENIX Association, Apr. 2022, pp. 829–843. [Online]. Available: https://www.usenix.org/conference/nsdi22/presentation/gao-kaihui
work page 2022
-
[20]
Anonymous. (n.d.) Blockraft. Anonymous code repository hosted on 4open.science. [Online]. Available: https://anonymous.4open.science/r/ BlockRAFT-00C1
-
[21]
An Efficient Framework for Optimistic Concurrent Execution of Smart Contracts,
P. S. Anjana, S. Kumari, S. Peri, S. Rathor, and A. Somani, “An Efficient Framework for Optimistic Concurrent Execution of Smart Contracts,” in 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), 2019, pp. 83–92
work page 2019
-
[22]
Batch-schedule-execute: On optimizing concurrent deterministic scheduling for blockchains,
Y . Hay and R. Friedman, “Batch-schedule-execute: On optimizing concurrent deterministic scheduling for blockchains,” in2024 43rd International Symposium on Reliable Distributed Systems (SRDS), 2024, pp. 163–174
work page 2024
-
[23]
ParBlockchain: Leveraging Transaction Parallelism in Permissioned Blockchain Systems,
M. J. Amiri, D. Agrawal, and A. El Abbadi, “ParBlockchain: Leveraging Transaction Parallelism in Permissioned Blockchain Systems,” in2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), 2019, pp. 1337–1347
work page 2019
-
[24]
Block-STM: Scaling Blockchain Execution by Turning Ordering Curse to a Performance Blessing,
R. Gelashvili, A. Spiegelman, Z. Xiang, G. Danezis, Z. Li, D. Malkhi, Y . Xia, and R. Zhou, “Block-STM: Scaling Blockchain Execution by Turning Ordering Curse to a Performance Blessing,” 2022
work page 2022
-
[25]
Sok: Dag-based blockchain systems,
Q. Wang, J. Yu, S. Chen, and Y . Xiang, “Sok: Dag-based blockchain systems,”ACM Comput. Surv., vol. 55, no. 12, Mar. 2023. [Online]. Available: https://doi.org/10.1145/3576899
-
[26]
Tangle the blockchain:towards connecting blockchain and dag,
H. Hellani, L. Sliman, A. E. Samhat, and E. Exposito, “Tangle the blockchain:towards connecting blockchain and dag,” in2021 IEEE 30th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), 2021, pp. 63–68
work page 2021
-
[27]
Phantom ghostdag: a scalable generalization of nakamoto consensus: September 2, 2021,
Y . Sompolinsky, S. Wyborski, and A. Zohar, “Phantom ghostdag: a scalable generalization of nakamoto consensus: September 2, 2021,” inProceedings of the 3rd ACM Conference on Advances in Financial Technologies, ser. AFT ’21. New York, NY , USA: Association for Computing Machinery, 2021, p. 57–70. [Online]. Available: https://doi.org/10.1145/3479722.3480990
-
[28]
Towards Scaling Blockchain Systems via Sharding,
H. Dang, T. T. A. Dinh, D. Loghin, E.-C. Chang, Q. Lin, and B. C. Ooi, “Towards Scaling Blockchain Systems via Sharding,” inProceedings of the 2019 International Conference on Management of Data, ser. SIGMOD ’19. New York, NY , USA: Association for Computing Machinery, 2019, p. 123–140
work page 2019
-
[29]
A secure sharding protocol for open blockchains,
L. Luu, V . Narayanan, C. Zheng, K. Baweja, S. Gilbert, and P. Saxena, “A secure sharding protocol for open blockchains,” ser. CCS ’16. New York, NY , USA: Association for Computing Machinery, 2016, p. 17–30
work page 2016
-
[30]
S. Baheti, P. S. Anjana, S. Peri, and Y . Simmhan, “DiPETrans: A frame- work for Distributed Parallel Execution of Transactions of Blocks in Blockchains,”Concurrency and Computation: Practice and Experience, vol. 34, no. 10, p. e6804, 2022
work page 2022
-
[31]
Pilotfish: Distributed execution for scalable blockchains,
Q. Kniep, L. Kokoris-Kogias, A. Sonnino, I. Zablotchi, and N. Zhang, “Pilotfish: Distributed execution for scalable blockchains,” inFinancial Cryptography and Data Security (FC), Miyakojima, Japan, Apr. 2025
work page 2025
-
[32]
P. Li, M. Song, M. Xing, Z. Xiao, Q. Ding, S. Guan, and J. Long, “Spring: Improving the throughput of sharding blockchain via deep reinforcement learning based state placement,” ser. WWW ’24. New York, NY , USA: Association for Computing Machinery, 2024, p. 2836–2846. [Online]. Available: https://doi.org/10.1145/3589334. 3645386
-
[33]
Redpanda: The future of streaming data,
R. Data, “Redpanda: The future of streaming data,” GitHub repository, 2025, accessed: 2025-02-19. [Online]. Available: https: //github.com/redpanda-data/redpanda APPENDIX A. Implementation Details We have implemented the proposed BlockRaFT framework using the C++ programming language. We utilize ETCD [14], which serves as a distributed, asynchronous share...
work page 2025
-
[34]
The transactions under this SCT are: •deposit <client_id> <key> <amount>: Adds funds to a wallet
Wallet Smart Contract:This contract represents a decen- tralized banking or token-based system, where wallet balances and fund transfers are securely recorded and verifiable through the blockchain’s global state. The transactions under this SCT are: •deposit <client_id> <key> <amount>: Adds funds to a wallet. •withdraw <client_id> <key> <amount>: Re- move...
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.