Recognition: unknown
HadAgent: Harness-Aware Decentralized Agentic AI Serving with Proof-of-Inference Blockchain Consensus
Pith reviewed 2026-05-10 12:14 UTC · model grok-4.3
The pith
HadAgent replaces proof-of-work mining with proof-of-inference consensus for decentralized LLM agent serving.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HadAgent establishes proof-of-inference consensus in which nodes perform LLM inference to validate blocks, organized into DATA, MODEL, and PROOF lanes with independent Merkle roots. A harness monitors behavior to classify nodes as trusted or non-trusted, allowing trusted nodes optimistic real-time service while non-trusted nodes undergo full verification, forming a feedback loop that excludes adversaries and elevates honest participants.
What carries the argument
The harness layer, which monitors nodes via heartbeat probes, detects anomalies through deterministic recomputation, and manages trust levels to create a self-correcting exclusion of malicious nodes.
If this is right
- Tampered records are detected at 100 percent rate with zero false positives.
- Record and hub operations complete validation in sub-millisecond time.
- Adversarial nodes are excluded within two monitoring rounds.
- Honest nodes reach trusted status within five rounds and can then serve inference optimistically.
- The three-lane structure allows independent verification of data, models, and proofs.
Where Pith is reading between the lines
- The system could extend to other deterministic compute workloads where re-execution serves both consensus and application needs.
- Trusted-node optimistic paths might reduce latency in production agent deployments once convergence stabilizes.
- Scaling would require addressing how model updates or input variations affect the determinism assumption across larger node sets.
Load-bearing premise
LLM inference produces identical results across nodes despite differences in hardware, software, and floating-point behavior.
What would settle it
Two nodes running the same model and input under controlled conditions produce differing outputs, or a modified record passes all Merkle root checks and harness recomputation without detection.
Figures
read the original abstract
Proof-of-Work (PoW) blockchain consensus consumes vast computational resources without producing useful output, while the rapid growth of large language model (LLM) agents has created unprecedented demand for GPU computation. We present HadAgent, a decentralized agentic AI serving system that replaces hash-based mining with Proof-of-Inference (PoI), a consensus mechanism in which nodes earn block-creation rights by executing deterministic LLM inference tasks. Because verification requires only re-executing a single forward pass under identical conditions, cross-node verification operates at consensus speed. HadAgent organizes validated records into a three-lane block body with dedicated DATA, MODEL, and PROOF channels, each protected by an independent Merkle root for fine-grained tamper detection. A two-tier node architecture classifies secondary nodes as trusted or non-trusted based on historical behavior: trusted nodes serve inference results in real time through optimistic execution, while non-trusted nodes must undergo full consensus verification. A harness layer monitors node behavior through heartbeat probes, anomaly detection via deterministic recomputation, and automated trust management, creating a self-correcting feedback loop that isolates malicious or unreliable participants. Experiments on a prototype implementation demonstrate 100% detection rate and 0% false positive rate for tampered records, sub-millisecond validation latency for record and hub operations, and effective harness convergence that excludes adversarial nodes within two rounds while promoting honest nodes to trusted status within five rounds.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents HadAgent, a decentralized agentic AI serving system that replaces Proof-of-Work with Proof-of-Inference (PoI) consensus. Nodes earn block rights by executing LLM inference tasks, with validated records stored in a three-lane block body (DATA, MODEL, PROOF channels) protected by independent Merkle roots. A two-tier node architecture and harness layer classify nodes by historical behavior, enable optimistic execution for trusted nodes, and use deterministic recomputation for anomaly detection and trust management. Prototype experiments are reported to achieve 100% detection and 0% false positives for tampered records, sub-millisecond validation latencies, and harness convergence that excludes adversaries in two rounds while promoting honest nodes in five rounds.
Significance. If the determinism and cross-node verification assumptions hold under realistic conditions, the work offers a promising direction for useful-work blockchain consensus tied directly to AI serving workloads, potentially reducing PoW waste while providing verifiable decentralized inference. The three-lane Merkle structure and self-correcting harness represent concrete architectural contributions, and the prototype implementation with quantitative performance numbers supplies initial evidence of practicality.
major comments (2)
- [Abstract] Abstract: The claims of 100% detection rate, 0% false positive rate, sub-millisecond latencies, and specific convergence rounds (two for exclusion, five for promotion) are presented without any reference to experimental methodology, hardware setup, datasets, attack models, number of trials, or statistical measures, leaving the central empirical support for the system's correctness and performance unassessable.
- [§3] §3 (System Design, PoI and harness description): The verification and anomaly detection mechanisms rest on the assumption that a single LLM forward pass produces bit-identical outputs across nodes under 'identical conditions,' enabling simple re-execution for tamper detection. This is contradicted by real hardware and software variations (GPU architectures, CUDA/cuDNN versions, floating-point modes) that cause divergent token or logit outputs even with fixed seeds and weights; the three-lane Merkle roots and trust classification would then misclassify honest divergence as tampering, undermining the reported detection rates and two-round exclusion claims.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We address each major comment point by point below. Revisions have been made to improve clarity and address concerns where the feedback identifies gaps in the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claims of 100% detection rate, 0% false positive rate, sub-millisecond latencies, and specific convergence rounds (two for exclusion, five for promotion) are presented without any reference to experimental methodology, hardware setup, datasets, attack models, number of trials, or statistical measures, leaving the central empirical support for the system's correctness and performance unassessable.
Authors: We agree that the abstract should provide more context on the supporting experiments to make the claims assessable. In the revised version, we have added a concise clause referencing the prototype evaluation: results derive from 1000 trials on a controlled cluster of identical NVIDIA A100 GPUs using synthetic and real LLM inference workloads, with simulated tampering attacks and standard statistical reporting (means and standard deviations). Full methodology, hardware specifications, datasets, and attack models are detailed in Section 5. This keeps the abstract within length limits while directing readers to the evidence. revision: yes
-
Referee: [§3] §3 (System Design, PoI and harness description): The verification and anomaly detection mechanisms rest on the assumption that a single LLM forward pass produces bit-identical outputs across nodes under 'identical conditions,' enabling simple re-execution for tamper detection. This is contradicted by real hardware and software variations (GPU architectures, CUDA/cuDNN versions, floating-point modes) that cause divergent token or logit outputs even with fixed seeds and weights; the three-lane Merkle roots and trust classification would then misclassify honest divergence as tampering, undermining the reported detection rates and two-round exclusion claims.
Authors: We acknowledge this as a valid and important limitation of the current design. The PoI mechanism and harness explicitly assume identical conditions for bit-exact recomputation, as stated in the manuscript. All prototype experiments were performed in a homogeneous environment (identical GPUs and software stacks) to validate the 100% detection / 0% false-positive claims under those conditions. In the revision to §3, we have added explicit discussion of this assumption, including requirements for node standardization and use of deterministic cuDNN modes. We also note that in heterogeneous deployments, the harness could incorporate tolerance thresholds on logits rather than strict equality, though this would require re-evaluating the reported metrics. A new limitations paragraph has been inserted to scope the guarantees accordingly. These changes clarify the operating assumptions without changing the core three-lane or harness architecture. revision: partial
Circularity Check
Tamper detection rates and harness convergence reduce to prototype definitions and simulation rules by construction
specific steps
-
fitted input called prediction
[Abstract]
"Experiments on a prototype implementation demonstrate 100% detection rate and 0% false positive rate for tampered records, sub-millisecond validation latency for record and hub operations, and effective harness convergence that excludes adversarial nodes within two rounds while promoting honest nodes to trusted status within five rounds."
The prototype uses homogeneous hardware/software; tampering is introduced by altering records, and detection occurs via deterministic recomputation on the same node. Mismatch is therefore guaranteed by construction whenever a record is altered, producing 100% detection / 0% FP without testing cross-node non-determinism or real decentralized conditions.
-
self definitional
[Abstract]
"A harness layer monitors node behavior through heartbeat probes, anomaly detection via deterministic recomputation, and automated trust management, creating a self-correcting feedback loop that isolates malicious or unreliable participants."
The harness rules define trust classification and exclusion based on historical behavior and recomputation matches; running those exact rules in simulation necessarily yields the reported two-round exclusion and five-round promotion numbers, making the convergence result equivalent to the input policy rather than an emergent or validated property.
full rationale
The paper's core claims rest on a prototype and simulation whose outcomes are forced by the verification definition (re-execution under identical conditions) and the harness classification logic. No external benchmarks, machine-checked proofs, or heterogeneous-node tests are cited to break the loop. The central experimental results therefore function as re-statements of the input setup rather than independent predictions.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLM inference produces identical outputs under identical conditions across nodes
- ad hoc to paper Historical behavior reliably predicts future trustworthiness
invented entities (3)
-
Proof-of-Inference (PoI)
no independent evidence
-
Three-lane block body (DATA, MODEL, PROOF channels)
no independent evidence
-
Harness layer
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Bitcoin: A peer-to-peer electronic cash system,
S. Nakamotoet al., “Bitcoin: A peer-to-peer electronic cash system,” 2008
2008
-
[2]
Energy-recycling blockchain with proof-of-deep-learning,
C. Chenli, B. Li, Y . Shi, and T. Jung, “Energy-recycling blockchain with proof-of-deep-learning,” in2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), 2019, pp. 19–23
2019
-
[3]
Bitcoin energy consumption index @ONLINE,
Digiconomist, “Bitcoin energy consumption index @ONLINE,” https://digiconomist.net/bitcoin-energy-consumption, March 2019, (ac- cessed: 03.06.2019)
2019
-
[4]
Proof of learning (pole): Empowering neural network training with consensus building on blockchains,
Y . Liu, Y . Lan, B. Li, C. Miao, and Z. Tian, “Proof of learning (pole): Empowering neural network training with consensus building on blockchains,”Computer Networks, vol. 201, p. 108594, 2021
2021
-
[5]
Coin. ai: A proof-of-useful-work scheme for blockchain-based distributed deep learning,
A. Baldominos and Y . Saez, “Coin. ai: A proof-of-useful-work scheme for blockchain-based distributed deep learning,”Entropy, vol. 21, no. 8, p. 723, 2019
2019
-
[6]
Dlbc: A deep learning- based consensus in blockchains for deep learning services,
B. Li, C. Chenli, X. Xu, Y . Shi, and T. Jung, “Dlbc: A deep learning- based consensus in blockchains for deep learning services,”arXiv preprint arXiv:1904.07349, 2019
-
[7]
Ppcoin: Peer-to-peer crypto-currency with proof- of-stake,
S. King and S. Nadal, “Ppcoin: Peer-to-peer crypto-currency with proof- of-stake,”self-published paper, August, vol. 19, no. 1, 2012
2012
-
[8]
Blockchain challenges and opportunities: A survey,
Z. Zheng, S. Xie, H.-N. Dai, X. Chen, and H. Wang, “Blockchain challenges and opportunities: A survey,”International journal of web and grid services, vol. 14, no. 4, pp. 352–375, 2018
2018
-
[9]
Proof-of-useful-work blockchain for trustworthy biomedical hyperdimensional computing,
J. Wen, D. Ma, S. Zhang, H. Sudler, and X. Jiao, “Proof-of-useful-work blockchain for trustworthy biomedical hyperdimensional computing,” in2025 IEEE Biomedical Circuits and Systems Conference (BioCAS), 2025, pp. 56–60
2025
-
[10]
Exploiting computation power of blockchain for biomedical image segmentation,
B. Li, C. Chenli, X. Xu, T. Jung, and Y . Shi, “Exploiting computation power of blockchain for biomedical image segmentation,” inProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019
2019
-
[11]
Proof-of-federated-learning- subchain: Free partner selection subchain based on federated learning,
B. Li, B. Shen, Q. Lu, T. Jung, and Y . Shi, “Proof-of-federated-learning- subchain: Free partner selection subchain based on federated learning,” in2023 Fifth International Conference on Blockchain Computing and Applications (BCCA), 2023, pp. 600–605
2023
-
[12]
A mining pool solution for novel proof-of-neural-architecture consensus,
B. Li, Q. Lu, W. Jiang, T. Jung, and Y . Shi, “A mining pool solution for novel proof-of-neural-architecture consensus,” in2021 IEEE Inter- national Conference on Blockchain and Cryptocurrency (ICBC), 2021, pp. 1–3
2021
-
[13]
Tidyblock: A novel consensus mechanism for dag-based blockchain in iot,
X. Qu, S. Wang, K. Li, J. Huang, and X. Cheng, “Tidyblock: A novel consensus mechanism for dag-based blockchain in iot,”IEEE Transactions on Mobile Computing, vol. 24, no. 2, pp. 722–735, 2025
2025
-
[14]
Blockchain consensus scheme based on the proof of distributed deep learning work,
H. Zhi, H. Wu, Y . Huang, C. Tian, and S. Wang, “Blockchain consensus scheme based on the proof of distributed deep learning work,”IET Software, vol. 2025, no. 1, p. 3378383, 2025
2025
-
[15]
A novel proof of useful work for a blockchain storing transportation transactions,
M. Haouari, M. Mhiri, M. El-Masri, and K. Al-Yafi, “A novel proof of useful work for a blockchain storing transportation transactions,” Information Processing & Management, vol. 59, no. 1, p. 102749, 2022
2022
-
[16]
Blockchain technology, structure, and applications: a survey,
N. Moosavi, H. Taherdoost, N. Mohamed, M. Madanchian, Y . Farhaoui, and I. U. Khan, “Blockchain technology, structure, and applications: a survey,”Procedia Computer Science, vol. 237, pp. 645–658, 2024
2024
-
[17]
Y . Rao, J. Steeves, A. Shaabana, D. Attevelt, and M. McAteer, “Bittensor: A peer-to-peer intelligence market,” 2021. [Online]. Available: https://arxiv.org/abs/2003.03917
-
[18]
Resonance: A market mechanism for heterogeneous computation,
N. Durvasula and M. Bahrani, “Resonance: A market mechanism for heterogeneous computation,” https://ritual.net/blog/resonance-pt1, 2025
2025
-
[19]
Introducing ritual chain,
Ritual Foundation, “Introducing ritual chain,” https://ritualfoundation.org/blog/unveiling-ritual, 2025
2025
-
[20]
Fedml: A research li- brary and benchmark for federated machine learning,
C. He, S. Li, J. So, X. Zeng, M. Zhang, H. Wang, X. Wang, P. Vepakomma, A. Singh, H. Qiuet al., “Fedml: A research li- brary and benchmark for federated machine learning,”arXiv preprint arXiv:2007.13518, 2020
-
[21]
Verde: Verification via refereed delegation for machine learning programs,
A. Arun, A. S. Arnaud, A. Titov, B. Wilcox, V . Kolobaric, M. Brinkmann, O. Ersoy, B. Fielding, and J. Bonneau, “Verde: Verification via refereed delegation for machine learning programs,”
-
[22]
Available: https://arxiv.org/abs/2502.19405
[Online]. Available: https://arxiv.org/abs/2502.19405
-
[23]
Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation
Y . Wu, W. Chen, Z. Huang, J. Chen, Q. Liu, K. Wang, X. Zhou, and Y . Liang, “Back to basics: Let conversational agents remember with just retrieval and generation,” 2026. [Online]. Available: https://arxiv.org/abs/2604.11628
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[24]
My AI adoption journey,
M. Hashimoto, “My AI adoption journey,” 2026. [Online]. Available: https://mitchellh.com/writing/my-ai-adoption-journey
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.