pith. sign in

arxiv: 2605.00065 · v1 · submitted 2026-04-30 · 💻 cs.CR · cs.DC

Lightweight Tamper-Evident Log Integrity Verification for IoT Edge Environments: A Merkle Tree Pipeline with Adaptive Chunking

Pith reviewed 2026-05-09 20:59 UTC · model grok-4.3

classification 💻 cs.CR cs.DC
keywords IoT log integrityMerkle treetamper evidenceadaptive chunkingedge computingaudit logsblockchain alternativetamper detection
0
0 comments X

The pith

Merkle-tree commitments with adaptive chunking provide tamper-evident IoT log verification at the edge without blockchain.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the need for integrity in IoT device audit logs to support forensics, compliance, and accountability, noting that blockchain solutions add too much overhead for edge devices. It proposes a three-stage pipeline using resource-aware adaptive chunking for batch ingestion, Merkle-tree construction for O(log n) proofs, and verification against a trusted root anchor. After fixing two bugs in the evaluation code, benchmarks on synthetic datasets report throughput above 130,000 logs per second, 22 ms latency for verification and proof generation, roughly 1,000-byte proofs, under 5 MB peak memory, and perfect tampering detection across 1% to 50% corruption rates. A sympathetic reader would care because this could make secure logging practical for resource-limited IoT settings where heavier alternatives are infeasible.

Core claim

The proposed pipeline operates in three stages: resource-aware batch ingestion via adaptive chunk sizing, Merkle-tree construction with O(log n) inclusion proof generation, and deterministic single-entry verification against a trusted root anchor. After correcting a double-counting bug in tampering metrics and a redundant full-tree reconstruction during batch appends, five-run benchmarks on synthetic IoT log datasets show throughput exceeding 130,000 logs/s for 100,000 records, per-entry verification latency of approximately 22 ms, proof generation latency of 22 ms, an average proof size of 1,006 bytes, peak memory usage below 5 MB, and perfect precision, recall, and F1-score of 1.0 for tamp

What carries the argument

The Merkle tree pipeline with resource-aware adaptive chunking, which sizes log batches according to available resources and builds commitments that allow efficient inclusion proofs for single-entry verification against a trusted root.

If this is right

  • IoT edge devices can generate tamper-evident logs with low computational and memory costs suitable for constrained hardware.
  • Post-incident forensics and regulatory compliance become feasible without network dependencies or consensus overhead.
  • Verification remains deterministic and efficient even as log volumes grow, using small O(log n) proofs.
  • The approach avoids blockchain deployment complexity while still providing verifiable integrity guarantees.
  • Adaptive chunking allows the system to adjust batch sizes dynamically to match fluctuating resource availability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be tested on other edge-constrained domains such as industrial sensors or mobile logging to broaden applicability.
  • Real deployments would require additional safeguards for the root anchor to prevent single-point failures not covered in the synthetic evaluation.
  • Further work could explore hardware-specific optimizations for chunk sizing based on measured device profiles rather than synthetic assumptions.
  • Integration with standard IoT protocols might lower barriers to adoption in existing systems.

Load-bearing premise

The evaluation assumes synthetic IoT log datasets adequately represent real-world log patterns and resource variability, and that the trusted root anchor remains secure without additional attack vectors or flaws.

What would settle it

Deploy the pipeline on actual production IoT device logs, induce real tampering at varying ratios, and check whether detection stays perfect while throughput, latency, and memory figures hold under hardware variability.

Figures

Figures reproduced from arXiv: 2605.00065 by Ahmet Hasim Yurttakal, Fahrettin Horasan, Muhammet Anil Yagiz.

Figure 1
Figure 1. Figure 1: Three-stage pipeline architecture. Sensors stream log entries into the adaptive in [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Merkle-tree inclusion proof for entry L0. The verification path (orange) traces from the target leaf h0 to the root R. The proof consists of the three sibling hashes (green): h1, h2-3, and h4-7. A verifier recomputes R′ = H(H(H(h0∥h1)∥h2-3)∥h4-7) and accepts if R′ = R. Proof size is ⌈log2 n⌉ sibling hashes. In our n = 10,000 experiments, interior indices yield 14-hash proofs with an average serialized size… view at source ↗
Figure 3
Figure 3. Figure 3: Ingestion throughput vs. dataset size with [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Controlled stress test with simulated memory pressure. Chunk size decreases [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Batch verification per-entry time remains stable across batch sizes (sequential [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Proof size across selected indices in a 10000-entry tree (logarithmic scaling with boundary effects). sha256 blake2b 1.65 1.7 1.75 ·105 Ingestion rate (logs/s) [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Ingestion-rate comparison of SHA-256 and BLAKE2b on [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Single-entry verification time comparison of SHA-256 and BLAKE2b on [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Tampering detection time under controlled corruption (verification performed [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗
read the original abstract

Integrity of audit logs produced by Internet of Things (IoT) devices is a prerequisite for post-incident forensics, regulatory compliance, and operational accountability. While blockchain-backed logging infrastructures can satisfy this requirement, they introduce consensus overhead, network dependencies, and deployment complexity that are often prohibitive at the IoT edge. This paper presents a lightweight and evaluated integrity verification pipeline that combines Merkle-tree commitments with resource-aware adaptive chunking to provide tamper evidence without relying on distributed ledger technologies. The proposed pipeline operates in three stages: (i) resource-aware batch ingestion via adaptive chunk sizing, (ii) Merkle-tree construction with O(logn) inclusion proof generation, and (iii) deterministic single-entry verification against a trusted root anchor. We further report an implementation audit that identified and corrected two evaluation defects: a double-counting bug in tampering metrics and a redundant full-tree reconstruction during batch appends. Using the corrected implementation, five-run benchmarks on synthetic IoT log datasets demonstrate throughput exceeding 130,000 logs/s for 100,000 records. The system achieves per-entry verification latency of approximately 22 ms, proof generation latency of 22 ms, an average proof size of 1,006 bytes, and peak memory usage below 5 MB. Tampering detection achieves perfect precision, recall, and F1-score (1.0) across corruption ratios ranging from 1% to 50%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript presents a lightweight integrity verification pipeline for IoT audit logs that uses Merkle trees combined with adaptive chunking to achieve tamper evidence without blockchain dependencies. The system consists of resource-aware batch ingestion, Merkle tree construction for O(log n) proofs, and single-entry verification against a trusted root. After correcting two evaluation defects (double-counting in metrics and redundant tree reconstruction), benchmarks on synthetic IoT log datasets report throughput exceeding 130,000 logs/s, ~22 ms latencies for verification and proof generation, 1,006-byte average proofs, <5 MB memory, and perfect (1.0) precision/recall/F1 for tampering detection across 1-50% corruption rates.

Significance. If the performance and detection results generalize beyond synthetic data, the pipeline could offer a practical, low-overhead alternative to blockchain for tamper-evident logging at the IoT edge, supporting post-incident forensics and compliance in constrained environments. The explicit correction of two evaluation bugs and the resource-aware adaptive chunking are positive elements that demonstrate attention to implementation details and novelty relative to static Merkle-tree approaches.

major comments (1)
  1. [Evaluation] Evaluation section (and abstract): All headline metrics (throughput >130,000 logs/s for 100k records, ~22 ms per-entry verification and proof generation, 1,006-byte average proofs, <5 MB peak memory, and F1=1.0 tampering detection for 1-50% corruption) are obtained exclusively from synthetic IoT log datasets. No side-by-side comparison with real device traces, no sensitivity analysis on log arrival patterns/burstiness/size distributions, and no real-device validation are reported. This directly undermines the central claim of suitability for IoT edge environments, because adaptive chunking behavior and detection guarantees depend on the statistical properties of the input logs.
minor comments (3)
  1. [Abstract] Abstract: The five-run benchmarks are mentioned but no per-run values, standard deviations, or error bars are supplied for throughput, latency, or memory figures, making it impossible to judge result stability.
  2. [Evaluation] Evaluation: The synthetic dataset generator parameters (log size distribution, inter-arrival times, corruption model) are not fully specified, preventing independent reproduction or extension of the sensitivity experiments.
  3. [Implementation] Implementation audit paragraph: Pre- and post-correction metric tables or deltas for the double-counting bug and redundant reconstruction are absent, so the impact of the fixes on the final reported numbers cannot be assessed.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the single major comment below with an honest assessment of the evaluation's scope and propose targeted revisions to improve clarity without misrepresenting the work.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section (and abstract): All headline metrics (throughput >130,000 logs/s for 100k records, ~22 ms per-entry verification and proof generation, 1,006-byte average proofs, <5 MB peak memory, and F1=1.0 tampering detection for 1-50% corruption) are obtained exclusively from synthetic IoT log datasets. No side-by-side comparison with real device traces, no sensitivity analysis on log arrival patterns/burstiness/size distributions, and no real-device validation are reported. This directly undermines the central claim of suitability for IoT edge environments, because adaptive chunking behavior and detection guarantees depend on the statistical properties of the input logs.

    Authors: We agree that the evaluation relies exclusively on synthetic datasets and that this limits direct claims about performance on real IoT edge hardware. The synthetic logs were constructed to incorporate variable record sizes, irregular inter-arrival times, and bursty patterns drawn from published IoT telemetry characterizations, enabling controlled measurement of adaptive chunking decisions and tampering detection across corruption rates. Nevertheless, this approach cannot fully substitute for traces collected from physical devices. In the revised manuscript we will (1) expand the evaluation section with an explicit description of the data-generation parameters and their mapping to observed IoT log statistics, and (2) insert a dedicated Limitations subsection that states the synthetic-data restriction, discusses its implications for generalizability of the adaptive-chunking and verification results, and outlines planned real-device experiments. These additions constitute a partial revision that directly responds to the concern while preserving the integrity of the reported numbers. revision: partial

standing simulated objections not resolved
  • Side-by-side comparison against real device traces or real-device validation, because the current experiments were performed only on synthetic datasets and we do not have access to such traces for the present revision.

Circularity Check

0 steps flagged

No circularity: empirical benchmarks rest on standard Merkle-tree primitives and corrected implementation

full rationale

The paper describes a three-stage pipeline using well-known Merkle-tree construction for O(log n) proofs, resource-aware adaptive chunking for batching, and deterministic verification against a trusted root. All reported metrics (throughput, latencies, proof sizes, memory, and perfect F1 tampering detection) are obtained from direct five-run execution on synthetic datasets after fixing two implementation bugs. No equations, fitted parameters, or predictions are defined in terms of the target results; no self-citations are invoked as load-bearing uniqueness theorems; and the central claims do not reduce to renaming or self-definition. The derivation chain is therefore self-contained against external cryptographic standards and reproducible benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard cryptographic assumptions for hash functions and does not introduce new free parameters, axioms beyond domain standards, or invented entities.

axioms (1)
  • domain assumption Cryptographic hash functions are collision-resistant and preimage-resistant
    Invoked for Merkle tree construction and tamper-evidence guarantees in the pipeline description.

pith-pipeline@v0.9.0 · 5573 in / 1213 out tokens · 32318 ms · 2026-05-09T20:59:13.753304+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

  1. [1]

    ACM Transactions on Information and System Security , volume=

    Secure audit logs to support computer forensics , author=. ACM Transactions on Information and System Security , volume=. 1999 , doi=

  2. [2]

    ACM Transactions on Storage , volume=

    A new approach to secure logging , author=. ACM Transactions on Storage , volume=. 2009 , doi=

  3. [3]

    Proceedings of the 18th USENIX Security Symposium , pages=

    Efficient data structures for tamper-evident logging , author=. Proceedings of the 18th USENIX Security Symposium , pages=. 2009 , publisher=

  4. [4]

    Computers & Security , volume=

    A secure and auditable logging infrastructure based on a permissioned blockchain , author=. Computers & Security , volume=. 2019 , doi=

  5. [5]

    Journal of Network and Computer Applications , volume=

    Secure and transparent audit logs with BlockAudit , author=. Journal of Network and Computer Applications , volume=. 2019 , doi=

  6. [6]

    2016 , howpublished=

    Secure Audit Logs with Verifiable Excerpts , author=. 2016 , howpublished=

  7. [7]

    Laurie, Ben and Messeri, Eran and Stradling, Rob , year=

  8. [8]

    and Aumasson, Jean-Philippe , year=

    Saarinen, Markku-Juhani O. and Aumasson, Jean-Philippe , year=

  9. [9]

    Advances in Cryptology: CRYPTO '87 , pages=

    A digital signature based on a conventional encryption function , author=. Advances in Cryptology: CRYPTO '87 , pages=. 1988 , publisher=

  10. [10]

    Current research on

    Hassan, Wan Haslina , journal=. Current research on. 2019 , doi=

  11. [11]

    Li, Shancang and Xu, Li Da and Zhao, Shanshan , journal=. The. 2015 , doi=

  12. [12]

    2020 , howpublished=

  13. [13]

    2013 , doi=

    Aumasson, Jean-Philippe and Neves, Samuel and Wilcox-O'Hearn, Zooko and Winnerlein, Christian , journal=. 2013 , doi=

  14. [14]

    Federal Information Processing Standards Publication 180-4 , year=

  15. [15]

    Proceedings of the 16th International Conference on Financial Cryptography and Data Security , pages=

    Efficient, compromise resilient and append-only cryptographic schemes for secure audit logging , author=. Proceedings of the 16th International Conference on Financial Cryptography and Data Security , pages=. 2012 , doi=

  16. [16]

    Australian Journal of Forensic Sciences , year=

    Novel digital forensic readiness technique in the cloud environment , author=. Australian Journal of Forensic Sciences , year=

  17. [17]

    2015 , howpublished=

    Balloon: A Forward-Secure Append-Only Persistent Authenticated Data Structure , author=. 2015 , howpublished=

  18. [18]

    , journal=

    Sinha, Arunesh and Jia, Limin and England, Paul and Lorch, Jacob R. , journal=. Continuous tamper-proof logging using. 2014 , doi=

  19. [19]

    IEEE Communications Surveys & Tutorials , volume=

    Applications of Blockchains in the Internet of Things: A Comprehensive Survey , author=. IEEE Communications Surveys & Tutorials , volume=. 2019 , doi=

  20. [20]

    IEEE Internet of Things Journal , volume=

    A Survey on Internet of Things: Architecture, Enabling Technologies, Security and Privacy, and Applications , author=. IEEE Internet of Things Journal , volume=. 2017 , doi=

  21. [21]

    Journal of Network and Computer Applications , volume=

    Internet of Things security: A survey , author=. Journal of Network and Computer Applications , volume=. 2017 , doi=

  22. [22]

    IEEE Communications Surveys & Tutorials , volume=

    A Survey on the Internet of Things (IoT) Forensics: Challenges, Approaches, and Open Issues , author=. IEEE Communications Surveys & Tutorials , volume=. 2020 , doi=

  23. [23]

    doi:10.6028/NIST.SP.800-92 , url=

    Kent, Karen and Souppaya, Murugiah , year=. doi:10.6028/NIST.SP.800-92 , url=

  24. [24]

    doi:10.6028/NIST.SP.800-86 , url=

    Kent, Karen and Chevalier, Suzanne and Grance, Tim and Dang, Hung , year=. doi:10.6028/NIST.SP.800-86 , url=

  25. [25]

    doi:10.6028/NIST.SP.800-61r2 , url=

    Cichonski, Paul and Millar, Tom and Grance, Tim and Scarfone, Karen , year=. doi:10.6028/NIST.SP.800-61r2 , url=

  26. [26]

    2012 , howpublished=

  27. [27]

    doi:10.6028/NIST.FIPS.140-3 , url=

    2019 , howpublished=. doi:10.6028/NIST.FIPS.140-3 , url=

  28. [28]

    Gerhards, Rainer , year=

  29. [29]

    Journal of Cryptology , volume=

    How to Time-Stamp a Digital Document , author=. Journal of Cryptology , volume=. 1991 , doi=

  30. [30]

    Sequences II: Methods in Communication, Security, and Computer Science , pages=

    Improving the Efficiency and Reliability of Digital Time-Stamping , author=. Sequences II: Methods in Communication, Security, and Computer Science , pages=. 1993 , publisher=