Effcient logging and querying for Blockchain-based cross-site genomic dataset access audit
Pith reviewed 2026-05-24 20:18 UTC · model grok-4.3
The pith
A blockchain-based log with hierarchical timestamps enables efficient range and AND queries for cross-site genomic dataset access audits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By layering a hierarchical timestamp structure onto an immutable blockchain ledger, the system supports efficient logging and querying of genomic dataset access records. The structure enables fast range queries on timestamps and complex AND queries containing multiple predicates while retaining the ledger's security, compatibility, and immutability guarantees. Tests on supplied data showed at least an order-of-magnitude improvement in range-query speed, boosted AND-query retrieval, and 25 percent lower storage use.
What carries the argument
Hierarchical timestamp structure layered on the blockchain ledger to index the timestamp field for range and compound queries.
If this is right
- Range queries on the timestamp field run at least ten times faster than without the structure.
- Complex AND queries that combine multiple predicates retrieve results more quickly.
- Overall storage footprint drops by 25 percent relative to the baseline method.
- The module can be added to existing blockchain platforms without altering their core protocols.
Where Pith is reading between the lines
- The same timestamp hierarchy could be reused for audit logs in other regulated data-sharing settings such as electronic health records.
- Because the structure sits on top of the ledger, it could be ported to different blockchain implementations with only minor adjustments.
- Longer-term tests on production-scale genomic access logs would reveal whether query speed and storage gains hold when the number of sites and records grows beyond competition test sizes.
Load-bearing premise
The hierarchical timestamp structure can be added to an immutable blockchain ledger without losing the ledger's security, immutability, or platform compatibility.
What would settle it
Deploy the structure on an actual blockchain instance and observe that range-query latency shows no improvement over a naive full scan or that total storage exceeds the baseline implementation.
read the original abstract
Background: Genomic data have been collected by different institutions and companies and need to be shared for broader use. In a cross-site genomic data sharing system, a secure and transparent access control audit module plays an essential role in ensuring the accountability. The 2018 iDASH competition first track provides us with an opportunity to design efficient logging and querying system for cross-site genomic dataset access audit. We designed a blockchain-based log system which can provide a light-weight and widely compatible module for existing blockchain platforms. The submitted solution won the third place of the competition. In this paper, we report the technical details in our system. Methods: We present two methods: baseline method and enhanced method. We started with the baseline method and then adjusted our implementation based on the competition evaluation criteria and characteristics of the log system. To overcome obstacles of indexing on the immutable Blockchain system, we designed a hierarchical timestamp structure which supports efficient range queries on the timestamp field. Results: We implemented our methods in Python3, tested the scalability, and compared the performance using the test data supplied by competition organizer. We successfully boosted the log retrieval speed for complex AND queries that contain multiple predicates. For the range query, we boosted the speed for at least one order of magnitude. The storage usage is reduced by 25%. Conclusion: We demonstrate that Blockchain can be used to build a time and space efficient log and query genomic dataset audit trail. Therefore, it provides a promising solution for sharing genomic data with accountability requirement across multiple sites.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that a blockchain-based logging and querying system, using baseline and enhanced methods with a novel hierarchical timestamp structure, enables efficient range and AND queries on immutable ledgers for cross-site genomic dataset access audits. Implemented in Python and evaluated on iDASH-supplied test data, it reports ≥10× speedup on range queries, improved AND-query performance, 25% storage reduction, and a third-place competition result, concluding that blockchain provides a promising solution for accountable genomic data sharing.
Significance. If the empirical results hold, the work is significant as a practical demonstration of adapting blockchain for efficient audit logging in a high-stakes domain, preserving immutability while adding query performance via the hierarchical structure. The competition placement and concrete performance numbers on supplied data provide external grounding for the efficiency claims in genomic data accountability.
major comments (2)
- [Results] Results section: the performance claims (order-of-magnitude range-query gains and 25% storage reduction) rest on competition data but lack explicit baseline implementation details, query workload specifications, or error/variance analysis, which are load-bearing for verifying the central efficiency claim.
- [Methods] Methods section: the hierarchical timestamp structure is presented as overcoming immutable-ledger indexing obstacles, but without pseudocode, formal invariants, or analysis of its effect on blockchain security/compatibility properties, the claim that it preserves ledger guarantees while enabling queries cannot be fully assessed.
minor comments (1)
- The abstract and title contain minor typographical issues (e.g., 'Effcient' in the provided title) that should be corrected for clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the two major comments below and will incorporate clarifications and additions in a revised manuscript to strengthen verifiability of the efficiency claims and the hierarchical timestamp design.
read point-by-point responses
-
Referee: [Results] Results section: the performance claims (order-of-magnitude range-query gains and 25% storage reduction) rest on competition data but lack explicit baseline implementation details, query workload specifications, or error/variance analysis, which are load-bearing for verifying the central efficiency claim.
Authors: We agree that additional implementation and workload details are required for independent verification. In the revision we will expand the Results section with: (1) explicit description of the baseline method implementation (Python3 code structure and data structures used), (2) the precise query workload specifications supplied by the iDASH organizers (number and types of range and AND queries), and (3) any available run-time statistics or variance from the competition evaluation runs. The reported ≥10× range-query speedup and 25% storage reduction were measured on the organizer-provided test data; we will make these measurement conditions explicit. revision: yes
-
Referee: [Methods] Methods section: the hierarchical timestamp structure is presented as overcoming immutable-ledger indexing obstacles, but without pseudocode, formal invariants, or analysis of its effect on blockchain security/compatibility properties, the claim that it preserves ledger guarantees while enabling queries cannot be fully assessed.
Authors: We accept that the current presentation is insufficient for full assessment. In the revised Methods section we will add: (1) pseudocode for the hierarchical timestamp construction and query traversal, (2) the key invariants maintained by the structure (e.g., monotonicity and completeness with respect to the underlying ledger), and (3) a short argument that the structure is an auxiliary indexing layer that does not modify block contents, consensus rules, or cryptographic commitments, thereby preserving the original blockchain's security and compatibility properties. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper describes an engineering system for blockchain-based genomic audit logging, with a baseline method and an enhanced hierarchical-timestamp indexing approach. All load-bearing claims rest on concrete implementation details, Python code, and empirical benchmarks (range-query speedup, storage reduction, AND-query performance) measured against iDASH-supplied test data. No equations, fitted parameters, or predictions are present that reduce by construction to the inputs; standard blockchain immutability is treated as an external property rather than derived internally. The work is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Blockchain platforms provide immutability and transparency sufficient for audit logs while allowing auxiliary indexing structures.
invented entities (1)
-
hierarchical timestamp structure
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Science 300(5617), 286–290 (2003)
Collins, F.S., Morgan, M., Patrinos, A.: The human genome project: lessons from large-scale biology. Science 300(5617), 286–290 (2003)
work page 2003
-
[2]
Consortium, I.H.: The international HapMap project. Nature 426(6968), 789 (2003)
work page 2003
-
[3]
Lonsdale, J., Thomas, J., Salvatore, M., Phillips, R., Lo, E., Shad, S., Hasz, R., Walters, G., Garcia, F., Young, N., Foster, B., Moser, M., Karasik, E., Gillard, B., Ramsey, K., Sullivan, S., Bridge, J., Magazine, H., Syron, J., Fleming, J., Siminoff, L., Traino, H., Mosavel, M., Barker, L., Jewell, S., Rohrer, D., Maxim, D., Filkins, D., Harbach, P., Co...
work page 2013
-
[4]
Wetterstrand, K.A.: DNA sequencing costs: data from the NHGRI genome sequencing program (GSP) (2013)
work page 2013
-
[5]
Journal of the American Medical Informatics Association 20(1), 2–6 (2013)
Malin, B.A., Emam, K.E., O’Keefe, C.M.: Biomedical data privacy: problems, perspectives, and recent advances. Journal of the American Medical Informatics Association 20(1), 2–6 (2013)
work page 2013
-
[6]
Journal of Biomedical Informatics 50, 4–19 (2014)
Gkoulalas-Divanis, A., Loukides, G., Sun, J.: Publishing data from electronic health records while preserving privacy: A survey of algorithms. Journal of Biomedical Informatics 50, 4–19 (2014)
work page 2014
-
[7]
Naveed, M., Ayday, E., Clayton, E.W., Fellay, J., Gunter, C.A., Hubaux, J.-P., Malin, B.A., Wang, X.: Privacy in the genomic era. ACM Comput. Surv. 48(1), 6–1644 (2015)
work page 2015
-
[8]
In: Security and Privacy Workshops (SPW), 2015 IEEE, pp
Zyskind, G., Nathan, O., Pentland, A.: Decentralizing privacy: Using blockchain to protect personal data. In: Security and Privacy Workshops (SPW), 2015 IEEE, pp. 180–184 (2015)
work page 2015
-
[9]
Enigma: Decentralized Computation Platform with Guaranteed Privacy
Zyskind, G., Nathan, O., Pentland, A.: Enigma: Decentralized computation platform with guaranteed privacy. arXiv:1506.03471 [cs] (2015)
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[10]
Proceedings on Privacy Enhancing Technologies 2017(4), 232–250 (2017)
Froelicher, D., Egger, P., Sousa, J.S., Raisaro, J.L., Huang, Z., Mouchet, C., Ford, B., Hubaux, J.-P.: UnLynx: a decentralized system for privacy-conscious data sharing. Proceedings on Privacy Enhancing Technologies 2017(4), 232–250 (2017)
work page 2017
-
[11]
Hackius, N., Petersen, M.: Blockchain in logistics and supply chain : trick or treat? In: Proceedings of the Hamburg International Conference of Logistics (HICL), pp. 3–18 (2017)
work page 2017
-
[12]
In: Lecture Notes in Computer Science, pp
Garc ´ ıa-Ba˜ nuelos, L., Ponomarev, A., Dumas, M., Weber, I.: Optimized execution of business processes on blockchain. In: Lecture Notes in Computer Science, pp. 130–146 (2017)
work page 2017
-
[13]
Abeyratne, S.A., Monfared, R.P.: Blockchain ready manufacturing supply chain using distributed ledger (2016)
work page 2016
-
[14]
In: Lecture Notes in Computer Science, pp
Azouvi, S., Al-Bassam, M., Meiklejohn, S.: Who am i? secure identity registration on distributed ledgers. In: Lecture Notes in Computer Science, pp. 373–389 (2017)
work page 2017
-
[15]
In: 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), vol
Yasin, A., Liu, L.: An online identity and smart contract management system. In: 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 192–198 (2016)
work page 2016
-
[16]
Kuo, T.-T., Ohno-Machado, L.: ModelChain: decentralized privacy-preserving healthcare predictive modeling framework on private blockchain networks. arXiv:1802.01746 [cs] (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[17]
Journal of Medical Systems 40(10), 218 (2016)
Yue, X., Wang, H., Jin, D., Li, M., Jiang, W.: Healthcare data gateways: Found healthcare intelligence on blockchain with novel privacy risk control. Journal of Medical Systems 40(10), 218 (2016)
work page 2016
-
[18]
IEEE Access 5, 14757–14767 (2017)
Xia, Q., Sifah, E.B., Asamoah, K.O., Gao, J., Du, X., Guizani, M.: MeDShare: trust-less medical data sharing among cloud service providers via blockchain. IEEE Access 5, 14757–14767 (2017)
work page 2017
-
[19]
In: 2016 2nd International Conference on Open and Big Data (OBD), pp
Azaria, A., Ekblaw, A., Vieira, T., Lippman, A.: MedRec: using blockchain for medical data access and permission management. In: 2016 2nd International Conference on Open and Big Data (OBD), pp. 25–30 (2016)
work page 2016
-
[20]
Journal of the International Society for Telemedicine and eHealth 5, 24 (2017)
Genestier, P., Zouarhi, S., Limeux, P., Excoffier, D., Prola, A., Sandon, S., Temerson, J.-M.: Blockchain for consent management in the ehealth environment: A nugget for privacy and security challenges. Journal of the International Society for Telemedicine and eHealth 5, 24 (2017)
work page 2017
-
[21]
Blockchain in Healthcare Today (2018)
Choudhury, O., Sarker, H., Rudolph, N., Foreman, M., Fay, N., Dhuliawala, M., Sylla, I., Fairoza, N., Das, A.K.: Enforcing human subject regulations using blockchain and smart contracts. Blockchain in Healthcare Today (2018)
work page 2018
-
[22]
In: 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW), pp
Li, C., Cao, Y., Hu, Z., Yoshikawa, M.: Blockchain-based bidirectional updates on fine-grained medical data. In: 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW), pp. 22–27 (2019)
work page 2019
-
[23]
Narayanan, A., Clark, J.: Bitcoin’s academic pedigree. Commun. ACM 60(12), 36–45 (2017)
work page 2017
-
[24]
Journal of the American Medical Informatics Association 24(6), 1211–1220 (2017)
Kuo, T.-T., Kim, H.-E., Ohno-Machado, L.: Blockchain distributed ledger technologies for biomedical and health care applications. Journal of the American Medical Informatics Association 24(6), 1211–1220 (2017)
work page 2017
-
[25]
Underwood, S.: Blockchain beyond bitcoin. Commun. ACM 59(11), 15–17 (2016)
work page 2016
-
[26]
Financial Innovation 2(1), 26 (2016)
Sun, J., Yan, J., Zhang, K.Z.K.: Blockchain-based sharing services: What blockchain technology can contribute to smart cities. Financial Innovation 2(1), 26 (2016)
work page 2016
-
[27]
W¨ orner, D., von Bomhard, T., Schreier, Y.-P., Bilgeri, D.: The bitcoin ecosystem: Disruption beyond financial services? (2016)
work page 2016
-
[28]
In: 2015 IEEE Symposium on Security and Privacy, pp
Bonneau, J., Miller, A., Clark, J., Narayanan, A., Kroll, J.A., Felten, E.W.: SoK: research perspectives and challenges for bitcoin and cryptocurrencies. In: 2015 IEEE Symposium on Security and Privacy, pp. 104–121 (2015)
work page 2015
-
[29]
IEEE Communications Surveys Tutorials 18(3), 2084–2123 (2016)
Tschorsch, F., Scheuermann, B.: Bitcoin and beyond: A technical survey on decentralized digital currencies. IEEE Communications Surveys Tutorials 18(3), 2084–2123 (2016)
work page 2084
-
[30]
Research handbook on digital transformations, 225 (2016)
Pilkington, M.: Blockchain technology: principles and applications. Research handbook on digital transformations, 225 (2016)
work page 2016
-
[31]
In: 2017 IEEE International Congress on Big Data (BigData Congress), pp
Zheng, Z., Xie, S., Dai, H., Chen, X., Wang, H.: An overview of blockchain technology: Architecture, consensus, and future trends. In: 2017 IEEE International Congress on Big Data (BigData Congress), pp. 557–564 (2017)
work page 2017
-
[32]
In: 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), vol
Suzuki, S., Murai, J.: Blockchain as an audit-able communication channel. In: 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 516–522 (2017)
work page 2017
-
[33]
Castaldo, L., Cinque, V.: Blockchain-based logging for the cross-border exchange of eHealth data in europe. In: Gelenbe, E., Campegiani, P., Czach´ orski, T., Katsikas, S.K., Komnios, I., Romano, L., Tzovaras, D. (eds.) Communications in Computer and Information Science, pp. 46–56 (2018)
work page 2018
-
[34]
In: 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp
Liang, X., Shetty, S., Tosh, D., Kamhoua, C., Kwiat, K., Njilla, L.: ProvChain: a blockchain-based data provenance architecture in cloud environment with enhanced privacy and availability. In: 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 468–477 (2017)
work page 2017
-
[35]
Dinh, T.T.A., Liu, R., Zhang, M., Chen, G., Ooi, B.C., Wang, J.: Untangling blockchain: A data processing view of blockchain systems. TKDE (2017)
work page 2017
-
[36]
Dinh, T.T.A., Wang, J., Chen, G., Liu, R., Ooi, B.C., Tan, K.-L.: BLOCKBENCH: a framework for analyzing private blockchains. In: SIGMOD, pp. 1085–1100 (2017)
work page 2017
-
[37]
Proceedings of the VLDB Endowment 11(10), 1137–1150 (2018)
Wang, S., Dinh, T.T.A., Lin, Q., Xie, Z., Zhang, M., Cai, Q., Chen, G., Ooi, B.C., Ruan, P.: Forkbase: an efficient storage engine for blockchain and forkable applications. Proceedings of the VLDB Endowment 11(10), 1137–1150 (2018)
work page 2018
-
[38]
Xu, Z., Han, S., Chen, L.: CUB, a consensus unit-based storage scheme for blockchain system. In: ICDE, p. 12 (2018)
work page 2018
-
[39]
iDASH Secure Genome Analysis Competition 2018, GMC Medical Genomics, 2019
work page 2018
-
[40]
https: //www.multichain.com/download/MultiChain-White-Paper.pdf Accessed 4 June 2019
MultiChain Private Blockchain White Paper. https: //www.multichain.com/download/MultiChain-White-Paper.pdf Accessed 4 June 2019
work page 2019
-
[41]
In: Financial Cryptography and Data Security (FC), pp
Croman, K., Decker, C., Eyal, I., Gencer, A.E., Juels, A., Kosba, A., Miller, A., Saxena, P., Shi, E., Sirer, E.G., Song, D., Wattenhofer, R.: On scaling decentralized blockchains. In: Financial Cryptography and Data Security (FC), pp. 106–125 (2016)
work page 2016
-
[42]
Studies in Health Technology and Informatics 210, 617–621 (2015)
Fonseca, M., Karkaletsis, K., Cruz, I.A., Berler, A., Oliveira, I.C.: OpenNCP: a novel framework to foster cross-border e-health services. Studies in Health Technology and Informatics 210, 617–621 (2015)
work page 2015
-
[43]
https://bitcoin.org/en/ Accessed 4 June 2019
Bitcoin. https://bitcoin.org/en/ Accessed 4 June 2019
work page 2019
-
[44]
https://www.ethereum.org/ Accessed 4 June 2019
Ethereum. https://www.ethereum.org/ Accessed 4 June 2019
work page 2019
-
[45]
Roselli, D., Anderson, T.E.: Characteristics of file system workloads (1998)
work page 1998
-
[46]
https://github.com/DXMarkets/Savoir Accessed 4 June 2019
A Python Wrapper for Multichain Json-RPC API. https://github.com/DXMarkets/Savoir Accessed 4 June 2019
work page 2019
-
[47]
https://www.docker.com/ Accessed 4 June 2019
Docker. https://www.docker.com/ Accessed 4 June 2019
work page 2019
-
[48]
https://github.com/mshuaic/Blockchain_med Accessed 4 June 2019
Our Code at Github. https://github.com/mshuaic/Blockchain_med Accessed 4 June 2019
work page 2019
-
[49]
https://github.com/google/leveldb Accessed 4 June 2019
LevelDB. https://github.com/google/leveldb Accessed 4 June 2019
work page 2019
-
[50]
Androulaki, E., Barger, A., Bortnikov, V., Cachin, C., Christidis, K., Shuaicheng Ma et al. Page 11 of 11 De Caro, A., Enyeart, D., Ferris, C., Laventman, G., Manevich, Y.: Hyperledger fabric: a distributed operating system for permissioned blockchains. In: Proceedings of the Thirteenth EuroSys Conference, p. 30 (2018)
work page 2018
-
[51]
Hynes, N., Dao, D., Yan, D., Cheng, R., Song, D.: A demonstration of sterling: A privacy-preserving data marketplace. Proc. VLDB Endow. 11(12), 2086–2089 (2018)
work page 2086
-
[52]
Rosenblum, M., Ousterhout, J.K.: The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10(1), 26–52 (1992). doi:10.1145/146941.146943
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.