pith. sign in

arxiv: 2606.20315 · v1 · pith:EL6ULRTYnew · submitted 2026-06-18 · 🧬 q-bio.GN · cs.CR

bioETH-Beacon: A Confidential On-Chain Genomic Beacon with Encrypted Counts, Filters, and Bounded Noise over a Fully Homomorphic EVM

Pith reviewed 2026-06-26 14:51 UTC · model grok-4.3

classification 🧬 q-bio.GN cs.CR
keywords beacon protocolfully homomorphic encryptiongenomic privacysmart contractsencrypted queriesmembership inferencevariant counts
0
0 comments X

The pith

bioETH-Beacon runs Beacon variant-count queries over encrypted genomic data on a fully homomorphic EVM without a trusted evaluator.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a smart-contract prototype that lets hospitals upload encrypted marker-count entries and researchers submit encrypted queries for aggregate variant counts. The contract executes the queries on-chain using fully homomorphic operations, then releases the encrypted result only to the requester listed in an on-chain access control list via an off-chain key service. Queries are organized in a 3x4 grid of tiers and families that cover genotype, sex, age, and phenotype data, with bounded noise added on genotype paths to limit probing attacks. Experiments on synthetic panels show expected gas scaling and that pre-aggregation reduces costs when public marker presence is acceptable.

Core claim

bioETH-Beacon provides a research prototype for confidential Beacon-style genomic querying without a trusted compute evaluator by executing aggregate count queries over encrypted data on a fully homomorphic EVM, with results released only to named requesters through an on-chain ACL.

What carries the argument

The fhEVM smart contract that performs encrypted count, filter, and bounded-noise operations on genomic marker entries, structured as a 3x4 tier-by-query-family grid.

If this is right

  • Queries remain hidden from hosts because both inputs and the computation occur in encrypted form.
  • Bounded on-chain noise can be injected for genotype queries to reduce the effectiveness of repeated rare-variant probes.
  • Pre-aggregation lowers gas when the presence of a marker can be treated as public information.
  • Different tiers let users choose stronger confidentiality at higher cost or lower cost at reduced confidentiality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the gas and security assumptions hold, Beacon networks could expand to more institutions without exposing query patterns to any single party.
  • The tiered structure implies that real deployments would need usage data to pick the right confidentiality-cost balance for different query families.
  • The approach could be extended to other aggregate genomic statistics beyond simple counts if the fhEVM primitives support the required operations.

Load-bearing premise

A fully homomorphic EVM can perform the required encrypted count, filter, and noise operations at practical gas cost while preventing the membership-inference attacks the design targets.

What would settle it

A measurement showing either that membership-inference attacks succeed on the noisy outputs for realistic cohort sizes or that the gas cost of a single query exceeds feasible limits for typical Beacon deployments.

Figures

Figures reproduced from arXiv: 2606.20315 by Christos Galanopoulos, Ilias Georgakopoulos-Soares, Kimon Antonios Provatas.

Figure 1
Figure 1. Figure 1: Graphical Abstract. bioETH-Beacon executes a Beacon-style aggregate-count primitive entirely in the encrypted domain on a fully homomorphic EVM. Hospitals upload encrypted marker-count entries; an authorized researcher submits a single encrypted marker query; the contract returns an encrypted aggregate without exposing raw genomic data, contributor counts, or the queried marker. Optional bounded noise on g… view at source ↗
Figure 2
Figure 2. Figure 2: System architecture. Approved hospitals en￾crypt marker–count pairs off-chain using fhevmjs and submit them with an input proof. The public BeaconRegistry gates uploads and queries; the encrypted ConfidentialBeacon fam￾ily stores and scans encrypted state and emits a requester￾private result handle decrypted via the off-chain KMS. count is perturbed by an integer drawn uniformly from {0, . . . , 𝐵 − 1}, wh… view at source ↗
Figure 3
Figure 3. Figure 3: Dataset and query lifecycles. Datasets become immutable at finalization and can only be replaced by register￾ing a new shell. Queries are monotone: processQueryChunk only advances nextEntryIndex, so permissionless relayers can￾not stall a query. layer role is implicit (e.g., a consortium-funded service that watches QueryCreated events and pays gas to drive processQueryChunk calls to completion on the reque… view at source ↗
Figure 4
Figure 4. Figure 4: The 3×4 tier-by-query-family grid. Twelve contracts cover the cross-product of tiers and query families; a thirteenth (G1) collapses the four families into a single encrypted four-way conjunctive query [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Per-entry FHE scan flow. HCU figures refer to the euint32 eq + euint64 add reference path (T3). The accumulator add dominates sequential depth and sets the chunk-size ceiling. the accumulator. At 133,000 sequential-depth HCU per step and a 5,000,000 per-transaction budget, the the￾oretical ceiling is ⌊5 × 106/133,000⌋ = 37 entries per chunk. We tested this empirically with Hardhat; chunk sizes 5, 10, 20, 2… view at source ↗
Figure 6
Figure 6. Figure 6: Bounded on-chain noise injection. Left: Over￾head gas decreases as a fraction of total query gas as 𝑁 grows. Right: The noise sample 𝜈 is generated inside the FHE copro￾cessor and never exists in plaintext; the coordinator triggers the injection but cannot observe or substitute the sampled value before the block mines. 6.2 Trust Model The coordinator triggers injectQueryNoise(queryId) with no noise paramet… view at source ↗
Figure 7
Figure 7. Figure 7: Threat model (seven on-chain adversaries). [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Tier-level gas tradeoffs and scaling behav [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Scaling behavior and lifecycle crossover [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗
read the original abstract

The Global Alliance for Genomics and Health (GA4GH) Beacon protocol lets researchers ask whether a genomic variant has been observed in a participating cohort and receive aggregate variant-level counts. As Beacon networks grow, two privacy risks remain: host institutions can see plaintext queries, and repeated rare-variant queries can support membership-inference attacks. We present bioETH-Beacon, a smart-contract prototype that runs the Beacon "aggregate count" query over encrypted data on a fully homomorphic Ethereum Virtual Machine (fhEVM). Hospitals upload encrypted marker-count entries, authorized researchers submit encrypted marker queries, and the contract returns an encrypted answer that is released, via an off-chain key-management service, only to the requester named in the contract's on-chain ACL. The design is organized as a 3x4 tier-by-query-family grid spanning genotype, sex, age, and phenotype queries, with tiers that trade stronger confidentiality for lower query cost. For genotype paths, the prototype can add bounded on-chain noise to mitigate probing attacks. Experiments on synthetic panels derived from a Polygenic Score (PGS) catalog show the expected scaling behavior and demonstrate that pre-aggregation can substantially reduce query gas when public marker presence is an acceptable trade-off. Overall, bioETH-Beacon provides a research prototype for confidential Beacon-style genomic querying without a trusted compute evaluator.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper presents bioETH-Beacon, a smart-contract prototype on a fully homomorphic EVM (fhEVM) that supports confidential GA4GH Beacon-style aggregate count queries over encrypted genomic marker data. Hospitals upload encrypted counts; authorized researchers submit encrypted queries organized in a 3x4 tier-by-query-family grid (genotype/sex/age/phenotype); the contract computes an encrypted result that is released only to the ACL-specified requester via an off-chain key-management service. For genotype paths the design optionally adds bounded on-chain noise; synthetic experiments on PGS-derived panels are used to illustrate scaling behavior and gas savings from pre-aggregation when public marker presence is acceptable. The central contribution is framed as a research prototype that removes the need for a trusted compute evaluator.

Significance. If the unverified performance and privacy properties hold, the work would offer a concrete on-chain architecture for privacy-preserving Beacon queries that directly addresses host-visible plaintext queries and membership-inference risks without introducing a trusted evaluator for computation. The tiered design and pre-aggregation trade-off provide a practical starting point for balancing confidentiality and cost in genomic data sharing; the synthetic validation demonstrates feasibility of the scaling claims even if full security analysis is absent.

major comments (2)
  1. [Abstract] Abstract: the claim that the prototype 'provides a research prototype for confidential Beacon-style genomic querying without a trusted compute evaluator' is load-bearing yet unsupported by any gas-cost measurements, security proofs, or empirical evaluation that the bounded noise raises the bar against membership-inference attacks; the manuscript supplies only architecture description and synthetic scaling results.
  2. [Abstract] Abstract: the off-chain key-management service that performs ACL-gated decryption and release lies outside the 'no trusted compute evaluator' guarantee; because result release is part of the end-to-end query path, this component must be analyzed for its impact on the overall trust model.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback on the abstract. We address the two major comments below, agreeing to revisions that better scope the claims and clarify the trust model.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the prototype 'provides a research prototype for confidential Beacon-style genomic querying without a trusted compute evaluator' is load-bearing yet unsupported by any gas-cost measurements, security proofs, or empirical evaluation that the bounded noise raises the bar against membership-inference attacks; the manuscript supplies only architecture description and synthetic scaling results.

    Authors: The full manuscript includes synthetic scaling results that encompass gas-cost measurements demonstrating pre-aggregation benefits. We concur that formal security proofs and empirical membership-inference evaluations for the bounded noise are absent, as the work focuses on architectural design and feasibility. We will revise the abstract to precisely state the contributions as a prototype with design, ACL-based release, and synthetic scaling experiments, without overstating security guarantees. revision: yes

  2. Referee: [Abstract] Abstract: the off-chain key-management service that performs ACL-gated decryption and release lies outside the 'no trusted compute evaluator' guarantee; because result release is part of the end-to-end query path, this component must be analyzed for its impact on the overall trust model.

    Authors: The design intentionally separates on-chain homomorphic computation (no trusted evaluator) from off-chain result release via the key-management service for ACL enforcement. We agree that the end-to-end trust model requires explicit discussion. We will expand the manuscript to analyze the trust assumptions for the key-management service and its role in the query path. revision: yes

Circularity Check

0 steps flagged

No circularity in system-design prototype

full rationale

The manuscript describes an architectural prototype and experimental scaling results on synthetic panels; it contains no equations, fitted parameters, or derivation chain that could reduce to its own inputs by construction. All load-bearing claims rest on unquantified assumptions about fhEVM performance and noise efficacy rather than on any self-referential definitions or self-citation loops. The contribution is therefore self-contained as a design artifact.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated beyond the high-level system components.

pith-pipeline@v0.9.1-grok · 5795 in / 1156 out tokens · 25867 ms · 2026-06-26T14:51:09.215330+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references

  1. [1]

    Marc Fiume and Miroslav Cupak and Stephen Keenan and Jordi Rambla and Sabela de la Torre and Stephanie O. M. Dyke and Anthony J. Brookes and Knox Carey and David Lloyd and Peter Goodhand and Maximilian Haeussler and Michael Baudis and Heinz Stockinger and Lena Dolman and Ilkka Lappalainen and Juha T. Federated discovery and sharing of genomic data using. ...

  2. [2]

    Fromont and Arcadi Navarro and Rahel Paloots and Manuel Rueda and Gary Saunders and Babita Singh and J

    Jordi Rambla and Michael Baudis and Roberto Ariosa and Tim Beck and Lauren A. Fromont and Arcadi Navarro and Rahel Paloots and Manuel Rueda and Gary Saunders and Babita Singh and J. Dylan Spalding and Juha T. Human Mutation , volume =. 2022 , doi =

  3. [3]

    2016 , doi =

    A Federated Ecosystem for Sharing Genomic, Clinical Data , journal =. 2016 , doi =

  4. [4]

    Shringarpure and Carlos D

    Suyash S. Shringarpure and Carlos D. Bustamante , title =. American Journal of Human Genetics , volume =. 2015 , doi =

  5. [5]

    Journal of the American Medical Informatics Association , volume =

    Jean Louis Raisaro and Florian Tramer and Zhanglong Ji and Diyue Bu and Yongan Zhao and Knox Carey and David Lloyd and Heidi Sofia and Dixie Baker and Paul Flicek and Suyash Shringarpure and Carlos Bustamante and Shuang Wang and Xiaoqian Jiang and Lucila Ohno-Machado and Haixu Tang and XiaoFeng Wang and Jean-Pierre Hubaux , title =. Journal of the America...

  6. [6]

    Journal of Cryptology , volume =

    Ilaria Chillotti and Nicolas Gama and Mariya Georgieva and Malika Izabach. Journal of Cryptology , volume =. 2020 , doi =

  7. [7]

    2024 , howpublished =

  8. [8]

    Wu and Bonnie Berger , title =

    Hyunghoon Cho and David J. Wu and Bonnie Berger , title =. Nature Biotechnology , volume =. 2018 , doi =

  9. [9]

    BMC Medical Informatics and Decision Making , volume =

    Miran Kim and Kristin Lauter , title =. BMC Medical Informatics and Decision Making , volume =. 2015 , doi =

  10. [10]

    Proceedings of the National Academy of Sciences , volume =

    Marcelo Blatt and Alexander Gusev and Yuriy Polyakov and Shafi Goldwasser , title =. Proceedings of the National Academy of Sciences , volume =. 2020 , doi =

  11. [11]

    McLaren and Jean Louis Raisaro and Manel Aouri and Margalida Rotger and Erman Ayday and Istv

    Paul J. McLaren and Jean Louis Raisaro and Manel Aouri and Margalida Rotger and Erman Ayday and Istv. Privacy-preserving genomic testing in the clinic:. Genetics in Medicine , volume =. 2016 , doi =

  12. [12]

    IEEE/ACM Transactions on Computational Biology and Bioinformatics , volume =

    Jean Louis Raisaro and Juan Ram. IEEE/ACM Transactions on Computational Biology and Bioinformatics , volume =. 2019 , doi =

  13. [13]

    Shuang Wang and Xiaoqian Jiang and Haixu Tang and Xiaofeng Wang and Diyue Bu and Knox Carey and Stephanie O. M. Dyke and Dov Fox and Chao Jiang and Kristin Lauter and Bradley Malin and Heidi Sofia and Amalio Telenti and Lei Wang and Wenhao Wang and Lucila Ohno-Machado , title =. npj Genomic Medicine , volume =. 2017 , doi =

  14. [14]

    Lambert and Laurent Gil and Simon Jupp and Scott C

    Samuel A. Lambert and Laurent Gil and Simon Jupp and Scott C. Ritchie and Yu Xu and Annalisa Buniello and Aoife McMahon and Gad Abraham and Michael Chapman and Helen Parkinson and John Danesh and Jacqueline A. C. MacArthur and Michael Inouye , title =. Nature Genetics , volume =. 2021 , doi =

  15. [15]

    2014 IEEE Symposium on Security and Privacy , pages =

    Eli Ben-Sasson and Alessandro Chiesa and Christina Garman and Matthew Green and Ian Miers and Eran Tromer and Madars Virza , title =. 2014 IEEE Symposium on Security and Privacy , pages =. 2014 , doi =

  16. [16]

    2020 IEEE Symposium on Security and Privacy , pages =

    Sean Bowe and Alessandro Chiesa and Matthew Green and Ian Miers and Pratyush Mishra and Howard Wu , title =. 2020 IEEE Symposium on Security and Privacy , pages =. 2020 , doi =

  17. [17]

    2018 , howpublished =

    Eli Ben-Sasson and Iddo Bentov and Yinon Horesh and Michael Riabzev , title =. 2018 , howpublished =

  18. [18]

    Malin , title =

    Aref Asvadishirehjini and Murat Kantarcioglu and Bradley A. Malin , title =. 2020 Second IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA) , pages =. 2020 , publisher =