Exploring the Effects of Entanglement on Quantum Machine Learning of Pathogen Epitope-Receptor Binding
Pith reviewed 2026-06-30 10:10 UTC · model grok-4.3
The pith
High-entanglement ZZ feature map reduces training overfit in hybrid QNN for epitope binding classification while keeping competitive test accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Among the four feature-map configurations tested in the hybrid Embedding-QNN workflow, the high-entanglement all-to-all ZZ feature map yields the lowest training AUAC together with the highest test/training AUAC ratio while maintaining test-set accuracy competitive with both the classical CNN benchmark and the other quantum maps; the paper interprets this pattern as evidence that entanglement topology influences overfitting on this N=80 epitope-receptor binding task.
What carries the argument
The ZZ feature map with all-to-all two-qubit entangling gates placed in the embedding stage before the variational quantum neural network layers.
If this is right
- Entanglement topology in the feature map functions as an independent design variable that can be adjusted to lower training-set overfit on sparse biological classification problems.
- The ZZ configuration preserves test accuracy while lowering the training AUAC, implying improved generalization relative to the low-entanglement and non-entangling maps on this task.
- The same pattern of results would be expected to appear in other small-scale epitope or receptor-binding datasets if the entanglement effect is robust.
- Further evaluation with noise models or actual hardware runs is required before claiming practical advantage on NISQ devices.
Where Pith is reading between the lines
- If the advantage persists on larger epitope libraries, entanglement topology could become a standard hyperparameter in quantum screening pipelines for vaccine design.
- Testing the same maps on datasets with different sequence lengths or binding thresholds would reveal whether the effect is specific to 9-mers or generalizes across molecular representations.
- Pairing the ZZ map with classical post-processing layers might amplify or cancel the observed generalization benefit.
- The result leaves open whether an intermediate entanglement density between the tested low and high patterns would produce an even better ratio.
Load-bearing premise
The observed differences in AUAC ratios are caused by the entanglement topology of each feature map rather than by uncontrolled factors such as random seed, optimizer choice, or the specific 40:30:30 split on the N=80 dataset.
What would settle it
Re-running the identical workflow across several independent random seeds and at least two different train-validation-test partitions and finding that the ZZ map no longer produces the highest test/training AUAC ratio would falsify the claim that entanglement topology drives the reduced overfit.
Figures
read the original abstract
Parameterized quantum circuits (PQCs) provide a flexible substrate for hybrid quantum machine learning (QML), but their practical value on Noisy Intermediate-Scale Quantum (NISQ) devices remains an empirical question, especially because training depth and scale can introduce optimization challenges such as barren plateaus. Here we study how the number and topology of two-qubit entangling gates in the feature-map stage influence a fixed hybrid QNN workflow for classifying strong versus weak epitope-receptor binding in Porcine Reproductive and Respiratory Syndrome (PRRS) vaccine design. The dataset consists of docking-derived binding affinities for N=80 9-mer epitopes, labeled as Strong or Weak binding, and partitioned into training, validation, and test subsets using a 40:30:30 split. We compare a classical CNN benchmark with a hybrid Embedding-QNN architecture under four feature-map configurations: a non-entangling Z feature map, an all-to-all high-entanglement ZZ feature map, and two interleaved nearest-neighbour entanglement patterns of low and high depth. Among the configurations tested, the high-entanglement ZZ feature map is seen to provide the strongest evidence of reduced training-set overfit, with a lower training area under the accuracy curve (AUAC) and the highest test/training AUAC ratio, while preserving competitive test-set accuracy. These results do not establish a general QML advantage, but they suggest that feature-map entanglement topology is a meaningful design variable for sparse biological screening tasks and warrants further evaluation with additional metrics, larger datasets, and noise-aware or hardware-based experiments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript empirically compares four feature-map variants (non-entangling Z, high-entanglement all-to-all ZZ, and two interleaved nearest-neighbour patterns) inside a fixed hybrid Embedding-QNN workflow for binary classification of strong vs. weak epitope-receptor binding on an N=80 docking-derived dataset of 9-mer epitopes, using a single 40:30:30 train/val/test split. It reports that the high-entanglement ZZ map yields the lowest training AUAC, the highest test/training AUAC ratio, and competitive test accuracy, suggesting that entanglement topology can mitigate overfitting in this sparse biological screening task without establishing a general QML advantage.
Significance. If the observed AUAC differences can be shown to arise specifically from entanglement topology rather than uncontrolled stochasticity, the result would usefully highlight feature-map design as a controllable variable for NISQ-era QML on small biological datasets. The work supplies concrete, reproducible metrics on an explicit workflow and four variants but does not claim broad superiority over classical methods.
major comments (2)
- [Abstract] Abstract and central empirical claim: the attribution of reduced training AUAC and elevated test/training AUAC ratio specifically to the high-entanglement ZZ topology rests on a single 40:30:30 split of N=80 samples with no reported multiple random seeds, fixed-seed sweeps, or k-fold cross-validation. On this scale, PQC training stochasticity (initialization, optimizer path, barren-plateau effects) could produce the observed ordering without any topological cause.
- [Abstract] Abstract: no error bars, standard deviations, or statistical significance tests accompany the reported AUAC values or ratios, so it is impossible to assess whether the differences between the four feature maps exceed the variability expected from the small dataset and single partition.
minor comments (2)
- [Abstract] The abstract states that results 'do not establish a general QML advantage' yet the title and framing emphasize entanglement effects; a brief clarification of scope in the introduction would help readers.
- Notation for AUAC (area under the accuracy curve) is introduced without an explicit definition or reference to how the curve is constructed from the validation or test predictions.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive feedback on our manuscript. We agree with the concerns regarding the statistical robustness of our empirical results and will revise the manuscript to address these issues by incorporating multiple runs and statistical measures.
read point-by-point responses
-
Referee: [Abstract] Abstract and central empirical claim: the attribution of reduced training AUAC and elevated test/training AUAC ratio specifically to the high-entanglement ZZ topology rests on a single 40:30:30 split of N=80 samples with no reported multiple random seeds, fixed-seed sweeps, or k-fold cross-validation. On this scale, PQC training stochasticity (initialization, optimizer path, barren-plateau effects) could produce the observed ordering without any topological cause.
Authors: We fully acknowledge this limitation. Our current study used a single data split, and the observed differences could indeed be influenced by training stochasticity. In the revised version, we will conduct experiments with multiple random seeds (at least 5-10) for each feature map, reporting average AUAC values along with standard deviations. This will allow us to better attribute any consistent differences to the entanglement topology rather than random variation. We will also explore the feasibility of k-fold cross-validation given the computational constraints. revision: yes
-
Referee: [Abstract] Abstract: no error bars, standard deviations, or statistical significance tests accompany the reported AUAC values or ratios, so it is impossible to assess whether the differences between the four feature maps exceed the variability expected from the small dataset and single partition.
Authors: We agree that the absence of error bars and statistical tests makes it difficult to evaluate the significance of the results. We will update the manuscript to include error bars based on multiple runs and perform statistical significance tests (such as t-tests) between the different feature maps to determine if the observed differences are statistically meaningful. revision: yes
Circularity Check
No significant circularity: empirical measurements only
full rationale
The paper reports direct empirical AUAC values computed on a fixed 40:30:30 split of an N=80 dataset for four explicitly defined feature-map circuits. No derived quantity is obtained by fitting a parameter to one subset and then relabeling a closely related quantity as a prediction, nor is any central result obtained by self-citation to an unverified uniqueness theorem or ansatz. The observed training/test AUAC ratios are computed quantities from the same evaluation procedure applied to each circuit; they do not reduce to the input definitions by construction. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Exploring the Effects of Entanglement on Quantum Machine Learning of Pathogen Epitope-Receptor Binding Aspen Erlandsson Brisebois1,2, Luis Pablo Gonzalez Dominguez1,3,4, Shivansi Prajapati4,5, Zahed Khatooni1, Heather L. Wilson1, Connor Burbridge6, Brook Byrns6, Sureesh Tikoo1,7, Christophe Pere8, Steven Rayan3,4*, Gordon Broderick1,3,4* 1 Vaccine and Inf...
2018
-
[2]
Figure 1: One-hot encoding scheme applied to the epitope RVPILRTVF Because in vivo measurements and high-fidelity computational screens are resource-intensive, biological datasets of this kind are often sparse and limited in scope. To test performance under a deliberately conservative data regime, we partitioned the 80 example epitopes into training, valid...
2025
-
[3]
consists of an initial classical feature-embedding stage for dimensional reduction, followed by a parameterized QNN circuit comprising a feature map, a quantum convolutional stage, a quantum pooling stage, a variational ansatz stage, qubit measurement, and a final classical output layer. The design is therefore hybrid throughout: the embedding and output w...
2025
-
[4]
While the feature-map configurations vary, the quantum convolutional, pooling, RealAmplitudes ansatz, and classical output components are held constant across all four QNN experiments. The PyTorch [Imambi et al., 2021] Python library was used for classical parameter training and evaluation, with the Qiskit Machine Learning [Sahin et al., 2025] TorchConnect...
2021
-
[5]
Node Degree Multiplicity Recip
Topological properties of entanglement patterns Feature-map configuration Graph Diameter Avg. Node Degree Multiplicity Recip. Directed steps (unique pairs) Z feature map baseline (1 rep; no feature-map entanglement) 0 0.00 0 0 0 ZZ feature map high entanglement (2 reps; all-to-all) 1 4.00 4 0 144 (36) Z feature map + low-depth interleaved entanglement (1 p...
2018
-
[6]
Thus, the present data support feature-map entanglement topology as a useful design variable for further study, not as a standalone explanation of broad generalization advantage
similarly frame the approximation-generalization trade-oX in quantum-information terms, underscoring that finite-data limitations cannot be bypassed merely by choosing a quantum model. Thus, the present data support feature-map entanglement topology as a useful design variable for further study, not as a standalone explanation of broad generalization advan...
2018
-
[7]
Generalization in quantum machine learning from few training data
Caro MC, Huang HY , Cerezo M, Sharma K, Sornborger A, Cincio L, Coles PJ. Generalization in quantum machine learning from few training data. Nature Communications. 2022 Aug 22;13(1):4919. Gil-Fuster E, Eisert J, Bravo-Prieto C. Understanding quantum machine learning also requires rethinking generalization. Nature Communications. 2024 Mar 13;15(1):2277. Ba...
-
[8]
Hybrid Quantum Neural Networks for EXicient Protein-Ligand Binding AXinity Prediction
Jeong SG, Moon KH, Hwang WJ. Hybrid Quantum Neural Networks for EXicient Protein-Ligand Binding AXinity Prediction. arXiv preprint arXiv:2509.11046. 2025 Sep
-
[9]
Molecular architecture and dynamics of SARS-CoV-2 envelope by integrative modeling
Pezeshkian W, Grünewald F, Narykov O, Lu S, Arkhipova V, Solodovnikov A, Wassenaar TA, Marrink SJ, Korkin D. Molecular architecture and dynamics of SARS-CoV-2 envelope by integrative modeling. Structure. 2023 Apr 6;31(4):492-503. Zhang N, Qi J, Feng S, Gao F , Liu J, Pan X, Chen R, Li Q, Chen Z, Li X, Xia C. Crystal structure of swine major histocompatibi...
2023
-
[10]
2009 Apr;73(4):307-315
Tissue Antigens. 2009 Apr;73(4):307-315. Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, Ronneberger O, Willmore L, Ballard AJ, Bambrick J, Bodenstein SW, et al. Accurate structure prediction of biomolecular interactions with AlphaFold
2009
-
[11]
2024 Jun 13;630(8016):493-500
Nature. 2024 Jun 13;630(8016):493-500. Jiang L, Zhang K, Zhu K, Wang Y , Kang Y , Hou T. Revisiting Protein-Protein Docking: A Systematic Evaluation Framework. Journal of Chemical Information and Modeling. 2025 Sep
2024
-
[12]
Alotaiq N, Dermawan D. Evaluation of Structure Prediction and Molecular Docking Tools for Therapeutic Peptides in Clinical Use and Trials Targeting Coronary Artery Disease. International Journal of Molecular Sciences. 2025 Jan 8;26(2):462. Honorato RV , Trellet ME, Jiménez-García B, Schaarschmidt JJ, Giulini M, Reys V , Koukos PI, Rodrigues JP , Karaca E,...
-
[13]
Sparse autoencoder features for classifications and transferability
Gallifant J, Chen S, Sasse K, Aerts H, Hartvigsen T, Bitterman DS. Sparse autoencoder features for classifications and transferability. arXiv preprint arXiv:2502.11367. 2025 Feb
-
[14]
Modeling Feature Maps for Quantum Machine Learning
Singh N, Pokhrel SR. Modeling Feature Maps for Quantum Machine Learning. arXiv preprint arXiv:2501.08205. 2025 Jan
-
[15]
Circuit-centric quantum classifiers
Schuld M, Bocharov A, Svore KM, Wiebe N. Circuit-centric quantum classifiers. Physical Review A. 2020 Mar;101(3):032308. Havlíček V , Córcoles AD, Temme K, Harrow AW, Kandala A, Chow JM, Gambetta JM. Supervised learning with quantum-enhanced feature spaces. Nature. 2019 Mar 14;567(7747):209-212. Anand A. On the power of interleaved low-depth quantum and cl...
2020
-
[16]
p. 87-104. Sahin ME, Altamura E, Wallis O, Wood SP , Dekusar A, Millar DA, Imamichi T, Matsuo A, Mensa S. Qiskit Machine Learning: an open-source library for quantum machine learning tasks at scale on quantum hardware and classical simulators. arXiv preprint arXiv:2505.17756. 2025 May
-
[17]
Powell MJD. An eXicient method for finding the minimum of a function of several variables without calculating derivatives. The Computer Journal. 1964 Jan 1;7(2):155-162. Li S, Xia Y , Xu Z. Simultaneous perturbation stochastic approximation: towards one-measurement per iteration. Numerical Algorithms. 2023 Nov;94(3):1085-1101. Brisebois AE, Broderick J, Kh...
work page internal anchor Pith review Pith/arXiv arXiv 1964
-
[18]
Implementing Grover’s algorithm on the IBM quantum computers
Mandviwalla A, Ohshiro K, Ji B. Implementing Grover’s algorithm on the IBM quantum computers. In: 2018 IEEE International Conference on Big Data (Big Data). IEEE; 2018 Dec
2018
-
[19]
2531-2537
p. 2531-2537. Abane A, Cubeddu M, Mai VS, Battou A. Entanglement routing in quantum networks: A comprehensive survey. IEEE Transactions on Quantum Engineering. 2025 Feb
2025
-
[20]
To Entanglement and Beyond: Explaining Superior Generalizability of Quantum Neural Networks
Park J. To Entanglement and Beyond: Explaining Superior Generalizability of Quantum Neural Networks. Proceedings of Quantum Techniques in Machine Learning (QTML2024), University of Melbourne, Melbourne, Australia; 2024 Nov 25-29. Acknowledgment This work was supported by the University of Saskatchewan’s Centre for Quantum Topology and Its Applications (qu...
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.