Quantum Kernel Advantage over Classical Collapse in Medical Foundation Model Embeddings

Aldo Marzullo; Ariel Guerra-Adames; Chi-Yu Chen; Dax Enshan Koh; Felipe Ocampo Osorio; J. Alejandro Andrade; Leo Anthony Celi; Noah Dane Hebdon; Rafi Al Attrach; Rahul Gorijavolu

arxiv: 2604.24597 · v1 · submitted 2026-04-27 · 🪐 quant-ph · cs.AI

Quantum Kernel Advantage over Classical Collapse in Medical Foundation Model Embeddings

Sebastian Cajas Ord\'o\~nez , Felipe Ocampo Osorio , Dax Enshan Koh , Rafi Al Attrach , Aldo Marzullo , Ariel Guerra-Adames , J. Alejandro Andrade , Siong Thye Goh

show 5 more authors

Chi-Yu Chen Rahul Gorijavolu Xue Yang Noah Dane Hebdon Leo Anthony Celi

This is my paper

Pith reviewed 2026-05-08 03:52 UTC · model grok-4.3

classification 🪐 quant-ph cs.AI

keywords quantum kernelQSVMmedical imagingclass imbalanceMIMIC-CXRfoundation model embeddingseffective rankkernel collapse

0 comments

The pith

Quantum support vector machines avoid classical majority-class collapse on imbalanced chest X-ray insurance classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests quantum kernels against classical kernels in a binary insurance prediction task on MIMIC-CXR chest radiographs, using frozen embeddings from three medical foundation models reduced by PCA to q dimensions. It runs a two-tier comparison with identical inputs to both sides: untuned QSVM versus untuned linear SVM, then untuned QSVM versus C-tuned RBF SVM. In all 18 Tier-1 setups the quantum kernel wins minority-class F1, often by large margins, while the linear kernel collapses to majority-class output on 90-100 percent of random seeds. The quantum kernel reaches an effective rank of 69.80 at q=11, far above the classical value, and the advantage persists against the tuned classical baseline in seven Tier-2 tests. A qubit sweep shows the onset of concentration depends on the embedding architecture.

Core claim

In binary insurance classification on MIMIC-CXR using PCA-q features from MedSigLIP-448, RAD-DINO, and ViT-patch32, the quantum kernel in QSVM produces a higher effective rank than the linear kernel and thereby prevents the collapse to majority-class prediction that occurs with the classical linear SVM regardless of regularization parameter C. Across every tested qubit count and embedding source, untuned QSVM records higher minority-class F1 than untuned linear SVM, with a mean gain of 0.293 at q=11 on MedSigLIP-448, and still outperforms a tuned RBF SVM in all seven Tier-2 comparisons.

What carries the argument

The quantum kernel Gram matrix obtained by applying a feature map to the PCA-reduced q-dimensional embeddings from the foundation models, whose eigenspectrum yields an effective rank up to 69.80 while the classical linear kernel rank stays low and invariant to C.

If this is right

The classical linear kernel collapses to majority-class prediction on 90-100 percent of seeds at every qubit count and remains C-invariant.
QSVM maintains non-trivial recall and wins minority F1 in all 18 Tier-1 configurations, 17 at p less than 0.001.
At q=11 with MedSigLIP-448, mean QSVM F1 reaches 0.343 versus 0.050 for the linear kernel.
Under Tier 2, untuned QSVM still wins all seven tested configurations against C-tuned RBF SVM with mean gain 0.068.
A full qubit sweep shows architecture-dependent concentration onset across the three embedding models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the rank advantage survives on hardware, quantum kernels could reduce reliance on extensive hyperparameter search for class-imbalanced medical tasks.
The same mechanism might apply to other high-dimensional outputs from foundation models beyond radiographs.
Testing whether the effective-rank gap closes under realistic noise would directly test whether the observed separation is hardware-limited or fundamental to the kernel construction.
Extending the comparison to multi-class or regression versions of the same embeddings could show whether the collapse-avoidance property generalizes.

Load-bearing premise

That noiseless simulation of the quantum kernel after PCA reduction gives a representative test of advantage that would hold on real hardware for this medical task and dataset.

What would settle it

Running the identical QSVM pipeline on current noisy quantum hardware and observing that its minority-class F1 falls below the tuned classical RBF SVM for the same embeddings and qubit counts.

Figures

Figures reproduced from arXiv: 2604.24597 by Aldo Marzullo, Ariel Guerra-Adames, Chi-Yu Chen, Dax Enshan Koh, Felipe Ocampo Osorio, J. Alejandro Andrade, Leo Anthony Celi, Noah Dane Hebdon, Rafi Al Attrach, Rahul Gorijavolu, Sebastian Cajas Ord\'o\~nez, Siong Thye Goh, Xue Yang.

**Figure 1.** Figure 1: FIG. 1. Three-stage preprocessing pipeline applied to all view at source ↗

**Figure 2.** Figure 2: FIG. 2. Linear kernel view at source ↗

**Figure 3.** Figure 3: FIG. 3. Quantum vs. linear kernel eigenspectrum comparison view at source ↗

**Figure 4.** Figure 4: FIG. 4. Quantum kernel matrix view at source ↗

**Figure 5.** Figure 5: FIG. 5. Partial qubit sweep ( view at source ↗

**Figure 6.** Figure 6: FIG. 6. Quantum kernel eigenvalue spectra for all three embedding models at view at source ↗

**Figure 7.** Figure 7: FIG. 7. Quantum kernel eigenspectrum for MedSigLIP-448 view at source ↗

**Figure 9.** Figure 9: FIG. 9. Quantum kernel heatmaps for all three embedding models at view at source ↗

**Figure 10.** Figure 10: FIG. 10. PCA feature space ( view at source ↗

**Figure 11.** Figure 11: FIG. 11. PCA scatter of MedSigLIP-448 embeddings projected to 2 components (total explained variance: 21.8%). Train set: view at source ↗

read the original abstract

We provide evidence of quantum kernel advantage under noiseless simulation in binary insurance classification on MIMIC-CXR chest radiographs using quantum support vector machines (QSVM) with frozen embeddings from three medical foundation models (MedSigLIP-448, RAD-DINO, ViT-patch32). We propose a two-tier fair comparison framework in which both classifiers receive identical PCA-q features. At Tier 1 (untuned QSVM vs. untuned linear SVM, C = 1 both sides), QSVM wins minority-class F1 in all 18 tested configurations (17 at p < 0.001, 1 at p < 0.01). The classical linear kernel collapses to majority-class prediction on 90-100% of seeds at every qubit count, while QSVM maintains non-trivial recall. At q = 11 (MedSigLIP-448 plateau center), QSVM achieves mean F1 = 0.343 vs. classical F1 = 0.050 (F1 gain = +0.293, p < 0.001) without hyperparameter tuning. Under Tier 2 (untuned QSVM vs. C-tuned RBF SVM), QSVM wins all seven tested configurations (mean gain +0.068, max +0.112). Eigenspectrum analysis reveals quantum kernel effective rank reaches 69.80 at q = 11, far exceeding linear kernel rank, while classical collapse remains C-invariant. A full qubit sweep reveals architecture-dependent concentration onset across models. Code: https://github.com/sebasmos/qml-medimage

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Quantum kernels avoid majority-class collapse on imbalanced medical embeddings in noiseless simulation, with a clear rank-based explanation and reproducible stats.

read the letter

The main point is that quantum kernels keep non-trivial minority-class recall where classical kernels collapse to majority predictions on these chest X-ray tasks. The authors show this across 18 configurations using frozen embeddings from three medical foundation models, with QSVM beating untuned linear SVM on F1 every time and still edging out tuned RBF in seven cases. The effective-rank gap (quantum around 70 versus classical staying low and C-invariant) directly tracks the performance difference, which is a clean observation rather than post-hoc fitting.

Referee Report

2 major / 2 minor

Summary. The manuscript claims empirical evidence of quantum kernel advantage in noiseless simulations of QSVMs applied to PCA-reduced embeddings from medical foundation models (MedSigLIP-448, RAD-DINO, ViT-patch32) for binary insurance classification on the MIMIC-CXR dataset. Using a two-tier comparison framework with identical PCA-q features for quantum and classical models, it reports that untuned QSVM outperforms untuned linear SVM (C=1) in minority-class F1 across all 18 configurations (with large gains, e.g., +0.293 at q=11 for MedSigLIP-448), while classical kernels collapse to majority-class predictions; QSVM also beats C-tuned RBF SVM in all 7 tested cases. Eigenspectrum analysis shows quantum kernels achieve much higher effective rank (~69.8 at q=11) than classical ones, explaining the non-collapse, with architecture-dependent concentration in qubit sweeps. Code is provided for reproducibility.

Significance. If the results hold under the stated conditions, the work offers a clear, reproducible demonstration of how quantum kernels can mitigate the collapse problem in imbalanced medical classification tasks where classical kernels fail, supported by consistent statistical significance across models and seeds. The two-tier design, effective-rank explanation, and linked code are strengths that make the empirical claims more credible than typical quantum ML benchmarks. This could encourage targeted follow-up on quantum methods for healthcare embeddings, though the noiseless scope limits immediate practical impact.

major comments (2)

[Results and eigenspectrum analysis] The central results rest on noiseless simulation of the quantum kernel; while the paper scopes its claims appropriately, the effective-rank advantage (reaching 69.80 at q=11) and F1 gains may not persist under realistic noise or hardware constraints, which could induce concentration or rank reduction not captured here. A brief analysis or caveat on this point in the discussion would strengthen the interpretation of the qubit-sweep results.
[Methods] Full experimental details on embedding extraction from the foundation models, exact train/test splits of MIMIC-CXR, and the precise implementation of the quantum kernel (e.g., feature map and circuit depth) are referenced only via the code repository. These should be summarized in the methods section to allow verification of the 18 configurations and the PCA-q reduction without external access, as they are load-bearing for reproducing the reported F1 values and p-values.

minor comments (2)

[Results] Ensure that all 18 Tier-1 and 7 Tier-2 configurations are explicitly tabulated or referenced to specific figures/tables, including the exact q values and models tested, to improve clarity of the 'all configurations' claim.
[Abstract and results] The abstract states 'architecture-dependent concentration onset' but does not specify the onset qubit counts per model; adding this detail or a reference to the relevant figure would aid readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. Both major points have been addressed by adding a targeted caveat in the Discussion and expanding the Methods section with the requested experimental details.

read point-by-point responses

Referee: [Results and eigenspectrum analysis] The central results rest on noiseless simulation of the quantum kernel; while the paper scopes its claims appropriately, the effective-rank advantage (reaching 69.80 at q=11) and F1 gains may not persist under realistic noise or hardware constraints, which could induce concentration or rank reduction not captured here. A brief analysis or caveat on this point in the discussion would strengthen the interpretation of the qubit-sweep results.

Authors: We agree that an explicit caveat strengthens interpretation of the qubit-sweep results. We have added a concise paragraph in the Discussion section noting that the reported effective-rank advantage and F1 gains are obtained under noiseless simulation and that hardware noise could induce additional concentration or rank reduction not captured in the present experiments. This addition clarifies the scope without changing the core empirical claims. revision: yes
Referee: [Methods] Full experimental details on embedding extraction from the foundation models, exact train/test splits of MIMIC-CXR, and the precise implementation of the quantum kernel (e.g., feature map and circuit depth) are referenced only via the code repository. These should be summarized in the methods section to allow verification of the 18 configurations and the PCA-q reduction without external access, as they are load-bearing for reproducing the reported F1 values and p-values.

Authors: We accept this recommendation. The revised Methods section now includes a self-contained summary of the embedding extraction pipelines for MedSigLIP-448, RAD-DINO, and ViT-patch32; the precise MIMIC-CXR train/test split (including patient-level stratification and seed handling); and the quantum feature map together with circuit depth and PCA-q reduction procedure. These additions enable direct verification of all 18 configurations and reported statistics without external code access. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper reports empirical performance comparisons (minority-class F1 scores across 18 configurations) and direct eigenspectrum measurements (effective rank of quantum vs. classical kernels) on PCA-reduced embeddings from medical foundation models. No derivation chain, first-principles prediction, or ansatz is claimed that reduces by the paper's own equations to fitted inputs or self-citations. The central observations (QSVM non-collapse, rank gap of ~69.8, architecture-dependent concentration) are independent experimental outputs, not constructed from the performance metrics or prior self-citations. Code and statistics over seeds are provided, confirming the results are self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The claim rests on standard assumptions of kernel methods and noiseless quantum simulation; no new entities are postulated and free parameters are limited to the fixed untuned C=1 and the choice of qubit count q.

free parameters (2)

C=1
Fixed untuned regularization for both QSVM and linear SVM in Tier 1 to enforce fair comparison.
q (PCA components / qubits)
Sweep variable; q=11 highlighted as plateau center for MedSigLIP-448.

axioms (2)

domain assumption Noiseless quantum simulation faithfully represents the ideal quantum kernel matrix for the given feature map.
Invoked throughout the QSVM experiments and eigenspectrum analysis.
domain assumption PCA-q reduction preserves the relevant discriminative information equally for quantum and classical kernels.
Central to the two-tier fair comparison framework.

pith-pipeline@v0.9.0 · 5647 in / 1461 out tokens · 81391 ms · 2026-05-08T03:52:14.789938+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 2 canonical work pages · 1 internal anchor

[1]

C´ orcoles, Kristan Temme, Aram W

Vojtˇ ech Havl´ ıˇ cek, Antonio D. C´ orcoles, Kristan Temme, Aram W. Harrow, Abhinav Kandala, Jerry M. Chow, and Jay M. Gambetta. Supervised learning with quantum- enhanced feature spaces.Nature, 567(7747):209–212, 2019

2019
[2]

Schuld and N

M. Schuld and N. Killoran. Quantum machine learn- ing in feature Hilbert spaces.Physical Review Letters, 122:040504, 2019

2019
[3]

M. Schuld. Supervised quantum machine learning models are kernel methods, 2021

2021
[4]

A rigorous and robust quantum speed-up in supervised machine learning.Nature Physics, 17(9):1013– 1017, 2021

Yunchao Liu, Srinivasan Arunachalam, and Kristan Temme. A rigorous and robust quantum speed-up in supervised machine learning.Nature Physics, 17(9):1013– 1017, 2021

2021
[5]

Jerbi, L

S. Jerbi, L. J. Fiderer, H. Poulsen Nautrup, J. M. K¨ ubler, H. J. Briegel, and V. Dunjko. Quantum machine learning beyond kernel methods.Nature Communications, 14:517, 2023

2023
[6]

Better than classical? the subtle artofbenchmarkingquantummachinelearningmodels

Joseph Bowles, Shahnawaz Ahmed, and Maria Schuld. Better than classical? the subtle art of benchmark- ing quantum machine learning models.arXiv preprint arXiv:2403.07059, 2024

work page arXiv 2024
[7]

Embedding aware quantum classical svms for scalable quantum machine learning

Sebasti´ an Andr´ es Cajas Ord´ o˜ nez, Luis Fernando Torres Torres, Mario Bifulco, Carlos Andres Duran, Cristian Bosch, and Ricardo Simon Carbajo. Embedding aware quantum classical svms for scalable quantum machine learning. In Marco Baioletti, Miguel Angel Gonzalez, Corrado Loglisci, Angelo Oddi, Riccardo Rasconi, and Ramiro Varela, editors,Proceedings ...

2025
[8]

Alistair E. W. Johnson, Tom J. Pollard, Seth J. Berkowitz, Nathaniel R. Greenbaum, Matthew P. Lungren, Chih- Ying Deng, Roger G. Mark, and Steven Horng. MIMIC- CXR, a de-identified publicly available database of chest radiographs with free-text reports.Scientific Data, 6:317, 2019

2019
[9]

A. E. W. Johnson, T. J. Pollard, N. R. Greenbaum, M. P. Lungren, C.-Y. Deng, Y. Peng, Z. Lu, R. G. Mark, S. J. Berkowitz, and S. Horng. MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs, 2019

2019
[10]

Burns, Leo Anthony Celi, Li-Ching Chen, Ramon Correa, Natalie Dullerud, Marzyeh Ghas- semi, Shih-Cheng Huang, Po-Chih Kuo, Matthew P

Judy Wawira Gichoya, Imon Banerjee, Ananth Reddy Bhimireddy, John L. Burns, Leo Anthony Celi, Li-Ching Chen, Ramon Correa, Natalie Dullerud, Marzyeh Ghas- semi, Shih-Cheng Huang, Po-Chih Kuo, Matthew P. Lungren, Lyle J. Palmer, Brandon J. Price, Saptarshi Purkayastha, Ayis T. Pyrros, Lauren Oakden-Rayner, Chima Okechukwu, Laleh Seyyed-Kalantari, Hari Triv...

2022
[11]

Algorithms Trained on Normal Chest X-rays Can Predict Health Insurance Types

Chi-Yu Chen, Rawan Abulibdeh, Arash Asgari, Sebasti´ an Andr´ es Cajas Ord´ o˜ nez, Leo Anthony Celi, Deirdre Goode, Hassan Hamidi, Laleh Seyyed-Kalantari, Ned McCague, Thomas Sounack, et al. Algorithms trained on normal chest x-rays can predict health insurance types.arXiv preprint arXiv:2511.11030, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[12]

A causal perspective on dataset bias in machine learning for medical imaging.Nature Machine Intelligence, 6(2):138– 146, 2024

Charles Jones, Daniel C Castro, Fabio De Sousa Ribeiro, Ozan Oktay, Melissa McCradden, and Ben Glocker. A causal perspective on dataset bias in machine learning for medical imaging.Nature Machine Intelligence, 6(2):138– 146, 2024

2024
[13]

Laleh Seyyed-Kalantari, Haoran Zhang, Matthew B. A. McDermott, Irene Y. Chen, and Marzyeh Ghassemi. Un- derdiagnosis bias of artificial intelligence algorithms ap- plied to chest radiographs in under-served patient popu- lations.Nature Medicine, 27(12):2176–2182, 2021

2021
[14]

Dissecting racial bias in an algo- rithm used to manage the health of populations.Science, 366(6464):447–453, 2019

Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. Dissecting racial bias in an algo- rithm used to manage the health of populations.Science, 366(6464):447–453, 2019

2019
[15]

Sigmoid loss for language image pre-training

Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, and Lucas Beyer. Sigmoid loss for language image pre-training. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 11975–11986, 2023

2023
[16]

Exploring scal- able medical image encoders beyond text supervision

Fernando P´ erez-Garc´ ıa, Harshita Sharma, Sam Bond- Taylor, Kenza Bouzid, Valentina Salvatelli, Maxim- ilian Ilse, Shruthi Bannur, Daniel C Castro, Anton Schwaighofer, Matthew P Lungren, et al. Exploring scal- able medical image encoders beyond text supervision. Nature Machine Intelligence, 7(1):119–130, 2025

2025
[17]

Dosovitskiy, L

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16 ×16 words: Transformers for image recognition at scale. InProceedings of the International Conference on Learning Representations (ICLR), 2021

2021
[18]

Exponential concentration in quantum kernel methods.Nature Communications, 15(1):5200, 2024

Supanut Thanasilp, Samson Wang, Marco Cerezo, and Zo¨ e Holmes. Exponential concentration in quantum kernel methods.Nature Communications, 15(1):5200, 2024

2024
[19]

Huang, M

H.-Y. Huang, M. Broughton, M. Mohseni, R. Babbush, S. Boixo, H. Neven, and J. R. McClean. Power of data in quantum machine learning.Nature Communications, 12:2631, 2021

2021
[20]

The inductive bias of quantum kernels.Advances in Neural Information Processing Systems, 34:12661–12673, 14 2021

Jonas K¨ ubler, Simon Buchholz, and Bernhard Sch¨ olkopf. The inductive bias of quantum kernels.Advances in Neural Information Processing Systems, 34:12661–12673, 14 2021

2021
[21]

Larocca, S

M. Larocca, S. Thanasilp, S. Wang, K. Sharma, J. Bia- monte, P. J. Coles, L. Cincio, J. R. McClean, Z. Holmes, and M. Cerezo. Barren plateaus in variational quantum computing.Nature Reviews Physics, 7:174–189, 2025

2025
[22]

The power of quantum neural networks.Nature Computational Science, 1:403–409, 2021

Amira Abbas, David Sutter, Christa Zoufal, Aurelien Lucchi, Alessio Figalli, and Stefan Woerner. The power of quantum neural networks.Nature Computational Science, 1:403–409, 2021

2021
[23]

Peral-Garc´ ıa, J

D. Peral-Garc´ ıa, J. Cruz-Benito, and F. J. Garc´ ıa-Pe˜ nalvo. Systematic literature review: Quantum machine learning and its applications.Computer Science Review, 51:100619, 2024

2024
[24]

Senokosov, A

A. Senokosov, A. Sedykh, A. Sagingalieva, B. Kyriacou, and A. Melnikov. Quantum machine learning for image classification.Machine Learning: Science and Technology, 5:015040, 2024

2024
[25]

Vapnik.The Nature of Statistical Learning Theory

Vladimir N. Vapnik.The Nature of Statistical Learning Theory. Springer, New York, 2nd edition, 1998

1998
[26]

Sch¨ olkopf and A

B. Sch¨ olkopf and A. J. Smola.Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, 2002

2002
[27]

Coleman, C

C. Coleman, C. Yeh, S. Mussmann, B. Mirzasoleiman, P. Bailis, P. Liang, J. Leskovec, and M. Zaharia. Selection via proxy: Efficient data selection for deep learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2020

2020
[28]

Sokolova and G

M. Sokolova and G. Lapalme. A systematic analysis of performance measures for classification tasks.Information Processing & Management, 45(4):427–437, 2009

2009
[29]

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.BMC Genomics, 21:6, 2020

Davide Chicco and Giuseppe Jurman. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.BMC Genomics, 21:6, 2020

2020
[30]

Predicting no-shows at out- patient appointments in internal medicine using machine learning models.PeerJ Computer Science, 11:e2762, 2025

Felipe Ocampo Osorio, Santiago Pedroza Gomez, David Esteban Rebell´ on Sanchez, Richard Ramirez Fernandez, Reinel Tabares-Soto, Mario Alejandro Bravo-Ort´ ız, and Gustavo Adolfo Cruz Suarez. Predicting no-shows at out- patient appointments in internal medicine using machine learning models.PeerJ Computer Science, 11:e2762, 2025

2025
[31]

Barren plateaus in quantum neural network training landscapes.Nature Communications, 9(1):4812, 2018

Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes.Nature Communications, 9(1):4812, 2018. Appendix A: Supplementary Figures This appendix collects additional figures that comple- ment the main text. All experiments use DT9 prepro- cessing, seed 0, and trace n...

2018
[32]

These complement the MedSigLIP q = 6 spectrum shown in the main text (Figure 3)

Quantum Kernel Eigenspectra (All Models) Figure 6 shows the quantum kernel eigenvalue spectra for all three embedding models at q = 4 and q = 6. These complement the MedSigLIP q = 6 spectrum shown in the main text (Figure 3)
[33]

Quantum Kernel Heatmaps (All Models) Figure 9 shows the quantum kernel matrices KQ at q= 4 andq= 6 for all three models
[34]

The substantial class overlap visible in every panel provides a geometric explanation for why the linear kernel collapses

PCA F eature Space: Class Separation atq= 4 andq= 6 Figure 10 shows the PCA-compressed training data at q = 4 and q = 6 for all three models. The substantial class overlap visible in every panel provides a geometric explanation for why the linear kernel collapses
[35]

PCA Geometry of MedSigLIP-448 atq= 2
[36]

ViT-patch32-GAP Pooling Ablation To assess the effect of pooling strategy on quantum kernel performance, we evaluate a global average pooling (GAP) variant of ViT-patch32 alongside the CLS-token variant reported in the main text. Both variants produce 768-dimensional embeddings from the same frozen ViT- patch32 backbone; the only difference is the aggrega...
[37]

ViT-patch16-cls Patch-Size Ablation To assess the effect of patch size on quantum ker- nel performance, we evaluate a ViT with patch size 16 (ViT-patch16-cls, 768-dimensional CLS-token embed- dings) alongside the ViT-patch32-cls variant reported in the main text. Both variants use the same frozen ViT backbone architecture; the only difference is the spati...

[1] [1]

C´ orcoles, Kristan Temme, Aram W

Vojtˇ ech Havl´ ıˇ cek, Antonio D. C´ orcoles, Kristan Temme, Aram W. Harrow, Abhinav Kandala, Jerry M. Chow, and Jay M. Gambetta. Supervised learning with quantum- enhanced feature spaces.Nature, 567(7747):209–212, 2019

2019

[2] [2]

Schuld and N

M. Schuld and N. Killoran. Quantum machine learn- ing in feature Hilbert spaces.Physical Review Letters, 122:040504, 2019

2019

[3] [3]

M. Schuld. Supervised quantum machine learning models are kernel methods, 2021

2021

[4] [4]

A rigorous and robust quantum speed-up in supervised machine learning.Nature Physics, 17(9):1013– 1017, 2021

Yunchao Liu, Srinivasan Arunachalam, and Kristan Temme. A rigorous and robust quantum speed-up in supervised machine learning.Nature Physics, 17(9):1013– 1017, 2021

2021

[5] [5]

Jerbi, L

S. Jerbi, L. J. Fiderer, H. Poulsen Nautrup, J. M. K¨ ubler, H. J. Briegel, and V. Dunjko. Quantum machine learning beyond kernel methods.Nature Communications, 14:517, 2023

2023

[6] [6]

Better than classical? the subtle artofbenchmarkingquantummachinelearningmodels

Joseph Bowles, Shahnawaz Ahmed, and Maria Schuld. Better than classical? the subtle art of benchmark- ing quantum machine learning models.arXiv preprint arXiv:2403.07059, 2024

work page arXiv 2024

[7] [7]

Embedding aware quantum classical svms for scalable quantum machine learning

Sebasti´ an Andr´ es Cajas Ord´ o˜ nez, Luis Fernando Torres Torres, Mario Bifulco, Carlos Andres Duran, Cristian Bosch, and Ricardo Simon Carbajo. Embedding aware quantum classical svms for scalable quantum machine learning. In Marco Baioletti, Miguel Angel Gonzalez, Corrado Loglisci, Angelo Oddi, Riccardo Rasconi, and Ramiro Varela, editors,Proceedings ...

2025

[8] [8]

Alistair E. W. Johnson, Tom J. Pollard, Seth J. Berkowitz, Nathaniel R. Greenbaum, Matthew P. Lungren, Chih- Ying Deng, Roger G. Mark, and Steven Horng. MIMIC- CXR, a de-identified publicly available database of chest radiographs with free-text reports.Scientific Data, 6:317, 2019

2019

[9] [9]

A. E. W. Johnson, T. J. Pollard, N. R. Greenbaum, M. P. Lungren, C.-Y. Deng, Y. Peng, Z. Lu, R. G. Mark, S. J. Berkowitz, and S. Horng. MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs, 2019

2019

[10] [10]

Burns, Leo Anthony Celi, Li-Ching Chen, Ramon Correa, Natalie Dullerud, Marzyeh Ghas- semi, Shih-Cheng Huang, Po-Chih Kuo, Matthew P

Judy Wawira Gichoya, Imon Banerjee, Ananth Reddy Bhimireddy, John L. Burns, Leo Anthony Celi, Li-Ching Chen, Ramon Correa, Natalie Dullerud, Marzyeh Ghas- semi, Shih-Cheng Huang, Po-Chih Kuo, Matthew P. Lungren, Lyle J. Palmer, Brandon J. Price, Saptarshi Purkayastha, Ayis T. Pyrros, Lauren Oakden-Rayner, Chima Okechukwu, Laleh Seyyed-Kalantari, Hari Triv...

2022

[11] [11]

Algorithms Trained on Normal Chest X-rays Can Predict Health Insurance Types

Chi-Yu Chen, Rawan Abulibdeh, Arash Asgari, Sebasti´ an Andr´ es Cajas Ord´ o˜ nez, Leo Anthony Celi, Deirdre Goode, Hassan Hamidi, Laleh Seyyed-Kalantari, Ned McCague, Thomas Sounack, et al. Algorithms trained on normal chest x-rays can predict health insurance types.arXiv preprint arXiv:2511.11030, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[12] [12]

A causal perspective on dataset bias in machine learning for medical imaging.Nature Machine Intelligence, 6(2):138– 146, 2024

Charles Jones, Daniel C Castro, Fabio De Sousa Ribeiro, Ozan Oktay, Melissa McCradden, and Ben Glocker. A causal perspective on dataset bias in machine learning for medical imaging.Nature Machine Intelligence, 6(2):138– 146, 2024

2024

[13] [13]

Laleh Seyyed-Kalantari, Haoran Zhang, Matthew B. A. McDermott, Irene Y. Chen, and Marzyeh Ghassemi. Un- derdiagnosis bias of artificial intelligence algorithms ap- plied to chest radiographs in under-served patient popu- lations.Nature Medicine, 27(12):2176–2182, 2021

2021

[14] [14]

Dissecting racial bias in an algo- rithm used to manage the health of populations.Science, 366(6464):447–453, 2019

Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. Dissecting racial bias in an algo- rithm used to manage the health of populations.Science, 366(6464):447–453, 2019

2019

[15] [15]

Sigmoid loss for language image pre-training

Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, and Lucas Beyer. Sigmoid loss for language image pre-training. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 11975–11986, 2023

2023

[16] [16]

Exploring scal- able medical image encoders beyond text supervision

Fernando P´ erez-Garc´ ıa, Harshita Sharma, Sam Bond- Taylor, Kenza Bouzid, Valentina Salvatelli, Maxim- ilian Ilse, Shruthi Bannur, Daniel C Castro, Anton Schwaighofer, Matthew P Lungren, et al. Exploring scal- able medical image encoders beyond text supervision. Nature Machine Intelligence, 7(1):119–130, 2025

2025

[17] [17]

Dosovitskiy, L

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16 ×16 words: Transformers for image recognition at scale. InProceedings of the International Conference on Learning Representations (ICLR), 2021

2021

[18] [18]

Exponential concentration in quantum kernel methods.Nature Communications, 15(1):5200, 2024

Supanut Thanasilp, Samson Wang, Marco Cerezo, and Zo¨ e Holmes. Exponential concentration in quantum kernel methods.Nature Communications, 15(1):5200, 2024

2024

[19] [19]

Huang, M

H.-Y. Huang, M. Broughton, M. Mohseni, R. Babbush, S. Boixo, H. Neven, and J. R. McClean. Power of data in quantum machine learning.Nature Communications, 12:2631, 2021

2021

[20] [20]

The inductive bias of quantum kernels.Advances in Neural Information Processing Systems, 34:12661–12673, 14 2021

Jonas K¨ ubler, Simon Buchholz, and Bernhard Sch¨ olkopf. The inductive bias of quantum kernels.Advances in Neural Information Processing Systems, 34:12661–12673, 14 2021

2021

[21] [21]

Larocca, S

M. Larocca, S. Thanasilp, S. Wang, K. Sharma, J. Bia- monte, P. J. Coles, L. Cincio, J. R. McClean, Z. Holmes, and M. Cerezo. Barren plateaus in variational quantum computing.Nature Reviews Physics, 7:174–189, 2025

2025

[22] [22]

The power of quantum neural networks.Nature Computational Science, 1:403–409, 2021

Amira Abbas, David Sutter, Christa Zoufal, Aurelien Lucchi, Alessio Figalli, and Stefan Woerner. The power of quantum neural networks.Nature Computational Science, 1:403–409, 2021

2021

[23] [23]

Peral-Garc´ ıa, J

D. Peral-Garc´ ıa, J. Cruz-Benito, and F. J. Garc´ ıa-Pe˜ nalvo. Systematic literature review: Quantum machine learning and its applications.Computer Science Review, 51:100619, 2024

2024

[24] [24]

Senokosov, A

A. Senokosov, A. Sedykh, A. Sagingalieva, B. Kyriacou, and A. Melnikov. Quantum machine learning for image classification.Machine Learning: Science and Technology, 5:015040, 2024

2024

[25] [25]

Vapnik.The Nature of Statistical Learning Theory

Vladimir N. Vapnik.The Nature of Statistical Learning Theory. Springer, New York, 2nd edition, 1998

1998

[26] [26]

Sch¨ olkopf and A

B. Sch¨ olkopf and A. J. Smola.Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge, MA, 2002

2002

[27] [27]

Coleman, C

C. Coleman, C. Yeh, S. Mussmann, B. Mirzasoleiman, P. Bailis, P. Liang, J. Leskovec, and M. Zaharia. Selection via proxy: Efficient data selection for deep learning. In Proceedings of the International Conference on Learning Representations (ICLR), 2020

2020

[28] [28]

Sokolova and G

M. Sokolova and G. Lapalme. A systematic analysis of performance measures for classification tasks.Information Processing & Management, 45(4):427–437, 2009

2009

[29] [29]

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.BMC Genomics, 21:6, 2020

Davide Chicco and Giuseppe Jurman. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.BMC Genomics, 21:6, 2020

2020

[30] [30]

Predicting no-shows at out- patient appointments in internal medicine using machine learning models.PeerJ Computer Science, 11:e2762, 2025

Felipe Ocampo Osorio, Santiago Pedroza Gomez, David Esteban Rebell´ on Sanchez, Richard Ramirez Fernandez, Reinel Tabares-Soto, Mario Alejandro Bravo-Ort´ ız, and Gustavo Adolfo Cruz Suarez. Predicting no-shows at out- patient appointments in internal medicine using machine learning models.PeerJ Computer Science, 11:e2762, 2025

2025

[31] [31]

Barren plateaus in quantum neural network training landscapes.Nature Communications, 9(1):4812, 2018

Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes.Nature Communications, 9(1):4812, 2018. Appendix A: Supplementary Figures This appendix collects additional figures that comple- ment the main text. All experiments use DT9 prepro- cessing, seed 0, and trace n...

2018

[32] [32]

These complement the MedSigLIP q = 6 spectrum shown in the main text (Figure 3)

Quantum Kernel Eigenspectra (All Models) Figure 6 shows the quantum kernel eigenvalue spectra for all three embedding models at q = 4 and q = 6. These complement the MedSigLIP q = 6 spectrum shown in the main text (Figure 3)

[33] [33]

Quantum Kernel Heatmaps (All Models) Figure 9 shows the quantum kernel matrices KQ at q= 4 andq= 6 for all three models

[34] [34]

The substantial class overlap visible in every panel provides a geometric explanation for why the linear kernel collapses

PCA F eature Space: Class Separation atq= 4 andq= 6 Figure 10 shows the PCA-compressed training data at q = 4 and q = 6 for all three models. The substantial class overlap visible in every panel provides a geometric explanation for why the linear kernel collapses

[35] [35]

PCA Geometry of MedSigLIP-448 atq= 2

[36] [36]

ViT-patch32-GAP Pooling Ablation To assess the effect of pooling strategy on quantum kernel performance, we evaluate a global average pooling (GAP) variant of ViT-patch32 alongside the CLS-token variant reported in the main text. Both variants produce 768-dimensional embeddings from the same frozen ViT- patch32 backbone; the only difference is the aggrega...

[37] [37]

ViT-patch16-cls Patch-Size Ablation To assess the effect of patch size on quantum ker- nel performance, we evaluate a ViT with patch size 16 (ViT-patch16-cls, 768-dimensional CLS-token embed- dings) alongside the ViT-patch32-cls variant reported in the main text. Both variants use the same frozen ViT backbone architecture; the only difference is the spati...