Scalable Quantum Machine Learning via Multi-layer Fully-Connected Variational Quantum Circuits
Pith reviewed 2026-05-15 21:09 UTC · model grok-4.3
The pith
Multi-layer fully-connected variational quantum circuits scale trainable parameters linearly with input dimension while matching or beating monolithic VQCs and matched deep neural networks on regression, classification, and PDE tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FC-VQC decomposes high-dimensional inputs into fixed-size local variational quantum circuit blocks connected by deterministic block-mixing rules. Each quantum computation remains local to its block, and the number of trainable quantum parameters scales linearly with input dimension. Across tabular regression, tabular classification, and spatio-temporal BSDE/PDE approximation tasks, this architecture improves performance over monolithic VQC baselines and achieves competitive or improved performance relative to structure-matched deep neural network baselines while using substantially fewer trainable parameters.
What carries the argument
Multi-layer fully-connected variational quantum circuit (FC-VQC) architecture that partitions inputs into fixed-size local VQC blocks linked by deterministic block-mixing rules to propagate information without extra trainable quantum parameters.
If this is right
- Trainable parameters grow linearly rather than quadratically or worse with input dimension.
- Each local quantum block stays small enough for classical simulation and gradient-based optimization on near-term hardware.
- The same modular pattern delivers gains on both supervised learning tasks and differential-equation approximation.
- Parameter count drops substantially relative to deep neural networks of matched width and depth.
Where Pith is reading between the lines
- The same block-and-fixed-mixing pattern could be applied to other families of quantum circuits to improve their scaling.
- Success of fixed mixing rules implies that explicit learned cross terms may be less necessary in quantum ML than often assumed.
- Larger input dimensions become practical on hardware limited to small qubit counts if local blocks fit within available qubits.
Load-bearing premise
The fixed deterministic block-mixing rules preserve enough expressivity to capture cross-block correlations without needing additional trainable quantum parameters or post-processing adjustments.
What would settle it
A dataset in which cross-block correlations are essential and cannot be recovered by the fixed mixing rules, causing FC-VQC accuracy to fall below both monolithic VQCs and matched DNNs even after increasing local block size.
Figures
read the original abstract
Variational Quantum Circuits (VQC) are promising models for quantum machine learning, but standard monolithic architectures face an expressivity--trainability dilemma: small circuits can be under-parameterized, while larger circuits are difficult to simulate and optimize. We propose Multi-Layer Fully-Connected Variational Quantum Circuits (FC-VQC), a modular framework that decomposes high-dimensional inputs into fixed-size local VQC blocks connected by deterministic block-mixing rules. This design keeps each quantum computation local while allowing the number of trainable quantum parameters to scale linearly with input dimension. We evaluate FC-VQC across tabular regression, tabular classification, and spatio-temporal BSDE/PDE approximation. Across the evaluated tasks, FC-VQC improves over monolithic VQC baselines and achieves competitive or improved performance relative to structure-matched deep neural network (DNN) baselines, while using substantially fewer trainable parameters.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Multi-Layer Fully-Connected Variational Quantum Circuits (FC-VQC), a modular architecture that decomposes high-dimensional inputs into fixed-size local VQC blocks linked by deterministic block-mixing rules. This yields linear scaling of trainable quantum parameters with input dimension while targeting improved performance over monolithic VQCs and competitiveness with structure-matched DNNs on tabular regression, classification, and spatio-temporal BSDE/PDE tasks.
Significance. If the empirical gains hold under rigorous controls, the work would demonstrate a practical route to scaling VQCs without exponential parameter growth or barren-plateau issues, offering a hybrid quantum-classical model with substantially lower parameter counts than matched DNNs. The linear-scaling claim and explicit comparison to DNN baselines are the strongest potential contributions, provided the mixing rules demonstrably propagate cross-block correlations.
major comments (3)
- [§3.2] §3.2 (Block-Mixing Rules): The deterministic mixing (fixed permutations, summations, or tensor contractions) is presented as sufficient to recover data-dependent cross-block correlations without extra trainable parameters. No proof or ablation is given showing that this static wiring preserves expressivity for tasks with non-local dependencies (e.g., BSDE/PDE); if the mixing is data-independent, the overall circuit may reduce to disconnected local blocks, undermining both the performance claim over monolithic VQCs and the linear-scaling advantage.
- [§4] §4 (Experimental Setup): The abstract and results claim competitive or superior performance versus DNN baselines with far fewer parameters, yet no full specification of baseline architectures, hyperparameter matching, error bars, data exclusion criteria, or statistical tests is provided. Without these, the central empirical claim cannot be evaluated and the “substantially fewer trainable parameters” comparison remains unverified.
- [Tables 2-3] Table 2 / Table 3 (Performance Metrics): Reported improvements lack standard deviations across runs and any analysis of variance; for the BSDE/PDE tasks the reported gains could be within noise, weakening the claim that FC-VQC is competitive with DNNs while using linear parameters.
minor comments (2)
- [Eq. (7)] Notation for the mixing operator (e.g., Eq. (7)) is introduced without an explicit definition of its action on the quantum state vector; a short matrix or circuit-diagram expansion would clarify the claim of locality.
- [Introduction] The manuscript cites prior VQC scaling work but omits direct comparison to recent tensor-network or hybrid quantum-classical approaches that also target linear parameter scaling.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major point below, providing clarifications and committing to revisions that strengthen the empirical and theoretical support for FC-VQC.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Block-Mixing Rules): The deterministic mixing (fixed permutations, summations, or tensor contractions) is presented as sufficient to recover data-dependent cross-block correlations without extra trainable parameters. No proof or ablation is given showing that this static wiring preserves expressivity for tasks with non-local dependencies (e.g., BSDE/PDE); if the mixing is data-independent, the overall circuit may reduce to disconnected local blocks, undermining both the performance claim over monolithic VQCs and the linear-scaling advantage.
Authors: The mixing rules are data-independent by design, but they operate across successive layers on quantum states that already encode data-dependent information from prior blocks. This layered propagation enables cross-block correlations to accumulate, analogous to how fixed-weight connections in classical fully-connected networks still permit rich feature interactions. We agree that the manuscript would benefit from stronger evidence and will add (i) an ablation comparing FC-VQC variants with and without inter-block mixing on the BSDE/PDE tasks and (ii) a concise theoretical argument showing that the multi-layer composition preserves the ability to represent non-local functions. These additions will demonstrate that the architecture does not collapse to disconnected local blocks. revision: yes
-
Referee: [§4] §4 (Experimental Setup): The abstract and results claim competitive or superior performance versus DNN baselines with far fewer parameters, yet no full specification of baseline architectures, hyperparameter matching, error bars, data exclusion criteria, or statistical tests is provided. Without these, the central empirical claim cannot be evaluated and the “substantially fewer trainable parameters” comparison remains unverified.
Authors: We accept that §4 currently lacks sufficient detail for independent verification. In the revised manuscript we will expand the experimental setup to report: (a) exact DNN architectures (layer widths, depths, activations, and initialization), (b) the hyperparameter search protocol and final values used for both FC-VQC and DNNs, (c) the number of independent runs and how error bars were computed, (d) data preprocessing, splitting, and any exclusion criteria, and (e) the statistical tests applied to compare methods. These additions will make the parameter-efficiency and performance claims fully reproducible. revision: yes
-
Referee: [Tables 2-3] Table 2 / Table 3 (Performance Metrics): Reported improvements lack standard deviations across runs and any analysis of variance; for the BSDE/PDE tasks the reported gains could be within noise, weakening the claim that FC-VQC is competitive with DNNs while using linear parameters.
Authors: We agree that standard deviations and variance analysis are essential. We will revise Tables 2 and 3 to include standard deviations computed over the multiple independent runs already performed. We will also add a short discussion of observed variance, with particular attention to the BSDE/PDE tasks, and will qualify performance claims where differences fall within statistical noise. If appropriate, we will report p-values or confidence intervals to support the competitiveness statement. revision: yes
Circularity Check
No significant circularity; claims rest on empirical evaluation
full rationale
The paper introduces FC-VQC as a modular architecture with deterministic block-mixing and reports performance gains via direct experimental comparisons against monolithic VQCs and DNN baselines. No derivation chain is presented that reduces a claimed result to its own inputs by construction, self-citation, or fitted-parameter renaming. The linear scaling and expressivity claims are architectural design choices validated empirically rather than derived from equations that presuppose the target outcome. This is the standard non-circular case for an empirical proposal paper.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard assumptions underlying variational quantum circuits and quantum state evolution
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
deterministic block-mixing rules... sliding-window... ring topology... fully-connected block mixing... parallel block mixing (Eqs. 9-11, Theorems 4.2-4.3)
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
linear scalability O(d)... parameter count scales linearly with number of blocks
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
C., Endo, S., Fujii, K., McClean, J
Cerezo, M., Arrasmith, A., Babbush, R., Benjamin, S. C., Endo, S., Fujii, K., McClean, J. R., Mitarai, K., Yuan, X., Cincio, L., et al. Variational quantum algorithms.Nature Reviews Physics, 3(9):625–644, 2021a. Cerezo, M., Sone, A., V olkoff, T., Cincio, L., and Coles, P. J. Cost function dependent barren plateaus in shallow parametrized quantum circuits...
work page 2022
-
[2]
Park, J. J., Cha, J., Chen, S. Y .-C., Tseng, H.-H., and Yoo, S. Addressing the current challenges of quantum machine learning through multi-chip ensembles.arXiv preprint arXiv:2505.08782,
-
[3]
Y .-C., Chen, P.-Y ., Zenil, H., and Tegner, J
Qi, J., Yang, C.-H., Chen, S. Y .-C., Chen, P.-Y ., Zenil, H., and Tegner, J. Leveraging pre-trained neural networks to enhance machine learning with variational quantum circuits.arXiv preprint arXiv:2411.08552,
-
[4]
Physical Review Letters 85(10), 2200–2203 (2000)
doi: 10.1103/PhysRevLett. 131.100803. Schuld, M., Bocharov, A., Svore, K. M., and Wiebe, N. Circuit-centric quantum classifiers.Physical Review A, 101(3):032308,
-
[5]
9 Multi-Layer Fully-Connected Variational Quantum Circuits Su, H. and Tseng, H.-H. On quantum BSDE solver for high- dimensional parabolic PDEs. InProceedings of the 2025 IEEE International Conference on Quantum Computing and Engineering (QCE), pp. 205–210. IEEE,
work page 2025
-
[6]
10 Multi-Layer Fully-Connected Variational Quantum Circuits A. Experimental Setup This appendix provides detailed specifications for the datasets, baseline models, and training protocols used in our experiments. Comprehensive lists of hyperparameters and architectural topologies are summarized in Table 11 and Table 12, respectively. A.1. Standard Benchmar...
work page 1998
-
[7]
NX i=1 MX k=0 fθ(X(i) tk )−Y (i) tk 2 (26) We utilize the Adam optimizer for gradient descent. A.2.4. VALIDATIONMETRIC: PORTFOLIORELATIVEMAE During validation, the model’s accuracy is assessed using the Portfolio Relative Mean Absolute Error (RelMAE) against the analytical Black-Scholes solution (Black & Scholes, 1973). Unlike component-wise metrics, this...
work page 1973
-
[8]
The crucial property islocality: (gsw(H)) b depends only on the neighborhood{b−r,
blocks back to one block (e.g., concatenation followed by a fixed linear projection, or averaging, etc.). The crucial property islocality: (gsw(H)) b depends only on the neighborhood{b−r, . . . , b+r}. Theorem C.3(Receptive-field growth under sliding-window mixing).Consider the recursion (49) with g(l) ≡g sw satisfying the locality property(51)for radiusr...
-
[9]
1 0.3925 0.5970 0.6280 0.6863 3 0.6176 0.7585 0.7231 0.7589 5 0.6418 0.7255 0.7603 0.7410 7 0.5469 0.5005 0.7244 0.7105 9 0.59300.77570.6484 0.7112 8t3t1 1 0.7623 0.7816 0.7767 0.7593 3 0.7922 0.8134 0.8284 0.8222 5 0.8051 0.8096 0.8380 0.7929 7 0.7775 0.8201 0.75590.8446 9 0.7955 0.7378 0.8247 0.7619 16t4t1 1 0.8276 0.8359 0.8645 0.8678 3 0.8636 0.8507 0...
- [10]
-
[11]
1 0.8853 0.88470.90280.8914 3 0.8940 0.8772 0.8898 0.9122 5 0.8801 0.8906 0.8959 0.8607 7 0.8861 0.8854 0.9004 0.8741 9 0.8594 0.8716 0.8681 0.8751 Note: For Quantum models,Kis VQC Circuit Depth. For DNN,Kis Hidden Layers. For XGBoost/CatBoost,Kis Tree Depth. 20 Multi-Layer Fully-Connected Variational Quantum Circuits Table 6.Concrete Compressive Strength...
- [12]
- [13]
-
[14]
•24t5t1:Fully Connected Block Mixing(5 blocks of 5–qubit–VQC)
1 0.8349 0.8469 0.8618 0.8356 3 0.8496 0.8118 0.8481 0.8679 5 0.8410 0.8399 0.7976 0.8295 7 0.8134 0.80730.87030.8118 9 0.8262 0.8267 0.8204 0.7854 Mixing Strategies: •24t8t3t1:Sliding Window Block Mixing(8 blocks of 3–qubit–VQC). •24t5t1:Fully Connected Block Mixing(5 blocks of 5–qubit–VQC). •24t8t3t1 Parallel:Parallel Block Mixing(8 blocks of 3–qubit–VQ...
- [15]
- [16]
- [17]
- [18]
-
[19]
1 0.6042 0.5917 0.6125 0.5875 3 0.5750 0.6000 0.6125 0.6250 5 0.6083 0.5792 0.5667 0.5583 7 0.5958 0.5958 0.6042 0.5667 9 0.5458 0.6042 0.55000.6458 Note: For Quantum models,Kis VQC Circuit Depth. For DNN,Kis Hidden Layers. For XGBoost/CatBoost,Kis Tree Depth. 22 Multi-Layer Fully-Connected Variational Quantum Circuits Table 8.Option Portfolio Valuation o...
-
[20]
(Boldindicates best performance). Model DepthK= 3DepthK= 5DepthK= 7DepthK= 9 DNN 0.0354 0.0365 0.0355 0.0348 CatBoost 0.0190 0.01820.01770.0194 XGBoost0.01770.0199 0.0261 0.0318 QNN Q3 0.0184 0.0196 0.0188 0.0176 QNN Q3 Parallel 0.0177 0.0176 0.01830.0171 23 Multi-Layer Fully-Connected Variational Quantum Circuits Table 10.Option Portfolio Valuation of300...
-
[21]
(Boldindicates best performance). Model DepthK= 3DepthK= 5DepthK= 7DepthK= 9 DNN 0.0274 0.0274 0.0278 0.0265 CatBoost 0.0249 0.0231 0.02080.0188 XGBoost 0.0226 0.0191 0.0213 0.0230 QNN Q3 0.0132 0.0125 0.0135 0.0120 QNN Q3 Parallel 0.0125 0.0118 0.01240.0107 24 Multi-Layer Fully-Connected Variational Quantum Circuits Table 11.Experimental setup and hyperp...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.