pith. machine review for the scientific record. sign in

arxiv: 2604.09957 · v1 · submitted 2026-04-10 · 🪐 quant-ph

Recognition: unknown

Mitigating Barren Plateaus in Variational Quantum Circuits through PDE-Constrained Loss Functions

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:36 UTC · model grok-4.3

classification 🪐 quant-ph
keywords variational quantum circuitsbarren plateausPDE constraintsgradient variancephysics-informed quantum modelsquantum simulation
0
0 comments X

The pith

Embedding PDE constraints into variational quantum circuit loss functions prevents exponential gradient vanishing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that physics-informed loss functions built from local partial differential equation residuals at collocation points keep gradient variance from decaying exponentially with system size. These losses inherit the polynomial scaling behavior of local cost functions while the constraints themselves narrow the optimization landscape and concentrate useful gradient directions. Numerical tests on the heat equation, Burgers equation, and shallow-water equations across 4-8 qubits and shallow circuits confirm higher gradient variance, sub-maximal entanglement, and faster convergence compared with global or unconstrained costs.

Core claim

Physics-constrained loss functions composed of local PDE residuals evaluated at spatial collocation points inherit the favorable polynomial scaling of local cost functions while benefiting from constraint-induced landscape narrowing that concentrates gradient information, thereby mitigating barren plateaus in variational quantum circuits.

What carries the argument

The PDE-constrained loss, formed by summing squared residuals of the governing partial differential equation at chosen spatial collocation points, which replaces or augments the usual observable-based cost to enforce physical consistency during training.

If this is right

  • Gradient variance remains polynomially bounded rather than exponentially suppressed as qubit number increases.
  • Constraint-induced landscape narrowing produces stable training even at moderate circuit depths.
  • Structured ansatze stay in a sub-maximal entanglement regime that preserves trainability.
  • Physics-constrained models reach lower loss values in fewer epochs than global-cost baselines.
  • The approach extends naturally to variational quantum simulation of physical systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar constraint-based losses could be constructed for other differential or algebraic equations beyond the tested PDEs.
  • The method may reduce the need for specialized ansatz designs or initialization tricks when scaling to larger quantum systems.
  • Hybrid quantum-classical workflows could systematically incorporate domain knowledge through the loss rather than through the circuit architecture alone.

Load-bearing premise

That PDE residuals evaluated at collocation points can be embedded into the variational quantum circuit loss without introducing new gradient-vanishing mechanisms or prohibitive computational overhead.

What would settle it

Direct measurement of gradient variance scaling for PDE-constrained circuits on 20+ qubit systems; exponential decay would falsify the polynomial lower bound.

Figures

Figures reproduced from arXiv: 2604.09957 by Midhun Chakkravarthy, Prasad Nimantha Madusanka Ukwatta Hewage, Ruvan Kumara Abeysekara.

Figure 1
Figure 1. Figure 1: FIG. 1. Gradient variance scaling with qubit count ( [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Gradient variance scaling with circuit depth ( [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Gradient variance comparison across PDE constraint types ( [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. Bipartite entanglement entropy for nearest-neighbor (left) and all-to-all (right) entangling [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5. Training convergence comparison ( [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6. Distribution of per-parameter gradient variance at [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
read the original abstract

The barren plateau phenomenon; where cost function gradients vanish exponentially with system size; remains a fundamental obstacle to training variational quantum circuits (VQCs) at scale. We demonstrate, both theoretically and numerically, that embedding partial differential equation (PDE) constraints into the VQC loss function provides a natural and effective mitigation mechanism against barren plateaus. We derive analytical gradient variance lower bounds showing that physics-constrained loss functions composed of local PDE residuals evaluated at spatial collocation points inherit the favorable polynomial scaling of local cost functions, while additionally benefiting from constraint-induced landscape narrowing that concentrates gradient information. Systematic numerical experiments on the one-dimensional heat equation, Burgers' equation, and the Saint-Venant shallow water equations quantify the gradient variance across 4-8 qubits and 1-5 layer depths, comparing global cost, local cost, PDE-constrained, and PDE-constrained with structured ansatz configurations. We find that PDE-constrained circuits exhibit favorable gradient variance scaling with system size, with the physics constraints creating a stabilizing effect that resists exponential gradient vanishing. Entanglement entropy analysis reveals that structured ansatze operate in a sub-maximal entanglement regime consistent with trainability. Convergence experiments confirm that physics-constrained VQCs achieve lower loss values in fewer epochs. These results establish PDE constraints as a principled, physically motivated strategy for designing trainable variational quantum circuits, with direct implications for quantum physics-informed neural networks and variational quantum simulation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that embedding PDE constraints into the loss function of variational quantum circuits (VQCs) mitigates barren plateaus. It derives analytical lower bounds on gradient variance showing that physics-constrained losses composed of local PDE residuals at collocation points inherit the polynomial scaling of local cost functions while gaining additional landscape narrowing. Numerical experiments on the 1D heat equation, Burgers' equation, and Saint-Venant equations with 4-8 qubits and 1-5 layer depths compare global, local, PDE-constrained, and structured-ansatz configurations, reporting favorable variance scaling, sub-maximal entanglement, and faster convergence.

Significance. If the central claims hold, the work supplies a physically motivated route to trainable VQCs for quantum physics-informed neural networks and variational simulation. The combination of analytical bounds with systematic numerics across multiple PDEs and depths, plus entanglement entropy analysis, is a concrete strength. The approach extends prior local-cost results without requiring new hardware assumptions.

major comments (2)
  1. [§3.2] §3.2 (Analytical Gradient Variance Bounds): The lower-bound derivation for Var(∂L/∂θ_j) where L = Σ_i r_i(θ, x_i)^2 treats the sum of individual var(∂r_i/∂θ_j) terms as the dominant contribution that inherits polynomial scaling. The full variance expression necessarily includes all pairwise Cov(∂r_i/∂θ_j, ∂r_k/∂θ_j) for i ≠ k. No separate estimate or sign control is supplied for these cross terms, which share the same variational parameters through the VQC; when the number of collocation points grows or the PDE couples the residuals, the bound does not rigorously guarantee retention of polynomial scaling.
  2. [§4.3] §4.3 (Numerical Gradient Variance Results): The reported variance values for PDE-constrained losses are shown only for fixed collocation-point counts (implicit in the 1D PDE discretizations). No scaling plot or table entry varies the number of points while holding qubit count fixed, leaving open whether the observed polynomial behavior survives the covariance contributions that become more numerous with finer spatial resolution.
minor comments (2)
  1. [Figure 3] Figure 3 caption: the distinction between 'PDE-constrained' and 'PDE-constrained + structured ansatz' curves is not stated explicitly in the legend or caption text.
  2. [§2.2] Notation: the symbol for the PDE residual r_i is introduced without an explicit definition of how the VQC output is mapped to the spatial derivative operators appearing in the residual (e.g., via finite-difference or automatic differentiation on the quantum state).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. The comments raise important points about the rigor of our analytical bounds and the comprehensiveness of our numerical validation. We respond to each major comment below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: §3.2 (Analytical Gradient Variance Bounds): The lower-bound derivation for Var(∂L/∂θ_j) where L = Σ_i r_i(θ, x_i)^2 treats the sum of individual var(∂r_i/∂θ_j) terms as the dominant contribution that inherits polynomial scaling. The full variance expression necessarily includes all pairwise Cov(∂r_i/∂θ_j, ∂r_k/∂θ_j) for i ≠ k. No separate estimate or sign control is supplied for these cross terms, which share the same variational parameters through the VQC; when the number of collocation points grows or the PDE couples the residuals, the bound does not rigorously guarantee retention of polynomial scaling.

    Authors: The referee correctly identifies that the full variance of the gradient includes covariance terms between the contributions from different residuals. Our derivation in §3.2 establishes the lower bound by emphasizing the local character of each PDE residual r_i, which individually follows the polynomial scaling known for local cost functions. However, we did not provide an explicit estimate or sign control for the cross-covariances. We will revise §3.2 to include a dedicated discussion of these terms, showing under the locality of collocation points and boundedness of the PDE residuals that the covariances remain controlled and do not cancel the polynomial lower bound. This will make the guarantee rigorous. revision: yes

  2. Referee: §4.3 (Numerical Gradient Variance Results): The reported variance values for PDE-constrained losses are shown only for fixed collocation-point counts (implicit in the 1D PDE discretizations). No scaling plot or table entry varies the number of points while holding qubit count fixed, leaving open whether the observed polynomial behavior survives the covariance contributions that become more numerous with finer spatial resolution.

    Authors: We agree that varying the collocation-point count while holding qubit number fixed is necessary to confirm robustness against growing covariance contributions. The current experiments employ fixed, discretization-appropriate point counts for each PDE. In the revised manuscript we will add new numerical results in §4.3 that explicitly vary the number of collocation points (e.g., for 6-qubit circuits) and report the corresponding gradient-variance scaling, thereby directly testing whether the polynomial behavior persists. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in the derivation chain

full rationale

The paper claims to derive new analytical gradient variance lower bounds for PDE-constrained losses, asserting that these inherit polynomial scaling from local cost functions while adding constraint-induced narrowing. This is framed as an independent derivation rather than a direct reduction to inputs by construction. No load-bearing self-citations, self-definitional loops, or fitted parameters renamed as predictions appear in the abstract or described chain. The numerical experiments on specific PDEs and entanglement analysis provide separate content. The central claim does not reduce to its own assumptions or prior results by definition; any inheritance is presented as shown via new bounds, not assumed.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on extending known gradient-variance results for local observables and on the domain assumption that PDE residuals can be evaluated locally via circuit outputs; no free parameters or new entities are introduced in the abstract.

axioms (2)
  • standard math Gradient variance for local cost functions in variational quantum circuits scales polynomially with system size
    The paper explicitly builds its lower bounds on this established property of local costs.
  • domain assumption PDE residuals can be computed from VQC outputs at a finite set of spatial collocation points
    Required for embedding the constraints into the loss without additional quantum resources.

pith-pipeline@v0.9.0 · 5575 in / 1389 out tokens · 71307 ms · 2026-05-10T16:36:56.213542+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 7 canonical work pages · 3 internal anchors

  1. [1]

    Preserves information locality: entanglement between qubitsiandjgrows with circuit depth, mimicking the causal cone structure of local PDEs

  2. [2]

    Avoids the fully random regime: the circuit does not form a 2-design for shallow depths, maintaining the conditions for Eq. (8)

  3. [3]

    PDE + structured

    Minimizes gate count:G=L(3n−1) versusG=L(n(n−1)/2 + 2n) for all-to-all entanglement. The combination of PDE-constrained loss with structured ansatz constitutes our proposed “PDE + structured” configuration, which we hypothesize provides the strongest barren plateau mitigation. 7 V. NUMERICAL EXPERIMENTS A. Experimental Setup All quantum circuit simulation...

  4. [4]

    SampleK= 25 random parameter initializationsϕ (k) ∼Uniform[0,2π) 2nL

  5. [5]

    Compute the gradient∇ ϕL(ϕ(k)) for each initialization

  6. [6]

    Compute the per-parameter variance: Var j = Vark[∂L/∂ϕ (k) j ]

  7. [7]

    gradient floor

    Report the mean: Var = 1 p Pp j=1 Varj. Loss function configurations.We compare four settings: •Global cost:L=⟨ N k σ(k) Z ⟩with all-to-all CNOT entanglement. •Local cost:L=⟨σ (1) Z ⟩with all-to-all CNOT entanglement. •PDE-constrained:Data + physics loss with all-to-all entanglement. •PDE + structured:Data + physics loss with nearest-neighbor entanglement...

  8. [8]

    Variational quantum algorithms,

    M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, L. Cincio, and P. J. Coles, “Variational quantum algorithms,”Nat. Rev. Phys.3, 625–644 (2021)

  9. [9]

    Noisy intermediate-scale quantum algorithms,

    K. Bharti, A. Cervera-Lierta, T. H. Kyaw, T. Haug, S. Alperin-Lea, A. Anand, M. Degroote, H. Heimonen, J. S. Kottmann, T. Menke, W.-K. Mok, S. Sim, L.-C. Kwek, and A. Aspuru- Guzik, “Noisy intermediate-scale quantum algorithms,”Rev. Mod. Phys.94, 015004 (2022)

  10. [10]

    A variational eigenvalue solver on a photonic quantum processor,

    A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, “A variational eigenvalue solver on a photonic quantum processor,”Nat. Commun.5, 4213 (2014)

  11. [11]

    A Quantum Approximate Optimization Algorithm

    E. Farhi, J. Goldstone, and S. Gutmann, “A quantum approximate optimization algorithm,” arXiv:1411.4028 (2014)

  12. [12]

    Schuld and F

    M. Schuld and F. Petruccione,Machine Learning with Quantum Computers, 2nd ed. (Springer, Cham, 2021)

  13. [13]

    Parameterized quantum circuits as machine learning models,

    M. Benedetti, E. Lloyd, S. Sack, and M. Fiorentini, “Parameterized quantum circuits as machine learning models,”Quantum Sci. Technol.4, 043001 (2019)

  14. [14]

    Barren plateaus in quantum neural network training landscapes,

    J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and H. Neven, “Barren plateaus in quantum neural network training landscapes,”Nat. Commun.9, 4812 (2018)

  15. [15]

    Cost function dependent barren plateaus in shallow parametrized quantum circuits,

    M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, “Cost function dependent barren plateaus in shallow parametrized quantum circuits,”Nat. Commun.12, 1791 (2021)

  16. [16]

    Absence of barren plateaus in quantum convolutional neural networks,

    A. Pesah, M. Cerezo, S. Wang, T. Volkoff, A. T. Sornborger, and P. J. Coles, “Absence of barren plateaus in quantum convolutional neural networks,”Phys. Rev. X11, 041011 (2021)

  17. [17]

    Connecting ansatz expressibility to gradient magnitudes and barren plateaus,

    Z. Holmes, K. Sharma, M. Cerezo, and P. J. Coles, “Connecting ansatz expressibility to gradient magnitudes and barren plateaus,”PRX Quantum3, 010313 (2022)

  18. [18]

    Diag- 19 nosing barren plateaus with tools from quantum optimal control,

    M. Larocca, P. Czarnik, K. Sharma, G. Muraleedharan, P. J. Coles, and M. Cerezo, “Diag- 19 nosing barren plateaus with tools from quantum optimal control,”Quantum6, 824 (2022)

  19. [19]

    A unified theory of barren plateaus for deep parametrized quantum circuits,

    M. Ragone, B. N. Bakalov, F. Sauvage, A. F. Kemper, C. Ortiz Marrero, M. Larocca, and M. Cerezo, “A unified theory of barren plateaus for deep parametrized quantum circuits,” Nat. Commun.15, 7172 (2024)

  20. [20]

    An initialization strategy for addressing barren plateaus in parametrized quantum circuits,

    E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti, “An initialization strategy for addressing barren plateaus in parametrized quantum circuits,”Quantum3, 214 (2019)

  21. [21]

    Layerwise learning for quantum neural networks,

    A. Skolik, J. R. McClean, M. Mohseni, P. van der Smagt, and M. Leib, “Layerwise learning for quantum neural networks,”Quantum Mach. Intell.3, 5 (2021)

  22. [22]

    Entanglement devised barren plateau miti- gation,

    T. L. Patti, K. Najafi, X. Gao, and S. F. Yelin, “Entanglement devised barren plateau miti- gation,”Phys. Rev. Research3, 033090 (2021)

  23. [23]

    Avoiding barren plateaus using classical shadows,

    S. H. Sack, R. A. Medina, A. A. Michailidis, R. Kueng, and M. Cerezo, “Avoiding barren plateaus using classical shadows,”PRX Quantum3, 020365 (2022)

  24. [24]

    Quantum-Enhanced Convergence of Physics-Informed Neural Networks

    N. Klement, V. Eyring, and M. Schwabe, “Explaining the advantage of quantum-enhanced physics-informed neural networks,” arXiv:2601.15046 (2026)

  25. [25]

    AQ-PINNs: Attention-enhanced quantum physics-informed neural networks for carbon-efficient climate modeling,

    S. Dutta, N. Innan, S. Ben Yahia, and M. Shafique, “AQ-PINNs: Attention-enhanced quantum physics-informed neural networks for carbon-efficient climate modeling,” arXiv:2409.01626 (2024)

  26. [26]

    Pattnaik et al

    O. Kyriienkoet al., “QCPINN: Quantum-classical physics-informed neural networks for solv- ing PDEs,” arXiv:2503.16678 (2025)

  27. [27]

    Hybrid quantum physics-informed neural network: Towards efficient learn- ing of high-speed flows,

    S. A. Steinet al., “Hybrid quantum physics-informed neural network: Towards efficient learn- ing of high-speed flows,” arXiv:2503.02202 (2025)

  28. [28]

    Quantum physics-informed neural net- works,

    I. Garc´ ıa-Barrenechea, S. Borr` as, and J. Latorre, “Quantum physics-informed neural net- works,”Entropy26, 649 (2024)

  29. [29]

    Trainable embedding quantum physics informed neural networks,

    H. Tezukaet al., “Trainable embedding quantum physics informed neural networks,”Sci. Rep. 15, 3894 (2025)

  30. [30]

    Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets,

    A. Kandala, A. Mezzacapo, K. Temme, M. Takita, M. Brink, J. M. Chow, and J. M. Gam- betta, “Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets,”Nature549, 242–246 (2017)

  31. [31]

    Quantum circuit learning,

    K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, “Quantum circuit learning,”Phys. Rev. A98, 032309 (2018)

  32. [32]

    Evaluating analytic gradients 20 on quantum hardware,

    M. Schuld, V. Bergholm, C. Gogolin, J. Izaac, and N. Killoran, “Evaluating analytic gradients 20 on quantum hardware,”Phys. Rev. A99, 032331 (2019)

  33. [33]

    Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial dif- ferential equations,

    M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial dif- ferential equations,”J. Comput. Phys.378, 686–707 (2019)

  34. [34]

    PennyLane: Automatic differentiation of hybrid quantum-classical computations

    V. Bergholmet al., “PennyLane: Automatic differentiation of hybrid quantum-classical com- putations,” arXiv:1811.04968 (2018)

  35. [35]

    E. F. Toro,Shock-Capturing Methods for Free-Surface Shallow Flows(John Wiley & Sons, Chichester, 2001)

  36. [36]

    Hamiltonian variational ansatz without barren plateaus,

    C. Park and N. Killoran, “Hamiltonian variational ansatz without barren plateaus,”Quantum 8, 1239 (2024)

  37. [37]

    Colloquium: Area laws for the entanglement en- tropy,

    J. Eisert, M. Cramer, and M. B. Plenio, “Colloquium: Area laws for the entanglement en- tropy,”Rev. Mod. Phys.82, 277–306 (2010)

  38. [38]

    Large gradients via correlation in random parameterized quantum circuits,

    T. Volkoff and P. J. Coles, “Large gradients via correlation in random parameterized quantum circuits,”Quantum Sci. Technol.6, 025008 (2021)

  39. [39]

    Effect of barren plateaus on gradient- free optimization,

    A. Arrasmith, Z. Holmes, M. Cerezo, and P. J. Coles, “Effect of barren plateaus on gradient- free optimization,”Quantum5, 558 (2021)

  40. [40]

    The power of quantum neural networks,

    A. Abbas, D. Sutter, C. Zoufal, A. Lucchi, A. Figalli, and S. Woerner, “The power of quantum neural networks,”Nat. Comput. Sci.1, 403–409 (2021)

  41. [41]

    Coles, Lukasz Cincio, Jarrod R

    M. Larocca, S. Thanasilp, S. Wang, K. Sharma, J. Biamonte, P. J. Coles, L. Cincio, J. R. Mc- Clean, Z. Holmes, and M. Cerezo, “A review of barren plateaus in variational quantum com- puting,” arXiv:2405.00781 (2024)

  42. [42]

    Supervised learning with quantum-enhanced feature spaces,

    V. Havl´ ıˇ cek, A. D. C´ orcoles, K. Temme, A. W. Harrow, A. Kandala, J. M. Chow, and J. M. Gambetta, “Supervised learning with quantum-enhanced feature spaces,”Nature567, 209–212 (2019)

  43. [43]

    Characterizing possible failure modes in physics-informed neural networks,

    A. S. Krishnapriyan, A. Gholami, S. Zhe, R. M. Kirby, and M. W. Mahoney, “Characterizing possible failure modes in physics-informed neural networks,” inAdvances in Neural Informa- tion Processing Systems (NeurIPS)(2021), Vol. 34. 21