Exponentially many initializations to avoid barren plateaus
Pith reviewed 2026-06-26 23:54 UTC · model grok-4.3
The pith
A first-moment diagnostic reveals exponentially many inequivalent initializations escape barren plateaus in quantum ansatze.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Our results show that one can generate exponentially many families of inequivalent initialization strategies. The first-moment framework supplies an operator-level diagnostic that determines when an initialization escapes the fully concentrated barren-plateau fixed point and that distinguishes the biases induced by different strategies. Many shifted, biased, and non-symmetric parameter distributions satisfy the diagnostic and therefore avoid concentration, yet these choices are not equivalent; numerics further indicate that first-moment-distinct initializations can converge to different minima.
What carries the argument
The first-moment framework: an operator-level diagnostic that checks whether the expected value of the loss under a given parameter distribution remains non-concentrated, thereby flagging escape from the barren-plateau fixed point.
If this is right
- Barren-plateau avoidance is highly non-unique; many distinct distributions satisfy the escape condition.
- Inequivalent first-moment-distinct initializations are not required to produce the same final solution quality.
- Different escaping initializations can be compared directly by the biases their distributions induce on the loss operator.
- Avoiding concentration via initialization replaces one scaling problem with the task of selecting among multiple distinct trainable regions.
Where Pith is reading between the lines
- The diagnostic could be applied to rank candidate initializations by how strongly they bias the loss toward particular solution classes.
- One could test whether the exponentially many families partition the space of possible minima into regions with systematically different properties such as depth or entanglement.
- Combining the first-moment test with a secondary criterion that favors one pocket over others might restore a unique preferred initialization for a given task.
Load-bearing premise
The first-moment diagnostic is sufficient to decide whether an initialization escapes concentration for the full loss landscape and the entire optimization trajectory.
What would settle it
A numerical experiment in which an initialization strategy that satisfies the first-moment diagnostic nevertheless produces loss values that concentrate to an exponentially small variance across random instances.
Figures
read the original abstract
Barren plateaus are stated as an average-case phenomenon: pick an ansatz, initialize it naively, and concentration follows. This has led to the common view that a potential cure for barren plateaus is simply to initialize the parameters more carefully. Here we show that the situation is subtler. We introduce a first-moment framework that gives a simple operator-level diagnostic for when an initialization may escape the fully concentrated barren-plateau fixed point, and for comparing the biases induced by different initialization strategies. Our framework recovers several known initialization schemes such as identity and Gaussian initialization, but also shows that barren-plateau avoidance is highly non-unique. Indeed, many shifted, biased, and non-symmetric parameter distributions can avoid concentration, and these choices need not be equivalent. In fact, our results show that one can generate exponentially many families of inequivalent initialization strategies. Then, our numerics indicate that different first-moment-distinct initializations can lead to different attained minima, suggesting that avoiding barren plateaus via smart initializations can trade the exponential concentration problem for the challenge of selecting the right trainable pocket amongst many options.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a first-moment operator-level diagnostic for determining when parameter initializations escape the fully concentrated barren-plateau fixed point in variational quantum algorithms. It recovers known schemes such as identity and Gaussian initialization, shows that many other biased and non-symmetric distributions also avoid the fixed point, and concludes that exponentially many inequivalent families of initializations exist. Numerical experiments suggest that first-moment-distinct initializations can lead to different attained minima.
Significance. If the first-moment diagnostic is shown to guarantee escape from exponential variance concentration, the result would clarify that initialization-based barren-plateau avoidance is highly non-unique and that the problem reduces to selecting among multiple trainable regions. The explicit recovery of known schemes and the operator-level comparison of biases are useful contributions. No machine-checked proofs or open reproducible code are referenced.
major comments (1)
- [Abstract] Abstract (central claim on exponentially many families): the first-moment diagnostic is presented as sufficient to escape the fully concentrated barren-plateau fixed point, yet barren plateaus are defined by exponential decay of Var[C] with qubit number; a non-zero first-moment bias does not by itself preclude the second moment from concentrating around the biased mean, and no derivation or numerical evidence is supplied showing that the full loss distribution escapes the concentrating regime.
minor comments (1)
- [Numerics] The numerical results are summarized without error bars, dataset sizes, or explicit description of the cost-function instances, which weakens the claim that different initializations reach distinct minima.
Simulated Author's Rebuttal
We thank the referee for their careful reading and for identifying a key point of clarification needed in the abstract. We address the major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract (central claim on exponentially many families): the first-moment diagnostic is presented as sufficient to escape the fully concentrated barren-plateau fixed point, yet barren plateaus are defined by exponential decay of Var[C] with qubit number; a non-zero first-moment bias does not by itself preclude the second moment from concentrating around the biased mean, and no derivation or numerical evidence is supplied showing that the full loss distribution escapes the concentrating regime.
Authors: We agree that a non-zero first-moment bias alone does not guarantee that the variance of the loss will remain non-concentrated. The manuscript introduces the first-moment diagnostic specifically to detect escape from the fully concentrated fixed point (i.e., the operator-level bias is non-vanishing), and shows that this condition is satisfied by exponentially many inequivalent families, including but not limited to the known identity-initialization scheme that is already established to avoid barren plateaus. However, we do not claim or derive that every first-moment-nonzero distribution necessarily escapes exponential variance concentration; the abstract's phrasing that these initializations "avoid barren plateaus" is therefore imprecise. We will revise the abstract and the relevant discussion sections to state that the diagnostic identifies families that escape the zero-bias fully concentrated fixed point (a necessary condition), while noting that second-moment analysis would be required to confirm variance non-concentration in general. This is a clarification rather than a change to the core technical results on the first-moment operator comparison and the exponential count of families. revision: partial
Circularity Check
No circularity: first-moment diagnostic is an independent operator framework
full rationale
The paper introduces a new first-moment operator diagnostic to identify initializations that escape the concentrated barren-plateau fixed point, recovers known schemes such as identity and Gaussian initialization, and derives the existence of exponentially many inequivalent families directly from the framework's definitions and operator-level comparisons. No step reduces a claimed prediction or uniqueness result to a fitted parameter, self-referential definition, or load-bearing self-citation chain; the inequivalence and non-uniqueness claims are consequences of the diagnostic rather than tautological renamings or imported ansatzes. The derivation remains self-contained against external benchmarks for barren-plateau variance concentration.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Concentration of the loss is governed by its first moment under the chosen initialization distribution.
Reference graph
Works this paper leans on
-
[1]
Cerezo, A
M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, L. Cincio, and P. J. Coles, Nature Reviews Physics3, 625–644 (2021)
2021
-
[2]
Bharti, A
K. Bharti, A. Cervera-Lierta, T. H. Kyaw, T. Haug, S. Alperin-Lea, A. Anand, M. Degroote, H. Heimonen, J. S. Kottmann, T. Menke,et al., Reviews of Modern Physics94, 015004 (2022)
2022
-
[3]
S. Endo, Z. Cai, S. C. Benjamin, and X. Yuan, Journal of the Physical Society of Japan90, 032001 (2021)
2021
-
[4]
Schuld, I
M. Schuld, I. Sinayskiy, and F. Petruccione, Contempo- rary Physics56, 172 (2015)
2015
-
[5]
Biamonte, P
J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Nature549, 195 (2017)
2017
-
[6]
Challenges and opportu- nities in quantum machine learning,
M. Cerezo, G. Verdon, H.-Y. Huang, L. Cincio, and P. J. Coles, Nature Computational Science 10.1038/s43588- 022-00311-3 (2022)
-
[7]
Di Meglio, K
A. Di Meglio, K. Jansen, I. Tavernelli, C. Alexandrou, S. Arunachalam, C. W. Bauer, K. Borras, S. Carrazza, A. Crippa, V. Croft,et al., Prx quantum5, 037001 (2024)
2024
-
[8]
J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and H. Neven, Nature Communications9, 1 (2018)
2018
-
[9]
Larocca, S
M. Larocca, S. Thanasilp, S. Wang, K. Sharma, J. Bia- monte, P. J. Coles, L. Cincio, J. R. McClean, Z. Holmes, and M. Cerezo, Nature Reviews Physics7, 174–189 (2025)
2025
-
[10]
You and X
X. You and X. Wu, inInternational Conference on Ma- chine Learning(PMLR, 2021) pp. 12144–12155
2021
-
[11]
Fontana, M
E. Fontana, M. Cerezo, A. Arrasmith, I. Rungger, and P. J. Coles, Quantum6, 804 (2022)
2022
-
[13]
E. R. Anschuetz, International Conference on Learning Representations (2022)
2022
-
[14]
Larocca, P
M. Larocca, P. Czarnik, K. Sharma, G. Muraleedharan, P. J. Coles, and M. Cerezo, Quantum6, 824 (2022)
2022
-
[15]
Bittel and M
L. Bittel and M. Kliesch, Phys. Rev. Lett.127, 120502 (2021)
2021
-
[16]
Cerezo, A
M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, Nature Communications12, 1 (2021)
2021
-
[17]
Pesah, M
A. Pesah, M. Cerezo, S. Wang, T. Volkoff, A. T. Sorn- borger, and P. J. Coles, Physical Review X11, 041011 (2021). 16
2021
-
[18]
Khatri, R
S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sorn- borger, and P. J. Coles, Quantum3, 140 (2019)
2019
-
[19]
Zhao and X.-S
C. Zhao and X.-S. Gao, Quantum5, 466 (2021)
2021
-
[20]
Liu, L.-W
Z. Liu, L.-W. Yu, L.-M. Duan, and D.-L. Deng, Physical Review Letters129, 270501 (2022)
2022
-
[21]
Miao and T
Q. Miao and T. Barthel, Physical Review A109, L050402 (2024)
2024
-
[22]
Bermejo, P
P. Bermejo, P. Braccia, M. S. Rudolph, Z. Holmes, L. Cin- cio, and M. Cerezo, PRX Quantum7, 020304 (2026)
2026
-
[23]
B. Bach, J. Falla, and I. Safro, in2024 IEEE Interna- tional Conference on Quantum Computing and Engineer- ing (QCE), Vol. 1 (IEEE, 2024) pp. 1–12
2024
-
[24]
Larocca, F
M. Larocca, F. Sauvage, F. M. Sbahi, G. Verdon, P. J. Coles, and M. Cerezo, PRX Quantum3, 030341 (2022)
2022
-
[25]
J. J. Meyer, M. Mularski, E. Gil-Fuster, A. A. Mele, F. Arzani, A. Wilms, and J. Eisert, PRX Quantum4, 010328 (2023)
2023
-
[26]
Skolik, M
A. Skolik, M. Cattelan, S. Yarkoni, T. Bäck, and V. Dun- jko, npj Quantum Information9, 47 (2023)
2023
-
[27]
Representation theory for geometric quantum machine learning (2022),
M. Ragone, Q. T. Nguyen, L. Schatzki, P. Braccia, M. Larocca, F. Sauvage, P. J. Coles, and M. Cerezo, arXiv preprint arXiv:2210.07980 10.48550/arXiv.2210.07980 (2022)
-
[28]
Q. T. Nguyen, L. Schatzki, P. Braccia, M. Ragone, P. J. Coles, F. Sauvage, M. Larocca, and M. Cerezo, PRX Quantum5, 020328 (2024)
2024
-
[29]
Schatzki, M
L. Schatzki, M. Larocca, Q. T. Nguyen, F. Sauvage, and M. Cerezo, npj Quantum Information10, 12 (2024)
2024
-
[30]
Zheng, Z
H. Zheng, Z. Li, J. Liu, S. Strelchuk, and R. Kondor, PRX Quantum4, 020327 (2023)
2023
-
[31]
R. D. P. East, G. Alonso-Linaje, and C.-Y. Park, Quan- tum Science and Technology11, 025025 (2026)
2026
-
[32]
S. Y. Chang, M. Larocca, and M. Cerezo, Manuscript in preparation (2026)
2026
-
[33]
Tsvelikhovskiy, I
B. Tsvelikhovskiy, I. Safro, and Y. Alexeev, IEEE Trans- actions on Quantum Engineering (2026)
2026
-
[34]
R.Shaydulin, S.Hadfield, T.Hogg,andI.Safro,Quantum Information Processing20, 1 (2021)
2021
-
[35]
Monbroussou, J
L. Monbroussou, J. Landman, A. B. Grilo, R. Kukla, and E. Kashefi, Quantum9, 1745 (2025)
2025
-
[36]
S. Raj, I. Kerenidis, A. Shekhar, B. Wood, J. Dee, S. Chakrabarti, R. Chen, D. Herman, S. Hu, P. Minssen, et al., Quantum7, 1191 (2023)
2023
-
[37]
Fontana, D
E. Fontana, D. Herman, S. Chakrabarti, N. Kumar, R. Yalovetzky, J. Heredge, S. H. Sureshbabu, and M. Pis- toia, Nature Communications15, 7171 (2024)
2024
-
[38]
Ragone, B
M. Ragone, B. N. Bakalov, F. Sauvage, A. F. Kemper, C. Ortiz Marrero, M. Larocca, and M. Cerezo, Nature Communications15, 7172 (2024)
2024
-
[39]
N. L. Diaz, D. García-Martín, S. Kazi, M. Larocca, and M. Cerezo, arXiv preprint arXiv:2310.11505 10.48550/arXiv.2310.11505 (2023)
-
[40]
M. T. West, J. Heredge, M. Sevior, and M. Usman, PRX Quantum5, 030320 (2024)
2024
-
[41]
J. Heredge, N. Kumar, D. Herman, S. Chakrabarti, R. Yalovetzky, S. H. Sureshbabu, C. Li, and M. Pistoia, arXiv preprint arXiv:2405.08801 (2024)
arXiv 2024
-
[42]
G. Aguilar, S. Cichy, J. Eisert, and L. Bittel, arXiv preprint arXiv:2408.00081 10.48550/arXiv.2408.00081 (2024)
-
[43]
E. Kökcü, R. Wiersema, A. F. Kemper, and B. N. Bakalov, arXiv preprint arXiv:2409.19797 10.48550/arXiv.2409.19797 (2024)
-
[44]
S. Kazi, M. Larocca, M. Farinati, P. J. Coles, M. Cerezo, and R. Zeier, PRX Quantum6, 040345 (2025)
2025
-
[45]
B. Tsvelikhovskiy, B. Bach, J. Falla, and I. Safro, arXiv preprint arXiv:2602.16141 (2026)
Pith/arXiv arXiv 2026
-
[46]
Cerezo, M
M. Cerezo, M. Larocca, D. García-Martín, N. L. Diaz, P. Braccia, E. Fontana, M. S. Rudolph, P. Bermejo, A. Ijaz, S. Thanasilp,et al., Nature Communications16, 7907 (2025)
2025
-
[47]
Angrisani, A
A. Angrisani, A. Schmidhuber, M. S. Rudolph, M. Cerezo, Z. Holmes, and H.-Y. Huang, Phys. Rev. Lett.135, 170602 (2025)
2025
-
[48]
S. Lerch, R. Puig, M. Rudolph, A. Angrisani, T. Jones, M. Cerezo, S. Thanasilp, and Z. Holmes, arXiv preprint arXiv:2411.19896 10.48550/arXiv.2411.19896 (2024)
-
[49]
E. R. Anschuetz and X. Gao, arXiv preprint arXiv:2402.08606 https://doi.org/10.48550/arXiv.2402.08606 (2024)
-
[50]
A. A. Mele, A. Angrisani, S. Ghosh, S. Khatri, J. Eisert, D. Stilck França, and Y. Quek, Nature Physics , 1 (2026)
2026
-
[51]
S. Shin, Y. S. Teo, and H. Jeong, Phys. Rev. Res.6, 023218 (2024)
2024
-
[52]
I. Ermakov, O. Lychkovskiy, and T. Byrnes, arXiv preprint arXiv:2401.08187 10.48550/arXiv.2401.08187 (2024)
-
[53]
A. Miller, J. Favre, Z. Holmes, Ö. Salehi, R. Chakraborty, A. Nykänen, Z. Zimboras, A. Glos, and G. García-Pérez, arXiv preprint arXiv:2503.18939 10.48550/arXiv.2503.18939 (2025)
-
[54]
M. S. Rudolph, A. Angrisani, A. Wright, I. Sanderski, R. Puig, and Z. Holmes, arXiv preprint arXiv:2602.04878 https://doi.org/10.48550/arXiv.2602.04878 (2026)
-
[55]
Enabling Lie-Algebraic Classical Simulation beyond Free Fermions
A. Bärligea, M. L. Sims-Goh, and J. S. Kottmann, arXiv preprint arXiv:2604.16701 10.48550/arXiv.2604.16701 (2026)
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.16701 2026
-
[56]
Grant, L
E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti, Quantum3, 214 (2019)
2019
-
[57]
Kulshrestha and I
A. Kulshrestha and I. Safro, in2022 IEEE Interna- tional Conference on Quantum Computing and Engineer- ing (QCE)(IEEE, 2022) pp. 197–203
2022
-
[58]
Duffield, M
S. Duffield, M. Benedetti, and M. Rosenkranz, Machine Learning: Science and Technology4, 025007 (2023)
2023
-
[59]
Heyraud, Z
V. Heyraud, Z. Li, K. Donatella, A. L. Boité, and C. Ciuti, PRX Quantum4, 040335 (2023)
2023
-
[60]
Zhang, L
K. Zhang, L. Liu, M.-H. Hsieh, and D. Tao, inAdvances in Neural Information Processing Systems(2022)
2022
-
[61]
Park and N
C.-Y. Park and N. Killoran, Quantum8, 1239 (2024)
2024
-
[62]
Y. Wang, B. Qi, C. Ferrie, and D. Dong, Physical Review Applied22, 054005 (2024)
2024
-
[63]
C.-Y. Park, M. Kang, and J. Huh, arXiv preprint arXiv:2403.04844 (2024). 17
arXiv 2024
- [64]
-
[65]
R. Puig, M. Drudis, S. Thanasilp, and Z. Holmes, PRX Quantum6, 010317 (2025)
2025
-
[66]
R. Puig, B. Casas, A. Cervera-Lierta, Z. Holmes, and A. Pérez-Salinas, arXiv preprint arXiv:2602.06137 https://doi.org/10.48550/arXiv.2602.06137 (2026)
-
[67]
IQP Born machines under data-dependent and agnostic initialization strategies,
S. Lerch, J. Bowles, R. Puig, E. Armengol, Z. Holmes, and S. Thanasilp, arXiv preprint arXiv:2603.14576 10.48550/arXiv.2603.14576 (2026)
-
[68]
E. R. Anschuetz and B. T. Kiani, Nature Communications 13, 7760 (2022)
2022
-
[69]
N. A. Nemkov, E. O. Kiktenko, and A. K. Fedorov, arXiv preprint arXiv:2405.05332 10.48550/arXiv.2405.05332 (2024)
-
[70]
Kandala, A
A. Kandala, A. Mezzacapo, K. Temme, M. Takita, M. Brink, J. M. Chow, and J. M. Gambetta, Nature549, 242 (2017)
2017
-
[71]
H. R. Grimsley, S. E. Economou, E. Barnes, and N. J. Mayhall, Nature Communications10, 1 (2019)
2019
-
[72]
A Quantum Approximate Optimization Algorithm
E. Farhi, J. Goldstone, and S. Gut- mann, arXiv preprint arXiv:1411.4028 https://doi.org/10.48550/arXiv.1411.4028 (2014)
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1411.4028 2014
-
[73]
Wiersema, C
R. Wiersema, C. Zhou, Y. de Sereville, J. F. Carrasquilla, Y. B. Kim, and H. Yuen, PRX Quantum1, 020319 (2020)
2020
-
[74]
S. Y. Chang, M. Grossi, B. Le Saux, and S. Vallecorsa, in 2023 IEEE International Conference on Quantum Com- puting and Engineering (QCE), Vol. 01 (2023) pp. 229– 235
2023
-
[75]
R. T. Forestano, M. C. Cara, G. R. Dahale, Z. Dong, S. Gleyzer, D. Justice, K. Kong, T. Magorsch, K. T. Matchev, K. Matcheva,et al., arXiv preprint arXiv:2311.18672 (2023)
arXiv 2023
-
[76]
M. T. West, M. Sevior, and M. Usman, Machine Learning: Science and Technology4, 035027 (2023)
2023
-
[77]
H.Zheng, Z.Li, J.Liu, S.Strelchuk,andR.Kondor,arXiv preprint arXiv:2207.07250 (2022)
arXiv 2022
-
[78]
H. Mhiri, R. Puig, S. Lerch, M. S. Rudolph, T. Chotibut, S. Thanasilp, and Z. Holmes, arXiv preprint arXiv:2502.07889 https://doi.org/10.48550/arXiv.2502.07889 (2025)
-
[79]
Peruzzo, J
A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’brien, Nature Communications5, 1 (2014)
2014
-
[80]
S. Sim, P. D. Johnson, and A. Aspuru-Guzik, Advanced Quantum Technologies2, 1900070 (2019)
2019
-
[81]
Holmes, K
Z. Holmes, K. Sharma, M. Cerezo, and P. J. Coles, PRX Quantum3, 010313 (2022)
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.