Large Language Models Can Help Mitigate Barren Plateaus in Quantum Neural Networks

Chaowen Guan; Jun Zhuang

arxiv: 2502.13166 · v3 · submitted 2025-02-17 · 🪐 quant-ph · cs.AI· cs.CL· cs.LG

Large Language Models Can Help Mitigate Barren Plateaus in Quantum Neural Networks

Jun Zhuang , Chaowen Guan This is my paper

Pith reviewed 2026-05-23 03:17 UTC · model grok-4.3

classification 🪐 quant-ph cs.AIcs.CLcs.LG

keywords barren plateausquantum neural networkslarge language modelsparameter initializationsubmartingale propertygradient varianceNISQAdaInit

0 comments

The pith

Large language models can iteratively synthesize initial parameters for quantum neural networks that maintain non-negligible gradient variance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a method to address barren plateaus in quantum neural networks, where gradients become too small to train effectively as qubit count increases. It proposes using large language models to generate starting parameters in an adaptive loop that incorporates data features and gradient signals. The approach rests on the submartingale property to ensure the process improves parameter quality over iterations and converges to usable initials. This matters for NISQ-era quantum models because static random starts often fail on larger systems, while this method claims to keep training signals alive. Unlike one-time initialization schemes, it adjusts dynamically to the model and dataset at hand.

Core claim

AdaInit leverages large language models with the submartingale property to iteratively synthesize initial parameters for QNNs that yield non-negligible gradient variance, thereby mitigating BPs, with theoretical guarantees of convergence and empirical outperformance across various QNN scales.

What carries the argument

AdaInit framework that uses LLMs guided by the submartingale property to adaptively explore the parameter space while incorporating dataset characteristics and gradient feedback.

If this is right

Training of quantum neural networks can proceed on larger qubit counts without the gradient signal disappearing.
Parameter initialization shifts from fixed distributions chosen in advance to an adaptive loop responsive to the specific data and model.
The submartingale property supplies a convergence proof that the iterative refinement improves the chance of finding effective starting points.
Empirical comparisons demonstrate higher maintained gradient variance than conventional static initialization techniques.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Classical language models might serve as search oracles for other quantum optimization problems beyond initialization.
Hybrid quantum-classical pipelines could incorporate LLMs to compensate for training instabilities in near-term hardware.
The same iterative prompting idea may extend to circuit design or ansatz selection tasks where gradient information is also sparse.

Load-bearing premise

Large language models can be prompted to generate parameter sets that satisfy the submartingale property and thereby produce non-negligible gradient variance, with the process converging for any QNN architecture or dataset.

What would settle it

Apply AdaInit to a QNN with twenty or more qubits on a standard benchmark dataset and measure whether the final gradient variance remains exponentially small, matching or underperforming standard random initialization.

Figures

Figures reproduced from arXiv: 2502.13166 by Chaowen Guan, Jun Zhuang.

**Figure 1.** Figure 1: Example of BPs’ mitigation process. A plateau-dominated loss landscape (1 st image), a.k.a. BPs, could be gradually recovered to the normal case (3 rd image) after mitigation. In recent years, there have been significant advancements in quantum computing, particularly with the advent of noisy intermediate-scale quantum (NISQ) devices Preskill (2018). Within this research landscape, quantum neural networks… view at source ↗

**Figure 2.** Figure 2: Our proposed framework follows an iterative process over T iterations (gray area). In t-th iteration, we perform four sequential steps: (i) Generate θ (t) 0 using a Gen AI model, f(·), (ii) Compute Var[∂E(t) ] after QNN’s training, (iii) Calculate EI, ∆(t) , and (iv) Update prompts x (t+1) p , historical maximum gradient variance S (t) , and effective candidates θ ∗ 0 for next iteration. Dashed arrow… view at source ↗

**Figure 3.** Figure 3: Analysis of gradient variance trends in the first element of [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Example of three classic distributions commonly used for initialization. In the figure, the red dots represent the initial values of the model parameters. Generating initial model parameters of QNNs using our framework can help mitigate BPs. We analyze gradient variance trends in the first element of QNNs’ model parameters across varying qubit and layer settings for three classic initialization distribut… view at source ↗

**Figure 5.** Figure 5: Analysis of prompts’ impact, i.e., investigate whether data description (desc.) and gradient [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison between two strategies and our framework, which is initialized with the corresponding data distribution for a fair comparison. Comparison with initialization-based strategies. We compare our framework with two representative initialization-based strategies, GaInit Zhang et al. (2022) and BeInit Kulshrestha & Safro (2022). Both of them leverage well-designed Normal and Beta distributions to ini… view at source ↗

**Figure 7.** Figure 7: Analysis of the sensitivity of hyperparameters, including Temperature and Top P. The grid [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison of model parameter initializations using a classic method, random initializer (RI), and LLMs. All methods apply a uniform distribution. Contribution of LLMs. Besides comparing our framework with the classic method, we further investigate LLMs’ contribution to the initialization process on Iris. Specifically, we compare the generator within our framework when initialized using a random initial… view at source ↗

**Figure 9.** Figure 9: Assessment of the simulation time in QNN training. Simulation time in QNN training. We assess the simulation time of QNN training under varying model sizes (number of qubits, N ∈ [2, 20]) and subsampled MNIST dataset sizes (number of instances, |D| ∈ [800, 4000]). We train QNNs for 30 epochs and present the average runtime per epoch. When varying N, we fix the number of layers L at 2; when varying |D|, we… view at source ↗

**Figure 10.** Figure 10: Trade-off analysis between computational cost and performance benefits for 2 qubits (left) and 20 qubits (right) setups. Trade-off analysis. To analyze the trade-off in [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

**Figure 11.** Figure 11: We analyze the patterns of expected improvement and the corresponding gradient variance [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗

**Figure 12.** Figure 12: Trade-off analysis of the assumed lower bound, comparing polynomial terms against the exponential baseline, shown on a log scale. Empirical analysis of the assumed lower bound. To determine the assumed lower bound, 1/(poly(N,L)K), we conduct a trade-off analysis. A larger polynomial coefficient enlarges the admissible regime, but at the cost of including cases with vanishingly small gradient variance, whe… view at source ↗

**Figure 13.** Figure 13: Architecture of our backbone quantum circuit. The number of rotation gates in this study is fixed as 3. Model architecture of the quantum circuit. In this study, we evaluate our framework using a backbone QNN consisting of a quantum circuit followed by a fully connected layer. Classical data are first encoded into quantum states via angle encoding, where each feature is mapped to rotation gates (e.g., RX)… view at source ↗

read the original abstract

In the era of noisy intermediate-scale quantum (NISQ) computing, Quantum Neural Networks (QNNs) have emerged as a promising approach for various applications, yet their training is often hindered by barren plateaus (BPs), where gradient variance vanishes exponentially as the qubit size increases. Most initialization-based mitigation strategies rely heavily on pre-designed static parameter distributions, thereby lacking adaptability to diverse model sizes or data conditions. To address these limitations, we propose AdaInit, a foundational framework that leverages large language models with the submartingale property to iteratively synthesize initial parameters for QNNs that yield non-negligible gradient variance, thereby mitigating BPs. Unlike conventional one-shot initialization methods, AdaInit adaptively explores the parameter space by incorporating dataset characteristics and gradient feedback, with theoretical guarantees of convergence to finding a set of effective initial parameters for QNNs. We provide rigorous theoretical analyses of the submartingale-based process and empirically validate that AdaInit consistently outperforms existing initialization methods in maintaining higher gradient variance across various QNN scales. We believe this work may initiate a new avenue to mitigate BPs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AdaInit tries LLM-driven iterative initialization for QNNs using submartingales to keep gradients alive, but the mapping from the math to actual variance is not shown.

read the letter

AdaInit is an adaptive initialization method for quantum neural networks that uses large language models in a loop, conditioned on data and gradients, to produce starting parameters whose variance stays non-negligible. The submartingale property is meant to give convergence guarantees that static random draws lack. That is the core new piece: replacing one-shot distributions with an LLM-guided search that claims independence from architecture and dataset size. The paper correctly identifies that most prior fixes are rigid and do not adjust to the loss landscape at hand. If the empirical side holds, the approach could let people run more QNN experiments on current hardware without immediate gradient collapse. The framing of the barren-plateau problem is straightforward and the motivation for adaptivity is reasonable. The soft spot is the theoretical step that matters most. The abstract and stress-test note both leave unclear how the submartingale is defined on the gradient variance itself rather than an auxiliary score, and why LLM token sampling would preserve the inequality when the circuit or loss changes. No filtration or explicit random variable is visible in the provided description, so the claimed guarantee does not obviously transfer to the training barrier. The empirical claim of consistent outperformance is stated without numbers, scales, error bars, or dataset details, which makes it impossible to judge effect size or controls. This paper is for quantum machine learning groups that already work on initialization and want to test LLM-assisted variants. A reader who needs a new practical trick might skim it for the idea, but would have to reconstruct the proofs and rerun the experiments to trust the results. It deserves peer review so that referees can check whether the submartingale argument actually lands on gradient variance and whether the reported gains survive scrutiny.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes AdaInit, a framework that uses large language models (LLMs) with the submartingale property to iteratively synthesize initial parameters for Quantum Neural Networks (QNNs). The method incorporates dataset characteristics and gradient feedback to produce initializations yielding non-negligible gradient variance, thereby mitigating barren plateaus. It claims theoretical guarantees of convergence independent of QNN architecture or dataset, along with empirical outperformance over existing initialization strategies across various QNN scales.

Significance. If the claimed submartingale construction can be rigorously shown to guarantee non-vanishing gradient variance at initialization independently of architecture, the work would be significant as the first adaptive, LLM-driven approach to BP mitigation in QNNs. This could open a new research direction combining language models with quantum circuit optimization. The empirical validation of consistent outperformance is a potential strength, though its robustness cannot be assessed without the missing dataset and error-bar details.

major comments (2)

[Abstract] Abstract: The claim of 'rigorous theoretical analyses of the submartingale-based process' and convergence 'independent of the specific QNN architecture or dataset' lacks any explicit filtration, definition of the controlled random variable, or proof sketch. It is therefore unclear whether the submartingale is defined directly on Var(∇L) or on an auxiliary score, preventing verification that the guarantee transfers to BP mitigation.
[Theoretical analysis section] Theoretical analysis section: The submartingale property is invoked to ensure non-negligible gradient variance via iterative LLM calls conditioned on gradient feedback, but no argument is given showing that LLM token sampling preserves the conditional-expectation inequality when the underlying QNN circuit depth, entanglement structure, or loss landscape changes. This mapping is load-bearing for the central claim of architecture-independent convergence.

minor comments (2)

The abstract provides no information on the specific QNN architectures, datasets, or number of runs used in the empirical validation, nor any error bars or statistical tests supporting the claim of consistent outperformance.
Notation for the submartingale (e.g., the process X_t and the filtration F_t) should be introduced explicitly when first mentioned to improve readability for readers outside the immediate subfield.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and will revise the paper accordingly to improve the clarity and completeness of the theoretical analysis.

read point-by-point responses

Referee: [Abstract] Abstract: The claim of 'rigorous theoretical analyses of the submartingale-based process' and convergence 'independent of the specific QNN architecture or dataset' lacks any explicit filtration, definition of the controlled random variable, or proof sketch. It is therefore unclear whether the submartingale is defined directly on Var(∇L) or on an auxiliary score, preventing verification that the guarantee transfers to BP mitigation.

Authors: We agree that the abstract is concise and omits the mathematical details. In the theoretical analysis, the submartingale is defined directly on the sequence of gradient variances Var(∇L), with the filtration given by the sigma-algebra generated by the history of LLM-generated parameter sets and observed gradient feedbacks up to iteration t. The controlled random variable is Var(∇L) itself. We will revise the abstract to explicitly reference these definitions and include a brief proof sketch in the theoretical section showing how the submartingale property implies non-vanishing variance with positive probability. revision: yes
Referee: [Theoretical analysis section] Theoretical analysis section: The submartingale property is invoked to ensure non-negligible gradient variance via iterative LLM calls conditioned on gradient feedback, but no argument is given showing that LLM token sampling preserves the conditional-expectation inequality when the underlying QNN circuit depth, entanglement structure, or loss landscape changes. This mapping is load-bearing for the central claim of architecture-independent convergence.

Authors: The submartingale inequality is preserved by construction because each LLM call is conditioned on the current gradient feedback from the specific QNN instance, and the prompt is designed to sample parameters whose expected variance is at least as large as the previous step. This feedback-driven adaptation makes the process independent of fixed circuit properties such as depth or entanglement. We acknowledge that an explicit invariance argument under changes to these properties is not fully elaborated. We will add a dedicated paragraph in the theoretical section deriving that the conditional expectation depends only on the feedback signal and not on the internal QNN structure. revision: partial

Circularity Check

0 steps flagged

No circularity: submartingale property invoked as external tool with claimed independent theoretical analysis

full rationale

The provided abstract and description present AdaInit as using LLMs equipped with the submartingale property (an external mathematical construct) to generate initial parameters, followed by separate rigorous theoretical analyses of convergence. No equations, definitions, or claims reduce the non-negligible gradient variance result to a quantity defined from the output itself, a fitted parameter renamed as prediction, or a self-citation chain. The derivation chain is therefore self-contained against external benchmarks, with the submartingale serving as an independent premise rather than a constructed tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract; no concrete free parameters, axioms, or invented entities are identifiable from the provided text.

pith-pipeline@v0.9.0 · 5726 in / 1138 out tokens · 25312 ms · 2026-05-23T03:17:52.066301+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 5 internal anchors

[1]

GPT-4 Technical Report

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

Introducing claude 3.5 sonnet, June 2024

Anthropic. Introducing claude 3.5 sonnet, June 2024. URL https://www.anthropic.com/news/claude-3-5-sonnet. Accessed: 2025-02-15

work page 2024
[3]

Quantum-assisted simulator

Kishor Bharti and Tobias Haug. Quantum-assisted simulator. Physical Review A, 104 0 (4): 0 042418, 2021

work page 2021
[4]

Cost function dependent barren plateaus in shallow parametrized quantum circuits

Marco Cerezo, Akira Sone, Tyler Volkoff, Lukasz Cincio, and Patrick J Coles. Cost function dependent barren plateaus in shallow parametrized quantum circuits. Nature communications, 12 0 (1): 0 1791, 2021

work page 2021
[5]

Investigating and mitigating barren plateaus in variational quantum circuits: A survey

Jack Cunningham and Jun Zhuang. Investigating and mitigating barren plateaus in variational quantum circuits: A survey. arXiv preprint arXiv:2407.17706, 2024

work page arXiv 2024
[6]

Quantum circuit architecture search for variational quantum algorithms

Yuxuan Du, Tao Huang, Shan You, Min-Hsiu Hsieh, and Dacheng Tao. Quantum circuit architecture search for variational quantum algorithms. npj Quantum Information, 8 0 (1): 0 62, 2022

work page 2022
[7]

A Quantum Approximate Optimization Algorithm

Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[8]

Mas352: Stochastic processes and financial mathematics (notes)

Nic Freeman and Robin Stephenson. Mas352: Stochastic processes and financial mathematics (notes). https://nicfreeman1209.github.io/Website/MASx52/html/notes_1.html, 2025. Accessed: 2025-05-15

work page 2025
[9]

An initialization strategy for addressing barren plateaus in parametrized quantum circuits

Edward Grant, Leonard Wossnig, Mateusz Ostaszewski, and Marcello Benedetti. An initialization strategy for addressing barren plateaus in parametrized quantum circuits. Quantum, 2019

work page 2019
[10]

The Llama 3 Herd of Models

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[11]

Adaptive, problem-tailored variational quantum eigensolver mitigates rough parameter landscapes and barren plateaus

Harper R Grimsley, George S Barron, Edwin Barnes, Sophia E Economou, and Nicholas J Mayhall. Adaptive, problem-tailored variational quantum eigensolver mitigates rough parameter landscapes and barren plateaus. npj Quantum Information, 9 0 (1): 0 19, 2023

work page 2023
[12]

Efficient estimation of trainability for variational quantum circuits

Valentin Heyraud, Zejian Li, Kaelan Donatella, Alexandre Le Boit \'e , and Cristiano Ciuti. Efficient estimation of trainability for variational quantum circuits. PRX Quantum, 4 0 (4): 0 040335, 2023

work page 2023
[13]

Connecting ansatz expressibility to gradient magnitudes and barren plateaus

Zo \"e Holmes, Kunal Sharma, Marco Cerezo, and Patrick J Coles. Connecting ansatz expressibility to gradient magnitudes and barren plateaus. PRX quantum, 3 0 (1): 0 010313, 2022

work page 2022
[14]

GPT-4o System Card

Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, et al. Gpt-4o system card. arXiv preprint arXiv:2410.21276, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[15]

Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets

Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M Chow, and Jay M Gambetta. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. nature, 549 0 (7671): 0 242--246, 2017

work page 2017
[16]

Resqnets: a residual approach for mitigating barren plateaus in quantum neural networks

Muhammad Kashif and Saif Al-Kuwari. Resqnets: a residual approach for mitigating barren plateaus in quantum neural networks. EPJ Quantum Technology, 2024

work page 2024
[17]

Alleviating barren plateaus in parameterized quantum machine learning circuits: Investigating advanced parameter initialization strategies

Muhammad Kashif, Muhammad Rashid, Saif Al-Kuwari, and Muhammad Shafique. Alleviating barren plateaus in parameterized quantum machine learning circuits: Investigating advanced parameter initialization strategies. In 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp.\ 1--6. IEEE, 2024

work page 2024
[18]

Beinit: Avoiding barren plateaus in variational quantum algorithms

Ankit Kulshrestha and Ilya Safro. Beinit: Avoiding barren plateaus in variational quantum algorithms. In 2022 IEEE international conference on quantum computing and engineering (QCE), pp.\ 197--203. IEEE, 2022

work page 2022
[19]

Vsql: Variational shadow quantum learning for classification

Guangxi Li, Zhixin Song, and Xin Wang. Vsql: Variational shadow quantum learning for classification. In Proceedings of the AAAI conference on artificial intelligence, 2021

work page 2021
[20]

Mitigating barren plateaus with transfer-learning-inspired parameter initializations

Huan-Yu Liu, Tai-Ping Sun, Yu-Chun Wu, Yong-Jian Han, and Guo-Ping Guo. Mitigating barren plateaus with transfer-learning-inspired parameter initializations. New Journal of Physics, 25 0 (1): 0 013039, 2023

work page 2023
[21]

Mitigating barren plateaus of variational quantum eigensolvers

Xia Liu, Geng Liu, Hao-Kai Zhang, Jiaxin Huang, and Xin Wang. Mitigating barren plateaus of variational quantum eigensolvers. IEEE Transactions on Quantum Engineering, 2024

work page 2024
[22]

Barren plateaus in quantum neural network training landscapes

Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes. Nature communications, 9 0 (1): 0 4812, 2018

work page 2018
[23]

Avoiding barren plateaus via transferability of smooth solutions in a hamiltonian variational ansatz

Antonio A Mele, Glen B Mbeng, Giuseppe E Santoro, Mario Collura, and Pietro Torta. Avoiding barren plateaus via transferability of smooth solutions in a hamiltonian variational ansatz. Physical Review A, 106 0 (6): 0 L060401, 2022

work page 2022
[24]

Entanglement-induced barren plateaus

Carlos Ortiz Marrero, M \'a ria Kieferov \'a , and Nathan Wiebe. Entanglement-induced barren plateaus. PRX quantum, 2 0 (4): 0 040316, 2021

work page 2021
[25]

Structure optimization for parameterized quantum circuits

Mateusz Ostaszewski, Edward Grant, and Marcello Benedetti. Structure optimization for parameterized quantum circuits. Quantum, 5: 0 391, 2021

work page 2021
[26]

Hamiltonian variational ansatz without barren plateaus

Chae-Yeun Park and Nathan Killoran. Hamiltonian variational ansatz without barren plateaus. Quantum, 8: 0 1239, 2024

work page 2024
[27]

Quantum computing in the nisq era and beyond

John Preskill. Quantum computing in the nisq era and beyond. Quantum, 2: 0 79, 2018

work page 2018
[28]

The barren plateaus of quantum neural networks: review, taxonomy and trends

Han Qi, Lei Wang, Hongsheng Zhu, Abdullah Gani, and Changqing Gong. The barren plateaus of quantum neural networks: review, taxonomy and trends. Quantum Information Processing, 22 0 (12): 0 435, 2023

work page 2023
[29]

Measurement-induced landscape transitions in hybrid variational quantum circuits

Sonny Rappaport, Gaurav Gyawali, Tiago Sereno, and Michael J Lawler. Measurement-induced landscape transitions in hybrid variational quantum circuits. arXiv preprint arXiv:2312.09135, 2023

work page arXiv 2023
[30]

Avoiding barren plateaus using classical shadows

Stefan H Sack, Raimel A Medina, Alexios A Michailidis, Richard Kueng, and Maksym Serbyn. Avoiding barren plateaus using classical shadows. PRX Quantum, 3 0 (2): 0 020365, 2022

work page 2022
[31]

Engineered dissipation to mitigate barren plateaus

Antonio Sannia, Francesco Tacchino, Ivano Tavernelli, Gian Luca Giorgi, and Roberta Zambrini. Engineered dissipation to mitigate barren plateaus. npj Quantum Information, 10 0 (1): 0 81, 2024

work page 2024
[32]

Dimensionality reduction with variational encoders based on subsystem purification

Raja Selvarajan, Manas Sajjan, Travis S Humble, and Sabre Kais. Dimensionality reduction with variational encoders based on subsystem purification. Mathematics, 2023

work page 2023
[33]

Avoiding barren plateaus via gaussian mixture model

Yun Shang and Xiao Shi. Avoiding barren plateaus via gaussian mixture model. New Journal of Physics, 2025

work page 2025
[34]

Qugan: A quantum state fidelity based generative adversarial network

Samuel A Stein, Betis Baheri, Daniel Chen, Ying Mao, Qiang Guan, Ang Li, Bo Fang, and Shuai Xu. Qugan: A quantum state fidelity based generative adversarial network. In 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), pp.\ 71--81. IEEE, 2021

work page 2021
[35]

Normalized gradient descent for variational quantum algorithms

Yudai Suzuki, Hiroshi Yano, Rudy Raymond, and Naoki Yamamoto. Normalized gradient descent for variational quantum algorithms. In 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), pp.\ 1--9. IEEE, 2021

work page 2021
[36]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, et al. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[37]

u ys \"u z, Giuseppe Clemente, Arianna Crippa, Tobias Hartung, Stefan K \

Cenk T \"u ys \"u z, Giuseppe Clemente, Arianna Crippa, Tobias Hartung, Stefan K \"u hn, and Karl Jansen. Classical splitting of parametrized quantum circuits. Quantum Machine Intelligence, 2023

work page 2023
[38]

Probability with martingales

David Williams. Probability with martingales. Cambridge university press, 1991

work page 1991
[39]

Escaping from the barren plateau via gaussian initializations in deep variational quantum circuits

Kaining Zhang, Liu Liu, Min-Hsiu Hsieh, and Dacheng Tao. Escaping from the barren plateau via gaussian initializations in deep variational quantum circuits. Advances in Neural Information Processing Systems, 2022

work page 2022
[40]

Improving trainability of variational quantum circuits via regularization strategies

Jun Zhuang, Jack Cunningham, and Chaowen Guan. Improving trainability of variational quantum circuits via regularization strategies. arXiv preprint arXiv:2405.01606, 2024

work page arXiv 2024
[41]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page
[42]

@esa (Ref

\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

work page
[43]

\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

work page
[44]

@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...

work page

[1] [1]

GPT-4 Technical Report

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[2] [2]

Introducing claude 3.5 sonnet, June 2024

Anthropic. Introducing claude 3.5 sonnet, June 2024. URL https://www.anthropic.com/news/claude-3-5-sonnet. Accessed: 2025-02-15

work page 2024

[3] [3]

Quantum-assisted simulator

Kishor Bharti and Tobias Haug. Quantum-assisted simulator. Physical Review A, 104 0 (4): 0 042418, 2021

work page 2021

[4] [4]

Cost function dependent barren plateaus in shallow parametrized quantum circuits

Marco Cerezo, Akira Sone, Tyler Volkoff, Lukasz Cincio, and Patrick J Coles. Cost function dependent barren plateaus in shallow parametrized quantum circuits. Nature communications, 12 0 (1): 0 1791, 2021

work page 2021

[5] [5]

Investigating and mitigating barren plateaus in variational quantum circuits: A survey

Jack Cunningham and Jun Zhuang. Investigating and mitigating barren plateaus in variational quantum circuits: A survey. arXiv preprint arXiv:2407.17706, 2024

work page arXiv 2024

[6] [6]

Quantum circuit architecture search for variational quantum algorithms

Yuxuan Du, Tao Huang, Shan You, Min-Hsiu Hsieh, and Dacheng Tao. Quantum circuit architecture search for variational quantum algorithms. npj Quantum Information, 8 0 (1): 0 62, 2022

work page 2022

[7] [7]

A Quantum Approximate Optimization Algorithm

Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[8] [8]

Mas352: Stochastic processes and financial mathematics (notes)

Nic Freeman and Robin Stephenson. Mas352: Stochastic processes and financial mathematics (notes). https://nicfreeman1209.github.io/Website/MASx52/html/notes_1.html, 2025. Accessed: 2025-05-15

work page 2025

[9] [9]

An initialization strategy for addressing barren plateaus in parametrized quantum circuits

Edward Grant, Leonard Wossnig, Mateusz Ostaszewski, and Marcello Benedetti. An initialization strategy for addressing barren plateaus in parametrized quantum circuits. Quantum, 2019

work page 2019

[10] [10]

The Llama 3 Herd of Models

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[11] [11]

Adaptive, problem-tailored variational quantum eigensolver mitigates rough parameter landscapes and barren plateaus

Harper R Grimsley, George S Barron, Edwin Barnes, Sophia E Economou, and Nicholas J Mayhall. Adaptive, problem-tailored variational quantum eigensolver mitigates rough parameter landscapes and barren plateaus. npj Quantum Information, 9 0 (1): 0 19, 2023

work page 2023

[12] [12]

Efficient estimation of trainability for variational quantum circuits

Valentin Heyraud, Zejian Li, Kaelan Donatella, Alexandre Le Boit \'e , and Cristiano Ciuti. Efficient estimation of trainability for variational quantum circuits. PRX Quantum, 4 0 (4): 0 040335, 2023

work page 2023

[13] [13]

Connecting ansatz expressibility to gradient magnitudes and barren plateaus

Zo \"e Holmes, Kunal Sharma, Marco Cerezo, and Patrick J Coles. Connecting ansatz expressibility to gradient magnitudes and barren plateaus. PRX quantum, 3 0 (1): 0 010313, 2022

work page 2022

[14] [14]

GPT-4o System Card

Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, et al. Gpt-4o system card. arXiv preprint arXiv:2410.21276, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[15] [15]

Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets

Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M Chow, and Jay M Gambetta. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. nature, 549 0 (7671): 0 242--246, 2017

work page 2017

[16] [16]

Resqnets: a residual approach for mitigating barren plateaus in quantum neural networks

Muhammad Kashif and Saif Al-Kuwari. Resqnets: a residual approach for mitigating barren plateaus in quantum neural networks. EPJ Quantum Technology, 2024

work page 2024

[17] [17]

Alleviating barren plateaus in parameterized quantum machine learning circuits: Investigating advanced parameter initialization strategies

Muhammad Kashif, Muhammad Rashid, Saif Al-Kuwari, and Muhammad Shafique. Alleviating barren plateaus in parameterized quantum machine learning circuits: Investigating advanced parameter initialization strategies. In 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp.\ 1--6. IEEE, 2024

work page 2024

[18] [18]

Beinit: Avoiding barren plateaus in variational quantum algorithms

Ankit Kulshrestha and Ilya Safro. Beinit: Avoiding barren plateaus in variational quantum algorithms. In 2022 IEEE international conference on quantum computing and engineering (QCE), pp.\ 197--203. IEEE, 2022

work page 2022

[19] [19]

Vsql: Variational shadow quantum learning for classification

Guangxi Li, Zhixin Song, and Xin Wang. Vsql: Variational shadow quantum learning for classification. In Proceedings of the AAAI conference on artificial intelligence, 2021

work page 2021

[20] [20]

Mitigating barren plateaus with transfer-learning-inspired parameter initializations

Huan-Yu Liu, Tai-Ping Sun, Yu-Chun Wu, Yong-Jian Han, and Guo-Ping Guo. Mitigating barren plateaus with transfer-learning-inspired parameter initializations. New Journal of Physics, 25 0 (1): 0 013039, 2023

work page 2023

[21] [21]

Mitigating barren plateaus of variational quantum eigensolvers

Xia Liu, Geng Liu, Hao-Kai Zhang, Jiaxin Huang, and Xin Wang. Mitigating barren plateaus of variational quantum eigensolvers. IEEE Transactions on Quantum Engineering, 2024

work page 2024

[22] [22]

Barren plateaus in quantum neural network training landscapes

Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes. Nature communications, 9 0 (1): 0 4812, 2018

work page 2018

[23] [23]

Avoiding barren plateaus via transferability of smooth solutions in a hamiltonian variational ansatz

Antonio A Mele, Glen B Mbeng, Giuseppe E Santoro, Mario Collura, and Pietro Torta. Avoiding barren plateaus via transferability of smooth solutions in a hamiltonian variational ansatz. Physical Review A, 106 0 (6): 0 L060401, 2022

work page 2022

[24] [24]

Entanglement-induced barren plateaus

Carlos Ortiz Marrero, M \'a ria Kieferov \'a , and Nathan Wiebe. Entanglement-induced barren plateaus. PRX quantum, 2 0 (4): 0 040316, 2021

work page 2021

[25] [25]

Structure optimization for parameterized quantum circuits

Mateusz Ostaszewski, Edward Grant, and Marcello Benedetti. Structure optimization for parameterized quantum circuits. Quantum, 5: 0 391, 2021

work page 2021

[26] [26]

Hamiltonian variational ansatz without barren plateaus

Chae-Yeun Park and Nathan Killoran. Hamiltonian variational ansatz without barren plateaus. Quantum, 8: 0 1239, 2024

work page 2024

[27] [27]

Quantum computing in the nisq era and beyond

John Preskill. Quantum computing in the nisq era and beyond. Quantum, 2: 0 79, 2018

work page 2018

[28] [28]

The barren plateaus of quantum neural networks: review, taxonomy and trends

Han Qi, Lei Wang, Hongsheng Zhu, Abdullah Gani, and Changqing Gong. The barren plateaus of quantum neural networks: review, taxonomy and trends. Quantum Information Processing, 22 0 (12): 0 435, 2023

work page 2023

[29] [29]

Measurement-induced landscape transitions in hybrid variational quantum circuits

Sonny Rappaport, Gaurav Gyawali, Tiago Sereno, and Michael J Lawler. Measurement-induced landscape transitions in hybrid variational quantum circuits. arXiv preprint arXiv:2312.09135, 2023

work page arXiv 2023

[30] [30]

Avoiding barren plateaus using classical shadows

Stefan H Sack, Raimel A Medina, Alexios A Michailidis, Richard Kueng, and Maksym Serbyn. Avoiding barren plateaus using classical shadows. PRX Quantum, 3 0 (2): 0 020365, 2022

work page 2022

[31] [31]

Engineered dissipation to mitigate barren plateaus

Antonio Sannia, Francesco Tacchino, Ivano Tavernelli, Gian Luca Giorgi, and Roberta Zambrini. Engineered dissipation to mitigate barren plateaus. npj Quantum Information, 10 0 (1): 0 81, 2024

work page 2024

[32] [32]

Dimensionality reduction with variational encoders based on subsystem purification

Raja Selvarajan, Manas Sajjan, Travis S Humble, and Sabre Kais. Dimensionality reduction with variational encoders based on subsystem purification. Mathematics, 2023

work page 2023

[33] [33]

Avoiding barren plateaus via gaussian mixture model

Yun Shang and Xiao Shi. Avoiding barren plateaus via gaussian mixture model. New Journal of Physics, 2025

work page 2025

[34] [34]

Qugan: A quantum state fidelity based generative adversarial network

Samuel A Stein, Betis Baheri, Daniel Chen, Ying Mao, Qiang Guan, Ang Li, Bo Fang, and Shuai Xu. Qugan: A quantum state fidelity based generative adversarial network. In 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), pp.\ 71--81. IEEE, 2021

work page 2021

[35] [35]

Normalized gradient descent for variational quantum algorithms

Yudai Suzuki, Hiroshi Yano, Rudy Raymond, and Naoki Yamamoto. Normalized gradient descent for variational quantum algorithms. In 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), pp.\ 1--9. IEEE, 2021

work page 2021

[36] [36]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, et al. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[37] [37]

u ys \"u z, Giuseppe Clemente, Arianna Crippa, Tobias Hartung, Stefan K \

Cenk T \"u ys \"u z, Giuseppe Clemente, Arianna Crippa, Tobias Hartung, Stefan K \"u hn, and Karl Jansen. Classical splitting of parametrized quantum circuits. Quantum Machine Intelligence, 2023

work page 2023

[38] [38]

Probability with martingales

David Williams. Probability with martingales. Cambridge university press, 1991

work page 1991

[39] [39]

Escaping from the barren plateau via gaussian initializations in deep variational quantum circuits

Kaining Zhang, Liu Liu, Min-Hsiu Hsieh, and Dacheng Tao. Escaping from the barren plateau via gaussian initializations in deep variational quantum circuits. Advances in Neural Information Processing Systems, 2022

work page 2022

[40] [40]

Improving trainability of variational quantum circuits via regularization strategies

Jun Zhuang, Jack Cunningham, and Chaowen Guan. Improving trainability of variational quantum circuits via regularization strategies. arXiv preprint arXiv:2405.01606, 2024

work page arXiv 2024

[41] [41]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page

[42] [42]

@esa (Ref

\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

work page

[43] [43]

\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

work page

[44] [44]

@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...

work page