Hessian-vector products for tensor networks via recursive tangent-state propagation

Christian B. Mendl; Isabel Nha Minh Le; Roeland Wiersema

arxiv: 2604.20384 · v1 · submitted 2026-04-22 · 🪐 quant-ph · physics.comp-ph

Hessian-vector products for tensor networks via recursive tangent-state propagation

Isabel Nha Minh Le , Roeland Wiersema , Christian B. Mendl This is my paper

Pith reviewed 2026-05-10 00:21 UTC · model grok-4.3

classification 🪐 quant-ph physics.comp-ph

keywords tensor networksHessian-vector productRiemannian optimizationtrust-region methodquantum circuit compressiontangent-state propagationsecond-order optimizationspin-chain time evolution

0 comments

The pith

Recursive tangent-state propagation computes scalable Hessian-vector products for tensor networks without building the full matrix.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to demonstrate that second-order optimization, which resists local minima better than first-order methods, can be applied to large tensor networks by replacing the full Hessian with an analytical product against any vector. It does this through a two-pass algorithm that propagates tangent states recursively while holding the virtual bond dimension fixed. This matters for quantum circuit tasks because first-order approaches like Riemannian ADAM converge slowly on time-evolution circuits, whereas the new kernel integrates directly into a Riemannian trust-region solver. If the method works as described, it removes the main computational barrier that previously made second-order methods impractical for these systems.

Core claim

An analytical Hessian-vector product kernel is introduced for arbitrary compositions of linear maps. The kernel is realized as a two-pass algorithm that performs recursive tangent-state propagation while keeping the virtual bond dimension bounded, thereby guaranteeing scalability. When this kernel is placed inside a Riemannian trust-region optimizer for quantum circuit compression, the resulting procedure produces up to four orders of magnitude higher fidelity than naive Trotterization on time-evolution circuits for spin chains and exhibits markedly smoother and faster convergence than standard first-order Riemannian methods.

What carries the argument

Recursive tangent-state propagation, the two-pass mechanism that carries tangent vectors through successive linear maps while enforcing a fixed virtual bond dimension to evaluate the Hessian-vector product.

If this is right

Second-order Riemannian optimization becomes practical for tensor networks whose size previously ruled out explicit Hessian construction.
Quantum circuit compression of time-evolution operators on spin chains reaches fidelities four orders of magnitude above naive Trotterization.
Convergence trajectories become smoother and require fewer iterations than those produced by first-order Riemannian ADAM.
The same kernel applies without modification to any composition of linear maps that can be represented as a tensor network.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The bounded-dimension propagation technique may extend to the computation of higher-order derivative-vector products in the same tensor-network setting.
The method could be tested on tensor-network ansatzes used in classical machine-learning models to check whether the same scalability gains appear outside quantum circuits.
If the bounded virtual bond dimension remains accurate for deeper or wider networks, the approach would directly support optimization of larger many-body systems without additional truncation analysis.

Load-bearing premise

Bounding the virtual bond dimension during recursive tangent-state propagation still yields the exact Hessian-vector product rather than an uncontrolled approximation.

What would settle it

On a small tensor network where the full Hessian can be stored and multiplied explicitly, compare the output of the recursive kernel against a finite-difference approximation of the directional derivative of the gradient.

Figures

Figures reproduced from arXiv: 2604.20384 by Christian B. Mendl, Isabel Nha Minh Le, Roeland Wiersema.

**Figure 1.** Figure 1: Visualization of the gradient and its directional derivative, i.e., the Hessian-vector product H(f(z)) [v]. The gradient ∇f represents the local slope; the difference between the gradient at point z and a perturbed point z + ϵv defines the Hessian-vector product as ϵ → 0. on a vector v, denoted as H(f(z)) [v], without constructing the full matrix ( [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: TN representation as a composition of linear [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Computational passes for a brickwall circuit. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Riemannian quantum circuit optimization results for various spin chains. (Top) Approximation accuracy of the optimized circuit compared to naive Trotterization, evaluated on a test batch of 100 Haar-random product states. (Bottom) The trust-region method correctly reproduces the ADAM optimization results [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Convergence of Riemannian quantum circuit [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

read the original abstract

Optimizing tensor networks with standard first-order methods often leads to slow convergence and entrapment in local minima. Although second-order optimization offers enhanced robustness, explicitly constructing the full Hessian matrix is computationally prohibitive for large-scale systems. In this work, we bypass this bottleneck by introducing an analytical Hessian-vector product kernel designed for arbitrary compositions of linear maps. This two-pass algorithm leverages recursive tangent-state propagation with a bounded virtual bond dimension to guarantee scalability. We demonstrate the practical utility of this kernel by integrating it into a Riemannian trust-region framework for quantum circuit compression. Evaluated on time-evolution circuits for various spin chains, our second-order approach achieves up to a four-order-of-magnitude improvement in fidelity over naive Trotterization, while delivering significantly smoother, faster convergence than conventional first-order methods such as Riemannian ADAM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a recursive tangent-state method for exact Hessian-vector products in tensor networks without building the full matrix, but the bounded bond dimension raises questions about whether truncation errors stay controlled.

read the letter

The main point is that this work develops a two-pass recursive algorithm to compute Hessian-vector products for tensor networks by propagating tangent states, avoiding the cost of the full Hessian. They fold it into a Riemannian trust-region optimizer and test it on compressing time-evolution circuits for spin chains, where it reaches up to four orders of magnitude higher fidelity than plain Trotterization and converges more smoothly than first-order Riemannian ADAM. That empirical side is the clearest win so far. The approach is new in its specific recursive kernel for arbitrary compositions of linear maps inside tensor networks, and the bounded virtual bond dimension is a reasonable engineering choice to keep the method scalable. The reported gains on concrete circuits show the idea can deliver practical speed and accuracy improvements in variational quantum settings. The soft spot sits in the exactness claim. The abstract presents the kernel as analytical, yet deliberately bounds the bond dimension during propagation to guarantee scalability. In tensor networks that bound is usually a truncation step, and nothing in the given description shows how the recursion preserves the exact product or cancels the resulting error. Without the derivation or an error analysis in the abstract, it is difficult to judge whether the method stays exact or quietly becomes approximate for larger systems. The full manuscript may resolve this, but the abstract leaves the central guarantee under-supported. This paper is for people already working on second-order methods for tensor networks or variational quantum algorithms who need better convergence than first-order optimizers. A reader focused on circuit compression or TN simulation would get usable implementation ideas and benchmark numbers. It is worth sending to peer review because the core problem is real and the reported results are concrete, even if the exactness question needs referee attention on the bounded-propagation step.

Referee Report

2 major / 2 minor

Summary. The paper introduces an analytical Hessian-vector product (HVP) kernel for tensor networks formed by arbitrary compositions of linear maps. The kernel is realized via a two-pass recursive tangent-state propagation algorithm that deliberately bounds the virtual bond dimension for scalability. This HVP is embedded in a Riemannian trust-region optimizer and applied to quantum circuit compression of time-evolution circuits on spin chains, where the second-order method reportedly yields up to four orders of magnitude higher fidelity than naive Trotterization and smoother convergence than first-order Riemannian ADAM.

Significance. If the HVP is shown to be exact (or with rigorously controlled error) despite the bond-dimension bound, the work supplies a missing computational primitive for second-order optimization of tensor networks. This would be valuable for variational quantum algorithms and many-body simulations where first-order methods converge slowly. The reported fidelity gains on spin-chain circuits provide initial evidence of practical impact, though quantitative validation details are needed to assess the strength of the empirical claims.

major comments (2)

[Abstract and recursive tangent-state propagation section (likely §3)] The central claim of an 'analytical' (exact) HVP is load-bearing, yet the abstract and method description rely on recursive tangent-state propagation with an explicitly bounded virtual bond dimension. Any such bound introduces a truncation whose error vanishes only if the intermediate tangent states remain exactly within the truncated manifold. The manuscript must demonstrate (via projection identities, closed-form cancellation, or an explicit error analysis) that the recursion preserves exactness; otherwise the 'analytical' guarantee does not hold. This issue directly affects the weakest assumption identified in the stress-test note.
[Application to quantum circuit compression and numerical results] The integration into the Riemannian trust-region framework assumes the supplied HVP is sufficiently accurate for the trust-region subproblem solver. No quantitative error analysis or comparison against finite-difference or full-Hessian baselines is referenced in the provided abstract; such validation is required to confirm that the observed fidelity improvements are attributable to the second-order information rather than to other implementation details.

minor comments (2)

[Abstract] The abstract states 'large fidelity gains' and 'four-order-of-magnitude improvement' without specifying the system sizes, circuit depths, or exact fidelity metric used; these details should be added for reproducibility.
[Method description] Notation for the tangent states and the two-pass algorithm would benefit from an explicit pseudocode listing or a small illustrative diagram showing the forward and backward recursions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments, which help clarify the presentation of our analytical HVP kernel and its validation. We address each major comment below, indicating the revisions we will incorporate.

read point-by-point responses

Referee: [Abstract and recursive tangent-state propagation section (likely §3)] The central claim of an 'analytical' (exact) HVP is load-bearing, yet the abstract and method description rely on recursive tangent-state propagation with an explicitly bounded virtual bond dimension. Any such bound introduces a truncation whose error vanishes only if the intermediate tangent states remain exactly within the truncated manifold. The manuscript must demonstrate (via projection identities, closed-form cancellation, or an explicit error analysis) that the recursion preserves exactness; otherwise the 'analytical' guarantee does not hold. This issue directly affects the weakest assumption identified in the stress-test note.

Authors: We agree that demonstrating exactness of the HVP with respect to the truncated manifold is essential for the 'analytical' claim. The recursive tangent-state propagation is constructed so that each tangent vector is explicitly projected onto the tangent space of the current bond-dimension truncation at every step; this ensures the computed product is exact for the truncated network rather than an approximation to the untruncated Hessian. We will add a new subsection to §3 containing a formal proof via projection identities showing that the two-pass recursion commutes with these projections and therefore preserves exactness on the truncated manifold. We will also include a short discussion of the (controlled) difference from the untruncated case. revision: yes
Referee: [Application to quantum circuit compression and numerical results] The integration into the Riemannian trust-region framework assumes the supplied HVP is sufficiently accurate for the trust-region subproblem solver. No quantitative error analysis or comparison against finite-difference or full-Hessian baselines is referenced in the provided abstract; such validation is required to confirm that the observed fidelity improvements are attributable to the second-order information rather than to other implementation details.

Authors: We acknowledge that stronger quantitative validation is needed to attribute the reported fidelity gains specifically to the second-order information. In the revised manuscript we will add a dedicated validation subsection (within the numerical results) that (i) compares the analytical HVP against finite-difference approximations on small spin-chain instances where the full Hessian remains tractable, (ii) reports observed relative errors and their scaling with bond dimension, and (iii) includes an ablation comparing trust-region steps using the analytical HVP versus first-order Riemannian ADAM on the same circuits. These additions will directly address the attribution question. revision: yes

Circularity Check

0 steps flagged

No circularity: analytical derivation is self-contained

full rationale

The paper introduces an analytical Hessian-vector product via recursive tangent-state propagation for tensor networks. No load-bearing step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the bounded bond dimension is explicitly a scalability choice rather than a hidden redefinition of the target quantity. The derivation chain remains independent of its inputs and does not rely on renaming known results or smuggling ansatzes through citations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the work relies on standard domain assumptions in tensor networks and differentiable programming; no free parameters, new entities, or ad-hoc axioms are introduced.

axioms (1)

domain assumption Tensor networks consist of compositions of linear maps that admit analytical differentiation.
Required for tangent-state propagation to compute Hessian-vector products exactly.

pith-pipeline@v0.9.0 · 5434 in / 1219 out tokens · 48015 ms · 2026-05-10T00:21:18.418552+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages

[1]

We benchmark its efficacy against the first-order Riemannian ADAM optimizer uti- lized in our recent study [13]

Optimizing Trotter circuits for spin chains To evaluate the performance of the HVP kernel, we op- timize Trotter circuits using a second-order Riemannian trust-region algorithm [15]. We benchmark its efficacy against the first-order Riemannian ADAM optimizer uti- lized in our recent study [13]. As a primary benchmark, we consider the transverse- field Isi...

work page
[2]

trusted region

Comparing convergence behavior To evaluate optimization efficiency, we compare the convergence speed of the Riemannian ADAM and trust- region optimizers in Fig. 5. For both the Ising and Heisenberg models, we optimize a quantum circuit com- posed of 11 layers. We enforce translational invariance across the circuit, meaning each layerℓis constructed by rep...

work page 2000
[3]

S. R. White, Density matrix formulation for quantum renormalization groups, Phys. Rev. Lett.69, 2863 (1992)

work page 1992
[4]

Schollw¨ ock, The density-matrix renormalization group in the age of matrix product states, Ann

U. Schollw¨ ock, The density-matrix renormalization group in the age of matrix product states, Ann. Phys.326, 96 (2011)

work page 2011
[5]

Or´ us, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Ann

R. Or´ us, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Ann. Phys.349, 117 (2014)

work page 2014
[6]

G. K.-L. Chan and S. Sharma, The density matrix renor- malization group in quantum chemistry, Annu. Rev. Phys. Chem.62, 465 (2011)

work page 2011
[7]

G. K. Chan, A. Keselman, N. Nakatani, Z. Li, and S. R. White, Matrix product operators, matrix product states, and ab initio density matrix renormalization group algo- rithms, J. Chem. Phys.145(2016)

work page 2016
[8]

Stoudenmire and D

E. Stoudenmire and D. J. Schwab, Supervised learning with tensor networks, Adv. Neural Inf. Process. Syst.29 (2016)

work page 2016
[9]

J. A. Reyes and E. M. Stoudenmire, Multi-scale tensor network architecture for machine learning, Mach. Learn.: Sci. Technol.2, 035036 (2021)

work page 2021
[10]

I. L. Markov and Y. Shi, Simulating quantum computa- tion by contracting tensor networks, SIAM J. Comput. 38, 963 (2008)

work page 2008
[11]

Seitz, I

P. Seitz, I. Medina, E. Cruz, Q. Huang, and C. B. Mendl, Simulating quantum circuits using tree tensor networks, Quantum7, 964 (2023)

work page 2023
[12]

Haegeman, J

J. Haegeman, J. I. Cirac, T. J. Osborne, I. Pizorn, H. Ver- schelde, and F. Verstraete, Time-dependent variational principle for quantum lattices, Phys. Rev. Lett.107, 070601 (2011)

work page 2011
[13]

Haegeman, C

J. Haegeman, C. Lubich, I. Oseledets, B. Vandereycken, and F. Verstraete, Unifying time evolution and optimiza- tion with matrix product states, Phys. Rev. B94, 165116 (2016)

work page 2016
[14]

Zauner-Stauber, L

V. Zauner-Stauber, L. Vanderstraeten, M. T. Fishman, F. Verstraete, and J. Haegeman, Variational optimization algorithms for uniform matrix product states, Phys. Rev. B97, 045145 (2018)

work page 2018
[15]

I. N. M. Le, S. Sun, and C. B. Mendl, Riemannian quan- tum circuit optimization based on matrix product oper- ators, Quantum9, 1833 (2025)

work page 2025
[16]

Zhang, R

Y. Zhang, R. Wiersema, J. Carrasquilla, L. Cincio, and Y. B. Kim, Scalable quantum dynamics compi- lation via quantum machine learning, arXiv preprint arXiv:2409.16346 (2024)

work page arXiv 2024
[17]

Absil, R

P.-A. Absil, R. Mahony, and R. Sepulchre,Optimiza- tion algorithms on matrix manifolds(Princeton Univer- sity Press, 2008)

work page 2008
[18]

Haegeman, T

J. Haegeman, T. J. Osborne, and F. Verstraete, Post- matrix product state methods: To tangent space and beyond, Phys. Rev. B88, 075133 (2013). 12

work page 2013
[19]

Vanderstraeten, J

L. Vanderstraeten, J. Haegeman, P. Corboz, and F. Ver- straete, Gradient methods for variational optimization of projected entangled-pair states, Phys. Rev. B94, 155123 (2016)

work page 2016
[20]

Kotil, R

A. Kotil, R. Banerjee, Q. Huang, and C. B. Mendl, Rie- mannian quantum circuit optimization for Hamiltonian simulation, J. Phys. A: Math. Theor.57, 135303 (2024)

work page 2024
[21]

Putterer, M

F. Putterer, M. M. Zumpe, I. N. M. Le, Q. Huang, and C. B. Mendl, High-performance contraction of quan- tum circuits for Riemannian optimization, arXiv preprint arXiv:2506.23775 (2025)

work page arXiv 2025
[22]

arXiv preprint arXiv:2501.18691 , year=

M. Ben-Dov and J. Chen, Regularized second-order optimization of tensor-network Born machines, arXiv preprint arXiv:2501.18691 (2025)

work page arXiv 2025
[23]

Nocedal and S

J. Nocedal and S. J. Wright,Numerical optimization (Springer, 2006)

work page 2006
[24]

B. A. Pearlmutter, Fast exact multiplication by the Hes- sian, Neural Comput.6, 147 (1994)

work page 1994
[25]

Griewank and A

A. Griewank and A. Walther,Evaluating derivatives: principles and techniques of algorithmic differentiation (SIAM, 2008)

work page 2008
[26]

Liao, J.-G

H.-J. Liao, J.-G. Liu, L. Wang, and T. Xiang, Differ- entiable programming tensor networks, Phys. Rev. X9, 031041 (2019)

work page 2019
[27]

Francuz, N

A. Francuz, N. Schuch, and B. Vanhecke, Stable and effi- cient differentiation of tensor network algorithms, Phys. Rev. Research7, 013237 (2025)

work page 2025
[29]

Khatri, R

S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sorn- borger, and P. J. Coles, Quantum-assisted quantum com- piling, Quantum3, 140 (2019)

work page 2019
[30]

M. C. Caro, H.-Y. Huang, N. Ezzell, J. Gibbs, A. T. Sornborger, L. Cincio, P. J. Coles, and Z. Holmes, Out- of-distribution generalization for learning quantum dy- namics, Nat. Commun.14, 3751 (2023)

work page 2023
[31]

Gibbs and L

J. Gibbs and L. Cincio, Deep circuit compression for quantum dynamics via tensor networks, Quantum9, 1789 (2025)

work page 2025
[32]

Robertson, A

N. Robertson, A. Akhriev, J. Vala, and S. Zhuk, Ap- proximate quantum compiling for quantum simulation: A tensor network based approach, ACM Trans. Quan- tum Comput.6(2025)

work page 2025
[33]

Causer, F

L. Causer, F. Jung, A. Mitra, F. Pollmann, and A. Gammon-Smith, Scalable simulation of nonequilib- rium quantum dynamics via classically optimized unitary circuits, Phys. Rev. Research6, 033062 (2024)

work page 2024
[34]

A short tutorial on Wirtinger Calculus with applications in quantum information,

M. D’Anna, Y. Zhang, R. Wiersema, M. S. Rudolph, and J. Carrasquilla, Circuit compression for 2d quantum dynamics, arXiv preprint arXiv:2312.04858 (2025)

work page arXiv 2025
[35]

I. A. Luchnikov, A. Ryzhov, S. N. Filippov, and H. Ouer- dane, QGOpt: Riemannian optimization for quantum technologies, SciPost Phys.10, 079 (2021)

work page 2021
[36]

I. A. Luchnikov, M. E. Krechetov, and S. N. Filippov, Riemannian geometry and automatic differentiation for optimization problems of quantum physics and quantum technologies, New J. Phys.23, 073006 (2021)

work page 2021
[37]

M. Herlihy, The art of multiprocessor programming, in Proceedings of the Twenty-Fifth Annual ACM Sympo- sium on Principles of Distributed Computing, PODC ’06 (Association for Computing Machinery, New York, NY, USA, 2006) p. 1–2

work page 2006
[38]

Gibbs and L

J. Gibbs and L. Cincio, Learning circuits with infi- nite tensor networks, arXiv preprint arXiv:2506.02105 (2025)

work page arXiv 2025
[39]

A. W. Sandvik and G. Vidal, Variational quantum Monte Carlo simulations with tensor-network states, Phys. Rev. Lett.99, 220602 (2007)

work page 2007
[40]

Liu, S.-J

W.-Y. Liu, S.-J. Dong, Y.-J. Han, G.-C. Guo, and L. He, Gradient optimization of finite projected entangled pair states, Phys. Rev. B95, 195154 (2017)

work page 2017
[41]

Liu, Y.-Z

W.-Y. Liu, Y.-Z. Huang, S.-S. Gong, and Z.-C. Gu, Accu- rate simulation for finite projected entangled pair states in two dimensions, Phys. Rev. B103, 235155 (2021)

work page 2021
[42]

Wu and Z

Y. Wu and Z. Dai, Algorithms for variational Monte Carlo calculations of fermion projected entangled pair states in the swap gates formulation and the detailed bal- ance of tensor network sequential sampling, Chin. Phys. B35, 020502 (2026)

work page 2026
[43]

T. Chen, J. Liu, Y. Wu, P. Zhang, and Y. Deng, Varia- tional Monte Carlo (VMC) with row-update projected entangled-pair states (PEPS) and its applications in quantum spin glasses, arXiv preprint arXiv:2601.20608 (2026)

work page arXiv 2026
[44]

R. J. Webber and M. Lindsey, Rayleigh-Gauss-Newton optimization with enhanced sampling for variational Monte Carlo, Phys. Rev. Research4, 033099 (2022)

work page 2022
[45]

Lanczos, An iteration method for the solution of the eigenvalue problem of linear differential and integral op- erators, J

C. Lanczos, An iteration method for the solution of the eigenvalue problem of linear differential and integral op- erators, J. Res. Natl. Bur. Stand.45, 255 (1950)

work page 1950
[46]

ADAM implementation by qiskit v0.18 (2021)

work page 2021
[47]

trust-region implementation by rqcopt v1.0.0 (2022)

work page 2022
[48]

Bradbury, R

J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Van- derPlas, S. Wanderman-Milne, and Q. Zhang, JAX: com- posable transformations of Python+NumPy programs (2018)

work page 2018
[49]

C. T. Kelley,Iterative methods for optimization(SIAM, 1999)

work page 1999
[50]

Hessian-vector products for tensor networks via recursive tangent-state propagation

J. Shewchuk,An introduction to the conjugate gradi- ent method without the agonizing pain, Technical Report CMU-CS-94-125 (School of Computer Science, Carnegie Mellon University, 1994). 1 Supplementary Material for: “Hessian-vector products for tensor networks via recursive tangent-state propagation” Appendix A: Complex derivatives In this section, we sum...

work page 1994

[1] [1]

We benchmark its efficacy against the first-order Riemannian ADAM optimizer uti- lized in our recent study [13]

Optimizing Trotter circuits for spin chains To evaluate the performance of the HVP kernel, we op- timize Trotter circuits using a second-order Riemannian trust-region algorithm [15]. We benchmark its efficacy against the first-order Riemannian ADAM optimizer uti- lized in our recent study [13]. As a primary benchmark, we consider the transverse- field Isi...

work page

[2] [2]

trusted region

Comparing convergence behavior To evaluate optimization efficiency, we compare the convergence speed of the Riemannian ADAM and trust- region optimizers in Fig. 5. For both the Ising and Heisenberg models, we optimize a quantum circuit com- posed of 11 layers. We enforce translational invariance across the circuit, meaning each layerℓis constructed by rep...

work page 2000

[3] [3]

S. R. White, Density matrix formulation for quantum renormalization groups, Phys. Rev. Lett.69, 2863 (1992)

work page 1992

[4] [4]

Schollw¨ ock, The density-matrix renormalization group in the age of matrix product states, Ann

U. Schollw¨ ock, The density-matrix renormalization group in the age of matrix product states, Ann. Phys.326, 96 (2011)

work page 2011

[5] [5]

Or´ us, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Ann

R. Or´ us, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Ann. Phys.349, 117 (2014)

work page 2014

[6] [6]

G. K.-L. Chan and S. Sharma, The density matrix renor- malization group in quantum chemistry, Annu. Rev. Phys. Chem.62, 465 (2011)

work page 2011

[7] [7]

G. K. Chan, A. Keselman, N. Nakatani, Z. Li, and S. R. White, Matrix product operators, matrix product states, and ab initio density matrix renormalization group algo- rithms, J. Chem. Phys.145(2016)

work page 2016

[8] [8]

Stoudenmire and D

E. Stoudenmire and D. J. Schwab, Supervised learning with tensor networks, Adv. Neural Inf. Process. Syst.29 (2016)

work page 2016

[9] [9]

J. A. Reyes and E. M. Stoudenmire, Multi-scale tensor network architecture for machine learning, Mach. Learn.: Sci. Technol.2, 035036 (2021)

work page 2021

[10] [10]

I. L. Markov and Y. Shi, Simulating quantum computa- tion by contracting tensor networks, SIAM J. Comput. 38, 963 (2008)

work page 2008

[11] [11]

Seitz, I

P. Seitz, I. Medina, E. Cruz, Q. Huang, and C. B. Mendl, Simulating quantum circuits using tree tensor networks, Quantum7, 964 (2023)

work page 2023

[12] [12]

Haegeman, J

J. Haegeman, J. I. Cirac, T. J. Osborne, I. Pizorn, H. Ver- schelde, and F. Verstraete, Time-dependent variational principle for quantum lattices, Phys. Rev. Lett.107, 070601 (2011)

work page 2011

[13] [13]

Haegeman, C

J. Haegeman, C. Lubich, I. Oseledets, B. Vandereycken, and F. Verstraete, Unifying time evolution and optimiza- tion with matrix product states, Phys. Rev. B94, 165116 (2016)

work page 2016

[14] [14]

Zauner-Stauber, L

V. Zauner-Stauber, L. Vanderstraeten, M. T. Fishman, F. Verstraete, and J. Haegeman, Variational optimization algorithms for uniform matrix product states, Phys. Rev. B97, 045145 (2018)

work page 2018

[15] [15]

I. N. M. Le, S. Sun, and C. B. Mendl, Riemannian quan- tum circuit optimization based on matrix product oper- ators, Quantum9, 1833 (2025)

work page 2025

[16] [16]

Zhang, R

Y. Zhang, R. Wiersema, J. Carrasquilla, L. Cincio, and Y. B. Kim, Scalable quantum dynamics compi- lation via quantum machine learning, arXiv preprint arXiv:2409.16346 (2024)

work page arXiv 2024

[17] [17]

Absil, R

P.-A. Absil, R. Mahony, and R. Sepulchre,Optimiza- tion algorithms on matrix manifolds(Princeton Univer- sity Press, 2008)

work page 2008

[18] [18]

Haegeman, T

J. Haegeman, T. J. Osborne, and F. Verstraete, Post- matrix product state methods: To tangent space and beyond, Phys. Rev. B88, 075133 (2013). 12

work page 2013

[19] [19]

Vanderstraeten, J

L. Vanderstraeten, J. Haegeman, P. Corboz, and F. Ver- straete, Gradient methods for variational optimization of projected entangled-pair states, Phys. Rev. B94, 155123 (2016)

work page 2016

[20] [20]

Kotil, R

A. Kotil, R. Banerjee, Q. Huang, and C. B. Mendl, Rie- mannian quantum circuit optimization for Hamiltonian simulation, J. Phys. A: Math. Theor.57, 135303 (2024)

work page 2024

[21] [21]

Putterer, M

F. Putterer, M. M. Zumpe, I. N. M. Le, Q. Huang, and C. B. Mendl, High-performance contraction of quan- tum circuits for Riemannian optimization, arXiv preprint arXiv:2506.23775 (2025)

work page arXiv 2025

[22] [22]

arXiv preprint arXiv:2501.18691 , year=

M. Ben-Dov and J. Chen, Regularized second-order optimization of tensor-network Born machines, arXiv preprint arXiv:2501.18691 (2025)

work page arXiv 2025

[23] [23]

Nocedal and S

J. Nocedal and S. J. Wright,Numerical optimization (Springer, 2006)

work page 2006

[24] [24]

B. A. Pearlmutter, Fast exact multiplication by the Hes- sian, Neural Comput.6, 147 (1994)

work page 1994

[25] [25]

Griewank and A

A. Griewank and A. Walther,Evaluating derivatives: principles and techniques of algorithmic differentiation (SIAM, 2008)

work page 2008

[26] [26]

Liao, J.-G

H.-J. Liao, J.-G. Liu, L. Wang, and T. Xiang, Differ- entiable programming tensor networks, Phys. Rev. X9, 031041 (2019)

work page 2019

[27] [27]

Francuz, N

A. Francuz, N. Schuch, and B. Vanhecke, Stable and effi- cient differentiation of tensor network algorithms, Phys. Rev. Research7, 013237 (2025)

work page 2025

[28] [29]

Khatri, R

S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sorn- borger, and P. J. Coles, Quantum-assisted quantum com- piling, Quantum3, 140 (2019)

work page 2019

[29] [30]

M. C. Caro, H.-Y. Huang, N. Ezzell, J. Gibbs, A. T. Sornborger, L. Cincio, P. J. Coles, and Z. Holmes, Out- of-distribution generalization for learning quantum dy- namics, Nat. Commun.14, 3751 (2023)

work page 2023

[30] [31]

Gibbs and L

J. Gibbs and L. Cincio, Deep circuit compression for quantum dynamics via tensor networks, Quantum9, 1789 (2025)

work page 2025

[31] [32]

Robertson, A

N. Robertson, A. Akhriev, J. Vala, and S. Zhuk, Ap- proximate quantum compiling for quantum simulation: A tensor network based approach, ACM Trans. Quan- tum Comput.6(2025)

work page 2025

[32] [33]

Causer, F

L. Causer, F. Jung, A. Mitra, F. Pollmann, and A. Gammon-Smith, Scalable simulation of nonequilib- rium quantum dynamics via classically optimized unitary circuits, Phys. Rev. Research6, 033062 (2024)

work page 2024

[33] [34]

A short tutorial on Wirtinger Calculus with applications in quantum information,

M. D’Anna, Y. Zhang, R. Wiersema, M. S. Rudolph, and J. Carrasquilla, Circuit compression for 2d quantum dynamics, arXiv preprint arXiv:2312.04858 (2025)

work page arXiv 2025

[34] [35]

I. A. Luchnikov, A. Ryzhov, S. N. Filippov, and H. Ouer- dane, QGOpt: Riemannian optimization for quantum technologies, SciPost Phys.10, 079 (2021)

work page 2021

[35] [36]

I. A. Luchnikov, M. E. Krechetov, and S. N. Filippov, Riemannian geometry and automatic differentiation for optimization problems of quantum physics and quantum technologies, New J. Phys.23, 073006 (2021)

work page 2021

[36] [37]

M. Herlihy, The art of multiprocessor programming, in Proceedings of the Twenty-Fifth Annual ACM Sympo- sium on Principles of Distributed Computing, PODC ’06 (Association for Computing Machinery, New York, NY, USA, 2006) p. 1–2

work page 2006

[37] [38]

Gibbs and L

J. Gibbs and L. Cincio, Learning circuits with infi- nite tensor networks, arXiv preprint arXiv:2506.02105 (2025)

work page arXiv 2025

[38] [39]

A. W. Sandvik and G. Vidal, Variational quantum Monte Carlo simulations with tensor-network states, Phys. Rev. Lett.99, 220602 (2007)

work page 2007

[39] [40]

Liu, S.-J

W.-Y. Liu, S.-J. Dong, Y.-J. Han, G.-C. Guo, and L. He, Gradient optimization of finite projected entangled pair states, Phys. Rev. B95, 195154 (2017)

work page 2017

[40] [41]

Liu, Y.-Z

W.-Y. Liu, Y.-Z. Huang, S.-S. Gong, and Z.-C. Gu, Accu- rate simulation for finite projected entangled pair states in two dimensions, Phys. Rev. B103, 235155 (2021)

work page 2021

[41] [42]

Wu and Z

Y. Wu and Z. Dai, Algorithms for variational Monte Carlo calculations of fermion projected entangled pair states in the swap gates formulation and the detailed bal- ance of tensor network sequential sampling, Chin. Phys. B35, 020502 (2026)

work page 2026

[42] [43]

T. Chen, J. Liu, Y. Wu, P. Zhang, and Y. Deng, Varia- tional Monte Carlo (VMC) with row-update projected entangled-pair states (PEPS) and its applications in quantum spin glasses, arXiv preprint arXiv:2601.20608 (2026)

work page arXiv 2026

[43] [44]

R. J. Webber and M. Lindsey, Rayleigh-Gauss-Newton optimization with enhanced sampling for variational Monte Carlo, Phys. Rev. Research4, 033099 (2022)

work page 2022

[44] [45]

Lanczos, An iteration method for the solution of the eigenvalue problem of linear differential and integral op- erators, J

C. Lanczos, An iteration method for the solution of the eigenvalue problem of linear differential and integral op- erators, J. Res. Natl. Bur. Stand.45, 255 (1950)

work page 1950

[45] [46]

ADAM implementation by qiskit v0.18 (2021)

work page 2021

[46] [47]

trust-region implementation by rqcopt v1.0.0 (2022)

work page 2022

[47] [48]

Bradbury, R

J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Van- derPlas, S. Wanderman-Milne, and Q. Zhang, JAX: com- posable transformations of Python+NumPy programs (2018)

work page 2018

[48] [49]

C. T. Kelley,Iterative methods for optimization(SIAM, 1999)

work page 1999

[49] [50]

Hessian-vector products for tensor networks via recursive tangent-state propagation

J. Shewchuk,An introduction to the conjugate gradi- ent method without the agonizing pain, Technical Report CMU-CS-94-125 (School of Computer Science, Carnegie Mellon University, 1994). 1 Supplementary Material for: “Hessian-vector products for tensor networks via recursive tangent-state propagation” Appendix A: Complex derivatives In this section, we sum...

work page 1994