pith. sign in

arxiv: 2604.20384 · v1 · submitted 2026-04-22 · 🪐 quant-ph · physics.comp-ph

Hessian-vector products for tensor networks via recursive tangent-state propagation

Pith reviewed 2026-05-10 00:21 UTC · model grok-4.3

classification 🪐 quant-ph physics.comp-ph
keywords tensor networksHessian-vector productRiemannian optimizationtrust-region methodquantum circuit compressiontangent-state propagationsecond-order optimizationspin-chain time evolution
0
0 comments X

The pith

Recursive tangent-state propagation computes scalable Hessian-vector products for tensor networks without building the full matrix.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to demonstrate that second-order optimization, which resists local minima better than first-order methods, can be applied to large tensor networks by replacing the full Hessian with an analytical product against any vector. It does this through a two-pass algorithm that propagates tangent states recursively while holding the virtual bond dimension fixed. This matters for quantum circuit tasks because first-order approaches like Riemannian ADAM converge slowly on time-evolution circuits, whereas the new kernel integrates directly into a Riemannian trust-region solver. If the method works as described, it removes the main computational barrier that previously made second-order methods impractical for these systems.

Core claim

An analytical Hessian-vector product kernel is introduced for arbitrary compositions of linear maps. The kernel is realized as a two-pass algorithm that performs recursive tangent-state propagation while keeping the virtual bond dimension bounded, thereby guaranteeing scalability. When this kernel is placed inside a Riemannian trust-region optimizer for quantum circuit compression, the resulting procedure produces up to four orders of magnitude higher fidelity than naive Trotterization on time-evolution circuits for spin chains and exhibits markedly smoother and faster convergence than standard first-order Riemannian methods.

What carries the argument

Recursive tangent-state propagation, the two-pass mechanism that carries tangent vectors through successive linear maps while enforcing a fixed virtual bond dimension to evaluate the Hessian-vector product.

If this is right

  • Second-order Riemannian optimization becomes practical for tensor networks whose size previously ruled out explicit Hessian construction.
  • Quantum circuit compression of time-evolution operators on spin chains reaches fidelities four orders of magnitude above naive Trotterization.
  • Convergence trajectories become smoother and require fewer iterations than those produced by first-order Riemannian ADAM.
  • The same kernel applies without modification to any composition of linear maps that can be represented as a tensor network.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The bounded-dimension propagation technique may extend to the computation of higher-order derivative-vector products in the same tensor-network setting.
  • The method could be tested on tensor-network ansatzes used in classical machine-learning models to check whether the same scalability gains appear outside quantum circuits.
  • If the bounded virtual bond dimension remains accurate for deeper or wider networks, the approach would directly support optimization of larger many-body systems without additional truncation analysis.

Load-bearing premise

Bounding the virtual bond dimension during recursive tangent-state propagation still yields the exact Hessian-vector product rather than an uncontrolled approximation.

What would settle it

On a small tensor network where the full Hessian can be stored and multiplied explicitly, compare the output of the recursive kernel against a finite-difference approximation of the directional derivative of the gradient.

Figures

Figures reproduced from arXiv: 2604.20384 by Christian B. Mendl, Isabel Nha Minh Le, Roeland Wiersema.

Figure 1
Figure 1. Figure 1: Visualization of the gradient and its directional derivative, i.e., the Hessian-vector product H(f(z)) [v]. The gradient ∇f represents the local slope; the difference between the gradient at point z and a perturbed point z + ϵv defines the Hessian-vector product as ϵ → 0. on a vector v, denoted as H(f(z)) [v], without construct￾ing the full matrix ( [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: TN representation as a composition of linear [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Computational passes for a brickwall circuit. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Riemannian quantum circuit optimization results for various spin chains. (Top) Approximation accuracy of the optimized circuit compared to naive Trotterization, evaluated on a test batch of 100 Haar-random product states. (Bottom) The trust-region method correctly reproduces the ADAM optimization results [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Convergence of Riemannian quantum circuit [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
read the original abstract

Optimizing tensor networks with standard first-order methods often leads to slow convergence and entrapment in local minima. Although second-order optimization offers enhanced robustness, explicitly constructing the full Hessian matrix is computationally prohibitive for large-scale systems. In this work, we bypass this bottleneck by introducing an analytical Hessian-vector product kernel designed for arbitrary compositions of linear maps. This two-pass algorithm leverages recursive tangent-state propagation with a bounded virtual bond dimension to guarantee scalability. We demonstrate the practical utility of this kernel by integrating it into a Riemannian trust-region framework for quantum circuit compression. Evaluated on time-evolution circuits for various spin chains, our second-order approach achieves up to a four-order-of-magnitude improvement in fidelity over naive Trotterization, while delivering significantly smoother, faster convergence than conventional first-order methods such as Riemannian ADAM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces an analytical Hessian-vector product (HVP) kernel for tensor networks formed by arbitrary compositions of linear maps. The kernel is realized via a two-pass recursive tangent-state propagation algorithm that deliberately bounds the virtual bond dimension for scalability. This HVP is embedded in a Riemannian trust-region optimizer and applied to quantum circuit compression of time-evolution circuits on spin chains, where the second-order method reportedly yields up to four orders of magnitude higher fidelity than naive Trotterization and smoother convergence than first-order Riemannian ADAM.

Significance. If the HVP is shown to be exact (or with rigorously controlled error) despite the bond-dimension bound, the work supplies a missing computational primitive for second-order optimization of tensor networks. This would be valuable for variational quantum algorithms and many-body simulations where first-order methods converge slowly. The reported fidelity gains on spin-chain circuits provide initial evidence of practical impact, though quantitative validation details are needed to assess the strength of the empirical claims.

major comments (2)
  1. [Abstract and recursive tangent-state propagation section (likely §3)] The central claim of an 'analytical' (exact) HVP is load-bearing, yet the abstract and method description rely on recursive tangent-state propagation with an explicitly bounded virtual bond dimension. Any such bound introduces a truncation whose error vanishes only if the intermediate tangent states remain exactly within the truncated manifold. The manuscript must demonstrate (via projection identities, closed-form cancellation, or an explicit error analysis) that the recursion preserves exactness; otherwise the 'analytical' guarantee does not hold. This issue directly affects the weakest assumption identified in the stress-test note.
  2. [Application to quantum circuit compression and numerical results] The integration into the Riemannian trust-region framework assumes the supplied HVP is sufficiently accurate for the trust-region subproblem solver. No quantitative error analysis or comparison against finite-difference or full-Hessian baselines is referenced in the provided abstract; such validation is required to confirm that the observed fidelity improvements are attributable to the second-order information rather than to other implementation details.
minor comments (2)
  1. [Abstract] The abstract states 'large fidelity gains' and 'four-order-of-magnitude improvement' without specifying the system sizes, circuit depths, or exact fidelity metric used; these details should be added for reproducibility.
  2. [Method description] Notation for the tangent states and the two-pass algorithm would benefit from an explicit pseudocode listing or a small illustrative diagram showing the forward and backward recursions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments, which help clarify the presentation of our analytical HVP kernel and its validation. We address each major comment below, indicating the revisions we will incorporate.

read point-by-point responses
  1. Referee: [Abstract and recursive tangent-state propagation section (likely §3)] The central claim of an 'analytical' (exact) HVP is load-bearing, yet the abstract and method description rely on recursive tangent-state propagation with an explicitly bounded virtual bond dimension. Any such bound introduces a truncation whose error vanishes only if the intermediate tangent states remain exactly within the truncated manifold. The manuscript must demonstrate (via projection identities, closed-form cancellation, or an explicit error analysis) that the recursion preserves exactness; otherwise the 'analytical' guarantee does not hold. This issue directly affects the weakest assumption identified in the stress-test note.

    Authors: We agree that demonstrating exactness of the HVP with respect to the truncated manifold is essential for the 'analytical' claim. The recursive tangent-state propagation is constructed so that each tangent vector is explicitly projected onto the tangent space of the current bond-dimension truncation at every step; this ensures the computed product is exact for the truncated network rather than an approximation to the untruncated Hessian. We will add a new subsection to §3 containing a formal proof via projection identities showing that the two-pass recursion commutes with these projections and therefore preserves exactness on the truncated manifold. We will also include a short discussion of the (controlled) difference from the untruncated case. revision: yes

  2. Referee: [Application to quantum circuit compression and numerical results] The integration into the Riemannian trust-region framework assumes the supplied HVP is sufficiently accurate for the trust-region subproblem solver. No quantitative error analysis or comparison against finite-difference or full-Hessian baselines is referenced in the provided abstract; such validation is required to confirm that the observed fidelity improvements are attributable to the second-order information rather than to other implementation details.

    Authors: We acknowledge that stronger quantitative validation is needed to attribute the reported fidelity gains specifically to the second-order information. In the revised manuscript we will add a dedicated validation subsection (within the numerical results) that (i) compares the analytical HVP against finite-difference approximations on small spin-chain instances where the full Hessian remains tractable, (ii) reports observed relative errors and their scaling with bond dimension, and (iii) includes an ablation comparing trust-region steps using the analytical HVP versus first-order Riemannian ADAM on the same circuits. These additions will directly address the attribution question. revision: yes

Circularity Check

0 steps flagged

No circularity: analytical derivation is self-contained

full rationale

The paper introduces an analytical Hessian-vector product via recursive tangent-state propagation for tensor networks. No load-bearing step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the bounded bond dimension is explicitly a scalability choice rather than a hidden redefinition of the target quantity. The derivation chain remains independent of its inputs and does not rely on renaming known results or smuggling ansatzes through citations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the work relies on standard domain assumptions in tensor networks and differentiable programming; no free parameters, new entities, or ad-hoc axioms are introduced.

axioms (1)
  • domain assumption Tensor networks consist of compositions of linear maps that admit analytical differentiation.
    Required for tangent-state propagation to compute Hessian-vector products exactly.

pith-pipeline@v0.9.0 · 5434 in / 1219 out tokens · 48015 ms · 2026-05-10T00:21:18.418552+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages

  1. [1]

    We benchmark its efficacy against the first-order Riemannian ADAM optimizer uti- lized in our recent study [13]

    Optimizing Trotter circuits for spin chains To evaluate the performance of the HVP kernel, we op- timize Trotter circuits using a second-order Riemannian trust-region algorithm [15]. We benchmark its efficacy against the first-order Riemannian ADAM optimizer uti- lized in our recent study [13]. As a primary benchmark, we consider the transverse- field Isi...

  2. [2]

    trusted region

    Comparing convergence behavior To evaluate optimization efficiency, we compare the convergence speed of the Riemannian ADAM and trust- region optimizers in Fig. 5. For both the Ising and Heisenberg models, we optimize a quantum circuit com- posed of 11 layers. We enforce translational invariance across the circuit, meaning each layerℓis constructed by rep...

  3. [3]

    S. R. White, Density matrix formulation for quantum renormalization groups, Phys. Rev. Lett.69, 2863 (1992)

  4. [4]

    Schollw¨ ock, The density-matrix renormalization group in the age of matrix product states, Ann

    U. Schollw¨ ock, The density-matrix renormalization group in the age of matrix product states, Ann. Phys.326, 96 (2011)

  5. [5]

    Or´ us, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Ann

    R. Or´ us, A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Ann. Phys.349, 117 (2014)

  6. [6]

    G. K.-L. Chan and S. Sharma, The density matrix renor- malization group in quantum chemistry, Annu. Rev. Phys. Chem.62, 465 (2011)

  7. [7]

    G. K. Chan, A. Keselman, N. Nakatani, Z. Li, and S. R. White, Matrix product operators, matrix product states, and ab initio density matrix renormalization group algo- rithms, J. Chem. Phys.145(2016)

  8. [8]

    Stoudenmire and D

    E. Stoudenmire and D. J. Schwab, Supervised learning with tensor networks, Adv. Neural Inf. Process. Syst.29 (2016)

  9. [9]

    J. A. Reyes and E. M. Stoudenmire, Multi-scale tensor network architecture for machine learning, Mach. Learn.: Sci. Technol.2, 035036 (2021)

  10. [10]

    I. L. Markov and Y. Shi, Simulating quantum computa- tion by contracting tensor networks, SIAM J. Comput. 38, 963 (2008)

  11. [11]

    Seitz, I

    P. Seitz, I. Medina, E. Cruz, Q. Huang, and C. B. Mendl, Simulating quantum circuits using tree tensor networks, Quantum7, 964 (2023)

  12. [12]

    Haegeman, J

    J. Haegeman, J. I. Cirac, T. J. Osborne, I. Pizorn, H. Ver- schelde, and F. Verstraete, Time-dependent variational principle for quantum lattices, Phys. Rev. Lett.107, 070601 (2011)

  13. [13]

    Haegeman, C

    J. Haegeman, C. Lubich, I. Oseledets, B. Vandereycken, and F. Verstraete, Unifying time evolution and optimiza- tion with matrix product states, Phys. Rev. B94, 165116 (2016)

  14. [14]

    Zauner-Stauber, L

    V. Zauner-Stauber, L. Vanderstraeten, M. T. Fishman, F. Verstraete, and J. Haegeman, Variational optimization algorithms for uniform matrix product states, Phys. Rev. B97, 045145 (2018)

  15. [15]

    I. N. M. Le, S. Sun, and C. B. Mendl, Riemannian quan- tum circuit optimization based on matrix product oper- ators, Quantum9, 1833 (2025)

  16. [16]

    Zhang, R

    Y. Zhang, R. Wiersema, J. Carrasquilla, L. Cincio, and Y. B. Kim, Scalable quantum dynamics compi- lation via quantum machine learning, arXiv preprint arXiv:2409.16346 (2024)

  17. [17]

    Absil, R

    P.-A. Absil, R. Mahony, and R. Sepulchre,Optimiza- tion algorithms on matrix manifolds(Princeton Univer- sity Press, 2008)

  18. [18]

    Haegeman, T

    J. Haegeman, T. J. Osborne, and F. Verstraete, Post- matrix product state methods: To tangent space and beyond, Phys. Rev. B88, 075133 (2013). 12

  19. [19]

    Vanderstraeten, J

    L. Vanderstraeten, J. Haegeman, P. Corboz, and F. Ver- straete, Gradient methods for variational optimization of projected entangled-pair states, Phys. Rev. B94, 155123 (2016)

  20. [20]

    Kotil, R

    A. Kotil, R. Banerjee, Q. Huang, and C. B. Mendl, Rie- mannian quantum circuit optimization for Hamiltonian simulation, J. Phys. A: Math. Theor.57, 135303 (2024)

  21. [21]

    Putterer, M

    F. Putterer, M. M. Zumpe, I. N. M. Le, Q. Huang, and C. B. Mendl, High-performance contraction of quan- tum circuits for Riemannian optimization, arXiv preprint arXiv:2506.23775 (2025)

  22. [22]

    arXiv preprint arXiv:2501.18691 , year=

    M. Ben-Dov and J. Chen, Regularized second-order optimization of tensor-network Born machines, arXiv preprint arXiv:2501.18691 (2025)

  23. [23]

    Nocedal and S

    J. Nocedal and S. J. Wright,Numerical optimization (Springer, 2006)

  24. [24]

    B. A. Pearlmutter, Fast exact multiplication by the Hes- sian, Neural Comput.6, 147 (1994)

  25. [25]

    Griewank and A

    A. Griewank and A. Walther,Evaluating derivatives: principles and techniques of algorithmic differentiation (SIAM, 2008)

  26. [26]

    Liao, J.-G

    H.-J. Liao, J.-G. Liu, L. Wang, and T. Xiang, Differ- entiable programming tensor networks, Phys. Rev. X9, 031041 (2019)

  27. [27]

    Francuz, N

    A. Francuz, N. Schuch, and B. Vanhecke, Stable and effi- cient differentiation of tensor network algorithms, Phys. Rev. Research7, 013237 (2025)

  28. [29]

    Khatri, R

    S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sorn- borger, and P. J. Coles, Quantum-assisted quantum com- piling, Quantum3, 140 (2019)

  29. [30]

    M. C. Caro, H.-Y. Huang, N. Ezzell, J. Gibbs, A. T. Sornborger, L. Cincio, P. J. Coles, and Z. Holmes, Out- of-distribution generalization for learning quantum dy- namics, Nat. Commun.14, 3751 (2023)

  30. [31]

    Gibbs and L

    J. Gibbs and L. Cincio, Deep circuit compression for quantum dynamics via tensor networks, Quantum9, 1789 (2025)

  31. [32]

    Robertson, A

    N. Robertson, A. Akhriev, J. Vala, and S. Zhuk, Ap- proximate quantum compiling for quantum simulation: A tensor network based approach, ACM Trans. Quan- tum Comput.6(2025)

  32. [33]

    Causer, F

    L. Causer, F. Jung, A. Mitra, F. Pollmann, and A. Gammon-Smith, Scalable simulation of nonequilib- rium quantum dynamics via classically optimized unitary circuits, Phys. Rev. Research6, 033062 (2024)

  33. [34]

    A short tutorial on Wirtinger Calculus with applications in quantum information,

    M. D’Anna, Y. Zhang, R. Wiersema, M. S. Rudolph, and J. Carrasquilla, Circuit compression for 2d quantum dynamics, arXiv preprint arXiv:2312.04858 (2025)

  34. [35]

    I. A. Luchnikov, A. Ryzhov, S. N. Filippov, and H. Ouer- dane, QGOpt: Riemannian optimization for quantum technologies, SciPost Phys.10, 079 (2021)

  35. [36]

    I. A. Luchnikov, M. E. Krechetov, and S. N. Filippov, Riemannian geometry and automatic differentiation for optimization problems of quantum physics and quantum technologies, New J. Phys.23, 073006 (2021)

  36. [37]

    M. Herlihy, The art of multiprocessor programming, in Proceedings of the Twenty-Fifth Annual ACM Sympo- sium on Principles of Distributed Computing, PODC ’06 (Association for Computing Machinery, New York, NY, USA, 2006) p. 1–2

  37. [38]

    Gibbs and L

    J. Gibbs and L. Cincio, Learning circuits with infi- nite tensor networks, arXiv preprint arXiv:2506.02105 (2025)

  38. [39]

    A. W. Sandvik and G. Vidal, Variational quantum Monte Carlo simulations with tensor-network states, Phys. Rev. Lett.99, 220602 (2007)

  39. [40]

    Liu, S.-J

    W.-Y. Liu, S.-J. Dong, Y.-J. Han, G.-C. Guo, and L. He, Gradient optimization of finite projected entangled pair states, Phys. Rev. B95, 195154 (2017)

  40. [41]

    Liu, Y.-Z

    W.-Y. Liu, Y.-Z. Huang, S.-S. Gong, and Z.-C. Gu, Accu- rate simulation for finite projected entangled pair states in two dimensions, Phys. Rev. B103, 235155 (2021)

  41. [42]

    Wu and Z

    Y. Wu and Z. Dai, Algorithms for variational Monte Carlo calculations of fermion projected entangled pair states in the swap gates formulation and the detailed bal- ance of tensor network sequential sampling, Chin. Phys. B35, 020502 (2026)

  42. [43]

    T. Chen, J. Liu, Y. Wu, P. Zhang, and Y. Deng, Varia- tional Monte Carlo (VMC) with row-update projected entangled-pair states (PEPS) and its applications in quantum spin glasses, arXiv preprint arXiv:2601.20608 (2026)

  43. [44]

    R. J. Webber and M. Lindsey, Rayleigh-Gauss-Newton optimization with enhanced sampling for variational Monte Carlo, Phys. Rev. Research4, 033099 (2022)

  44. [45]

    Lanczos, An iteration method for the solution of the eigenvalue problem of linear differential and integral op- erators, J

    C. Lanczos, An iteration method for the solution of the eigenvalue problem of linear differential and integral op- erators, J. Res. Natl. Bur. Stand.45, 255 (1950)

  45. [46]

    ADAM implementation by qiskit v0.18 (2021)

  46. [47]

    trust-region implementation by rqcopt v1.0.0 (2022)

  47. [48]

    Bradbury, R

    J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Van- derPlas, S. Wanderman-Milne, and Q. Zhang, JAX: com- posable transformations of Python+NumPy programs (2018)

  48. [49]

    C. T. Kelley,Iterative methods for optimization(SIAM, 1999)

  49. [50]

    Hessian-vector products for tensor networks via recursive tangent-state propagation

    J. Shewchuk,An introduction to the conjugate gradi- ent method without the agonizing pain, Technical Report CMU-CS-94-125 (School of Computer Science, Carnegie Mellon University, 1994). 1 Supplementary Material for: “Hessian-vector products for tensor networks via recursive tangent-state propagation” Appendix A: Complex derivatives In this section, we sum...