pith. machine review for the scientific record. sign in

arxiv: 2604.00037 · v2 · submitted 2026-03-24 · 🧮 math.NA · cs.NA· physics.comp-ph· quant-ph

Recognition: no theorem link

Fast elementwise operations on tensor trains with alternating cross interpolation

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:22 UTC · model grok-4.3

classification 🧮 math.NA cs.NAphysics.comp-phquant-ph
keywords tensor trainsmatrix product stateselementwise operationsalternating cross interpolationhigh-dimensional tensorsnumerical linear algebrapartial differential equations
0
0 comments X

The pith

The alternating cross interpolation algorithm performs elementwise operations on tensor trains in O(χ³) time while controlling error.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Tensor trains compress high-dimensional data so that calculations on the data remain efficient. In applications such as TT-based solvers for nonlinear partial differential equations, the dominant cost is elementwise multiplication or similar operations on several tensor trains. Existing error-controlled methods for these operations scale as O(χ⁴) with the TT rank χ. The paper shows that an alternating cross interpolation procedure can reduce the cost to O(χ³) whenever the rank of the result stays below χ², while still providing rigorous error bounds. Numerical tests on benchmark problems confirm the faster scaling and the practical speedups that result for ranks typical in applications.

Core claim

We present the alternating cross interpolation (ACI) algorithm that performs such operations in O(χ³), while maintaining error control.

What carries the argument

The alternating cross interpolation (ACI) algorithm, which builds successive low-rank approximations by alternating between cross selections and interpolations to realize the elementwise product or similar operation.

If this is right

  • TT-based solvers for nonlinear PDEs become faster for the ranks encountered in practice.
  • Elementwise operations can be performed with the same error control previously available only at quartic cost.
  • The method extends directly to other elementwise operations beyond multiplication.
  • Significant runtime reductions are observed on standard benchmark problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar cross-interpolation strategies might accelerate other tensor-network contractions that currently rely on quartic intermediates.
  • The approach could be combined with rank-adaptive truncation to handle cases where the output rank temporarily grows.
  • In high-dimensional scientific computing, the cubic scaling opens previously inaccessible problem sizes.

Load-bearing premise

The rank of the output tensor train stays smaller than χ squared so that the improved scaling applies without rank explosion.

What would settle it

A concrete test case in which the elementwise operation produces an output TT whose rank exceeds χ², after which the measured runtime reverts to quartic scaling or the error bound is lost.

Figures

Figures reproduced from arXiv: 2604.00037 by Marc K. Ritter.

Figure 1
Figure 1. Figure 1: Multiplication of Gaussians for algorithm verification. [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Hadamard product of random Fourier series, Eq. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Similar to Fig [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

Tensor trains (TTs), also known as matrix product states (MPS), are compressed representations of high-dimensional data that can be efficiently manipulated to perform calculations on the data. In many applications, such as TT-based solvers for nonlinear partial differential equations, the most expensive step is an elementwise multiplication or similar elementwise operation on multiple TTs. Known error-controlled algorithms for such operations scale as $O(\chi^4)$, where $\chi$ is the TT rank. If the rank of the output is smaller than $\chi^2$, it is possible to formulate algorithms with better scaling. In this work, we present the alternating cross interpolation (ACI) algorithm that performs such operations in $O(\chi^3)$, while maintaining error control. We demonstrate these properties on benchmark problems, achieving a significant speedup for TT ranks that are commonly encountered in practical applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the alternating cross interpolation (ACI) algorithm for performing elementwise operations (such as multiplication) on tensor trains (TTs/MPS). It claims that ACI achieves O(χ³) complexity with error control whenever the output TT rank remains smaller than χ², improving on the standard O(χ⁴) scaling of existing error-controlled methods, and demonstrates the speedup on benchmark problems relevant to nonlinear PDE solvers.

Significance. If the O(χ³) scaling and error control hold under the stated rank condition, the result would meaningfully accelerate the dominant step in many TT-based applications. The approach builds directly on established cross-interpolation and TT techniques without introducing free parameters or fitted constants, which strengthens the contribution. The central practical value rests on whether ACI reliably keeps intermediate and output ranks from reaching Θ(χ²) without hidden costs that would restore O(χ⁴) scaling.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (algorithm description): the O(χ³) claim is conditioned on output rank remaining < χ², yet the alternating cross sweeps are not shown to include a provable truncation mechanism that prevents intermediate ranks from reaching Θ(χ²) before the final compression step; without this, per-sweep cost reverts to O(χ⁴) and the headline complexity does not hold uniformly.
  2. [§4] §4 (complexity analysis): the derivation of the improved scaling assumes that the cross-interpolation procedure itself enforces the rank bound without additional truncation overhead; no explicit bound or pseudocode step is given that guarantees this for general elementwise operations, leaving the complexity claim dependent on an unproven assumption about the target applications.
minor comments (2)
  1. [Figure 2] Figure 2 caption: the legend does not distinguish the ACI timing curve from the reference O(χ⁴) method; add explicit labels or line styles for clarity.
  2. [Notation throughout] Notation: the symbol χ is used both for the input TT ranks and (implicitly) for the output rank bound; introduce a distinct symbol (e.g., χ_out) when stating the condition χ_out < χ².

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on the complexity claims. We address each major point below and will revise the manuscript to make the rank-control mechanism and conditional nature of the O(χ³) bound fully explicit.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (algorithm description): the O(χ³) claim is conditioned on output rank remaining < χ², yet the alternating cross sweeps are not shown to include a provable truncation mechanism that prevents intermediate ranks from reaching Θ(χ²) before the final compression step; without this, per-sweep cost reverts to O(χ⁴) and the headline complexity does not hold uniformly.

    Authors: We agree that the presentation in §3 would be strengthened by an explicit truncation step. In the ACI procedure each cross-interpolation is performed on unfoldings whose dimensions are governed by the current TT cores; because the target approximation is a TT whose rank is bounded by the output rank χ_out < χ², the number of selected fibers per core remains O(χ). To guarantee that intermediate ranks never exceed this bound during the alternating sweeps, we will insert a standard TT truncation (with relative tolerance matching the overall error control) after each sweep. This truncation costs O(χ³) when the working rank is O(χ) and therefore preserves the headline complexity. The revised pseudocode and accompanying text in §3 will document this step. revision: yes

  2. Referee: [§4] §4 (complexity analysis): the derivation of the improved scaling assumes that the cross-interpolation procedure itself enforces the rank bound without additional truncation overhead; no explicit bound or pseudocode step is given that guarantees this for general elementwise operations, leaving the complexity claim dependent on an unproven assumption about the target applications.

    Authors: Section 4 derives the O(χ³) scaling under the explicit hypothesis that the output TT admits rank χ_out < χ². For arbitrary elementwise operations the exact product may reach rank Θ(χ²), but ACI is designed for the practically relevant regime in which the result is compressible to rank o(χ²) (as occurs for the nonlinear terms in the PDE benchmarks). We do not claim a universal a-priori bound that holds for every possible nonlinearity without truncation; the algorithm relies on the final compression and the added per-sweep truncation described above to enforce the rank condition. We will revise §4 to state this conditional assumption clearly and to reference the new truncation step, removing any ambiguity about hidden O(χ⁴) costs. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The ACI algorithm extends established tensor-train and cross-interpolation methods with an explicit conditioning on output rank remaining below χ² to achieve O(χ³) scaling. No load-bearing step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the complexity claim follows directly from the rank assumption and standard cross-interpolation sweeps without renaming known results or smuggling ansatzes. The derivation chain is independent of the present paper's own outputs and relies on externally verifiable TT literature.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on standard assumptions of tensor train representations and cross-interpolation techniques from prior literature; no new free parameters, axioms, or invented entities are introduced in the abstract.

pith-pipeline@v0.9.0 · 5444 in / 944 out tokens · 39215 ms · 2026-05-15T00:22:10.671973+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 1 internal anchor

  1. [1]

    M. C. Bañuls,Tensor network algorithms: A route map, Annual Review of Condensed Matter Physics14, 173 (2023), doi:10.1146/annurev-conmatphys-040721-022705

  2. [2]

    Orús,A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Annals of Physics349, 117 (2014), doi:10.1016/j.aop.2014.06.013

    R. Orús,A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Annals of Physics349, 117 (2014), doi:10.1016/j.aop.2014.06.013

  3. [3]

    S. R. White,Density matrix formulation for quantum renormalization groups, Physical Review Letters69(19), 2863 (1992), doi:10.1103/PhysRevLett.69.2863

  4. [4]

    I. P . McCulloch,From density-matrix renormalization group to matrix product states, Journal of Statistical Mechanics: Theory and Experiment2007(10), P10014 (2007), doi:10.1088/1742-5468/2007/10/P10014, cond-smat/0701428

  5. [5]

    Schollwöck,The density-matrix renormalization group in the age of matrix product states, Annals of Physics326(1), 96 (2011), doi:10.1016/j.aop.2010.09.012

    U. Schollwöck,The density-matrix renormalization group in the age of matrix product states, Annals of Physics326(1), 96 (2011), doi:10.1016/j.aop.2010.09.012

  6. [6]

    N´ u˜ nez Fern´ andez, M

    Y. Núñez Fernández, M. Jeannin, P . T . Dumitrescu, T . Kloss, J. Kaye, O. Parcollet and X. Waintal,Learning Feynman diagrams with tensor trains, Physical Review X12(4), 041018 (2022), doi:10.1103/PhysRevX.12.041018

  7. [7]

    Jolly , Y

    N. Jolly , Y. N. Fernández and X. Waintal,Tensorized orbitals for computational chemistry, doi:10.48550/arXiv.2308.03508 (2023), 2308.03508

  8. [8]

    Arenstein and M

    L. Arenstein and M. Kastoryano,Full grid solution for multi-asset options pricing with tensor networks, doi:10.48550/arXiv.2601.00009 (2025), 2601.00009

  9. [9]

    K. Glau, D. Kressner and F . Statti,Low-rank tensor approximation for chebyshev interpolation in parametric option pricing, SIAM Journal on Financial Mathematics11(3), 897 (2020), doi:10.1137/19M1244172

  10. [10]

    Kastoryano and N

    M. Kastoryano and N. Pancotti,A highly efficient tensor network algorithm for multi-asset fourier options pricing, doi:10.48550/arXiv.2203.02804 (2022), 2203.02804

  11. [11]

    Sakurai, H

    R. Sakurai, H. Takahashi and K. Miyamoto,Learning parameter dependence for fourier-based option pricing with tensor trains, Mathematics13(11), 1828 (2025), doi:10.3390/math13111828, 2405.00701

  12. [12]

    Shinaoka, M

    H. Shinaoka, M. Wallerberger, Y. Murakami, K. Nogaki, R. Sakurai, P . Werner and A. Kauch,Multiscale space-time ansatz for correlation functions of quantum sys- tems based on quantics tensor trains, Physical Review X13(2), 021015 (2023), doi:10.1103/PhysRevX.13.021015

  13. [13]

    Rohshap, M

    S. Rohshap, M. K. Ritter, H. Shinaoka, J. von Delft, M. Wallerberger and A. Kauch,Two- particle calculations with quantics tensor trains: Solving the parquet equations, Physical Review Research7(2), 023087 (2025), doi:10.1103/PhysRevResearch.7.023087

  14. [14]

    M. K. Ritter, Y. Núñez Fernández, M. Wallerberger, J. von Delft, H. Shinaoka and X. Waintal,Quantics tensor cross interpolation for high-resolution parsimonious repre- sentations of multivariate functions, Physical Review Letters132(5), 056501 (2024), doi:10.1103/PhysRevLett.132.056501

  15. [15]

    A. O. Fumega, M. Niedermeier and J. L. Lado,Correlated states in super-moiré materials with a kernel polynomial quantics tensor cross interpolation algorithm, 2D Materials12(1), 015018 (2024), doi:10.1088/2053-1583/ad9d59. 14 SciPost Physics Submission

  16. [16]

    Gourianov, M

    N. Gourianov, M. Lubasch, S. Dolgov, Q. Y. van den Berg, H. Babaee, P . Givi, M. Kiffner and D. Jaksch,A quantum-inspired approach to exploit turbulence structures, Nature Computational Science2(1), 30 (2022), doi:10.1038/s43588-021-00181-1

  17. [17]

    Kornev, S

    E. Kornev, S. Dolgov, K. Pinto, M. Pflitsch, M. Perelshtein and A. Melnikov,Numerical solution of the incompressible navier-stokes equations for chemical mixers via quantum- inspired tensor train finite element method, doi:10.48550/arXiv.2305.10784 (2023), 2305.10784

  18. [18]

    R. D. Peddinti, S. Pisoni, A. Marini, P . Lott, H. Argentieri, E. Tiunov and L. Aolita,Quantum- inspired framework for computational fluid dynamics, Communications Physics7(1), 1 (2024), doi:10.1038/s42005-024-01623-8

  19. [19]

    Gourianov, P

    N. Gourianov, P . Givi, D. Jaksch and S. B. Pope,Tensor networks enable the calcula- tion of turbulence probability distributions, Science Advances11(5), eads5990 (2025), doi:10.1126/sciadv.ads5990

  20. [20]

    Hölscher, P

    L. Hölscher, P . Rao, L. Müller, J. Klepsch, A. Luckow, T . Stollenwerk and F . K. Wilhelm, Quantum-inspired fluid simulation of two-dimensional turbulence with GPU acceleration, Physical Review Research7(1), 013112 (2025), doi:10.1103/PhysRevResearch.7.013112

  21. [21]

    E. M. Stoudenmire and S. R. White,Minimally entangled typical thermal state algorithms, New Journal of Physics12(5), 055026 (2010), doi:10.1088/1367-2630/12/5/055026

  22. [22]

    B.-B. Chen, L. Chen, Z. Chen, W . Li and A. Weichselbaum,Exponential thermal tensor network approach for quantum lattice models, Physical Review X8(3), 031082 (2018), doi:10.1103/PhysRevX.8.031082

  23. [23]

    L. Ma, M. Fishman, E. M. Stoudenmire and E. Solomonik,Approximate contraction of arbitrary tensor networks with a flexible and efficient density matrix algorithm, Quantum8, 1580 (2024), doi:10.22331/q-2024-12-27-1580

  24. [24]

    Camaño, E

    C. Camaño, E. N. Epperly and J. A. Tropp,Successive randomized compression: A ran- domized algorithm for the compressed MPO-MPS product, Quantum10, 2022 (2026), doi:10.22331/q-2026-03-10-2022, 2504.06475

  25. [25]

    Bou-Comas, M

    A. Bou-Comas, M. Płodzie´n, L. Tagliacozzo and J. J. García-Ripoll,Quantics tensor train for solving Gross-Pitaevskii equation, doi:10.48550/arXiv.2507.03134 (2025), 2507.03134

  26. [26]

    Chen, I.-K

    Q.-C. Chen, I.-K. Liu, J.-W . Li and C.-M. Chung,Solving the Gross-Pitaevskii equation with quantic tensor trains: Ground states and nonlinear dynamics, doi:10.48550/arXiv.2507.04279 (2025), 2507.04279

  27. [27]

    Oseledets and E

    I. Oseledets and E. Tyrtyshnikov,TT-cross approximation for multidimensional arrays, Linear Algebra and its Applications432(1), 70 (2010), doi:10.1016/j.laa.2009.07.024

  28. [28]

    I. V . Oseledets,Tensor-train decomposition, SIAM Journal on Scientific Computing33(5), 2295 (2011), doi:10.1137/090752286

  29. [29]

    Savostyanov and I

    D. Savostyanov and I. Oseledets,Fast adaptive interpolation of multi-dimensional arrays in tensor train format, InThe 2011 International Workshop on Multidimensional (nD) Systems, pp. 1–8. IEEE, ISBN 978-1-61284-815-0, doi:10.1109/nDS.2011.6076873 (2011)

  30. [30]

    D. V . Savostyanov,Quasioptimality of maximum-volume cross interpolation of tensors, Linear Algebra and its Applications458, 217 (2014), doi:10.1016/j.laa.2014.06.006. 15 SciPost Physics Submission

  31. [31]

    Dolgov and D

    S. Dolgov and D. Savostyanov,Parallel cross interpolation for high-precision calculation of high-dimensional integrals, Computer Physics Communications246, 106869 (2020), doi:10.1016/j.cpc.2019.106869

  32. [32]

    N´ u˜ nez Fern´ andez, M

    Y. Núñez Fernández, M. K. Ritter, M. Jeannin, J.-W . Li, T . Kloss, T . Louvet, S. Terasaki, O. Parcollet, J. von Delft, H. Shinaoka and X. Waintal,Learning tensor networks with tensor cross interpolation: New algorithms and libraries, SciPost Physics18(3), 104 (2025), doi:10.21468/SciPostPhys.18.3.104

  33. [33]

    S. V . Dolgov and D. V . Savostyanov,Alternating minimal energy methods for linear systems in higher dimensions, SIAM Journal on Scientific Computing36(5), A2248 (2014), doi:10.1137/140953289

  34. [34]

    Z. Meng, Y. Khoo, J. Li and E. M. Stoudenmire,Recursive sketched interpolation: Efficient Hadamard products of tensor trains, doi:10.48550/arXiv.2602.17974 (2026), 2602.17974

  35. [35]

    A. A. Michailidis, C. Fenton and M. Kiffner,Element-wise multiplication of tensor trains, SIAM Journal on Scientific Computing47(5), B1158 (2025), doi:10.1137/24M1714149

  36. [36]

    Z. Sun, J. Huang, C. Xiao and C. Yang,HaTT: Hadamard avoiding TT recompression, doi:10.48550/arXiv.2410.04385 (2025), 2410.04385

  37. [37]

    Cazeaux, M.-S

    P . Cazeaux, M.-S. Dupuy and R. F . Justiniano,Linear-scaling tensor train sketching, doi:10.48550/arXiv.2603.11009 (2026), 2603.11009

  38. [38]

    S. A. Goreinov and E. E. Tyrtyshnikov,Quasioptimality of skeleton approxima- tion of a matrix in the chebyshev norm, Doklady Mathematics83(3), 374 (2011), doi:10.1134/S1064562411030355

  39. [39]

    Devos, M

    L. Devos, M. Van Damme and J. Haegeman,MPSKit, doi:10.5281/zenodo.10654900 (2026)

  40. [40]

    Devos and J

    L. Devos and J. Haegeman,TensorKit.jl: A julia package for large-scale tensor computations, with a hint of category theory, doi:10.48550/arXiv.2508.10076 (2025), 2508.10076

  41. [41]

    I. V . Oseledets,Approximation of2d ×2d matrices using tensor decomposition, SIAM Journal on Matrix Analysis and Applications31(4), 2130 (2010), doi:10.1137/090757861

  42. [42]

    Lindsey ,Multiscale interpolative construction of quantized tensor trains(2023), 2311

    M. Lindsey ,Multiscale interpolative construction of quantized tensor trains(2023), 2311. 12554

  43. [43]

    J. Chen, E. Stoudenmire and S. R. White,Quantum fourier transform has small entangle- ment, PRX Quantum4(4), 040318 (2023), doi:10.1103/PRXQuantum.4.040318

  44. [44]

    Chen and M

    J. Chen and M. Lindsey ,Direct interpolative construction of the discrete fourier transform as a matrix product operator, Applied and Computational Harmonic Analysis81, 101817 (2026), doi:10.1016/j.acha.2025.101817

  45. [45]

    V . A. Kazeev and B. N. Khoromskij,Low-rank explicit QTT representation of the laplace operator and its inverse, SIAM Journal on Matrix Analysis and Applications33(3), 742 (2012), doi:10.1137/100820479

  46. [46]

    Adaptive Patching for Tensor Train Computations

    G. Grosso, M. K. Ritter, S. Rohshap, S. Badr, A. Kauch, M. Wallerberger, J. v. Delft and H. Shi- naoka,Adaptive patching for tensor train computations, doi:10.48550/arXiv.2602.22372 (2026), 2602.22372. 16 SciPost Physics Submission

  47. [47]

    Tindall, M

    J. Tindall, M. Stoudenmire and R. Levy ,Compressing multivariate functions with tree tensor networks, doi:10.48550/arXiv.2410.03572 (2024), 2410.03572

  48. [48]

    M. K. Ritter,Code repository for ‘Fast elementwise operations on tensor trains with alternating cross interpolation’, doi:10.5281/zenodo.19208286, https://doi.org/10.5281/zenodo.19208286. [49]M. K. Ritter,AlternatingCrossInterpolation.jl(2026). [50]Tensor4all, tensor4all.org. 17