arxiv: 2604.00037 · v2 · submitted 2026-03-24 · 🧮 math.NA · cs.NA· physics.comp-ph· quant-ph

Recognition: no theorem link

Fast elementwise operations on tensor trains with alternating cross interpolation

Marc K. Ritter

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:22 UTC · model grok-4.3

classification 🧮 math.NA cs.NAphysics.comp-phquant-ph

keywords tensor trainsmatrix product stateselementwise operationsalternating cross interpolationhigh-dimensional tensorsnumerical linear algebrapartial differential equations

0 comments

The pith

The alternating cross interpolation algorithm performs elementwise operations on tensor trains in O(χ³) time while controlling error.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Tensor trains compress high-dimensional data so that calculations on the data remain efficient. In applications such as TT-based solvers for nonlinear partial differential equations, the dominant cost is elementwise multiplication or similar operations on several tensor trains. Existing error-controlled methods for these operations scale as O(χ⁴) with the TT rank χ. The paper shows that an alternating cross interpolation procedure can reduce the cost to O(χ³) whenever the rank of the result stays below χ², while still providing rigorous error bounds. Numerical tests on benchmark problems confirm the faster scaling and the practical speedups that result for ranks typical in applications.

Core claim

We present the alternating cross interpolation (ACI) algorithm that performs such operations in O(χ³), while maintaining error control.

What carries the argument

The alternating cross interpolation (ACI) algorithm, which builds successive low-rank approximations by alternating between cross selections and interpolations to realize the elementwise product or similar operation.

If this is right

TT-based solvers for nonlinear PDEs become faster for the ranks encountered in practice.
Elementwise operations can be performed with the same error control previously available only at quartic cost.
The method extends directly to other elementwise operations beyond multiplication.
Significant runtime reductions are observed on standard benchmark problems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar cross-interpolation strategies might accelerate other tensor-network contractions that currently rely on quartic intermediates.
The approach could be combined with rank-adaptive truncation to handle cases where the output rank temporarily grows.
In high-dimensional scientific computing, the cubic scaling opens previously inaccessible problem sizes.

Load-bearing premise

The rank of the output tensor train stays smaller than χ squared so that the improved scaling applies without rank explosion.

What would settle it

A concrete test case in which the elementwise operation produces an output TT whose rank exceeds χ², after which the measured runtime reverts to quartic scaling or the error bound is lost.

Figures

Figures reproduced from arXiv: 2604.00037 by Marc K. Ritter.

**Figure 2.** Figure 2: Hadamard product of random Fourier series, Eq. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Similar to Fig [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

Tensor trains (TTs), also known as matrix product states (MPS), are compressed representations of high-dimensional data that can be efficiently manipulated to perform calculations on the data. In many applications, such as TT-based solvers for nonlinear partial differential equations, the most expensive step is an elementwise multiplication or similar elementwise operation on multiple TTs. Known error-controlled algorithms for such operations scale as $O(\chi^4)$, where $\chi$ is the TT rank. If the rank of the output is smaller than $\chi^2$, it is possible to formulate algorithms with better scaling. In this work, we present the alternating cross interpolation (ACI) algorithm that performs such operations in $O(\chi^3)$, while maintaining error control. We demonstrate these properties on benchmark problems, achieving a significant speedup for TT ranks that are commonly encountered in practical applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the alternating cross interpolation (ACI) algorithm for performing elementwise operations (such as multiplication) on tensor trains (TTs/MPS). It claims that ACI achieves O(χ³) complexity with error control whenever the output TT rank remains smaller than χ², improving on the standard O(χ⁴) scaling of existing error-controlled methods, and demonstrates the speedup on benchmark problems relevant to nonlinear PDE solvers.

Significance. If the O(χ³) scaling and error control hold under the stated rank condition, the result would meaningfully accelerate the dominant step in many TT-based applications. The approach builds directly on established cross-interpolation and TT techniques without introducing free parameters or fitted constants, which strengthens the contribution. The central practical value rests on whether ACI reliably keeps intermediate and output ranks from reaching Θ(χ²) without hidden costs that would restore O(χ⁴) scaling.

major comments (2)

[Abstract and §3] Abstract and §3 (algorithm description): the O(χ³) claim is conditioned on output rank remaining < χ², yet the alternating cross sweeps are not shown to include a provable truncation mechanism that prevents intermediate ranks from reaching Θ(χ²) before the final compression step; without this, per-sweep cost reverts to O(χ⁴) and the headline complexity does not hold uniformly.
[§4] §4 (complexity analysis): the derivation of the improved scaling assumes that the cross-interpolation procedure itself enforces the rank bound without additional truncation overhead; no explicit bound or pseudocode step is given that guarantees this for general elementwise operations, leaving the complexity claim dependent on an unproven assumption about the target applications.

minor comments (2)

[Figure 2] Figure 2 caption: the legend does not distinguish the ACI timing curve from the reference O(χ⁴) method; add explicit labels or line styles for clarity.
[Notation throughout] Notation: the symbol χ is used both for the input TT ranks and (implicitly) for the output rank bound; introduce a distinct symbol (e.g., χ_out) when stating the condition χ_out < χ².

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on the complexity claims. We address each major point below and will revise the manuscript to make the rank-control mechanism and conditional nature of the O(χ³) bound fully explicit.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (algorithm description): the O(χ³) claim is conditioned on output rank remaining < χ², yet the alternating cross sweeps are not shown to include a provable truncation mechanism that prevents intermediate ranks from reaching Θ(χ²) before the final compression step; without this, per-sweep cost reverts to O(χ⁴) and the headline complexity does not hold uniformly.

Authors: We agree that the presentation in §3 would be strengthened by an explicit truncation step. In the ACI procedure each cross-interpolation is performed on unfoldings whose dimensions are governed by the current TT cores; because the target approximation is a TT whose rank is bounded by the output rank χ_out < χ², the number of selected fibers per core remains O(χ). To guarantee that intermediate ranks never exceed this bound during the alternating sweeps, we will insert a standard TT truncation (with relative tolerance matching the overall error control) after each sweep. This truncation costs O(χ³) when the working rank is O(χ) and therefore preserves the headline complexity. The revised pseudocode and accompanying text in §3 will document this step. revision: yes
Referee: [§4] §4 (complexity analysis): the derivation of the improved scaling assumes that the cross-interpolation procedure itself enforces the rank bound without additional truncation overhead; no explicit bound or pseudocode step is given that guarantees this for general elementwise operations, leaving the complexity claim dependent on an unproven assumption about the target applications.

Authors: Section 4 derives the O(χ³) scaling under the explicit hypothesis that the output TT admits rank χ_out < χ². For arbitrary elementwise operations the exact product may reach rank Θ(χ²), but ACI is designed for the practically relevant regime in which the result is compressible to rank o(χ²) (as occurs for the nonlinear terms in the PDE benchmarks). We do not claim a universal a-priori bound that holds for every possible nonlinearity without truncation; the algorithm relies on the final compression and the added per-sweep truncation described above to enforce the rank condition. We will revise §4 to state this conditional assumption clearly and to reference the new truncation step, removing any ambiguity about hidden O(χ⁴) costs. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The ACI algorithm extends established tensor-train and cross-interpolation methods with an explicit conditioning on output rank remaining below χ² to achieve O(χ³) scaling. No load-bearing step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the complexity claim follows directly from the rank assumption and standard cross-interpolation sweeps without renaming known results or smuggling ansatzes. The derivation chain is independent of the present paper's own outputs and relies on externally verifiable TT literature.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on standard assumptions of tensor train representations and cross-interpolation techniques from prior literature; no new free parameters, axioms, or invented entities are introduced in the abstract.

pith-pipeline@v0.9.0 · 5444 in / 944 out tokens · 39215 ms · 2026-05-15T00:22:10.671973+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 1 internal anchor

[1]

M. C. Bañuls,Tensor network algorithms: A route map, Annual Review of Condensed Matter Physics14, 173 (2023), doi:10.1146/annurev-conmatphys-040721-022705

work page doi:10.1146/annurev-conmatphys-040721-022705 2023
[2]

Orús,A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Annals of Physics349, 117 (2014), doi:10.1016/j.aop.2014.06.013

R. Orús,A practical introduction to tensor networks: Matrix product states and projected entangled pair states, Annals of Physics349, 117 (2014), doi:10.1016/j.aop.2014.06.013

work page doi:10.1016/j.aop.2014.06.013 2014
[3]

S. R. White,Density matrix formulation for quantum renormalization groups, Physical Review Letters69(19), 2863 (1992), doi:10.1103/PhysRevLett.69.2863

work page doi:10.1103/physrevlett.69.2863 1992
[4]

I. P . McCulloch,From density-matrix renormalization group to matrix product states, Journal of Statistical Mechanics: Theory and Experiment2007(10), P10014 (2007), doi:10.1088/1742-5468/2007/10/P10014, cond-smat/0701428

work page doi:10.1088/1742-5468/2007/10/p10014 2007
[5]

Schollwöck,The density-matrix renormalization group in the age of matrix product states, Annals of Physics326(1), 96 (2011), doi:10.1016/j.aop.2010.09.012

U. Schollwöck,The density-matrix renormalization group in the age of matrix product states, Annals of Physics326(1), 96 (2011), doi:10.1016/j.aop.2010.09.012

work page doi:10.1016/j.aop.2010.09.012 2011
[6]

N´ u˜ nez Fern´ andez, M

Y. Núñez Fernández, M. Jeannin, P . T . Dumitrescu, T . Kloss, J. Kaye, O. Parcollet and X. Waintal,Learning Feynman diagrams with tensor trains, Physical Review X12(4), 041018 (2022), doi:10.1103/PhysRevX.12.041018

work page doi:10.1103/physrevx.12.041018 2022
[7]

Jolly , Y

N. Jolly , Y. N. Fernández and X. Waintal,Tensorized orbitals for computational chemistry, doi:10.48550/arXiv.2308.03508 (2023), 2308.03508

work page doi:10.48550/arxiv.2308.03508 2023
[8]

Arenstein and M

L. Arenstein and M. Kastoryano,Full grid solution for multi-asset options pricing with tensor networks, doi:10.48550/arXiv.2601.00009 (2025), 2601.00009

work page doi:10.48550/arxiv.2601.00009 2025
[9]

K. Glau, D. Kressner and F . Statti,Low-rank tensor approximation for chebyshev interpolation in parametric option pricing, SIAM Journal on Financial Mathematics11(3), 897 (2020), doi:10.1137/19M1244172

work page doi:10.1137/19m1244172 2020
[10]

Kastoryano and N

M. Kastoryano and N. Pancotti,A highly efficient tensor network algorithm for multi-asset fourier options pricing, doi:10.48550/arXiv.2203.02804 (2022), 2203.02804

work page doi:10.48550/arxiv.2203.02804 2022
[11]

Sakurai, H

R. Sakurai, H. Takahashi and K. Miyamoto,Learning parameter dependence for fourier-based option pricing with tensor trains, Mathematics13(11), 1828 (2025), doi:10.3390/math13111828, 2405.00701

work page doi:10.3390/math13111828 2025
[12]

Shinaoka, M

H. Shinaoka, M. Wallerberger, Y. Murakami, K. Nogaki, R. Sakurai, P . Werner and A. Kauch,Multiscale space-time ansatz for correlation functions of quantum sys- tems based on quantics tensor trains, Physical Review X13(2), 021015 (2023), doi:10.1103/PhysRevX.13.021015

work page doi:10.1103/physrevx.13.021015 2023
[13]

Rohshap, M

S. Rohshap, M. K. Ritter, H. Shinaoka, J. von Delft, M. Wallerberger and A. Kauch,Two- particle calculations with quantics tensor trains: Solving the parquet equations, Physical Review Research7(2), 023087 (2025), doi:10.1103/PhysRevResearch.7.023087

work page doi:10.1103/physrevresearch.7.023087 2025
[14]

M. K. Ritter, Y. Núñez Fernández, M. Wallerberger, J. von Delft, H. Shinaoka and X. Waintal,Quantics tensor cross interpolation for high-resolution parsimonious repre- sentations of multivariate functions, Physical Review Letters132(5), 056501 (2024), doi:10.1103/PhysRevLett.132.056501

work page doi:10.1103/physrevlett.132.056501 2024
[15]

A. O. Fumega, M. Niedermeier and J. L. Lado,Correlated states in super-moiré materials with a kernel polynomial quantics tensor cross interpolation algorithm, 2D Materials12(1), 015018 (2024), doi:10.1088/2053-1583/ad9d59. 14 SciPost Physics Submission

work page doi:10.1088/2053-1583/ad9d59 2024
[16]

Gourianov, M

N. Gourianov, M. Lubasch, S. Dolgov, Q. Y. van den Berg, H. Babaee, P . Givi, M. Kiffner and D. Jaksch,A quantum-inspired approach to exploit turbulence structures, Nature Computational Science2(1), 30 (2022), doi:10.1038/s43588-021-00181-1

work page doi:10.1038/s43588-021-00181-1 2022
[17]

Kornev, S

E. Kornev, S. Dolgov, K. Pinto, M. Pflitsch, M. Perelshtein and A. Melnikov,Numerical solution of the incompressible navier-stokes equations for chemical mixers via quantum- inspired tensor train finite element method, doi:10.48550/arXiv.2305.10784 (2023), 2305.10784

work page doi:10.48550/arxiv.2305.10784 2023
[18]

R. D. Peddinti, S. Pisoni, A. Marini, P . Lott, H. Argentieri, E. Tiunov and L. Aolita,Quantum- inspired framework for computational fluid dynamics, Communications Physics7(1), 1 (2024), doi:10.1038/s42005-024-01623-8

work page doi:10.1038/s42005-024-01623-8 2024
[19]

Gourianov, P

N. Gourianov, P . Givi, D. Jaksch and S. B. Pope,Tensor networks enable the calcula- tion of turbulence probability distributions, Science Advances11(5), eads5990 (2025), doi:10.1126/sciadv.ads5990

work page doi:10.1126/sciadv.ads5990 2025
[20]

Hölscher, P

L. Hölscher, P . Rao, L. Müller, J. Klepsch, A. Luckow, T . Stollenwerk and F . K. Wilhelm, Quantum-inspired fluid simulation of two-dimensional turbulence with GPU acceleration, Physical Review Research7(1), 013112 (2025), doi:10.1103/PhysRevResearch.7.013112

work page doi:10.1103/physrevresearch.7.013112 2025
[21]

E. M. Stoudenmire and S. R. White,Minimally entangled typical thermal state algorithms, New Journal of Physics12(5), 055026 (2010), doi:10.1088/1367-2630/12/5/055026

work page doi:10.1088/1367-2630/12/5/055026 2010
[22]

B.-B. Chen, L. Chen, Z. Chen, W . Li and A. Weichselbaum,Exponential thermal tensor network approach for quantum lattice models, Physical Review X8(3), 031082 (2018), doi:10.1103/PhysRevX.8.031082

work page doi:10.1103/physrevx.8.031082 2018
[23]

L. Ma, M. Fishman, E. M. Stoudenmire and E. Solomonik,Approximate contraction of arbitrary tensor networks with a flexible and efficient density matrix algorithm, Quantum8, 1580 (2024), doi:10.22331/q-2024-12-27-1580

work page doi:10.22331/q-2024-12-27-1580 2024
[24]

Camaño, E

C. Camaño, E. N. Epperly and J. A. Tropp,Successive randomized compression: A ran- domized algorithm for the compressed MPO-MPS product, Quantum10, 2022 (2026), doi:10.22331/q-2026-03-10-2022, 2504.06475

work page doi:10.22331/q-2026-03-10-2022 2022
[25]

Bou-Comas, M

A. Bou-Comas, M. Płodzie´n, L. Tagliacozzo and J. J. García-Ripoll,Quantics tensor train for solving Gross-Pitaevskii equation, doi:10.48550/arXiv.2507.03134 (2025), 2507.03134

work page doi:10.48550/arxiv.2507.03134 2025
[26]

Chen, I.-K

Q.-C. Chen, I.-K. Liu, J.-W . Li and C.-M. Chung,Solving the Gross-Pitaevskii equation with quantic tensor trains: Ground states and nonlinear dynamics, doi:10.48550/arXiv.2507.04279 (2025), 2507.04279

work page doi:10.48550/arxiv.2507.04279 2025
[27]

Oseledets and E

I. Oseledets and E. Tyrtyshnikov,TT-cross approximation for multidimensional arrays, Linear Algebra and its Applications432(1), 70 (2010), doi:10.1016/j.laa.2009.07.024

work page doi:10.1016/j.laa.2009.07.024 2010
[28]

I. V . Oseledets,Tensor-train decomposition, SIAM Journal on Scientific Computing33(5), 2295 (2011), doi:10.1137/090752286

work page doi:10.1137/090752286 2011
[29]

Savostyanov and I

D. Savostyanov and I. Oseledets,Fast adaptive interpolation of multi-dimensional arrays in tensor train format, InThe 2011 International Workshop on Multidimensional (nD) Systems, pp. 1–8. IEEE, ISBN 978-1-61284-815-0, doi:10.1109/nDS.2011.6076873 (2011)

work page doi:10.1109/nds.2011.6076873 2011
[30]

D. V . Savostyanov,Quasioptimality of maximum-volume cross interpolation of tensors, Linear Algebra and its Applications458, 217 (2014), doi:10.1016/j.laa.2014.06.006. 15 SciPost Physics Submission

work page doi:10.1016/j.laa.2014.06.006 2014
[31]

Dolgov and D

S. Dolgov and D. Savostyanov,Parallel cross interpolation for high-precision calculation of high-dimensional integrals, Computer Physics Communications246, 106869 (2020), doi:10.1016/j.cpc.2019.106869

work page doi:10.1016/j.cpc.2019.106869 2020
[32]

N´ u˜ nez Fern´ andez, M

Y. Núñez Fernández, M. K. Ritter, M. Jeannin, J.-W . Li, T . Kloss, T . Louvet, S. Terasaki, O. Parcollet, J. von Delft, H. Shinaoka and X. Waintal,Learning tensor networks with tensor cross interpolation: New algorithms and libraries, SciPost Physics18(3), 104 (2025), doi:10.21468/SciPostPhys.18.3.104

work page doi:10.21468/scipostphys.18.3.104 2025
[33]

S. V . Dolgov and D. V . Savostyanov,Alternating minimal energy methods for linear systems in higher dimensions, SIAM Journal on Scientific Computing36(5), A2248 (2014), doi:10.1137/140953289

work page doi:10.1137/140953289 2014
[34]

Z. Meng, Y. Khoo, J. Li and E. M. Stoudenmire,Recursive sketched interpolation: Efficient Hadamard products of tensor trains, doi:10.48550/arXiv.2602.17974 (2026), 2602.17974

work page doi:10.48550/arxiv.2602.17974 2026
[35]

A. A. Michailidis, C. Fenton and M. Kiffner,Element-wise multiplication of tensor trains, SIAM Journal on Scientific Computing47(5), B1158 (2025), doi:10.1137/24M1714149

work page doi:10.1137/24m1714149 2025
[36]

Z. Sun, J. Huang, C. Xiao and C. Yang,HaTT: Hadamard avoiding TT recompression, doi:10.48550/arXiv.2410.04385 (2025), 2410.04385

work page doi:10.48550/arxiv.2410.04385 2025
[37]

Cazeaux, M.-S

P . Cazeaux, M.-S. Dupuy and R. F . Justiniano,Linear-scaling tensor train sketching, doi:10.48550/arXiv.2603.11009 (2026), 2603.11009

work page doi:10.48550/arxiv.2603.11009 2026
[38]

S. A. Goreinov and E. E. Tyrtyshnikov,Quasioptimality of skeleton approxima- tion of a matrix in the chebyshev norm, Doklady Mathematics83(3), 374 (2011), doi:10.1134/S1064562411030355

work page doi:10.1134/s1064562411030355 2011
[39]

Devos, M

L. Devos, M. Van Damme and J. Haegeman,MPSKit, doi:10.5281/zenodo.10654900 (2026)

work page doi:10.5281/zenodo.10654900 2026
[40]

Devos and J

L. Devos and J. Haegeman,TensorKit.jl: A julia package for large-scale tensor computations, with a hint of category theory, doi:10.48550/arXiv.2508.10076 (2025), 2508.10076

work page doi:10.48550/arxiv.2508.10076 2025
[41]

I. V . Oseledets,Approximation of2d ×2d matrices using tensor decomposition, SIAM Journal on Matrix Analysis and Applications31(4), 2130 (2010), doi:10.1137/090757861

work page doi:10.1137/090757861 2010
[42]

Lindsey ,Multiscale interpolative construction of quantized tensor trains(2023), 2311

M. Lindsey ,Multiscale interpolative construction of quantized tensor trains(2023), 2311. 12554

work page 2023
[43]

J. Chen, E. Stoudenmire and S. R. White,Quantum fourier transform has small entangle- ment, PRX Quantum4(4), 040318 (2023), doi:10.1103/PRXQuantum.4.040318

work page doi:10.1103/prxquantum.4.040318 2023
[44]

Chen and M

J. Chen and M. Lindsey ,Direct interpolative construction of the discrete fourier transform as a matrix product operator, Applied and Computational Harmonic Analysis81, 101817 (2026), doi:10.1016/j.acha.2025.101817

work page doi:10.1016/j.acha.2025.101817 2026
[45]

V . A. Kazeev and B. N. Khoromskij,Low-rank explicit QTT representation of the laplace operator and its inverse, SIAM Journal on Matrix Analysis and Applications33(3), 742 (2012), doi:10.1137/100820479

work page doi:10.1137/100820479 2012
[46]

Adaptive Patching for Tensor Train Computations

G. Grosso, M. K. Ritter, S. Rohshap, S. Badr, A. Kauch, M. Wallerberger, J. v. Delft and H. Shi- naoka,Adaptive patching for tensor train computations, doi:10.48550/arXiv.2602.22372 (2026), 2602.22372. 16 SciPost Physics Submission

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2602.22372 2026
[47]

Tindall, M

J. Tindall, M. Stoudenmire and R. Levy ,Compressing multivariate functions with tree tensor networks, doi:10.48550/arXiv.2410.03572 (2024), 2410.03572

work page doi:10.48550/arxiv.2410.03572 2024
[48]

M. K. Ritter,Code repository for ‘Fast elementwise operations on tensor trains with alternating cross interpolation’, doi:10.5281/zenodo.19208286, https://doi.org/10.5281/zenodo.19208286. [49]M. K. Ritter,AlternatingCrossInterpolation.jl(2026). [50]Tensor4all, tensor4all.org. 17

work page doi:10.5281/zenodo.19208286 2026