pith. sign in

arxiv: 2604.25532 · v1 · submitted 2026-04-28 · 🪐 quant-ph

Bond-dimension scaling of a local-refinement advantage over hyperoptimized tensor-network contraction on Sycamore like topologies

Pith reviewed 2026-05-07 16:39 UTC · model grok-4.3

classification 🪐 quant-ph
keywords tensor network contractionSycamore topologybond dimensionlocal refinementnearest-neighbor interchangecotengraquantum circuit simulationcontraction order
0
0 comments X

The pith

Appending nearest-neighbor interchange refinement to cotengra produces larger cost savings on Sycamore-like graphs as bond dimension increases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a local refinement using nearest-neighbor interchanges, added after cotengra hyperoptimization, lowers the predicted contraction cost on graphs with Sycamore processor connectivity. This improvement grows roughly linearly with bond dimension chi, from 15 bits at chi=2 to 116 bits at chi=16 for 500-qubit instances. The refiner consistently wins across all 25 tested seeds at each chi value. In contrast, random 3-regular graphs and QAOA p=2 graphs show negligible median gaps under 1 bit, with win rates approaching random chance at higher chi. The same pattern holds when contracting actual random quantum circuits on the Sycamore envelope at depths from 4 to 12. A reader would care if they simulate quantum circuits classically, since better contraction orders at high bond dimensions directly cut the resources needed for high-fidelity simulations.

Core claim

We identify a missing local-refinement stage in the cotengra tensor-network contraction pipeline and show that its impact grows monotonically with bond dimension on the connectivity graph of Sycamore-like topologies. Appending a nearest-neighbor interchange search to the cotengra output at matched 8-s wallclock yields a median predicted cost-model gap Delta T at n=500 that grows monotonically and approximately linearly in chi, from ~15 bits at chi=2 to ~116 bits at chi=16, with the refiner winning on 25/25 seeds at every tested chi. Control families show small gaps, and the advantage is confirmed on real circuit contractions.

What carries the argument

nearest-neighbor interchange (NNI) search, a local swap of adjacent contraction steps in the tree that finds lower-cost sequences after the main hyperoptimizer finishes

If this is right

  • The advantage grows with bond dimension and is largest in regimes relevant to physical contraction.
  • The benefit is specific to the Sycamore-like topology and not a general effect of extra search time.
  • An ablation confirms that the refinement search, rather than the Pareto rule, produces the improvement.
  • On random circuits at depths 4 to 12, the refined orders win in all tested cases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Integrating the NNI search directly into the primary optimizer might produce even larger gains without a separate wallclock budget.
  • If the cost model holds on real hardware, this approach could lower classical resources needed for high-fidelity simulations of near-term quantum processors.
  • Other quantum hardware connectivities might benefit from similar topology-tuned local refinements.
  • The linear scaling suggests the relative advantage could become even more substantial at bond dimensions beyond those tested.

Load-bearing premise

The cost model used to compute the predicted time gap accurately reflects real contraction runtimes on the hardware, and the random Sycamore-like graphs represent the workloads from actual quantum circuits.

What would settle it

Measuring actual wallclock contraction times on hardware for Sycamore-like graphs at chi=16 and n=500 to check whether the refined orders run faster than the cotengra orders by a factor matching the predicted 116-bit gap.

Figures

Figures reproduced from arXiv: 2604.25532 by Rub\'en Dar\'io Guerrero.

Figure 1
Figure 1. Figure 1: Pipeline and cost structure. (a) A tensor network N = (T , E) (left) is contracted via a rooted binary contraction tree τ (right). Each internal node v contracts two children; the total floating-point work fT sums over all internal nodes, and the peak intermediate tensor size fS (log2 entry count) is the maximum over them (highlighted in vermillion). The four cost axes (fT , fS, fσ, fε) quantify total work… view at source ↗
Figure 2
Figure 2. Figure 2: Bond-dimension scaling of the refinement advantage at view at source ↗
Figure 3
Figure 3. Figure 3: Length scaling of the predicted FLOP-cost reduction at view at source ↗
Figure 4
Figure 4. Figure 4: Per-seed fT (refiner) vs fT (cotengra-hyper), bits. Below-diagonal: refiner wins. Shape encodes topology; color encodes n (log). The dashed offset parallel to the identity line marks a 2 7 = 128× FLOP reduction, i.e., ∆fT = 7 bits, anchored to the Sycamore-like cluster at n=500. Inset: ∆fT distribution on symlog axis (all topologies pooled); median +0.58 bits, Sycamore-like tail visible. contribution is th… view at source ↗
Figure 5
Figure 5. Figure 5: External validation on the Sycamore-53 connectivity graph [ view at source ↗
Figure 6
Figure 6. Figure 6: Mechanism of the Sycamore-like advantage. (a) view at source ↗
read the original abstract

We identify a missing local-refinement stage in the cotengra tensor-network contraction pipeline and show that its impact grows monotonically with bond dimension on the \emph{connectivity graph} of Sycamore-like topologies. Appending a nearest-neighbor interchange (NNI) search to the \cotengra{} output at matched 8-s wallclock yields a median \emph{predicted} cost-model gap $\Delta\fT$ at $n{=}500$ that grows monotonically and approximately linearly in $\chi$, from $\sim\!15$~bits at $\chi{=}2$ to $\sim\!116$~bits at $\chi{=}16$ (Fig.~\ref{fig:chi_sweep}), with the refiner winning on $25/25$ seeds at every tested $\chi$. Two control families -- random $3$-regular and QAOA $p{=}2$ interaction graphs -- show median $|\Delta\fT| \leq 0.71$~bits across both controls at every $\chi$, with refiner win rate falling toward chance as $\chi$ grows; the signal is topology-specific, not a generic refinement-budget effect. An ablation establishes that refinement itself, not the four-axis Pareto acceptance rule, drives the gain ($|\Delta\fT| \lesssim 0.1$ bits between scalar and Pareto arms at $\chi{=}2$). The Sycamore-circuit envelope (App.~\ref{em:sec:results:syccirc}) reports the corresponding refinement on actual random circuits at depths $m \in \{4, 6, 8, 10, 12\}$, where the refiner wins on $5/5$ instances at every depth. The advantage is therefore largest precisely in the bond-dimension regime relevant to physical contraction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript claims that appending a nearest-neighbor interchange (NNI) local-refinement search to the output of the cotengra hyperoptimizer produces a topology-specific advantage on Sycamore-like connectivity graphs. At fixed 8 s wallclock and n=500, the median predicted cost-model gap ΔT grows monotonically and approximately linearly with bond dimension χ (∼15 bits at χ=2 to ∼116 bits at χ=16), with the refiner winning on 25/25 seeds at every tested χ. Two control families (random 3-regular graphs and QAOA p=2 interaction graphs) exhibit median |ΔT| ≤ 0.71 bits with win rates falling toward chance; an ablation isolates the gain to the refinement step rather than the Pareto acceptance rule (|ΔT| ≲ 0.1 bits at χ=2). The advantage is confirmed on actual random Sycamore circuits at depths m=4 to 12 (5/5 wins).

Significance. If the reported scaling holds, the work identifies a simple, low-overhead post-processing step that can meaningfully improve predicted contraction costs for tensor networks on Sycamore-like topologies, with the benefit increasing in the higher-χ regime most relevant to quantum-circuit simulation. The topology-specific signal is convincingly isolated by the two control families, the Pareto-vs-scalar ablation, and the extension to real circuit instances. Explicit framing of all quantitative results as predicted cost-model quantities (rather than measured wall-clock time) is a methodological strength that keeps the central claim internally consistent.

major comments (1)
  1. [Ablation (abstract and results section)] The ablation establishing that refinement (rather than the four-axis Pareto rule) drives the gain is quantified only at χ=2 (|ΔT| ≲ 0.1 bits). Because the central claim concerns the monotonic scaling of ΔT with χ, the ablation should be extended across the full χ range (or at least to χ=8 and χ=16) to confirm that the observed linear growth is attributable to NNI refinement itself rather than an interaction with the acceptance criterion.
minor comments (3)
  1. [Abstract] In the abstract, the symbols ΔT and cotengra appear without inline definition or forward reference; a parenthetical gloss or citation to the methods section would improve standalone readability.
  2. [Fig. 1] Figure 1 (chi_sweep) should include per-seed distributions or interquartile ranges in addition to the median to allow visual assessment of the consistency of the 25/25 win rate and the linearity of the ΔT growth.
  3. [Appendix em:sec:results:syccirc] The Sycamore-circuit envelope (App. em:sec:results:syccirc) reports only win counts (5/5); tabulating the per-depth median ΔT values would let readers compare the scaling behavior on actual circuits versus the random-graph ensemble.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment and the constructive comment on the ablation. We agree that extending the ablation is necessary to fully support the scaling claim and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Ablation (abstract and results section)] The ablation establishing that refinement (rather than the four-axis Pareto rule) drives the gain is quantified only at χ=2 (|ΔT| ≲ 0.1 bits). Because the central claim concerns the monotonic scaling of ΔT with χ, the ablation should be extended across the full χ range (or at least to χ=8 and χ=16) to confirm that the observed linear growth is attributable to NNI refinement itself rather than an interaction with the acceptance criterion.

    Authors: We agree that the ablation at χ=2 alone leaves open the possibility of an interaction between the Pareto acceptance rule and the observed scaling of ΔT with χ. In the revised manuscript we will extend the ablation to χ=8 and χ=16 (and report the full range if space permits). This requires only additional runs of the scalar versus Pareto arms at the same 8 s wall-clock budget; the resulting |ΔT| values will be added to the results section and referenced in the abstract. We expect the difference to remain ≲ 0.2 bits, thereby confirming that the monotonic growth is attributable to NNI refinement itself. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper reports an empirical scaling result obtained by applying an external nearest-neighbor interchange refiner to contraction trees produced by the independent cotengra optimizer, then evaluating both trees with a separate cost model to obtain the predicted gap ΔT. No equation or definition inside the paper reduces ΔT to a fitted parameter or to a quantity defined in terms of itself; the monotonic growth with χ is observed across 25 seeds on Sycamore-like graphs and contrasted with near-zero gaps on control topologies. The Sycamore-circuit envelope simply reapplies the identical procedure to random-circuit instances. All quantitative claims are therefore direct comparisons against an external baseline rather than self-referential derivations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the accuracy of an external cost model for contraction time and on the representativeness of the generated Sycamore-like graphs; no free parameters or new entities are introduced in the abstract.

axioms (1)
  • domain assumption The cost model used to compute ΔT is a faithful proxy for actual contraction runtime
    Invoked when reporting median predicted cost gaps

pith-pipeline@v0.9.0 · 5633 in / 1181 out tokens · 43262 ms · 2026-05-07T16:39:49.214558+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 1 internal anchor

  1. [1]

    Hyper-optimized tensor network contraction.Quantum, 5:410, 2021

    Johnnie Gray and Stefanos Kourtis. Hyper-optimized tensor network contraction.Quantum, 5:410, 2021

  2. [2]

    Markov and Yaoyun Shi

    Igor L. Markov and Yaoyun Shi. Simulating quantum computation by contracting tensor networks.SIAM Journal on Computing, 38(3):963–981, 2008

  3. [3]

    Mucciolo, and Andrei E

    Stefanos Kourtis, Claudio Chamon, Eduardo R. Mucciolo, and Andrei E. Ruckenstein. Fast counting with tensor networks.SciPost Physics, 7(5):060, 2019

  4. [4]

    High-quality hypergraph partition- ing.ACM Journal of Experimental Algorithmics, 27:1.9:1–1.9:39, 2023

    Sebastian Schlag, Tobias Heuer, Lars Gottesb¨ uren, et al. High-quality hypergraph partition- ing.ACM Journal of Experimental Algorithmics, 27:1.9:1–1.9:39, 2023

  5. [5]

    Solving the sampling problem of the sycamore quantum circuits.Physical Review Letters, 129(9):090502, 2022

    Feng Pan, Keyang Chen, and Pan Zhang. Solving the sampling problem of the sycamore quantum circuits.Physical Review Letters, 129(9):090502, 2022

  6. [6]

    Huang, F

    Cupjin Huang, Fang Zhang, Michael Newman, Junhua Cai, et al. Classical simulation of quantum supremacy circuits.arXiv preprint arXiv:2005.06787, 2020

  7. [7]

    quantum supremacy

    Yong Liu, Xin Liu, Fang Li, Haohuan Fu, et al. Closing the “quantum supremacy” gap: achiev- ing real-time simulation of a random quantum circuit using a new sunway supercomputer. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’21), 2021

  8. [8]

    A Quantum Approximate Optimization Algorithm

    Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm.arXiv preprint arXiv:1411.4028, 2014

  9. [9]

    Quantum supremacy using a programmable superconducting processor.Nature, 574(7779):505–510, 2019

    Frank Arute, Kunal Arya, Ryan Babbush, Dave Bacon, et al. Quantum supremacy using a programmable superconducting processor.Nature, 574(7779):505–510, 2019

  10. [10]

    Gleb Kalachev, Pavel Panteleev, and Man-Hong Yung. Classical simulation of quantum supremacy circuits via tensor-network contraction with optimised slicing.arXiv preprint 15 2 4 8 16 Bond dimension  −40 −30 −20 −10 0 10 20 30 f LOGIT T ¡ f PLS T (bits) " PLS lower # LOGIT lower (a) Same hyper seed: fT difference between refiners  = 2  = 4  = 8  = 160...

  11. [11]

    D. F. Robinson. Comparison of labeled trees with valency three.Journal of Combinatorial Theory, Series B, 11(2):105–119, 1971

  12. [12]

    Swofford.PAUP*: Phylogenetic Analysis Using Parsimony (and Other Methods)

    David L. Swofford.PAUP*: Phylogenetic Analysis Using Parsimony (and Other Methods). Sinauer Associates, 1996. Standard reference for NNI-based hill-climbing in phylogenetics; bibkey?– author to verify edition/year (PAUP* 4.0 was used widely from 2002 onward)

  13. [13]

    Higham.Accuracy and Stability of Numerical Algorithms

    Nicholas J. Higham.Accuracy and Stability of Numerical Algorithms. SIAM, 2 edition, 2002

  14. [14]

    A four-player potential game for barren plateau-aware quantum ansatz design.arXiv preprint, 2026

    Rub´ en Dar´ ıo Guerrero. A four-player potential game for barren plateau-aware quantum ansatz design.arXiv preprint, 2026

  15. [15]

    Ibm quantum heavy-hexagon connectivity.https://quantum-computing

    IBM Quantum. Ibm quantum heavy-hexagon connectivity.https://quantum-computing. ibm.com/, 2021. Heavy-hexagon lattice; see also Chamberland et al., Phys. Rev. X 10, 011022 (2020)

  16. [16]

    Classical Simulation of Intermediate-Size Quantum Circuits

    Jianxin Chen, Fang Zhang, Cupjin Huang, Michael Newman, and Yaoyun Shi. Classical simulation of intermediate-size quantum circuits.arXiv preprint arXiv:1805.01450, 2018

  17. [17]

    quimb: A python package for quantum information and many-body calcula- tions.Journal of Open Source Software, 3(29):819, 2018

    Johnnie Gray. quimb: A python package for quantum information and many-body calcula- tions.Journal of Open Source Software, 3(29):819, 2018

  18. [18]

    Rub´ en Dar´ ıo Guerrero. Reproducibility kit for ”bond-dimension scaling of a local-refinement advantage over hyperoptimized tensor-network contraction on sycamore-like topologies”: data, headline contraction trees, and verification scripts.https://zenodo.org/records/19852180, 2026

  19. [19]

    Daniel G. A. Smith and Johnnie Gray. opt einsum — a python package for optimizing con- traction order for einsum-like expressions.Journal of Open Source Software, 3(26):753, 2018

  20. [20]

    CUTLASS: Cuda templates for linear algebra subroutines.https: //github.com/NVIDIA/cutlass, 2024

    NVIDIA Corporation. CUTLASS: Cuda templates for linear algebra subroutines.https: //github.com/NVIDIA/cutlass, 2024. Version pinned at submission. 18