Bond-dimension scaling of a local-refinement advantage over hyperoptimized tensor-network contraction on Sycamore like topologies
Pith reviewed 2026-05-07 16:39 UTC · model grok-4.3
The pith
Appending nearest-neighbor interchange refinement to cotengra produces larger cost savings on Sycamore-like graphs as bond dimension increases.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We identify a missing local-refinement stage in the cotengra tensor-network contraction pipeline and show that its impact grows monotonically with bond dimension on the connectivity graph of Sycamore-like topologies. Appending a nearest-neighbor interchange search to the cotengra output at matched 8-s wallclock yields a median predicted cost-model gap Delta T at n=500 that grows monotonically and approximately linearly in chi, from ~15 bits at chi=2 to ~116 bits at chi=16, with the refiner winning on 25/25 seeds at every tested chi. Control families show small gaps, and the advantage is confirmed on real circuit contractions.
What carries the argument
nearest-neighbor interchange (NNI) search, a local swap of adjacent contraction steps in the tree that finds lower-cost sequences after the main hyperoptimizer finishes
If this is right
- The advantage grows with bond dimension and is largest in regimes relevant to physical contraction.
- The benefit is specific to the Sycamore-like topology and not a general effect of extra search time.
- An ablation confirms that the refinement search, rather than the Pareto rule, produces the improvement.
- On random circuits at depths 4 to 12, the refined orders win in all tested cases.
Where Pith is reading between the lines
- Integrating the NNI search directly into the primary optimizer might produce even larger gains without a separate wallclock budget.
- If the cost model holds on real hardware, this approach could lower classical resources needed for high-fidelity simulations of near-term quantum processors.
- Other quantum hardware connectivities might benefit from similar topology-tuned local refinements.
- The linear scaling suggests the relative advantage could become even more substantial at bond dimensions beyond those tested.
Load-bearing premise
The cost model used to compute the predicted time gap accurately reflects real contraction runtimes on the hardware, and the random Sycamore-like graphs represent the workloads from actual quantum circuits.
What would settle it
Measuring actual wallclock contraction times on hardware for Sycamore-like graphs at chi=16 and n=500 to check whether the refined orders run faster than the cotengra orders by a factor matching the predicted 116-bit gap.
Figures
read the original abstract
We identify a missing local-refinement stage in the cotengra tensor-network contraction pipeline and show that its impact grows monotonically with bond dimension on the \emph{connectivity graph} of Sycamore-like topologies. Appending a nearest-neighbor interchange (NNI) search to the \cotengra{} output at matched 8-s wallclock yields a median \emph{predicted} cost-model gap $\Delta\fT$ at $n{=}500$ that grows monotonically and approximately linearly in $\chi$, from $\sim\!15$~bits at $\chi{=}2$ to $\sim\!116$~bits at $\chi{=}16$ (Fig.~\ref{fig:chi_sweep}), with the refiner winning on $25/25$ seeds at every tested $\chi$. Two control families -- random $3$-regular and QAOA $p{=}2$ interaction graphs -- show median $|\Delta\fT| \leq 0.71$~bits across both controls at every $\chi$, with refiner win rate falling toward chance as $\chi$ grows; the signal is topology-specific, not a generic refinement-budget effect. An ablation establishes that refinement itself, not the four-axis Pareto acceptance rule, drives the gain ($|\Delta\fT| \lesssim 0.1$ bits between scalar and Pareto arms at $\chi{=}2$). The Sycamore-circuit envelope (App.~\ref{em:sec:results:syccirc}) reports the corresponding refinement on actual random circuits at depths $m \in \{4, 6, 8, 10, 12\}$, where the refiner wins on $5/5$ instances at every depth. The advantage is therefore largest precisely in the bond-dimension regime relevant to physical contraction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that appending a nearest-neighbor interchange (NNI) local-refinement search to the output of the cotengra hyperoptimizer produces a topology-specific advantage on Sycamore-like connectivity graphs. At fixed 8 s wallclock and n=500, the median predicted cost-model gap ΔT grows monotonically and approximately linearly with bond dimension χ (∼15 bits at χ=2 to ∼116 bits at χ=16), with the refiner winning on 25/25 seeds at every tested χ. Two control families (random 3-regular graphs and QAOA p=2 interaction graphs) exhibit median |ΔT| ≤ 0.71 bits with win rates falling toward chance; an ablation isolates the gain to the refinement step rather than the Pareto acceptance rule (|ΔT| ≲ 0.1 bits at χ=2). The advantage is confirmed on actual random Sycamore circuits at depths m=4 to 12 (5/5 wins).
Significance. If the reported scaling holds, the work identifies a simple, low-overhead post-processing step that can meaningfully improve predicted contraction costs for tensor networks on Sycamore-like topologies, with the benefit increasing in the higher-χ regime most relevant to quantum-circuit simulation. The topology-specific signal is convincingly isolated by the two control families, the Pareto-vs-scalar ablation, and the extension to real circuit instances. Explicit framing of all quantitative results as predicted cost-model quantities (rather than measured wall-clock time) is a methodological strength that keeps the central claim internally consistent.
major comments (1)
- [Ablation (abstract and results section)] The ablation establishing that refinement (rather than the four-axis Pareto rule) drives the gain is quantified only at χ=2 (|ΔT| ≲ 0.1 bits). Because the central claim concerns the monotonic scaling of ΔT with χ, the ablation should be extended across the full χ range (or at least to χ=8 and χ=16) to confirm that the observed linear growth is attributable to NNI refinement itself rather than an interaction with the acceptance criterion.
minor comments (3)
- [Abstract] In the abstract, the symbols ΔT and cotengra appear without inline definition or forward reference; a parenthetical gloss or citation to the methods section would improve standalone readability.
- [Fig. 1] Figure 1 (chi_sweep) should include per-seed distributions or interquartile ranges in addition to the median to allow visual assessment of the consistency of the 25/25 win rate and the linearity of the ΔT growth.
- [Appendix em:sec:results:syccirc] The Sycamore-circuit envelope (App. em:sec:results:syccirc) reports only win counts (5/5); tabulating the per-depth median ΔT values would let readers compare the scaling behavior on actual circuits versus the random-graph ensemble.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and the constructive comment on the ablation. We agree that extending the ablation is necessary to fully support the scaling claim and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Ablation (abstract and results section)] The ablation establishing that refinement (rather than the four-axis Pareto rule) drives the gain is quantified only at χ=2 (|ΔT| ≲ 0.1 bits). Because the central claim concerns the monotonic scaling of ΔT with χ, the ablation should be extended across the full χ range (or at least to χ=8 and χ=16) to confirm that the observed linear growth is attributable to NNI refinement itself rather than an interaction with the acceptance criterion.
Authors: We agree that the ablation at χ=2 alone leaves open the possibility of an interaction between the Pareto acceptance rule and the observed scaling of ΔT with χ. In the revised manuscript we will extend the ablation to χ=8 and χ=16 (and report the full range if space permits). This requires only additional runs of the scalar versus Pareto arms at the same 8 s wall-clock budget; the resulting |ΔT| values will be added to the results section and referenced in the abstract. We expect the difference to remain ≲ 0.2 bits, thereby confirming that the monotonic growth is attributable to NNI refinement itself. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper reports an empirical scaling result obtained by applying an external nearest-neighbor interchange refiner to contraction trees produced by the independent cotengra optimizer, then evaluating both trees with a separate cost model to obtain the predicted gap ΔT. No equation or definition inside the paper reduces ΔT to a fitted parameter or to a quantity defined in terms of itself; the monotonic growth with χ is observed across 25 seeds on Sycamore-like graphs and contrasted with near-zero gaps on control topologies. The Sycamore-circuit envelope simply reapplies the identical procedure to random-circuit instances. All quantitative claims are therefore direct comparisons against an external baseline rather than self-referential derivations.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The cost model used to compute ΔT is a faithful proxy for actual contraction runtime
Reference graph
Works this paper leans on
-
[1]
Hyper-optimized tensor network contraction.Quantum, 5:410, 2021
Johnnie Gray and Stefanos Kourtis. Hyper-optimized tensor network contraction.Quantum, 5:410, 2021
work page 2021
-
[2]
Igor L. Markov and Yaoyun Shi. Simulating quantum computation by contracting tensor networks.SIAM Journal on Computing, 38(3):963–981, 2008
work page 2008
-
[3]
Stefanos Kourtis, Claudio Chamon, Eduardo R. Mucciolo, and Andrei E. Ruckenstein. Fast counting with tensor networks.SciPost Physics, 7(5):060, 2019
work page 2019
-
[4]
Sebastian Schlag, Tobias Heuer, Lars Gottesb¨ uren, et al. High-quality hypergraph partition- ing.ACM Journal of Experimental Algorithmics, 27:1.9:1–1.9:39, 2023
work page 2023
-
[5]
Feng Pan, Keyang Chen, and Pan Zhang. Solving the sampling problem of the sycamore quantum circuits.Physical Review Letters, 129(9):090502, 2022
work page 2022
- [6]
-
[7]
Yong Liu, Xin Liu, Fang Li, Haohuan Fu, et al. Closing the “quantum supremacy” gap: achiev- ing real-time simulation of a random quantum circuit using a new sunway supercomputer. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’21), 2021
work page 2021
-
[8]
A Quantum Approximate Optimization Algorithm
Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm.arXiv preprint arXiv:1411.4028, 2014
work page internal anchor Pith review arXiv 2014
-
[9]
Quantum supremacy using a programmable superconducting processor.Nature, 574(7779):505–510, 2019
Frank Arute, Kunal Arya, Ryan Babbush, Dave Bacon, et al. Quantum supremacy using a programmable superconducting processor.Nature, 574(7779):505–510, 2019
work page 2019
-
[10]
Gleb Kalachev, Pavel Panteleev, and Man-Hong Yung. Classical simulation of quantum supremacy circuits via tensor-network contraction with optimised slicing.arXiv preprint 15 2 4 8 16 Bond dimension  −40 −30 −20 −10 0 10 20 30 f LOGIT T ¡ f PLS T (bits) " PLS lower # LOGIT lower (a) Same hyper seed: fT difference between refiners  = 2  = 4  = 8  = 160...
-
[11]
D. F. Robinson. Comparison of labeled trees with valency three.Journal of Combinatorial Theory, Series B, 11(2):105–119, 1971
work page 1971
-
[12]
Swofford.PAUP*: Phylogenetic Analysis Using Parsimony (and Other Methods)
David L. Swofford.PAUP*: Phylogenetic Analysis Using Parsimony (and Other Methods). Sinauer Associates, 1996. Standard reference for NNI-based hill-climbing in phylogenetics; bibkey?– author to verify edition/year (PAUP* 4.0 was used widely from 2002 onward)
work page 1996
-
[13]
Higham.Accuracy and Stability of Numerical Algorithms
Nicholas J. Higham.Accuracy and Stability of Numerical Algorithms. SIAM, 2 edition, 2002
work page 2002
-
[14]
A four-player potential game for barren plateau-aware quantum ansatz design.arXiv preprint, 2026
Rub´ en Dar´ ıo Guerrero. A four-player potential game for barren plateau-aware quantum ansatz design.arXiv preprint, 2026
work page 2026
-
[15]
Ibm quantum heavy-hexagon connectivity.https://quantum-computing
IBM Quantum. Ibm quantum heavy-hexagon connectivity.https://quantum-computing. ibm.com/, 2021. Heavy-hexagon lattice; see also Chamberland et al., Phys. Rev. X 10, 011022 (2020)
work page 2021
-
[16]
Classical Simulation of Intermediate-Size Quantum Circuits
Jianxin Chen, Fang Zhang, Cupjin Huang, Michael Newman, and Yaoyun Shi. Classical simulation of intermediate-size quantum circuits.arXiv preprint arXiv:1805.01450, 2018
work page Pith review arXiv 2018
-
[17]
Johnnie Gray. quimb: A python package for quantum information and many-body calcula- tions.Journal of Open Source Software, 3(29):819, 2018
work page 2018
-
[18]
Rub´ en Dar´ ıo Guerrero. Reproducibility kit for ”bond-dimension scaling of a local-refinement advantage over hyperoptimized tensor-network contraction on sycamore-like topologies”: data, headline contraction trees, and verification scripts.https://zenodo.org/records/19852180, 2026
-
[19]
Daniel G. A. Smith and Johnnie Gray. opt einsum — a python package for optimizing con- traction order for einsum-like expressions.Journal of Open Source Software, 3(26):753, 2018
work page 2018
-
[20]
CUTLASS: Cuda templates for linear algebra subroutines.https: //github.com/NVIDIA/cutlass, 2024
NVIDIA Corporation. CUTLASS: Cuda templates for linear algebra subroutines.https: //github.com/NVIDIA/cutlass, 2024. Version pinned at submission. 18
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.