StableShots: Online Shot Stopping for Quantum Circuit Execution

Alessandro Bocci; Antonio Brogi; Ernesto Pimentel; Giuseppe Bisicchia

arxiv: 2606.22170 · v1 · pith:JAQ4XHO2new · submitted 2026-06-20 · 🪐 quant-ph · cs.ET· cs.SE

StableShots: Online Shot Stopping for Quantum Circuit Execution

Giuseppe Bisicchia , Alessandro Bocci , Ernesto Pimentel , Antonio Brogi This is my paper

Pith reviewed 2026-06-26 11:39 UTC · model grok-4.3

classification 🪐 quant-ph cs.ETcs.SE

keywords quantum circuitsshot allocationonline stopping ruletotal variation distanceadaptive executionquantum simulationIBM backendsdistribution convergence

0 comments

The pith

StableShots stops quantum circuit measurements once cumulative distributions show repeated local stability in total variation distance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a black-box method that runs a quantum circuit in small batches and halts execution after seeing the empirical output distribution stabilize. Fixed shot counts chosen in advance often undersample or waste resources on noisy hardware. By tracking total-variation distance between successive cumulative distributions, the approach adapts the shot budget to each circuit and backend. Validation on 180 traces from six circuit families shows the calibrated rule meets a TVD threshold of 0.05 on held-out tests at a median cost of 7,650 shots while fixed baselines either miss the threshold more often or consume more shots.

Core claim

StableShots executes a fixed circuit in small batches, monitors the total-variation distance between cumulative empirical distributions, and stops after repeated evidence of local stability. With validation-only calibration and 100 repeated backend-holdout splits across 180 QSimBench traces spanning six circuit families, six sizes from 4 to 14 qubits, and five noisy IBM simulated backends, the selected configuration reaches TVD <= 0.05 on all held-out test evaluations with median 7,650 shots, whereas fixed-shot baselines either fail more often or spend substantially more shots.

What carries the argument

The online stopping rule that halts after repeated local stability in total-variation distance between cumulative empirical distributions from small batches of shots.

If this is right

The adaptive rule meets the TVD <= 0.05 target on every held-out evaluation while using a median of only 7,650 shots.
Fixed-shot baselines either exceed the TVD target more frequently or require substantially higher shot counts to match the same reliability.
The calibration uses only validation data and generalizes across 100 backend-holdout splits without retraining per test backend.
The same stability criterion applies uniformly to six circuit families and five noisy simulated backends spanning 4 to 14 qubits.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Circuit compilers or runtime libraries could embed the stopping rule to allocate shots automatically instead of requiring users to guess budgets.
The batch-stability approach might extend to other sampling-based quantum tasks such as variational algorithms where distribution convergence is also the goal.
Combining the rule with existing error-mitigation post-processing could further reduce the effective shot cost needed for a target accuracy.

Load-bearing premise

Repeated evidence of local stability in total-variation distance between cumulative empirical distributions reliably signals that the empirical distribution has converged sufficiently to the true output distribution.

What would settle it

Running the method on additional circuits or real hardware and finding cases where it stops yet later independent measurements yield TVD above 0.05 on a substantial fraction of trials.

read the original abstract

Quantum circuit execution estimates output distributions by repeated measurements, yet developers commonly choose a fixed shot budget before execution. This static choice is brittle: low budgets can under-sample the distribution, while high budgets waste measurements. In this paper, we present StableShots, a black-box online stopping rule for static quantum circuits. The method executes a fixed circuit in small batches, monitors the total-variation distance between cumulative empirical distributions, and stops after repeated evidence of local stability. We evaluate StableShots on 180 QSimBench traces spanning six circuit families, six sizes from 4 to 14 qubits, and five noisy IBM simulated backends. With validation-only calibration and 100 repeated backend-holdout splits, the selected configuration reaches TVD <= 0.05 on all held-out test evaluations with median 7,650 shots, whereas fixed-shot baselines either fail more often or spend substantially more shots.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

StableShots gives a practical batch-wise TVD stability rule that beats fixed-shot baselines on the tested circuits and backends, but the link from observed stability to true-distribution convergence is the part that still needs checking.

read the letter

The main takeaway is a black-box stopping rule that runs a circuit in small batches, tracks total-variation distance between successive cumulative empirical distributions, and halts after repeated local stability. On 180 QSimBench traces across six circuit families, qubit counts 4–14, and five noisy IBM simulators, the calibrated version hits TVD ≤ 0.05 on every held-out backend split with a median of 7,650 shots while fixed-shot baselines either miss the threshold more often or use substantially more measurements. The 100 repeated holdout splits and validation-only calibration are set up cleanly, so the reported numbers are not obviously overfit.

What the work does well is turn a routine engineering choice into something that can be decided online without prior knowledge of the output distribution. The repeated-stability criterion is a concrete, implementable addition to the usual fixed-budget or simple-convergence heuristics.

The soft spot is the assumption that repeated small TVD between cumulative batches reliably signals that the empirical distribution has gotten close to the backend’s true output. That link can fail when rare outcomes remain unsampled or when noise produces an early plateau in the observed marginals. The abstract and stress-test note both flag this, and without seeing the full failure-mode analysis or sensitivity checks on batch size and stability threshold, it is hard to know how often the rule would trigger too early on circuits outside the six families tested.

This is for people who actually run or simulate quantum circuits and want to cut measurement cost without guessing the shot budget in advance. A reader working on quantum execution or benchmarking would get immediate use from the method and the comparison.

It is worth sending to peer review. The empirical setup is solid enough to merit referee time even if the theoretical justification for the stopping criterion stays light.

Referee Report

2 major / 0 minor

Summary. The paper introduces StableShots, a black-box online stopping rule for static quantum circuits. Circuits are executed in small batches; the method monitors total-variation distance between successive cumulative empirical distributions and halts after repeated evidence of local stability. On 180 QSimBench traces (six circuit families, 4–14 qubits, five noisy IBM simulated backends), validation-only calibration plus 100 repeated backend-holdout splits yields a configuration that attains TVD ≤ 0.05 on every held-out test evaluation at a median of 7,650 shots, outperforming fixed-shot baselines that either exceed the TVD threshold more often or consume substantially more shots.

Significance. If the central empirical claim holds, the work supplies a practical, calibration-light procedure for dynamically allocating shots while controlling distribution error. The repeated backend-holdout design with validation-only calibration is a methodological strength that reduces the risk of test-set overfitting. The breadth of circuit families and backends supplies useful evidence of applicability within the evaluated regime.

major comments (2)

The headline performance (TVD ≤ 0.05 on all held-out evaluations) rests on the unproven link that repeated small-batch TVD stability between cumulative empirical distributions reliably signals proximity to the backend’s true output distribution. This assumption can fail when low-probability outcomes remain unsampled or when noise produces early plateaus; the 100 splits supply empirical support only inside the tested set and do not constitute a general guarantee.
[Evaluation] The abstract and evaluation description report the selected configuration but supply neither the exact stability criterion (number of consecutive stable batches, TVD threshold), the batch size, nor any sensitivity analysis of these free parameters. Without these details the reported median of 7,650 shots cannot be reproduced or stress-tested outside the authors’ implementation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful review and for highlighting important points regarding the empirical nature of our claims and reproducibility. We respond to each major comment below.

read point-by-point responses

Referee: The headline performance (TVD ≤ 0.05 on all held-out evaluations) rests on the unproven link that repeated small-batch TVD stability between cumulative empirical distributions reliably signals proximity to the backend’s true output distribution. This assumption can fail when low-probability outcomes remain unsampled or when noise produces early plateaus; the 100 splits supply empirical support only inside the tested set and do not constitute a general guarantee.

Authors: We agree that StableShots is a heuristic method without a theoretical convergence guarantee to the true distribution. The manuscript frames the approach as an empirical online stopping rule whose reliability is supported by the repeated backend-holdout evaluation across 180 traces. We will revise the introduction and discussion sections to explicitly note the heuristic character, to acknowledge potential failure modes such as unsampled rare outcomes or noise-induced plateaus, and to clarify that the reported performance is specific to the evaluated regime rather than a general proof. revision: yes
Referee: [Evaluation] The abstract and evaluation description report the selected configuration but supply neither the exact stability criterion (number of consecutive stable batches, TVD threshold), the batch size, nor any sensitivity analysis of these free parameters. Without these details the reported median of 7,650 shots cannot be reproduced or stress-tested outside the authors’ implementation.

Authors: The full methods section of the manuscript specifies the batch size, the number of consecutive stable batches required, and the internal TVD threshold used for the stopping decision, along with the validation-only calibration procedure. However, these parameters are not restated in the abstract or the main evaluation narrative. We will revise the evaluation section to include the exact hyperparameter values and add a sensitivity analysis (varying batch size and stability window) to improve reproducibility and allow external stress-testing. revision: yes

Circularity Check

0 steps flagged

No significant circularity; evaluation uses independent hold-outs

full rationale

The paper defines StableShots as an online stopping rule that monitors repeated local stability in TVD between successive cumulative empirical distributions computed from small batches. Its headline performance claim (TVD <= 0.05 on all held-out evaluations, median 7,650 shots) is obtained after validation-only calibration across 100 repeated backend-holdout splits on 180 QSimBench traces. Because the test metric is measured on data never seen during calibration and the stopping rule itself contains no fitted parameters that are renamed as predictions, no equation or self-citation reduces the reported result to its inputs by construction. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The method depends on empirical calibration of batch size and stability thresholds on validation data; it assumes TVD is a suitable stability metric and that local stability in finite batches generalizes across circuit families and simulated backends. No free parameters are numerically reported in the abstract.

free parameters (2)

batch size
Size of each execution batch is a tunable parameter calibrated on validation data.
stability criterion
TVD threshold and number of consecutive stable batches required to stop are parameters chosen via validation-only calibration.

axioms (1)

domain assumption Total variation distance between cumulative empirical distributions is an appropriate indicator of local stability for the purpose of deciding when to stop sampling.
Invoked as the core monitoring quantity without further justification in the abstract.

pith-pipeline@v0.9.1-grok · 5690 in / 1432 out tokens · 20571 ms · 2026-06-26T11:39:24.633725+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references

[1]

M. A. Nielsen and I. L. Chuang,Quantum Computation and Quantum Information. Cambridge University Press, 2010

2010
[2]

From quantum software handcrafting to quantum software engineering,

G. Bisicchia et al., “From quantum software handcrafting to quantum software engineering,” inProc. IEEE SANER-C, 2024

2024
[3]

Probability inequalities for sums of bounded random variables,

W. Hoeffding, “Probability inequalities for sums of bounded random variables,”Journal of the American Statistical Association, 1963

1963
[4]

Inequalities for theL 1 deviation of the empirical distribution,

T. Weissman et al., “Inequalities for theL 1 deviation of the empirical distribution,” Hewlett-Packard Labs, Tech. Rep., 2003

2003
[5]

Concentration inequalities for the empirical distribu- tion,

J. Mardia et al., “Concentration inequalities for the empirical distribu- tion,” arXiv:1809.06522, 2018

arXiv 2018
[6]

Optimizing shot assignment in variational quantum eigen- solver measurement,

L. Zhu et al., “Optimizing shot assignment in variational quantum eigen- solver measurement,”Journal of Chemical Theory and Computation, 2024

2024
[7]

Adaptive shot allocation for fast convergence in variational quantum algorithms,

A. Gu et al., “Adaptive shot allocation for fast convergence in variational quantum algorithms,” arXiv:2108.10434, 2021

arXiv 2021
[8]

Artificial-intelligence-driven shot reduction in quantum measurement,

S. Liang et al., “Artificial-intelligence-driven shot reduction in quantum measurement,”Chemical Physics Reviews, 2024

2024
[9]

Distributing quantum computations, shot-wise,

G. Bisicchia et al., “Distributing quantum computations, shot-wise,” in Future Internet, 2025

2025
[10]

Rethinking Services in the Quantum Age: The SOQ Paradigm,

J. Garcia-Alonso et al., “Rethinking Services in the Quantum Age: The SOQ Paradigm,” inACM Transactions on Software Engineering and Methodology, 2026

2026
[11]

QSimBench: An execution-level benchmark suite for quantum software engineering,

G. Bisicchia et al., “QSimBench: An execution-level benchmark suite for quantum software engineering,” inProc. IEEE QCE, 2025

2025

[1] [1]

M. A. Nielsen and I. L. Chuang,Quantum Computation and Quantum Information. Cambridge University Press, 2010

2010

[2] [2]

From quantum software handcrafting to quantum software engineering,

G. Bisicchia et al., “From quantum software handcrafting to quantum software engineering,” inProc. IEEE SANER-C, 2024

2024

[3] [3]

Probability inequalities for sums of bounded random variables,

W. Hoeffding, “Probability inequalities for sums of bounded random variables,”Journal of the American Statistical Association, 1963

1963

[4] [4]

Inequalities for theL 1 deviation of the empirical distribution,

T. Weissman et al., “Inequalities for theL 1 deviation of the empirical distribution,” Hewlett-Packard Labs, Tech. Rep., 2003

2003

[5] [5]

Concentration inequalities for the empirical distribu- tion,

J. Mardia et al., “Concentration inequalities for the empirical distribu- tion,” arXiv:1809.06522, 2018

arXiv 2018

[6] [6]

Optimizing shot assignment in variational quantum eigen- solver measurement,

L. Zhu et al., “Optimizing shot assignment in variational quantum eigen- solver measurement,”Journal of Chemical Theory and Computation, 2024

2024

[7] [7]

Adaptive shot allocation for fast convergence in variational quantum algorithms,

A. Gu et al., “Adaptive shot allocation for fast convergence in variational quantum algorithms,” arXiv:2108.10434, 2021

arXiv 2021

[8] [8]

Artificial-intelligence-driven shot reduction in quantum measurement,

S. Liang et al., “Artificial-intelligence-driven shot reduction in quantum measurement,”Chemical Physics Reviews, 2024

2024

[9] [9]

Distributing quantum computations, shot-wise,

G. Bisicchia et al., “Distributing quantum computations, shot-wise,” in Future Internet, 2025

2025

[10] [10]

Rethinking Services in the Quantum Age: The SOQ Paradigm,

J. Garcia-Alonso et al., “Rethinking Services in the Quantum Age: The SOQ Paradigm,” inACM Transactions on Software Engineering and Methodology, 2026

2026

[11] [11]

QSimBench: An execution-level benchmark suite for quantum software engineering,

G. Bisicchia et al., “QSimBench: An execution-level benchmark suite for quantum software engineering,” inProc. IEEE QCE, 2025

2025