Gradient-descent methods for scalable quantum detector tomography
Pith reviewed 2026-05-17 20:46 UTC · model grok-4.3
The pith
Gradient descent optimization reconstructs the POVM of phase-insensitive quantum detectors faster than constrained convex methods while matching or exceeding fidelity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Gradient descent can be applied directly to the parameters of a POVM to solve the quantum detector tomography problem for phase-insensitive detectors, yielding a reconstruction of the detector response that matches or surpasses the fidelity of constrained convex optimization while requiring far less runtime, as shown in numerical tests that include realistic noise levels and restricted probe resources; the same framework extends to the phase-sensitive case through a manifold-constrained parametrization.
What carries the argument
Gradient descent optimization of POVM matrix elements (with Stiefel-manifold parametrization for the phase-sensitive extension) to minimize the mismatch between predicted and observed detection statistics.
If this is right
- The method enables tomography of higher-dimensional or more complex detectors within practical laboratory time budgets.
- Reconstruction remains reliable when measurement noise is present and only limited probe states can be prepared.
- The Stiefel-manifold parametrization brings gradient-based optimization to phase-sensitive detectors without leaving the space of valid POVMs.
- Reduced computation time makes repeated calibrations feasible during long experimental runs.
Where Pith is reading between the lines
- The same gradient machinery could be combined with adaptive probe-state selection to further reduce the total number of measurements needed.
- Integration into real-time control software might allow detectors to be recalibrated on the fly as experimental conditions drift.
- Because the approach is first-order and local, it may scale more gracefully than convex solvers when detector Hilbert spaces become large.
Load-bearing premise
Gradient descent will converge to a high-fidelity POVM without becoming trapped in poor local minima, and the added-noise simulations used for benchmarking will accurately reflect real experimental performance.
What would settle it
An experiment on a calibrated phase-insensitive detector where the gradient-descent reconstruction produces measurably lower fidelity than a constrained convex optimization run on the same data set, or where the gradient method fails to converge within the reported time advantage.
Figures
read the original abstract
We present a technique for performing quantum detector tomography (QDT) of phase insensitive quantum detectors, a category under which many detectors of interest fall under, using gradient descent-based optimization to learn the positive operator-valued measure (POVM) that best describes the data collected using the detector under study. We numerically benchmark our method against constrained convex optimization (CCO) and show that it reaches higher or comparable reconstruction fidelity in much less time even in the presence of noise and limited probe state resources. We also present a possible extension of our approach to the phase sensitive case via a parametrization of POVMs on the complex Stiefel manifold which enables gradient based optimization restricted to this manifold.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces gradient descent methods for quantum detector tomography (QDT) of phase-insensitive detectors by optimizing the POVM parameters to fit experimental data. Numerical benchmarks against constrained convex optimization (CCO) demonstrate that the proposed method achieves higher or comparable reconstruction fidelity in significantly less time, even with noise and limited probe states. The work also proposes an extension to phase-sensitive detectors using a parametrization on the complex Stiefel manifold.
Significance. If the numerical advantages hold under broader conditions, this method could provide a more scalable alternative to convex optimization for characterizing quantum detectors, which is relevant for quantum information processing tasks involving higher-dimensional systems. The benchmarks offer initial support for practical efficiency gains, but the absence of scaling analysis limits the strength of the contribution.
major comments (2)
- [Numerical benchmarks] Numerical benchmarks section: The comparisons are performed only on low-dimensional phase-insensitive detectors with small Fock-space truncation levels. Since the title and abstract emphasize scalability, the manuscript requires explicit scaling studies (e.g., wall-clock time and fidelity versus Hilbert-space dimension or number of probe states) to confirm that the reported speed-up persists beyond the tested regime; otherwise the central claim of practical advantage over CCO is not yet load-bearing.
- [Method / Optimization details] Optimization and convergence: No empirical or theoretical analysis is provided on robustness to local minima, initialization dependence, or basin-hopping frequency for the gradient-descent procedure. Given that POVM fitting landscapes are generally non-convex, this omission directly affects the reliability of the fidelity claims under noise and limited probes.
minor comments (3)
- [Abstract] The abstract states 'much less time' without reporting concrete timing metrics, hardware specifications, or iteration counts used in the CCO versus GD comparison.
- [Method] Clarify the explicit parametrization chosen for the diagonal (phase-insensitive) POVM elements in the gradient-descent update rule.
- [Extension to phase-sensitive case] The Stiefel-manifold extension is described only as 'possible'; if it is intended as a contribution, a minimal numerical demonstration or convergence guarantee should be added or the claim should be toned down.
Simulated Author's Rebuttal
We are grateful to the referee for their thoughtful review and valuable suggestions. We have carefully considered the major comments and provide point-by-point responses below. Where appropriate, we will revise the manuscript to address the concerns raised.
read point-by-point responses
-
Referee: [Numerical benchmarks] Numerical benchmarks section: The comparisons are performed only on low-dimensional phase-insensitive detectors with small Fock-space truncation levels. Since the title and abstract emphasize scalability, the manuscript requires explicit scaling studies (e.g., wall-clock time and fidelity versus Hilbert-space dimension or number of probe states) to confirm that the reported speed-up persists beyond the tested regime; otherwise the central claim of practical advantage over CCO is not yet load-bearing.
Authors: We agree with the referee that demonstrating scalability through explicit scaling studies is important to support the claims in the title and abstract. In the revised manuscript, we will include additional numerical results showing how the wall-clock time and reconstruction fidelity scale with increasing Hilbert-space dimension (e.g., larger Fock space truncations) and varying numbers of probe states. These studies will help confirm whether the observed computational advantages over constrained convex optimization persist in higher-dimensional regimes. revision: yes
-
Referee: [Method / Optimization details] Optimization and convergence: No empirical or theoretical analysis is provided on robustness to local minima, initialization dependence, or basin-hopping frequency for the gradient-descent procedure. Given that POVM fitting landscapes are generally non-convex, this omission directly affects the reliability of the fidelity claims under noise and limited probes.
Authors: We acknowledge that the non-convex nature of the optimization landscape warrants further investigation into robustness and initialization effects. While our current implementation employs random initializations and reports the best outcome across trials, we did not include a systematic study of convergence behavior or sensitivity to starting points. In the revised manuscript, we will add empirical analysis, such as statistics over multiple initializations and observations on convergence under noisy conditions, to better substantiate the reliability of the reported fidelities. revision: yes
Circularity Check
No circularity: standard GD optimization applied to QDT data fitting with independent numerical benchmarks
full rationale
The paper applies gradient descent to minimize a loss function for reconstructing phase-insensitive POVMs from tomography data and benchmarks runtime and fidelity against constrained convex optimization on simulated datasets with noise and limited probes. These are direct empirical comparisons of two standard optimization approaches on the same fitting problem; no derivation chain reduces the reported performance advantage to a fitted parameter, self-definition, or self-citation. The Stiefel-manifold extension is described as a possible future parametrization without any load-bearing uniqueness theorem or ansatz imported from prior self-work. The method is self-contained against external benchmarks and does not rename known results or smuggle assumptions via citation.
Axiom & Free-Parameter Ledger
free parameters (1)
- gradient descent hyperparameters (learning rate, iterations)
axioms (1)
- standard math Detector response is fully described by a POVM: positive semi-definite operators summing to the identity.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
min Π ||P−FΠ||_F^2 subject to Π_ij ≥0, rows sum to 1; softmax applied to rows of Π
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
would be another research direction of interest that could bypass the need for Riemannian manifold based optimization
-
[2]
D. T. Smithey, M. Beck, M. G. Raymer, and A. Faridani, Measurement of the Wigner distribution and the density matrix of a light mode using optical homodyne tomog- raphy: Application to squeezed states and the vacuum, Phys. Rev. Lett.70, 1244 (1993)
work page 1993
-
[3]
K. Banaszek and K. W´ odkiewicz, Direct probing of quan- tum phase space by photon counting, Phys. Rev. Lett. 76, 4344 (1996)
work page 1996
-
[4]
S. Wallentowitz and W. Vogel, Unbalanced homodyning for quantum state measurements, Phys. Rev. A53, 4528 (1996)
work page 1996
-
[5]
M. Lobino, D. Korystov, C. Kupchak, E. Figueroa, B. C. 8 Sanders, and A. I. Lvovsky, Complete characterization of quantum-optical processes, Science322, 563 (2008), https://www.science.org/doi/pdf/10.1126/science.1162086
-
[6]
E. Nielsen, J. K. Gamble, K. Rudinger, T. Scholten, K. Young, and R. Blume-Kohout, Gate set tomography, Quantum5, 557 (2021)
work page 2021
- [7]
-
[8]
A. Kuzmich, I. Walmsley, and L. Mandel, Violation of bell’s inequality by a generalized einstein-podolsky-rosen state using homodyne detection, Physical review letters 85, 1349 (2000)
work page 2000
-
[9]
R. Raussendorf, D. E. Browne, and H. J. Briegel, Measurement-based quantum computation on cluster states, Physical review A68, 022312 (2003)
work page 2003
-
[10]
H. J. Briegel, D. E. Browne, W. D¨ ur, R. Raussendorf, and M. Van den Nest, Measurement-based quantum compu- tation, Nature Physics5, 19 (2009)
work page 2009
-
[11]
J. S. Lundeen, A. Feito, H. Coldenstrodt-Ronge, K. L. Pregnell, C. Silberhorn, T. C. Ralph, J. Eisert, M. B. Plenio, and I. A. Walmsley, Tomography of quantum de- tectors, Nature Physics5, 27 (2009)
work page 2009
- [12]
-
[13]
C. M. Natarajan, L. Zhang, H. Coldenstrodt-Ronge, G. Donati, S. N. Dorenbos, V. Zwiller, I. A. Walms- ley, and R. H. Hadfield, Quantum detector tomography of a time-multiplexed superconducting nanowire single- photon detector at telecom wavelengths, Optics express 21, 893 (2013)
work page 2013
-
[14]
T. Schapeler, J. Philipp H¨ opker, and T. J. Bartley, Quan- tum detector tomography of a 2×2 multi-pixel array of superconducting nanowire single photon detectors, Op- tics Express28, 33035 (2020)
work page 2020
-
[15]
T. Schapeler, J. P. H¨ opker, and T. J. Bartley, Quantum detector tomography of a high dynamic-range supercon- ducting nanowire single-photon detector, Superconduc- tor Science and Technology34, 064002 (2021)
work page 2021
- [16]
-
[17]
M. Cattaneo, M. A. Rossi, K. Korhonen, E.-M. Borrelli, G. Garc´ ıa-P´ erez, Z. Zimbor´ as, and D. Cavalcanti, Self- consistent quantum measurement tomography based on semidefinite programming, Physical Review Research5, 033154 (2023)
work page 2023
-
[18]
J. Barber` a-Rodr´ ıguez, L. Zambrano, A. Ac´ ın, and D. Fa- rina, Boosting projective methods for quantum process and detector tomography, Physical Review Research7, 013208 (2025)
work page 2025
- [19]
-
[20]
M. D. Hoffman, A. Gelman, et al., The no-u-turn sam- pler: adaptively setting path lengths in hamiltonian monte carlo., J. Mach. Learn. Res.15, 1593 (2014)
work page 2014
-
[21]
A. G. Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, Automatic differentiation in machine learning: a survey, Journal of machine learning research18, 1 (2018)
work page 2018
-
[22]
I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning, Vol. 1 (MIT press Cambridge, 2016)
work page 2016
-
[23]
C. M. Bishop and H. Bishop, Deep learning: Foundations and concepts (Springer Nature, 2023)
work page 2023
-
[24]
J. M. Arrazola, T. R. Bromley, J. Izaac, C. R. Myers, K. Br´ adler, and N. Killoran, Machine learning method for state preparation and gate synthesis on photonic quan- tum computers, Quantum Science and Technology4, 024004 (2019)
work page 2019
-
[25]
F. M. Miatto and N. Quesada, Fast optimization of parametrized quantum optical circuits, Quantum4, 366 (2020)
work page 2020
- [26]
-
[27]
Y. Yao, F. Miatto, and N. Quesada, Riemannian opti- mization of photonic quantum circuits in phase and fock space, SciPost Physics17, 082 (2024)
work page 2024
- [28]
- [29]
-
[30]
Y. Wang, L. Liu, S. Cheng, L. Li, and J. Chen, Efficient factored gradient descent algorithm for quantum state tomography, Physical Review Research6, 033034 (2024)
work page 2024
- [31]
-
[32]
A. Gaikwad, M. S. Torres, S. Ahmed, and A. F. Kockum, Gradient-descent methods for fast quantum state to- mography, Quantum Science and Technology10, 045055 (2025)
work page 2025
-
[33]
C. W. Helstrom, Quantum Detection and Estimation Theory (Mathematics in Science and Engineering, 123, Academic Press, New York, 1976)
work page 1976
-
[34]
G. Benenti, G. Casati, D. Rossini, and G. Strini, Principles of quantum computation and information: a comprehensive textbook (World Scientific, 2019)
work page 2019
-
[35]
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, in Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) (2019) pp. 4171–4186
work page 2019
-
[36]
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, Distil- bert, a distilled version of bert: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108 (2019)
work page internal anchor Pith review Pith/arXiv arXiv 1910
- [37]
- [38]
-
[39]
LLaMA: Open and Efficient Foundation Language Models
H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi` ere, N. Goyal, E. Ham- bro, F. Azhar, et al., Llama: Open and efficient founda- 9 tion language models, arXiv preprint arXiv:2302.13971 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[40]
D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, in International Conference on Learning Representations (ICLR) (2015)
work page 2015
-
[41]
I. Panageas, G. Piliouras, and X. Wang, First-order methods almost always avoid saddle points: The case of vanishing step-sizes, Advances in Neural Information Processing Systems32(2019)
work page 2019
-
[42]
T. Schapeler, R. Schade, M. Lass, C. Plessl, and T. J. Bartley, Scalable quantum detector tomography by high- performance computing, Quantum Science and Technol- ogy10, 015018 (2024)
work page 2024
- [43]
-
[44]
S. Diamond and S. Boyd, Cvxpy: A python-embedded modeling language for convex optimization, Journal of Machine Learning Research17, 1 (2016)
work page 2016
- [45]
-
[46]
A. E. Lita, A. J. Miller, and S. W. Nam, Counting near- infrared single-photons with 95% efficiency, Opt. Expr. 16, 3032 (2008)
work page 2008
- [47]
- [48]
- [49]
- [50]
-
[51]
D. Gottesman, A. Kitaev, and J. Preskill, Encoding a qubit in an oscillator, Phys. Rev. A64, 012310 (2001)
work page 2001
-
[52]
B. Q. Baragiola, G. Pantaleoni, R. N. Alexander, A. Karanjai, and N. C. Menicucci, All-Gaussian univer- sality and fault tolerance with the Gottesman-Kitaev- Preskill code, Phys. Rev. Lett.123, 200502 (2019)
work page 2019
-
[53]
S. Boyd and L. Vandenberghe, Convex optimization (Cambridge university press, 2004)
work page 2004
-
[54]
D. Fukuda, G. Fujii, T. Numata, K. Amemiya, A. Yoshizawa, H. Tsuchida, H. Fujino, H. Ishii, T. Itatani, S. Inoue, et al., Titanium-based transition- edge photon number resolving detector with 98% detec- tion efficiency with index-matched small-gap fiber cou- pling, Optics express19, 870 (2011)
work page 2011
-
[55]
W. Zhang, L. You, H. Li, J. Huang, C. Lv, L. Zhang, X. Liu, J. Wu, Z. Wang, and X. Xie, Nbn superconduct- ing nanowire single photon detector with efficiency over 90% at 1550 nm wavelength operational at compact cry- ocooler temperature, Science China Physics, Mechanics & Astronomy60, 120314 (2017)
work page 2017
-
[56]
M. J. Fitch, B. C. Jacobs, T. B. Pittman, and J. D. Fran- son, Photon-number resolution using time-multiplexed single-photon detectors, Phys. Rev. A68, 043814 (2003)
work page 2003
-
[57]
D. P. Bertsekas, Projected newton methods for optimiza- tion problems with simple constraints, SIAM Journal on control and Optimization20, 221 (1982)
work page 1982
-
[58]
S. Li, Y. Zhao, R. Varma, O. Salpekar, P. Noordhuis, T. Li, A. Paszke, J. Smith, B. Vaughan, P. Damania, et al., Pytorch distributed: Experiences on accelerating data parallel training, arXiv preprint arXiv:2006.15704 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2006
- [59]
-
[60]
S. Bubeck et al., Convex optimization: Algorithms and complexity, Foundations and Trends®in Machine Learning8, 231 (2015)
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.