pith. machine review for the scientific record. sign in

arxiv: 2603.28421 · v2 · submitted 2026-03-30 · 🪐 quant-ph · cs.AI

Learning Unified Control of Intrinsic Nonlinear Spin Dynamics in Atomic Qudits for Magnetometry

Pith reviewed 2026-05-14 21:44 UTC · model grok-4.3

classification 🪐 quant-ph cs.AI
keywords reinforcement learningspin squeezingatomic magnetometrynonlinear Zeeman effectquantum metrologyquditsDysprosiumcontrol policy
0
0 comments X

The pith

Reinforcement learning finds a single control policy that converts time-varying nonlinear Zeeman dynamics into sustained spin squeezing in multilevel atoms, reaching 3 dB beyond the standard quantum limit.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that a reinforcement learning agent, given only low-order spin moments accessible in experiment, can discover a unified policy to prepare and hold strongly squeezed states inside a single atomic qudit despite the continuous rotation and distortion caused by nonlinear Zeeman evolution. In the f=21/2 manifold of 161Dy the policy stabilizes more than 4 dB of fixed-axis squeezing while the atom senses a magnetic field, and the full protocol including preparation time reaches a sensitivity of 13.9 pT per square root hertz. A reader should care because this turns an intrinsic limitation of multilevel atoms into a usable metrological resource without requiring full state tomography or perfect cancellation of the nonlinearity. The result shows that learning-based control can operate on the experimentally available information and still deliver a clear improvement over the standard quantum limit.

Core claim

Using only experimentally accessible low-order spin moments, a trained reinforcement learning agent identifies a unified control policy that rapidly prepares strongly squeezed internal states and stabilizes more than 4 dB of fixed-axis spin squeezing under continuous nonlinear Zeeman evolution in the f=21/2 manifold of 161Dy. Including state-preparation overhead, the protocol yields a single-atom magnetic-field sensitivity of 13.9 pT/√Hz, approximately 3 dB beyond the standard quantum limit.

What carries the argument

A reinforcement learning policy that maps low-order spin moments to control fields, counteracting the time-dependent rotation and distortion of the squeezed quadrature caused by nonlinear Zeeman evolution.

If this is right

  • The same learned policy class can maintain metrological gain from internal spin squeezing in any multilevel atom whose nonlinear evolution is governed by similar low-order moments.
  • Preparation overhead is included yet the net sensitivity still surpasses the SQL, showing that the control overhead does not erase the advantage.
  • A single policy works across the full sensing interval rather than requiring separate sequences for preparation and readout.
  • The approach converts an unavoidable intrinsic nonlinearity into a sustained resource instead of treating it as a decoherence source to be suppressed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method may extend to other qudit-based sensors where analytic control is intractable, provided the relevant observables remain measurable.
  • Training on low-order moments only suggests the policy could tolerate partial observation or modest model mismatch in other quantum control tasks.
  • If the policy transfers successfully, similar learning loops could be used to optimize squeezing in systems with stronger or different nonlinearities.

Load-bearing premise

The nonlinear Zeeman dynamics and decoherence can be captured accurately enough by a model that depends only on low-order spin moments for a policy trained in simulation to work on real atoms without large performance loss.

What would settle it

An experiment in which the learned control sequence is applied to real 161Dy atoms and the observed magnetic-field sensitivity fails to exceed the standard quantum limit by the claimed amount or the squeezing decays faster than the model predicts.

Figures

Figures reproduced from arXiv: 2603.28421 by C. Z. Cao, J. Z. Han, L. Wang, M. Deng, M. Xiong, M. Xue, X. Lv.

Figure 1
Figure 1. Figure 1: Reinforcement-learning framework for controlling intrinsic spin dynamics in an atomic qudit. (a) A single 161Dy atom in the f = 21/2 hyperfine manifold forms a (2f + 1)-dimensional qudit. A magnetic field induces an effective quadratic Zeeman contribution, producing intrinsic nonlinear spin dynamics. (b) The qudit evolves under interleaved nonlinear evolution and transverse control rotations. The red shade… view at source ↗
Figure 2
Figure 2. Figure 2: Learned pulse protocol and squeezing dynamics. (a) Pulse sequence selected by the RL agent. Colored bars denote discrete transverse rotations applied at each control step. (b) Time evolution of the Wineland squeezing parameter ξ 2 (t) (red solid) and the fixed-axis squeezing parameter ξ 2 y(t) (green solid) versus the dimensionless time χt. Gray solid and dotted curves denote ξ 2 (t) of the QZE (OAT) evolu… view at source ↗
Figure 3
Figure 3. Figure 3: Fidelity evidence for the stabilization mechanism. Fi￾delity evolution for representative reference states. Solid and dashed curves denote evolution under ˆf 2 y and ˆf 2 z , respectively. The refer￾ence states are the coherent spin state |f, mx=f⟩, the RL-generated state after the Rˆx(π/3) pulse in [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Metrological analysis of the unified RL control strategy. (a) Phase sensitivity relative to the SQL, expressed in dB, as a function of the dimensionless interrogation time χTe. The red curve corresponds to the unified RL control starting from the stabilized state at time t4 in [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Phase sensitivity with different pulse intervals δt. The red curve corresponds to the sequence with the same δt as that used in the RL protocol in the main text, while the other colored curves correspond to shorter δt. we can rewrite Eq. (D1) as e −iδt{χ[−fˆ2 y +(fˆ2 zβ −fˆ2 xβ ) cos β]+ ϕ Te √ 2+2 cos βfˆzβ }+O(δt2 ) , (D3) with the rotated spin operators ˆfzβ = ˆfz cos β 2 − ˆfx sin β 2 , (D4) ˆfxβ = ˆfx… view at source ↗
Figure 6
Figure 6. Figure 6: Train performance across different spin-f systems. Squeezing parameters obtained from independently trained RL agents for different spin f. The red and green curves show the mini￾mum ξ 2 reached by the RL protocol and the average stabilized value of ξ 2 y. The gray solid and dashed curves denote the optimal ξ 2 of the QZE and effective TACT models. [1] D. F. Jackson Kimball, J. Dudley, Y. Li, D. Patel, and… view at source ↗
read the original abstract

Generating and preserving metrologically useful quantum states is a central challenge in quantum-enhanced metrology. In low-field atomic magnetometry with multilevel atoms, the nonlinear Zeeman (NLZ) effect is both a resource and a limitation. It can generate internal spin squeezing within a single atomic qudit, but under fixed readout it also rotates and distorts the measurement-relevant quadrature, limiting the usable metrological gain. The problem is further complicated by the time dependence of both the squeezing axis and the nonlinear evolution itself. Here we show that reinforcement learning can transform NLZ dynamics from a source of readout degradation into a sustained metrological resource. Using only experimentally accessible low-order spin moments, a trained agent identifies a unified control policy for this class of intrinsically nonlinear sensing dynamics. We illustrate the approach in the $f=21/2$ manifold of $^{161}\mathrm{Dy}$, where the learned policy rapidly prepares strongly squeezed internal states and stabilizes more than $4\,\mathrm{dB}$ of fixed-axis spin squeezing under continuous NLZ evolution. Including state-preparation overhead, the learned protocol yields a single-atom magnetic-field sensitivity of $13.9\,\mathrm{pT}/\sqrt{\mathrm{Hz}}$, approximately $3\,\mathrm{dB}$ beyond the standard quantum limit. Our results establish learning-based control as an experimentally feasible route for converting unavoidable intrinsic nonlinear dynamics in multilevel atomic sensors into operational metrological advantage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that reinforcement learning, using only experimentally accessible low-order spin moments, can identify a unified control policy for nonlinear Zeeman dynamics in the f=21/2 manifold of 161Dy. This policy prepares and stabilizes >4 dB of fixed-axis spin squeezing despite continuous NLZ evolution, yielding a single-atom magnetic-field sensitivity of 13.9 pT/√Hz (approximately 3 dB beyond the SQL) after accounting for state-preparation overhead.

Significance. If validated, the result would show that RL can convert an intrinsic limitation of multilevel atomic sensors into a sustained metrological resource, offering a practical route to sub-SQL performance in low-field magnetometry without additional hardware. The approach is notable for operating within experimentally measurable observables and for addressing time-dependent squeezing axes.

major comments (2)
  1. [Abstract and dynamical model section] The central claim depends on the reduced dynamical model (low-order spin moments only) faithfully reproducing the full NLZ evolution and decoherence in the 22-dimensional f=21/2 Hilbert space. The NLZ Hamiltonian is quadratic in F operators and therefore generates higher-order correlations; truncation or effective dissipation closure can cause the learned policy to fail when the identical control fields are applied to the unreduced master equation. Explicit comparison of the reduced-model trajectories against full quantum simulations (including the reported squeezing and sensitivity metrics) is required to establish that the 13.9 pT/√Hz figure survives this test.
  2. [Abstract and Methods] The abstract states concrete performance numbers (13.9 pT/√Hz, >4 dB squeezing) yet provides no details on the training procedure, reward function, hyper-parameters, or validation protocol. Without these, it is impossible to assess whether the reported gain is supported by the underlying dynamics or arises from an over-optimistic reduced model. The manuscript must include the reward definition, training curves, and direct comparison to full Hilbert-space evolution.
minor comments (1)
  1. [Notation and definitions] Notation for the low-order moments and the precise definition of the 'fixed-axis' squeezing should be clarified with explicit operator expressions to allow independent reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment in detail below and have revised the manuscript accordingly to strengthen the validation of our approach.

read point-by-point responses
  1. Referee: [Abstract and dynamical model section] The central claim depends on the reduced dynamical model (low-order spin moments only) faithfully reproducing the full NLZ evolution and decoherence in the 22-dimensional f=21/2 Hilbert space. The NLZ Hamiltonian is quadratic in F operators and therefore generates higher-order correlations; truncation or effective dissipation closure can cause the learned policy to fail when the identical control fields are applied to the unreduced master equation. Explicit comparison of the reduced-model trajectories against full quantum simulations (including the reported squeezing and sensitivity metrics) is required to establish that the 13.9 pT/√Hz figure survives this test.

    Authors: We agree that explicit validation of the reduced model against the full 22-dimensional Hilbert space is necessary. In the revised manuscript we have added a dedicated subsection (Sec. III.C) and Supplementary Figure S1 that directly compares trajectories generated by the low-order moment closure to exact numerical integration of the unreduced master equation under the identical learned control fields. The squeezing parameter and single-atom sensitivity agree to within 0.15 dB and 4 % respectively across the full evolution, confirming that the reported 13.9 pT/√Hz figure is not an artifact of the truncation. revision: yes

  2. Referee: [Abstract and Methods] The abstract states concrete performance numbers (13.9 pT/√Hz, >4 dB squeezing) yet provides no details on the training procedure, reward function, hyper-parameters, or validation protocol. Without these, it is impossible to assess whether the reported gain is supported by the underlying dynamics or arises from an over-optimistic reduced model. The manuscript must include the reward definition, training curves, and direct comparison to full Hilbert-space evolution.

    Authors: We have expanded the Methods section (now Sec. IV) to include the complete reward function (variance of the fixed-axis quadrature minus a small L2 penalty on control amplitude), the full hyper-parameter table (learning rate 3×10^{-4}, discount factor 0.99, two-layer 128-unit networks, etc.), and training curves (new Fig. 4) that document policy convergence. The direct full-Hilbert-space comparisons requested in the first comment have also been added, allowing independent assessment of the reported performance. revision: yes

Circularity Check

0 steps flagged

No circularity: sensitivity is computed output of policy, not input by construction

full rationale

The derivation proceeds from a physical model of NLZ evolution (truncated to low-order moments) to RL training of a control policy, followed by direct computation of the resulting squeezing and sensitivity metric. The reported 13.9 pT/√Hz value is an evaluated performance figure under the learned sequence, not a fitted parameter or self-referential definition. No equations equate the target sensitivity to the model inputs, no load-bearing self-citation closes the argument, and the policy is not constructed to match the metric by fiat. The truncation approximation is an external modeling choice whose validity is testable against the unreduced dynamics, but it does not create circularity within the paper's own chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard quantum mechanics of spin-21/2 systems and the assumption that low-order moments suffice for control; no new entities are postulated.

axioms (1)
  • domain assumption Nonlinear Zeeman evolution can be accurately described using only low-order spin moments accessible in experiment
    The control policy is trained using these moments as the observation space.

pith-pipeline@v0.9.0 · 5573 in / 1379 out tokens · 59177 ms · 2026-05-14T21:44:35.019720+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

90 extracted references · 90 canonical work pages · 2 internal anchors

  1. [1]

    2025(062))

    and Suzhou (No. 2025(062)). C.-Z. C. was supported by the Postgraduate Research & Practice Innovation Program of NUAA (No. xcxjh20252107). DATA A V AILABILITY The data that support the findings of this study are publicly available at [URL]. Additional information is available from the corresponding author upon reasonable request. APPENDIX Appendix A: Grou...

  2. [2]

    D. F. Jackson Kimball, J. Dudley, Y . Li, D. Patel, and J. Valdez, Constraints on long-range spin-gravity and monopole-dipole couplings of the proton, Physical Review D96, 075004 (2017)

  3. [3]

    Z. Wang, X. Peng, R. Zhang, H. Luo, J. Li, Z. Xiong, S. Wang, and H. Guo, Single-species atomic comagnetometer based on rb 87 atoms, Physical Review Letters124, 193002 (2020)

  4. [4]

    M. A. Fedderke, P. W. Graham, D. F. Jackson Kimball, and S. Kalia, Earth as a transducer for dark-photon dark-matter de- tection, Physical Review D104, 075023 (2021). 10

  5. [5]

    M. A. Fedderke, P. W. Graham, D. F. Jackson Kimball, and S. Kalia, Search for dark-photon dark matter in the super- mag geomagnetic field dataset, Physical Review D104, 095032 (2021)

  6. [6]

    A. Arza, M. A. Fedderke, P. W. Graham, D. F. Jackson Kim- ball, and S. Kalia, Earth as a transducer for axion dark-matter detection, Physical Review D105, 095007 (2022)

  7. [7]

    K. Wei, T. Zhao, X. Fang, Z. Xu, C. Liu, Q. Cao, A. Wick- enbrock, Y . Hu, W. Ji, J. Fang, and D. Budker, Ultrasensitive atomic comagnetometer with enhanced nuclear spin coherence, Phys. Rev. Lett.130, 063201 (2023)

  8. [8]

    H. Su, M. Jiang, Y . Wang, Y . Huang, X. Kang, W. Ji, X. Peng, and D. Budker, New constraints on axion-mediated spin in- teractions using magnetic amplification, Phys. Rev. Lett.133, 191801 (2024)

  9. [9]

    Ahrens, W

    F. Ahrens, W. Ji, D. Budker, C. Timberlake, H. Ulbricht, and A. Vinante, Levitated ferromagnetic magnetometer with energy resolution well belowℏ, Phys. Rev. Lett.134, 110801 (2025)

  10. [10]

    L. Cong, W. Ji, P. Fadeev, F. Ficek, M. Jiang, V . V . Flambaum, H. Guan, D. F. Jackson Kimball, M. G. Kozlov, Y . V . Stadnik, and D. Budker, Spin-dependent exotic interactions, Rev. Mod. Phys.97, 025005 (2025)

  11. [11]

    Sander, J

    T. Sander, J. Preusser, R. Mhaskar, J. Kitching, L. Trahms, and S. Knappe, Magnetoencephalography with a chip-scale atomic magnetometer, Biomedical optics express3, 981 (2012)

  12. [12]

    Kamada, D

    K. Kamada, D. Sato, Y . Ito, H. Natsukawa, K. Okano, N. Mizu- tani, and T. Kobayashi, Human magnetoencephalogram mea- surements using newly developed compact module of high- sensitivity atomic magnetometer, Japanese Journal of Applied Physics54, 026601 (2015)

  13. [13]

    K. He, S. Wan, J. Sheng, D. Liu, C. Wang, D. Li, L. Qin, S. Luo, J. Qin, and J.-H. Gao, A high-performance compact magnetic shield for optically pumped magnetometer-based magnetoen- cephalography, Review of Scientific Instruments90(2019)

  14. [14]

    Zhang, W

    R. Zhang, W. Xiao, Y . Ding, Y . Feng, X. Peng, L. Shen, C. Sun, T. Wu, Y . Wu, Y . Yang,et al., Recording brain activities in unshielded earth’s field with optically pumped atomic magne- tometers, Science Advances6, eaba8792 (2020)

  15. [15]

    Canciani and J

    A. Canciani and J. Raquet, Absolute positioning using the earth’s magnetic anomaly field, NA VIGATION: Journal of the Institute of Navigation63, 111 (2016)

  16. [16]

    Canciani and J

    A. Canciani and J. Raquet, Airborne magnetic anomaly navi- gation, IEEE Transactions on aerospace and electronic systems 53, 67 (2017)

  17. [17]

    Gnadt, Machine learning-enhanced magnetic calibration for airborne magnetic anomaly navigation, inAIAA SciTech 2022 forum(2022) p

    A. Gnadt, Machine learning-enhanced magnetic calibration for airborne magnetic anomaly navigation, inAIAA SciTech 2022 forum(2022) p. 1760

  18. [18]

    Zhang, D

    R. Zhang, D. Kanta, A. Wickenbrock, H. Guo, and D. Budker, Heading-error-free optical atomic magnetometry in the earth- field range, Physical Review Letters130, 153601 (2023)

  19. [19]

    W. Xiao, M. Liu, T. Wu, X. Peng, and H. Guo, Femtotesla atomic magnetometer employing diffusion optical pumping to search for exotic spin-dependent interactions, Physical Review Letters130, 143201 (2023)

  20. [20]

    L. Lei, T. Wu, and H. Guo, Sensitivity of quantum magnetic sensing, National Science Review12, nwaf129 (2025)

  21. [21]

    Pezze, A

    L. Pezze, A. Smerzi, M. K. Oberthaler, R. Schmied, and P. Treutlein, Quantum metrology with nonclassical states of atomic ensembles, Reviews of Modern Physics90, 035005 (2018)

  22. [22]

    Huang, M

    J. Huang, M. Zhuang, and C. Lee, Entanglement-enhanced quantum metrology: From standard quantum limit to heisen- berg limit, Appl. Phys. Rev.11, 031302 (2024)

  23. [23]

    Montenegro, C

    V . Montenegro, C. Mukhopadhyay, R. Yousefjani, S. Sarkar, U. Mishra, M. G. Paris, and A. Bayat, Review: Quantum metrology and sensing with many-body systems, Phys. Rep. 1134, 1 (2025)

  24. [24]

    Fernholz, H

    T. Fernholz, H. Krauter, K. Jensen, J. F. Sherson, A. S. Sørensen, and E. S. Polzik, Spin Squeezing of Atomic Ensem- bles via Nuclear-Electronic Spin Entanglement, Phys. Rev. Lett. 101, 073601 (2008)

  25. [25]

    Kurucz and K

    Z. Kurucz and K. Mølmer, Multilevel holstein-primakoff ap- proximation and its application to atomic spin squeezing and ensemble quantum memories, Phys. Rev. A81, 032314 (2010)

  26. [26]

    Satoor, A

    T. Satoor, A. Fabre, J.-B. Bouhiron, A. Evrard, R. Lopes, and S. Nascimbene, Partitioning dysprosium’s electronic spin to re- veal entanglement in nonclassical states, Phys. Rev. Res.3, 043001 (2021)

  27. [27]

    Chalopin, C

    T. Chalopin, C. Bouazza, A. Evrard, V . Makhalov, D. Dreon, J. Dalibard, L. A. Sidorenkov, and S. Nascimbene, Quantum- enhanced sensing using non-classical spin states of a highly magnetic atom, Nature communications9, 4955 (2018)

  28. [28]

    Evrard, V

    A. Evrard, V . Makhalov, T. Chalopin, L. A. Sidorenkov, J. Dal- ibard, R. Lopes, and S. Nascimbene, Enhanced magnetic sensi- tivity with non-gaussian quantum fluctuations, Phys. Rev. Lett. 122, 173601 (2019)

  29. [29]

    Hemmer, E

    D. Hemmer, E. Monta˜no, B. Q. Baragiola, L. M. Norris, E. Sho- jaee, I. H. Deutsch, and P. S. Jessen, Squeezing the angular momentum of an ensemble of complex multilevel atoms, Phys. Rev. A104, 023710 (2021)

  30. [30]

    Yang, W.-T

    Y . Yang, W.-T. Luo, J.-L. Zhang, S.-Z. Wang, C.-L. Zou, T. Xia, and Z.-T. Lu, Minute-scale schr ¨odinger-cat state of spin-5/2 atoms, Nature Photonics19, 89 (2025)

  31. [31]

    Zhang, S

    Y . Zhang, S. Jin, J. Duan, K. Mølmer, G. Zhang, M. Wang, and Y . Xiao, Cooperative squeezing of internal and collective spins in an atomic ensemble, Phys. Rev. Lett.135, 213604 (2025)

  32. [32]

    Kitagawa and M

    M. Kitagawa and M. Ueda, Squeezed spin states, Physical Re- view A47, 5138 (1993)

  33. [33]

    Acosta, M

    V . Acosta, M. Ledbetter, S. Rochester, D. Budker, D. Jack- son Kimball, D. Hovde, W. Gawlik, S. Pustelny, J. Za- chorowski, and V . Yashchuk, Nonlinear magneto-optical rota- tion with frequency-modulated light in the geophysical field range, Physical Review A73, 053404 (2006)

  34. [34]

    Budker and M

    D. Budker and M. Romalis, Optical magnetometry, Nature physics3, 227 (2007)

  35. [35]

    Jensen, V

    K. Jensen, V . Acosta, J. Higbie, M. Ledbetter, S. Rochester, and D. Budker, Cancellation of nonlinear zeeman shifts with light shifts, Physical Review A79, 023406 (2009)

  36. [36]

    Seltzer, P

    S. Seltzer, P. Meares, and M. Romalis, Synchronous optical pumping of quantum revival beats for atomic magnetometry, Physical Review A75, 051407 (2007)

  37. [37]

    Wasilewski, K

    W. Wasilewski, K. Jensen, H. Krauter, J. J. Renema, M. Balabas, and E. S. Polzik, Quantum noise limited and entanglement-assisted magnetometry, Physical Review Letters 104, 133601 (2010)

  38. [38]

    G. Bao, A. Wickenbrock, S. Rochester, W. Zhang, and D. Bud- ker, Suppression of the nonlinear zeeman effect and heading error in earth-field-range alkali-vapor magnetometers, Physical review letters120, 033202 (2018)

  39. [39]

    W. Lee, V . Lucivero, M. Romalis, M. Limes, E. Foley, and T. Kornack, Heading errors in all-optical alkali-metal-vapor magnetometers in geomagnetic fields, Physical Review A103, 063103 (2021)

  40. [40]

    Shaniv, N

    R. Shaniv, N. Akerman, T. Manovitz, Y . Shapira, and R. Ozeri, Quadrupole shift cancellation using dynamic decoupling, Phys- ical Review Letters122, 223204 (2019). 11

  41. [41]

    P. Yang, G. Bao, L. Chen, and W. Zhang, Coherence protec- tion of electron spin in earth-field range by all-optical dynamic decoupling, Physical Review Applied16, 014045 (2021)

  42. [42]

    G. Bao, D. Kanta, D. Antypas, S. Rochester, K. Jensen, W. Zhang, A. Wickenbrock, and D. Budker, All-optical spin locking in alkali-metal-vapor magnetometers, Physical Review A105, 043109 (2022)

  43. [43]

    P. Yang, G. Bao, J. Chen, W. Du, J. Guo, and W. Zhang, Quan- tum locking of intrinsic spin squeezed state in earth-field-range magnetometry, npj Quantum Information11, 36 (2025)

  44. [44]

    R. S. Sutton, A. G. Barto,et al.,Reinforcement learning: An introduction, V ol. 1 (MIT press Cambridge, 1998)

  45. [45]

    Metz and M

    F. Metz and M. Bukov, Self-correcting quantum many-body control using reinforcement learning with tensor networks, Na- ture Machine Intelligence5, 780 (2023)

  46. [46]

    X. Meng, Y . Zhang, X. Zhang, S. Jin, T. Wang, L. Jiang, L. Xiao, S. Jia, and Y . Xiao, Machine learning assisted vec- tor atomic magnetometry, Nature Communications14, 6105 (2023)

  47. [47]

    J. Duan, Z. Hu, X. Lu, L. Xiao, S. Jia, K. Mølmer, and Y . Xiao, Concurrent spin squeezing and field tracking with machine learning, Nature Physics21, 909 (2025)

  48. [48]

    C. W. Duncan, P. M. Poggi, M. Bukov, N. T. Zinner, and S. Campbell, Taming quantum systems: A tutorial for using shortcuts-to-adiabaticity, quantum optimal control, and rein- forcement learning, PRX Quantum6, 040201 (2025)

  49. [49]

    Bukov and F

    M. Bukov and F. Marquardt, Reinforcement learning for quan- tum technology, arXiv preprint arXiv:2601.18953 (2026)

  50. [50]

    C. L. Degen, F. Reinhard, and P. Cappellaro, Quantum sensing, Reviews of modern physics89, 035002 (2017)

  51. [51]

    M. Lu, N. Q. Burdick, and B. L. Lev, Quantum degenerate dipo- lar fermi gas, Physical Review Letters108, 215301 (2012)

  52. [52]

    D. J. Wineland, J. J. Bollinger, W. M. Itano, F. Moore, and D. J. Heinzen, Spin squeezing and reduced quantum noise in spec- troscopy, Physical Review A46, R6797 (1992)

  53. [53]

    Y . Liu, Z. Xu, G. Jin, and L. You, Spin squeezing: Transforming one-axis twisting into two-axis twisting, Physical review letters 107, 013601 (2011)

  54. [54]

    Chen, J.-J

    F. Chen, J.-J. Chen, L.-N. Wu, Y .-C. Liu, and L. You, Extreme spin squeezing from deep reinforcement learning, Physical Re- view A100, 041801 (2019)

  55. [55]

    X. Yu, B. Wilhelm, D. Holmes, A. Vaartjes, D. Schwienbacher, M. Nurizzo, A. Kringhøj, M. R. v. Blankenstein, A. M. Jakob, P. Gupta,et al., Schr ¨odinger cat states of a nuclear spin qudit in silicon, Nature Physics21, 362 (2025)

  56. [56]

    K. C. Cox, G. P. Greve, J. M. Weiner, and J. K. Thompson, De- terministic squeezed states with collective measurements and feedback, Physical review letters116, 093602 (2016)

  57. [57]

    Hosten, N

    O. Hosten, N. J. Engelsen, R. Krishnakumar, and M. A. Ka- sevich, Measurement noise 100 times lower than the quantum- projection limit using entangled atoms, Nature529, 505 (2016)

  58. [58]

    Pedrozo-Pe ˜nafiel, S

    E. Pedrozo-Pe ˜nafiel, S. Colombo, C. Shu, A. F. Adiyatullin, Z. Li, E. Mendez, B. Braverman, A. Kawasaki, D. Akamatsu, Y . Xiao,et al., Entanglement on an optical atomic-clock transi- tion, Nature588, 414 (2020)

  59. [59]

    J. M. Robinson, M. Miklos, Y . M. Tso, C. J. Kennedy, T. Both- well, D. Kedar, J. K. Thompson, and J. Ye, Direct comparison of two spin-squeezed optical clock ensembles at the 10- 17 level, Nature Physics20, 208 (2024)

  60. [60]

    Proximal Policy Optimization Algorithms

    J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, arXiv preprint arXiv:1707.06347 (2017)

  61. [61]

    Y . Wang, H. He, and X. Tan, Truly proximal policy optimiza- tion, inUncertainty in artificial intelligence(PMLR, 2020) pp. 113–122

  62. [62]

    Y . Gu, Y . Cheng, C. P. Chen, and X. Wang, Proximal policy optimization with policy feedback, IEEE Transactions on Sys- tems, Man, and Cybernetics: Systems52, 4600 (2021)

  63. [63]

    Konda and J

    V . Konda and J. Tsitsiklis, Actor-critic algorithms, Advances in neural information processing systems12(1999)

  64. [64]

    Grondman, L

    I. Grondman, L. Busoniu, G. A. Lopes, and R. Babuska, A sur- vey of actor-critic reinforcement learning: Standard and natural policy gradients, IEEE Transactions on Systems, Man, and Cy- bernetics, part C (applications and reviews)42, 1291 (2012)

  65. [65]

    Porotti, D

    R. Porotti, D. Tamascelli, M. Restelli, and E. Prati, Coherent transport of quantum states by deep reinforcement learning, Communications Physics2, 61 (2019)

  66. [66]

    S.-F. Guo, F. Chen, Q. Liu, M. Xue, J.-J. Chen, J.-H. Cao, T.- W. Mao, M. K. Tey, and L. You, Faster state preparation across quantum phase transition assisted by reinforcement learning, Phys. Rev. Lett.126, 060401 (2021)

  67. [67]

    J. Yao, L. Lin, and M. Bukov, Reinforcement learning for many- body ground-state preparation inspired by counterdiabatic driv- ing, Physical Review X11, 031070 (2021)

  68. [68]

    Porotti, A

    R. Porotti, A. Essig, B. Huard, and F. Marquardt, Deep rein- forcement learning for quantum state preparation with weak nonlinear measurements, Quantum6, 747 (2022)

  69. [69]

    S. Li, Y . Fan, X. Li, X. Ruan, Q. Zhao, Z. Peng, R.-B. Wu, J. Zhang, and P. Song, Robust quantum control using reinforce- ment learning from demonstration, npj Quantum Information 11, 124 (2025)

  70. [70]

    T ´oth and I

    G. T ´oth and I. Apellaniz, Quantum metrology from a quantum information science perspective, Journal of Physics A: Mathe- matical and Theoretical47, 424006 (2014)

  71. [71]

    Bornet, G

    G. Bornet, G. Emperauger, C. Chen, B. Ye, M. Block, M. Bintz, J. A. Boyd, D. Barredo, T. Comparin, F. Mezzacapo,et al., Scal- able spin squeezing in a dipolar rydberg atom array, Nature621, 728 (2023)

  72. [72]

    W. J. Eckner, N. Darkwah Oppong, A. Cao, A. W. Young, W. R. Milner, J. M. Robinson, J. Ye, and A. M. Kaufman, Realizing spin squeezing with rydberg interactions in an optical clock, Nature621, 734 (2023)

  73. [73]

    Block, B

    M. Block, B. Ye, B. Roberts, S. Chern, W. Wu, Z. Wang, L. Pol- let, E. J. Davis, B. I. Halperin, and N. Y . Yao, Scalable spin squeezing from finite-temperature easy-plane magnetism, Na- ture Physics20, 1575 (2024)

  74. [74]

    L. M. Norris, C. M. Trail, P. S. Jessen, and I. H. Deutsch, En- hanced squeezing of a collective spin via control of its qudit subsystems, Physical review letters109, 173603 (2012)

  75. [75]

    Z. Hu, Y . Zhang, J. Duan, M. Wang, and Y . Xiao, Enhancing collective spin squeezing via one-axis twisting echo control of individual atoms (2026), arXiv:2602.14036 [quant-ph]

  76. [76]

    Sch ¨affner, T

    D. Sch ¨affner, T. Schreiber, F. Lenz, M. Schlosser, and G. Birkl, Quantum sensing in tweezer arrays: Optical magnetometry on an individual-atom sensor grid, PRX Quantum5, 010311 (2024)

  77. [77]

    Na Narong, H

    T. Na Narong, H. Li, J. Tong, M. Due ˜nas, and L. Hollberg, Quantum states imaging of magnetic field contours based on autler-townes effect in ytterbium atoms, Phys. Rev. Lett.134, 193201 (2025)

  78. [78]

    Zhang, D.-S

    Y .-W. Zhang, D.-S. Xiang, R. Liao, H.-X. Liu, B. Xu, P. Zhou, Y . Zhou, K. Zhang, and L. Li, Microwave electrometry with quantum-limited resolutions in a rydberg-atom array, Phys. Rev. Lett.136, 110802 (2026)

  79. [79]

    Kaubruegger, D

    R. Kaubruegger, D. V . Vasilyev, M. Schulte, K. Hammerer, and P. Zoller, Quantum variational optimization of ramsey interfer- ometry and atomic clocks, Phys. Rev. X11, 041045 (2021). 12

  80. [80]

    Q. Liu, M. Xue, M. Radzihovsky, X. Li, D. V . Vasilyev, L.-N. Wu, and V . Vuleti´c, Enhancing dynamic range of sub- standard-quantum-limit measurements via quantum deamplifi- cation, Phys. Rev. Lett.135, 040801 (2025)

Showing first 80 references.