arxiv: 2603.28421 · v2 · submitted 2026-03-30 · 🪐 quant-ph · cs.AI

Learning Unified Control of Intrinsic Nonlinear Spin Dynamics in Atomic Qudits for Magnetometry

C. Z. Cao , J. Z. Han , M. Xiong , M. Deng , L. Wang , X. Lv , M. Xue This is my paper

Pith reviewed 2026-05-14 21:44 UTC · model grok-4.3

classification 🪐 quant-ph cs.AI

keywords reinforcement learningspin squeezingatomic magnetometrynonlinear Zeeman effectquantum metrologyquditsDysprosiumcontrol policy

0 comments

The pith

Reinforcement learning finds a single control policy that converts time-varying nonlinear Zeeman dynamics into sustained spin squeezing in multilevel atoms, reaching 3 dB beyond the standard quantum limit.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that a reinforcement learning agent, given only low-order spin moments accessible in experiment, can discover a unified policy to prepare and hold strongly squeezed states inside a single atomic qudit despite the continuous rotation and distortion caused by nonlinear Zeeman evolution. In the f=21/2 manifold of 161Dy the policy stabilizes more than 4 dB of fixed-axis squeezing while the atom senses a magnetic field, and the full protocol including preparation time reaches a sensitivity of 13.9 pT per square root hertz. A reader should care because this turns an intrinsic limitation of multilevel atoms into a usable metrological resource without requiring full state tomography or perfect cancellation of the nonlinearity. The result shows that learning-based control can operate on the experimentally available information and still deliver a clear improvement over the standard quantum limit.

Core claim

Using only experimentally accessible low-order spin moments, a trained reinforcement learning agent identifies a unified control policy that rapidly prepares strongly squeezed internal states and stabilizes more than 4 dB of fixed-axis spin squeezing under continuous nonlinear Zeeman evolution in the f=21/2 manifold of 161Dy. Including state-preparation overhead, the protocol yields a single-atom magnetic-field sensitivity of 13.9 pT/√Hz, approximately 3 dB beyond the standard quantum limit.

What carries the argument

A reinforcement learning policy that maps low-order spin moments to control fields, counteracting the time-dependent rotation and distortion of the squeezed quadrature caused by nonlinear Zeeman evolution.

If this is right

The same learned policy class can maintain metrological gain from internal spin squeezing in any multilevel atom whose nonlinear evolution is governed by similar low-order moments.
Preparation overhead is included yet the net sensitivity still surpasses the SQL, showing that the control overhead does not erase the advantage.
A single policy works across the full sensing interval rather than requiring separate sequences for preparation and readout.
The approach converts an unavoidable intrinsic nonlinearity into a sustained resource instead of treating it as a decoherence source to be suppressed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method may extend to other qudit-based sensors where analytic control is intractable, provided the relevant observables remain measurable.
Training on low-order moments only suggests the policy could tolerate partial observation or modest model mismatch in other quantum control tasks.
If the policy transfers successfully, similar learning loops could be used to optimize squeezing in systems with stronger or different nonlinearities.

Load-bearing premise

The nonlinear Zeeman dynamics and decoherence can be captured accurately enough by a model that depends only on low-order spin moments for a policy trained in simulation to work on real atoms without large performance loss.

What would settle it

An experiment in which the learned control sequence is applied to real 161Dy atoms and the observed magnetic-field sensitivity fails to exceed the standard quantum limit by the claimed amount or the squeezing decays faster than the model predicts.

Figures

Figures reproduced from arXiv: 2603.28421 by C. Z. Cao, J. Z. Han, L. Wang, M. Deng, M. Xiong, M. Xue, X. Lv.

**Figure 1.** Figure 1: Reinforcement-learning framework for controlling intrinsic spin dynamics in an atomic qudit. (a) A single 161Dy atom in the f = 21/2 hyperfine manifold forms a (2f + 1)-dimensional qudit. A magnetic field induces an effective quadratic Zeeman contribution, producing intrinsic nonlinear spin dynamics. (b) The qudit evolves under interleaved nonlinear evolution and transverse control rotations. The red shade… view at source ↗

**Figure 2.** Figure 2: Learned pulse protocol and squeezing dynamics. (a) Pulse sequence selected by the RL agent. Colored bars denote discrete transverse rotations applied at each control step. (b) Time evolution of the Wineland squeezing parameter ξ 2 (t) (red solid) and the fixed-axis squeezing parameter ξ 2 y(t) (green solid) versus the dimensionless time χt. Gray solid and dotted curves denote ξ 2 (t) of the QZE (OAT) evolu… view at source ↗

**Figure 3.** Figure 3: Fidelity evidence for the stabilization mechanism. Fidelity evolution for representative reference states. Solid and dashed curves denote evolution under ˆf 2 y and ˆf 2 z , respectively. The reference states are the coherent spin state |f, mx=f⟩, the RL-generated state after the Rˆx(π/3) pulse in [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Metrological analysis of the unified RL control strategy. (a) Phase sensitivity relative to the SQL, expressed in dB, as a function of the dimensionless interrogation time χTe. The red curve corresponds to the unified RL control starting from the stabilized state at time t4 in [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Phase sensitivity with different pulse intervals δt. The red curve corresponds to the sequence with the same δt as that used in the RL protocol in the main text, while the other colored curves correspond to shorter δt. we can rewrite Eq. (D1) as e −iδt{χ[−fˆ2 y +(fˆ2 zβ −fˆ2 xβ ) cos β]+ ϕ Te √ 2+2 cos βfˆzβ }+O(δt2 ) , (D3) with the rotated spin operators ˆfzβ = ˆfz cos β 2 − ˆfx sin β 2 , (D4) ˆfxβ = ˆfx… view at source ↗

**Figure 6.** Figure 6: Train performance across different spin-f systems. Squeezing parameters obtained from independently trained RL agents for different spin f. The red and green curves show the minimum ξ 2 reached by the RL protocol and the average stabilized value of ξ 2 y. The gray solid and dashed curves denote the optimal ξ 2 of the QZE and effective TACT models. [1] D. F. Jackson Kimball, J. Dudley, Y. Li, D. Patel, and… view at source ↗

read the original abstract

Generating and preserving metrologically useful quantum states is a central challenge in quantum-enhanced metrology. In low-field atomic magnetometry with multilevel atoms, the nonlinear Zeeman (NLZ) effect is both a resource and a limitation. It can generate internal spin squeezing within a single atomic qudit, but under fixed readout it also rotates and distorts the measurement-relevant quadrature, limiting the usable metrological gain. The problem is further complicated by the time dependence of both the squeezing axis and the nonlinear evolution itself. Here we show that reinforcement learning can transform NLZ dynamics from a source of readout degradation into a sustained metrological resource. Using only experimentally accessible low-order spin moments, a trained agent identifies a unified control policy for this class of intrinsically nonlinear sensing dynamics. We illustrate the approach in the $f=21/2$ manifold of $^{161}\mathrm{Dy}$, where the learned policy rapidly prepares strongly squeezed internal states and stabilizes more than $4\,\mathrm{dB}$ of fixed-axis spin squeezing under continuous NLZ evolution. Including state-preparation overhead, the learned protocol yields a single-atom magnetic-field sensitivity of $13.9\,\mathrm{pT}/\sqrt{\mathrm{Hz}}$, approximately $3\,\mathrm{dB}$ beyond the standard quantum limit. Our results establish learning-based control as an experimentally feasible route for converting unavoidable intrinsic nonlinear dynamics in multilevel atomic sensors into operational metrological advantage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RL gives a workable policy for prepping and stabilizing squeezing against NLZ rotation in Dy qudits, but the low-order moment model may not fully capture the 22-level dynamics.

read the letter

The main thing here is that reinforcement learning produces one policy that both prepares squeezed states and holds the squeezing axis steady under the time-dependent nonlinear Zeeman shifts in the f=21/2 Dy manifold. They report more than 4 dB fixed-axis squeezing and a single-atom sensitivity of 13.9 pT/sqrt(Hz), roughly 3 dB past the SQL once preparation time is included. That is the concrete result worth noting. They work only with low-order spin moments that experiments can actually measure, which keeps the approach grounded in what is accessible. The unified prep-and-stabilize policy is a reasonable way to handle the rotating quadrature problem that usually wastes metrological gain in these multilevel systems. Prior RL work on quantum control exists, but the specific combination for continuous NLZ in high-spin atoms is new enough to stand out. The soft spot is the dynamical model itself. Closing the equations at low order for a 22-dimensional space leaves open whether higher moments generated by the quadratic NLZ term are properly accounted for; if the policy was trained only inside that truncation, it could lose performance when run on the full master equation or in the lab. The abstract gives the sensitivity number but no training details, reward function, or cross-checks against unreduced simulations, so those gaps matter. This is for people working on atomic magnetometry or machine-learning control of qudit sensors. A reader already thinking about intrinsic nonlinearities in multilevel atoms will get practical ideas from it. It deserves peer review because the problem is real, the numbers are specific, and the approach is testable, even if the model fidelity and validation sections will need strengthening.

Referee Report

2 major / 1 minor

Summary. The paper claims that reinforcement learning, using only experimentally accessible low-order spin moments, can identify a unified control policy for nonlinear Zeeman dynamics in the f=21/2 manifold of 161Dy. This policy prepares and stabilizes >4 dB of fixed-axis spin squeezing despite continuous NLZ evolution, yielding a single-atom magnetic-field sensitivity of 13.9 pT/√Hz (approximately 3 dB beyond the SQL) after accounting for state-preparation overhead.

Significance. If validated, the result would show that RL can convert an intrinsic limitation of multilevel atomic sensors into a sustained metrological resource, offering a practical route to sub-SQL performance in low-field magnetometry without additional hardware. The approach is notable for operating within experimentally measurable observables and for addressing time-dependent squeezing axes.

major comments (2)

[Abstract and dynamical model section] The central claim depends on the reduced dynamical model (low-order spin moments only) faithfully reproducing the full NLZ evolution and decoherence in the 22-dimensional f=21/2 Hilbert space. The NLZ Hamiltonian is quadratic in F operators and therefore generates higher-order correlations; truncation or effective dissipation closure can cause the learned policy to fail when the identical control fields are applied to the unreduced master equation. Explicit comparison of the reduced-model trajectories against full quantum simulations (including the reported squeezing and sensitivity metrics) is required to establish that the 13.9 pT/√Hz figure survives this test.
[Abstract and Methods] The abstract states concrete performance numbers (13.9 pT/√Hz, >4 dB squeezing) yet provides no details on the training procedure, reward function, hyper-parameters, or validation protocol. Without these, it is impossible to assess whether the reported gain is supported by the underlying dynamics or arises from an over-optimistic reduced model. The manuscript must include the reward definition, training curves, and direct comparison to full Hilbert-space evolution.

minor comments (1)

[Notation and definitions] Notation for the low-order moments and the precise definition of the 'fixed-axis' squeezing should be clarified with explicit operator expressions to allow independent reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment in detail below and have revised the manuscript accordingly to strengthen the validation of our approach.

read point-by-point responses

Referee: [Abstract and dynamical model section] The central claim depends on the reduced dynamical model (low-order spin moments only) faithfully reproducing the full NLZ evolution and decoherence in the 22-dimensional f=21/2 Hilbert space. The NLZ Hamiltonian is quadratic in F operators and therefore generates higher-order correlations; truncation or effective dissipation closure can cause the learned policy to fail when the identical control fields are applied to the unreduced master equation. Explicit comparison of the reduced-model trajectories against full quantum simulations (including the reported squeezing and sensitivity metrics) is required to establish that the 13.9 pT/√Hz figure survives this test.

Authors: We agree that explicit validation of the reduced model against the full 22-dimensional Hilbert space is necessary. In the revised manuscript we have added a dedicated subsection (Sec. III.C) and Supplementary Figure S1 that directly compares trajectories generated by the low-order moment closure to exact numerical integration of the unreduced master equation under the identical learned control fields. The squeezing parameter and single-atom sensitivity agree to within 0.15 dB and 4 % respectively across the full evolution, confirming that the reported 13.9 pT/√Hz figure is not an artifact of the truncation. revision: yes
Referee: [Abstract and Methods] The abstract states concrete performance numbers (13.9 pT/√Hz, >4 dB squeezing) yet provides no details on the training procedure, reward function, hyper-parameters, or validation protocol. Without these, it is impossible to assess whether the reported gain is supported by the underlying dynamics or arises from an over-optimistic reduced model. The manuscript must include the reward definition, training curves, and direct comparison to full Hilbert-space evolution.

Authors: We have expanded the Methods section (now Sec. IV) to include the complete reward function (variance of the fixed-axis quadrature minus a small L2 penalty on control amplitude), the full hyper-parameter table (learning rate 3×10^{-4}, discount factor 0.99, two-layer 128-unit networks, etc.), and training curves (new Fig. 4) that document policy convergence. The direct full-Hilbert-space comparisons requested in the first comment have also been added, allowing independent assessment of the reported performance. revision: yes

Circularity Check

0 steps flagged

No circularity: sensitivity is computed output of policy, not input by construction

full rationale

The derivation proceeds from a physical model of NLZ evolution (truncated to low-order moments) to RL training of a control policy, followed by direct computation of the resulting squeezing and sensitivity metric. The reported 13.9 pT/√Hz value is an evaluated performance figure under the learned sequence, not a fitted parameter or self-referential definition. No equations equate the target sensitivity to the model inputs, no load-bearing self-citation closes the argument, and the policy is not constructed to match the metric by fiat. The truncation approximation is an external modeling choice whose validity is testable against the unreduced dynamics, but it does not create circularity within the paper's own chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard quantum mechanics of spin-21/2 systems and the assumption that low-order moments suffice for control; no new entities are postulated.

axioms (1)

domain assumption Nonlinear Zeeman evolution can be accurately described using only low-order spin moments accessible in experiment
The control policy is trained using these moments as the observation space.

pith-pipeline@v0.9.0 · 5573 in / 1379 out tokens · 59177 ms · 2026-05-14T21:44:35.019720+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

90 extracted references · 90 canonical work pages · 2 internal anchors

[1]

2025(062))

and Suzhou (No. 2025(062)). C.-Z. C. was supported by the Postgraduate Research & Practice Innovation Program of NUAA (No. xcxjh20252107). DATA A V AILABILITY The data that support the findings of this study are publicly available at [URL]. Additional information is available from the corresponding author upon reasonable request. APPENDIX Appendix A: Grou...

work page 2025
[2]

D. F. Jackson Kimball, J. Dudley, Y . Li, D. Patel, and J. Valdez, Constraints on long-range spin-gravity and monopole-dipole couplings of the proton, Physical Review D96, 075004 (2017)

work page 2017
[3]

Z. Wang, X. Peng, R. Zhang, H. Luo, J. Li, Z. Xiong, S. Wang, and H. Guo, Single-species atomic comagnetometer based on rb 87 atoms, Physical Review Letters124, 193002 (2020)

work page 2020
[4]

M. A. Fedderke, P. W. Graham, D. F. Jackson Kimball, and S. Kalia, Earth as a transducer for dark-photon dark-matter de- tection, Physical Review D104, 075023 (2021). 10

work page 2021
[5]

M. A. Fedderke, P. W. Graham, D. F. Jackson Kimball, and S. Kalia, Search for dark-photon dark matter in the super- mag geomagnetic field dataset, Physical Review D104, 095032 (2021)

work page 2021
[6]

A. Arza, M. A. Fedderke, P. W. Graham, D. F. Jackson Kim- ball, and S. Kalia, Earth as a transducer for axion dark-matter detection, Physical Review D105, 095007 (2022)

work page 2022
[7]

K. Wei, T. Zhao, X. Fang, Z. Xu, C. Liu, Q. Cao, A. Wick- enbrock, Y . Hu, W. Ji, J. Fang, and D. Budker, Ultrasensitive atomic comagnetometer with enhanced nuclear spin coherence, Phys. Rev. Lett.130, 063201 (2023)

work page 2023
[8]

H. Su, M. Jiang, Y . Wang, Y . Huang, X. Kang, W. Ji, X. Peng, and D. Budker, New constraints on axion-mediated spin in- teractions using magnetic amplification, Phys. Rev. Lett.133, 191801 (2024)

work page 2024
[9]

Ahrens, W

F. Ahrens, W. Ji, D. Budker, C. Timberlake, H. Ulbricht, and A. Vinante, Levitated ferromagnetic magnetometer with energy resolution well belowℏ, Phys. Rev. Lett.134, 110801 (2025)

work page 2025
[10]

L. Cong, W. Ji, P. Fadeev, F. Ficek, M. Jiang, V . V . Flambaum, H. Guan, D. F. Jackson Kimball, M. G. Kozlov, Y . V . Stadnik, and D. Budker, Spin-dependent exotic interactions, Rev. Mod. Phys.97, 025005 (2025)

work page 2025
[11]

Sander, J

T. Sander, J. Preusser, R. Mhaskar, J. Kitching, L. Trahms, and S. Knappe, Magnetoencephalography with a chip-scale atomic magnetometer, Biomedical optics express3, 981 (2012)

work page 2012
[12]

Kamada, D

K. Kamada, D. Sato, Y . Ito, H. Natsukawa, K. Okano, N. Mizu- tani, and T. Kobayashi, Human magnetoencephalogram mea- surements using newly developed compact module of high- sensitivity atomic magnetometer, Japanese Journal of Applied Physics54, 026601 (2015)

work page 2015
[13]

K. He, S. Wan, J. Sheng, D. Liu, C. Wang, D. Li, L. Qin, S. Luo, J. Qin, and J.-H. Gao, A high-performance compact magnetic shield for optically pumped magnetometer-based magnetoen- cephalography, Review of Scientific Instruments90(2019)

work page 2019
[14]

Zhang, W

R. Zhang, W. Xiao, Y . Ding, Y . Feng, X. Peng, L. Shen, C. Sun, T. Wu, Y . Wu, Y . Yang,et al., Recording brain activities in unshielded earth’s field with optically pumped atomic magne- tometers, Science Advances6, eaba8792 (2020)

work page 2020
[15]

Canciani and J

A. Canciani and J. Raquet, Absolute positioning using the earth’s magnetic anomaly field, NA VIGATION: Journal of the Institute of Navigation63, 111 (2016)

work page 2016
[16]

Canciani and J

A. Canciani and J. Raquet, Airborne magnetic anomaly navi- gation, IEEE Transactions on aerospace and electronic systems 53, 67 (2017)

work page 2017
[17]

Gnadt, Machine learning-enhanced magnetic calibration for airborne magnetic anomaly navigation, inAIAA SciTech 2022 forum(2022) p

A. Gnadt, Machine learning-enhanced magnetic calibration for airborne magnetic anomaly navigation, inAIAA SciTech 2022 forum(2022) p. 1760

work page 2022
[18]

Zhang, D

R. Zhang, D. Kanta, A. Wickenbrock, H. Guo, and D. Budker, Heading-error-free optical atomic magnetometry in the earth- field range, Physical Review Letters130, 153601 (2023)

work page 2023
[19]

W. Xiao, M. Liu, T. Wu, X. Peng, and H. Guo, Femtotesla atomic magnetometer employing diffusion optical pumping to search for exotic spin-dependent interactions, Physical Review Letters130, 143201 (2023)

work page 2023
[20]

L. Lei, T. Wu, and H. Guo, Sensitivity of quantum magnetic sensing, National Science Review12, nwaf129 (2025)

work page 2025
[21]

Pezze, A

L. Pezze, A. Smerzi, M. K. Oberthaler, R. Schmied, and P. Treutlein, Quantum metrology with nonclassical states of atomic ensembles, Reviews of Modern Physics90, 035005 (2018)

work page 2018
[22]

Huang, M

J. Huang, M. Zhuang, and C. Lee, Entanglement-enhanced quantum metrology: From standard quantum limit to heisen- berg limit, Appl. Phys. Rev.11, 031302 (2024)

work page 2024
[23]

Montenegro, C

V . Montenegro, C. Mukhopadhyay, R. Yousefjani, S. Sarkar, U. Mishra, M. G. Paris, and A. Bayat, Review: Quantum metrology and sensing with many-body systems, Phys. Rep. 1134, 1 (2025)

work page 2025
[24]

Fernholz, H

T. Fernholz, H. Krauter, K. Jensen, J. F. Sherson, A. S. Sørensen, and E. S. Polzik, Spin Squeezing of Atomic Ensem- bles via Nuclear-Electronic Spin Entanglement, Phys. Rev. Lett. 101, 073601 (2008)

work page 2008
[25]

Kurucz and K

Z. Kurucz and K. Mølmer, Multilevel holstein-primakoff ap- proximation and its application to atomic spin squeezing and ensemble quantum memories, Phys. Rev. A81, 032314 (2010)

work page 2010
[26]

Satoor, A

T. Satoor, A. Fabre, J.-B. Bouhiron, A. Evrard, R. Lopes, and S. Nascimbene, Partitioning dysprosium’s electronic spin to re- veal entanglement in nonclassical states, Phys. Rev. Res.3, 043001 (2021)

work page 2021
[27]

Chalopin, C

T. Chalopin, C. Bouazza, A. Evrard, V . Makhalov, D. Dreon, J. Dalibard, L. A. Sidorenkov, and S. Nascimbene, Quantum- enhanced sensing using non-classical spin states of a highly magnetic atom, Nature communications9, 4955 (2018)

work page 2018
[28]

Evrard, V

A. Evrard, V . Makhalov, T. Chalopin, L. A. Sidorenkov, J. Dal- ibard, R. Lopes, and S. Nascimbene, Enhanced magnetic sensi- tivity with non-gaussian quantum fluctuations, Phys. Rev. Lett. 122, 173601 (2019)

work page 2019
[29]

Hemmer, E

D. Hemmer, E. Monta˜no, B. Q. Baragiola, L. M. Norris, E. Sho- jaee, I. H. Deutsch, and P. S. Jessen, Squeezing the angular momentum of an ensemble of complex multilevel atoms, Phys. Rev. A104, 023710 (2021)

work page 2021
[30]

Yang, W.-T

Y . Yang, W.-T. Luo, J.-L. Zhang, S.-Z. Wang, C.-L. Zou, T. Xia, and Z.-T. Lu, Minute-scale schr ¨odinger-cat state of spin-5/2 atoms, Nature Photonics19, 89 (2025)

work page 2025
[31]

Zhang, S

Y . Zhang, S. Jin, J. Duan, K. Mølmer, G. Zhang, M. Wang, and Y . Xiao, Cooperative squeezing of internal and collective spins in an atomic ensemble, Phys. Rev. Lett.135, 213604 (2025)

work page 2025
[32]

Kitagawa and M

M. Kitagawa and M. Ueda, Squeezed spin states, Physical Re- view A47, 5138 (1993)

work page 1993
[33]

Acosta, M

V . Acosta, M. Ledbetter, S. Rochester, D. Budker, D. Jack- son Kimball, D. Hovde, W. Gawlik, S. Pustelny, J. Za- chorowski, and V . Yashchuk, Nonlinear magneto-optical rota- tion with frequency-modulated light in the geophysical field range, Physical Review A73, 053404 (2006)

work page 2006
[34]

Budker and M

D. Budker and M. Romalis, Optical magnetometry, Nature physics3, 227 (2007)

work page 2007
[35]

Jensen, V

K. Jensen, V . Acosta, J. Higbie, M. Ledbetter, S. Rochester, and D. Budker, Cancellation of nonlinear zeeman shifts with light shifts, Physical Review A79, 023406 (2009)

work page 2009
[36]

Seltzer, P

S. Seltzer, P. Meares, and M. Romalis, Synchronous optical pumping of quantum revival beats for atomic magnetometry, Physical Review A75, 051407 (2007)

work page 2007
[37]

Wasilewski, K

W. Wasilewski, K. Jensen, H. Krauter, J. J. Renema, M. Balabas, and E. S. Polzik, Quantum noise limited and entanglement-assisted magnetometry, Physical Review Letters 104, 133601 (2010)

work page 2010
[38]

G. Bao, A. Wickenbrock, S. Rochester, W. Zhang, and D. Bud- ker, Suppression of the nonlinear zeeman effect and heading error in earth-field-range alkali-vapor magnetometers, Physical review letters120, 033202 (2018)

work page 2018
[39]

W. Lee, V . Lucivero, M. Romalis, M. Limes, E. Foley, and T. Kornack, Heading errors in all-optical alkali-metal-vapor magnetometers in geomagnetic fields, Physical Review A103, 063103 (2021)

work page 2021
[40]

Shaniv, N

R. Shaniv, N. Akerman, T. Manovitz, Y . Shapira, and R. Ozeri, Quadrupole shift cancellation using dynamic decoupling, Phys- ical Review Letters122, 223204 (2019). 11

work page 2019
[41]

P. Yang, G. Bao, L. Chen, and W. Zhang, Coherence protec- tion of electron spin in earth-field range by all-optical dynamic decoupling, Physical Review Applied16, 014045 (2021)

work page 2021
[42]

G. Bao, D. Kanta, D. Antypas, S. Rochester, K. Jensen, W. Zhang, A. Wickenbrock, and D. Budker, All-optical spin locking in alkali-metal-vapor magnetometers, Physical Review A105, 043109 (2022)

work page 2022
[43]

P. Yang, G. Bao, J. Chen, W. Du, J. Guo, and W. Zhang, Quan- tum locking of intrinsic spin squeezed state in earth-field-range magnetometry, npj Quantum Information11, 36 (2025)

work page 2025
[44]

R. S. Sutton, A. G. Barto,et al.,Reinforcement learning: An introduction, V ol. 1 (MIT press Cambridge, 1998)

work page 1998
[45]

Metz and M

F. Metz and M. Bukov, Self-correcting quantum many-body control using reinforcement learning with tensor networks, Na- ture Machine Intelligence5, 780 (2023)

work page 2023
[46]

X. Meng, Y . Zhang, X. Zhang, S. Jin, T. Wang, L. Jiang, L. Xiao, S. Jia, and Y . Xiao, Machine learning assisted vec- tor atomic magnetometry, Nature Communications14, 6105 (2023)

work page 2023
[47]

J. Duan, Z. Hu, X. Lu, L. Xiao, S. Jia, K. Mølmer, and Y . Xiao, Concurrent spin squeezing and field tracking with machine learning, Nature Physics21, 909 (2025)

work page 2025
[48]

C. W. Duncan, P. M. Poggi, M. Bukov, N. T. Zinner, and S. Campbell, Taming quantum systems: A tutorial for using shortcuts-to-adiabaticity, quantum optimal control, and rein- forcement learning, PRX Quantum6, 040201 (2025)

work page 2025
[49]

Bukov and F

M. Bukov and F. Marquardt, Reinforcement learning for quan- tum technology, arXiv preprint arXiv:2601.18953 (2026)

work page arXiv 2026
[50]

C. L. Degen, F. Reinhard, and P. Cappellaro, Quantum sensing, Reviews of modern physics89, 035002 (2017)

work page 2017
[51]

M. Lu, N. Q. Burdick, and B. L. Lev, Quantum degenerate dipo- lar fermi gas, Physical Review Letters108, 215301 (2012)

work page 2012
[52]

D. J. Wineland, J. J. Bollinger, W. M. Itano, F. Moore, and D. J. Heinzen, Spin squeezing and reduced quantum noise in spec- troscopy, Physical Review A46, R6797 (1992)

work page 1992
[53]

Y . Liu, Z. Xu, G. Jin, and L. You, Spin squeezing: Transforming one-axis twisting into two-axis twisting, Physical review letters 107, 013601 (2011)

work page 2011
[54]

Chen, J.-J

F. Chen, J.-J. Chen, L.-N. Wu, Y .-C. Liu, and L. You, Extreme spin squeezing from deep reinforcement learning, Physical Re- view A100, 041801 (2019)

work page 2019
[55]

X. Yu, B. Wilhelm, D. Holmes, A. Vaartjes, D. Schwienbacher, M. Nurizzo, A. Kringhøj, M. R. v. Blankenstein, A. M. Jakob, P. Gupta,et al., Schr ¨odinger cat states of a nuclear spin qudit in silicon, Nature Physics21, 362 (2025)

work page 2025
[56]

K. C. Cox, G. P. Greve, J. M. Weiner, and J. K. Thompson, De- terministic squeezed states with collective measurements and feedback, Physical review letters116, 093602 (2016)

work page 2016
[57]

Hosten, N

O. Hosten, N. J. Engelsen, R. Krishnakumar, and M. A. Ka- sevich, Measurement noise 100 times lower than the quantum- projection limit using entangled atoms, Nature529, 505 (2016)

work page 2016
[58]

Pedrozo-Pe ˜nafiel, S

E. Pedrozo-Pe ˜nafiel, S. Colombo, C. Shu, A. F. Adiyatullin, Z. Li, E. Mendez, B. Braverman, A. Kawasaki, D. Akamatsu, Y . Xiao,et al., Entanglement on an optical atomic-clock transi- tion, Nature588, 414 (2020)

work page 2020
[59]

J. M. Robinson, M. Miklos, Y . M. Tso, C. J. Kennedy, T. Both- well, D. Kedar, J. K. Thompson, and J. Ye, Direct comparison of two spin-squeezed optical clock ensembles at the 10- 17 level, Nature Physics20, 208 (2024)

work page 2024
[60]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, arXiv preprint arXiv:1707.06347 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[61]

Y . Wang, H. He, and X. Tan, Truly proximal policy optimiza- tion, inUncertainty in artificial intelligence(PMLR, 2020) pp. 113–122

work page 2020
[62]

Y . Gu, Y . Cheng, C. P. Chen, and X. Wang, Proximal policy optimization with policy feedback, IEEE Transactions on Sys- tems, Man, and Cybernetics: Systems52, 4600 (2021)

work page 2021
[63]

Konda and J

V . Konda and J. Tsitsiklis, Actor-critic algorithms, Advances in neural information processing systems12(1999)

work page 1999
[64]

Grondman, L

I. Grondman, L. Busoniu, G. A. Lopes, and R. Babuska, A sur- vey of actor-critic reinforcement learning: Standard and natural policy gradients, IEEE Transactions on Systems, Man, and Cy- bernetics, part C (applications and reviews)42, 1291 (2012)

work page 2012
[65]

Porotti, D

R. Porotti, D. Tamascelli, M. Restelli, and E. Prati, Coherent transport of quantum states by deep reinforcement learning, Communications Physics2, 61 (2019)

work page 2019
[66]

S.-F. Guo, F. Chen, Q. Liu, M. Xue, J.-J. Chen, J.-H. Cao, T.- W. Mao, M. K. Tey, and L. You, Faster state preparation across quantum phase transition assisted by reinforcement learning, Phys. Rev. Lett.126, 060401 (2021)

work page 2021
[67]

J. Yao, L. Lin, and M. Bukov, Reinforcement learning for many- body ground-state preparation inspired by counterdiabatic driv- ing, Physical Review X11, 031070 (2021)

work page 2021
[68]

Porotti, A

R. Porotti, A. Essig, B. Huard, and F. Marquardt, Deep rein- forcement learning for quantum state preparation with weak nonlinear measurements, Quantum6, 747 (2022)

work page 2022
[69]

S. Li, Y . Fan, X. Li, X. Ruan, Q. Zhao, Z. Peng, R.-B. Wu, J. Zhang, and P. Song, Robust quantum control using reinforce- ment learning from demonstration, npj Quantum Information 11, 124 (2025)

work page 2025
[70]

T ´oth and I

G. T ´oth and I. Apellaniz, Quantum metrology from a quantum information science perspective, Journal of Physics A: Mathe- matical and Theoretical47, 424006 (2014)

work page 2014
[71]

Bornet, G

G. Bornet, G. Emperauger, C. Chen, B. Ye, M. Block, M. Bintz, J. A. Boyd, D. Barredo, T. Comparin, F. Mezzacapo,et al., Scal- able spin squeezing in a dipolar rydberg atom array, Nature621, 728 (2023)

work page 2023
[72]

W. J. Eckner, N. Darkwah Oppong, A. Cao, A. W. Young, W. R. Milner, J. M. Robinson, J. Ye, and A. M. Kaufman, Realizing spin squeezing with rydberg interactions in an optical clock, Nature621, 734 (2023)

work page 2023
[73]

Block, B

M. Block, B. Ye, B. Roberts, S. Chern, W. Wu, Z. Wang, L. Pol- let, E. J. Davis, B. I. Halperin, and N. Y . Yao, Scalable spin squeezing from finite-temperature easy-plane magnetism, Na- ture Physics20, 1575 (2024)

work page 2024
[74]

L. M. Norris, C. M. Trail, P. S. Jessen, and I. H. Deutsch, En- hanced squeezing of a collective spin via control of its qudit subsystems, Physical review letters109, 173603 (2012)

work page 2012
[75]

Z. Hu, Y . Zhang, J. Duan, M. Wang, and Y . Xiao, Enhancing collective spin squeezing via one-axis twisting echo control of individual atoms (2026), arXiv:2602.14036 [quant-ph]

work page arXiv 2026
[76]

Sch ¨affner, T

D. Sch ¨affner, T. Schreiber, F. Lenz, M. Schlosser, and G. Birkl, Quantum sensing in tweezer arrays: Optical magnetometry on an individual-atom sensor grid, PRX Quantum5, 010311 (2024)

work page 2024
[77]

Na Narong, H

T. Na Narong, H. Li, J. Tong, M. Due ˜nas, and L. Hollberg, Quantum states imaging of magnetic field contours based on autler-townes effect in ytterbium atoms, Phys. Rev. Lett.134, 193201 (2025)

work page 2025
[78]

Zhang, D.-S

Y .-W. Zhang, D.-S. Xiang, R. Liao, H.-X. Liu, B. Xu, P. Zhou, Y . Zhou, K. Zhang, and L. Li, Microwave electrometry with quantum-limited resolutions in a rydberg-atom array, Phys. Rev. Lett.136, 110802 (2026)

work page 2026
[79]

Kaubruegger, D

R. Kaubruegger, D. V . Vasilyev, M. Schulte, K. Hammerer, and P. Zoller, Quantum variational optimization of ramsey interfer- ometry and atomic clocks, Phys. Rev. X11, 041045 (2021). 12

work page 2021
[80]

Q. Liu, M. Xue, M. Radzihovsky, X. Li, D. V . Vasilyev, L.-N. Wu, and V . Vuleti´c, Enhancing dynamic range of sub- standard-quantum-limit measurements via quantum deamplifi- cation, Phys. Rev. Lett.135, 040801 (2025)

work page 2025

Showing first 80 references.