Learning Unified Control of Intrinsic Nonlinear Spin Dynamics in Atomic Qudits for Magnetometry
Pith reviewed 2026-05-14 21:44 UTC · model grok-4.3
The pith
Reinforcement learning finds a single control policy that converts time-varying nonlinear Zeeman dynamics into sustained spin squeezing in multilevel atoms, reaching 3 dB beyond the standard quantum limit.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using only experimentally accessible low-order spin moments, a trained reinforcement learning agent identifies a unified control policy that rapidly prepares strongly squeezed internal states and stabilizes more than 4 dB of fixed-axis spin squeezing under continuous nonlinear Zeeman evolution in the f=21/2 manifold of 161Dy. Including state-preparation overhead, the protocol yields a single-atom magnetic-field sensitivity of 13.9 pT/√Hz, approximately 3 dB beyond the standard quantum limit.
What carries the argument
A reinforcement learning policy that maps low-order spin moments to control fields, counteracting the time-dependent rotation and distortion of the squeezed quadrature caused by nonlinear Zeeman evolution.
If this is right
- The same learned policy class can maintain metrological gain from internal spin squeezing in any multilevel atom whose nonlinear evolution is governed by similar low-order moments.
- Preparation overhead is included yet the net sensitivity still surpasses the SQL, showing that the control overhead does not erase the advantage.
- A single policy works across the full sensing interval rather than requiring separate sequences for preparation and readout.
- The approach converts an unavoidable intrinsic nonlinearity into a sustained resource instead of treating it as a decoherence source to be suppressed.
Where Pith is reading between the lines
- The method may extend to other qudit-based sensors where analytic control is intractable, provided the relevant observables remain measurable.
- Training on low-order moments only suggests the policy could tolerate partial observation or modest model mismatch in other quantum control tasks.
- If the policy transfers successfully, similar learning loops could be used to optimize squeezing in systems with stronger or different nonlinearities.
Load-bearing premise
The nonlinear Zeeman dynamics and decoherence can be captured accurately enough by a model that depends only on low-order spin moments for a policy trained in simulation to work on real atoms without large performance loss.
What would settle it
An experiment in which the learned control sequence is applied to real 161Dy atoms and the observed magnetic-field sensitivity fails to exceed the standard quantum limit by the claimed amount or the squeezing decays faster than the model predicts.
Figures
read the original abstract
Generating and preserving metrologically useful quantum states is a central challenge in quantum-enhanced metrology. In low-field atomic magnetometry with multilevel atoms, the nonlinear Zeeman (NLZ) effect is both a resource and a limitation. It can generate internal spin squeezing within a single atomic qudit, but under fixed readout it also rotates and distorts the measurement-relevant quadrature, limiting the usable metrological gain. The problem is further complicated by the time dependence of both the squeezing axis and the nonlinear evolution itself. Here we show that reinforcement learning can transform NLZ dynamics from a source of readout degradation into a sustained metrological resource. Using only experimentally accessible low-order spin moments, a trained agent identifies a unified control policy for this class of intrinsically nonlinear sensing dynamics. We illustrate the approach in the $f=21/2$ manifold of $^{161}\mathrm{Dy}$, where the learned policy rapidly prepares strongly squeezed internal states and stabilizes more than $4\,\mathrm{dB}$ of fixed-axis spin squeezing under continuous NLZ evolution. Including state-preparation overhead, the learned protocol yields a single-atom magnetic-field sensitivity of $13.9\,\mathrm{pT}/\sqrt{\mathrm{Hz}}$, approximately $3\,\mathrm{dB}$ beyond the standard quantum limit. Our results establish learning-based control as an experimentally feasible route for converting unavoidable intrinsic nonlinear dynamics in multilevel atomic sensors into operational metrological advantage.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that reinforcement learning, using only experimentally accessible low-order spin moments, can identify a unified control policy for nonlinear Zeeman dynamics in the f=21/2 manifold of 161Dy. This policy prepares and stabilizes >4 dB of fixed-axis spin squeezing despite continuous NLZ evolution, yielding a single-atom magnetic-field sensitivity of 13.9 pT/√Hz (approximately 3 dB beyond the SQL) after accounting for state-preparation overhead.
Significance. If validated, the result would show that RL can convert an intrinsic limitation of multilevel atomic sensors into a sustained metrological resource, offering a practical route to sub-SQL performance in low-field magnetometry without additional hardware. The approach is notable for operating within experimentally measurable observables and for addressing time-dependent squeezing axes.
major comments (2)
- [Abstract and dynamical model section] The central claim depends on the reduced dynamical model (low-order spin moments only) faithfully reproducing the full NLZ evolution and decoherence in the 22-dimensional f=21/2 Hilbert space. The NLZ Hamiltonian is quadratic in F operators and therefore generates higher-order correlations; truncation or effective dissipation closure can cause the learned policy to fail when the identical control fields are applied to the unreduced master equation. Explicit comparison of the reduced-model trajectories against full quantum simulations (including the reported squeezing and sensitivity metrics) is required to establish that the 13.9 pT/√Hz figure survives this test.
- [Abstract and Methods] The abstract states concrete performance numbers (13.9 pT/√Hz, >4 dB squeezing) yet provides no details on the training procedure, reward function, hyper-parameters, or validation protocol. Without these, it is impossible to assess whether the reported gain is supported by the underlying dynamics or arises from an over-optimistic reduced model. The manuscript must include the reward definition, training curves, and direct comparison to full Hilbert-space evolution.
minor comments (1)
- [Notation and definitions] Notation for the low-order moments and the precise definition of the 'fixed-axis' squeezing should be clarified with explicit operator expressions to allow independent reproduction.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment in detail below and have revised the manuscript accordingly to strengthen the validation of our approach.
read point-by-point responses
-
Referee: [Abstract and dynamical model section] The central claim depends on the reduced dynamical model (low-order spin moments only) faithfully reproducing the full NLZ evolution and decoherence in the 22-dimensional f=21/2 Hilbert space. The NLZ Hamiltonian is quadratic in F operators and therefore generates higher-order correlations; truncation or effective dissipation closure can cause the learned policy to fail when the identical control fields are applied to the unreduced master equation. Explicit comparison of the reduced-model trajectories against full quantum simulations (including the reported squeezing and sensitivity metrics) is required to establish that the 13.9 pT/√Hz figure survives this test.
Authors: We agree that explicit validation of the reduced model against the full 22-dimensional Hilbert space is necessary. In the revised manuscript we have added a dedicated subsection (Sec. III.C) and Supplementary Figure S1 that directly compares trajectories generated by the low-order moment closure to exact numerical integration of the unreduced master equation under the identical learned control fields. The squeezing parameter and single-atom sensitivity agree to within 0.15 dB and 4 % respectively across the full evolution, confirming that the reported 13.9 pT/√Hz figure is not an artifact of the truncation. revision: yes
-
Referee: [Abstract and Methods] The abstract states concrete performance numbers (13.9 pT/√Hz, >4 dB squeezing) yet provides no details on the training procedure, reward function, hyper-parameters, or validation protocol. Without these, it is impossible to assess whether the reported gain is supported by the underlying dynamics or arises from an over-optimistic reduced model. The manuscript must include the reward definition, training curves, and direct comparison to full Hilbert-space evolution.
Authors: We have expanded the Methods section (now Sec. IV) to include the complete reward function (variance of the fixed-axis quadrature minus a small L2 penalty on control amplitude), the full hyper-parameter table (learning rate 3×10^{-4}, discount factor 0.99, two-layer 128-unit networks, etc.), and training curves (new Fig. 4) that document policy convergence. The direct full-Hilbert-space comparisons requested in the first comment have also been added, allowing independent assessment of the reported performance. revision: yes
Circularity Check
No circularity: sensitivity is computed output of policy, not input by construction
full rationale
The derivation proceeds from a physical model of NLZ evolution (truncated to low-order moments) to RL training of a control policy, followed by direct computation of the resulting squeezing and sensitivity metric. The reported 13.9 pT/√Hz value is an evaluated performance figure under the learned sequence, not a fitted parameter or self-referential definition. No equations equate the target sensitivity to the model inputs, no load-bearing self-citation closes the argument, and the policy is not constructed to match the metric by fiat. The truncation approximation is an external modeling choice whose validity is testable against the unreduced dynamics, but it does not create circularity within the paper's own chain.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Nonlinear Zeeman evolution can be accurately described using only low-order spin moments accessible in experiment
Reference graph
Works this paper leans on
-
[1]
and Suzhou (No. 2025(062)). C.-Z. C. was supported by the Postgraduate Research & Practice Innovation Program of NUAA (No. xcxjh20252107). DATA A V AILABILITY The data that support the findings of this study are publicly available at [URL]. Additional information is available from the corresponding author upon reasonable request. APPENDIX Appendix A: Grou...
work page 2025
-
[2]
D. F. Jackson Kimball, J. Dudley, Y . Li, D. Patel, and J. Valdez, Constraints on long-range spin-gravity and monopole-dipole couplings of the proton, Physical Review D96, 075004 (2017)
work page 2017
-
[3]
Z. Wang, X. Peng, R. Zhang, H. Luo, J. Li, Z. Xiong, S. Wang, and H. Guo, Single-species atomic comagnetometer based on rb 87 atoms, Physical Review Letters124, 193002 (2020)
work page 2020
-
[4]
M. A. Fedderke, P. W. Graham, D. F. Jackson Kimball, and S. Kalia, Earth as a transducer for dark-photon dark-matter de- tection, Physical Review D104, 075023 (2021). 10
work page 2021
-
[5]
M. A. Fedderke, P. W. Graham, D. F. Jackson Kimball, and S. Kalia, Search for dark-photon dark matter in the super- mag geomagnetic field dataset, Physical Review D104, 095032 (2021)
work page 2021
-
[6]
A. Arza, M. A. Fedderke, P. W. Graham, D. F. Jackson Kim- ball, and S. Kalia, Earth as a transducer for axion dark-matter detection, Physical Review D105, 095007 (2022)
work page 2022
-
[7]
K. Wei, T. Zhao, X. Fang, Z. Xu, C. Liu, Q. Cao, A. Wick- enbrock, Y . Hu, W. Ji, J. Fang, and D. Budker, Ultrasensitive atomic comagnetometer with enhanced nuclear spin coherence, Phys. Rev. Lett.130, 063201 (2023)
work page 2023
-
[8]
H. Su, M. Jiang, Y . Wang, Y . Huang, X. Kang, W. Ji, X. Peng, and D. Budker, New constraints on axion-mediated spin in- teractions using magnetic amplification, Phys. Rev. Lett.133, 191801 (2024)
work page 2024
- [9]
-
[10]
L. Cong, W. Ji, P. Fadeev, F. Ficek, M. Jiang, V . V . Flambaum, H. Guan, D. F. Jackson Kimball, M. G. Kozlov, Y . V . Stadnik, and D. Budker, Spin-dependent exotic interactions, Rev. Mod. Phys.97, 025005 (2025)
work page 2025
- [11]
- [12]
-
[13]
K. He, S. Wan, J. Sheng, D. Liu, C. Wang, D. Li, L. Qin, S. Luo, J. Qin, and J.-H. Gao, A high-performance compact magnetic shield for optically pumped magnetometer-based magnetoen- cephalography, Review of Scientific Instruments90(2019)
work page 2019
- [14]
-
[15]
A. Canciani and J. Raquet, Absolute positioning using the earth’s magnetic anomaly field, NA VIGATION: Journal of the Institute of Navigation63, 111 (2016)
work page 2016
-
[16]
A. Canciani and J. Raquet, Airborne magnetic anomaly navi- gation, IEEE Transactions on aerospace and electronic systems 53, 67 (2017)
work page 2017
-
[17]
A. Gnadt, Machine learning-enhanced magnetic calibration for airborne magnetic anomaly navigation, inAIAA SciTech 2022 forum(2022) p. 1760
work page 2022
- [18]
-
[19]
W. Xiao, M. Liu, T. Wu, X. Peng, and H. Guo, Femtotesla atomic magnetometer employing diffusion optical pumping to search for exotic spin-dependent interactions, Physical Review Letters130, 143201 (2023)
work page 2023
-
[20]
L. Lei, T. Wu, and H. Guo, Sensitivity of quantum magnetic sensing, National Science Review12, nwaf129 (2025)
work page 2025
- [21]
- [22]
-
[23]
V . Montenegro, C. Mukhopadhyay, R. Yousefjani, S. Sarkar, U. Mishra, M. G. Paris, and A. Bayat, Review: Quantum metrology and sensing with many-body systems, Phys. Rep. 1134, 1 (2025)
work page 2025
-
[24]
T. Fernholz, H. Krauter, K. Jensen, J. F. Sherson, A. S. Sørensen, and E. S. Polzik, Spin Squeezing of Atomic Ensem- bles via Nuclear-Electronic Spin Entanglement, Phys. Rev. Lett. 101, 073601 (2008)
work page 2008
-
[25]
Z. Kurucz and K. Mølmer, Multilevel holstein-primakoff ap- proximation and its application to atomic spin squeezing and ensemble quantum memories, Phys. Rev. A81, 032314 (2010)
work page 2010
- [26]
-
[27]
T. Chalopin, C. Bouazza, A. Evrard, V . Makhalov, D. Dreon, J. Dalibard, L. A. Sidorenkov, and S. Nascimbene, Quantum- enhanced sensing using non-classical spin states of a highly magnetic atom, Nature communications9, 4955 (2018)
work page 2018
- [28]
- [29]
-
[30]
Y . Yang, W.-T. Luo, J.-L. Zhang, S.-Z. Wang, C.-L. Zou, T. Xia, and Z.-T. Lu, Minute-scale schr ¨odinger-cat state of spin-5/2 atoms, Nature Photonics19, 89 (2025)
work page 2025
- [31]
-
[32]
M. Kitagawa and M. Ueda, Squeezed spin states, Physical Re- view A47, 5138 (1993)
work page 1993
-
[33]
V . Acosta, M. Ledbetter, S. Rochester, D. Budker, D. Jack- son Kimball, D. Hovde, W. Gawlik, S. Pustelny, J. Za- chorowski, and V . Yashchuk, Nonlinear magneto-optical rota- tion with frequency-modulated light in the geophysical field range, Physical Review A73, 053404 (2006)
work page 2006
-
[34]
D. Budker and M. Romalis, Optical magnetometry, Nature physics3, 227 (2007)
work page 2007
- [35]
-
[36]
S. Seltzer, P. Meares, and M. Romalis, Synchronous optical pumping of quantum revival beats for atomic magnetometry, Physical Review A75, 051407 (2007)
work page 2007
-
[37]
W. Wasilewski, K. Jensen, H. Krauter, J. J. Renema, M. Balabas, and E. S. Polzik, Quantum noise limited and entanglement-assisted magnetometry, Physical Review Letters 104, 133601 (2010)
work page 2010
-
[38]
G. Bao, A. Wickenbrock, S. Rochester, W. Zhang, and D. Bud- ker, Suppression of the nonlinear zeeman effect and heading error in earth-field-range alkali-vapor magnetometers, Physical review letters120, 033202 (2018)
work page 2018
-
[39]
W. Lee, V . Lucivero, M. Romalis, M. Limes, E. Foley, and T. Kornack, Heading errors in all-optical alkali-metal-vapor magnetometers in geomagnetic fields, Physical Review A103, 063103 (2021)
work page 2021
- [40]
-
[41]
P. Yang, G. Bao, L. Chen, and W. Zhang, Coherence protec- tion of electron spin in earth-field range by all-optical dynamic decoupling, Physical Review Applied16, 014045 (2021)
work page 2021
-
[42]
G. Bao, D. Kanta, D. Antypas, S. Rochester, K. Jensen, W. Zhang, A. Wickenbrock, and D. Budker, All-optical spin locking in alkali-metal-vapor magnetometers, Physical Review A105, 043109 (2022)
work page 2022
-
[43]
P. Yang, G. Bao, J. Chen, W. Du, J. Guo, and W. Zhang, Quan- tum locking of intrinsic spin squeezed state in earth-field-range magnetometry, npj Quantum Information11, 36 (2025)
work page 2025
-
[44]
R. S. Sutton, A. G. Barto,et al.,Reinforcement learning: An introduction, V ol. 1 (MIT press Cambridge, 1998)
work page 1998
-
[45]
F. Metz and M. Bukov, Self-correcting quantum many-body control using reinforcement learning with tensor networks, Na- ture Machine Intelligence5, 780 (2023)
work page 2023
-
[46]
X. Meng, Y . Zhang, X. Zhang, S. Jin, T. Wang, L. Jiang, L. Xiao, S. Jia, and Y . Xiao, Machine learning assisted vec- tor atomic magnetometry, Nature Communications14, 6105 (2023)
work page 2023
-
[47]
J. Duan, Z. Hu, X. Lu, L. Xiao, S. Jia, K. Mølmer, and Y . Xiao, Concurrent spin squeezing and field tracking with machine learning, Nature Physics21, 909 (2025)
work page 2025
-
[48]
C. W. Duncan, P. M. Poggi, M. Bukov, N. T. Zinner, and S. Campbell, Taming quantum systems: A tutorial for using shortcuts-to-adiabaticity, quantum optimal control, and rein- forcement learning, PRX Quantum6, 040201 (2025)
work page 2025
-
[49]
M. Bukov and F. Marquardt, Reinforcement learning for quan- tum technology, arXiv preprint arXiv:2601.18953 (2026)
-
[50]
C. L. Degen, F. Reinhard, and P. Cappellaro, Quantum sensing, Reviews of modern physics89, 035002 (2017)
work page 2017
-
[51]
M. Lu, N. Q. Burdick, and B. L. Lev, Quantum degenerate dipo- lar fermi gas, Physical Review Letters108, 215301 (2012)
work page 2012
-
[52]
D. J. Wineland, J. J. Bollinger, W. M. Itano, F. Moore, and D. J. Heinzen, Spin squeezing and reduced quantum noise in spec- troscopy, Physical Review A46, R6797 (1992)
work page 1992
-
[53]
Y . Liu, Z. Xu, G. Jin, and L. You, Spin squeezing: Transforming one-axis twisting into two-axis twisting, Physical review letters 107, 013601 (2011)
work page 2011
-
[54]
F. Chen, J.-J. Chen, L.-N. Wu, Y .-C. Liu, and L. You, Extreme spin squeezing from deep reinforcement learning, Physical Re- view A100, 041801 (2019)
work page 2019
-
[55]
X. Yu, B. Wilhelm, D. Holmes, A. Vaartjes, D. Schwienbacher, M. Nurizzo, A. Kringhøj, M. R. v. Blankenstein, A. M. Jakob, P. Gupta,et al., Schr ¨odinger cat states of a nuclear spin qudit in silicon, Nature Physics21, 362 (2025)
work page 2025
-
[56]
K. C. Cox, G. P. Greve, J. M. Weiner, and J. K. Thompson, De- terministic squeezed states with collective measurements and feedback, Physical review letters116, 093602 (2016)
work page 2016
- [57]
-
[58]
E. Pedrozo-Pe ˜nafiel, S. Colombo, C. Shu, A. F. Adiyatullin, Z. Li, E. Mendez, B. Braverman, A. Kawasaki, D. Akamatsu, Y . Xiao,et al., Entanglement on an optical atomic-clock transi- tion, Nature588, 414 (2020)
work page 2020
-
[59]
J. M. Robinson, M. Miklos, Y . M. Tso, C. J. Kennedy, T. Both- well, D. Kedar, J. K. Thompson, and J. Ye, Direct comparison of two spin-squeezed optical clock ensembles at the 10- 17 level, Nature Physics20, 208 (2024)
work page 2024
-
[60]
Proximal Policy Optimization Algorithms
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, arXiv preprint arXiv:1707.06347 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[61]
Y . Wang, H. He, and X. Tan, Truly proximal policy optimiza- tion, inUncertainty in artificial intelligence(PMLR, 2020) pp. 113–122
work page 2020
-
[62]
Y . Gu, Y . Cheng, C. P. Chen, and X. Wang, Proximal policy optimization with policy feedback, IEEE Transactions on Sys- tems, Man, and Cybernetics: Systems52, 4600 (2021)
work page 2021
-
[63]
V . Konda and J. Tsitsiklis, Actor-critic algorithms, Advances in neural information processing systems12(1999)
work page 1999
-
[64]
I. Grondman, L. Busoniu, G. A. Lopes, and R. Babuska, A sur- vey of actor-critic reinforcement learning: Standard and natural policy gradients, IEEE Transactions on Systems, Man, and Cy- bernetics, part C (applications and reviews)42, 1291 (2012)
work page 2012
-
[65]
R. Porotti, D. Tamascelli, M. Restelli, and E. Prati, Coherent transport of quantum states by deep reinforcement learning, Communications Physics2, 61 (2019)
work page 2019
-
[66]
S.-F. Guo, F. Chen, Q. Liu, M. Xue, J.-J. Chen, J.-H. Cao, T.- W. Mao, M. K. Tey, and L. You, Faster state preparation across quantum phase transition assisted by reinforcement learning, Phys. Rev. Lett.126, 060401 (2021)
work page 2021
-
[67]
J. Yao, L. Lin, and M. Bukov, Reinforcement learning for many- body ground-state preparation inspired by counterdiabatic driv- ing, Physical Review X11, 031070 (2021)
work page 2021
-
[68]
R. Porotti, A. Essig, B. Huard, and F. Marquardt, Deep rein- forcement learning for quantum state preparation with weak nonlinear measurements, Quantum6, 747 (2022)
work page 2022
-
[69]
S. Li, Y . Fan, X. Li, X. Ruan, Q. Zhao, Z. Peng, R.-B. Wu, J. Zhang, and P. Song, Robust quantum control using reinforce- ment learning from demonstration, npj Quantum Information 11, 124 (2025)
work page 2025
-
[70]
G. T ´oth and I. Apellaniz, Quantum metrology from a quantum information science perspective, Journal of Physics A: Mathe- matical and Theoretical47, 424006 (2014)
work page 2014
- [71]
-
[72]
W. J. Eckner, N. Darkwah Oppong, A. Cao, A. W. Young, W. R. Milner, J. M. Robinson, J. Ye, and A. M. Kaufman, Realizing spin squeezing with rydberg interactions in an optical clock, Nature621, 734 (2023)
work page 2023
- [73]
-
[74]
L. M. Norris, C. M. Trail, P. S. Jessen, and I. H. Deutsch, En- hanced squeezing of a collective spin via control of its qudit subsystems, Physical review letters109, 173603 (2012)
work page 2012
- [75]
-
[76]
D. Sch ¨affner, T. Schreiber, F. Lenz, M. Schlosser, and G. Birkl, Quantum sensing in tweezer arrays: Optical magnetometry on an individual-atom sensor grid, PRX Quantum5, 010311 (2024)
work page 2024
-
[77]
T. Na Narong, H. Li, J. Tong, M. Due ˜nas, and L. Hollberg, Quantum states imaging of magnetic field contours based on autler-townes effect in ytterbium atoms, Phys. Rev. Lett.134, 193201 (2025)
work page 2025
-
[78]
Y .-W. Zhang, D.-S. Xiang, R. Liao, H.-X. Liu, B. Xu, P. Zhou, Y . Zhou, K. Zhang, and L. Li, Microwave electrometry with quantum-limited resolutions in a rydberg-atom array, Phys. Rev. Lett.136, 110802 (2026)
work page 2026
-
[79]
R. Kaubruegger, D. V . Vasilyev, M. Schulte, K. Hammerer, and P. Zoller, Quantum variational optimization of ramsey interfer- ometry and atomic clocks, Phys. Rev. X11, 041045 (2021). 12
work page 2021
-
[80]
Q. Liu, M. Xue, M. Radzihovsky, X. Li, D. V . Vasilyev, L.-N. Wu, and V . Vuleti´c, Enhancing dynamic range of sub- standard-quantum-limit measurements via quantum deamplifi- cation, Phys. Rev. Lett.135, 040801 (2025)
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.