pith. sign in

arxiv: 2602.19315 · v2 · submitted 2026-02-22 · 💻 cs.RO · cs.AI

Online Navigation Planning for Long-term Autonomous Operation of Underwater Gliders

Pith reviewed 2026-05-15 20:12 UTC · model grok-4.3

classification 💻 cs.RO cs.AI
keywords underwater glidersonline planningMonte Carlo Tree SearchMarkov Decision Processstochastic navigationocean currentsautonomous roboticsfield deployment
0
0 comments X

The pith

A Monte Carlo Tree Search planner using a calibrated physics simulator lets underwater gliders achieve up to 16.51 percent shorter paths and 9.88 percent longer dives than straight navigation in real ocean conditions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Underwater gliders collect ocean data over long periods but must handle unpredictable currents and control errors to stay efficient without frequent human help. The paper models navigation as a stochastic shortest-path Markov Decision Process and solves it online with Monte Carlo Tree Search, drawing samples from a simulator tuned on past glider data to represent forecast uncertainty. This produces closed-loop replanning each time the glider surfaces. Two North Sea field tests lasting three months and covering a thousand kilometers show consistent gains over standard straight-to-goal methods, including a statistically significant 9.55 percent path-length cut in one deployment.

Core claim

Formulating glider navigation as a stochastic shortest-path Markov Decision Process and solving it with a sample-based Monte Carlo Tree Search planner that draws from a physics-informed simulator calibrated on real glider data produces online policies that reduce dive duration and path length compared with straight-to-goal navigation when deployed in closed loop on Slocum gliders.

What carries the argument

Sample-based online planner that solves a stochastic shortest-path Markov Decision Process via Monte Carlo Tree Search, using a physics-informed simulator to generate trajectories that incorporate uncertain control execution and ocean current forecasts.

Load-bearing premise

The simulator, tuned on earlier glider missions, accurately represents how controls are executed and how current forecasts behave under the conditions of the North Sea test sites.

What would settle it

In a new deployment the measured total path lengths or cumulative dive times under the planner would need to show no statistically significant improvement or would need to exceed the straight-to-goal baseline by more than the reported margins.

Figures

Figures reproduced from arXiv: 2602.19315 by Alexandra Kokkinaki, Alvaro Lorenzo Lopez, Ashley Iceton-Morris, Benjamin Allsup, Bruno Lacerda, Charlotte Williams, Charlotte Z. Reed, Chloe Baker, Christopher D. J. Auckland, Dan Jones, Elizabeth Siddle, James Kirk, Jan Stratmann, Jeff Polton, Justin J. H. Buck, Kevin Chaplin, Nick Hawes, Owain Jones, Rob A. Hall, Ryan D. Patmore, Stephen Woodward, Tobias Ferreira, Trishna Saeharaseelan, Victor-Alexandru Darvariu.

Figure 1
Figure 1. Figure 1: FIGURE 1: Slocum glider in operation during the [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIGURE 2: An illustration of basic glider operation and [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIGURE 4: An illustration of the key inputs and outputs [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIGURE 5: An illustration of a search tree built by the planner. Circles represent state nodes and squares represent action [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: We note the distinction between the bearing to [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIGURE 6: Illustration of the proposed action space formu [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: FIGURE 7: Visualisation of waypoint list generation. The [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: FIGURE 8: Diagram illustrating the components and steps of the two system workflows. The Mission Configuration [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: FIGURE 9: Field deployment locations in the North Sea. [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: FIGURE 10: Learning curve showing the best value of [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: FIGURE 11: Progress of the proposed simulator parameter optimisation procedure (Section IV.D) on one of the dives in [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: FIGURE 12: Transects completed by the proposed GALE system in the MOGli field deployment using the Planner and [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: FIGURE 13: Proportions of actions chosen by the Planner [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗
read the original abstract

Underwater glider robots have become indispensable for ocean sampling, yet fully autonomous long-term operation remains rare in practice. Although stakeholders are calling for tools to manage increasingly large fleets of gliders, existing methods have seen limited adoption due to their inability to account for environmental uncertainty and operational constraints. In this work, we demonstrate that uncertainty-aware online navigation planning can be deployed in real-world glider missions at scale. We formulate the problem as a stochastic shortest-path Markov Decision Process and propose a sample-based online planner based on Monte Carlo Tree Search. Samples are generated by a physics-informed simulator calibrated on real-world glider data that captures uncertain execution of controls and ocean current forecasts while remaining computationally tractable. Our methodology is integrated into an autonomous system for Slocum gliders that performs closed-loop replanning at each surfacing. The system was validated in two North Sea deployments totalling approximately 3 months and 1000 km, representing the longest fully autonomous glider campaigns in the literature to date. Results demonstrate improvements of up to 9.88% in dive duration and 16.51% in path length compared to standard straight-to-goal navigation, including a statistically significant path length reduction of 9.55% in a field deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The paper formulates underwater glider navigation as a stochastic shortest-path MDP and solves it online via Monte Carlo Tree Search, generating samples from a physics-informed simulator calibrated on prior real glider data to model control uncertainty and ocean current forecast errors. The system performs closed-loop replanning at each surfacing and is evaluated in two North Sea deployments (total ~3 months, 1000 km). It reports improvements of up to 9.88% in dive duration and 16.51% in path length versus a straight-to-goal baseline, including a statistically significant 9.55% path-length reduction in one field trial.

Significance. If the empirical results hold, the work demonstrates the first large-scale, long-duration deployment of uncertainty-aware stochastic planning on real underwater gliders. The multi-month autonomous campaigns and quantitative metrics against a standard baseline provide concrete evidence that such methods can improve operational efficiency in uncertain marine environments, supporting the management of larger glider fleets for sustained ocean sampling.

major comments (1)
  1. [§5] §5 (field experiments): The central claim attributes the reported 9.55% path-length reduction and 9.88% dive-duration improvement specifically to the stochastic modeling of control and forecast uncertainty. However, the manuscript provides no direct validation (e.g., Kolmogorov-Smirnov tests or quantile-quantile plots) comparing the simulator's predicted current distributions and glider response statistics against the currents and trajectories actually observed during the North Sea deployments. Without this, it remains possible that the gains arise primarily from increased surfacing frequency rather than explicit uncertainty handling.
minor comments (3)
  1. [Figure 4] Figure 4 and Table 2: Axis labels and captions should explicitly state the number of independent runs or dives used to compute the reported means, standard deviations, and p-values.
  2. [§4.2] §4.2 (MCTS implementation): The description of the rollout policy and termination conditions is terse; adding pseudocode or a short algorithmic listing would improve reproducibility.
  3. [§2] Related work: The discussion of prior glider navigation methods omits several recent papers on current-aware path planning published after 2020; a brief update would strengthen context.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment of the work's significance and for the constructive feedback on the field experiments section. We address the major comment point by point below and will incorporate the requested validation in the revised manuscript.

read point-by-point responses
  1. Referee: [§5] §5 (field experiments): The central claim attributes the reported 9.55% path-length reduction and 9.88% dive-duration improvement specifically to the stochastic modeling of control and forecast uncertainty. However, the manuscript provides no direct validation (e.g., Kolmogorov-Smirnov tests or quantile-quantile plots) comparing the simulator's predicted current distributions and glider response statistics against the currents and trajectories actually observed during the North Sea deployments. Without this, it remains possible that the gains arise primarily from increased surfacing frequency rather than explicit uncertainty handling.

    Authors: We agree that direct validation of the simulator against the deployment data strengthens the attribution of gains to uncertainty-aware planning. The simulator was calibrated on historical glider data prior to the North Sea missions (Section 4), but we will add in the revision Kolmogorov-Smirnov tests and quantile-quantile plots comparing simulated versus observed current distributions and glider response statistics from the two deployments. On surfacing frequency: both the MCTS planner and the straight-to-goal baseline use the identical surfacing schedule dictated by the glider's battery and communication constraints; the planner differs only in selecting uncertainty-aware waypoints at each surfacing. The reported statistical significance (p < 0.05) of the 9.55% path-length reduction in one trial is therefore attributable to the stochastic policy rather than replanning cadence. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical field results independent of simulator calibration

full rationale

The paper formulates the navigation task as a stochastic shortest-path MDP and deploys an MCTS planner whose samples come from a physics-informed simulator. However, the load-bearing claims are the measured improvements (up to 9.88% dive duration, 9.55% statistically significant path-length reduction) obtained by running the planner in two independent North Sea field deployments totaling 3 months and 1000 km, then comparing directly against a straight-to-goal baseline executed in the same missions. Simulator calibration uses prior glider data, but the reported metrics are real-world observational outcomes, not quantities recomputed from the calibration set or forced by the model equations. No self-citation, uniqueness theorem, or ansatz is invoked to derive the performance numbers; the derivation chain terminates in external, falsifiable deployment data.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The approach rests on standard MDP assumptions plus the domain claim that a calibrated physics simulator adequately represents real uncertainties; no new entities are postulated.

free parameters (1)
  • Simulator calibration parameters
    Parameters fitted to match observed glider behavior and current forecasts from prior real-world data.
axioms (2)
  • domain assumption Navigation under uncertainty can be modeled as a stochastic shortest-path Markov Decision Process.
    The problem is explicitly formulated this way in the abstract.
  • domain assumption The physics-informed simulator captures the relevant uncertainties for planning.
    The planner relies on samples generated by this simulator.

pith-pipeline@v0.9.0 · 5615 in / 1446 out tokens · 25772 ms · 2026-05-15T20:12:07.772859+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages

  1. [1]

    Underwater Gliders for Ocean Research,

    D. L. Rudnick, R. E. Davis, C. C. Eriksen, D. M. Fratantoni, and M. J. Perry, “Underwater Gliders for Ocean Research,”Marine Technology Society Journal, vol. 38, no. 2, pp. 73–84, 2004

  2. [2]

    Autonomous Profiling Floats: Workhorse for Broad-scale Ocean Observations,

    D. Roemmich, S. Riser, R. Davis, and Y . Desaubies, “Autonomous Profiling Floats: Workhorse for Broad-scale Ocean Observations,” Marine Technology Society Journal, vol. 38, no. 2, pp. 21–29, 2004

  3. [3]

    On the Future of Argo: A Global, Full-Depth, Multi- Disciplinary Array,

    D. Roemmich, “On the Future of Argo: A Global, Full-Depth, Multi- Disciplinary Array,”Frontiers in Marine Science, 2019

  4. [4]

    Net Zero Oceanographic Capability (NZOC) summary report,

    National Oceanography Centre, “Net Zero Oceanographic Capability (NZOC) summary report,” https://noc.ac.uk/files/documents/facilities/ NZOCSUMMARYREPORTV2.pdf, 2021, accessed: 2026-03-31

  5. [5]

    OceanGliders: A Component of the Integrated GOOS,

    P. Testor, B. De Young, D. L. Rudnick, S. Glenn, D. Hayes, C. M. Lee, C. Pattiaratchi, K. Hill, E. Heslop, V . Turpin, P. Alenius, C. Barrera, J. A. Barth, N. Beaird, G. B ´ecu, A. Bosse, F. Bourrin, J. A. Brearley, Y . Chao, S. Chen, J. Chiggiato, L. Coppola, R. Crout, J. Cummings, B. Curry, R. Curry, R. Davis, K. Desai, S. DiMarco, C. Edwards, S. Fieldi...

  6. [6]

    The Slocum Mission,

    H. Stommel, “The Slocum Mission,”Oceanography, vol. 2, no. 1, pp. 22–25, 1989

  7. [7]

    Future Vision for Autonomous Ocean Observations,

    C. Whitt, J. Pearlman, B. Polagye, F. Caimi, F. Muller-Karger, A. Copping, H. Spence, S. Madhusudhana, W. Kirkwood, L. Grosjean, B. M. Fiaz, S. Singh, S. Singh, D. Manalang, A. S. Gupta, A. Maguer, J. J. H. Buck, A. Marouchos, M. A. Atmanand, R. Venkatesan, V . Narayanaswamy, P. Testor, E. Douglas, S. De Halleux, and S. J. Khalsa, “Future Vision for Auton...

  8. [8]

    Slocum: An underwater glider propelled by environmental energy,

    D. C. Webb, P. J. Simonetti, and C. P. Jones, “Slocum: An underwater glider propelled by environmental energy,”IEEE Journal of Oceanic Engineering, vol. 26, no. 4, pp. 447–452, 2001

  9. [9]

    Slocum glider,

    Teledyne Marine, “Slocum glider,” https://www.teledynemarine.com/ brands/webb-research/slocum-glider, 2025, accessed: 2025-12-01

  10. [10]

    Optimal trajectory generation in ocean flows,

    T. Inanc, S. Shadden, and J. Marsden, “Optimal trajectory generation in ocean flows,” inAmerican Control Conference (ACC), 2005

  11. [11]

    Optimal trajectory generation for a glider in time-varying 2D ocean flows B- spline model,

    W. Zhang, T. Inanc, S. Ober-Blobaum, and J. E. Marsden, “Optimal trajectory generation for a glider in time-varying 2D ocean flows B- spline model,” inInternational Conference on Robotics and Automa- tion (ICRA), 2008

  12. [12]

    Optimal AUV path planning for extended missions in complex, fast-flowing estuarine en- vironments,

    D. Kruger, R. Stolkin, A. Blum, and J. Briganti, “Optimal AUV path planning for extended missions in complex, fast-flowing estuarine en- vironments,” inInternational Conference on Robotics and Automation (ICRA), 2007

  13. [13]

    Path planning in time dependent flow fields using level set methods,

    T. Lolla, M. P. Ueckermann, K. Yigit, P. J. Haley, and P. F. J. Lermusiaux, “Path planning in time dependent flow fields using level set methods,” inInternational Conference on Robotics and Automation (ICRA), 2012

  14. [14]

    Optimizing autonomous underwater vehicle routes with the aid of high resolution ocean models,

    M. Aguiar, J. B. de Sousa, J. M. Dias, J. E. da Silva, R. Mendes, and A. S. Ribeiro, “Optimizing autonomous underwater vehicle routes with the aid of high resolution ocean models,” inOCEANS, 2019

  15. [15]

    Fern ´andez-Perdomo, J

    E. Fern ´andez-Perdomo, J. Cabrera-G ´amez, D. Hern ´andez-Sosa, J. Isern-Gonz´alez, A. C. Dom´ınguez-Brito, A. Redondo, J. Coca, A. G. Ramos, E. ´A. Fanjul, and M. Garc ´ıa, “Path planning for gliders using regional ocean models: Application of Pinz ´on path planner with the PREPRINT 17 Darvariu et al.: Online Navigation Planning for Long-term Autonomous...

  16. [16]

    Extending per- sistent monitoring by combining ocean models and Markov decision processes,

    W. H. Al-Sabban, L. F. Gonzalez, and R. N. Smith, “Extending per- sistent monitoring by combining ocean models and Markov decision processes,” inOCEANS, 2012

  17. [17]

    Reinforcement learning-based path planning for underwater gliders using ocean current forecasts,

    Y . Zhou, Y . Li, R. Juan, T. Wang, Z. Li, S. Liu, W. Ma, and Z. Gao, “Reinforcement learning-based path planning for underwater gliders using ocean current forecasts,”Ocean Engineering, vol. 353, p. 124736, Apr. 2026

  18. [18]

    Glider Coordinated Control and Lagrangian Coherent Structures,

    F. Lekien, L. Mortier, and P. Testor, “Glider Coordinated Control and Lagrangian Coherent Structures,”IFAC Proceedings Volumes, vol. 41, no. 1, pp. 125–130, 2008

  19. [19]

    Collective Motion, Sensor Networks, and Ocean Sampling,

    N. E. Leonard, D. A. Paley, F. Lekien, R. Sepulchre, D. M. Fratantoni, and R. E. Davis, “Collective Motion, Sensor Networks, and Ocean Sampling,”Proceedings of the IEEE, vol. 95, no. 1, pp. 48–74, 2007

  20. [20]

    Coordinated control of an underwater glider fleet in an adaptive ocean sampling field experiment in Monterey Bay,

    N. E. Leonard, D. A. Paley, R. E. Davis, D. M. Fratantoni, F. Lekien, and F. Zhang, “Coordinated control of an underwater glider fleet in an adaptive ocean sampling field experiment in Monterey Bay,”Journal of Field Robotics, vol. 27, no. 6, pp. 718–740, 2010

  21. [21]

    Ant-Colony-Based Complete-Coverage Path-Planning Algorithm for Underwater Gliders in Ocean Areas With Thermoclines,

    G. Han, Z. Zhou, T. Zhang, H. Wang, L. Liu, Y . Peng, and M. Guizani, “Ant-Colony-Based Complete-Coverage Path-Planning Algorithm for Underwater Gliders in Ocean Areas With Thermoclines,”IEEE Trans- actions on Vehicular Technology, vol. 69, no. 8, pp. 8959–8971, 2020

  22. [22]

    Kolobov,Planning with Markov Decision Processes: an AI Perspective

    Mausam and A. Kolobov,Planning with Markov Decision Processes: an AI Perspective. Morgan & Claypool Publishers, 2012, vol. 17

  23. [23]

    R. S. Sutton and A. G. Barto,Reinforcement Learning: An Introduc- tion. MIT Press, 2018

  24. [24]

    AMM15: a new high-resolution NEMO configuration for operational simulation of the European north-west shelf,

    J. A. Graham, E. O’Dea, J. Holt, J. Polton, H. T. Hewitt, R. Furner, K. Guihou, A. Brereton, A. Arnold, S. Wakelin, J. M. Castillo Sanchez, and C. G. Mayorga Adame, “AMM15: a new high-resolution NEMO configuration for operational simulation of the European north-west shelf,”Geoscientific Model Development, vol. 11, no. 2, pp. 681–696, 2018

  25. [25]

    Met office marine data service,

    Met Office, “Met office marine data service,” https://www.metoffice. gov.uk/services/data/met-office-marine-data-service, 2025, accessed: 2025-11-05

  26. [26]

    A survey of Monte Carlo tree search methods,

    C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton, “A survey of Monte Carlo tree search methods,”IEEE Transactions on Computational Intelligence and AI in Games, vol. 4, no. 1, pp. 1–43, 2012

  27. [27]

    glidersim,

    L. Merckelbach, “glidersim,” https://github.com/smerckel/glidersim, 2025, accessed: 2025-11-05

  28. [28]

    Depth-averaged instantaneous currents in a tidally dominated shelf sea from glider observations,

    ——, “Depth-averaged instantaneous currents in a tidally dominated shelf sea from glider observations,”Biogeosciences, vol. 13, no. 24, pp. 6637–6649, 2016

  29. [29]

    A Dynamic Flight Model for Slocum Gliders and Implications for Turbulence Microstructure Measurements,

    L. Merckelbach, A. Berger, G. Krahmann, M. Dengler, and J. R. Car- penter, “A Dynamic Flight Model for Slocum Gliders and Implications for Turbulence Microstructure Measurements,”Journal of Atmospheric and Oceanic Technology, vol. 36, no. 2, pp. 281–296, 2019

  30. [30]

    Steady motion of underwater gliders and stability analysis,

    B. Wang, J. Xiong, S. Wang, D. Ma, and C. Liu, “Steady motion of underwater gliders and stability analysis,”Nonlinear Dynamics, vol. 107, no. 1, pp. 515–531, 2022

  31. [31]

    Combination optimization method for motion trajectory of morphing underwater gliders balancing energy efficiency and motion accuracy,

    H. Wu, J. Ding, W. Niu, M. Liao, Y . Song, L. Tan, and S. Yan, “Combination optimization method for motion trajectory of morphing underwater gliders balancing energy efficiency and motion accuracy,” Ocean Engineering, vol. 329, p. 121154, 2025

  32. [32]

    BoTorch: A framework for efficient Monte-Carlo Bayesian optimization,

    M. Balandat, B. Karrer, D. Jiang, S. Daulton, B. Letham, A. G. Wilson, and E. Bakshy, “BoTorch: A framework for efficient Monte-Carlo Bayesian optimization,” inAdvances in Neural Information Processing Systems, vol. 33, 2020

  33. [33]

    Ax: a platform for adaptive experimentation,

    M. Olson, E. Santorella, L. C. Tiao, S. Cakmak, M. Garrard, S. Daulton, Z. J. Lin, S. Ament, B. Beckerman, E. Onofrey, P. Igusti, C. Lara, B. Letham, C. Cardoso, S. S. Shen, A. C. Lin, M. Grange, E. Kashtelyan, D. Eriksson, M. Balandat, and E. Bakshy, “Ax: a platform for adaptive experimentation,” inAutoML 2025 ABCD Track, 2025

  34. [34]

    Bandit based Monte-Carlo planning,

    L. Kocsis and C. Szepesv ´ari, “Bandit based Monte-Carlo planning,” inEuropean Conference on Machine Learning (ECML), 2006

  35. [35]

    Finite-time Analysis of the Multiarmed Bandit Problem,

    P. Auer, N. Cesa-Bianchi, and P. Fischer, “Finite-time Analysis of the Multiarmed Bandit Problem,”Machine Learning, vol. 47, no. 2, pp. 235–256, 2002

  36. [36]

    Progres- sive strategies for Monte-Carlo Tree Search,

    G. M. J.-B. Chaslot, M. H. M. Winands, and B. Bouzy, “Progres- sive strategies for Monte-Carlo Tree Search,”New Mathematics and Natural Computation, vol. 4, no. 03, pp. 343–357, 2008

  37. [37]

    Continuous Upper Confidence Trees,

    A. Cou ¨etoux, J.-B. Hoock, N. Sokolovska, O. Teytaud, and N. Bon- nard, “Continuous Upper Confidence Trees,” inLearning and Intelli- gent Optimization, 2011, vol. 6683, pp. 433–445

  38. [38]

    On the Parallelization of UCT,

    T. Cazenave and N. Jouandeau, “On the Parallelization of UCT,” in Computer Games Workshop, 2007

  39. [39]

    Parallel Monte-Carlo Tree Search,

    G. M. J.-B. Chaslot, M. H. M. Winands, and H. J. Van Den Herik, “Parallel Monte-Carlo Tree Search,” inComputers and Games, 2008, vol. 5131, pp. 60–71

  40. [40]

    Oceanids C2: An Integrated Command, Control, and Data Infrastructure for the Over-the-Horizon Operation of Marine Autonomous Systems,

    C. A. Harris, A. Lorenzo-Lopez, O. Jones, J. J. H. Buck, A. Kokkinaki, S. Loch, T. Gardner, and A. B. Phillips, “Oceanids C2: An Integrated Command, Control, and Data Infrastructure for the Over-the-Horizon Operation of Marine Autonomous Systems,”Frontiers in Marine Science, vol. 7, p. 397, 2020

  41. [41]

    Communications backbone,

    National Oceanography Centre Marine Autonomous and Robotic Sys- tems Team, “Communications backbone,” https://gitlab.com/nocacuk/ mars-oss/cb/communications-backbone, 2026, accessed: 2026-01-06

  42. [42]

    Backbone message formats,

    ——, “Backbone message formats,” https://gitlab.com/nocacuk/ mars-oss/cb/backbone-message-format, 2026, accessed: 2026-01-06

  43. [43]

    British Oceanographic Data Centre (BODC),

    National Oceanography Centre, “British Oceanographic Data Centre (BODC),” https://www.bodc.ac.uk/, 2025, accessed: 2025-11-05

  44. [44]

    Copernicus Monitoring Environment Marine Service (CMEMS),

    Copernicus Authors, “Copernicus Monitoring Environment Marine Service (CMEMS),” https://marine.copernicus.eu/, 2025, accessed: 2025-11-05

  45. [45]

    Seaglider: A long-range autonomous underwater vehicle for oceanographic research,

    C. C. Eriksen, T. J. Osse, R. D. Light, T. Wen, T. W. Lehman, P. L. Sabin, J. W. Ballard, and A. M. Chiodi, “Seaglider: A long-range autonomous underwater vehicle for oceanographic research,”IEEE Journal of Oceanic Engineering, vol. 26, no. 4, pp. 424–436, 2001

  46. [46]

    A Survey of Multi-Objective Sequential Decision-Making,

    D. M. Roijers, P. Vamplew, S. Whiteson, and R. Dazeley, “A Survey of Multi-Objective Sequential Decision-Making,”Journal of Artificial Intelligence Research, vol. 48, pp. 67–113, 2013

  47. [47]

    Online algorithms for pomdps with continuous state, action, and observation spaces,

    Z. Sunberg and M. Kochenderfer, “Online algorithms for pomdps with continuous state, action, and observation spaces,” inInternational Conference on Automated Planning and Scheduling (ICAPS), 2018

  48. [48]

    Learning latent dynamics for planning from pixels,

    D. Hafner, T. Lillicrap, I. Fischer, R. Villegas, D. Ha, H. Lee, and J. Davidson, “Learning latent dynamics for planning from pixels,” in International Conference on Machine Learning (ICML), 2019

  49. [49]

    Mastering Atari, Go, chess and shogi by planning with a learned model,

    J. Schrittwieser, I. Antonoglou, T. Hubert, K. Simonyan, L. Sifre, S. Schmitt, A. Guez, E. Lockhart, D. Hassabis, T. Graepel, T. Lillicrap, and D. Silver, “Mastering Atari, Go, chess and shogi by planning with a learned model,”Nature, vol. 588, no. 7839, pp. 604–609, 2020

  50. [50]

    Gaussian process aggregation for root-parallel Monte Carlo tree search with continuous actions,

    J. Xiao, V .-A. Darvariu, B. Lacerda, and N. Hawes, “Gaussian process aggregation for root-parallel Monte Carlo tree search with continuous actions,”arXiv preprint arXiv:2512.09727, 2025

  51. [51]

    Monte Carlo Tree Search with Boltzmann exploration,

    M. Painter, M. Baioumy, N. Hawes, and B. Lacerda, “Monte Carlo Tree Search with Boltzmann exploration,”Advances in Neural Infor- mation Processing Systems, vol. 36, 2023

  52. [52]

    Stop! planner time: Metareason- ing for probabilistic planning using learned performance profiles,

    M. Budd, B. Lacerda, and N. Hawes, “Stop! planner time: Metareason- ing for probabilistic planning using learned performance profiles,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2024

  53. [53]

    Autonomous tracking of an oceanic thermal front by a Wave Glider,

    Y . Zhang, C. Rueda, B. Kieft, J. P. Ryan, C. Wahl, T. C. O’Reilly, T. Maughan, and F. P. Chavez, “Autonomous tracking of an oceanic thermal front by a Wave Glider,”Journal of Field Robotics, vol. 36, no. 5, pp. 940–954, 2019

  54. [54]

    Algal Bloom Front Tracking Using an Unmanned Surface Vehicle: Numer- ical Experiments Based on Baltic Sea Data,

    J. Fonseca, M. Aguiar, J. B. D. Sousa, and K. H. Johansson, “Algal Bloom Front Tracking Using an Unmanned Surface Vehicle: Numer- ical Experiments Based on Baltic Sea Data,” inOCEANS, 2021

  55. [55]

    Evaluating existing ocean glider sampling strategies for submesoscale dynamics,

    R. D. Patmore, D. Ferreira, D. P. Marshall, M. D. du Plessis, J. A. Brearley, and S. Swart, “Evaluating existing ocean glider sampling strategies for submesoscale dynamics,”Journal of Atmospheric and Oceanic Technology, vol. 41, no. 7, pp. 647–663, 2024. 18 PREPRINT