arxiv: 2605.09811 · v1 · submitted 2026-05-10 · 💻 cs.RO

Recognition: no theorem link

Above and Below: Heterogeneous Multi-robot SLAM Across Surface and Underwater Domains

John McConnell , Armon Shariati , Paul Szenher , Yaxuan Li

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:22 UTC · model grok-4.3

classification 💻 cs.RO

keywords multi-robot SLAMUSVAUVloop closurevisual featuresheterogeneous robotsmaritime robotics

0 comments

The pith

A multi-robot SLAM system merges USV and AUV trajectories by matching visual features visible above and below the water surface.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a centralized multi-robot SLAM framework that fuses data from Uncrewed Surface Vessels and Autonomous Underwater Vehicles without acoustic range measurements. Each robot runs its own state estimation while inter-robot loop closures are detected by matching perceptual features that appear in both surface and underwater sensor streams. These closures are added to a shared graph to produce a single optimized trajectory history for the entire team. Tests with real data collected in three maritime environments show lower position errors for the AUVs than when each vehicle runs SLAM alone.

Core claim

The system detects loop closures between USV and AUV data streams using features observable across the air-water boundary, then inserts those closures into a centralized pose graph that merges every robot's individual state estimate into one consistent map covering the full mission duration of all vehicles.

What carries the argument

Centralized graph that incorporates detected inter-robot loop closures from shared perceptual features to merge separate USV and AUV state estimates.

If this is right

AUV localization errors decrease by incorporating surface observations through visual loop closures.
The team shares a single consistent map without requiring robots to be near each other for acoustic pings.
The approach remains functional when structures block acoustic signals but leave visible features intact.
All robots receive optimized estimates for their entire time histories in one centralized computation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same visual-matching approach could anchor teams of several AUVs to a single USV acting as a moving surface reference.
In search-and-rescue or inspection tasks, surface and underwater views of the same structures could speed up coordinated coverage without constant acoustic links.
Performance would likely degrade in feature-poor open water where few repeatable landmarks exist above and below the surface.

Load-bearing premise

Perceptual features observable from both above and below the surface can be reliably detected and matched as loop closures between USV and AUV data streams in complex, cluttered maritime environments.

What would settle it

In a new cluttered environment, no matching features are found between surface and underwater streams, so the multi-robot graph produces no reduction in AUV trajectory error relative to single-robot SLAM on the same data.

Figures

Figures reproduced from arXiv: 2605.09811 by Armon Shariati, John McConnell, Paul Szenher, Yaxuan Li.

**Figure 1.** Figure 1: Multi Robot Map Result. A sample multi-robot mission from the harbor environment. The blue points show the 2D LiDAR map, with the blue line showing the USV trajectory. Orange points indicate the sonar map, with the orange line representing the AUV’s trajectory. Grid lines are shown on the satellite image at 10-meter cell size. between robots. However, there are limitations to direct encounters, mainly req… view at source ↗

**Figure 2.** Figure 2: System Flow. At bottom left, we show 𝑁 underwater vehicles with imaging sonar; each AUV compresses its features and computes a pose graph, sending results to the USV. The USV converts its LiDAR scans into 2D scans near the water’s surface. The USV performs a loop closure search between the AUV’s sonar data and the USV’s LiDAR data. Loop closures are used to merge USV and AUV pose graphs, resulting in a cen… view at source ↗

**Figure 3.** Figure 3: Sonar image compression. (a) shows an illustration of a sonar image with multiple contacts highlighted in red. (b) shows the results of reducing each of these patches to a rectangle represented by the top left and bottom right corners. (c) shows a real-world example sonar image of (b), where all sonar contacts have been grouped into rectangles, each with a different color. B. AUV State Estimation In this s… view at source ↗

**Figure 4.** Figure 4: Test Environments and real robot. (a) shows the bridge, (b) shows waterfront and (c) shows harbor. (d) shows our kingfisher robot with sonar and LiDAR used to collect data for the experiments in this work. Satellite images are from Mapbox [41]. 𝑘 = ( 𝑓 (𝑟1), ..., 𝑓 (𝑟𝑁 )), 𝑓 : 𝑟𝑖 −→ R (7) where the contents of each discrete bin, 𝑓 (𝑟𝑖), is the number of occurrences inside that range bin. Lastly, we normali… view at source ↗

**Figure 5.** Figure 5: Qualitative Results. Each plot shows a single AUV in the system. Black lines show the ground truth trajectory from RTK-GPS. The red lines indicate the baseline trajectory from single-robot underwater SLAM. Blue lines show the trajectory from the proposed method, centralized multi-robot SLAM. Results from bridge are shown in (a) and (b). Results from waterfront are shown in (c) and (d). Results from harbor … view at source ↗

read the original abstract

Multi-robot simultaneous localization and mapping (SLAM) is a fundamental task in multi-robot operations. Robots must have a common understanding of their location and that of their team members to complete coordinated actions. However, multi-robot SLAM between Uncrewed Surface Vessels (USVs) and Autonomous Underwater Vehicles (AUVs) has primarily been achieved through acoustic pinging between robots to retrieve range measurements; a measurement technique requires that robots to be in similar locations simultaneously, have an uninterrupted path for signal propagation, and may necessitate synchronized clocks. This is especially challenging in complex, cluttered maritime environments, where structures may impede signals. However, these same structures may be observable above and below the water's surface, presenting an opportunity for inter-robot SLAM loop closure between USV and AUV data streams. This work builds upon recent research on inter-robot SLAM loop closure between USV and AUV data, extending it to propose a centralized multi-robot SLAM system. Each robot performs its state estimation, and we detect loop closures between each AUV and the USV data. These inter-robot loop closures are used to merge each robot's state estimate into a centralized graph, yielding estimates for the whole time history of the USV and all AUVs in the system. Validation is performed using real-world perceptual data in three different environments. Results show improved errors for AUVs in the multi-robot SLAM system compared to single-robot SLAM over the same trajectories. To our knowledge, this is the first instance of a multi-robot SLAM system with AUVs and USVs built on loop closures rather than acoustic distance measurements.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims a first-of-its-kind perceptual loop-closure SLAM system for USV-AUV teams that improves AUV errors over single-robot baselines, but the abstract supplies no numbers or method details to back it up.

read the letter

The core claim here is that you can fuse USV and AUV estimates into one graph by adding visual or perceptual loop closures across the surface boundary, and that this beats running SLAM independently on the AUVs. They describe each robot doing its local estimation first, then detecting inter-robot closures to merge everything centrally. The pitch is that this avoids the usual acoustic ranging problems in cluttered water where signals get blocked or timing fails, while the same structures can serve as shared landmarks from above and below. They ran it on real data from three environments and say the AUV errors dropped compared to the single-robot case on identical paths. It builds directly on some recent inter-robot closure work rather than starting fresh.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes a centralized multi-robot SLAM system for heterogeneous USV-AUV teams. Each robot performs independent state estimation; perceptual loop closures are detected between USV and AUV data streams and used to merge the individual estimates into a single centralized graph. Validation is performed on real-world perceptual data collected in three maritime environments. The abstract claims that this yields lower AUV errors than single-robot SLAM on the same trajectories and states that the system is the first to rely on loop closures rather than acoustic ranging.

Significance. If the claimed error reductions are demonstrated with rigorous quantitative evidence, the approach would constitute a practical advance for multi-robot coordination in cluttered waters where acoustic methods are unreliable due to obstacles or range limits. The centralized formulation is a direct application of standard pose-graph optimization with added inter-robot constraints, so the primary novelty resides in the cross-domain perceptual matching.

major comments (2)

[Abstract] Abstract: The central claim that 'Results show improved errors for AUVs in the multi-robot SLAM system compared to single-robot SLAM over the same trajectories' is unsupported by any quantitative metrics, ATE/RPE values, loop-closure counts, precision/recall figures, or error bars from the three environments. This absence is load-bearing because the improvement is the sole empirical validation offered for the proposed system.
[Abstract] Abstract: No description is given of the loop-closure detection pipeline, including feature extraction, descriptors, matching criteria, or outlier rejection applied to perceptual data across the air-water interface. These details are required to assess whether the inter-robot constraints can be reliably formed in complex maritime scenes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful comments on our manuscript. We address the major comments below and plan to revise the abstract to better support our claims with quantitative evidence and method details.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'Results show improved errors for AUVs in the multi-robot SLAM system compared to single-robot SLAM over the same trajectories' is unsupported by any quantitative metrics, ATE/RPE values, loop-closure counts, precision/recall figures, or error bars from the three environments. This absence is load-bearing because the improvement is the sole empirical validation offered for the proposed system.

Authors: We agree that including specific quantitative metrics in the abstract would strengthen the presentation of our results. The full manuscript includes ATE and RPE comparisons between single-robot and multi-robot SLAM for the AUVs across the three environments, as well as counts of inter-robot loop closures. We will revise the abstract to incorporate representative values from these experiments, such as the percentage reduction in ATE and the number of loop closures utilized. revision: yes
Referee: [Abstract] Abstract: No description is given of the loop-closure detection pipeline, including feature extraction, descriptors, matching criteria, or outlier rejection applied to perceptual data across the air-water interface. These details are required to assess whether the inter-robot constraints can be reliably formed in complex maritime scenes.

Authors: The loop-closure detection pipeline is described in detail in the methods section of the manuscript, where we explain the use of perceptual data from both domains, feature matching across the air-water interface, and robust outlier rejection. To make this accessible in the abstract, we will add a brief overview of the pipeline, highlighting the key components for cross-domain matching. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical system description with no derivations or equations

full rationale

The provided text consists solely of an abstract describing a multi-robot SLAM architecture that augments standard single-robot SLAM with inter-robot perceptual loop closures between USV and AUV data streams. No equations, state estimation formulations, optimization objectives, or parameter-fitting procedures are stated. The central claim (improved AUV errors versus single-robot baselines on identical trajectories) is presented as an empirical outcome from real-world validation in three environments, not as a mathematical derivation. Because no load-bearing step reduces by construction to its own inputs, self-citations, or fitted parameters, the paper exhibits no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, invented entities, or non-standard axioms; the approach implicitly rests on domain assumptions of standard SLAM.

axioms (1)

domain assumption Individual robots can produce usable state estimates and that perceptual features can be matched across surface and underwater views as loop closures.
Required for merging per-robot estimates into a centralized graph as described.

pith-pipeline@v0.9.0 · 5578 in / 1193 out tokens · 62330 ms · 2026-05-12T02:22:59.084717+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

Supervised Multi-Agent Autonomy for Cost-Effective Subsea Operations,

J. Vincentet al., “Supervised Multi-Agent Autonomy for Cost-Effective Subsea Operations,”Offshore Technology Conference, 2020

work page 2020
[2]

Communication- constrained multi-AUV cooperative SLAM,

L. Paull, G. Huang, M. Seto and J.J. Leonard, “Communication- constrained multi-AUV cooperative SLAM,”IEEE International Con- ference on Robotics and Automation, 2015

work page 2015
[3]

Decentralized cooperative trajectory estimation for autonomous underwater vehicles,

L. Paull, M. Seto and J.J. Leonard, “Decentralized cooperative trajectory estimation for autonomous underwater vehicles,”IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014

work page 2014
[4]

Mar- itime Scene Matching for Inter-Robot Localization Across Surface and Underwater Domains,

J. McConnell, I. Collado-Gonzalez , P. Szenher and A. Shariati “Mar- itime Scene Matching for Inter-Robot Localization Across Surface and Underwater Domains,”SSRN Preprint, 2025

work page 2025
[5]

Matching Color Aerial Images and Underwater Sonar Images Using Deep Learning for Underwater Localization,

M. Machado Dos Santos, G. G. De Giacomo, P. L. J. Drews and S. S. C. Botelho, “Matching Color Aerial Images and Underwater Sonar Images Using Deep Learning for Underwater Localization,”in IEEE Robotics and Automation Letters, 2022

work page 2022
[6]

Cross-View and Cross-Domain Underwater Localization Based on Optical Aerial and Acoustic Underwater Images,

M. M. D. Santos, G. G. De Giacomo, P. L. J. Drews-Jr and S. S. C. Botelho, “Cross-View and Cross-Domain Underwater Localization Based on Optical Aerial and Acoustic Underwater Images,”in IEEE Robotics and Automation Letters, 2022

work page 2022
[7]

Overhead Image Factors for Underwater Sonar-Based SLAM,

J. McConnell, F. Chen and B. Englot, “Overhead Image Factors for Underwater Sonar-Based SLAM,”in IEEE Robotics and Automation Letters, 2022 Module Mean Runtime (ms) STD Runtime (ms) PCM 6.05 5.23 Point Cloud Registration 766.11 254.60 Rectangle Compression 115.64 40.0 TABLE III:Runtime statistics in ms.Point cloud registration is from Section IV-D4, PCM i...

work page 2022
[8]

Scan Context: Egocentric spatial descriptor for place recognition within 3D point cloud map,

G. Kim and A. Kim, “Scan Context: Egocentric spatial descriptor for place recognition within 3D point cloud map,”IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

work page 2018
[9]

Intensity Scan Context: Coding Intensity and Geometry Relations for place recognition Detection,

H. Wang, C. Wang and L. Xie, “Intensity Scan Context: Coding Intensity and Geometry Relations for place recognition Detection,”IEEE International Conference on Robotics and Automation, 2020

work page 2020
[10]

Scan Context++: Structural Place Recog- nition Robust to Rotation and Lateral Variations in Urban Environments,

G. Kim, S. Choi and A. Kim, “Scan Context++: Structural Place Recog- nition Robust to Rotation and Lateral Variations in Urban Environments,” in IEEE Transactions on Robotics, 2021

work page 2021
[11]

An Algorithm for Finding Best Matches in Logarithmic Expected Time

J. H. Friedman, J. L. Bentley, and R. A. Finkel. “An Algorithm for Finding Best Matches in Logarithmic Expected Time”,ACM Transactions on Mathematical Software, 1977

work page 1977
[12]

Robust Imaging Sonar- based Place Recognition and Localization in Underwater Environments,

H. Kim, G. Kang, S. Jeong, S. Ma and Y. Cho, “Robust Imaging Sonar- based Place Recognition and Localization in Underwater Environments,” IEEE International Conference on Robotics and Automation, 2023

work page 2023
[13]

DRACo-SLAM: Distributed Robust Acoustic Communication-efficient SLAM for Imaging Sonar Equipped Underwater Robot Teams,

J. McConnell, Y. Huang, P. Szenher, I. Collado-Gonzalez and B. Englot, “DRACo-SLAM: Distributed Robust Acoustic Communication-efficient SLAM for Imaging Sonar Equipped Underwater Robot Teams,”IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

work page 2022
[14]

Underwater place recognition using forward-looking sonar images. A topological approach,

M.M. Santos, G.B. Zaffari, P. Ribeiro, P.L. J. Drews-Jr. and S.S.C. Botelho, “Underwater place recognition using forward-looking sonar images. A topological approach,”in Journal of Field Robotics, 2018

work page 2018
[15]

Underwater Place Recognition in Unknown Environments with Triplet Based Acoustic Image Retrieval,

P.O.C.S. Ribeiro et al., “Underwater Place Recognition in Unknown Environments with Triplet Based Acoustic Image Retrieval,”IEEE International Conference on Machine Learning and Applications, 2018

work page 2018
[16]

Latent Space Metric Learning For Sidescan Sonar Place Recognition,

M. Larsson, N. Bore and J. Folkesson, “Latent Space Metric Learning For Sidescan Sonar Place Recognition,”IEEE/OES Autonomous Underwater Vehicles Symposium, 2020

work page 2020
[17]

Virtual Maps for Autonomous Exploration of Cluttered Underwater Environments,

J. Wang, F. Chen, Y.Huang, J. McConnell, T. Shan and B. Englot, “Virtual Maps for Autonomous Exploration of Cluttered Underwater Environments,” inin IEEE Journal of Oceanic Engineering, 2022

work page 2022
[18]

A method for registration of 3-D shapes,

P. J. Besl and N. D. McKay, “A method for registration of 3-D shapes,”in IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992

work page 1992
[19]

Pairwise Consistent Measurement Set Maximization for Robust Multi-Robot Map Merging,

J.G. Mangelson, D. Dominic, R.M. Eustice and R. Vasudevan, “Pairwise Consistent Measurement Set Maximization for Robust Multi-Robot Map Merging,”IEEE International Conference on Robotics and Automation, 2018

work page 2018
[20]

Robust map optimization using dynamic covariance scaling,

P. Agarwal, G. D. Tipaldi, L. Spinello, C. Stachniss and W. Burgard, “Robust map optimization using dynamic covariance scaling,”IEEE International Conference on Robotics and Automation, 2013

work page 2013
[21]

Switchable constraints for robust pose graph SLAM,

N. S¨ underhauf and P. Protzel, “Switchable constraints for robust pose graph SLAM,”IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012

work page 2012
[22]

Long-baseline acoustic navigation for under-ice autonomous underwater vehicle operations

M.V. Jakuba, C. N. Roman, H. Singh, C. Murphy, C. Kunz, C. Willis, T. Sato and R.A. Sato “Long-baseline acoustic navigation for under-ice autonomous underwater vehicle operations”Journal of Field Robotics, 2008

work page 2008
[23]

Absolute positioning of an autonomous underwater vehicle using GPS and acoustic measure- ments,

N. H. Kussat, C. D. Chadwell and R. Zimmerman, “Absolute positioning of an autonomous underwater vehicle using GPS and acoustic measure- ments,” inin IEEE Journal of Oceanic Engineering, 2005

work page 2005
[24]

One-way travel- time inverted ultra-short baseline localization for low-cost autonomous underwater vehicles,

N. R. Rypkema, E. M. Fischell and H. Schmidt, “One-way travel- time inverted ultra-short baseline localization for low-cost autonomous underwater vehicles,”IEEE International Conference on Robotics and Automation, 2017

work page 2017
[25]

Passive Inverted Ultra-Short Baseline Positioning for a Disc-Shaped Autonomous Underwater Vehicle: Design and Field Experiments,

Y. Wang et al., “Passive Inverted Ultra-Short Baseline Positioning for a Disc-Shaped Autonomous Underwater Vehicle: Design and Field Experiments,” inin IEEE Robotics and Automation Letters, 2022

work page 2022
[26]

Design, Implementation, and Characterization of Precision Timing for Bistatic Acoustic Data Acquisition,

E. Fischell, T. Schneider and H. Schmidt, “Design, Implementation, and Characterization of Precision Timing for Bistatic Acoustic Data Acquisition,”in IEEE Journal of Oceanic Engineering, 2016

work page 2016
[27]

Relative Autonomy and Navigation for Command and Control of Low-Cost Autonomous Underwater Vehicles,

E.M. Fischell, N.R. Rypkema and H. Schmidt, “Relative Autonomy and Navigation for Command and Control of Low-Cost Autonomous Underwater Vehicles,”in IEEE Robotics and Automation Letters, 2019

work page 2019
[28]

Cooperative localization with communication delays for MAUVs,

Y. Yao, D. Xu and W. Yan, “Cooperative localization with communication delays for MAUVs,”IEEE International Conference on Intelligent Computing and Intelligent Systems, 2009

work page 2009
[29]

Distributed multi-robot localization from acoustic pulses using Euclidean distance geometry,

T. Halsted and M. Schwager, “Distributed multi-robot localization from acoustic pulses using Euclidean distance geometry,”International Symposium on Multi-Robot and Multi-Agent Systems, 2017

work page 2017
[30]

Multiple Autonomous Underwater Vehicle Cooperative Localization in Anchor-Free Environments,

Y. Li, Y. Wang, W. Yu and X. Guan, “Multiple Autonomous Underwater Vehicle Cooperative Localization in Anchor-Free Environments,”in IEEE Journal of Oceanic Engineering, 2019

work page 2019
[31]

A. Bahr, J. Leonard, M. Fallon, Cooperative Localization for Autonomous Underwater Vehicles.in The International Journal of Robotics Research. 2009. 9

work page 2009
[32]

DiSCo-SLAM: Distributed Scan Context-Enabled Multi-Robot LiDAR SLAM With Two-Stage Global-Local Graph Optimization,

Y. Huang, T. Shan, F. Chen and B. Englot, “DiSCo-SLAM: Distributed Scan Context-Enabled Multi-Robot LiDAR SLAM With Two-Stage Global-Local Graph Optimization,” inin IEEE Robotics and Automation Letters,2022

work page 2022
[33]

A Learnable Image Compression Scheme for Synthetic Aperture Sonar Imagery,

I. D. Gerg and V. Monga, “A Learnable Image Compression Scheme for Synthetic Aperture Sonar Imagery,”OCEANS, 2021

work page 2021
[34]

KISS-ICP: In Defense of Point-to-Point ICP – Simple, Accurate, and Robust Registration If Done the Right Way,

I. Vizzo, T. Guadagnino, B. Mersch, L. Wiesmann, J. Behley and C. Stachniss, “KISS-ICP: In Defense of Point-to-Point ICP – Simple, Accurate, and Robust Registration If Done the Right Way,”in IEEE Robotics and Automation Letters, 2023

work page 2023
[35]

KISS-SLAM: A Simple, Robust, and Accurate 3D LiDAR SLAM System With Enhanced Generalization Capabilities,

T. Guadagnino, B. Mersch, S. Gupta, I. Vizzo, G. Grisetti and C. Stachniss, “KISS-SLAM: A Simple, Robust, and Accurate 3D LiDAR SLAM System With Enhanced Generalization Capabilities,”arXiv preprint, 2023

work page 2023
[36]

An Assessment of the Navigation and Course Correc- tions for a Manned Flyby of Mars or Venus,

B. A. McElhoe, “An Assessment of the Navigation and Course Correc- tions for a Manned Flyby of Mars or Venus,”IEEE Transactions on Aerospace and Electronic Systems, 1966

work page 1966
[37]

borglab-gtsam,

F. Dellaert and GTSAM Contributors, “borglab-gtsam,”Georgia Tech Borg Lab, 2022

work page 2022
[38]

Richards,Fundamentals of Radar Signal Processing, McGraw Hill, 2005

M. Richards,Fundamentals of Radar Signal Processing, McGraw Hill, 2005

work page 2005
[39]

Go-ICP: Solving 3D Registration Efficiently and Globally Optimally,

J. Yang, H. Li and Y. Jia, “Go-ICP: Solving 3D Registration Efficiently and Globally Optimally,”IEEE International Conference on Computer Vision, 2013

work page 2013
[40]

EvoLogics, https://www.evologics.com/acoustic-modem/hs

work page
[41]

Mapbox, https://mapbox.com

work page