A Sonar-Visual Dataset for Cross-Modal Underwater Robot Perception
Pith reviewed 2026-06-28 16:45 UTC · model grok-4.3
The pith
The SOVIS dataset supplies over 76,000 synchronized sonar-visual frame pairs that enable cross-modal fish detection with seven times the accuracy of camera-only baselines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SOVIS comprises over 76,000 paired frames collected across 17 dives at six sites in the Trondheimfjord, supported by an end-to-end pipeline that cleans and synchronizes the cross-modal sensor data. An interactive annotation tool accelerates labeling of the paired data. A proof-of-concept cross-modal fish detection task using a small subset of labeled data achieves a 7x improvement in mAP@0.10 over a monocular camera baseline, positioning the dataset as the first step toward dense sonar prediction from monocular images.
What carries the argument
The end-to-end pipeline that cleans and synchronizes cross-modal sensor streams to produce accurately paired sonar-visual frames for supervised learning.
If this is right
- Models trained on the pairs can combine visual semantics with acoustic range to detect fish more reliably than vision alone.
- The dataset directly supports experiments that attempt to predict full sonar outputs from single monocular images.
- The synchronization pipeline supplies a reusable method for constructing additional multi-modal underwater collections.
- Larger volumes of paired examples lower the barrier to developing cross-modal algorithms for robot perception.
Where Pith is reading between the lines
- The same pairs could be used to train models that correct visibility loss in one modality by reference to the other.
- Performance improvements observed on fish may extend to detection of other objects such as structures or debris once more labels are added.
- Real-robot deployment would reveal whether the learned mappings remain stable under motion and lighting changes absent from the static dives.
- The six sites may or may not represent the full range of turbidity and bottom types encountered in open-ocean operations.
Load-bearing premise
The pipeline produces correctly paired frames without misalignment errors or biases that would distort what models learn from the data.
What would settle it
Retraining the fish detector after independently re-synchronizing the original raw sensor logs and finding that the reported sevenfold mAP gain disappears.
Figures
read the original abstract
Underwater robots typically use both cameras and sonar for perception to leverage the rich semantic details of vision and the robust range measurements of acoustics. However, learning to map between these modalities via cross-modal prediction remains underexplored due to limited sonar-visual paired datasets. We present SOVIS, a sonar-visual dataset for cross-modal underwater perception. SOVIS comprises over 76,000 paired frames collected across 17 dives at six sites in the Trondheimfjord, supported by an end-to-end pipeline that cleans and synchronizes the cross-modal sensor data. We also introduce an interactive annotation tool designed to accelerate the labeling process for this paired data. Finally, we demonstrate a proof-of-concept cross-modal fish detection task using a small subset of labeled data, achieving a 7x improvement in mAP@0.10 over a monocular camera baseline. SOVIS serves as the first step toward advancing cross-modal underwater perception research, enabling research directions such as dense sonar prediction from monocular images.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents SOVIS, a sonar-visual dataset comprising over 76,000 paired frames collected across 17 dives at six sites in the Trondheimfjord. It describes an end-to-end pipeline for cleaning and synchronizing the cross-modal sensor data, introduces an interactive annotation tool, and demonstrates a proof-of-concept cross-modal fish detection task on a small labeled subset that achieves a 7x improvement in mAP@0.10 over a monocular camera baseline.
Significance. If the pairing quality holds, the dataset and annotation tool would provide a valuable empirical resource for cross-modal underwater perception, addressing the scarcity of large-scale paired sonar-visual data and enabling downstream tasks such as dense prediction and fish detection. The manuscript's strength lies in its scale of data collection and the provision of a practical pipeline and tool.
major comments (2)
- [§3] §3: The end-to-end pipeline for timestamp matching, cleaning heuristics, and synchronization is described in detail, but the manuscript reports no quantitative validation metrics such as mean/max timestamp offset (in ms), fraction of pairs rejected due to misalignment, or statistics from manual spot-checks. This is load-bearing for the central claim that the >76k frames supply reliably paired data usable for cross-modal learning.
- [Proof-of-concept results] Proof-of-concept results: The reported 7x mAP@0.10 improvement is evaluated on a small labeled subset, yet the text provides no details on baseline implementation, train/test splits, error bars, or confirmation that synchronization errors were not a confounding factor in the cross-modal signal. This leaves the empirical demonstration weakly supported.
minor comments (1)
- The abstract states the 7x improvement but does not indicate the size of the labeled subset used, which would aid reader assessment of the result's scope.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and commit to revisions that strengthen the manuscript's claims regarding data quality and the proof-of-concept evaluation.
read point-by-point responses
-
Referee: [§3] §3: The end-to-end pipeline for timestamp matching, cleaning heuristics, and synchronization is described in detail, but the manuscript reports no quantitative validation metrics such as mean/max timestamp offset (in ms), fraction of pairs rejected due to misalignment, or statistics from manual spot-checks. This is load-bearing for the central claim that the >76k frames supply reliably paired data usable for cross-modal learning.
Authors: We agree that quantitative validation metrics are necessary to support the reliability of the paired data. In the revised manuscript we will augment Section 3 with the requested statistics: the distribution (mean, max, std) of timestamp offsets across all pairs, the fraction of candidate pairs rejected by each cleaning heuristic, and quantitative results from manual spot-checks performed on a random sample of 500 pairs. These additions will directly address the load-bearing concern. revision: yes
-
Referee: [Proof-of-concept results] Proof-of-concept results: The reported 7x mAP@0.10 improvement is evaluated on a small labeled subset, yet the text provides no details on baseline implementation, train/test splits, error bars, or confirmation that synchronization errors were not a confounding factor in the cross-modal signal. This leaves the empirical demonstration weakly supported.
Authors: We acknowledge the need for greater transparency in the proof-of-concept. The revised text will specify the exact baseline architecture and training protocol, the train/validation/test split ratios and sizes for the labeled subset, standard deviations or error bars across repeated runs, and a targeted analysis (e.g., ablation on deliberately mis-synchronized pairs) confirming that the observed 7x gain is not an artifact of residual synchronization error. These details will be added to the relevant experimental section. revision: yes
Circularity Check
No circularity: empirical dataset paper with no derivation chain
full rationale
The manuscript describes collection of a real-world sonar-visual dataset across 17 dives, an end-to-end cleaning/synchronization pipeline, an annotation tool, and a small proof-of-concept detection demo. No equations, fitted parameters, uniqueness theorems, or self-citations are invoked to derive any result from prior outputs of the same work. The 7× mAP improvement is reported as an empirical observation on the collected data rather than a prediction forced by construction. The synchronization pipeline is described procedurally but is not presented as a mathematical derivation that reduces to its own inputs. This is a standard data-release contribution whose central claims rest on external measurement rather than internal self-reference.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Science Robotics , volume=
Acrobatics at the insect scale: A durable, precise, and agile micro--aerial robot , author=. Science Robotics , volume=. 2025 , publisher=
2025
-
[2]
IEEE Transactions on Robotics , volume=
A comparative study of nonlinear mpc and differential-flatness-based control for quadrotor agile flight , author=. IEEE Transactions on Robotics , volume=. 2022 , publisher=
2022
-
[3]
IEEE Robotics and Automation Letters , volume=
Passive wall tracking for a rotorcraft with tilted and ducted propellers using proximity effects , author=. IEEE Robotics and Automation Letters , volume=. 2022 , publisher=
2022
-
[4]
Science , volume=
The aerodynamics of free-flight maneuvers in Drosophila , author=. Science , volume=. 2003 , publisher=
2003
-
[5]
Proceedings of the National Academy of Sciences , volume=
Discovering the flight autostabilizer of fruit flies by inducing aerial stumbles , author=. Proceedings of the National Academy of Sciences , volume=. 2010 , publisher=
2010
-
[6]
Frontiers in neuroscience , volume=
Comparison of visually guided flight in insects and birds , author=. Frontiers in neuroscience , volume=. 2018 , publisher=
2018
-
[7]
Science , volume=
A tailless aerial robotic flapper reveals that flies use torque coupling in rapid banked turns , author=. Science , volume=. 2018 , publisher=
2018
-
[8]
IEEE Robotics and Automation Letters , volume=
Flying with damaged wings: The effect on flight capacity and bio-inspired coping strategies of a flapping wing robot , author=. IEEE Robotics and Automation Letters , volume=. 2021 , publisher=
2021
-
[9]
2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=
Rotating the heading angle of underactuated flapping-wing flyers by wriggle-steering , author=. 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=. 2015 , organization=
2015
-
[10]
Science , volume=
Aggressive mimicry in Photuris: firefly femmes fatales , author=. Science , volume=. 1965 , publisher=
1965
-
[11]
The Biological Bulletin , volume=
Courting behavior in a synchronously flashing, aggregative firefly, Pteroptyx tener , author=. The Biological Bulletin , volume=. 1980 , publisher=
1980
-
[12]
The American Naturalist , volume=
Energy and predation costs of firefly courtship signals , author=. The American Naturalist , volume=. 2007 , publisher=
2007
-
[13]
Science , volume=
Controlled flight of a biologically inspired, insect-scale robot , author=. Science , volume=. 2013 , publisher=
2013
-
[14]
IEEE Robotics and Automation Letters , volume=
Toward controlled flight of the ionocraft: a flying microrobot using electrohydrodynamic thrust with onboard sensing and no moving parts , author=. IEEE Robotics and Automation Letters , volume=. 2018 , publisher=
2018
-
[15]
IEEE Robotics and Automation Letters , volume=
Bee+: A 95-mg four-winged insect-scale flying robot driven by twinned unimorph actuators , author=. IEEE Robotics and Automation Letters , volume=. 2019 , publisher=
2019
-
[16]
IEEE Transactions on Robotics , volume=
Liftoff of an electromagnetically driven insect-inspired flapping-wing robot , author=. IEEE Transactions on Robotics , volume=. 2016 , publisher=
2016
-
[17]
Science Robotics , volume=
A biologically inspired, flapping-wing, hybrid aerial-aquatic microrobot , author=. Science Robotics , volume=. 2017 , publisher=
2017
-
[18]
IEEE Transactions on Robotics , volume=
RoboFly: An insect-sized robot with simplified fabrication that is capable of flight, ground, and water surface locomotion , author=. IEEE Transactions on Robotics , volume=. 2021 , publisher=
2021
-
[19]
Science , volume=
Perching and takeoff of a robotic insect on overhangs using switchable electrostatic adhesion , author=. Science , volume=. 2016 , publisher=
2016
-
[20]
2014 IEEE international conference on robotics and automation (ICRA) , pages=
Pitch and yaw control of a robotic insect using an onboard magnetometer , author=. 2014 IEEE international conference on robotics and automation (ICRA) , pages=. 2014 , organization=
2014
-
[21]
Nature , volume=
Untethered flight of an insect-sized flapping-wing microscale aerial vehicle , author=. Nature , volume=. 2019 , publisher=
2019
-
[22]
2018 IEEE International Conference on Robotics and Automation (ICRA) , pages=
Liftoff of a 190 mg laser-powered aerial vehicle: The lightest wireless robot to fly , author=. 2018 IEEE International Conference on Robotics and Automation (ICRA) , pages=. 2018 , organization=
2018
-
[23]
Advanced Materials , pages=
A High-Lift Micro-Aerial-Robot Powered by Low-Voltage and Long-Endurance Dielectric Elastomer Actuators , author=. Advanced Materials , pages=. 2022 , publisher=
2022
-
[24]
Nature , volume=
Controlled flight of a microrobot powered by soft artificial muscles , author=. Nature , volume=. 2019 , publisher=
2019
-
[25]
IEEE Transactions on Robotics , volume=
Collision resilient insect-scale soft-actuated aerial robots with high agility , author=. IEEE Transactions on Robotics , volume=. 2021 , publisher=
2021
-
[26]
Advanced Materials , volume=
Highly stretchable and self-deformable alternating current electroluminescent devices , author=. Advanced Materials , volume=. 2015 , publisher=
2015
-
[27]
Advanced Materials , volume=
Bright stretchable alternating current electroluminescent displays based on high permittivity composites , author=. Advanced Materials , volume=. 2016 , publisher=
2016
-
[28]
Journal of Materials Chemistry C , volume=
Highly bright and stable electroluminescent devices with extraordinary stretchability and ultraconformability , author=. Journal of Materials Chemistry C , volume=. 2019 , publisher=
2019
-
[29]
science , volume=
Highly stretchable electroluminescent skin for optical signaling and tactile sensing , author=. science , volume=. 2016 , publisher=
2016
-
[30]
Soft Matter , volume=
Electroluminescent soft elastomer actuators with adjustable luminance and strain , author=. Soft Matter , volume=. 2019 , publisher=
2019
-
[31]
Advanced Materials , volume=
Multilayer dielectric elastomers for fast, programmable actuation without prestretch , author=. Advanced Materials , volume=. 2016 , publisher=
2016
-
[32]
2003 , publisher=
Multiple view geometry in computer vision , author=. 2003 , publisher=
2003
-
[33]
Journal of Applied Physics , volume=
A high-performance dielectric elastomer consisting of bio-based polyester elastomer and titanium dioxide powder , author=. Journal of Applied Physics , volume=. 2013 , publisher=
2013
-
[34]
2014 , publisher=
Using a MEMS gyroscope to stabilize the attitude of a fly-sized hovering robot , author=. 2014 , publisher=
2014
-
[35]
Incremental Nonlinear Fault-Tolerant Control of a Quadrotor With Complete Loss of Two Opposing Rotors , year=
Sun, Sihao and Wang, Xuerui and Chu, Qiping and Visser, Coen de , journal=. Incremental Nonlinear Fault-Tolerant Control of a Quadrotor With Complete Loss of Two Opposing Rotors , year=
-
[36]
High-Speed Flight of Quadrotor Despite Loss of Single Rotor , year=
Sun, Sihao and Sijbers, Leon and Wang, Xuerui and de Visser, Coen , journal=. High-Speed Flight of Quadrotor Despite Loss of Single Rotor , year=
-
[37]
Single-loop control and trajectory following of a flapping-wing microrobot , year=
Chirarattananon, Pakpong and Ma, Kevin Y and Wood, Robert J , booktitle=. Single-loop control and trajectory following of a flapping-wing microrobot , year=
-
[38]
IEEE Robotics and Automation Letters , volume=
Yaw control of a hovering flapping-wing aerial vehicle with a passive wing hinge , author=. IEEE Robotics and Automation Letters , volume=. 2021 , publisher=
2021
-
[39]
Interface focus , volume=
Dynamics and flight control of a flapping-wing robotic insect in the presence of wind gusts , author=. Interface focus , volume=. 2017 , publisher=
2017
-
[40]
In 2014 IEEE , author=
Principles of microscale flexure hinge design for enhanced endurance. In 2014 IEEE , author=. RSJ International Conference on Intelligent Robots and Systems (IROS 2014) , pages=
2014
-
[41]
IEEE Robotics and Automation Letters , volume=
Four wings: An insect-sized aerial robot with steering ability and payload capacity for autonomy , author=. IEEE Robotics and Automation Letters , volume=. 2019 , publisher=
2019
-
[42]
Journal of Fluid Mechanics , volume=
Experimental and computational studies of the aerodynamic performance of a flapping and passively rotating insect wing , author=. Journal of Fluid Mechanics , volume=. 2016 , publisher=
2016
-
[43]
Journal of Fluid Mechanics , volume=
Passive wing pitch reversal in insect flight , author=. Journal of Fluid Mechanics , volume=. 2007 , publisher=
2007
-
[44]
Journal of fluid mechanics , volume=
Influence of wing kinematics on aerodynamic performance in hovering insect flight , author=. Journal of fluid mechanics , volume=. 2008 , publisher=
2008
-
[45]
IEEE Robotics and Automation Letters , volume=
Modeling and Control of Flapping-Wing Micro-Aerial Vehicles With Harmonic Sinusoids , author=. IEEE Robotics and Automation Letters , volume=. 2021 , publisher=
2021
-
[46]
and Helbling, E
Steinmeyer, Rebecca and Hyun, Nak-seung P. and Helbling, E. Farrell and Wood, Robert J. , booktitle=. Yaw Torque Authority for a Flapping-Wing Micro-Aerial Vehicle , year=
-
[47]
The International Journal of Robotics Research , pages=
An efficient, modular controller for flapping flight composing model-based and model-free components , author=. The International Journal of Robotics Research , pages=. 2021 , publisher=
2021
-
[48]
Bioinspiration & biomimetics , volume=
Adaptive control of a millimeter-scale flapping-wing robot , author=. Bioinspiration & biomimetics , volume=. 2014 , publisher=
2014
-
[49]
and Wood, Robert J
Chirarattananon, Pakpong and Ma, Kevin Y. and Wood, Robert J. , booktitle=. Fly on the wall , year=
-
[50]
The International Journal of Robotics Research , volume=
Perching with a robotic insect using adaptive tracking control and iterative learning control , author=. The International Journal of Robotics Research , volume=. 2016 , publisher=
2016
-
[51]
, journal=
Wood, Robert J. , journal=. The First Takeoff of a Biologically Inspired At-Scale Robotic Insect , year=
-
[52]
2011 IEEE international conference on robotics and automation , pages=
Minimum snap trajectory generation and control for quadrotors , author=. 2011 IEEE international conference on robotics and automation , pages=. 2011 , organization=
2011
-
[53]
The International Journal of Robotics Research , volume=
Trajectory generation and control for precise aggressive maneuvers with quadrotors , author=. The International Journal of Robotics Research , volume=. 2012 , publisher=
2012
-
[54]
AIAA Guidance, Navigation and Control Conference and Exhibit , pages=
Experiments in fixed-wing UAV perching , author=. AIAA Guidance, Navigation and Control Conference and Exhibit , pages=
-
[55]
IEEE Robotics and Automation Letters , volume=
FireFly: An Insect-Scale Aerial Robot Powered by Electroluminescent Soft Artificial Muscles , author=. IEEE Robotics and Automation Letters , volume=. 2022 , publisher=
2022
-
[56]
2015 , school=
Design of hybrid passive and active mechanisms for control of insect-scale flapping-wing robots , author=. 2015 , school=
2015
-
[57]
Journal of mathematical Biology , volume=
Mathematical model of honeycomb construction , author=. Journal of mathematical Biology , volume=. 1986 , publisher=
1986
-
[58]
Oecologia , volume=
Bumble bee behavior and selection on flower size in the sky pilot, Polemonium viscosum , author=. Oecologia , volume=. 1987 , publisher=
1987
-
[59]
Proceedings of the National Academy of Sciences , volume=
Mosquitoes survive raindrop collisions by virtue of their low mass , author=. Proceedings of the National Academy of Sciences , volume=. 2012 , publisher=
2012
-
[60]
Science advances , volume=
Flies land upside down on a ceiling using rapid visually mediated rotational maneuvers , author=. Science advances , volume=. 2019 , publisher=
2019
-
[61]
Journal of Micromechanics and Microengineering , volume=
Pop-up book MEMS , author=. Journal of Micromechanics and Microengineering , volume=. 2011 , publisher=
2011
-
[62]
2013 16th International Conference on Advanced Robotics (ICAR) , pages=
Model-free control of a flapping-wing flying microrobot , author=. 2013 16th International Conference on Advanced Robotics (ICAR) , pages=. 2013 , organization=
2013
-
[63]
Science Robotics , volume=
A gyroscope-free visual-inertial flight control and wind sensing system for 10-mg robots , author=. Science Robotics , volume=. 2022 , publisher=
2022
-
[64]
Microrobot design using fiber reinforced composites , author=
-
[65]
Journal of Micromechanics and Microengineering , volume=
Monolithic fabrication of millimeter-scale machines , author=. Journal of Micromechanics and Microengineering , volume=. 2012 , publisher=
2012
-
[66]
Smart Materials and Structures , volume=
Design and manufacturing rules for maximizing the performance of polycrystalline piezoelectric bending actuators , author=. Smart Materials and Structures , volume=. 2015 , publisher=
2015
-
[67]
Micro-and Nanotechnology Sensors, Systems, and Applications VII , volume=
PopupCAD: a tool for automated design, fabrication, and analysis of laminate devices , author=. Micro-and Nanotechnology Sensors, Systems, and Applications VII , volume=. 2015 , organization=
2015
-
[68]
International Design Engineering Technical Conferences and Computers and Information in Engineering Conference , volume=
An integrated design and simulation environment for rapid prototyping of laminate robotic mechanisms , author=. International Design Engineering Technical Conferences and Computers and Information in Engineering Conference , volume=. 2018 , organization=
2018
-
[69]
2016 IEEE International Conference on Robotics and Automation (ICRA) , pages=
Non-linear resonance modeling and system design improvements for underactuated flapping-wing vehicles , author=. 2016 IEEE International Conference on Robotics and Automation (ICRA) , pages=. 2016 , organization=
2016
-
[70]
2011 IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=
System identification and linear time-invariant modeling of an insect-sized flapping-wing micro air vehicle , author=. 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems , pages=. 2011 , organization=
2011
-
[71]
2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=
Influence of wing morphological and inertial parameters on flapping flight performance , author=. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=. 2016 , organization=
2016
-
[72]
2018 IEEE International Conference on Real-time Computing and Robotics (RCAR) , pages=
An efficient method for the design and fabrication of 2D laminate robotic structures , author=. 2018 IEEE International Conference on Real-time Computing and Robotics (RCAR) , pages=. 2018 , organization=
2018
-
[73]
IEEE Robotics and Automation Letters , volume=
Scalable cooperative transport of cable-suspended loads with UAVs using distributed trajectory optimization , author=. IEEE Robotics and Automation Letters , volume=. 2020 , publisher=
2020
-
[74]
2008 IEEE international conference on robotics and automation , pages=
Reciprocal velocity obstacles for real-time multi-agent navigation , author=. 2008 IEEE international conference on robotics and automation , pages=. 2008 , organization=
2008
-
[75]
IEEE Transactions on Robotics , volume=
Liftoff of a motor-driven, flapping-wing microaerial vehicle capable of resonance , author=. IEEE Transactions on Robotics , volume=. 2013 , publisher=
2013
-
[76]
Science , volume=
Programmable self-assembly in a thousand-robot swarm , author=. Science , volume=. 2014 , publisher=
2014
-
[77]
IEEE Transactions on Robotics , volume=
A survey on aerial swarm robotics , author=. IEEE Transactions on Robotics , volume=. 2018 , publisher=
2018
-
[78]
Frontiers in Robotics and AI , volume=
A survey on swarming with micro air vehicles: Fundamental challenges and constraints , author=. Frontiers in Robotics and AI , volume=. 2020 , publisher=
2020
-
[79]
Nature Machine Intelligence , volume=
Predictive control of aerial swarms in cluttered environments , author=. Nature Machine Intelligence , volume=. 2021 , publisher=
2021
-
[80]
Science Robotics , volume=
Minimal navigation solution for a swarm of tiny flying robots to explore an unknown environment , author=. Science Robotics , volume=. 2019 , publisher=
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.