Integrating direction-of-arrival spectra and binaural embeddings from passive audio into vision models improves relative camera pose estimation in in-the-wild videos and adds robustness to visual corruption.
Fixing the scale and shift in monocular depth for camera pose estimation.arXiv preprint arXiv:2501.07742, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2025 2verdicts
UNVERDICTED 2representative citing papers
POLAR converts scaleless monocular depth maps to metric scale via radar-guided polynomial fitting and first-derivative regularization, claiming 24.9% MAE and 33.2% RMSE gains over prior methods on three datasets.
citing papers explorer
-
Audio-Visual Camera Pose Estimation with Passive Scene Sounds and In-the-Wild Video
Integrating direction-of-arrival spectra and binaural embeddings from passive audio into vision models improves relative camera pose estimation in in-the-wild videos and adds robustness to visual corruption.
-
Radar-Guided Polynomial Fitting for Metric Depth Estimation
POLAR converts scaleless monocular depth maps to metric scale via radar-guided polynomial fitting and first-derivative regularization, claiming 24.9% MAE and 33.2% RMSE gains over prior methods on three datasets.