Sky2Ground: A Benchmark for Site Modeling under Varying Altitude
Pith reviewed 2026-05-15 12:00 UTC · model grok-4.3
The pith
SkyNet improves multi-view alignment for altitude-varying scenes by training progressively on satellite imagery.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SkyNet, a model trained with a curriculum strategy that progressively incorporates more satellite views, significantly strengthens multi-view alignment and outperforms existing methods by 9.6% on RRA@5 and 18.1% on RTA@5 in absolute performance.
What carries the argument
Curriculum-based training strategy that progressively incorporates satellite views to enhance cross-view consistency in SkyNet.
If this is right
- Satellite imagery degrades pose estimation performance in current models under large altitude changes.
- Reconstruction suffers from sparse geometric overlap, orthogonal angles, and noise in real images.
- The dataset supports evaluation from global satellite context down to local ground details.
- SkyNet supplies a practical baseline for generalizable localization across altitude levels.
Where Pith is reading between the lines
- The curriculum approach could transfer to other scale-varying problems such as combining street-level and overhead imagery for mapping.
- Similar benchmarks mixing synthetic and real data might help test robustness in related tasks like object detection across distances.
- Better handling of these view differences may improve downstream uses in urban modeling or navigation systems.
Load-bearing premise
The 51 sites and their mix of real and synthetic images represent the range of altitude variations and noise that future models will encounter.
What would settle it
SkyNet falling below baseline performance on a fresh set of sites whose altitude spans or noise levels differ markedly from the original 51.
Figures
read the original abstract
We introduce Sky2Ground, a three-view dataset designed for varying altitude camera localization, correspondence learning, and reconstruction. The dataset combines structured synthetic imagery with real, in-the-wild images, providing both controlled multi-view geometry and realistic scene noise. Each of the 51 sites contains thousands of satellite, aerial, and ground images spanning wide altitude ranges and nearly orthogonal viewing angles, enabling rigorous evaluation across global-to-local contexts. We benchmark state of the art pose estimation models, including MASt3R, DUSt3R, Map Anything, and VGGT, and observe that the use of satellite imagery often degrades performance, highlighting the challenges under large altitude variations. We also examine reconstruction methods, highlighting the challenges introduced by sparse geometric overlap, varying perspectives, and the use of real imagery, which often introduces noise and reduces rendering quality. To address some of these challenges, we propose SkyNet, a model which enhances cross-view consistency when incorporating satellite imagery with a curriculum-based training strategy to progressively incorporate more satellite views. SkyNet significantly strengthens multi-view alignment and outperforms existing methods by 9.6% on RRA@5 and 18.1% on RTA@5 in terms of absolute performance. Sky2Ground and SkyNet together establish a comprehensive testbed and baseline for advancing large-scale, multi-altitude 3D perception and generalizable camera localization. Code and models will be released publicly for future research.Project page: https://sky2ground2026.github.io/sky2ground/
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Sky2Ground, a benchmark dataset of 51 sites with thousands of synthetic and real satellite, aerial, and ground images spanning wide altitude ranges and orthogonal views. It evaluates pose estimation models (MASt3R, DUSt3R, Map Anything, VGGT) and reconstruction methods, documenting performance degradation from satellite imagery, and proposes SkyNet, which applies a curriculum training strategy to progressively add satellite views and reports absolute gains of 9.6% on RRA@5 and 18.1% on RTA@5 over baselines.
Significance. If the empirical gains and evaluation protocol hold under scrutiny, the work supplies a valuable public testbed for multi-altitude 3D perception and localization, an area where existing benchmarks lack coverage of extreme viewpoint and scale changes. The curriculum-based SkyNet offers a practical, reproducible baseline for cross-view consistency, and the planned code/model release will support follow-on research in generalizable camera pose estimation.
major comments (2)
- [§5.2] §5.2: The absolute gains of 9.6% on RRA@5 and 18.1% on RTA@5 are reported without standard deviations, multiple random seeds, or statistical significance tests. Because the curriculum progression schedule is an explicit free parameter, these omissions make it difficult to determine whether the outperformance is robust or sensitive to hyperparameter choices.
- [§4.1] §4.1: The manuscript provides insufficient detail on the train/validation/test splits across the 51 sites and on the precise balancing of real versus synthetic imagery within each altitude tier. This information is load-bearing for verifying that the reported degradation and SkyNet improvements are not artifacts of site-specific leakage or unbalanced noise distributions.
minor comments (2)
- [Figure 3] Figure 3: The caption and legend should explicitly label which rows correspond to satellite versus ground views to help readers immediately connect the visualizations to the altitude-variation challenge.
- [Related Work] Related Work section: A brief citation to recent remote-sensing cross-view matching papers (e.g., on satellite-to-ground registration) would strengthen the positioning of Sky2Ground relative to prior benchmarks.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and constructive suggestions. We respond to each major comment below and will make the necessary revisions to the manuscript.
read point-by-point responses
-
Referee: [§5.2] §5.2: The absolute gains of 9.6% on RRA@5 and 18.1% on RTA@5 are reported without standard deviations, multiple random seeds, or statistical significance tests. Because the curriculum progression schedule is an explicit free parameter, these omissions make it difficult to determine whether the outperformance is robust or sensitive to hyperparameter choices.
Authors: We agree that including standard deviations from multiple random seeds and statistical significance tests would better demonstrate the robustness of the reported gains. Although the curriculum progression schedule was determined through preliminary validation, we will conduct additional experiments with multiple seeds in the revised manuscript and report the mean performance along with standard deviations and p-values where appropriate. revision: yes
-
Referee: [§4.1] §4.1: The manuscript provides insufficient detail on the train/validation/test splits across the 51 sites and on the precise balancing of real versus synthetic imagery within each altitude tier. This information is load-bearing for verifying that the reported degradation and SkyNet improvements are not artifacts of site-specific leakage or unbalanced noise distributions.
Authors: We appreciate this observation and will provide more detailed information in the revised manuscript. Specifically, we will clarify the site-level splits (ensuring no overlap between train, validation, and test sites), the allocation of images across altitude tiers, and the exact ratios of real to synthetic images within each tier to allow full verification of the experimental setup. revision: yes
Circularity Check
No significant circularity; empirical benchmarks on held-out data
full rationale
The paper introduces the Sky2Ground dataset (51 sites mixing synthetic and real imagery) and SkyNet model with a curriculum training procedure. Central claims are absolute performance gains (9.6% RRA@5, 18.1% RTA@5) measured on held-out test splits against external baselines (MASt3R, DUSt3R, etc.). No equations or derivations are present that reduce to fitted inputs by construction; the curriculum is a standard training schedule, not a self-definitional loop. Any self-citations are incidental and non-load-bearing for the empirical results. The evaluation protocol is self-contained against external benchmarks and does not import uniqueness theorems or ansatzes from prior author work.
Axiom & Free-Parameter Ledger
free parameters (1)
- curriculum progression schedule
axioms (1)
- domain assumption Standard epipolar and multi-view geometry constraints remain valid across large altitude differences
invented entities (1)
-
SkyNet
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Javier Argota Sánchez-Vaquerizo. Urban digital twins and metaverses towards city multiplicities: uniting or dividing urban experiences? Ethics and Information Technology, 27 (1):4, 2025. 1
work page 2025
-
[2]
Implications of web mercator and its use in online mapping
Sarah E Battersby, Michael P Finn, E Lynn Usery, and K Yamamoto. Implications of web mercator and its use in online mapping. Cartographica, 49(2):85–101, 2014. 1
work page 2014
-
[3]
nuscenes: A multi- modal dataset for autonomous driving
Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom. nuscenes: A multi- modal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11621–11631, 2020. 3
work page 2020
-
[4]
End- to-end object detection with transformers
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End- to-end object detection with transformers. In European conference on computer vision, pages 213–229. Springer,
-
[5]
Shihan Chen, Zhaojin Li, Zeyu Chen, Qingsong Yan, Gaoyang Shen, and Ran Duan. 3d gaussian splatting for fine- detailed surface reconstruction in large-scale scene. arXiv preprint arXiv:2506.17636, 2025. 2
-
[6]
An integrated uav navigation system based on aerial image matching
Gianpaolo Conte and Patrick Doherty. An integrated uav navigation system based on aerial image matching. In 2008 IEEE Aerospace Conference, pages 1–10. IEEE, 2008. 1
work page 2008
-
[7]
Yuanyuan Gao, Hao Li, Jiaqi Chen, Zhengyu Zou, Zhihang Zhong, Dingwen Zhang, Xiao Sun, and Junwei Han. Citygs- x: A scalable architecture for efficient and geometrically accurate large-scale scene reconstruction. arXiv preprint arXiv:2503.23044, 2025. 2
- [8]
-
[9]
Dragon: Drone and ground gaussian splatting for 3d build- ing reconstruction
Yujin Ham, Mateusz Michalkiewicz, and Guha Balakrishnan. Dragon: Drone and ground gaussian splatting for 3d build- ing reconstruction. In 2024 IEEE International Conference on Computational Photography (ICCP), pages 1–12. IEEE,
work page 2024
-
[10]
Wenmiao Hu, Yichen Zhang, Yuxuan Liang, Yifang Yin, An- drei Georgescu, An Tran, Hannes Kruppa, See-Kiong Ng, and Roger Zimmermann. Beyond geo-localization: Fine-grained orientation of street-view images by cross-view matching with satellite imagery. In Proceedings of the 30th ACM international conference on multimedia, pages 6155–6164,
-
[11]
Horizon-gs: Unified 3d gaussian splatting for large-scale aerial-to-ground scenes
Lihan Jiang, Kerui Ren, Mulin Yu, Linning Xu, Junting Dong, Tao Lu, Feng Zhao, Dahua Lin, and Bo Dai. Horizon-gs: Unified 3d gaussian splatting for large-scale aerial-to-ground scenes. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 26789–26799, 2025. 2
work page 2025
-
[12]
Image Matching across Wide Baselines: From Paper to Practice
Yuhe Jin, Dmytro Mishkin, Anastasiia Mishchuk, Jiri Matas, Pascal Fua, Kwang Moo Yi, and Eduard Trulls. Image Matching across Wide Baselines: From Paper to Practice. International Journal of Computer Vision, 2020. 4
work page 2020
-
[13]
Unconstrained large-scale 3d re- construction and rendering across altitudes
Neil Joshi, Joshua Carney, Nathanael Kuo, Homer Li, Cheng Peng, and Myron Brown. Unconstrained large-scale 3d re- construction and rendering across altitudes. arXiv preprint arXiv:2505.00734, 2025. 2
-
[14]
3d gaussian splatting for real-time radiance field rendering
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph., 42(4):139–1, 2023. 2
work page 2023
-
[15]
Naoki Kikuchi, Tomohiro Fukuda, and Nobuyoshi Yabuki. Future landscape visualization using a city digital twin: In- tegration of augmented reality and drones with implemen- tation of 3d model-based occlusion handling. Journal of Computational Design and Engineering, 9(2):837–856, 2022. 2
work page 2022
-
[16]
Photogrammetry: Geometry from Images and Laser Scans
Karl Kraus. Photogrammetry: Geometry from Images and Laser Scans. De Gruyter, 2007. 1
work page 2007
-
[17]
Digital twin of a city: Review of technology serving city needs
Ville V Lehtola, Mila Koeva, Sander Oude Elberink, Paulo Raposo, Juho-Pekka Virtanen, Faridaddin Vahdatikhaki, and Simone Borsci. Digital twin of a city: Review of technology serving city needs. International Journal of Applied Earth Observation and Geoinformation, 114:102915, 2022. 1
work page 2022
-
[18]
Ground- ing image matching in 3d with mast3r, 2024
Vincent Leroy, Yohann Cabon, and Jerome Revaud. Ground- ing image matching in 3d with mast3r, 2024. 1, 2, 8, 5
work page 2024
-
[19]
Learning cross-view visual geo-localization without ground truth
Haoyuan Li, Chang Xu, Wen Yang, Huai Yu, and Gui-Song Xia. Learning cross-view visual geo-localization without ground truth. IEEE Transactions on Geoscience and Remote Sensing, 2024. 1
work page 2024
-
[20]
Matrixcity: A large-scale city dataset for city-scale neural rendering and beyond
Yixuan Li, Lihan Jiang, Linning Xu, Yuanbo Xiangli, Zhenzhi Wang, Dahua Lin, and Bo Dai. Matrixcity: A large-scale city dataset for city-scale neural rendering and beyond. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3205–3215, 2023. 1, 3
work page 2023
-
[21]
Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d
Yiyi Liao, Jun Xie, and Andreas Geiger. Kitti-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3292–3310, 2022. 3
work page 2022
-
[22]
Citygaussian: Real-time high-quality large-scale scene rendering with gaussians
Yang Liu, Chuanchen Luo, Lue Fan, Naiyan Wang, Jun- ran Peng, and Zhaoxiang Zhang. Citygaussian: Real-time high-quality large-scale scene rendering with gaussians. In European Conference on Computer Vision, pages 265–282. Springer, 2024. 2
work page 2024
-
[23]
Yang Liu, Chuanchen Luo, Zhongkai Mao, Junran Peng, and Zhaoxiang Zhang. Citygaussianv2: Efficient and geometri- cally accurate reconstruction for large-scale scenes. arXiv preprint arXiv:2411.00771, 2024. 2
-
[24]
Roger Marí, Gabriele Facciolo, and Thibaud Ehret. Sat- NeRF: Learning multi-view satellite photogrammetry with transient objects and shadow modeling using RPC cam- eras. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1310– 1320, 2022. 1
work page 2022
-
[25]
Microsoft Corporation. Bing Maps Imagery Services. https://www.bing.com/maps. 1
-
[26]
Nerf: Representing scenes as neural radiance fields for view syn- thesis
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. Communications of the ACM, 65(1):99–106, 2021. 2
work page 2021
-
[27]
Cross-view visual geo-localization for outdoor augmented reality
Niluthpol Chowdhury Mithun, Kshitij S Minhas, Han-Pang Chiu, Taragay Oskiper, Mikhail Sizintsev, Supun Samarasek- era, and Rakesh Kumar. Cross-view visual geo-localization for outdoor augmented reality. In 2023 IEEE Conference Virtual Reality and 3D User Interfaces (VR), pages 493–502. IEEE, 2023. 1
work page 2023
-
[28]
On occlu- sions in video action detection: Benchmark datasets and training recipes
Rajat Modi, Vibhav Vineet, and Yogesh Rawat. On occlu- sions in video action detection: Benchmark datasets and training recipes. Advances in Neural Information Processing Systems, 36:57306–57335, 2023. 2
work page 2023
-
[29]
Visual localization with google earth images for robust global pose estimation of uavs
Bhavit Patel, Timothy D Barfoot, and Angela P Schoel- lig. Visual localization with google earth images for robust global pose estimation of uavs. In 2020 IEEE international conference on robotics and automation (ICRA), pages 6491–
work page 2020
-
[30]
Navigating urban complexity: The trans- formative role of digital twins in smart city development
Dechen Peldon, Saeed Banihashemi, Khuong LeNguyen, and Sybil Derrible. Navigating urban complexity: The trans- formative role of digital twins in smart city development. Sustainable Cities and Society, 2024. 1
work page 2024
-
[31]
Revealing scenes by inverting structure from motion reconstructions
Francesco Pittaluga, Sanjeev J Koppal, Sing Bing Kang, and Sudipta N Sinha. Revealing scenes by inverting structure from motion reconstructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 145–154, 2019. 1
work page 2019
-
[32]
Sat2map: Reconstructing 3d building roof from 2d satellite images
Yoones Rezaei and Stephen Lee. Sat2map: Reconstructing 3d building roof from 2d satellite images. ACM Transactions on Cyber-Physical Systems, 8(4):1–25, 2024. 1
work page 2024
-
[33]
Structure-from-motion revisited
Johannes Lutz Schönberger and Jan Michael Frahm. Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 2
work page 2016
-
[34]
A vote-and-verify strat- egy for fast spatial verification in image retrieval
Johannes Lutz Schönberger, True Price, Torsten Sattler, Jan- Michael Frahm, and Marc Pollefeys. A vote-and-verify strat- egy for fast spatial verification in image retrieval. In Asian Conference on Computer Vision (ACCV), 2016. 1
work page 2016
-
[35]
Pixelwise view selection for un- structured multi-view stereo
Johannes Lutz Schönberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. Pixelwise view selection for un- structured multi-view stereo. In European Conference on Computer Vision (ECCV), 2016. 1, 2
work page 2016
-
[36]
To- wards urban digital twins: A workflow for procedural visual- ization using geospatial data
Sanjay Somanath, Vasilis Naserentin, Orfeas Eleftheriou, Daniel Sjölie, Beata Stahre Wästberg, and Anders Logg. To- wards urban digital twins: A workflow for procedural visual- ization using geospatial data. Remote Sensing, 16(11):1939,
work page 1939
-
[37]
Dronesplat: 3d gaussian splatting for ro- bust 3d reconstruction from in-the-wild drone imagery
Jiadong Tang, Yu Gao, Dianyi Yang, Liqi Yan, Yufeng Yue, and Yi Yang. Dronesplat: 3d gaussian splatting for ro- bust 3d reconstruction from in-the-wild drone imagery. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 833–843, 2025. 2
work page 2025
-
[38]
Geometric processing of remote sensing im- ages: models, algorithms and methods
Thierry Toutin. Geometric processing of remote sensing im- ages: models, algorithms and methods. International Journal of Remote Sensing, 25(5):1893–1924, 2004. 1
work page 1924
-
[39]
Aerialmegadepth: Learning aerial-ground reconstruction and view synthesis
Khiem Vuong, Anurag Ghosh, Deva Ramanan, Srinivasa Narasimhan, and Shubham Tulsiani. Aerialmegadepth: Learning aerial-ground reconstruction and view synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025. 1, 3, 8
work page 2025
-
[40]
Vggt: Visual geometry grounded transformer
Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Visual geometry grounded transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025. 1, 2, 3, 6, 7, 5
work page 2025
-
[41]
Dust3r: Geometric 3d vision made easy
Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. Dust3r: Geometric 3d vision made easy. In CVPR, 2024. 1, 2, 8, 4
work page 2024
-
[42]
Croco: Self-supervised pre-training for 3d vision tasks by cross-view completion
Philippe Weinzaepfel, Vincent Leroy, Thomas Lucas, Ro- main Brégier, Yohann Cabon, Vaibhav Arora, Leonid Ants- feld, Boris Chidlovskii, Gabriela Csurka, and Jérôme Re- vaud. Croco: Self-supervised pre-training for 3d vision tasks by cross-view completion. Advances in Neural Information Processing Systems, 35:3502–3516, 2022. 5
work page 2022
-
[43]
Croco v2: Improved cross-view completion pre-training for stereo matching and optical flow
Philippe Weinzaepfel, Thomas Lucas, Vincent Leroy, Yohann Cabon, Vaibhav Arora, Romain Brégier, Gabriela Csurka, Leonid Antsfeld, Boris Chidlovskii, and Jérôme Revaud. Croco v2: Improved cross-view completion pre-training for stereo matching and optical flow. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 17969–17980, 2023. 5
work page 2023
-
[44]
Wide- area image geolocalization with aerial reference imagery
Scott Workman, Richard Souvenir, and Nathan Jacobs. Wide- area image geolocalization with aerial reference imagery. In Proceedings of the IEEE International Conference on Computer Vision, pages 3961–3969, 2015. 1
work page 2015
-
[45]
3d gaussian splat- ting for large-scale surface reconstruction from aerial images
YuanZheng Wu, Jin Liu, and Shunping Ji. 3d gaussian splat- ting for large-scale surface reconstruction from aerial images. arXiv preprint arXiv:2409.00381, 2024. 2
-
[46]
Bungeenerf: Progressive neural radiance field for extreme multi-scale scene rendering
Yuanbo Xiangli, Linning Xu, Xingang Pan, Nanxuan Zhao, Anyi Rao, Christian Theobalt, Bo Dai, and Dahua Lin. Bungeenerf: Progressive neural radiance field for extreme multi-scale scene rendering. In European conference on computer vision, pages 106–122. Springer, 2022. 1, 2, 3
work page 2022
-
[47]
Butian Xiong, Zhuo Li, and Zhen Li. Gauu-scene: A scene reconstruction benchmark on large scale 3d recon- struction dataset using gaussian splatting. arXiv preprint arXiv:2401.14032, 2024. 3
-
[48]
Vr-nerf: High- fidelity virtualized walkable spaces
Linning Xu, Vasu Agrawal, William Laney, Tony Garcia, Aayush Bansal, Changil Kim, Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder, Aljaž Božiˇc, et al. Vr-nerf: High- fidelity virtualized walkable spaces. In SIGGRAPH Asia 2023 Conference Papers, pages 1–12, 2023. 1
work page 2023
-
[49]
Robust and efficient 3d gaussian splatting for urban scene reconstruction
Zhensheng Yuan, Haozhi Huang, Zhen Xiong, Di Wang, and Guanghua Yang. Robust and efficient 3d gaussian splatting for urban scene reconstruction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 26209– 26219, 2025. 2
work page 2025
-
[50]
Predicting ground-level scene layout from aerial imagery
Menghua Zhai, Zachary Bessinger, Scott Workman, and Nathan Jacobs. Predicting ground-level scene layout from aerial imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 867–875,
-
[51]
Crossview- gs: Cross-view gaussian splatting for large-scale scene recon- struction
Chenhao Zhang, Yuanping Cao, and Lei Zhang. Crossview- gs: Cross-view gaussian splatting for large-scale scene recon- struction. arXiv preprint arXiv:2501.01695, 2025. 2
-
[52]
Bird- nerf: Fast neural reconstruction of large-scale scenes from aerial imagery
Huiqing Zhang, Yifei Xue, Ming Liao, and Yizhen Lao. Bird- nerf: Fast neural reconstruction of large-scale scenes from aerial imagery. Scientific Reports, 15(1):37295, 2025. 2
work page 2025
-
[53]
Drone-assisted road gaussian splatting with cross-view uncertainty.arXiv preprint arXiv:2408.15242,
Saining Zhang, Baijun Ye, Xiaoxue Chen, Yuantao Chen, Zongzheng Zhang, Cheng Peng, Yongliang Shi, and Hao Zhao. Drone-assisted road gaussian splatting with cross-view uncertainty. arXiv preprint arXiv:2408.15242, 2024. 3 Sky2Ground: A Benchmark for Site Modeling under Varying Altitude Supplementary Material This supplementary document provides additional ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.