CM-EVS: Sparse Panoramic RGB-D-Pose Data for Complete Scene Coverage

Haoran Huang; Jiahuan Zhang; Jiale Liu; Jieming Yu; Jungang Li; Kaifeng Ding; Lidong Chen; Mingjun Cheng; Shan Huang; Shunwen Bai

arxiv: 2605.15597 · v1 · pith:DGMQR6TDnew · submitted 2026-05-15 · 💻 cs.CV · cs.GR· cs.LG· cs.RO

CM-EVS: Sparse Panoramic RGB-D-Pose Data for Complete Scene Coverage

Jiale Liu , Jungang Li , Jieming Yu , Xinglin Yu , Zihao Dongfang , Zongjian Ding , Kaifeng Ding , Yi Yang

show 8 more authors

Lidong Chen Yang Zou Shunwen Bai Jiahuan Zhang Haoran Huang Shan Huang Yudong Gao Mingjun Cheng

This is my paper

Pith reviewed 2026-05-20 19:21 UTC · model grok-4.3

classification 💻 cs.CV cs.GRcs.LGcs.RO

keywords panoramic RGB-Dviewpoint curationscene coverageERP warpingsparse 3D datasetindoor scenesgeometry-consistent learningdepth conflict avoidance

0 comments

The pith

COVER selects sparse panoramic RGB-D-pose frames from 3D assets that achieve complete scene coverage with low redundancy under bounded approximation error.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the lack of sparse, geometry-consistent panoramic training data for 3D visual learning by converting existing meshes and scans into curated ERP views. It introduces COVER, a training-free method that projects geometry from chosen views into candidate probes, scores added coverage, and avoids depth conflicts. A sympathetic reader would care because dense trajectories waste computation on redundant nearby views while sparse heuristics often leave gaps or create inconsistent depths. If the approach holds, it yields an auditable dataset of only about 25 frames per indoor scene that still covers every room type across thousands of scenes. The work also supplies the resulting CM-EVS collection with full-sphere RGB, metric depth, and pose for both indoor and outdoor assets.

Core claim

COVER (Coverage-Oriented Viewpoint curation with ERP Range-depth warping) projects observed geometry from selected views into candidate ERP probes, computes incremental coverage scores, and penalizes depth conflicts. Under the assumption of bounded proxy error, its greedy coverage proxy preserves the standard coverage-style approximation behavior up to an additive error term. Using this curator on Blender indoor, HM3D, ScanNet++, TartanGround, and OB3D sources produces the CM-EVS dataset of 36,373 frames from 1,275 scenes, each supplying calibrated panoramic RGB, range depth, and pose with provenance logs.

What carries the argument

COVER, the Coverage-Oriented Viewpoint curation with ERP Range-depth warping procedure, which works by projecting geometry observed from already selected views into candidate equirectangular probes to score incremental coverage while penalizing depth inconsistencies.

If this is right

CM-EVS supplies 36,373 panoramic frames across 1,275 scenes with a median of 25 frames per indoor scene.
The collection covers all 13 unified room types while keeping low redundancy.
Indoor frames include per-step provenance logs that record how each view was chosen.
Experiments demonstrate a better coverage-conflict trade-off than prior heuristics.
The same schema works for re-encoded outdoor panoramas from TartanGround and OB3D.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The curation approach could be tested on larger or dynamic scenes to check whether the bounded-error condition continues to hold.
If the proxy remains reliable, similar greedy selection might be applied to other spherical or multi-view representations beyond ERP.
The resulting compact sets could support ablation studies that isolate the effect of view density on downstream panoramic 3D tasks.

Load-bearing premise

The error introduced by projecting geometry from selected views into candidate ERP probes remains bounded.

What would settle it

A direct measurement on a held-out scene where the greedy coverage ratio deviates from the standard approximation guarantee by more than the stated additive error term after the proxy projections are applied.

Figures

Figures reproduced from arXiv: 2605.15597 by Haoran Huang, Jiahuan Zhang, Jiale Liu, Jieming Yu, Jungang Li, Kaifeng Ding, Lidong Chen, Mingjun Cheng, Shan Huang, Shunwen Bai, Xinglin Yu, Yang Zou, Yi Yang, Yudong Gao, Zihao Dongfang, Zongjian Ding.

**Figure 2.** Figure 2: COVER’s three-phase per-scene pipeline (Algorithm 1). Each iteration emits one ERP RGB-depth-pose frame plus its per-step provenance log. source trajectories, not curator-selected subsets, so they do not carry the per-step provenance log. Per-source detail is in [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: CM-EVS: (a) multi-view 4π coverage, (b) RGB + metric range depth + pose under one schema, (c) scene-type diversity across 13 unified buckets, and (d) low redundancy at scale. 6 [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Representative ERP frames per source. Each frame ships three modalities: RGB, range [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Room-type composition across CM-EVS and five baselines (13-bucket taxonomy). The [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Cross-source curator behavior (λ= 0.35, τ = 1% early stop). (a) Per-step coverage gain Gt. (b) Per-scene frame-count distribution per source. 5 Curator analysis We empirically study the curator’s behavior along three axes: how it compares to data-free and coverage-only baselines under a fixed budget (§5.1), how it responds to the conflict-weight λ (§5.2), and whether the same code path generalizes across o… view at source ↗

**Figure 7.** Figure 7: Selection diversity vs. λ on a Blender multi-scene pool (K = 30): (a) unique scene prefixes hit, (b) per-prefix allocation, (c) pairwise Jaccard similarity, (d) diversity vs. coverage. feasible candidates suffice and greedy selection saturates quickly. HM3D carries a substantially higher conflict prior (0.0713 versus 0.0175 on Blender indoor and 0.0103 on ScanNet++), consistent with noisier real-scan geome… view at source ↗

**Figure 8.** Figure 8: Selection geometry on Blender (K = 30). λ = 0 collapses to a tight off-centre cluster inside the candidate pool; λ = 0.2 partially spreads; COVER’s default λ = 0.35 covers the pool. embodied ai era. arXiv preprint arXiv:2509.12989, 2025. [2] Zhijie Shen, Chunyu Lin, Kang Liao, Lang Nie, Zishuo Zheng, and Yao Zhao. PanoFormer: Panorama transformer for indoor 360◦ depth estimation. In Proceedings of the Euro… view at source ↗

**Figure 9.** Figure 9: Expanded overview of COVER: concretisation of the conflict-aware warping oracle and the per-step state update, complementing [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: Full selection geometry across three curator-source examples at [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Per-step marginal coverage gain Gt across selection steps. The curves saturate at scenespecific rates; the production threshold τ = 1% (dashed) defines the gain-gradient early stop. C.3 Low-redundancy selection example [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗

**Figure 12.** Figure 12: Per-source range-depth distribution (violin) on released frames. Width reflects density at [PITH_FULL_IMAGE:figures/full_fig_p021_12.png] view at source ↗

**Figure 13.** Figure 13: Six COVER-selected viewpoints on a Blender indoor residential scene, spanning three functional zones; positions overlaid on the scene’s accumulated point cloud (right) [PITH_FULL_IMAGE:figures/full_fig_p022_13.png] view at source ↗

**Figure 14.** Figure 14: Low-redundancy selection on an open-plan office. [PITH_FULL_IMAGE:figures/full_fig_p022_14.png] view at source ↗

**Figure 15.** Figure 15: Audited 50-bad-case gallery (Blender top, HM3D middle, ScanNet++ bottom). Each cell [PITH_FULL_IMAGE:figures/full_fig_p023_15.png] view at source ↗

read the original abstract

Modern 3D visual learning relies on observations sampled from metric 3D assets, yet existing scans, meshes, point clouds, simulations, and reconstructions do not directly provide a sparse, comparable, and geometry-consistent panoramic training interface. Dense trajectories duplicate nearby views, source-specific rendering policies yield heterogeneous annotations, and sparse heuristics may miss important regions or introduce depth-inconsistent observations. We study how to convert 3D assets into sparse panoramic RGB-D-pose data that preserves complete scene coverage with low redundancy and auditable provenance. We propose COVER (Coverage-Oriented Viewpoint curation with ERP Range-depth warping), a training-free ERP viewpoint curator that projects geometry observed from selected views into candidate ERP probes, scores incremental coverage, and penalizes depth conflicts. Under bounded proxy error, its greedy coverage proxy preserves the standard coverage-style approximation behavior up to an additive error term. Using COVER, we build CM-EVS (Coverage-curated Metric ERP View Set), a panoramic RGB-D-pose dataset with 36,373 curated ERP frames from 1,275 indoor scenes across Blender indoor, HM3D, and ScanNet++, complemented by outdoor panoramas from TartanGround and OB3D re-encoded into the same schema. Each frame provides full-sphere RGB, metric range depth, calibrated pose; COVER-produced indoor frames include per-step provenance logs. With a median of only 25 frames per indoor scene, CM-EVS covers all 13 unified room types while maintaining compact scene-level coverage. Experiments show that COVER improves the coverage-conflict trade-off, making CM-EVS a sparse, compact, and auditable RGB-D-pose resource for geometry-consistent panoramic 3D learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a new curation method and sparse panoramic dataset that could be handy for 3D learning, but the approximation guarantee rests on an unverified bounded proxy error.

read the letter

The main takeaway is that this work supplies a training-free curator called COVER and the CM-EVS dataset of 36k sparse panoramic RGB-D-pose frames drawn from public indoor and outdoor sources. With a median of 25 frames per scene it claims complete coverage of 13 room types while keeping redundancy low and adding provenance logs for the indoor parts. Each frame includes full-sphere RGB, metric range depth, and calibrated pose, all re-encoded into one consistent schema. That addresses a real pain point for people who need geometry-consistent panoramic training data instead of dense trajectories or mismatched renders. The specific mix of ERP range-depth warping, incremental coverage scoring, and depth-conflict penalization looks like the novel piece relative to earlier viewpoint heuristics. The dataset itself is presented as a new resource that could standardize experiments in panoramic 3D learning. The practical side is done reasonably well. They pull from Blender indoor, HM3D, ScanNet++, TartanGround, and OB3D, then apply the same curation pipeline so the output is comparable across scenes. The compact size and coverage claims are concrete and could save others time if the frames hold up. The soft spot is the central guarantee. The abstract says that under bounded proxy error the greedy selector preserves standard coverage approximation up to an additive term, yet there is no derivation of the bound, no check on how the error behaves with scene scale, depth jumps, or ERP distortion, and no ablation on the conflict penalty. Without those details the additive term could accumulate rather than stay controlled. The experiments report a better coverage-conflict trade-off, but the lack of error bars or sensitivity tests leaves the strength of that result unclear. This paper is for vision researchers who train models on panoramic or 360 data and want a ready-made sparse, auditable set rather than raw scans. A reader who needs consistent indoor-outdoor frames for experiments would find it useful. It has enough concrete output to deserve a serious referee, even though the theory section needs more work to back the guarantee. I would recommend sending it to peer review.

Referee Report

2 major / 0 minor

Summary. The paper introduces COVER, a training-free ERP viewpoint curator that projects geometry from selected views into candidate ERP probes, scores incremental coverage, and penalizes depth conflicts. Under a bounded proxy error assumption, its greedy coverage proxy is claimed to preserve standard coverage-style approximation behavior up to an additive error term. Using COVER, the authors construct the CM-EVS dataset comprising 36,373 curated panoramic RGB-D-pose frames from 1,275 indoor scenes (Blender indoor, HM3D, ScanNet++) plus outdoor panoramas, achieving complete coverage of 13 room types with a median of 25 frames per scene while maintaining low redundancy and auditable provenance.

Significance. If the bounded proxy error holds and the approximation guarantee can be verified, the work supplies a sparse, compact, geometry-consistent, and provenance-auditable panoramic RGB-D-pose resource that directly addresses redundancy, heterogeneity, and coverage gaps in existing 3D assets. The training-free construction from public sources and explicit coverage-conflict trade-off experiments constitute clear strengths for downstream 3D visual learning tasks.

major comments (2)

[Abstract] Abstract: the central claim that 'under bounded proxy error, its greedy coverage proxy preserves the standard coverage-style approximation behavior up to an additive error term' is load-bearing for the method's theoretical contribution, yet the manuscript provides no derivation of an explicit bound on the proxy error incurred when projecting geometry from selected views into candidate ERP probes, no error-bar analysis, and no ablation isolating the depth-conflict penalty term.
[Abstract] The assumption that proxy error remains bounded independently of scene scale, depth discontinuities, and ERP distortion is required for the approximation guarantee to hold, but the text does not supply a concrete test or worst-case analysis showing the additive term stays controlled rather than accumulating with the number of selected views.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We appreciate the recognition of the dataset's value for 3D visual learning and the identification of areas where the theoretical claims require stronger support. We address each major comment below and outline specific revisions to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'under bounded proxy error, its greedy coverage proxy preserves the standard coverage-style approximation behavior up to an additive error term' is load-bearing for the method's theoretical contribution, yet the manuscript provides no derivation of an explicit bound on the proxy error incurred when projecting geometry from selected views into candidate ERP probes, no error-bar analysis, and no ablation isolating the depth-conflict penalty term.

Authors: We agree that the central claim requires explicit support and that the abstract alone does not supply the requested derivation, error-bar analysis, or ablation. The full manuscript discusses the proxy error arising from ERP range-depth warping but does not isolate the bound or perform the suggested analyses. In the revised version we will add a new subsection deriving an explicit bound on the proxy error under the stated assumptions, include error-bar plots quantifying the additive term, and report an ablation that isolates the depth-conflict penalty's contribution to coverage scoring. revision: yes
Referee: [Abstract] The assumption that proxy error remains bounded independently of scene scale, depth discontinuities, and ERP distortion is required for the approximation guarantee to hold, but the text does not supply a concrete test or worst-case analysis showing the additive term stays controlled rather than accumulating with the number of selected views.

Authors: We acknowledge that the boundedness assumption must be substantiated with concrete tests rather than left implicit. The current text states the assumption without worst-case analysis or scaling experiments. We will revise the manuscript to include a worst-case analysis of the additive error term under varying scene scales, depth discontinuities, and ERP distortion, together with empirical plots showing that the term remains controlled and does not accumulate unboundedly as the number of selected views increases. revision: yes

Circularity Check

0 steps flagged

Coverage guarantee stated conditionally on unverified assumption; no reduction to inputs by construction

full rationale

The paper presents COVER as a training-free method that projects geometry from selected views into ERP probes and uses a greedy coverage proxy. The key claim is that under bounded proxy error this proxy preserves standard coverage-style approximation up to an additive term. No equations in the provided text reduce this guarantee to a fitted parameter, self-citation chain, or definitional equivalence. The bounded-error assumption is explicitly stated as a precondition rather than derived from the method itself, and the dataset construction draws from public sources without introducing fitted predictions renamed as results. This keeps the derivation self-contained against external benchmarks, warranting only a minor score for the unproven bound rather than circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the existence of accurate metric 3D assets from Blender, HM3D, ScanNet++, TartanGround and OB3D, plus the assumption that ERP projection preserves geometry sufficiently for coverage scoring. No free parameters or invented physical entities are introduced.

axioms (1)

domain assumption Existing 3D assets provide accurate metric geometry that can be projected into ERP without significant distortion for coverage purposes.
Invoked when describing how COVER projects observed geometry from selected views into candidate ERP probes.

pith-pipeline@v0.9.0 · 5903 in / 1391 out tokens · 31485 ms · 2026-05-20T19:21:56.845886+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages · 2 internal anchors

[1]

Xu Zheng, Chenfei Liao, Ziqiao Weng, Kaiyu Lei, Zihao Dongfang, Haocong He, Yuanhuiyi Lyu, Lutao Jiang, Lu Qi, Li Chen, et al. Panorama: The rise of omnidirectional vision in the 9 1.0 0.5 0.0 0.5 1.0 (a) =0 (coverage-only greedy) 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.0 0.5 0.0 0.5 1.0 (b) =0.2 (coverage peak) 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.0 0.5 0.0...

work page arXiv 2025
[2]

PanoFormer: Panorama transformer for indoor 360 ◦ depth estimation

Zhijie Shen, Chunyu Lin, Kang Liao, Lang Nie, Zishuo Zheng, and Yao Zhao. PanoFormer: Panorama transformer for indoor 360 ◦ depth estimation. InProceedings of the European Conference on Computer Vision (ECCV), 2022

work page 2022
[3]

PERF: Panoramic neural radiance field from a single panorama.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

Guangcong Wang, Peng Wang, Zhaoxi Chen, Wenping Wang, Chen Change Loy, and Ziwei Liu. PERF: Panoramic neural radiance field from a single panorama.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

work page 2024
[4]

360DVD: Controllable panorama video generation with 360-degree video diffusion model

Qian Wang, Weiqi Li, Chong Mou, Xinhua Cheng, and Jian Zhang. 360DVD: Controllable panorama video generation with 360-degree video diffusion model. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024
[5]

Pano3D: A holis- tic benchmark and a solid baseline for 360◦ depth estimation

Georgios Albanis, Nikolaos Zioulis, Petros Drakoulis, Vasileios Gkitsas, Vladimiros Sterzentsenko, Federico Alvarez, Dimitrios Zarpalas, and Petros Daras. Pano3D: A holis- tic benchmark and a solid baseline for 360◦ depth estimation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021

work page 2021
[6]

OmniPhotos: Casual 360◦ VR photography

Tobias Bertel, Mingze Yuan, Reuben Lindroos, and Christian Richardt. OmniPhotos: Casual 360◦ VR photography. InACM Transactions on Graphics (Proc. SIGGRAPH Asia), volume 39. ACM, 2020

work page 2020
[7]

Matrix-3D: Omnidirectional explorable 3D world generation

Zhongqi Yang, Wenhang Ge, Yuqi Li, Jiaqi Chen, Haoyuan Li, Mengyin An, Fei Kang, Hua Xue, Baixin Xu, Yuyang Yin, et al. Matrix-3D: Omnidirectional explorable 3D world generation. arXiv preprint, 2025. Skywork Matrix-3D

work page 2025
[8]

Susskind

Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel Angel Bautista, Nathan Paczan, Russ Webb, and Joshua M. Susskind. Hypersim: A photorealistic synthetic dataset for holistic indoor scene understanding. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021

work page 2021
[9]

Structured3D: A large photo-realistic dataset for structured 3D modeling

Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, and Zihan Zhou. Structured3D: A large photo-realistic dataset for structured 3D modeling. InProceedings of the European Conference on Computer Vision (ECCV), 2020

work page 2020
[10]

ScanNet++: A high-fidelity dataset of 3D indoor scenes

Chandan Yeshwanth, Yueh-Cheng Liu, Matthias Nießner, and Angela Dai. ScanNet++: A high-fidelity dataset of 3D indoor scenes. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023

work page 2023
[11]

Matterport3D: Learning from RGB-D data in indoor environments

Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Nießner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3D: Learning from RGB-D data in indoor environments. InInternational Conference on 3D Vision (3DV), 2017

work page 2017
[12]

Ramakrishnan, Aaron Gokaslan, Erik Wijmans, Oleksandr Maksymets, Alex Clegg, John Turner, Eric Undersander, Wojciech Galuba, Andrew Westbury, Angel X

Santhosh K. Ramakrishnan, Aaron Gokaslan, Erik Wijmans, Oleksandr Maksymets, Alex Clegg, John Turner, Eric Undersander, Wojciech Galuba, Andrew Westbury, Angel X. Chang, Manolis Savva, Yili Zhao, and Dhruv Batra. Habitat-matterport 3D dataset (HM3D): 1000 large-scale 3D environments for embodied AI. InProceedings of the NeurIPS Track on Datasets and Bench...

work page 2021
[13]

TartanGround: A large-scale dataset for ground robot perception and navigation

Manthan Patel, Fan Yang, Yuheng Qiu, Cesar Cadena, Sebastian Scherer, Marco Hutter, and Wenshan Wang. TartanGround: A large-scale dataset for ground robot perception and navigation. arXiv:2505.10696, 2025

work page arXiv 2025
[14]

OB3D: A new dataset for benchmarking omnidirectional 3D reconstruction using Blender

Shintaro Ito, Natsuki Takama, Toshiki Watanabe, Koichi Ito, Hwann-Tzong Chen, and Takafumi Aoki. OB3D: A new dataset for benchmarking omnidirectional 3D reconstruction using Blender. arXiv:2505.20126, 2025

work page arXiv 2025
[15]

Holo360D: A Large-Scale Real-World Dataset with Continuous Trajectories for Advancing Panoramic 3D Reconstruction and Beyond

Jing Ou, Zidong Cao, Yinrui Ren, Zhuoxiao Li, Jinjing Zhu, Tongyan Hua, Shuai Zhang, Hui Xiong, and Wufan Zhao. Holo360d: A large-scale real-world dataset with continuous trajectories for advancing panoramic 3d reconstruction and beyond.arXiv preprint arXiv:2604.22482, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[16]

PanoGRF: Generalizable spherical radiance fields for wide-baseline panoramas

Zheng Chen, Yan-Pei Cao, Yuan-Chen Guo, Chen Wang, Ying Shan, and Song-Hai Zhang. PanoGRF: Generalizable spherical radiance fields for wide-baseline panoramas. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

work page 2023
[17]

DreamScene360: Unconstrained text-to-3D scene generation with panoramic Gaussian splatting

Shijie Zhou, Zhiwen Fan, Dejia Xu, Haoran Chang, Pradyumna Chari, Tejas Bharadwaj, Suya You, Zhangyang Wang, and Achuta Kadambi. DreamScene360: Unconstrained text-to-3D scene generation with panoramic Gaussian splatting. InProceedings of the European Conference on Computer Vision (ECCV), 2024

work page 2024
[18]

MVDiffu- sion: Enabling holistic multi-view image generation with correspondence-aware diffusion

Shitao Tang, Fuyang Zhang, Jiacheng Chen, Peng Wang, and Yasutaka Furukawa. MVDiffu- sion: Enabling holistic multi-view image generation with correspondence-aware diffusion. In Advances in Neural Information Processing Systems (NeurIPS), 2023

work page 2023
[19]

Irving Vasquez-Gomez, L

J. Irving Vasquez-Gomez, L. Enrique Sucar, Rafael Murrieta-Cid, and Efrain Lopez-Damian. V olumetric next-best-view planning for 3D object reconstruction with positioning error.Inter- national Journal of Advanced Robotic Systems, 11(10), 2014

work page 2014
[20]

SCVP: Learning one-shot view planning via set covering for unknown object reconstruction

Sicong Pan, Hao Hu, and Hui Wei. SCVP: Learning one-shot view planning via set covering for unknown object reconstruction. InIEEE Robotics and Automation Letters / Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022

work page 2022
[21]

ActiveNeRF: Learning where to see with uncertainty estimation

Xuran Pan, Zihang Lai, Shiji Song, and Gao Huang. ActiveNeRF: Learning where to see with uncertainty estimation. InProceedings of the European Conference on Computer Vision (ECCV), 2022

work page 2022
[22]

NeurAR: Neural uncertainty for autonomous 3D reconstruction with implicit neural representations.IEEE Robotics and Automation Letters, 8(2):1125–1132, 2023

Yunlong Ran, Jing Zeng, Shibo He, Jiming Chen, Lincheng Li, Yingfeng Chen, Gimhee Lee, and Qi Ye. NeurAR: Neural uncertainty for autonomous 3D reconstruction with implicit neural representations.IEEE Robotics and Automation Letters, 8(2):1125–1132, 2023

work page 2023
[23]

GenNBV: Generalizable next-best-view policy for active 3D reconstruction

Xiao Chen, Quanyi Li, Tai Wang, Tianfan Xue, and Jiangmiao Pang. GenNBV: Generalizable next-best-view policy for active 3D reconstruction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024
[24]

Datasheets for datasets.Communications of the ACM, 64(12):86–92, 2021

Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. Datasheets for datasets.Communications of the ACM, 64(12):86–92, 2021

work page 2021
[25]

Croissant: A metadata format for ML-ready datasets.Advances in Neural Information Processing Systems, 37:82133–82148, 2024

Mubashara Akhtar, Omar Benjelloun, Costanza Conforti, Luca Foschini, Pieter Gijsbers, Joan Giner-Miguelez, Sujata Goswami, Nitisha Jain, Michalis Karamousadakis, Satyapriya Krishna, et al. Croissant: A metadata format for ML-ready datasets.Advances in Neural Information Processing Systems, 37:82133–82148, 2024

work page 2024
[26]

Richard M. Karp. Reducibility among combinatorial problems. InComplexity of Computer Computations, pages 85–103. Plenum Press, 1972

work page 1972
[27]

A threshold oflnn for approximating set cover.Journal of the ACM, 45(4):634–652, 1998

Uriel Feige. A threshold oflnn for approximating set cover.Journal of the ACM, 45(4):634–652, 1998

work page 1998
[28]

Nemhauser, Laurence A

George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. An analysis of approxima- tions for maximizing submodular set functions—I.Mathematical Programming, 14(1):265–294, 1978. 11

work page 1978
[29]

Submodular function maximization

Andreas Krause and Daniel Golovin. Submodular function maximization. In Lucas Bordeaux, Youssef Hamadi, and Pushmeet Kohli, editors,Tractability: Practical Approaches to Hard Problems, pages 71–104. Cambridge University Press, 2014

work page 2014
[30]

Submodular optimization under noise

Avinatan Hassidim and Yaron Singer. Submodular optimization under noise. InProceedings of the 2017 Conference on Learning Theory (COLT), volume 65 ofProceedings of Machine Learning Research, pages 1069–1122. PMLR, 2017

work page 2017
[31]

Streaming submodular maximization: Massive data summarization on the fly

Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, and Andreas Krause. Streaming submodular maximization: Massive data summarization on the fly. InProceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2014

work page 2014
[32]

Streaming non-monotone submodular maximization: Personalized video summarization on the fly

Baharan Mirzasoleiman, Stefanie Jegelka, and Andreas Krause. Streaming non-monotone submodular maximization: Personalized video summarization on the fly. InProceedings of the AAAI Conference on Artificial Intelligence, 2018

work page 2018
[33]

BlenderProc2: A procedural pipeline for photorealistic rendering.Journal of Open Source Software, 8(82):4901, 2023

Maximilian Denninger, Dominik Winkelbauer, Martin Sundermeyer, Wout Boerdijk, Markus Wendelin Knauer, Klaus H Strobl, Matthias Humt, and Rudolph Triebel. BlenderProc2: A procedural pipeline for photorealistic rendering.Journal of Open Source Software, 8(82):4901, 2023

work page 2023
[34]

Habitat: A platform for embodied AI research

Manolis Savva, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, Devi Parikh, and Dhruv Batra. Habitat: A platform for embodied AI research. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019

work page 2019
[35]

Light field rendering

Marc Levoy and Pat Hanrahan. Light field rendering. InProceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), pages 31–42, 1996

work page 1996
[36]

Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner

Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017

work page 2017
[37]

ARKitScenes: A diverse real-world dataset for 3D indoor scene understanding using mobile RGB-D data

Gilad Baruch, Zhuoyuan Chen, Afshin Dehghan, Tal Dimry, Yuri Feigin, Peter Fu, Thomas Gebauer, Brandon Joffe, Daniel Kurz, Arik Schwartz, and Elad Shulman. ARKitScenes: A diverse real-world dataset for 3D indoor scene understanding using mobile RGB-D data. In Proceedings of the NeurIPS Track on Datasets and Benchmarks, 2021

work page 2021
[38]

Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J. Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma, Anton Clarkson, Mingfei Yan, Brian Budge, Yajie Yan, Xiaqing Pan, June Yon, Yuyang Zou, Kimberly Leon, Nigel Carter, Jesus Briales, Tyler Gillingham, Elias Mueggler, Luis Pesqueira, Manolis Savva, Dhruv Batra, Hauke M. S...

work page internal anchor Pith review Pith/arXiv arXiv 1906
[39]

Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti Derek Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehd...

work page 2022
[40]

Infinite photorealistic worlds using procedural generation

Alexander Raistrick, Lahav Lipson, Zeyu Ma, Lingjie Mei, Mingzhe Wang, Yiming Zuo, Karhan Kayan, Hongyu Wen, Beining Han, Yihan Wang, Alejandro Newell, Hei Law, Ankit Goyal, Kaiyu Yang, and Jia Deng. Infinite photorealistic worlds using procedural generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. 12

work page 2023
[41]

3D-FRONT: 3D furnished rooms with layouts and semantics

Huan Fu, Bowen Cai, Lin Gao, Ling-Xiao Zhang, Jiaming Wang, Cao Li, Qixun Zeng, Chengyue Sun, Rongfei Jia, Binqiang Zhao, and Hao Zhang. 3D-FRONT: 3D furnished rooms with layouts and semantics. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021

work page 2021
[42]

Zamir, Zhi-Yang He, Alexander Sax, Jitendra Malik, and Silvio Savarese

Fei Xia, Amir R. Zamir, Zhi-Yang He, Alexander Sax, Jitendra Malik, and Silvio Savarese. Gibson Env: Real-world perception for embodied agents. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

work page 2018
[43]

Tchapmi, Micael E

Bokui Shen, Fei Xia, Chengshu Li, Roberto Martín-Martín, Linxi Fan, Guanzhi Wang, Claudia Pérez-D’Arpino, Shyamal Buch, Sanjana Srivastava, Lyne P. Tchapmi, Micael E. Tchapmi, Kent Vainio, Josiah Wong, Li Fei-Fei, and Silvio Savarese. iGibson 1.0: A simulation environment for interactive tasks in large realistic scenes. InProceedings of the IEEE/RSJ Inter...

work page 2021
[44]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. NeRF: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision (ECCV), 2020

work page 2020
[45]

3D Gaus- sian splatting for real-time radiance field rendering.ACM Transactions on Graphics (Proc

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3D Gaus- sian splatting for real-time radiance field rendering.ACM Transactions on Graphics (Proc. SIGGRAPH), 42(4), 2023

work page 2023
[46]

360-GS: Layout-guided panoramic gaussian splatting for indoor roaming

Jiayang Bai, Letian Huang, Jie Guo, Wen Gong, Yuanqi Li, and Yanwen Guo. 360-GS: Layout-guided panoramic gaussian splatting for indoor roaming. arXiv:2402.00763, 2024

work page arXiv 2024
[47]

Balanced spherical grid for egocentric view synthesis

Changwoon Choi, Sang Min Kim, and Young Min Kim. Balanced spherical grid for egocentric view synthesis. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

work page 2023
[48]

Omni-nerf: neural radiance field from 360 image captures

Kai Gu, Thomas Maugey, Sebastian Knorr, and Christine Guillemot. Omni-nerf: neural radiance field from 360 image captures. InProceedings of the IEEE International Conference on Image Processing (ICIP), 2022

work page 2022
[49]

360Roam: Real-time indoor roaming using geometry-aware 360◦ radiance fields

Huajian Huang, Yingshu Chen, Tianjia Zhang, and Sai-Kit Yeung. 360Roam: Real-time indoor roaming using geometry-aware 360◦ radiance fields. arXiv:2208.02705, 2022

work page arXiv 2022
[50]

DiffPano: Scalable and consistent text to panorama generation with spherical epipolar-aware diffusion

Weicai Ye, Chenhao Ji, Zheng Chen, Junyao Gao, Xiaoshui Huang, Song-Hai Zhang, Wanli Ouyang, Tong He, Cairong Zhao, and Guofeng Zhang. DiffPano: Scalable and consistent text to panorama generation with spherical epipolar-aware diffusion. InAdvances in Neural Information Processing Systems (NeurIPS), 2024

work page 2024
[51]

Text2Light: Zero-shot text-driven HDR panorama generation

Zhaoxi Chen, Guangcong Wang, and Ziwei Liu. Text2Light: Zero-shot text-driven HDR panorama generation. InACM Transactions on Graphics (Proc. SIGGRAPH Asia), 2022

work page 2022
[52]

Taming stable diffusion for text to 360◦ panorama image generation

Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella, Xiaoshui Huang, Dinh Phung, Wanli Ouyang, and Jianfei Cai. Taming stable diffusion for text to 360◦ panorama image generation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024
[53]

PanoDiffusion: 360-degree panorama outpainting via diffusion

Tianhao Wu, Chuanxia Zheng, and Tat-Jen Cham. PanoDiffusion: 360-degree panorama outpainting via diffusion. InInternational Conference on Learning Representations (ICLR), 2024

work page 2024
[54]

Connolly

Cyrus I. Connolly. The determination of next best views. InProceedings of the IEEE Interna- tional Conference on Robotics and Automation (ICRA), pages 432–435, 1985

work page 1985
[55]

where does the pipeline break?

Nikolaos A. Massios and Robert B. Fisher. A best next view selection algorithm incorporating a quality criterion. InProceedings of the British Machine Vision Conference (BMVC), 1998. 13 Appendix Contents The page limit forces the main body to point to supporting evidence rather than reproduce it. The full Datasheet, the production hyperparameters and geom...

work page 1998
[56]

Guidelines: • The answer [N/A] means that the paper does not involve crowdsourcing nor research with human subjects

Institutional review board (IRB) approvals or equivalent for research with human subjects Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country or ...

work page

[1] [1]

Xu Zheng, Chenfei Liao, Ziqiao Weng, Kaiyu Lei, Zihao Dongfang, Haocong He, Yuanhuiyi Lyu, Lutao Jiang, Lu Qi, Li Chen, et al. Panorama: The rise of omnidirectional vision in the 9 1.0 0.5 0.0 0.5 1.0 (a) =0 (coverage-only greedy) 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.0 0.5 0.0 0.5 1.0 (b) =0.2 (coverage peak) 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.0 0.5 0.0...

work page arXiv 2025

[2] [2]

PanoFormer: Panorama transformer for indoor 360 ◦ depth estimation

Zhijie Shen, Chunyu Lin, Kang Liao, Lang Nie, Zishuo Zheng, and Yao Zhao. PanoFormer: Panorama transformer for indoor 360 ◦ depth estimation. InProceedings of the European Conference on Computer Vision (ECCV), 2022

work page 2022

[3] [3]

PERF: Panoramic neural radiance field from a single panorama.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

Guangcong Wang, Peng Wang, Zhaoxi Chen, Wenping Wang, Chen Change Loy, and Ziwei Liu. PERF: Panoramic neural radiance field from a single panorama.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

work page 2024

[4] [4]

360DVD: Controllable panorama video generation with 360-degree video diffusion model

Qian Wang, Weiqi Li, Chong Mou, Xinhua Cheng, and Jian Zhang. 360DVD: Controllable panorama video generation with 360-degree video diffusion model. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024

[5] [5]

Pano3D: A holis- tic benchmark and a solid baseline for 360◦ depth estimation

Georgios Albanis, Nikolaos Zioulis, Petros Drakoulis, Vasileios Gkitsas, Vladimiros Sterzentsenko, Federico Alvarez, Dimitrios Zarpalas, and Petros Daras. Pano3D: A holis- tic benchmark and a solid baseline for 360◦ depth estimation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021

work page 2021

[6] [6]

OmniPhotos: Casual 360◦ VR photography

Tobias Bertel, Mingze Yuan, Reuben Lindroos, and Christian Richardt. OmniPhotos: Casual 360◦ VR photography. InACM Transactions on Graphics (Proc. SIGGRAPH Asia), volume 39. ACM, 2020

work page 2020

[7] [7]

Matrix-3D: Omnidirectional explorable 3D world generation

Zhongqi Yang, Wenhang Ge, Yuqi Li, Jiaqi Chen, Haoyuan Li, Mengyin An, Fei Kang, Hua Xue, Baixin Xu, Yuyang Yin, et al. Matrix-3D: Omnidirectional explorable 3D world generation. arXiv preprint, 2025. Skywork Matrix-3D

work page 2025

[8] [8]

Susskind

Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel Angel Bautista, Nathan Paczan, Russ Webb, and Joshua M. Susskind. Hypersim: A photorealistic synthetic dataset for holistic indoor scene understanding. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021

work page 2021

[9] [9]

Structured3D: A large photo-realistic dataset for structured 3D modeling

Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, and Zihan Zhou. Structured3D: A large photo-realistic dataset for structured 3D modeling. InProceedings of the European Conference on Computer Vision (ECCV), 2020

work page 2020

[10] [10]

ScanNet++: A high-fidelity dataset of 3D indoor scenes

Chandan Yeshwanth, Yueh-Cheng Liu, Matthias Nießner, and Angela Dai. ScanNet++: A high-fidelity dataset of 3D indoor scenes. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023

work page 2023

[11] [11]

Matterport3D: Learning from RGB-D data in indoor environments

Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Nießner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3D: Learning from RGB-D data in indoor environments. InInternational Conference on 3D Vision (3DV), 2017

work page 2017

[12] [12]

Ramakrishnan, Aaron Gokaslan, Erik Wijmans, Oleksandr Maksymets, Alex Clegg, John Turner, Eric Undersander, Wojciech Galuba, Andrew Westbury, Angel X

Santhosh K. Ramakrishnan, Aaron Gokaslan, Erik Wijmans, Oleksandr Maksymets, Alex Clegg, John Turner, Eric Undersander, Wojciech Galuba, Andrew Westbury, Angel X. Chang, Manolis Savva, Yili Zhao, and Dhruv Batra. Habitat-matterport 3D dataset (HM3D): 1000 large-scale 3D environments for embodied AI. InProceedings of the NeurIPS Track on Datasets and Bench...

work page 2021

[13] [13]

TartanGround: A large-scale dataset for ground robot perception and navigation

Manthan Patel, Fan Yang, Yuheng Qiu, Cesar Cadena, Sebastian Scherer, Marco Hutter, and Wenshan Wang. TartanGround: A large-scale dataset for ground robot perception and navigation. arXiv:2505.10696, 2025

work page arXiv 2025

[14] [14]

OB3D: A new dataset for benchmarking omnidirectional 3D reconstruction using Blender

Shintaro Ito, Natsuki Takama, Toshiki Watanabe, Koichi Ito, Hwann-Tzong Chen, and Takafumi Aoki. OB3D: A new dataset for benchmarking omnidirectional 3D reconstruction using Blender. arXiv:2505.20126, 2025

work page arXiv 2025

[15] [15]

Holo360D: A Large-Scale Real-World Dataset with Continuous Trajectories for Advancing Panoramic 3D Reconstruction and Beyond

Jing Ou, Zidong Cao, Yinrui Ren, Zhuoxiao Li, Jinjing Zhu, Tongyan Hua, Shuai Zhang, Hui Xiong, and Wufan Zhao. Holo360d: A large-scale real-world dataset with continuous trajectories for advancing panoramic 3d reconstruction and beyond.arXiv preprint arXiv:2604.22482, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[16] [16]

PanoGRF: Generalizable spherical radiance fields for wide-baseline panoramas

Zheng Chen, Yan-Pei Cao, Yuan-Chen Guo, Chen Wang, Ying Shan, and Song-Hai Zhang. PanoGRF: Generalizable spherical radiance fields for wide-baseline panoramas. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

work page 2023

[17] [17]

DreamScene360: Unconstrained text-to-3D scene generation with panoramic Gaussian splatting

Shijie Zhou, Zhiwen Fan, Dejia Xu, Haoran Chang, Pradyumna Chari, Tejas Bharadwaj, Suya You, Zhangyang Wang, and Achuta Kadambi. DreamScene360: Unconstrained text-to-3D scene generation with panoramic Gaussian splatting. InProceedings of the European Conference on Computer Vision (ECCV), 2024

work page 2024

[18] [18]

MVDiffu- sion: Enabling holistic multi-view image generation with correspondence-aware diffusion

Shitao Tang, Fuyang Zhang, Jiacheng Chen, Peng Wang, and Yasutaka Furukawa. MVDiffu- sion: Enabling holistic multi-view image generation with correspondence-aware diffusion. In Advances in Neural Information Processing Systems (NeurIPS), 2023

work page 2023

[19] [19]

Irving Vasquez-Gomez, L

J. Irving Vasquez-Gomez, L. Enrique Sucar, Rafael Murrieta-Cid, and Efrain Lopez-Damian. V olumetric next-best-view planning for 3D object reconstruction with positioning error.Inter- national Journal of Advanced Robotic Systems, 11(10), 2014

work page 2014

[20] [20]

SCVP: Learning one-shot view planning via set covering for unknown object reconstruction

Sicong Pan, Hao Hu, and Hui Wei. SCVP: Learning one-shot view planning via set covering for unknown object reconstruction. InIEEE Robotics and Automation Letters / Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022

work page 2022

[21] [21]

ActiveNeRF: Learning where to see with uncertainty estimation

Xuran Pan, Zihang Lai, Shiji Song, and Gao Huang. ActiveNeRF: Learning where to see with uncertainty estimation. InProceedings of the European Conference on Computer Vision (ECCV), 2022

work page 2022

[22] [22]

NeurAR: Neural uncertainty for autonomous 3D reconstruction with implicit neural representations.IEEE Robotics and Automation Letters, 8(2):1125–1132, 2023

Yunlong Ran, Jing Zeng, Shibo He, Jiming Chen, Lincheng Li, Yingfeng Chen, Gimhee Lee, and Qi Ye. NeurAR: Neural uncertainty for autonomous 3D reconstruction with implicit neural representations.IEEE Robotics and Automation Letters, 8(2):1125–1132, 2023

work page 2023

[23] [23]

GenNBV: Generalizable next-best-view policy for active 3D reconstruction

Xiao Chen, Quanyi Li, Tai Wang, Tianfan Xue, and Jiangmiao Pang. GenNBV: Generalizable next-best-view policy for active 3D reconstruction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024

[24] [24]

Datasheets for datasets.Communications of the ACM, 64(12):86–92, 2021

Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. Datasheets for datasets.Communications of the ACM, 64(12):86–92, 2021

work page 2021

[25] [25]

Croissant: A metadata format for ML-ready datasets.Advances in Neural Information Processing Systems, 37:82133–82148, 2024

Mubashara Akhtar, Omar Benjelloun, Costanza Conforti, Luca Foschini, Pieter Gijsbers, Joan Giner-Miguelez, Sujata Goswami, Nitisha Jain, Michalis Karamousadakis, Satyapriya Krishna, et al. Croissant: A metadata format for ML-ready datasets.Advances in Neural Information Processing Systems, 37:82133–82148, 2024

work page 2024

[26] [26]

Richard M. Karp. Reducibility among combinatorial problems. InComplexity of Computer Computations, pages 85–103. Plenum Press, 1972

work page 1972

[27] [27]

A threshold oflnn for approximating set cover.Journal of the ACM, 45(4):634–652, 1998

Uriel Feige. A threshold oflnn for approximating set cover.Journal of the ACM, 45(4):634–652, 1998

work page 1998

[28] [28]

Nemhauser, Laurence A

George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. An analysis of approxima- tions for maximizing submodular set functions—I.Mathematical Programming, 14(1):265–294, 1978. 11

work page 1978

[29] [29]

Submodular function maximization

Andreas Krause and Daniel Golovin. Submodular function maximization. In Lucas Bordeaux, Youssef Hamadi, and Pushmeet Kohli, editors,Tractability: Practical Approaches to Hard Problems, pages 71–104. Cambridge University Press, 2014

work page 2014

[30] [30]

Submodular optimization under noise

Avinatan Hassidim and Yaron Singer. Submodular optimization under noise. InProceedings of the 2017 Conference on Learning Theory (COLT), volume 65 ofProceedings of Machine Learning Research, pages 1069–1122. PMLR, 2017

work page 2017

[31] [31]

Streaming submodular maximization: Massive data summarization on the fly

Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, and Andreas Krause. Streaming submodular maximization: Massive data summarization on the fly. InProceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2014

work page 2014

[32] [32]

Streaming non-monotone submodular maximization: Personalized video summarization on the fly

Baharan Mirzasoleiman, Stefanie Jegelka, and Andreas Krause. Streaming non-monotone submodular maximization: Personalized video summarization on the fly. InProceedings of the AAAI Conference on Artificial Intelligence, 2018

work page 2018

[33] [33]

BlenderProc2: A procedural pipeline for photorealistic rendering.Journal of Open Source Software, 8(82):4901, 2023

Maximilian Denninger, Dominik Winkelbauer, Martin Sundermeyer, Wout Boerdijk, Markus Wendelin Knauer, Klaus H Strobl, Matthias Humt, and Rudolph Triebel. BlenderProc2: A procedural pipeline for photorealistic rendering.Journal of Open Source Software, 8(82):4901, 2023

work page 2023

[34] [34]

Habitat: A platform for embodied AI research

Manolis Savva, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, Devi Parikh, and Dhruv Batra. Habitat: A platform for embodied AI research. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019

work page 2019

[35] [35]

Light field rendering

Marc Levoy and Pat Hanrahan. Light field rendering. InProceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), pages 31–42, 1996

work page 1996

[36] [36]

Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner

Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017

work page 2017

[37] [37]

ARKitScenes: A diverse real-world dataset for 3D indoor scene understanding using mobile RGB-D data

Gilad Baruch, Zhuoyuan Chen, Afshin Dehghan, Tal Dimry, Yuri Feigin, Peter Fu, Thomas Gebauer, Brandon Joffe, Daniel Kurz, Arik Schwartz, and Elad Shulman. ARKitScenes: A diverse real-world dataset for 3D indoor scene understanding using mobile RGB-D data. In Proceedings of the NeurIPS Track on Datasets and Benchmarks, 2021

work page 2021

[38] [38]

Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J. Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma, Anton Clarkson, Mingfei Yan, Brian Budge, Yajie Yan, Xiaqing Pan, June Yon, Yuyang Zou, Kimberly Leon, Nigel Carter, Jesus Briales, Tyler Gillingham, Elias Mueggler, Luis Pesqueira, Manolis Savva, Dhruv Batra, Hauke M. S...

work page internal anchor Pith review Pith/arXiv arXiv 1906

[39] [39]

Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti Derek Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehd...

work page 2022

[40] [40]

Infinite photorealistic worlds using procedural generation

Alexander Raistrick, Lahav Lipson, Zeyu Ma, Lingjie Mei, Mingzhe Wang, Yiming Zuo, Karhan Kayan, Hongyu Wen, Beining Han, Yihan Wang, Alejandro Newell, Hei Law, Ankit Goyal, Kaiyu Yang, and Jia Deng. Infinite photorealistic worlds using procedural generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. 12

work page 2023

[41] [41]

3D-FRONT: 3D furnished rooms with layouts and semantics

Huan Fu, Bowen Cai, Lin Gao, Ling-Xiao Zhang, Jiaming Wang, Cao Li, Qixun Zeng, Chengyue Sun, Rongfei Jia, Binqiang Zhao, and Hao Zhang. 3D-FRONT: 3D furnished rooms with layouts and semantics. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021

work page 2021

[42] [42]

Zamir, Zhi-Yang He, Alexander Sax, Jitendra Malik, and Silvio Savarese

Fei Xia, Amir R. Zamir, Zhi-Yang He, Alexander Sax, Jitendra Malik, and Silvio Savarese. Gibson Env: Real-world perception for embodied agents. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

work page 2018

[43] [43]

Tchapmi, Micael E

Bokui Shen, Fei Xia, Chengshu Li, Roberto Martín-Martín, Linxi Fan, Guanzhi Wang, Claudia Pérez-D’Arpino, Shyamal Buch, Sanjana Srivastava, Lyne P. Tchapmi, Micael E. Tchapmi, Kent Vainio, Josiah Wong, Li Fei-Fei, and Silvio Savarese. iGibson 1.0: A simulation environment for interactive tasks in large realistic scenes. InProceedings of the IEEE/RSJ Inter...

work page 2021

[44] [44]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. NeRF: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision (ECCV), 2020

work page 2020

[45] [45]

3D Gaus- sian splatting for real-time radiance field rendering.ACM Transactions on Graphics (Proc

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3D Gaus- sian splatting for real-time radiance field rendering.ACM Transactions on Graphics (Proc. SIGGRAPH), 42(4), 2023

work page 2023

[46] [46]

360-GS: Layout-guided panoramic gaussian splatting for indoor roaming

Jiayang Bai, Letian Huang, Jie Guo, Wen Gong, Yuanqi Li, and Yanwen Guo. 360-GS: Layout-guided panoramic gaussian splatting for indoor roaming. arXiv:2402.00763, 2024

work page arXiv 2024

[47] [47]

Balanced spherical grid for egocentric view synthesis

Changwoon Choi, Sang Min Kim, and Young Min Kim. Balanced spherical grid for egocentric view synthesis. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

work page 2023

[48] [48]

Omni-nerf: neural radiance field from 360 image captures

Kai Gu, Thomas Maugey, Sebastian Knorr, and Christine Guillemot. Omni-nerf: neural radiance field from 360 image captures. InProceedings of the IEEE International Conference on Image Processing (ICIP), 2022

work page 2022

[49] [49]

360Roam: Real-time indoor roaming using geometry-aware 360◦ radiance fields

Huajian Huang, Yingshu Chen, Tianjia Zhang, and Sai-Kit Yeung. 360Roam: Real-time indoor roaming using geometry-aware 360◦ radiance fields. arXiv:2208.02705, 2022

work page arXiv 2022

[50] [50]

DiffPano: Scalable and consistent text to panorama generation with spherical epipolar-aware diffusion

Weicai Ye, Chenhao Ji, Zheng Chen, Junyao Gao, Xiaoshui Huang, Song-Hai Zhang, Wanli Ouyang, Tong He, Cairong Zhao, and Guofeng Zhang. DiffPano: Scalable and consistent text to panorama generation with spherical epipolar-aware diffusion. InAdvances in Neural Information Processing Systems (NeurIPS), 2024

work page 2024

[51] [51]

Text2Light: Zero-shot text-driven HDR panorama generation

Zhaoxi Chen, Guangcong Wang, and Ziwei Liu. Text2Light: Zero-shot text-driven HDR panorama generation. InACM Transactions on Graphics (Proc. SIGGRAPH Asia), 2022

work page 2022

[52] [52]

Taming stable diffusion for text to 360◦ panorama image generation

Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella, Xiaoshui Huang, Dinh Phung, Wanli Ouyang, and Jianfei Cai. Taming stable diffusion for text to 360◦ panorama image generation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024

[53] [53]

PanoDiffusion: 360-degree panorama outpainting via diffusion

Tianhao Wu, Chuanxia Zheng, and Tat-Jen Cham. PanoDiffusion: 360-degree panorama outpainting via diffusion. InInternational Conference on Learning Representations (ICLR), 2024

work page 2024

[54] [54]

Connolly

Cyrus I. Connolly. The determination of next best views. InProceedings of the IEEE Interna- tional Conference on Robotics and Automation (ICRA), pages 432–435, 1985

work page 1985

[55] [55]

where does the pipeline break?

Nikolaos A. Massios and Robert B. Fisher. A best next view selection algorithm incorporating a quality criterion. InProceedings of the British Machine Vision Conference (BMVC), 1998. 13 Appendix Contents The page limit forces the main body to point to supporting evidence rather than reproduce it. The full Datasheet, the production hyperparameters and geom...

work page 1998

[56] [56]

Guidelines: • The answer [N/A] means that the paper does not involve crowdsourcing nor research with human subjects

Institutional review board (IRB) approvals or equivalent for research with human subjects Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country or ...

work page