ArchSym: Detecting 3D-Grounded Architectural Symmetries in the Wild

Hanyu Chen; Noah Snavely; Ruojin Cai; Steve Marschner

arxiv: 2604.22202 · v1 · submitted 2026-04-24 · 💻 cs.CV

ArchSym: Detecting 3D-Grounded Architectural Symmetries in the Wild

Hanyu Chen , Ruojin Cai , Steve Marschner , Noah Snavely This is my paper

Pith reviewed 2026-05-08 12:30 UTC · model grok-4.3

classification 💻 cs.CV

keywords 3D symmetry detectionarchitectural landmarksreflectional symmetrysingle-view detectionSfM annotation pipelinesigned distance mapsin-the-wild imagesscene geometry

0 comments

The pith

A single photo of a building can now yield a full 3D reflection symmetry plane through a detector trained on automatically labeled real-world data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to create the first workable system for finding reflection symmetries that are grounded in actual 3D space when given only one ordinary photograph of an architectural scene. Earlier learning methods could not do this because they were trained almost entirely on centered objects or synthetic scenes and could not resolve the position of a symmetry plane amid monocular scale ambiguity. The authors solve the data problem with an automatic pipeline that harvests 3D symmetry labels from structure-from-motion reconstructions by matching features across multiple views, producing the ArchSym collection. They then train a detector that outputs signed distance maps anchored to an estimated scene geometry, so the symmetry plane is recovered in metric 3D rather than as an orientation alone. A reader should care because this removes a long-standing barrier to using symmetry as a reliable geometric prior on everyday photographs.

Core claim

The central discovery is that a scalable annotation pipeline based on cross-view matching in SfM reconstructions can produce reliable 3D symmetry labels at scale, and that a detector trained on those labels can localize reflectional symmetry planes in full 3D by regressing signed distance maps relative to a predicted scene geometry, thereby overcoming the orientation-only limitation of prior single-image methods and generalizing to in-the-wild architectural images.

What carries the argument

The single-view symmetry detector that parameterizes each symmetry as a signed distance map defined relative to a predicted scene geometry.

If this is right

Symmetry can now serve as a 3D prior for single-image reconstruction and editing tasks on architectural scenes.
The detector supplies both orientation and position of the symmetry plane, resolving the scale ambiguity that limited earlier methods.
A new benchmark of real-world architectural images becomes available for evaluating 3D symmetry detectors.
The same automatic labeling approach can in principle be reused to create training data for other 3D geometric properties.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The signed-distance representation could be directly fed into downstream geometric algorithms such as plane fitting or symmetry-aware meshing without additional conversion steps.
The dataset curation technique might be extended to label other repeating structures such as facades or window grids once the core matching pipeline is in place.

Load-bearing premise

Cross-view image matching on SfM reconstructions can produce accurate 3D symmetry annotations without significant geometric errors or labeling mistakes in real scenes.

What would settle it

Manual inspection or additional LiDAR capture revealing that a large fraction of the automatically generated 3D symmetry planes deviate by more than a few degrees or meters from ground truth would undermine the data pipeline and the detector trained on it.

Figures

Figures reproduced from arXiv: 2604.22202 by Hanyu Chen, Noah Snavely, Ruojin Cai, Steve Marschner.

**Figure 1.** Figure 1: Our method robustly detects 3D-grounded symmetries in challenging, in-the-wild images. From a single RGB image (left in each pair), our model recovers dominant 3D symmetry planes (right) even when they are partially occluded or not directly visible. To train our model, we introduce a novel pipeline to automatically curate ArchSym, a large-scale dataset of landmark symmetries. The results above are on image… view at source ↗

**Figure 2.** Figure 2: Overview of our automated pipeline for extracting symmetry annotations. We visualize within-view (left) and cross-view (right) matching on image pairs sampled from an SfM reconstruction. For each pair, we horizontally flip one image, find dense matches with the other image via MASt3R [7], unproject matched pixels to 3D points using depth maps, and fit a plane to the resulting point pairs. The final symmetr… view at source ↗

**Figure 3.** Figure 3: Visualization of automated symmetry plane annotations. We run LANGEVIN [13] and our symmetry extraction pipeline on six scenes from MegaScenes. Extracted planes are visualized with a dense point cloud from COLMAP [33, 34]. For LANGEVIN, the dense point cloud from COLMAP is used as input. For OURS-ANNOTATION, sampled pairs of input images and depth maps are used as input. We observe that LANGEVIN, as a pure… view at source ↗

**Figure 4.** Figure 4: Statistics of the ArchSym dataset, showing the distribution of the number of images available (left) and the number of global symmetries annotated (right) in each scene. and fails to identify symmetries where points to one side of the symmetry plane are largely missing (e.g., Isa Khan’s Tomb, Frauenkirche). It often prioritizes the coarse shape of the point cloud over the underlying architectural semantics… view at source ↗

**Figure 5.** Figure 5: Overview of our single-view symmetry detector architecture. A frozen VGGT [42] backbone first extracts features from a single input image. The features are processed by a transformer decoder with learnable instance queries to identify symmetries. A lightweight MLP generates conditioning parameters from the resulting instance features. Then, a DPT-style [29] prediction head fuses multi-layer features to gen… view at source ↗

**Figure 6.** Figure 6: Qualitative comparison of single-view symmetry detection results. Input images are sampled from eight different test scenes. Since REFLECT3D [19] does not predict plane offsets, for visualization purposes, we use the point closest to the center of the landmark on the corresponding ground truth plane as an anchor point for REFLECT3D’s predicted normals. Point clouds shown are predicted by VGGT [42]. We obse… view at source ↗

**Figure 7.** Figure 7: Single-view completion using detected symmetries. We view at source ↗

read the original abstract

Symmetry detection is a fundamental problem in computer vision, and symmetries serve as powerful priors for downstream tasks. However, existing learning-based methods for detecting 3D symmetries from single images have been almost exclusively trained and evaluated on object-centric or synthetic datasets, and thus fail to generalize to real-world scenes. Furthermore, due to the inherent scale ambiguity of monocular inputs, which makes localizing the 3D plane an ill-posed problem, many existing works only predict the plane's orientation. In this paper, we address these limitations by presenting the first framework for detecting 3D-grounded reflectional symmetries from single, in-the-wild RGB images, focusing on architectural landmarks. We introduce two key innovations: (1) a scalable data annotation pipeline to automatically curate a large-scale dataset of architectural symmetries, ArchSym, from SfM reconstructions by leveraging cross-view image matching; and building on the dataset, (2) a single-view symmetry detector that accurately localizes symmetries in 3D by parameterizing them as signed distance maps defined relative to predicted scene geometry. We validate our symmetry annotation pipeline against geometry-based alternatives and demonstrate that our symmetry detector significantly outperforms state-of-the-art baselines on our new benchmark.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ArchSym gives a workable pipeline for labeling 3D symmetries in real building photos via SfM and a single-view detector that uses signed-distance maps, but the evidence on label accuracy and detector gains stays thin.

read the letter

The main advance is a dataset of 3D reflectional symmetries for architectural scenes collected from ordinary photos, plus a detector that predicts them from one image by outputting signed distance maps tied to estimated scene geometry. This is a direct response to the limits of prior work that stayed on objects or synthetic data and often skipped full 3D localization because of scale ambiguity. The annotation step that pulls symmetries out of SfM reconstructions through cross-view matching is the practical piece that lets them scale up beyond manual labeling. The modeling choice to ground the output in predicted geometry rather than just orientation is a clean way to make the task well-posed. Those two moves are what separate it from earlier symmetry papers. The paper also sets up a new benchmark and reports that the learned detector beats geometry-only baselines on it. That is useful to know even if the numbers are not yet in the abstract. The soft spots sit in the validation. The abstract does not give error rates for the SfM labels, how often repeated facades cause bad matches, or ablation results on the detector. The concern that cross-view matching can drift or pick wrong planes in symmetric buildings is not obviously answered by the high-level description, so the claim that the labels are reliable enough for training needs more concrete checks. If those checks are missing or weak in the full text, the downstream detector results become harder to trust. This paper is for people working on monocular scene understanding or reconstruction who want symmetry priors that actually apply to real buildings rather than clean objects. A reader who needs a starting point for 3D symmetry data or a baseline detector in this domain will find something concrete to build on. It is worth sending to peer review because the problem is real, the dataset construction is new, and the modeling idea is straightforward enough that referees can give targeted feedback on the missing quantitative pieces.

Referee Report

2 major / 2 minor

Summary. The paper introduces ArchSym, the first framework for 3D-grounded reflectional symmetry detection from single in-the-wild RGB images of architectural scenes. It contributes (1) a scalable automatic annotation pipeline that leverages SfM reconstructions and cross-view image matching to curate a large dataset of 3D symmetries without manual labeling, and (2) a monocular detector that outputs signed-distance symmetry maps defined relative to predicted scene geometry, thereby resolving scale ambiguity. The annotation pipeline is validated against geometry-based alternatives, and the detector is shown to outperform state-of-the-art baselines on the new ArchSym benchmark.

Significance. If the automatic 3D labels prove reliable, the work meaningfully extends symmetry detection beyond object-centric and synthetic regimes to real architectural scenes, where symmetries are both prevalent and useful for downstream tasks such as 3D reconstruction and facade parsing. The signed-distance-map parameterization is a technically sound way to make the 3D plane localization well-posed from monocular input. The SfM-based annotation strategy is a practical contribution for scalable dataset creation. These strengths would be strengthened by explicit quantification of label accuracy in the presence of repeated architectural structures.

major comments (2)

[§3] §3 (Annotation Pipeline): The central claim that the detector achieves accurate 3D-grounded detection rests on the quality of the automatically generated 3D symmetry labels. Architectural scenes frequently contain repeated facades and symmetric elements that induce SfM correspondence errors, scale drift, and erroneous plane hypotheses. While the manuscript states that the pipeline is validated against geometry-based alternatives, no quantitative metrics (e.g., plane-parameter error distributions, agreement with manual annotations on a held-out subset, or failure-case analysis) are reported in the validation section. Without these, it is impossible to determine whether label noise undermines the downstream detector training and the reported benchmark gains.
[§5] §5 (Experiments): The claim of significant outperformance over baselines is load-bearing for the paper's contribution. The manuscript should provide ablations isolating the effect of the signed-distance-map formulation versus simpler orientation-only predictions, as well as an analysis of how annotation noise from the SfM pipeline propagates to detector performance. Current results appear to lack these controls, making it difficult to attribute improvements specifically to the proposed 3D-grounded representation.

minor comments (2)

[§4] Figure captions and the method section would benefit from an explicit diagram showing how the predicted signed-distance map is converted back to a 3D plane equation and how this resolves scale ambiguity.
[Abstract] The abstract states that the detector 'significantly outperforms' baselines; adding the key quantitative deltas (e.g., mIoU or plane-error reductions) would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the two major comments below and will revise the paper accordingly to strengthen the validation of the annotation pipeline and the experimental analysis.

read point-by-point responses

Referee: [§3] §3 (Annotation Pipeline): The central claim that the detector achieves accurate 3D-grounded detection rests on the quality of the automatically generated 3D symmetry labels. Architectural scenes frequently contain repeated facades and symmetric elements that induce SfM correspondence errors, scale drift, and erroneous plane hypotheses. While the manuscript states that the pipeline is validated against geometry-based alternatives, no quantitative metrics (e.g., plane-parameter error distributions, agreement with manual annotations on a held-out subset, or failure-case analysis) are reported in the validation section. Without these, it is impossible to determine whether label noise undermines the downstream detector training and the reported benchmark gains.

Authors: We agree that explicit quantification of label accuracy is important, particularly given the challenges of repeated structures in architectural scenes. Our current validation compares the SfM-based pipeline to geometry-based alternatives, but we acknowledge this is insufficiently quantitative. In the revision we will add: (1) plane-parameter error distributions on a held-out subset of 200 images for which we obtain manual plane annotations, (2) agreement metrics (e.g., angular and distance errors) between automatic and manual labels, and (3) a targeted failure-case analysis on scenes with repeated facades. These additions will allow readers to assess label reliability directly. revision: yes
Referee: [§5] §5 (Experiments): The claim of significant outperformance over baselines is load-bearing for the paper's contribution. The manuscript should provide ablations isolating the effect of the signed-distance-map formulation versus simpler orientation-only predictions, as well as an analysis of how annotation noise from the SfM pipeline propagates to detector performance. Current results appear to lack these controls, making it difficult to attribute improvements specifically to the proposed 3D-grounded representation.

Authors: We concur that isolating the contribution of the signed-distance-map representation and quantifying noise sensitivity would strengthen the experimental claims. In the revised manuscript we will add: (1) an ablation comparing the full signed-distance-map model against an orientation-only baseline (normal vector prediction without distance), and (2) a controlled noise-injection study that perturbs the training labels with increasing levels of plane-parameter noise and reports the resulting drop in detector metrics. These controls will clarify the benefit of the 3D-grounded formulation and the robustness to annotation noise. revision: yes

Circularity Check

0 steps flagged

No significant circularity; dataset creation and model training are independent of fitted predictions

full rationale

The paper's derivation chain begins with an external SfM-based annotation pipeline that generates the ArchSym dataset via cross-view matching on reconstructions; this process is not derived from or equivalent to the single-view detector's outputs. The detector is then trained to predict signed-distance symmetry maps relative to scene geometry estimated from the input image. No equations or steps reduce a claimed prediction to a fitted parameter by construction, and no self-citations are invoked as load-bearing uniqueness theorems or ansatzes. Validation against geometry-based alternatives is presented as an independent check. The approach remains self-contained against external benchmarks with no reduction of the central claim to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review limits visibility into specific parameters or assumptions; the method appears to rest on standard SfM accuracy and the validity of signed distance map parameterization for symmetry localization.

pith-pipeline@v0.9.0 · 5523 in / 1026 out tokens · 45449 ms · 2026-05-08T12:30:52.828513+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

[1]

Mikhail J. Atallah. On symmetry detection.IEEE Transactions on Computers, 100(7):663–666, 1985. 1

work page 1985
[2]

Doppelgangers: Learning to disambiguate images of similar structures

Ruojin Cai, Joseph Tung, Qianqian Wang, Hadar Averbuch- Elor, Bharath Hariharan, and Noah Snavely. Doppelgangers: Learning to disambiguate images of similar structures. In ICCV, 2023. 2, 3

work page 2023
[3]

End-to- end object detection with transformers

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to- end object detection with transformers. InECCV, 2020. 6

work page 2020
[4]

Shapenet: An information-rich 3d model repository.arXiv preprint, 2015

Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. Shapenet: An information-rich 3d model repository.arXiv preprint, 2015. 2

work page 2015
[5]

Finding mirror symmetry via registration and optimal symmetric pairwise assignment of curves: Algorithm and results

Marcelo Cicconet, David GC Hildebrand, and Hunter Elliott. Finding mirror symmetry via registration and optimal symmetric pairwise assignment of curves: Algorithm and results. InICCV Workshops, pages 1759–1763, 2017. 2

work page 2017
[6]

Objaverse: A universe of annotated 3d objects

Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. Objaverse: A universe of annotated 3d objects. InCVPR, 2023. 2

work page 2023
[7]

Mast3r-sfm: a fully-integrated solution for unconstrained structure-from-motion.arXiv preprint, 2024

Bardienus Duisterhof, Lojze Zust, Philippe Weinzaepfel, Vincent Leroy, Yohann Cabon, and Jerome Revaud. Mast3r-sfm: a fully-integrated solution for unconstrained structure-from-motion.arXiv preprint, 2024. 3, 11

work page 2024
[8]

Wavelet-based reflection symmetry detection via textural and color histograms

Mohamed Elawady, Christophe Ducottet, Olivier Alata, C´ecile Barat, and Philippe Colantoni. Wavelet-based reflection symmetry detection via textural and color histograms. In ICCV Workshops, pages 1725–1733, 2017. 2

work page 2017
[9]

A density-based algorithm for discovering clusters in large spatial databases with noise

Martin Ester, Hans-Peter Kriegel, J¨org Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. InProc. KDD, page 226–231. AAAI Press, 1996. 3, 4

work page 1996
[10]

Beyond planar symmetry: Modeling human perception of reflection and rotation symmetries in the wild

Christopher Funk and Yanxi Liu. Beyond planar symmetry: Modeling human perception of reflection and rotation symmetries in the wild. InICCV, 2017. 2

work page 2017
[11]

2017 iccv challenge: Detecting symmetry in the wild

Christopher Funk, Seungkyu Lee, Martin R Oswald, Stavros Tsogkas, Wei Shen, Andrea Cohen, Sven Dickinson, and Yanxi Liu. 2017 iccv challenge: Detecting symmetry in the wild. InICCV Workshops, 2017. 2

work page 2017
[12]

Prs-net: Planar reflective symmetry detection net for 3d models.IEEE TVCG, 27(6): 3007–3018, 2020

Lin Gao, Ling-Xiao Zhang, Hsien-Yu Meng, Yi-Hui Ren, Yu-Kun Lai, and Leif Kobbelt. Prs-net: Planar reflective symmetry detection net for 3d models.IEEE TVCG, 27(6): 3007–3018, 2020. 2

work page 2020
[13]

Robust symmetry detection via riemannian langevin dynamics

Jihyeon Je, Jiayi Liu, Guandao Yang, Boyang Deng, Shengqu Cai, Gordon Wetzstein, Or Litany, and Leonidas Guibas. Robust symmetry detection via riemannian langevin dynamics. InSIGGRAPH Asia 2024 Conference Papers, pages 1–11,

work page 2024
[14]

Detecting symmetry in grey level images: The global optimization approach.International Journal of Computer Vision, 29(1):29–45, 1998

Nahum Kiryati and Yossi Gofman. Detecting symmetry in grey level images: The global optimization approach.International Journal of Computer Vision, 29(1):29–45, 1998. 2

work page 1998
[15]

Dense 3d reconstruction of symmetric scenes from a single image

Kevin K¨oser, Christopher Zach, and Marc Pollefeys. Dense 3d reconstruction of symmetric scenes from a single image. InJoint Pattern Recognition Symposium, pages 266–275. Springer, 2011. 2, 3

work page 2011
[16]

The hungarian method for the assignment problem.Naval research logistics quarterly, 2(1-2):83–97,

Harold W Kuhn. The hungarian method for the assignment problem.Naval research logistics quarterly, 2(1-2):83–97,

work page
[17]

Grounding image matching in 3d with mast3r

Vincent Leroy, Yohann Cabon, and J´erˆome Revaud. Grounding image matching in 3d with mast3r. InECCV, 2024. 3

work page 2024
[18]

E3sym: Leveraging e (3) invariance for unsupervised 3d planar reflective symmetry detection

Ren-Wu Li, Ling-Xiao Zhang, Chunpeng Li, Yu-Kun Lai, and Lin Gao. E3sym: Leveraging e (3) invariance for unsupervised 3d planar reflective symmetry detection. InICCV, 2023. 2

work page 2023
[19]

Sym- metry strikes back: From single-image symmetry detection to 3d generation

Xiang Li, Zixuan Huang, Anh Thai, and James M Rehg. Sym- metry strikes back: From single-image symmetry detection to 3d generation. InCVPR, 2025. 1, 2, 5, 7, 8, 11, 12, 14

work page 2025
[20]

Refinenet: Multi-path refinement networks for high-resolution semantic segmentation

Guosheng Lin, Anton Milan, Chunhua Shen, and Ian Reid. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. InCVPR, 2017. 6

work page 2017
[21]

Nerd++: Improved 3d-mirror symmetry learning from a single image.arXiv preprint, 2021

Yancong Lin, Silvia-Laura Pintea, and Jan van Gemert. Nerd++: Improved 3d-mirror symmetry learning from a single image.arXiv preprint, 2021. 2

work page 2021
[22]

Detecting symmetry and symmetric constellations of features

Gareth Loy and Jan-Olof Eklundh. Detecting symmetry and symmetric constellations of features. InECCV, 2006. 2

work page 2006
[23]

Symmetry and uncertainty-aware object slam for 6dof object pose estimation

Nathaniel Merrill, Yuliang Guo, Xingxing Zuo, Xinyu Huang, Stefan Leutenegger, Xi Peng, Liu Ren, and Guoquan Huang. Symmetry and uncertainty-aware object slam for 6dof object pose estimation. InCVPR, 2022. 1

work page 2022
[24]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. InECCV, 2020. 13

work page 2020
[25]

Symmetry in 3d geometry: Extraction and applications

Niloy J Mitra, Mark Pauly, Michael Wand, and Duygu Ceylan. Symmetry in 3d geometry: Extraction and applications. In Comput. Graph. Forum, pages 1–23. Wiley Online Library,

work page
[26]

Symmmap: Estimation of the 2-d reflection symmetry map and its applications

Rajendra Nagar and Shanmuganathan Raman. Symmmap: Estimation of the 2-d reflection symmetry map and its applications. InICCV Workshops, pages 1715–1724, 2017. 2

work page 2017
[27]

Scalable diffusion models with transformers

William Peebles and Saining Xie. Scalable diffusion models with transformers. InICCV, pages 4195–4205, 2023. 6

work page 2023
[28]

Film: Visual reasoning with a general conditioning layer

Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. Film: Visual reasoning with a general conditioning layer. InAAAI, 2018. 6, 11, 12

work page 2018
[29]

Vision transformers for dense prediction

Ren´e Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. Vision transformers for dense prediction. InICCV, pages 12179–12188, 2021. 6, 12

work page 2021
[30]

Com- mon objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction

Jeremy Reizenstein, Roman Shapovalov, Philipp Henzler, Luca Sbordone, Patrick Labatut, and David Novotny. Com- mon objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction. InProceedings of the IEEE/CVF international conference on computer vision, pages 10901–10911, 2021. 13

work page 2021
[31]

Common objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction

Jeremy Reizenstein, Roman Shapovalov, Philipp Henzler, Luca Sbordone, Patrick Labatut, and David Novotny. Common objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction. InInternational Conference on Computer Vision, 2021. 13

work page 2021
[32]

Detecting 3-d mirror symmetry in a 2-d camera image for 3-d shape re- covery.Proceedings of the IEEE, 102(10):1588–1606, 2014

Tadamasa Sawada, Yunfeng Li, and Zygmunt Pizlo. Detecting 3-d mirror symmetry in a 2-d camera image for 3-d shape re- covery.Proceedings of the IEEE, 102(10):1588–1606, 2014. 2 9

work page 2014
[33]

Structure- from-motion revisited

Johannes L Schonberger and Jan-Michael Frahm. Structure- from-motion revisited. InCVPR, 2016. 2, 4

work page 2016
[34]

Pixelwise view selection for unstructured multi-view stereo

Johannes Lutz Sch¨onberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. Pixelwise view selection for unstructured multi-view stereo. InEuropean Conference on Computer Vision (ECCV), 2016. 4, 13

work page 2016
[35]

Symmetrynet: Learning to predict reflectional and rotational symmetries of 3d shapes from single- view rgb-d images.ACM TOG, 39(6):1–14, 2020

Yifei Shi, Junwen Huang, Hongjia Zhang, Xin Xu, Szymon Rusinkiewicz, and Kai Xu. Symmetrynet: Learning to predict reflectional and rotational symmetries of 3d shapes from single- view rgb-d images.ACM TOG, 39(6):1–14, 2020. 1, 2, 5, 7

work page 2020
[36]

Symmetrygrasp: Symmetry-aware antipodal grasp detection from single-view rgb-d images.RA-L, 7(4): 12235–12242, 2022

Yifei Shi, Zixin Tang, Xiangting Cai, Hongjia Zhang, Dewen Hu, and Xin Xu. Symmetrygrasp: Symmetry-aware antipodal grasp detection from single-view rgb-d images.RA-L, 7(4): 12235–12242, 2022. 1

work page 2022
[37]

Learning to detect 3d symmetry from single-view rgb-d images with weak supervision.IEEE TPAMI, 45(4): 4882–4896, 2022

Yifei Shi, Xin Xu, Junhua Xi, Xiaochang Hu, Dewen Hu, and Kai Xu. Learning to detect 3d symmetry from single-view rgb-d images with weak supervision.IEEE TPAMI, 45(4): 4882–4896, 2022. 2, 5

work page 2022
[38]

To aggregate or not to aggregate: Selective match kernels for image search

Giorgos Tolias, Yannis Avrithis, and Herv ´e J ´egou. To aggregate or not to aggregate: Selective match kernels for image search. InICCV, pages 1401–1408, 2013. 3

work page 2013
[39]

Learning-based symmetry detection in natural images

Stavros Tsogkas and Iasonas Kokkinos. Learning-based symmetry detection in natural images. InECCV, 2012. 2

work page 2012
[40]

Megascenes: Scene-level view synthesis at scale

Joseph Tung, Gene Chou, Ruojin Cai, Guandao Yang, Kai Zhang, Gordon Wetzstein, Bharath Hariharan, and Noah Snavely. Megascenes: Scene-level view synthesis at scale. In ECCV, 2024. 3, 13

work page 2024
[41]

Attention is all you need.NeurIPS, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.NeurIPS, 30, 2017. 5

work page 2017
[42]

Vggt: Visual geometry grounded transformer

Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Visual geometry grounded transformer. InCVPR, 2025. 5, 6, 7, 8, 11

work page 2025
[43]

Dust3r: Geometric 3d vision made easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. Dust3r: Geometric 3d vision made easy. InCVPR, pages 20697–20709, 2024. 6

work page 2024
[44]

Optimal al- gorithms for symmetry detection in two and three dimensions

Jan D Wolter, Tony C Woo, and Richard A V olz. Optimal al- gorithms for symmetry detection in two and three dimensions. The Visual Computer, 1(1):37–48, 1985. 1

work page 1985
[45]

Unsupervised learning of probably symmetric deformable 3d objects from images in the wild

Shangzhe Wu, Christian Rupprecht, and Andrea Vedaldi. Unsupervised learning of probably symmetric deformable 3d objects from images in the wild. InCVPR, 2020. 1, 2

work page 2020
[46]

De-rendering the world’s revolutionary artefacts

Shangzhe Wu, Ameesh Makadia, Jiajun Wu, Noah Snavely, Richard Tucker, and Angjoo Kanazawa. De-rendering the world’s revolutionary artefacts. InCVPR, 2021. 1

work page 2021
[47]

Doppelgangers++: Improved visual disam- biguation with geometric 3d features

Yuanbo Xiangli, Ruojin Cai, Hanyu Chen, Jeffrey Byrne, and Noah Snavely. Doppelgangers++: Improved visual disam- biguation with geometric 3d features. InCVPR, 2025. 2, 3

work page 2025
[48]

Front2back: Single view 3d shape reconstruction via front to back prediction

Yuan Yao, Nico Schertler, Enrique Rosales, Helge Rhodin, Leonid Sigal, and Alla Sheffer. Front2back: Single view 3d shape reconstruction via front to back prediction. InCVPR,

work page
[49]

Single depth-image 3d reflection symmetry and shape prediction

Zhaoxuan Zhang, Bo Dong, Tong Li, Felix Heide, Pieter Peers, Baocai Yin, and Xin Yang. Single depth-image 3d reflection symmetry and shape prediction. InICCV, 2023. 2

work page 2023
[50]

Learning symmetry-aware geometry correspondences for 6d object pose estimation

Heng Zhao, Shenxing Wei, Dahu Shi, Wenming Tan, Zheyang Li, Ye Ren, Xing Wei, Yi Yang, and Shiliang Pu. Learning symmetry-aware geometry correspondences for 6d object pose estimation. InICCV, 2023. 1

work page 2023
[51]

Nerd: Neural 3d reflection symmetry detector

Yichao Zhou, Shichen Liu, and Yi Ma. Nerd: Neural 3d reflection symmetry detector. InCVPR, 2021. 1, 2, 5 10 ArchSym: Detecting 3D-Grounded Architectural Symmetries in the Wild Supplementary Material A. Implementation details Our implementation builds upon the official MASt3R [7] and VGGT [42] codebases. A.1. Training details We use a base learning rate of...

work page 2021

[1] [1]

Mikhail J. Atallah. On symmetry detection.IEEE Transactions on Computers, 100(7):663–666, 1985. 1

work page 1985

[2] [2]

Doppelgangers: Learning to disambiguate images of similar structures

Ruojin Cai, Joseph Tung, Qianqian Wang, Hadar Averbuch- Elor, Bharath Hariharan, and Noah Snavely. Doppelgangers: Learning to disambiguate images of similar structures. In ICCV, 2023. 2, 3

work page 2023

[3] [3]

End-to- end object detection with transformers

Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to- end object detection with transformers. InECCV, 2020. 6

work page 2020

[4] [4]

Shapenet: An information-rich 3d model repository.arXiv preprint, 2015

Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. Shapenet: An information-rich 3d model repository.arXiv preprint, 2015. 2

work page 2015

[5] [5]

Finding mirror symmetry via registration and optimal symmetric pairwise assignment of curves: Algorithm and results

Marcelo Cicconet, David GC Hildebrand, and Hunter Elliott. Finding mirror symmetry via registration and optimal symmetric pairwise assignment of curves: Algorithm and results. InICCV Workshops, pages 1759–1763, 2017. 2

work page 2017

[6] [6]

Objaverse: A universe of annotated 3d objects

Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. Objaverse: A universe of annotated 3d objects. InCVPR, 2023. 2

work page 2023

[7] [7]

Mast3r-sfm: a fully-integrated solution for unconstrained structure-from-motion.arXiv preprint, 2024

Bardienus Duisterhof, Lojze Zust, Philippe Weinzaepfel, Vincent Leroy, Yohann Cabon, and Jerome Revaud. Mast3r-sfm: a fully-integrated solution for unconstrained structure-from-motion.arXiv preprint, 2024. 3, 11

work page 2024

[8] [8]

Wavelet-based reflection symmetry detection via textural and color histograms

Mohamed Elawady, Christophe Ducottet, Olivier Alata, C´ecile Barat, and Philippe Colantoni. Wavelet-based reflection symmetry detection via textural and color histograms. In ICCV Workshops, pages 1725–1733, 2017. 2

work page 2017

[9] [9]

A density-based algorithm for discovering clusters in large spatial databases with noise

Martin Ester, Hans-Peter Kriegel, J¨org Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. InProc. KDD, page 226–231. AAAI Press, 1996. 3, 4

work page 1996

[10] [10]

Beyond planar symmetry: Modeling human perception of reflection and rotation symmetries in the wild

Christopher Funk and Yanxi Liu. Beyond planar symmetry: Modeling human perception of reflection and rotation symmetries in the wild. InICCV, 2017. 2

work page 2017

[11] [11]

2017 iccv challenge: Detecting symmetry in the wild

Christopher Funk, Seungkyu Lee, Martin R Oswald, Stavros Tsogkas, Wei Shen, Andrea Cohen, Sven Dickinson, and Yanxi Liu. 2017 iccv challenge: Detecting symmetry in the wild. InICCV Workshops, 2017. 2

work page 2017

[12] [12]

Prs-net: Planar reflective symmetry detection net for 3d models.IEEE TVCG, 27(6): 3007–3018, 2020

Lin Gao, Ling-Xiao Zhang, Hsien-Yu Meng, Yi-Hui Ren, Yu-Kun Lai, and Leif Kobbelt. Prs-net: Planar reflective symmetry detection net for 3d models.IEEE TVCG, 27(6): 3007–3018, 2020. 2

work page 2020

[13] [13]

Robust symmetry detection via riemannian langevin dynamics

Jihyeon Je, Jiayi Liu, Guandao Yang, Boyang Deng, Shengqu Cai, Gordon Wetzstein, Or Litany, and Leonidas Guibas. Robust symmetry detection via riemannian langevin dynamics. InSIGGRAPH Asia 2024 Conference Papers, pages 1–11,

work page 2024

[14] [14]

Detecting symmetry in grey level images: The global optimization approach.International Journal of Computer Vision, 29(1):29–45, 1998

Nahum Kiryati and Yossi Gofman. Detecting symmetry in grey level images: The global optimization approach.International Journal of Computer Vision, 29(1):29–45, 1998. 2

work page 1998

[15] [15]

Dense 3d reconstruction of symmetric scenes from a single image

Kevin K¨oser, Christopher Zach, and Marc Pollefeys. Dense 3d reconstruction of symmetric scenes from a single image. InJoint Pattern Recognition Symposium, pages 266–275. Springer, 2011. 2, 3

work page 2011

[16] [16]

The hungarian method for the assignment problem.Naval research logistics quarterly, 2(1-2):83–97,

Harold W Kuhn. The hungarian method for the assignment problem.Naval research logistics quarterly, 2(1-2):83–97,

work page

[17] [17]

Grounding image matching in 3d with mast3r

Vincent Leroy, Yohann Cabon, and J´erˆome Revaud. Grounding image matching in 3d with mast3r. InECCV, 2024. 3

work page 2024

[18] [18]

E3sym: Leveraging e (3) invariance for unsupervised 3d planar reflective symmetry detection

Ren-Wu Li, Ling-Xiao Zhang, Chunpeng Li, Yu-Kun Lai, and Lin Gao. E3sym: Leveraging e (3) invariance for unsupervised 3d planar reflective symmetry detection. InICCV, 2023. 2

work page 2023

[19] [19]

Sym- metry strikes back: From single-image symmetry detection to 3d generation

Xiang Li, Zixuan Huang, Anh Thai, and James M Rehg. Sym- metry strikes back: From single-image symmetry detection to 3d generation. InCVPR, 2025. 1, 2, 5, 7, 8, 11, 12, 14

work page 2025

[20] [20]

Refinenet: Multi-path refinement networks for high-resolution semantic segmentation

Guosheng Lin, Anton Milan, Chunhua Shen, and Ian Reid. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. InCVPR, 2017. 6

work page 2017

[21] [21]

Nerd++: Improved 3d-mirror symmetry learning from a single image.arXiv preprint, 2021

Yancong Lin, Silvia-Laura Pintea, and Jan van Gemert. Nerd++: Improved 3d-mirror symmetry learning from a single image.arXiv preprint, 2021. 2

work page 2021

[22] [22]

Detecting symmetry and symmetric constellations of features

Gareth Loy and Jan-Olof Eklundh. Detecting symmetry and symmetric constellations of features. InECCV, 2006. 2

work page 2006

[23] [23]

Symmetry and uncertainty-aware object slam for 6dof object pose estimation

Nathaniel Merrill, Yuliang Guo, Xingxing Zuo, Xinyu Huang, Stefan Leutenegger, Xi Peng, Liu Ren, and Guoquan Huang. Symmetry and uncertainty-aware object slam for 6dof object pose estimation. InCVPR, 2022. 1

work page 2022

[24] [24]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. InECCV, 2020. 13

work page 2020

[25] [25]

Symmetry in 3d geometry: Extraction and applications

Niloy J Mitra, Mark Pauly, Michael Wand, and Duygu Ceylan. Symmetry in 3d geometry: Extraction and applications. In Comput. Graph. Forum, pages 1–23. Wiley Online Library,

work page

[26] [26]

Symmmap: Estimation of the 2-d reflection symmetry map and its applications

Rajendra Nagar and Shanmuganathan Raman. Symmmap: Estimation of the 2-d reflection symmetry map and its applications. InICCV Workshops, pages 1715–1724, 2017. 2

work page 2017

[27] [27]

Scalable diffusion models with transformers

William Peebles and Saining Xie. Scalable diffusion models with transformers. InICCV, pages 4195–4205, 2023. 6

work page 2023

[28] [28]

Film: Visual reasoning with a general conditioning layer

Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. Film: Visual reasoning with a general conditioning layer. InAAAI, 2018. 6, 11, 12

work page 2018

[29] [29]

Vision transformers for dense prediction

Ren´e Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. Vision transformers for dense prediction. InICCV, pages 12179–12188, 2021. 6, 12

work page 2021

[30] [30]

Com- mon objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction

Jeremy Reizenstein, Roman Shapovalov, Philipp Henzler, Luca Sbordone, Patrick Labatut, and David Novotny. Com- mon objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction. InProceedings of the IEEE/CVF international conference on computer vision, pages 10901–10911, 2021. 13

work page 2021

[31] [31]

Common objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction

Jeremy Reizenstein, Roman Shapovalov, Philipp Henzler, Luca Sbordone, Patrick Labatut, and David Novotny. Common objects in 3d: Large-scale learning and evaluation of real-life 3d category reconstruction. InInternational Conference on Computer Vision, 2021. 13

work page 2021

[32] [32]

Detecting 3-d mirror symmetry in a 2-d camera image for 3-d shape re- covery.Proceedings of the IEEE, 102(10):1588–1606, 2014

Tadamasa Sawada, Yunfeng Li, and Zygmunt Pizlo. Detecting 3-d mirror symmetry in a 2-d camera image for 3-d shape re- covery.Proceedings of the IEEE, 102(10):1588–1606, 2014. 2 9

work page 2014

[33] [33]

Structure- from-motion revisited

Johannes L Schonberger and Jan-Michael Frahm. Structure- from-motion revisited. InCVPR, 2016. 2, 4

work page 2016

[34] [34]

Pixelwise view selection for unstructured multi-view stereo

Johannes Lutz Sch¨onberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. Pixelwise view selection for unstructured multi-view stereo. InEuropean Conference on Computer Vision (ECCV), 2016. 4, 13

work page 2016

[35] [35]

Symmetrynet: Learning to predict reflectional and rotational symmetries of 3d shapes from single- view rgb-d images.ACM TOG, 39(6):1–14, 2020

Yifei Shi, Junwen Huang, Hongjia Zhang, Xin Xu, Szymon Rusinkiewicz, and Kai Xu. Symmetrynet: Learning to predict reflectional and rotational symmetries of 3d shapes from single- view rgb-d images.ACM TOG, 39(6):1–14, 2020. 1, 2, 5, 7

work page 2020

[36] [36]

Symmetrygrasp: Symmetry-aware antipodal grasp detection from single-view rgb-d images.RA-L, 7(4): 12235–12242, 2022

Yifei Shi, Zixin Tang, Xiangting Cai, Hongjia Zhang, Dewen Hu, and Xin Xu. Symmetrygrasp: Symmetry-aware antipodal grasp detection from single-view rgb-d images.RA-L, 7(4): 12235–12242, 2022. 1

work page 2022

[37] [37]

Learning to detect 3d symmetry from single-view rgb-d images with weak supervision.IEEE TPAMI, 45(4): 4882–4896, 2022

Yifei Shi, Xin Xu, Junhua Xi, Xiaochang Hu, Dewen Hu, and Kai Xu. Learning to detect 3d symmetry from single-view rgb-d images with weak supervision.IEEE TPAMI, 45(4): 4882–4896, 2022. 2, 5

work page 2022

[38] [38]

To aggregate or not to aggregate: Selective match kernels for image search

Giorgos Tolias, Yannis Avrithis, and Herv ´e J ´egou. To aggregate or not to aggregate: Selective match kernels for image search. InICCV, pages 1401–1408, 2013. 3

work page 2013

[39] [39]

Learning-based symmetry detection in natural images

Stavros Tsogkas and Iasonas Kokkinos. Learning-based symmetry detection in natural images. InECCV, 2012. 2

work page 2012

[40] [40]

Megascenes: Scene-level view synthesis at scale

Joseph Tung, Gene Chou, Ruojin Cai, Guandao Yang, Kai Zhang, Gordon Wetzstein, Bharath Hariharan, and Noah Snavely. Megascenes: Scene-level view synthesis at scale. In ECCV, 2024. 3, 13

work page 2024

[41] [41]

Attention is all you need.NeurIPS, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.NeurIPS, 30, 2017. 5

work page 2017

[42] [42]

Vggt: Visual geometry grounded transformer

Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Visual geometry grounded transformer. InCVPR, 2025. 5, 6, 7, 8, 11

work page 2025

[43] [43]

Dust3r: Geometric 3d vision made easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. Dust3r: Geometric 3d vision made easy. InCVPR, pages 20697–20709, 2024. 6

work page 2024

[44] [44]

Optimal al- gorithms for symmetry detection in two and three dimensions

Jan D Wolter, Tony C Woo, and Richard A V olz. Optimal al- gorithms for symmetry detection in two and three dimensions. The Visual Computer, 1(1):37–48, 1985. 1

work page 1985

[45] [45]

Unsupervised learning of probably symmetric deformable 3d objects from images in the wild

Shangzhe Wu, Christian Rupprecht, and Andrea Vedaldi. Unsupervised learning of probably symmetric deformable 3d objects from images in the wild. InCVPR, 2020. 1, 2

work page 2020

[46] [46]

De-rendering the world’s revolutionary artefacts

Shangzhe Wu, Ameesh Makadia, Jiajun Wu, Noah Snavely, Richard Tucker, and Angjoo Kanazawa. De-rendering the world’s revolutionary artefacts. InCVPR, 2021. 1

work page 2021

[47] [47]

Doppelgangers++: Improved visual disam- biguation with geometric 3d features

Yuanbo Xiangli, Ruojin Cai, Hanyu Chen, Jeffrey Byrne, and Noah Snavely. Doppelgangers++: Improved visual disam- biguation with geometric 3d features. InCVPR, 2025. 2, 3

work page 2025

[48] [48]

Front2back: Single view 3d shape reconstruction via front to back prediction

Yuan Yao, Nico Schertler, Enrique Rosales, Helge Rhodin, Leonid Sigal, and Alla Sheffer. Front2back: Single view 3d shape reconstruction via front to back prediction. InCVPR,

work page

[49] [49]

Single depth-image 3d reflection symmetry and shape prediction

Zhaoxuan Zhang, Bo Dong, Tong Li, Felix Heide, Pieter Peers, Baocai Yin, and Xin Yang. Single depth-image 3d reflection symmetry and shape prediction. InICCV, 2023. 2

work page 2023

[50] [50]

Learning symmetry-aware geometry correspondences for 6d object pose estimation

Heng Zhao, Shenxing Wei, Dahu Shi, Wenming Tan, Zheyang Li, Ye Ren, Xing Wei, Yi Yang, and Shiliang Pu. Learning symmetry-aware geometry correspondences for 6d object pose estimation. InICCV, 2023. 1

work page 2023

[51] [51]

Nerd: Neural 3d reflection symmetry detector

Yichao Zhou, Shichen Liu, and Yi Ma. Nerd: Neural 3d reflection symmetry detector. InCVPR, 2021. 1, 2, 5 10 ArchSym: Detecting 3D-Grounded Architectural Symmetries in the Wild Supplementary Material A. Implementation details Our implementation builds upon the official MASt3R [7] and VGGT [42] codebases. A.1. Training details We use a base learning rate of...

work page 2021