pith. sign in

arxiv: 2606.31164 · v1 · pith:GNPAQWTVnew · submitted 2026-06-30 · 💻 cs.CV

Seeing Through the Weights: Privacy Leakage in Scene Coordinate Regression

Pith reviewed 2026-07-01 06:30 UTC · model grok-4.3

classification 💻 cs.CV
keywords scene coordinate regressionprivacy leakage3D reconstructionmodel inversionvisual localizationneural scene representationcoordinate regression attack
0
0 comments X

The pith

Scene coordinate regression models leak the 3D geometry and appearance of their training scenes through a query-based attack using unrelated proxy images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that scene coordinate regression networks, which implicitly encode a scene by regressing 3D coordinates for image pixels, do not preserve privacy as often assumed. Instead, an attacker can query the model repeatedly with batches of unrelated images, identify points whose predicted coordinates remain stable under small perturbations, and accumulate those points across queries to recover the training scene's geometry. The recovered points also allow feature inversion to synthesize approximate scene appearance from new viewpoints. A sympathetic reader would care because this shows that implicit neural encodings of environments carry extractable information about private spaces, creating practical risks for deployment in homes or secure areas.

Core claim

The paper claims that a query-based attack reconstructs substantial portions of an SCR model's training environment by sending batches of proxy images unrelated to the target scene, selecting reliable 3D coordinate predictions based on stability under input perturbations, accumulating these points across independent batches, and refining them in white-box settings to recover geometry; the same process further enables inversion of network features to synthesize recognizable scene appearance from arbitrary viewpoints.

What carries the argument

The query-based attack that selects stable 3D coordinate predictions from unrelated proxy images under small perturbations and accumulates them across batches to recover scene geometry.

If this is right

  • Substantial portions of indoor and outdoor training scenes can be recovered with high geometric fidelity from the model alone.
  • Approximate color appearance of the scene can be synthesized from arbitrary viewpoints using the recovered representation.
  • Recognizable layout details and potentially sensitive scene elements become exposed through the attack.
  • SCR systems cannot be considered privacy-preserving by design when used in private or security-critical environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Other implicit neural representations of 3D scenes may carry similar extractable information about their training data.
  • Adding controlled noise to coordinate outputs or limiting query access could reduce leakage without full retraining.
  • The attack suggests that any regression-based encoding of spatial data may retain recoverable structure unless explicitly regularized against it.

Load-bearing premise

Stable coordinate predictions obtained from unrelated proxy images correspond to actual points in the training scene and accumulate reliably into accurate geometry.

What would settle it

If querying an SCR model with batches of unrelated proxy images produces no stable points that accumulate into a 3D structure matching the actual training scene geometry, the leakage result would not hold.

Figures

Figures reproduced from arXiv: 2606.31164 by Byungkon Kang, Francois Rameau, Jaemin Cho, Oleksii Nasypanyi, Utku Ozbulak.

Figure 1
Figure 1. Figure 1: Overview of the proposed query-based attack. An SCR model trained on im￾ages of a private scene is queried using proxy images. By aggregating the returned 3D predictions, the attacker reconstructs a voxelized representation of the scene and synthesizes novel views, revealing both geometry and appearance. increasingly used in private or sensitive spaces like homes, offices, or medical fa￾cilities, privacy b… view at source ↗
Figure 2
Figure 2. Figure 2: Gray-box deployment scenario considered in this work. Feature extraction is performed on the client using a public encoder, while the scene-specific head is hosted remotely. The client transmits spatially shuffled feature maps to the server, which returns predicted 3D scene coordinates used locally for pose estimation. 3 Background This section introduces SCR, the ACE framework as a representative SCR ex￾a… view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the proposed white-box attack on SCR models. (1) Proxy images are encoded using a public feature extractor to obtain dense feature maps. (2) Features are optionally refined via gradient descent to move them toward stable regions of the learned scene. (3) Unstable predictions are discarded by retaining only the top-p% most stable features. (4) Stable 3D predictions are accumulated into a voxel o… view at source ↗
Figure 4
Figure 4. Figure 4: Detailed modules of the attack: (a) Stability score s(f) computed from the variance of noise-perturbed predictions. (b) Training pipeline of the inversion network. measuring the resulting variation in predicted 3D space, as illustrated in Fig. 4a. Specifically, we draw perturbations ϵk ∼ N (0, σ2 I) and compute s(\mathbf {f}) = \frac {1}{n-1}\sum _{k=1}^{n} \left \| g_{\theta _s}(\mathbf {f}+\boldsymbol {\… view at source ↗
Figure 5
Figure 5. Figure 5: (a) Effect of proxy image domain on query budget for two representative scenes (top) 7-Scenes Stairs and (bottom) Cambridge Shop Facade and (b) relationship be￾tween localization model accuracy (translation) and reconstruction quality [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Qualitative reconstruction on 7-Scenes. Free-view renderings are gener￾ated from viewpoints that are not included in either the training or test sets. Real query images are shown alongside renderings of the reconstructed scene from the same poses for visual comparison. We also show the recovered 3D geometry as an error heat map, with colors ranging from blue for 0 cm error to red for errors above 10 cm, to… view at source ↗
Figure 7
Figure 7. Figure 7: Cambridge Landmarks reconstruction results. [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Qualitative comparison of reconstructed point clouds under different access levels. Colors indicate the distance to the ground-truth point cloud (blue: small error, red: large error ≥ 10 cm). (a) Pascal VOC (b) SUN RGB-D (c) Synthetic Images [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Sample images from different proxy datasets. structural similarity with real-world imagery, ensuring essentially zero domain overlap with the target dataset. Example images from all datasets are shown in [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Colmap, depth and ACE Pseudo GT comparison on 7-scenes, scene fire D. Image Reconstruction Quality Evaluation Upper/lower bounds. To jointly evaluate 3D and image reconstruction qual￾ity, we estimate lower and upper bounds for image reconstruction. For image inversion, the upper bound assumes near-perfect geometry. To simulate this, we keep only proxy-derived point-feature pairs within 2 cm of the pseudo-… view at source ↗
Figure 11
Figure 11. Figure 11: Example of YOLOv7 on generated images [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Whitebox attack on GLACE Aachen model. Missing Background Too much Background Different Coverage COLMAP RGB Depth Pseudo ACE Pseudo ACE-G DSAC* Recovered ACE-G Recovered Client Attacker Gray-box Attack Private 3D Recovered 3D X-Ray Pascal X-Ray to CT SCRNet (a) Overview of private and recovered 3D on SCRNet Missing Background Too much Background Different Coverage COLMAP RGB Depth Pseudo ACE Pseudo ACE-G … view at source ↗
Figure 13
Figure 13. Figure 13: (a) Recovered Scene Coordinates on X-ray-to-CT SCR model (b) Qualitative result of DSAC* and ACE-G. F. Metrics Let P denote the reconstructed point cloud and G the pseudo ground-truth geometry. For each point p ∈ P, let d(p, G) = ming∈G ∥p − g∥2 be the nearest￾neighbor distance, and define d(g,P) symmetrically. Chamfer Distance (CD). We use the symmetric half Chamfer Distance as our primary geometric erro… view at source ↗
Figure 14
Figure 14. Figure 14: Percentile ablation on stability-based filtering in the gray-box and white-box attacks on ACE [3]. The shaded region marks the default setting p = 20%, diversifying the retained percentile. J. Global Descriptors As explained in Sec. G., GLACE uses global descriptors for coordinate predic￾tion. However, since we assume white-box access and want to avoid predictions varying too strongly with the data distri… view at source ↗
Figure 15
Figure 15. Figure 15: Comparison of image reconstruction results across different access levels on the 7-Scenes Office scene [PITH_FULL_IMAGE:figures/full_fig_p029_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Gray-Box (red) vs. White-Box (blue) convergence on Cambridge Landmarks. Shaded regions indicate the better-performing attack (blue = White-Box, red = Gray￾Box). White-Box achieves lower Chamfer Distance in nearly all cases as it converges, while Gray-Box often tends to diverge as more images are processed [PITH_FULL_IMAGE:figures/full_fig_p030_16.png] view at source ↗
read the original abstract

Scene Coordinate Regression (SCR) methods are increasingly adopted for visual localization. In these approaches, the scene is implicitly encoded within a neural network that regresses a 3D world coordinate for each image pixel. Because the scene is represented only through the network parameters and not stored explicitly as images or maps, such methods are often assumed to be privacy-preserving. In this work, we show that this assumption is incorrect in practice. Specifically, we introduce a query-based attack that reconstructs the 3D geometry of the training environment from an SCR model under different levels of model access. To do so, we repeatedly query the model with batches of proxy images unrelated to the target scene to obtain dense pixel-wise 3D coordinates. Reliable points are identified through their stability under small input perturbations and can be further refined in a white-box setting. These stable points are accumulated across independent query batches to recover the scene geometry. From the recovered 3D representation, we also invert the network features to synthesize images from arbitrary viewpoints, revealing additional appearance information. Experiments on indoor and outdoor datasets demonstrate that substantial portions of training environments can be reconstructed with high geometric fidelity. Beyond geometry, we also recover an approximate color appearance, which exposes recognizable layout and potentially sensitive scene elements. This directly contradicts claims in the literature that SCR representations are privacy-preserving by design, and reveals a real risk when such systems are deployed in private or security-critical spaces. The project page is available at https://jaeminch0.github.io/seeing-through-the-weights-privacy-leakage-in-scene-coordinate-regression.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that Scene Coordinate Regression (SCR) models, which regress 3D coordinates per pixel and implicitly encode scenes in network weights, are not privacy-preserving. It introduces a query-based attack that repeatedly queries the model with batches of unrelated proxy images, identifies reliable 3D points via stability under small input perturbations, accumulates these points across batches to reconstruct scene geometry, and optionally inverts features to synthesize appearance from novel viewpoints. Experiments on indoor and outdoor datasets are reported to recover substantial portions of the training scenes with high geometric fidelity, directly contradicting prior literature assumptions that SCR representations are privacy-preserving by design.

Significance. If the reported reconstructions hold with the claimed fidelity and generalize beyond the tested models, the work is significant because it provides the first concrete empirical demonstration of privacy leakage in SCR systems used for visual localization. The attack is falsifiable via perturbation stability and accumulation, and the project page aids reproducibility. This challenges an assumption in the visual localization literature and identifies deployment risks in private or security-critical settings.

major comments (2)
  1. [Abstract] Abstract: the claim that experiments 'demonstrate that substantial portions of training environments can be reconstructed with high geometric fidelity' is load-bearing for the central empirical claim but is unsupported by any quantitative metrics (e.g., point-cloud coverage, Chamfer distance, or IoU), baselines, or controls; this prevents verification that the attack exceeds chance-level accumulation of stable predictions.
  2. [Attack method] Attack method description: the assumption that points stable under perturbations on unrelated proxy images correspond to actual training-scene geometry (rather than spurious stable outputs) is central to the accumulation step, yet the manuscript provides no explicit verification such as overlap statistics against ground-truth 3D models or cross-batch consistency checks with known scene elements.
minor comments (2)
  1. [Introduction] Introduction: the statement that SCR methods are 'often assumed to be privacy-preserving' should include specific citations to prior works making that claim to ground the motivation.
  2. [Experiments] Figure captions and experimental results: ensure all reported reconstructions include error bars or variance across multiple runs and models to clarify robustness.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive recommendation of minor revision and the constructive comments. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation of our empirical results.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that experiments 'demonstrate that substantial portions of training environments can be reconstructed with high geometric fidelity' is load-bearing for the central empirical claim but is unsupported by any quantitative metrics (e.g., point-cloud coverage, Chamfer distance, or IoU), baselines, or controls; this prevents verification that the attack exceeds chance-level accumulation of stable predictions.

    Authors: We agree that the abstract claim would benefit from explicit quantitative support. In the revised version we will augment the abstract (and corresponding results section) with quantitative metrics including point-cloud coverage percentages relative to ground-truth scene models, Chamfer distances where available, and a random-accumulation baseline to demonstrate that the stable-point accumulation exceeds chance-level predictions. revision: yes

  2. Referee: [Attack method] Attack method description: the assumption that points stable under perturbations on unrelated proxy images correspond to actual training-scene geometry (rather than spurious stable outputs) is central to the accumulation step, yet the manuscript provides no explicit verification such as overlap statistics against ground-truth 3D models or cross-batch consistency checks with known scene elements.

    Authors: We acknowledge that additional explicit verification would improve clarity. In the revision we will include overlap statistics computed against available ground-truth 3D models on the evaluated datasets as well as cross-batch consistency metrics (e.g., fraction of points recovered in multiple independent query batches) to directly support that the stable points align with training-scene geometry rather than spurious outputs. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is an empirical demonstration of a query-based attack that extracts stable 3D points from an SCR model via perturbation stability and accumulates them into a scene reconstruction. No derivation chain, fitted parameters renamed as predictions, or self-citation load-bearing steps are present. The central claim is supported by experimental results on indoor/outdoor datasets rather than any reduction to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper introduces no free parameters, new axioms, or invented entities; it relies on standard assumptions about neural network query access and perturbation stability.

pith-pipeline@v0.9.1-grok · 5834 in / 1041 out tokens · 28511 ms · 2026-07-01T06:30:05.730170+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 2 canonical work pages · 1 internal anchor

  1. [1]

    In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: Cnn ar- chitecture for weakly supervised place recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 5297–5307 (2016)

  2. [2]

    In: International confer- ence on machine learning

    Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: International confer- ence on machine learning. pp. 274–283. PMLR (2018)

  3. [3]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Brachmann, E., Cavallari, T., Prisacariu, V.A.: Accelerated coordinate encoding: Learning to relocalize in minutes using rgb and poses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5044– 5053 (2023) 16 O. Nasypanyi et al

  4. [4]

    In: CVPR (2017)

    Brachmann, E., Krull, A., Nowozin, S., Shotton, J., Michel, F., Gumhold, S., Rother, C.: Dsac-differentiable ransac for camera localization. In: CVPR (2017)

  5. [5]

    In: European Conference on Com- puter Vision

    Brachmann, E., Wynn, J., Chen, S., Cavallari, T., Monszpart, A., Turmukhambe- tov, D., Prisacariu, V.A.: Scene coordinate reconstruction: Posing of image collec- tions via incremental learning of a relocalizer. In: European Conference on Com- puter Vision. pp. 421–440. Springer (2024)

  6. [6]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision

    Bruns, L., Barroso-Laguna, A., Cavallari, T., Monszpart, A., Munukutla, S., Prisacariu, V.A., Brachmann, E.: Ace-g: Improving generalization of scene coor- dinate regression through query pre-training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 26751–26761 (2025)

  7. [7]

    In: 2017 ieee symposium on security and privacy (sp)

    Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 ieee symposium on security and privacy (sp). pp. 39–57. IEEE (2017)

  8. [8]

    Chalmers Tekniska Hogskola (Sweden) (2025)

    Chelani, K.: Privacy in Visual Localization. Chalmers Tekniska Hogskola (Sweden) (2025)

  9. [9]

    In: 2025 International Conference on 3D Vision (3DV)

    Chelani, K., Benbihi, A., Kahl, F., Sattler, T., Kukelova, Z.: Obfuscation based privacy preserving representations are recoverable using neighborhood information. In: 2025 International Conference on 3D Vision (3DV). pp. 189–199. IEEE (2025)

  10. [10]

    In: CVPR (2021)

    Chelani, K., Kahl, F., Sattler, T.: How privacy-preserving are line clouds? recov- ering scene details from 3d lines. In: CVPR (2021)

  11. [11]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Chelani, K., Sattler, T., Kahl, F., Kukelova, Z.: Privacy-preserving representations are not enough: Recovering scene content from camera poses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 13132–13141. IEEE (2023)

  12. [12]

    In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Clark, R., Wang, S., Markham, A., Trigoni, N., Wen, H.: Vidloc: A deep spatio- temporal model for 6-dof video-clip relocalization. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 6856–6864 (2017)

  13. [13]

    In: Proceedings of the AAAI conference on artificial intelligence (2017)

    Clark, R., Wang, S., Wen, H., Markham, A., Trigoni, N.: Vinet: Visual-inertial odometry as a sequence-to-sequence learning problem. In: Proceedings of the AAAI conference on artificial intelligence (2017)

  14. [14]

    In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Dai,A.,Chang,A.X.,Savva,M.,Halber,M.,Funkhouser,T.,Nießner,M.:Scannet: Richly-annotated 3d reconstructions of indoor scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 5828–5839 (2017)

  15. [15]

    In: Proceedings of the IEEE conference on com- puter vision and pattern recognition workshops

    DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE conference on com- puter vision and pattern recognition workshops. pp. 224–236 (2018)

  16. [16]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2022)

    Do, T., Miksik, O., DeGol, J., Park, H.S., Sinha, S.N.: Learning to detect scene landmarks for camera localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2022)

  17. [17]

    In: CVPR (2016)

    Dosovitskiy, A., Brox, T.: Inverting visual representations with convolutional net- works. In: CVPR (2016)

  18. [18]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Dusmanu, M., Schonberger, J.L., Sinha, S.N., Pollefeys, M.: Privacy-preserving image features via adversarial affine subspace embeddings. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 14267– 14277 (2021)

  19. [19]

    International Journal of Computer Vision88(2), 303–338 (2010)

    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. International Journal of Computer Vision88(2), 303–338 (2010)

  20. [20]

    In: ECCV (2020) Seeing Through the Weights 17

    Geppert, M., Larsson, V., Speciale, P., Schönberger, J.L., Pollefeys, M.: Privacy preserving structure-from-motion. In: ECCV (2020) Seeing Through the Weights 17

  21. [21]

    In: Proceedings of the ieee/cvf conference on computer vision and pattern recognition

    Izquierdo, S., Civera, J.: Optimal transport aggregation for visual place recogni- tion. In: Proceedings of the ieee/cvf conference on computer vision and pattern recognition. pp. 17658–17668 (2024)

  22. [22]

    Robotics and Automation Letters9(5), 4710–4717 (2024)

    Jeong,H.,Shin,J.,Rameau,F.,Kum,D.:Multi-modalplacerecognitionviavector- ized hd maps and images fusion for autonomous driving. Robotics and Automation Letters9(5), 4710–4717 (2024)

  23. [23]

    In: Proceedings of the IEEE international con- ference on computer vision

    Kendall, A., Grimes, M., Cipolla, R.: Posenet: A convolutional network for real- time 6-dof camera relocalization. In: Proceedings of the IEEE international con- ference on computer vision. pp. 2938–2946 (2015)

  24. [24]

    ACM Trans

    Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph.42(4), 139–1 (2023)

  25. [25]

    In: Proceedings of the IEEE International Confer- ence on Computer Vision

    Larsson, V., Kukelova, Z., Zheng, Y.: Making minimal solvers for absolute pose estimation compact and robust. In: Proceedings of the IEEE International Confer- ence on Computer Vision. pp. 2316–2324 (2017)

  26. [26]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Lee, C., Kim, J., Yun, C., Hong, J.H.: Paired-point lifting for enhanced privacy- preserving visual localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 17266–17275 (2023)

  27. [27]

    In: CVPR (2021)

    Lee, D., Ryu, S., Yeon, S., Lee, Y., Kim, D., Han, C., Cabon, Y., Weinzaepfel, P., Guérin, N., Csurka, G., et al.: Large-scale localization datasets in crowded indoor spaces. In: CVPR (2021)

  28. [28]

    Interna- tional journal of computer vision60(2), 91–110 (2004)

    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Interna- tional journal of computer vision60(2), 91–110 (2004)

  29. [29]

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    Mahendran, A., Vedaldi, A.: Understanding deep image representations by invert- ing them. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 5188–5196 (June 2015)

  30. [30]

    Commu- nications of the ACM65(1), 99–106 (2021)

    Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. Commu- nications of the ACM65(1), 99–106 (2021)

  31. [31]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Moon, H., Lee, C., Hong, J.H.: Efficient privacy-preserving visual localization using 3d ray clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9773–9783 (2024)

  32. [32]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Ng, T., Kim, H.J., Lee, V.T., DeTone, D., Yang, T.Y., Shen, T., Ilg, E., Balntas, V., Mikolajczyk, K., Sweeney, C.: Ninjadesc: content-concealing visual descriptors via adversarial learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 12797–12807 (2022)

  33. [33]

    Sensitivity and Generalization in Neural Networks: an Empirical Study

    Novak, R., Bahri, Y., Abolafia, D.A., Pennington, J., Sohl-Dickstein, J.: Sensi- tivity and generalization in neural networks: an empirical study. arXiv preprint arXiv:1802.08760 (2018)

  34. [34]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision

    Pan, L., Schönberger, J.L., Larsson, V., Pollefeys, M.: Privacy preserving localiza- tion via coordinate permutations. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 18174–18183 (2023)

  35. [35]

    In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR)

    Pietrantoni, M., Csurka, G., Sattler, T.: Gaussian splatting feature fields for (privacy-preserving) visual localization. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR). pp. 1082–1092 (June 2025)

  36. [36]

    In: ProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecog- nition

    Pietrantoni, M., Humenberger, M., Sattler, T., Csurka, G.: Segloc: Learning segmentation-based representations for privacy-preserving visual localization. In: ProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecog- nition. pp. 15380–15391 (2023)

  37. [37]

    Nasypanyi et al

    Pietrantoni,M.,Humenberger,M.,Sattler,T.,Csurka,G.:Canwemakenerf-based visual localization privacy-preserving? arXiv preprint arXiv:2508.18971 (2025) 18 O. Nasypanyi et al

  38. [38]

    In: CVPR (2019)

    Pittaluga, F., Koppal, S.J., Kang, S.B., Sinha, S.N.: Revealing scenes by inverting structure from motion reconstructions. In: CVPR (2019)

  39. [39]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision

    Pittaluga, F., Zhuang, B.: Ldp-feat: Image features with local differential privacy. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 17580–17590 (2023)

  40. [40]

    In: ECCV (2022)

    Sarlin, P.E., Dusmanu, M., Schönberger, J.L., Speciale, P., Gruber, L., Larsson, V., Miksik, O., Pollefeys, M.: Lamar: Benchmarking localization and mapping for augmented reality. In: ECCV (2022)

  41. [41]

    In: Pro- ceedings of the IEEE conference on computer vision and pattern recognition

    Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., Fitzgibbon, A.: Scene coordinate regression forests for camera relocalization in rgb-d images. In: Pro- ceedings of the IEEE conference on computer vision and pattern recognition. pp. 2930–2937 (2013)

  42. [42]

    In: MICCAI (2023)

    Shrestha, P., Xie, C., Shishido, H., Yoshii, Y., Kitahara, I.: X-ray to ct rigid reg- istration using scene coordinate regression. In: MICCAI (2023)

  43. [43]

    In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Song,S.,Lichtenberg,S.P.,Xiao,J.:Sunrgb-d:Argb-dsceneunderstandingbench- mark suite. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 567–576 (2015)

  44. [44]

    Computer Vision and Image Understanding259, 104440 (2025)

    Song, X., Tang, J., Yang, K., Guo, W., Lu, X., Hei, X.: A method for absolute pose regression based on cascaded attention modules. Computer Vision and Image Understanding259, 104440 (2025)

  45. [45]

    In: CVPR (2019)

    Speciale, P., Schonberger, J.L., Kang, S.B., Sinha, S.N., Pollefeys, M.: Privacy preserving image-based localization. In: CVPR (2019)

  46. [46]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision

    Speciale, P., Schonberger, J.L., Sinha, S.N., Pollefeys, M.: Privacy preserving image queries for camera localization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 1486–1496 (2019)

  47. [47]

    In: 25th USENIX security symposium (USENIX Security 16)

    Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction{APIs}. In: 25th USENIX security symposium (USENIX Security 16). pp. 601–618 (2016)

  48. [48]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Wang, F., Jiang, X., Galliani, S., Vogel, C., Pollefeys, M.: Glace: Global local accelerated coordinate encoding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 21562–21571 (2024)

  49. [49]

    IEEE Robotics and Automation Letters6(2), 699–706 (2020)

    Zhou, L., Koppel, D., Kaess, M.: A complete, accurate and efficient solution for the perspective-n-line problem. IEEE Robotics and Automation Letters6(2), 699–706 (2020)

  50. [50]

    Zhou, Q., Agostinho, S., Ošep, A., Leal-Taixé, L.: Is geometry enough for matching in visual localization? In: European Conference on Computer Vision. pp. 407–425. Springer (2022)

  51. [51]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Zhu, S., Yang, L., Chen, C., Shah, M., Shen, X., Wang, H.: R2former: Unified retrieval and reranking transformer for place recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 19370– 19380 (2023) Seeing Through the Weights 19 Seeing Through the Weights: Privacy Leakage in Scene Coordinate Regression Suppl...

  52. [52]

    11, Table 5), which is considered an upper bound for PPVL

    (Fig. 11, Table 5), which is considered an upper bound for PPVL. InvSfM preserves sharper local details when the 3D geometry is accurate, but it can still obtain low PSNR because sparse points create missing regions and occlusion artifacts. Appearance leakage quality.To further evaluate the information revealed by the proposed method, we assess object rec...