pith. sign in

arxiv: 2605.16925 · v1 · pith:RAVAFULXnew · submitted 2026-05-16 · 💻 cs.CV

P2GS: Physical Prior-guided Gaussian Splatting for Photometrically Consistent Urban Reconstruction

Pith reviewed 2026-05-19 20:54 UTC · model grok-4.3

classification 💻 cs.CV
keywords 3D Gaussian SplattingPhotometric ConsistencyUrban ReconstructionHDR Radiance FieldAutonomous DrivingExposure NormalizationTone MappingLDR Images
0
0 comments X

The pith

P2GS jointly decomposes a view-invariant linear HDR radiance field, per-view exposure scales, and tone-mapping functions from LDR images to fix photometric inconsistencies in 3D Gaussian Splatting for urban scenes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard 3D Gaussian Splatting assumes uniform exposure and tone mapping across all views, but real driving footage violates this because of different cameras and changing outdoor light. P2GS addresses the problem by learning one underlying linear high-dynamic-range radiance field that stays constant no matter the viewpoint. It simultaneously recovers the unique exposure scale and tone-mapping curve for each input image using only ordinary low-dynamic-range photographs. The method grounds the entire process in the known physics of image formation and adds regularization in the HDR domain to keep the solution stable. This produces a radiance field that renders with consistent illumination across sparse views while keeping the fast real-time performance of the original Gaussian Splatting technique.

Core claim

By jointly optimizing a shared linear HDR radiance field together with per-view exposure scales and tone-mapping functions from LDR observations alone, and by enforcing relative-exposure consistency plus HDR-domain radiance regularization, P2GS recovers a view-invariant radiance field that is robust to inter-camera illumination differences without HDR supervision.

What carries the argument

Unified optimization strategy grounded in the physical image-formation process that jointly solves for the shared linear HDR radiance field, per-view exposure scales, and tone-mapping functions while enforcing relative-exposure consistency and HDR-domain regularization.

If this is right

  • Matches or surpasses prior methods on standard LDR reconstruction quality in both real and simulated driving environments.
  • Yields substantially improved photometric consistency across heterogeneous camera pipelines.
  • Enables reliable exposure normalization that supports consistent rendering of static backgrounds.
  • Produces physically coherent illumination suitable for closed-loop autonomous driving simulators.
  • Preserves the real-time rendering speed of standard 3D Gaussian Splatting.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same decomposition strategy could be applied to other explicit 3D representations to achieve illumination-robust reconstruction without HDR capture hardware.
  • Normalized radiance fields may improve accuracy in downstream perception tasks that rely on consistent scene appearance.
  • Extending the framework to handle moving objects would allow full dynamic urban scene reconstruction under varying light.
  • The approach indicates that physical image-formation priors can substitute for multi-exposure or HDR supervision in large-scale mapping projects.

Load-bearing premise

The physical image-formation process can be accurately inverted by jointly optimizing a shared linear HDR radiance field with per-view exposure scales and tone-mapping functions from LDR observations alone when relative-exposure consistency and HDR-domain regularization are applied.

What would settle it

Check whether radiance values at the same 3D point, when rendered through two different recovered tone mappings scaled by their exposure ratios, match the observed LDR pixel intensities to within sensor noise levels across multiple view pairs.

Figures

Figures reproduced from arXiv: 2605.16925 by Hidehisa Arai, Hironobu Fujiyoshi, Kota Shimomura, Takayoshi Yamashita, Tsubasa Takahashi.

Figure 1
Figure 1. Figure 1: We propose P2GS, a physically grounded framework that reconstructs an exposure-invariant HDR radiance field from multi [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of Our Method. (1) To decouple the per-view exposure scales and tone-mapping functions, we learn each Gaussian as a linear HDR radiance representation. The Gaussians are constrained by a relative exposure consistency loss, which enforces the linearity of relative exposures in HDR space. (2) LDR image rendering is then performed by applying the learned exposure and tone-mapping functions to the ren… view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative results on the Waymo Open Dataset. Baseline methods struggle with inter-view exposure variation, producing visible seams and color inconsistencies, whereas our method maintains photometric consistency and largely suppresses seam visibility [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Effects of illuminance differences between cameras. Our [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative comparison on the CARLA dataset. Ground truth (GT) shows uniform illumination, while training views contain [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Gamma comparison on the Waymo Open Dataset. [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Exposure scale and Tone-Mapping comparison on the Waymo Open Dataset. [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
read the original abstract

3D Gaussian Splatting (3DGS) has recently emerged as a powerful explicit representation enabling fast, high-fidelity rendering, making it a promising foundation for closed-loop simulators and perception models in autonomous driving. However, conventional 3DGS implicitly assumes consistent exposure and tone mapping across views. Real driving data violates this assumption due to heterogeneous camera pipelines and dynamic outdoor illumination, baking exposure discrepancies and sensor noise into the radiance field and producing artifacts and inconsistent illumination especially in static backgrounds crucial for realistic simulation. These issues are amplified in autonomous driving, where sparse viewpoints, varying exposures, and outdoor lighting interact, while prior work mainly targets dynamic-object reconstruction and overlooks cross-view photometric consistency. To address this limitation, we introduce P2GS, a physically consistent Gaussian Splatting framework that jointly decomposes a view-invariant linear HDR radiance field, per-view exposure scales, and tone-mapping functions from only LDR images without HDR supervision. P2GS employs a unified optimization strategy grounded in the physical image-formation process, enforcing relative-exposure consistency and HDR-domain radiance regularization. This yields a radiance field robust to inter-camera illumination differences while preserving the real-time efficiency of standard 3DGS. Experiments across real and simulated driving environments show that P2GS matches or surpasses prior methods in LDR reconstruction while providing substantially improved photometric consistency, reliable exposure normalization, and physically coherent illumination across diverse scenes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces P2GS, a physically consistent extension of 3D Gaussian Splatting for urban driving scenes. It jointly optimizes a shared view-invariant linear HDR radiance field together with per-view exposure scales and tone-mapping functions directly from LDR images (no HDR ground truth), using a physical image-formation model, relative-exposure consistency constraints, and HDR-domain regularization to remove baked-in photometric inconsistencies while preserving real-time rendering.

Significance. If the decomposition is shown to be non-degenerate and the photometric-consistency gains are reproducible, the work would be significant for closed-loop autonomous-driving simulators and perception pipelines, where cross-camera illumination differences currently degrade static-background fidelity. The explicit physical modeling and retention of 3DGS efficiency are clear strengths.

major comments (3)
  1. [Abstract / §3] Abstract and §3 (optimization formulation): the claim that relative-exposure consistency plus HDR-domain regularization uniquely determines the shared linear HDR field is load-bearing for the central contribution, yet the manuscript provides no explicit analysis or proof that these priors eliminate the global scale ambiguity between radiance and exposure (or compensatory tone-map adjustments) that can produce identical LDR outputs.
  2. [Experiments] Experiments (quantitative tables): the reported gains in photometric consistency and exposure normalization must be supported by explicit cross-view metrics (e.g., variance of rendered linear radiance or normalized HDR values across held-out views) and ablations that isolate the contribution of each regularization term; without these, it is impossible to verify that the optimizer has not converged to a photometrically plausible but physically incorrect decomposition.
  3. [§4] §4 (scene-specific results): in outdoor driving sequences with sparse viewpoints and varying natural illumination, the paper should demonstrate that the recovered HDR field remains stable under small perturbations of the tone-mapping parameters; otherwise the robustness claim for heterogeneous camera pipelines is not yet substantiated.
minor comments (2)
  1. [Method] Notation for the tone-mapping function parameters and the exact form of the relative-exposure consistency loss should be introduced with a single equation block early in the method section for clarity.
  2. [Figures] Figure captions should explicitly state whether rendered images are shown in LDR or tone-mapped HDR space and which views are used for the consistency visualization.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We appreciate the recognition of P2GS's potential significance for autonomous-driving simulators and perception pipelines. We address each major comment below with clarifications and planned revisions.

read point-by-point responses
  1. Referee: [Abstract / §3] Abstract and §3 (optimization formulation): the claim that relative-exposure consistency plus HDR-domain regularization uniquely determines the shared linear HDR field is load-bearing for the central contribution, yet the manuscript provides no explicit analysis or proof that these priors eliminate the global scale ambiguity between radiance and exposure (or compensatory tone-map adjustments) that can produce identical LDR outputs.

    Authors: We agree that an explicit analysis would strengthen the central claim. In the revised manuscript we will add a short derivation in §3 showing that the relative-exposure consistency constraints (which enforce consistent radiance ratios across views) together with the HDR-domain regularization (which penalizes non-physical radiance distributions) resolve the global scale ambiguity up to a single normalizable factor. Any compensatory tone-map adjustment that preserves LDR outputs would violate the cross-view consistency under the linear image-formation model, thereby substantiating uniqueness without changing the optimization procedure. revision: yes

  2. Referee: [Experiments] Experiments (quantitative tables): the reported gains in photometric consistency and exposure normalization must be supported by explicit cross-view metrics (e.g., variance of rendered linear radiance or normalized HDR values across held-out views) and ablations that isolate the contribution of each regularization term; without these, it is impossible to verify that the optimizer has not converged to a photometrically plausible but physically incorrect decomposition.

    Authors: We will augment the experimental section with the requested metrics. The revision will report variance of rendered linear radiance and normalized HDR values across held-out views, plus systematic ablations that disable each regularization term individually. These additions will demonstrate that the observed photometric-consistency gains arise from the physical priors rather than from a merely plausible but incorrect decomposition. revision: yes

  3. Referee: [§4] §4 (scene-specific results): in outdoor driving sequences with sparse viewpoints and varying natural illumination, the paper should demonstrate that the recovered HDR field remains stable under small perturbations of the tone-mapping parameters; otherwise the robustness claim for heterogeneous camera pipelines is not yet substantiated.

    Authors: We acknowledge the value of explicit stability verification for heterogeneous pipelines. In the revised §4 we will add experiments that apply small perturbations (±5 % in key tone-mapping parameters) to the optimized functions and quantify the resulting variation in the recovered linear HDR radiance field on both real and simulated driving sequences. Quantitative stability metrics and qualitative renderings will be included to support the robustness claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on explicit physical model and independent regularizations

full rationale

The paper formulates an inverse rendering problem by assuming the standard image formation equation LDR = tone_map(exposure * HDR_render) and jointly optimizes the shared HDR field, per-view exposures, and tone maps from LDR inputs. It adds relative-exposure consistency across views and HDR-domain radiance regularization as explicit priors. These steps are not self-definitional: the target photometric consistency is not defined in terms of the fitted parameters themselves, nor is any prediction reduced to a fitted input by construction. No load-bearing self-citation or uniqueness theorem imported from prior author work appears in the provided derivation chain. The approach is a standard under-constrained optimization with added constraints, self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the image formation model permits clean separation of a shared linear HDR field from per-view adjustments; the paper introduces no new physical constants or particles but treats exposure scales and tone-mapping curves as optimizable quantities.

free parameters (2)
  • per-view exposure scales
    Fitted parameters that absorb camera-specific brightness differences; their values are determined during the joint optimization rather than taken from external calibration.
  • tone-mapping function parameters
    Per-view parameters that model the non-linear mapping from linear radiance to LDR pixel values; chosen to fit the observed images.
axioms (1)
  • domain assumption The physical image-formation process can be modeled as a linear HDR radiance field followed by per-view exposure scaling and tone mapping.
    Invoked in the abstract when stating that the method is 'grounded in the physical image-formation process' and enforces 'relative-exposure consistency and HDR-domain radiance regularization'.

pith-pipeline@v0.9.0 · 5796 in / 1446 out tokens · 27520 ms · 2026-05-19T20:54:31.149506+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

  1. [1]

    Hdr-gs: Efficient high dynamic range novel view synthesis at 1000x speed via gaussian splatting

    Yuanhao Cai, Zihao Xiao, Yixun Liang, Minghan Qin, Yu- lun Zhang, Xiaokang Yang, Yaoyao Liu, and Alan Yuille. Hdr-gs: Efficient high dynamic range novel view synthesis at 1000x speed via gaussian splatting. In NeurIPS, 2024. 2

  2. [2]

    Pseudo- simulation for autonomous driving

    Wei Cao, Marcel Hallgarten, Tianyu Li, Daniel Dauner, Xunjiang Gu, Caojun Wang, Yakov Miron, Marco Aiello, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, and Kashyap Chitta. Pseudo- simulation for autonomous driving. In Conference on Robot Learning (CoRL), 2025. 2

  3. [3]

    Hallucinated neural radi- ance fields in the wild

    Xingyu Chen, Qi Zhang, Xiaoyu Li, Yue Chen, Ying Feng, Xuan Wang, and Jue Wang. Hallucinated neural radi- ance fields in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12943–12952, 2022. 2, 4

  4. [4]

    Periodic vibration gaussian: Dynamic urban scene reconstruction and real-time rendering.arXiv preprint arXiv:2311.18561, 2023

    Yurui Chen, Chun Gu, Junzhe Jiang, Xiatian Zhu, and Li Zhang. Periodic vibration gaussian: Dynamic urban scene reconstruction and real-time rendering. arXiv:2311.18561,

  5. [5]

    Omnire: Omni urban scene reconstruction

    Ziyu Chen, Jiawei Yang, Jiahui Huang, Riccardo de Lu- tio, Janick Martinez Esturo, Boris Ivanovic, Or Litany, Zan Gojcic, Sanja Fidler, Marco Pavone, Li Song, and Yue Wang. Omnire: Omni urban scene reconstruction. In The Thirteenth International Conference on Learning Representations, 2025. 2, 3

  6. [6]

    Aleth-nerf: Illumination adaptive nerf with concealing field assumption

    Ziteng Cui, Lin Gu, Xiao Sun, Xianzheng Ma, Yu Qiao, and Tatsuya Harada. Aleth-nerf: Illumination adaptive nerf with concealing field assumption. In Proceedings of the AAAI Conference on Artificial Intelligence, 2024. 2

  7. [7]

    Luminance-gs: Adapting 3d gaussian splatting to chal- lenging lighting conditions with view-adaptive curve adjustment

    Ziteng Cui, Xuangeng Chu, and Tatsuya Harada. Luminance-gs: Adapting 3d gaussian splatting to chal- lenging lighting conditions with view-adaptive curve adjustment. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025. 2, 5, 6, 7

  8. [8]

    Discovering an image-adaptive coordinate system for photography process- ing

    Ziteng Cui, Lin Gu, and Tatsuya Harada. Discovering an image-adaptive coordinate system for photography process- ing. arXiv preprint arXiv:2501.06448, 2025. 5, 6, 7

  9. [9]

    Navsim: Data-driven non- reactive autonomous vehicle simulation and benchmark- ing

    Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, and Kashyap Chitta. Navsim: Data-driven non- reactive autonomous vehicle simulation and benchmark- ing. In Advances in Neural Information Processing Systems (NeurIPS), 2024. 2

  10. [10]

    CARLA: An open urban driving simulator

    Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. CARLA: An open urban driving simulator. In Proceedings of the 1st Annual Conference on Robot Learning, pages 1–16, 2017. 5, 7, 1

  11. [11]

    3d gaussian splatting for real-time radiance field rendering

    Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42 (4), 2023. 2, 5, 6, 7

  12. [12]

    Mtgs: Multi-traversal gaussian splatting.arXiv preprint arXiv:2503.12552, 2025

    Tianyu Li, Yihang Qiu, Zhenhua Wu, Carl Lind- str¨om, Peng Su, Matthias Nießner, and Hongyang Li. Mtgs: Multi-traversal gaussian splatting. arXiv preprint arXiv:2503.12552, 2025. 2

  13. [13]

    Yiyu Li, Haoyuan Wang, Ke Xu, Gerhard Petrus Hancke, and Rynson W.H. Lau. Sehdr: Single-exposure hdr novel view synthesis via 3d gaussian bracketing. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025. 2

  14. [14]

    Gausshdr: High dynamic range gaussian splatting via learning uni- fied 3d and 2d local tone mapping

    Jinfeng Liu, Lingtong Kong, Bo Li, and Dan Xu. Gausshdr: High dynamic range gaussian splatting via learning uni- fied 3d and 2d local tone mapping. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. 2, 4

  15. [15]

    Gausshdr: High dynamic range gaussian splatting via learning unified 3d and 2d local tone mapping

    Jinfeng Liu, Lingtong Kong, Bo Li, and Dan Xu. Gausshdr: High dynamic range gaussian splatting via learning unified 3d and 2d local tone mapping. In CVPR, 2025. 2

  16. [16]

    Srinivasan, Matthew Tancik, Jonathan T

    Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis. In ECCV, 2020. 2

  17. [17]

    Desire-gs: 4d street gaussians for static-dynamic decomposition and surface reconstruction for urban driving scenes

    Chensheng Peng, Chengwei Zhang, Yixiao Wang, Chenfeng Xu, Yichen Xie, Wenzhao Zheng, Kurt Keutzer, Masayoshi Tomizuka, and Wei Zhan. Desire-gs: 4d street gaussians for static-dynamic decomposition and surface reconstruction for urban driving scenes. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), 2025. 3, 7

  18. [18]

    Schonberger and Jan-Michael Frahm

    Johannes L. Schonberger and Jan-Michael Frahm. Structure- from-motion revisited. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. 5

  19. [19]

    Scalability in perception for autonomous driving: Waymo open dataset

    Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Et- tinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, and Dragomir Anguelov. Scalability in percepti...

  20. [20]

    Haoyuan Wang, Xiaogang Xu, Ke Xu, and Rynson W.H. Lau. Lighting up nerf via unsupervised decomposition and enhancement. In ICCV, 2023. 2

  21. [21]

    Bilateral guided radiance field processing

    Yuehao Wang, Chaoyi Wang, Bingchen Gong, and Tian- fan Xue. Bilateral guided radiance field processing. ACM Transactions on Graphics (TOG), 43(4):1–13, 2024

  22. [22]

    Editable scene simulation for autonomous driving via collaborative llm-agents

    Yuxi Wei, Zi Wang, Yifan Lu, Chenxin Xu, Changxing Liu, Hao Zhao, Siheng Chen, and Yanfeng Wang. Editable scene simulation for autonomous driving via collaborative llm-agents. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2

  23. [23]

    Difix3d+: Improving 3d reconstruc- tions with single-step diffusion models

    Jay Zhangjie Wu, Yuxuan Zhang, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Goj- cic, and Huan Ling. Difix3d+: Improving 3d reconstruc- tions with single-step diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025. 5

  24. [24]

    Street gaussians: Modeling dynamic urban scenes with gaussian splatting

    Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, and Sida Peng. Street gaussians: Modeling dynamic urban scenes with gaussian splatting. In ECCV, 2024. 2, 3

  25. [25]

    Hugsim: A real-time, photo-realistic and closed-loop simulator for autonomous driving,

    Hongyu Zhou, Longzhong Lin, Jiabao Wang, Yichong Lu, Dongfeng Bai, Bingbing Liu, Yue Wang, Andreas Geiger, and Yiyi Liao. Hugsim: A real-time, photo-realistic and closed-loop simulator for autonomous driving. arXiv preprint arXiv:2412.01718, 2024. 2

  26. [26]

    Hugs: Holistic urban 3d scene understanding via gaus- sian splatting

    Hongyu Zhou, Jiahao Shao, Lu Xu, Dongfeng Bai, Weichao Qiu, Bingbing Liu, Yue Wang, Andreas Geiger, and Yiyi Liao. Hugs: Holistic urban 3d scene understanding via gaus- sian splatting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2, 7

  27. [27]

    Drivinggaussian: Composite gaussian splatting for surrounding dynamic au- tonomous driving scenes

    Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan, Yongtao Wang, Deqing Sun, and Ming-Hsuan Yang. Drivinggaussian: Composite gaussian splatting for surrounding dynamic au- tonomous driving scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,

  28. [28]

    CARLA Dataset We build a synthetic dataset in CARLA 0.9.15 [10] to isolate exposure variation while fixing geometry, camera poses, and rendering pipeline

    2 P2GS: Physical Prior-guided Gaussian Splatting for Photometrically Consistent Urban Reconstruction Supplementary Material A. CARLA Dataset We build a synthetic dataset in CARLA 0.9.15 [10] to isolate exposure variation while fixing geometry, camera poses, and rendering pipeline. This enables controlled eval- uation of view-invariant HDR radiance recover...