SoLAR: Error-Resilient Streamable Long-Horizon Free-Viewpoint Video Reconstruction with Anchor Activation and Latent Recalibration
Pith reviewed 2026-05-11 00:59 UTC · model grok-4.3
The pith
SoLAR provides error-resilient reconstruction for long free-viewpoint videos without GOP partitioning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SoLAR is presented as the first error-resilient streamable framework for free-viewpoint video that maintains stable reconstruction quality on long sequences without requiring group-of-pictures partitioning. It achieves this through Anchor Activation Dynamics that dynamically model non-rigid transformations by activating informative anchors and suppressing redundant ones, along with Latent Discrepancy Aware Recalibration that identifies and corrects discrepancies in latent representations to stop error propagation.
What carries the argument
Anchor Activation Dynamics (AAD) for dynamic anchor activation to model non-rigid transformations, and Latent Discrepancy Aware Recalibration (LaDAR) for identifying and fixing latent discrepancies to mitigate error propagation.
If this is right
- Achieves state-of-the-art reconstruction performance on long sequences
- Maintains minimum storage overhead
- Preserves real-time performance
- Advances practical deployment of immersive media systems
Where Pith is reading between the lines
- This could allow for continuous streaming of volumetric content in applications like virtual reality without periodic quality resets.
- The mechanisms might be adaptable to other error-prone reconstruction tasks in 3D vision.
- Further work could test integration with existing video codecs for hybrid systems.
Load-bearing premise
The mechanisms of Anchor Activation Dynamics and Latent Discrepancy Aware Recalibration can effectively mitigate error propagation over long sequences while preserving real-time speed and compact storage.
What would settle it
Demonstrating significant quality degradation or increased storage requirements on long free-viewpoint video sequences when applying SoLAR compared to short-sequence methods.
Figures
read the original abstract
Free-Viewpoint Video (FVV) has emerged as a cornerstone of next-generation immersive media systems and attracted widespread attention. Previous methods primarily focus on short video sequences and suffer from significant performance degradation when processing long-horizon free-viewpoint video (LFVV). Motivated by bit allocation theory, we analyze dynamic-anchor-based volumetric video representation within a rate-distortion optimization framework and propose \textbf{SoLAR}, which is the first error-resilient streamable FVV framework that maintains stable reconstruction quality on long sequences without requiring group-of-pictures partitioning. We propose the Anchor Activation Dynamics (AAD), which enables dynamic anchors to model non-rigid transformations by dynamically activating informative anchors and suppressing redundant ones. Furthermore, we introduce Latent Discrepancy Aware Recalibration (LaDAR), which is a mechanism to identify discrepancies between latent representations and recalibrate the correspondences encoded in the network, effectively mitigating error propagation in LFVV without compromising real-time performance or storage compactness. Extensive experiments demonstrate that \textbf{SoLAR} achieves state-of-the-art reconstruction performance while maintaining minimum storage overhead, which provides a new direction for LFVV reconstruction and advances the practical deployment of immersive systems. Demo free-viewpoint videos are provided in the supplementary material.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes SoLAR, the first error-resilient streamable framework for long-horizon free-viewpoint video (LFVV) reconstruction. Motivated by bit allocation theory and rate-distortion optimization of dynamic-anchor volumetric representations, it introduces Anchor Activation Dynamics (AAD) to dynamically activate informative anchors for non-rigid motion while suppressing redundant ones, and Latent Discrepancy Aware Recalibration (LaDAR) to detect latent discrepancies and recalibrate network correspondences, thereby mitigating error propagation without group-of-pictures partitioning. The work claims state-of-the-art reconstruction quality on long sequences, minimal storage overhead, real-time performance, and provides supplementary demo videos.
Significance. If validated, the result would be significant for immersive media and computer vision by enabling practical long-sequence FVV without the quality degradation typical of prior short-sequence methods. The AAD and LaDAR mechanisms, presented as additive to existing volumetric representations and grounded in rate-distortion analysis, offer a concrete direction for streamable error-resilient reconstruction. The explicit provision of demo videos in the supplementary material is a strength for assessing practical impact.
major comments (3)
- [§3] §3 (AAD description): the claim that AAD enables dynamic activation of informative anchors for non-rigid transformations while maintaining storage compactness is central to the no-GOP claim, yet the activation criterion and its rate-distortion cost are presented only at high level without an explicit equation or algorithm; this makes it impossible to verify the asserted minimal overhead or parameter-free character.
- [§4] §4 (LaDAR description): LaDAR is introduced to identify latent discrepancies and recalibrate correspondences to block error propagation in LFVV; the precise discrepancy metric, recalibration update rule, and proof that it preserves real-time performance are missing, which is load-bearing for the error-resilience claim.
- [§5.1] §5.1 (quantitative results): the central claim of SOTA performance and stable long-horizon quality without GOP partitioning requires specific tables reporting PSNR/SSIM (or equivalent) versus sequence length, with error bars, direct comparisons to GOP-based baselines, and ablations isolating AAD and LaDAR contributions; absence of these undermines assessment of the weakest assumption that the mechanisms actually mitigate propagation.
minor comments (2)
- [Abstract] Abstract: the acronym LFVV is introduced without prior expansion; define 'long-horizon free-viewpoint video' on first use.
- [Method] Notation: the terms 'dynamic anchors' and 'latent representations' are used repeatedly; a brief table or paragraph clarifying their relation to standard volumetric codecs would improve readability.
Simulated Author's Rebuttal
Thank you for the constructive feedback on our manuscript. We appreciate the referee's careful reading and will use these comments to improve the clarity and rigor of the presentation. We address each major comment below.
read point-by-point responses
-
Referee: [§3] §3 (AAD description): the claim that AAD enables dynamic activation of informative anchors for non-rigid transformations while maintaining storage compactness is central to the no-GOP claim, yet the activation criterion and its rate-distortion cost are presented only at high level without an explicit equation or algorithm; this makes it impossible to verify the asserted minimal overhead or parameter-free character.
Authors: We agree that the current high-level description of AAD limits verifiability. In the revised manuscript we will insert the explicit activation criterion (derived from the rate-distortion analysis in §3), the associated cost function, and a concise algorithm box that shows the dynamic activation/suppression logic. These additions will make the parameter-free property and storage overhead explicit while preserving the streamable, no-GOP design. revision: yes
-
Referee: [§4] §4 (LaDAR description): LaDAR is introduced to identify latent discrepancies and recalibrate correspondences to block error propagation in LFVV; the precise discrepancy metric, recalibration update rule, and proof that it preserves real-time performance are missing, which is load-bearing for the error-resilience claim.
Authors: We will expand §4 with the exact latent discrepancy metric, the closed-form recalibration update rule, and a complexity analysis together with measured runtime figures on standard hardware. While a formal mathematical proof of real-time invariance is difficult because the overhead is data-dependent, the added analysis and empirical FPS numbers will substantiate that LaDAR does not compromise the real-time claim. revision: partial
-
Referee: [§5.1] §5.1 (quantitative results): the central claim of SOTA performance and stable long-horizon quality without GOP partitioning requires specific tables reporting PSNR/SSIM (or equivalent) versus sequence length, with error bars, direct comparisons to GOP-based baselines, and ablations isolating AAD and LaDAR contributions; absence of these undermines assessment of the weakest assumption that the mechanisms actually mitigate propagation.
Authors: We will augment §5.1 with the requested tables: PSNR/SSIM versus sequence length (with error bars from repeated runs), side-by-side comparisons against representative GOP-based baselines, and dedicated ablation studies that isolate AAD and LaDAR. These additions will directly address the concern about error propagation and strengthen the long-horizon stability evidence. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper's derivation begins from bit allocation theory to motivate analysis of dynamic-anchor volumetric representations, then introduces AAD for dynamic anchor activation and LaDAR for latent recalibration as additive mechanisms. These are presented as novel proposals without reducing to self-definitions, fitted inputs renamed as predictions, or load-bearing self-citations. The central claim of error-resilient streamable LFVV without GOP partitioning is supported by the new components and asserted experimental results rather than any internal equivalence by construction. The framework remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Bit allocation theory applies to dynamic-anchor-based volumetric video representation within a rate-distortion optimization framework
invented entities (2)
-
Anchor Activation Dynamics (AAD)
no independent evidence
-
Latent Discrepancy Aware Recalibration (LaDAR)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
J. Zhu and H. Tang, “Dynamic scene reconstruction: Recent advance in real-time rendering and streaming,”arXiv preprint arXiv:2503.08166, 2025
-
[2]
Neural 3D Video Synthesis from Multi-view Video,
T. Li, M. Slavcheva, M. Zollhoefer, S. Green, C. Lassner, C. Kim, T. Schmidt, S. Lovegrove, M. Goesele, R. Newcombeet al., “Neural 3D Video Synthesis from Multi-view Video,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, XXXX 2026 16
work page 2022
-
[3]
Streaming Radiance Fields for 3D Video Synthesis,
L. Li, Z. Shen, Z. Wang, L. Shen, and P. Tan, “Streaming Radiance Fields for 3D Video Synthesis,” inAdvances in Neural Information Processing Systems (NeurIPS), 2022
work page 2022
-
[4]
4dgcpro: Efficient hierarchical 4d gaussian compression for progressive volumetric video streaming,
Z. Zheng, Z. Wu, H. Zhong, Y . Tian, N. Cao, L. Xu, J. Yao, X. Zhang, Q. Hu, and W. Zhang, “4dgcpro: Efficient hierarchical 4d gaussian compression for progressive volumetric video streaming,” inAdvances in Neural Information Processing Systems (NeurIPS), 2025
work page 2025
-
[5]
4d-rotor gaussian splatting: towards efficient novel view synthesis for dynamic scenes,
Y . Duan, F. Wei, Q. Dai, Y . He, W. Chen, and B. Chen, “4d-rotor gaussian splatting: towards efficient novel view synthesis for dynamic scenes,” inACM SIGGRAPH 2024 Conference Papers (SIGGRAPH), 2024
work page 2024
-
[6]
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis,
Z. Li, Z. Chen, Z. Li, and Y . Xu, “Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
work page 2024
-
[7]
4k4d: Real-time 4d view synthesis at 4k resolution,
Z. Xu, S. Peng, H. Lin, G. He, J. Sun, Y . Shen, H. Bao, and X. Zhou, “4k4d: Real-time 4d view synthesis at 4k resolution,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 20 029–20 040
work page 2024
-
[8]
J. Sun, H. Jiao, G. Li, Z. Zhang, L. Zhao, and W. Xing, “3dgstream: On- the-fly training of 3d gaussians for efficient streaming of photo-realistic free-viewpoint videos,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 20 675– 20 685
work page 2024
-
[9]
Compressing streamable free-viewpoint videos to 0.1 mb per frame,
L. Tang, J. Yang, R. Peng, Y . Zhai, S. Shen, and R. Wang, “Compressing streamable free-viewpoint videos to 0.1 mb per frame,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 39, no. 7, 2025, pp. 7257–7265
work page 2025
-
[10]
Hicom: Hierarchical coherent motion for dynamic streamable scenes with 3d gaussian splat- ting,
Q. Gao, J. Meng, C. Wen, J. Chen, and J. Zhang, “Hicom: Hierarchical coherent motion for dynamic streamable scenes with 3d gaussian splat- ting,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 37, 2024, pp. 80 609–80 633
work page 2024
-
[11]
J. Fu, Q. Gao, C. Wen, Y . Wu, S. Ma, J. Zhang, and J. Zhang, “Recon- gs: Continuum-preserved gaussian streaming for fast and compact re- construction of dynamic scenes,” inAdvances in Neural Information Processing Systems (NeurIPS), 2025
work page 2025
-
[12]
Repre- senting long volumetric video with temporal gaussian hierarchy,
Z. Xu, Y . Xu, Z. Yu, S. Peng, J. Sun, H. Bao, and X. Zhou, “Repre- senting long volumetric video with temporal gaussian hierarchy,”ACM Transactions on Graphics, vol. 43, no. 6, pp. 1–18, 2024
work page 2024
-
[13]
Swings: sliding windows for dy- namic 3d gaussian splatting,
R. Shaw, M. Nazarczuk, J. Song, A. Moreau, S. Catley-Chandar, H. Dhamo, and E. P ´erez-Pellitero, “Swings: sliding windows for dy- namic 3d gaussian splatting,” inProceedings of the European Confer- ence on Computer Vision (ECCV). Springer, 2024, pp. 37–54
work page 2024
-
[14]
λ-domain optimal bit allocation algorithm for high efficiency video coding,
L. Li, B. Li, H. Li, and C. W. Chen, “λ-domain optimal bit allocation algorithm for high efficiency video coding,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 1, pp. 130–142, 2018
work page 2018
-
[15]
Rate control optimization for temporal-layer scalable video coding,
S. Hu, H. Wang, S. Kwong, T. Zhao, and C.-C. J. Kuo, “Rate control optimization for temporal-layer scalable video coding,”IEEE Transac- tions on Circuits and Systems for Video Technology, vol. 21, no. 8, pp. 1152–1162, 2011
work page 2011
-
[16]
Rate control by R-lambda model for HEVC,
B. Li, H. Li, L. Li, and J. Zhang, “Rate control by R-lambda model for HEVC,”ITU-T SG16 Contribution, JCTVC-K0103, pp. 1–5, 2012
work page 2012
-
[17]
λdomain rate control algorithm for High Efficiency Video Coding,
B. Li, H. Li, L. Li, and J. Zhang, “λdomain rate control algorithm for High Efficiency Video Coding,”IEEE Transactions on Image Process- ing, vol. 23, no. 9, pp. 3841–3854, 2014
work page 2014
-
[18]
3d gaussian splatting for real-time radiance field rendering
B. Kerbl, G. Kopanas, T. Leimk ¨uhler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering.”ACM Transactions on Graphics, vol. 42, no. 4, pp. 139–1, 2023
work page 2023
-
[19]
Octree-gs: To- wards consistent real-time rendering with lod-structured 3d gaussians,
K. Ren, L. Jiang, T. Lu, M. Yu, L. Xu, Z. Ni, and B. Dai, “Octree-gs: To- wards consistent real-time rendering with lod-structured 3d gaussians,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[20]
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis,
B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis,” inProceedings of the European Conference on Computer Vision (ECCV), 2020
work page 2020
-
[21]
Deep learning- based point cloud compression: An in-depth survey and benchmark,
W. Gao, L. Xie, S. Fan, G. Li, S. Liu, and W. Gao, “Deep learning- based point cloud compression: An in-depth survey and benchmark,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[22]
Next bit prediction: A unified lossless and lossy point cloud geometry compression frame- work,
B. Liu, Y . Ma, L. Li, D. Liu, Z. Li, and H. Li, “Next bit prediction: A unified lossless and lossy point cloud geometry compression frame- work,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026
work page 2026
-
[23]
Sparse tensor- based multiscale representation for point cloud geometry compres- sion,
J. Wang, D. Ding, Z. Li, X. Feng, C. Cao, and Z. Ma, “Sparse tensor- based multiscale representation for point cloud geometry compres- sion,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 7, pp. 9055–9071, 2023
work page 2023
-
[24]
Hac++: Towards 100x compression of 3d gaussian splatting,
Y . Chen, Q. Wu, W. Lin, M. Harandi, and J. Cai, “Hac++: Towards 100x compression of 3d gaussian splatting,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[25]
Compression of 3d gaussian splatting with optimized feature planes and standard video codecs,
S. Lee, F. Shu, Y . Sanchez, T. Schierl, and C. Hellge, “Compression of 3d gaussian splatting with optimized feature planes and standard video codecs,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 25 496–25 505
work page 2025
-
[26]
Efficient scene modeling via structure-aware and region-prioritized 3d gaussians,
G. Fang and B. Wang, “Efficient scene modeling via structure-aware and region-prioritized 3d gaussians,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[27]
Mcgs: Mul- tiview consistency enhancement for sparse-view 3d gaussian radiance fields,
Y . Xiao, D. Zhai, W. Zhao, K. Jiang, J. Jiang, and X. Liu, “Mcgs: Mul- tiview consistency enhancement for sparse-view 3d gaussian radiance fields,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[28]
Compressed 3d gaussian splatting for accelerated novel view synthesis,
S. Niedermayr, J. Stumpfegger, and R. Westermann, “Compressed 3d gaussian splatting for accelerated novel view synthesis,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), 2024, pp. 10 349–10 358
work page 2024
-
[29]
Compact 3d gaussian representation for radiance field,
J. C. Lee, D. Rho, X. Sun, J. H. Ko, and E. Park, “Compact 3d gaussian representation for radiance field,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 21 719–21 728
work page 2024
-
[30]
Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps,
Z. Fan, K. Wang, K. Wen, Z. Zhu, D. Xu, Z. Wanget al., “Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps,” inAdvances in Neural Information Processing Systems (NeurIPS), 2024
work page 2024
-
[31]
Scaffold-gs: Structured 3d gaussians for view-adaptive rendering,
T. Lu, M. Yu, L. Xu, Y . Xiangli, L. Wang, D. Lin, and B. Dai, “Scaffold-gs: Structured 3d gaussians for view-adaptive rendering,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 20 654–20 664
work page 2024
-
[32]
Hac: Hash-grid assisted context for 3d gaussian splatting compression,
Y . Chen, Q. Wu, W. Lin, M. Harandi, and J. Cai, “Hac: Hash-grid assisted context for 3d gaussian splatting compression,” inProceedings of the European Conference on Computer Vision (ECCV). Springer, 2024, pp. 422–438
work page 2024
-
[33]
D- nerf: Neural radiance fields for dynamic scenes,
A. Pumarola, E. Corona, G. Pons-Moll, and F. Moreno-Noguer, “D- nerf: Neural radiance fields for dynamic scenes,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 10 318–10 327
work page 2021
-
[34]
Real-time Photorealistic Dy- namic Scene Representation and Rendering with 4D Gaussian Splatting,
Z. Yang, H. Yang, Z. Pan, and L. Zhang, “Real-time Photorealistic Dy- namic Scene Representation and Rendering with 4D Gaussian Splatting,” inInternational Conference on Learning Representations (ICLR), 2024
work page 2024
-
[35]
Freetimegs: Free gaussian primitives at anytime anywhere for dynamic scene reconstruction,
Y . Wang, P. Yang, Z. Xu, J. Sun, Z. Zhang, Y . Chen, H. Bao, S. Peng, and X. Zhou, “Freetimegs: Free gaussian primitives at anytime anywhere for dynamic scene reconstruction,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 21 750–21 760
work page 2025
-
[36]
4d gaussian splatting for real-time dynamic scene rendering,
G. Wu, T. Yi, J. Fang, L. Xie, X. Zhang, W. Wei, W. Liu, Q. Tian, and X. Wang, “4d gaussian splatting for real-time dynamic scene rendering,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 20 310–20 320
work page 2024
-
[37]
Gaussian-flow: 4d reconstruction with dynamic 3d gaussian particle,
Y . Lin, Z. Dai, S. Zhu, and Y . Yao, “Gaussian-flow: 4d reconstruction with dynamic 3d gaussian particle,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 21 136–21 145
work page 2024
-
[38]
Sc- gs: Sparse-controlled gaussian splatting for editable dynamic scenes,
Y .-H. Huang, Y .-T. Sun, Z. Yang, X. Lyu, Y .-P. Cao, and X. Qi, “Sc- gs: Sparse-controlled gaussian splatting for editable dynamic scenes,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 4220–4230
work page 2024
-
[39]
S. Kwak, J. Kim, J. Y . Jeong, W.-S. Cheong, J. Oh, and M. Kim, “Modec-gs: Global-to-local motion decomposition and temporal interval adjustment for compact dynamic 3d gaussian splatting,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), 2025, pp. 11 338–11 348
work page 2025
-
[40]
Neural scene flow fields for space-time view synthesis of dynamic scenes,
Z. Li, S. Niklaus, N. Snavely, and O. Wang, “Neural scene flow fields for space-time view synthesis of dynamic scenes,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 6498–6508
work page 2021
-
[41]
Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruc- tion,
Z. Yang, X. Gao, W. Zhou, S. Jiao, Y . Zhang, and X. Jin, “Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruc- tion,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
work page 2024
-
[42]
J. Yan, R. Peng, Z. Wang, L. Tang, J. Yang, J. Liang, J. Wu, and R. Wang, “Instant gaussian stream: Fast and generalizable streaming of dynamic scene reconstruction via gaussian splatting,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 16 520–16 531
work page 2025
-
[43]
Neural residual radiance fields for streamably free-viewpoint videos,
L. Wang, Q. Hu, Q. He, Z. Wang, J. Yu, T. Tuytelaars, L. Xu, and M. Wu, “Neural residual radiance fields for streamably free-viewpoint videos,” IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, XXXX 2026 17 inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 76–87
work page 2026
-
[44]
Mega: Memory-efficient 4d gaussian splatting for dynamic scenes,
X. Zhang, Z. Liu, Y . Zhang, X. Ge, D. He, T. Xu, Y . Wang, Z. Lin, S. Yan, and J. Zhang, “Mega: Memory-efficient 4d gaussian splatting for dynamic scenes,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 27 828–27 838
work page 2025
-
[45]
Rate-distortion optimization for video compression,
G. J. Sullivan and T. Wiegand, “Rate-distortion optimization for video compression,”IEEE Signal Processing Magazine, vol. 15, no. 6, pp. 74–90, 1998
work page 1998
-
[46]
N. Kamaci, Y . Altunbasak, and R. M. Mersereau, “Frame bit allocation for the h. 264/avc video coder via cauchy-density-based rate and distortion models,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 8, pp. 994–1006, 2005
work page 2005
-
[47]
V3: Viewing volumetric videos on mobiles via streamable 2d dynamic gaussians,
P. Wang, Z. Zhang, L. Wang, K. Yao, S. Xie, J. Yu, M. Wu, and L. Xu, “V3: Viewing volumetric videos on mobiles via streamable 2d dynamic gaussians,”ACM Transactions on Graphics, vol. 43, no. 6, pp. 1–13, 2024
work page 2024
-
[48]
Hyperreel: High-fidelity 6-dof video with ray-conditioned sampling,
B. Attal, J.-B. Huang, C. Richardt, M. Zollhoefer, J. Kopf, M. O’Toole, and C. Kim, “Hyperreel: High-fidelity 6-dof video with ray-conditioned sampling,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 16 610–16 620
work page 2023
-
[49]
Tetrirf: Temporal tri- plane radiance fields for efficient free-viewpoint video,
M. Wu, Z. Wang, G. Kouros, and T. Tuytelaars, “Tetrirf: Temporal tri- plane radiance fields for efficient free-viewpoint video,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), 2024, pp. 6487–6496
work page 2024
-
[50]
Motion matters: Compact gaussian streaming for free-viewpoint video reconstruction,
J. Chen, Q. Mao, Y . Bao, X. Meng, F. Meng, R. Wang, and Y . Liang, “Motion matters: Compact gaussian streaming for free-viewpoint video reconstruction,” inAdvances in Neural Information Processing Systems (NeurIPS), 2025
work page 2025
-
[51]
Airgs: Real-time 4d gaussian streaming for free-viewpoint video experiences,
Z. Wang, J. Li, and Y . Zhu, “Airgs: Real-time 4d gaussian streaming for free-viewpoint video experiences,”arXiv preprint arXiv:2512.20943, 2025
-
[52]
Gifstream: 4d gaussian-based immersive video with feature stream,
H. Li, S. Li, X. Gao, A. Batuer, L. Yu, and Y . Liao, “Gifstream: 4d gaussian-based immersive video with feature stream,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 21 761–21 770
work page 2025
-
[53]
Videorf: Rendering dynamic radiance fields as 2d feature video streams,
L. Wang, K. Yao, C. Guo, Z. Zhang, Q. Hu, J. Yu, L. Xu, and M. Wu, “Videorf: Rendering dynamic radiance fields as 2d feature video streams,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 470–481
work page 2024
-
[54]
Stochasticsplats: Stochastic raster- ization for sorting-free 3d gaussian splatting,
S. Kheradmand, D. Vicini, G. Kopanas, D. Lagun, K. M. Yi, M. Matthews, and A. Tagliasacchi, “Stochasticsplats: Stochastic raster- ization for sorting-free 3d gaussian splatting,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025, pp. 26 326–26 335
work page 2025
-
[55]
Y . Ma, B. Liu, J. Li, L. Li, and D. Liu, “Hash grid feature pruning,” arXiv preprint arXiv:2512.22882, 2025
-
[56]
Topology-aware optimization of gaussian primitives for human-centric volumetric videos,
Y . Jiang, C. Guo, Y . Wu, Y . Hong, S. Zhu, Z. Shen, Y . Zhang, S. Jiao, Z. Su, L. Xuet al., “Topology-aware optimization of gaussian primitives for human-centric volumetric videos,” inProceedings of the SIGGRAPH Asia 2025 Conference Papers (SIGGRAPH Asia), 2025, pp. 1–12
work page 2025
-
[57]
Evolvinggs: Stable volumetric video via high-fidelity evolving 3d gaussian reconstruction,
C. Zhang, Y . Zhou, S. Wang, W. Li, D. Wang, Y . Xu, and S. Jiao, “Evolvinggs: Stable volumetric video via high-fidelity evolving 3d gaussian reconstruction,” inProceedings of the SIGGRAPH Asia 2025 Technical Communications (SIGGRAPH Asia), 2025, pp. 1–4
work page 2025
-
[58]
Maskgaussian: Adaptive 3d gaussian representation from probabilistic masks,
Y . Liu, Z. Zhong, Y . Zhan, S. Xu, and X. Sun, “Maskgaussian: Adaptive 3d gaussian representation from probabilistic masks,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, pp. 681–690
work page 2025
-
[59]
End-to- end rate-distortion optimized 3d gaussian representation,
H. Wang, H. Zhu, T. He, R. Feng, J. Deng, J. Bian, and Z. Chen, “End-to- end rate-distortion optimized 3d gaussian representation,” inProceedings of the European Conference on Computer Vision (ECCV). Springer, 2024, pp. 76–92
work page 2024
-
[60]
J. Wu, R. Peng, Z. Wang, L. Xiao, L. Tang, J. Yan, K. Xiong, and R. Wang, “Swift4d: Adaptive divide-and-conquer gaussian splatting for compact and efficient reconstruction of dynamic scene,” inInternational Conference on Learning Representations (ICLR), 2025
work page 2025
-
[61]
Avatarrex: Real-time expressive full-body avatars,
Z. Zheng, X. Zhao, H. Zhang, B. Liu, and Y . Liu, “Avatarrex: Real-time expressive full-body avatars,”ACM Transactions on Graphics, vol. 42, no. 4, pp. 1–19, 2023
work page 2023
-
[62]
Splinegs: Learning smooth trajectories in gaussian splatting for dynamic scene reconstruction,
J. Yoon, S. Han, J. Oh, and M. Lee, “Splinegs: Learning smooth trajectories in gaussian splatting for dynamic scene reconstruction,” in International Conference on Learning Representations (ICLR), 2025
work page 2025
-
[63]
QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos,
S. Girish, T. Li, A. Mazumdar, A. Shrivastava, S. De Melloet al., “QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos,” inAdvances in Neural Information Processing Systems (NeurIPS), 2024
work page 2024
-
[64]
D- fcgs: Feedforward compression of dynamic gaussian splatting for free- viewpoint videos,
W. Zhang, Y . Zhao, Q. Wang, Z. Xu, L. Song, and Z. Cheng, “D- fcgs: Feedforward compression of dynamic gaussian splatting for free- viewpoint videos,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 40, no. 19, 2026, pp. 16 361–16 369
work page 2026
-
[65]
4dgc: Rate-aware 4d gaussian compression for efficient streamable free-viewpoint video,
Q. Hu, Z. Zheng, H. Zhong, S. Fu, L. Song, X. Zhang, G. Zhai, and Y . Wang, “4dgc: Rate-aware 4d gaussian compression for efficient streamable free-viewpoint video,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
work page 2025
-
[66]
Image quality assessment: from error visibility to structural similarity,
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,”IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004
work page 2004
-
[67]
The unreasonable effectiveness of deep features as a perceptual metric,
R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 586–595
work page 2018
-
[68]
J. Yan, R. Peng, L. Tang, and R. Wang, “4D Gaussian Splatting with Scale-aware Residual Field and Adaptive Optimization for Real-time rendering of temporally complex dynamic scenes,” inProceedings of the ACM International Conference on Multimedia (ACM MM), 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.