arxiv: 2604.02836 · v1 · submitted 2026-04-03 · 💻 cs.CV

Recognition: no theorem link

Factorized Multi-Resolution HashGrid for Efficient Neural Radiance Fields: Execution on Edge-Devices

Kim Jun-Seong , Mingyu Kim , GeonU Kim , Tae-Hyun Oh , Jin-Hwa Kim

Authors on Pith no claims yet

Pith reviewed 2026-05-13 20:02 UTC · model grok-4.3

classification 💻 cs.CV

keywords neural radiance fieldshash encodingtensor factorizationon-device trainingmemory efficiencyedge computing3D scene representation

0 comments

The pith

Fact-Hash projects 3D coordinates to lower dimensions before hashing to cut NeRF memory use by over one third while preserving PSNR.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that combining tensor factorization with hash encoding allows neural radiance fields to run on memory-limited edge devices. By projecting each 3D point into several 2D or 1D planes, hashing those planes separately, and then aggregating the resulting features, Fact-Hash keeps high-resolution detail without storing a full 3D grid. This approach maintains rendering quality measured by PSNR while reducing memory footprint and energy draw compared with prior positional encodings. The on-device experiments show faster training and lower power consumption, opening the door to private, low-latency scene capture on phones and embedded hardware.

Core claim

Fact-Hash merges tensor factorization and hash encoding by first projecting 3D coordinates into multiple lower-dimensional (2D or 1D) forms, applying the hash function to each, and aggregating the results into a single feature vector. This yields rich high-resolution features with far fewer parameters than a direct 3D hash grid, delivering over one-third memory savings while holding PSNR values steady against previous encoding methods. On-device tests confirm gains in computational efficiency and reduced energy consumption relative to standard positional encodings.

What carries the argument

Fact-Hash: projection of each 3D coordinate into multiple lower-dimensional (2D or 1D) planes, independent hashing of those planes, and aggregation of the hashed features into one vector.

Load-bearing premise

Projecting 3D points to several lower-dimensional planes, hashing each plane, and summing the results still supplies enough high-resolution information to avoid quality loss in the final radiance field.

What would settle it

A side-by-side test on a standard NeRF benchmark scene in which Fact-Hash produces PSNR at least 1 dB lower than the strongest baseline hash-grid method under identical training budgets.

Figures

Figures reproduced from arXiv: 2604.02836 by GeonU Kim, Jin-Hwa Kim, Kim Jun-Seong, Mingyu Kim, Tae-Hyun Oh.

**Figure 1.** Figure 1: Comparison of Instant-ngp [16], TensoRF [14], K-planes [15] and Ours in terms of PSNR and inference time on the edge-device. Training is conducted on a standard GPU machine, whereas the inference is performed on the edge-device aligning with standard edge-device utilization practices. The area of each circle represents the model size. factorization and hashing encoding, which also has favorable operating p… view at source ↗

**Figure 2.** Figure 2: Conceptual illustration of parameter-encoding; iNGP [ [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Schematic of the proposed method, Fact-Hash. For a given point [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Qualitative results of 8 views case on the NeRF synthetic dataset. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: PSNR (a) and SSIM (b) values according to the number of uniformly sampled inputs. All metrics are average value of 7 NeRF synthetic data. while requiring an extremely low number of parameters under normal conditions. To emphasize the performance comparison, we conduct both training and evaluation on an RTX3090Ti. The NeRF synthetic dataset consists of images rendered for eight objects within a confined are… view at source ↗

**Figure 6.** Figure 6: Visualization of bitfield on the NeRF synthetic dataset. Each case is [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 8.** Figure 8: Qualitative results on the San Francisco Mission Bay dataset. For 1/3 [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

**Figure 7.** Figure 7: Qualitative results on the Tank and Temples for 10% training inputs. [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 9.** Figure 9: Comparison between iNGP and Fact-Hash in various collision rate. PSNR values are averaged along the scenes where iNGP success in training, excluding hotdog, lego, materials. We observe that as collision rate decreases, with higher hash table size, the PSNR values degrade (see [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

**Figure 10.** Figure 10: Ablation study on encoding methods. To confirm synergy between Tensor Factorization and Hash-encoding, we assessed PSNR across parameters on chair data, by compressing Factorized encoding, by adding hash collisions (Fact-Hash) and reducing tensor rank (Tensor Factor.) [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

read the original abstract

We introduce Fact-Hash, a novel parameter-encoding method for training on-device neural radiance fields. Neural Radiance Fields (NeRF) have proven pivotal in 3D representations, but their applications are limited due to large computational resources. On-device training can open large application fields, providing strength in communication limitations, privacy concerns, and fast adaptation to a frequently changing scene. However, challenges such as limited resources (GPU memory, storage, and power) impede their deployment. To handle this, we introduce Fact-Hash, a novel parameter-encoding merging Tensor Factorization and Hash-encoding techniques. This integration offers two benefits: the use of rich high-resolution features and the few-shot robustness. In Fact-Hash, we project 3D coordinates into multiple lower-dimensional forms (2D or 1D) before applying the hash function and then aggregate them into a single feature. Comparative evaluations against state-of-the-art methods demonstrate Fact-Hash's superior memory efficiency, preserving quality and rendering speed. Fact-Hash saves memory usage by over one-third while maintaining the PSNR values compared to previous encoding methods. The on-device experiment validates the superiority of Fact-Hash compared to alternative positional encoding methods in computational efficiency and energy consumption. These findings highlight Fact-Hash as a promising solution to improve feature grid representation, address memory constraints, and improve quality in various applications. Project page: https://facthash.github.io/

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Fact-Hash cuts NeRF memory by projecting 3D points to lower-dimensional planes before hashing, with on-device tests showing efficiency gains but limited proof that quality holds across scenes.

read the letter

The paper introduces Fact-Hash by taking 3D coordinates, projecting them onto multiple 2D and 1D planes, hashing each projection independently, and aggregating the results into a feature vector for the radiance field. This factorization shrinks the hash table size enough to deliver the claimed one-third memory reduction while the reported PSNR numbers stay close to standard multi-resolution hash encodings. The on-device experiments add concrete value by tracking energy consumption and runtime on actual edge hardware, which matters for anyone targeting mobile AR or local adaptation tasks. The approach combines two established ideas in a way that fits the on-device constraint, and the empirical results back the memory and speed claims for the tested setups. The projection step raises a real question about lost 3D correlations. If thin structures or view-dependent effects depend on full volumetric interactions, the lower-dimensional hashes plus aggregation might not recover them reliably, yet the paper shows no quality drop in its comparisons. More ablations across scene types and explicit checks on high-frequency failure modes would clarify how often the compensation works. Baseline details and run-to-run variance are also thin, which makes it harder to judge how much of the gain comes from the encoding versus other implementation choices. The work targets people building efficient 3D representations for constrained devices. Readers focused on practical NeRF deployment will find the memory and energy numbers useful even if the theoretical grounding stays light. It deserves peer review because the device-level measurements are specific and falsifiable, and the core engineering claim can be tested directly.

Referee Report

3 major / 2 minor

Summary. The paper introduces Fact-Hash, a parameter-encoding technique for NeRF that merges tensor factorization with multi-resolution hash grids. 3D coordinates are projected onto multiple lower-dimensional (2D or 1D) planes, hashed independently, and aggregated into a single feature vector. The central empirical claim is that this factorization reduces memory footprint by more than one-third relative to prior hash encodings while preserving PSNR, rendering speed, and on-device efficiency (compute and energy).

Significance. If the memory and quality claims hold under rigorous controls, Fact-Hash would be a practical advance for on-device NeRF, directly addressing GPU-memory, storage, and power limits that currently restrict deployment. The combination of factorization and hashing is a natural direction for parameter-efficient volumetric representations and could support privacy-preserving or rapidly adapting scene models.

major comments (3)

[Abstract, §3] Abstract and §3: the claim that Fact-Hash 'saves memory usage by over one-third while maintaining the PSNR values' is presented without the exact baseline configurations, hash-table sizes, multi-resolution schedules, or quantitative tables that would allow direct verification of the one-third figure. The central memory-saving assertion therefore cannot be assessed from the provided evidence.
[Abstract] Abstract and on-device experiment: no error bars, data-split details, or ablation studies are reported for the PSNR comparisons or the energy/compute metrics. Without these, it is impossible to determine whether the observed gains are statistically reliable or sensitive to particular scenes or hardware.
[§3] §3 (aggregation step): projecting 3D coordinates to independent 2D/1D planes and then aggregating (concatenation or summation) is asserted to preserve high-resolution features, yet no analysis or controlled experiment quantifies information loss for high-frequency structures or view-dependent effects. The 'maintaining PSNR' claim therefore rests on an unverified assumption about the sufficiency of the chosen aggregation and multi-resolution schedule.

minor comments (2)

The manuscript would benefit from an explicit notation table or diagram clarifying the projection operators, hash functions per dimension, and the precise aggregation operator used to form the final feature vector.
Device specifications (GPU/CPU model, memory capacity, power measurement method) for the on-device experiments should be stated in a dedicated subsection or table.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and will incorporate revisions to improve the clarity and rigor of the claims regarding memory savings, statistical reporting, and analysis of the factorization approach.

read point-by-point responses

Referee: [Abstract, §3] Abstract and §3: the claim that Fact-Hash 'saves memory usage by over one-third while maintaining the PSNR values' is presented without the exact baseline configurations, hash-table sizes, multi-resolution schedules, or quantitative tables that would allow direct verification of the one-third figure. The central memory-saving assertion therefore cannot be assessed from the provided evidence.

Authors: We agree that the memory claim requires explicit supporting details for verification. The one-third reduction is computed from the total hash-table parameters in Fact-Hash (using independent 2D/1D projections per level) versus the baseline multi-resolution hash encoding with L=16 levels and T=2^19 entries per level. In the revised manuscript we will add a new table in Section 4 that lists the precise hash-table sizes, multi-resolution schedules, per-level memory footprints in MB, and the resulting percentage savings for Fact-Hash against Instant-NGP and other baselines, enabling direct numerical verification. revision: yes
Referee: [Abstract] Abstract and on-device experiment: no error bars, data-split details, or ablation studies are reported for the PSNR comparisons or the energy/compute metrics. Without these, it is impossible to determine whether the observed gains are statistically reliable or sensitive to particular scenes or hardware.

Authors: We accept that statistical reliability and experimental controls must be strengthened. In the revision we will report mean PSNR with standard deviation computed over three independent random seeds for all scenes. Data-split details (e.g., number of training views and the standard NeRF train/test partitioning) will be stated explicitly in Section 4. We will also add a dedicated ablation subsection that examines the impact of projection dimensionality (2D vs. 1D) and aggregation operator (sum vs. concatenation) on both PSNR and on-device energy metrics. revision: yes
Referee: [§3] §3 (aggregation step): projecting 3D coordinates to independent 2D/1D planes and then aggregating (concatenation or summation) is asserted to preserve high-resolution features, yet no analysis or controlled experiment quantifies information loss for high-frequency structures or view-dependent effects. The 'maintaining PSNR' claim therefore rests on an unverified assumption about the sufficiency of the chosen aggregation and multi-resolution schedule.

Authors: We acknowledge the absence of a controlled quantification of potential information loss. In the revised version we will insert a new experiment that directly compares PSNR on high-frequency-detail scenes (fine textures, sharp edges) between the factorized encoding and the full 3D hash baseline, and we will separately evaluate view-dependent performance on scenes containing specular surfaces. The multi-resolution schedule is intended to compensate for any loss by allocating finer grids to higher-frequency bands; the added experiment will measure the residual PSNR gap and discuss its magnitude. revision: yes

Circularity Check

0 steps flagged

No circularity: Fact-Hash is an empirical encoding design with external validation

full rationale

The paper defines Fact-Hash as a direct design choice: project 3D coordinates to multiple lower-dimensional (2D/1D) forms, apply hashing independently, then aggregate the results into a feature vector for the NeRF MLP. This is motivated by combining tensor factorization ideas with hash grids to reduce table size, and the claimed memory savings (over one-third) and PSNR parity are asserted via comparative experiments on standard benchmarks plus on-device measurements. No equation chain, fitted parameter, or self-citation is shown that makes the performance numbers tautological by construction; the method remains falsifiable against independent baselines. The derivation is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard NeRF assumptions about feature grids and hash functions plus the unproven claim that lower-dimensional factorization preserves necessary detail; no explicit free parameters or new entities are named in the abstract.

axioms (1)

domain assumption Projecting 3D coordinates to lower-dimensional spaces before hashing and aggregating yields equivalent or superior feature richness at reduced memory cost.
Core design premise stated in the method description.

pith-pipeline@v0.9.0 · 5571 in / 1197 out tokens · 57214 ms · 2026-05-13T20:02:58.412513+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

[1]

Nerf: Representing scenes as neural radiance fields for view synthesis,

B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,”Communications of the ACM, 2021

work page 2021
[2]

Fov-nerf: Foveated neural radiance fields for virtual reality,

N. Deng, Z. He, J. Ye, B. Duinkharjav, P. Chakravarthula, X. Yang, and Q. Sun, “Fov-nerf: Foveated neural radiance fields for virtual reality,” IEEE Transactions on Visualization and Comp. Graphics, 2021

work page 2021
[3]

Keynote-ashok elluswamy, tesla,

A. Elluswamy, “Keynote-ashok elluswamy, tesla,” 2022, workshop on Autonomous Driving. [Online]. Available: https://bit.ly/3z6IRPR

work page 2022
[4]

Mixture of volumetric primitives for efficient neural render- ing,

S. Lombardi, T. Simon, G. Schwartz, M. Zollhoefer, Y . Sheikh, and J. Saragih, “Mixture of volumetric primitives for efficient neural render- ing,”ACM Trans. on Graphics, 2021

work page 2021
[5]

Mobilenerf: Exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures,

Z. Chen, T. Funkhouser, P. Hedman, and A. Tagliasacchi, “Mobilenerf: Exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2023

work page 2023
[6]

Shacira: Scalable hash-grid compression for implicit neural representations,

S. Girish, A. Shrivastava, and K. Gupta, “Shacira: Scalable hash-grid compression for implicit neural representations,” inIEEE Int. Conf. on Comp. Vis., 2023

work page 2023
[7]

Compressing volumetric radiance fields to 1mb,

L. Li, Z. Shen, Z. Wang, L. Shen, and L. Bo, “Compressing volumetric radiance fields to 1mb,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2023

work page 2023
[8]

Compressible-composable nerf via rank-residual decomposition,

J. Tang, X. Chen, J. Wang, and G. Zeng, “Compressible-composable nerf via rank-residual decomposition,” inAdvances in Neural Info. Processing Systems, 2022

work page 2022
[9]

Binary radiance fields,

S. Shin and J. Park, “Binary radiance fields,” inAdvances in Neural Info. Processing Systems, 2023

work page 2023
[10]

Block-NeRF: Scalable large scene neural view synthesis,

M. Tancik, V . Casser, X. Yan, S. Pradhan, B. Mildenhall, P. Srinivasan, J. T. Barron, and H. Kretzschmar, “Block-NeRF: Scalable large scene neural view synthesis,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2022

work page 2022
[11]

Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs,

H. Turki, D. Ramanan, and M. Satyanarayanan, “Mega-nerf: Scalable construction of large-scale nerfs for virtual fly-throughs,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2022

work page 2022
[12]

Improved direct voxel grid optimiza- tion for radiance fields reconstruction

C. Sun, M. Sun, and H.-T. Chen, “Improved direct voxel grid optimiza- tion for radiance fields reconstruction.”

work page
[13]

Plenoxels: Radiance fields without neural networks,

S. Fridovich-Keil, A. Yu, M. Tancik, Q. Chen, B. Recht, and A. Kanazawa, “Plenoxels: Radiance fields without neural networks,” in IEEE Conf. on Comp. Vis. and Patt. Recog., 2022

work page 2022
[14]

Tensorf: Tensorial radiance fields,

A. Chen, Z. Xu, A. Geiger, J. Yu, and H. Su, “Tensorf: Tensorial radiance fields,” inEur. Conf. on Comp. Vis.Springer, 2022

work page 2022
[15]

K-planes: Explicit radiance fields in space, time, and appearance,

S. Fridovich-Keil, G. Meanti, F. R. Warburg, B. Recht, and A. Kanazawa, “K-planes: Explicit radiance fields in space, time, and appearance,” in IEEE Conf. on Comp. Vis. and Patt. Recog., 2023

work page 2023
[16]

Instant neural graph- ics primitives with a multiresolution hash encoding,

T. M ¨uller, A. Evans, C. Schied, and A. Keller, “Instant neural graph- ics primitives with a multiresolution hash encoding,”ACM Trans. on Graphics, 2022

work page 2022
[17]

Synergistic integration of coordinate network and tensorial feature for improving nerfs from sparse inputs,

M. Kim, J. S. Kim, S. Y . Yun, and J. H. Kim, “Synergistic integration of coordinate network and tensorial feature for improving nerfs from sparse inputs,” inInt. Conf. on Machine Learning, 2024

work page 2024
[18]

Nerf in the wild: Neural radiance fields for unconstrained photo collections,

R. Martin-Brualla, N. Radwan, M. S. Sajjadi, J. T. Barron, A. Dosovit- skiy, and D. Duckworth, “Nerf in the wild: Neural radiance fields for unconstrained photo collections,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2021

work page 2021
[19]

Hdr-plenoxels: Self- calibrating high dynamic range radiance fields,

K. Jun-Seong, K. Yu-Ji, M. Ye-Bin, and T.-H. Oh, “Hdr-plenoxels: Self- calibrating high dynamic range radiance fields,” inEur. Conf. on Comp. Vis., 2022

work page 2022
[20]

FPRF: Feed-forward photorealistic style transfer of large-scale 3D neural radiance fields,

G. Kim, K. Youwang, and T.-H. Oh, “FPRF: Feed-forward photorealistic style transfer of large-scale 3D neural radiance fields,” inAAAI Conf. on Artificial Intelligence (AAAI), 2024

work page 2024
[21]

Multiscale tensor decomposition and rendering equation encoding for view synthesis,

K. Han and W. Xiang, “Multiscale tensor decomposition and rendering equation encoding for view synthesis,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2023

work page 2023
[22]

Rt-nerf: Real-time on-device neural radiance fields towards immersive ar/vr rendering,

C. Li, S. Li, Y . Zhao, W. Zhu, and Y . Lin, “Rt-nerf: Real-time on-device neural radiance fields towards immersive ar/vr rendering,” inIEEE/ACM Int. Conf. on Comp.-Aided Design, 2022

work page 2022
[23]

Instant-3d: Instant neural radiance field training towards on-device ar/vr 3d reconstruction,

S. Li, C. Li, W. Zhu, B. T. Yu, Y . K. Zhao, C. Wan, H. You, H. Shi, and Y . C. Lin, “Instant-3d: Instant neural radiance field training towards on-device ar/vr 3d reconstruction,” inProceedings of the 50th Annual Int. Symposium on Comp. Architecture, 2023

work page 2023
[24]

Icarus: A specialized architecture for neural radiance fields rendering,

C. Rao, H. Yu, H. Wan, J. Zhou, Y . Zheng, M. Wu, Y . Ma, A. Chen, B. Yuan, P. Zhou, X. Lou, and J. Yu, “Icarus: A specialized architecture for neural radiance fields rendering,”ACM Trans. on Graphics, 2022

work page 2022
[25]

Reducing the memory footprint of 3d gaussian splatting,

P. Papantonakis, G. Kopanas, B. Kerbl, A. Lanvin, and G. Drettakis, “Reducing the memory footprint of 3d gaussian splatting,”Proceedings of the ACM on Comp. Graphics and Interactive Tech., 2024

work page 2024
[26]

Real-time neural light field on mobile devices,

J. Cao, H. Wang, P. Chemerys, V . Shakhrai, J. Hu, Y . Fu, D. Makovi- ichuk, S. Tulyakov, and J. Ren, “Real-time neural light field on mobile devices,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2023

work page 2023
[27]

Compressing Explicit V oxel Grid Representations: fast NeRFs become also small,

C. L. Deng and E. Tartaglione, “Compressing Explicit V oxel Grid Representations: fast NeRFs become also small,” inIEEE Winter Conf. on Applications of Comp. Vis. (WACV), 2023

work page 2023
[28]

Mip-nerf 360: Unbounded anti-aliased neural radiance fields,

J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman, “Mip-nerf 360: Unbounded anti-aliased neural radiance fields,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2022

work page 2022
[29]

N. LLC. (2023) Nvidia jetson xavier. [Online]. Available: https: //bit.ly/3XujzUY

work page 2023
[30]

Freenerf: Improving few-shot neural rendering with free frequency regularization,

J. Yang, M. Pavone, and Y . Wang, “Freenerf: Improving few-shot neural rendering with free frequency regularization,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2023

work page 2023
[31]

Putting nerf on a diet: Semantically consistent few-shot view synthesis,

A. Jain, M. Tancik, and P. Abbeel, “Putting nerf on a diet: Semantically consistent few-shot view synthesis,” inIEEE Int. Conf. on Comp. Vis., 2021

work page 2021
[32]

Neural fourier filter bank,

Z. Wu, Y . Jin, and K. M. Yi, “Neural fourier filter bank,” inIEEE Conf. on Comp. Vis. and Patt. Recog., 2023

work page 2023