PEPS: Positional Encoding Projected Sampling -- Extended
Pith reviewed 2026-05-08 04:44 UTC · model grok-4.3
The pith
Positional encoding decomposes into projected points whose frequency motion patterns enable compact learned grid encodings that outperform prior methods with 25% fewer parameters.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose the Positional Encoding Projected Sampling, where we treat the projection of the original coordinate at each frequency as a point of interest. We describe the motion of each point with respect to the frequencies and show that it follows a unique pattern. Finally, we use the unique motion of each point as a basis decomposition for doing learned positional encoding using grids. We prove, using three competitive applications—image representation, texture compression, and signed distance function—that the proposed approach outperforms the current state of the art methods, and often requires 25% less parameters for equivalent reconstruction error or rendering.
What carries the argument
Positional Encoding Projected Sampling (PEPS), which decomposes each frequency projection into a point and uses the resulting frequency-dependent motion trajectories as the basis for learned grid encodings.
If this is right
- Image representation reaches lower reconstruction error using grids whose resolution is determined by the motion patterns rather than by brute-force upsampling.
- Texture compression maintains visual quality with approximately 25 percent fewer network parameters than current grid or encoding baselines.
- Signed distance function rendering achieves equivalent or better accuracy without the fitting artifacts that high-resolution grids typically produce.
- Learned positional encodings become feasible on modest grids because the point-motion basis supplies the necessary high-frequency information.
- The same decomposition applies uniformly to the three tested domains, suggesting the motion patterns are not task-specific.
Where Pith is reading between the lines
- The motion-pattern analysis could be applied to other coordinate-to-signal mappings such as neural radiance fields, potentially lowering memory use in 3D scene representation.
- If the trajectories prove stable under input perturbations, the method may reduce sensitivity to coordinate scaling hyperparameters that plague many INR pipelines.
- Extending the point-motion basis to time-varying signals might enable compact representations of dynamic scenes without increasing grid resolution.
- Comparing the learned grids against analytically derived ratio-symmetric encodings could test whether the empirical patterns capture deeper geometric structure.
Load-bearing premise
The observed motion patterns of the projected points are consistent and general enough to form a reliable basis for grid-based learned encodings across different signals and without introducing new artifacts.
What would settle it
Run the three reported applications with the motion-pattern basis replaced by random trajectories or by standard positional encoding; if reconstruction error or parameter count is not worse than PEPS, the claim that the unique patterns are essential would be falsified.
Figures
read the original abstract
Implicit neural representations (INRs) are increasingly being used as tools to map coordinates to signals, encompassing applications from neural fields to texture compression, shape representations, and beyond. Most INR methods are based on using high-dimensional projections of the initial coordinates through encoders such as grid or positional encoding. Nevertheless, positional encoding is often insufficient and grids, as we show in this paper, require high resolution for being able to learn. In this paper, we demonstrate that positional encoding can be used not only as a high-dimensional embedding but also decomposed as a series of meaningful points. We propose the Positional Encoding Projected Sampling, where we treat the projection of the original coordinate at each frequency as a point of interest. We describe the motion of each point with respect to the frequencies and show that it follows a unique pattern. Finally, we use the unique motion of each point as a basis decomposition for doing learned positional encoding using grids. We prove, using three competitive applications; image representation, texture compression, and signed distance function; that the proposed approach outperforms the current state of the art methods, and often requires 25\% less parameters for equivalent reconstruction error or rendering.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces PEPS (Positional Encoding Projected Sampling) for implicit neural representations. It decomposes standard positional encodings by projecting input coordinates at each frequency, treating these as points whose motion across frequencies is claimed to follow a unique pattern. This pattern serves as a basis decomposition for learned positional encoding on grids. The approach is evaluated empirically on image representation, texture compression, and signed distance function tasks, with the central claim that it outperforms current state-of-the-art methods while often requiring 25% fewer parameters for equivalent reconstruction error or rendering quality.
Significance. If the projected-point motion patterns indeed supply a reliable, artifact-free basis that avoids the high-resolution grid requirement while delivering consistent gains, the method could meaningfully advance parameter-efficient INRs across computer vision and graphics. The multi-task empirical evaluation is a positive feature, as is the explicit framing of positional encoding as a decomposable point-motion process rather than a black-box embedding.
major comments (2)
- [Abstract and method description of projected sampling] The abstract asserts that the motion of projected points 'follows a unique pattern' usable for learned positional encoding on grids without high-resolution requirements or new artifacts, yet no derivation, uniqueness proof, or explicit construction of the decomposition is referenced. This is load-bearing for the parameter-reduction claim, as the pattern must demonstrably differ from standard sinusoidal embeddings or implicit dense sampling.
- [Experimental evaluation sections] The headline result (outperformance + 25% parameter savings across three tasks) is stated without reference to specific baselines, quantitative tables, error bars, or ablation controls in the provided text. If §4 or the experimental section contains these, they must be cross-referenced to the uniqueness claim; otherwise the empirical 'proof' reduces to unsupported assertion.
minor comments (2)
- [Method] Notation for the frequency-wise projections and point-motion descriptors should be formalized with equations early in the method section to improve readability.
- [Introduction] The introduction should explicitly contrast PEPS with prior grid-based INR works (e.g., those using multi-resolution or hash grids) to clarify the claimed novelty.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and describe the revisions that will be incorporated.
read point-by-point responses
-
Referee: [Abstract and method description of projected sampling] The abstract asserts that the motion of projected points 'follows a unique pattern' usable for learned positional encoding on grids without high-resolution requirements or new artifacts, yet no derivation, uniqueness proof, or explicit construction of the decomposition is referenced. This is load-bearing for the parameter-reduction claim, as the pattern must demonstrably differ from standard sinusoidal embeddings or implicit dense sampling.
Authors: We agree that an explicit derivation strengthens the central claim. Section 3 already describes the projected-point trajectories and contrasts them visually with standard sinusoidal embeddings, but we will add a new subsection that derives the frequency-dependent motion analytically from the sinusoidal basis functions, shows the resulting unique trajectories, and formally distinguishes the decomposition from both dense sampling and conventional positional encodings. This addition will directly underpin the parameter-efficiency argument. revision: yes
-
Referee: [Experimental evaluation sections] The headline result (outperformance + 25% parameter savings across three tasks) is stated without reference to specific baselines, quantitative tables, error bars, or ablation controls in the provided text. If §4 or the experimental section contains these, they must be cross-referenced to the uniqueness claim; otherwise the empirical 'proof' reduces to unsupported assertion.
Authors: Section 4 already contains the quantitative tables, baseline comparisons (including grid-based and positional-encoding methods), error metrics with standard deviations across runs, and ablations on grid resolution and parameter count. We will insert explicit forward references from the abstract and from the method description in Section 3 to the relevant tables and figures in Section 4, making the link between the uniqueness of the decomposition and the reported gains unambiguous. revision: partial
Circularity Check
No significant circularity; derivation remains self-contained.
full rationale
The abstract frames the method as an observation of projected-point motion patterns used to construct a basis for grid-based positional encoding, followed by empirical validation on three tasks. No equations, parameter-fitting steps, or self-citations are supplied in the given text that would reduce any claimed prediction or uniqueness result to the inputs by construction. The central claim of outperformance with fewer parameters is presented as an external empirical result rather than a definitional or fitted tautology. This satisfies the default expectation of non-circularity when no load-bearing reduction can be exhibited via direct quote.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Positional encodings can be decomposed into meaningful points whose motion with respect to frequencies follows a unique pattern usable as basis decomposition for learned grid encodings
invented entities (1)
-
PEPS projected sampling
no independent evidence
Reference graph
Works this paper leans on
-
[1]
HIP: C++ Heterogeneous-Compute Interface for Portability
AMD. HIP: C++ Heterogeneous-Compute Interface for Portability. https://github.com/ROCm/hip, 2016. Accessed: 2025-07-31
work page 2016
-
[2]
RDNA4 Instruction Set Architecture: Reference Guide
AMD. RDNA4 Instruction Set Architecture: Reference Guide. https://www.amd.com/content/dam/amd/en/documents/radeon-tech- docs/instruction-set-architectures/rdna4-instruction-set-architecture.pdf, 2025. Accessed: 2025-07-31
work page 2025
-
[3]
Flip: A difference evaluator for alternating images
Pontus Andersson, Jim Nilsson, Tomas Akenine-Möller, Magnus Oskarsson, Kalle Åström, and Mark D Fairchild. Flip: A difference evaluator for alternating images. Proc. ACM Comput. Graph. Interact. Tech., 3(2):15–1, 2020
work page 2020
-
[4]
Peter Dencker and Wolfgang Erb. Multivariate polynomial interpolation on lissajous–chebyshev nodes.Journal of Approximation Theory, 219:15–45, 2017
work page 2017
-
[5]
Bi- variatelagrangeinterpolationatthenodepointsofnon-degeneratelissajouscurves
Wolfgang Erb, Christian Kaethner, Mandy Ahlborg, and Thorsten M Buzug. Bi- variatelagrangeinterpolationatthenodepointsofnon-degeneratelissajouscurves. Numerische Mathematik, 133(4):685–705, 2016
work page 2016
-
[6]
Neural graphics texture compression supporting random access
Farzad Farhadzadeh, Qiqi Hou, Hoang Le, Amir Said, Randall Rauwendaal, Alex Bourd, and Fatih Porikli. Neural graphics texture compression supporting random access. InEuropean Conference on Computer Vision, pages 412–429. Springer, 2024
work page 2024
-
[7]
ffish.asia / floraZia.com. CT scan Pitted Stonefish, E. erosa,
-
[8]
https://sketchfab.com/3d-models/ct-scan-pitted-stonefish-e-erosa- 0cdc3d1419384fd78fd952dc251a3169, Accessed: 2025-07-31, organization Sketch- fab
work page 2025
-
[9]
David J Field. Relations between the statistics of natural images and the response propertiesofcorticalcells.Journal of the Optical Society of America A,4(12):2379– 2394, 1987
work page 1987
-
[10]
Neural Texture Block Compression
Shin Fujieda and Takahiro Harada. Neural Texture Block Compression. In Jon Yn- gve Hardeberg and Holly Rushmeier, editors,Workshop on Material Appearance ModelingJoint MAM - MANER Conference - Material Appearance Network for Education and Research. The Eurographics Association, 2024
work page 2024
-
[11]
Local Positional Encoding for Multi-Layer Perceptrons
Shin Fujieda, Atsushi Yoshimura, and Takahiro Harada. Local Positional Encoding for Multi-Layer Perceptrons. InPacific Graphics Short Papers and Posters. The Eurographics Association, 2023
work page 2023
-
[12]
Martin Gardner. White and brown music, fractal curves and one-over-f fluctua- tions.Scientific American, 238(4):16–32, 1978
work page 1978
-
[13]
Convolutional sequence to sequence learning
Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N Dauphin. Convolutional sequence to sequence learning. InInternational conference on ma- chine learning, pages 1243–1252. PMLR, 2017
work page 2017
-
[14]
Augustine Gray and John Markel. Distance measures for speech processing.IEEE Transactions on Acoustics, Speech, and Signal Processing, 24(5):380–391, 2003
work page 2003
-
[15]
N Jeremy Kasdin. Discrete simulation of colored noise and stochastic processes and 1/f/sup/spl alpha//power law noise generation.Proceedings of the IEEE, 83(5):802–827, 1995
work page 1995
-
[16]
1/f noise.Proceedings of the IEEE, 70(3):212–218, 1982
Marvin S Keshner. 1/f noise.Proceedings of the IEEE, 70(3):212–218, 1982
work page 1982
-
[17]
Kodak lossless true color image suite (photocd pcd0992), 1993
Eastman Kodak. Kodak lossless true color image suite (photocd pcd0992), 1993. URL http://r0k. us/graphics/kodak, 10, 2022
work page 1993
-
[18]
Hardware accelerated neural block texture compression with cooperative vectors
Belcour Laurent and Benyoub Anis. Hardware accelerated neural block texture compression with cooperative vectors.arXiv preprint arXiv:2506.06040, 2025
-
[19]
Jules-Antoine Lissajous.Mémoire sur l’étude optique des mouvements vibratoires. Mallet-Bachelier, 1857
-
[20]
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis.Communications of the ACM, 65(1):99–106, 2021
work page 2021
-
[21]
Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding.ACM transactions on graphics (TOG), 41(4):1–15, 2022
work page 2022
-
[22]
Stanley Osher and Ronald Fedkiw. Signed distance functions. InLevel set methods and dynamic implicit surfaces, pages 17–22. Springer, 2003
work page 2003
-
[23]
Gen- erating 1/f noise sequences as constraint satisfaction: The voss constraint
François Pachet, Pierre Roy, Alexandre Papadopoulos, and Jason Sakellariou. Gen- erating 1/f noise sequences as constraint satisfaction: The voss constraint. InIJ- CAI, pages 2482–2488, 2015
work page 2015
-
[24]
Automatic differentiation in pytorch
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. InNIPS-W, 2017
work page 2017
-
[25]
Extending the capacity of 1/f noise generation
Guillaume Perez, Brendan Rappazzo, and Carla Gomes. Extending the capacity of 1/f noise generation. InInternational Conference on Principles and Practice of Constraint Programming, pages 601–610. Springer, 2018
work page 2018
-
[26]
An image synthesizer.ACM Siggraph Computer Graphics, 19(3):287– 296, 1985
Ken Perlin. An image synthesizer.ACM Siggraph Computer Graphics, 19(3):287– 296, 1985
work page 1985
-
[27]
Lawrence Rabiner and Biing-Hwang Juang.Fundamentals of speech recognition. Prentice-Hall, Inc., 1993
work page 1993
-
[28]
On the spectral bias of neu- ral networks
Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. On the spectral bias of neu- ral networks. InInternational conference on machine learning, pages 5301–5310. PMLR, 2019
work page 2019
-
[29]
Beyond periodicity: Towards a unifying framework for activations in coordinate-mlps
Sameera Ramasinghe and Simon Lucey. Beyond periodicity: Towards a unifying framework for activations in coordinate-mlps. InEuropean Conference on Com- puter Vision, pages 142–158. Springer, 2022
work page 2022
-
[30]
Wire: Wavelet implicit neural represen- tations
Vishwanath Saragadam, Daniel LeJeune, Jasper Tan, Guha Balakrishnan, Ashok Veeraraghavan, and Richard G Baraniuk. Wire: Wavelet implicit neural represen- tations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18507–18516, 2023
work page 2023
-
[31]
Neural block compression: Variable bitrates feature blocks for texture representation
Rui Shi, Yishun Dou, Zhong Zheng, Xiangzhong Fang, Wenjun Zhang, and Bing- bing Ni. Neural block compression: Variable bitrates feature blocks for texture representation. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 6878–6886, 2025
work page 2025
-
[32]
Implicit neural representations with periodic activation functions
Vincent Sitzmann, Julien Martel, Alexander Bergman, David Lindell, and Gor- don Wetzstein. Implicit neural representations with periodic activation functions. Advances in neural information processing systems, 33:7462–7473, 2020
work page 2020
-
[33]
Lissajous curves as aerial search patterns.Scientific Reports, 14(1):11144, 2024
J Josiah Steckenrider, Mitchell Miller, Rory Blankenship, Victor Trujillo, and James Bluman. Lissajous curves as aerial search patterns.Scientific Reports, 14(1):11144, 2024
work page 2024
-
[34]
P. Szendro, G. Vincze, and A. Szasz. Bio-response to white noise excitation. Electro- and Magnetobiology, 20(2):215–229, 2001
work page 2001
-
[35]
Neural geo- metric level of detail: Real-time rendering with implicit 3d shapes
Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. Neural geo- metric level of detail: Real-time rendering with implicit 3d shapes. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11358–11367, 2021
work page 2021
-
[36]
Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimensional domains.Advances in neural information processing systems, 33:7537–7547, 2020
work page 2020
-
[37]
Karthik Vaidyanathan, Marco Salvi, Bartlomiej Wronski, Tomas Akenine-Moller, Pontus Ebelin, and Aaron Lefohn. Random-access neural compression of material textures.ACM Transactions on Graphics (TOG), 42:1–25, 2023
work page 2023
-
[38]
van A Van der Schaaf and JH van van Hateren. Modelling the power spectra of natural images: statistics and information.Vision research, 36(17):2759–2770, 1996
work page 1996
-
[39]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017
work page 2017
-
[40]
Delio Vicini, Sébastien Speierer, and Wenzel Jakob. Differentiable signed distance function rendering.ACM Transactions on Graphics (ToG), 41(4):1–18, 2022
work page 2022
-
[41]
1/f noise in music and speech.Nature, 258(5533):317–318, 1975
Richard F Voss and John Clarke. 1/f noise in music and speech.Nature, 258(5533):317–318, 1975
work page 1975
-
[42]
Richard F Voss and John Clarke. ”1/f noise”in music: Music from 1/f noise.The Journal of the Acoustical Society of America, 63(1):258–263, 1978
work page 1978
-
[43]
Yusong Wang, Shaoning Li, Tong Wang, Bin Shao, Nanning Zheng, and Tie-Yan Liu. Geometric transformer with interatomic positional encoding.Advances in Neural Information Processing Systems, 36:55981–55994, 2023
work page 2023
-
[44]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE transactions on image processing, 13(4):600–612, 2004
work page 2004
-
[45]
Real- time neural materials using block-compressed features
ClémentWeinreich,LouisDeOliveira,AntoineHoudard,andGeorgesNader. Real- time neural materials using block-compressed features. InComputer Graphics Forum, volume 43, page e15013. Wiley Online Library, 2024
work page 2024
-
[46]
The unreasonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018
work page 2018
-
[47]
Zelin Zhao, Fenglei Fan, Wenlong Liao, and Junchi Yan. Grounding and enhancing grid-based models for neural fields. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19425–19435, 2024. A Global notes Proof of proposition 1Suppose there are two non-null points(x, y)and(x ′, y′) having different ratioa/banda ′/b′ and...
work page 2024
-
[48]
Then the LPSD is simply the L1 distance between the original image and the learned one
The resulting graph is for example the one showed in Figure 3. Then the LPSD is simply the L1 distance between the original image and the learned one. More information about the PSD or those metrics can be found in [37,8,26,13] D NTC DatasetWeused18instancesfromhttps://polyhaven.com/andhttps://ambientcg.com/. We extracted the available one from [9,44] and...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.