Mapped Convolutions
Pith reviewed 2026-05-25 15:23 UTC · model grok-4.3
The pith
Mapped convolution decouples sampling from weighted summation to apply kernels to any structured data such as geodesic meshes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Mapped convolutions are obtained by replacing the fixed grid sampling of standard convolution with an arbitrary sampling function that selects input values before the kernel performs its weighted sum. On spherical data this formulation supports both an improved sampling scheme for equirectangular images and a projection of the image onto a geodesic grid so that convolution occurs directly on the textured mesh. The resulting networks exceed prior spherical-convolution methods by nearly 17 percent on dense depth estimation.
What carries the argument
The mapped convolution, defined by an explicit sampling function that selects input locations before the kernel weights are applied.
If this is right
- Convolution kernels can be applied to any type of structured data once an appropriate sampling function is supplied.
- Equirectangular spherical images can be processed with a sampling method that reduces distortion compared with earlier approaches.
- Spherical images can be projected onto a geodesic grid so that convolution occurs directly on the mesh surface.
- Spherical depth estimation accuracy can exceed the previous state of the art by nearly 17 percent.
Where Pith is reading between the lines
- The same separation of sampling from summation could be applied to other grid-based operations such as pooling or transposed convolution on irregular domains.
- Geodesic meshes may serve as a more natural representation than equirectangular or cube-map formats for omnidirectional vision tasks.
- Performance differences among spherical CNN methods may largely trace to discretization choices rather than to the convolution operator itself.
Load-bearing premise
The proposed sampling improvements and geodesic-grid projection produce a fair comparison to earlier spherical convolution methods without introducing large discretization artifacts that inflate the reported performance gain.
What would settle it
Running the identical depth-estimation network on the same equirectangular test set using only prior spherical convolution operators without the mapped sampling or geodesic projection, and checking whether the 17 percent margin disappears.
Figures
read the original abstract
We present a versatile formulation of the convolution operation that we term a "mapped convolution." The standard convolution operation implicitly samples the pixel grid and computes a weighted sum. Our mapped convolution decouples these two components, freeing the operation from the confines of the image grid and allowing the kernel to process any type of structured data. As a test case, we demonstrate its use by applying it to dense inference on spherical data. We perform an in-depth study of existing spherical image convolution methods and propose an improved sampling method for equirectangular images. Then, we discuss the impact of data discretization when deriving a sampling function, highlighting drawbacks of the cube map representation for spherical data. Finally, we illustrate how mapped convolutions enable us to convolve directly on a mesh by projecting the spherical image onto a geodesic grid and training on the textured mesh. This method exceeds the state of the art for spherical depth estimation by nearly 17%. Our findings suggest that mapped convolutions can be instrumental in expanding the application scope of convolutional neural networks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces mapped convolutions as a formulation that decouples sampling from the weighted sum in standard convolution, enabling the kernel to operate on arbitrary structured data beyond image grids. As a test case on spherical data, the authors propose an improved equirectangular sampling method, critique cube-map discretization drawbacks, project spherical images onto a geodesic mesh, and report that this approach exceeds prior state-of-the-art spherical depth estimation performance by nearly 17%.
Significance. If the reported gains can be shown to arise specifically from the decoupling rather than from the accompanying sampling and projection choices, the approach would offer a general mechanism for extending CNNs to non-grid structured data such as meshes.
major comments (1)
- [Abstract / experimental results] Abstract and experimental results: the 17% improvement is reported only after combining mapped convolutions with a new equirectangular sampling method and a geodesic-mesh projection. No ablation is described that fixes the sampling function and varies only the convolution operator itself, leaving open the possibility that the gain is driven by discretization changes rather than the claimed decoupling. This directly affects the central claim that mapped convolutions are the enabling factor.
minor comments (1)
- [Abstract] Abstract: the headline performance claim lacks any mention of baselines, error bars, data splits, or evaluation protocol, reducing verifiability of the empirical result.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The concern about isolating the contribution of mapped convolutions from sampling and discretization choices is valid and directly impacts the strength of our central claim. We address this below and commit to revisions that will clarify the role of the decoupling.
read point-by-point responses
-
Referee: [Abstract / experimental results] Abstract and experimental results: the 17% improvement is reported only after combining mapped convolutions with a new equirectangular sampling method and a geodesic-mesh projection. No ablation is described that fixes the sampling function and varies only the convolution operator itself, leaving open the possibility that the gain is driven by discretization changes rather than the claimed decoupling. This directly affects the central claim that mapped convolutions are the enabling factor.
Authors: We agree that the current experiments do not include an ablation that holds the sampling function fixed while varying only the convolution operator (standard vs. mapped). This omission leaves ambiguity about whether the reported gains stem primarily from the decoupling or from the accompanying sampling improvements and mesh projection. In the revised manuscript we will add such an ablation: we will fix the improved equirectangular sampling and compare a standard convolution baseline (adapted to the same sampling locations where feasible) against mapped convolution on the same data. We will also clarify that the geodesic-mesh experiment is only possible because mapped convolutions decouple sampling from the weighted sum, allowing direct operation on unstructured mesh vertices; standard convolutions cannot be applied in the same way without additional discretization steps. These additions will strengthen the evidence that the decoupling itself is the enabling mechanism. revision: yes
Circularity Check
No circularity: mapped convolution is a direct definitional decoupling with independent empirical validation
full rationale
The paper defines mapped convolution explicitly as the decoupling of sampling from the weighted sum operation, which is a constructive formulation rather than a derivation that reduces to its own inputs or prior self-citations. No equations or claims in the provided abstract or description show a self-definitional loop, a fitted parameter renamed as prediction, or a load-bearing uniqueness theorem imported from the authors' own prior work. The 17% gain is presented as an empirical outcome on spherical depth estimation after introducing sampling improvements and geodesic projection; these are separate methodological choices whose effects are not used to justify the core operator definition itself. The derivation chain remains self-contained against external benchmarks and does not rely on any of the enumerated circular patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Low-memory GEMM-based convolution algorithms for deep neural networks
A. Anderson, A. Vasudevan, C. Keane, and D. Gregg. Low- memory gemm-based convolution algorithms for deep neu- ral networks. arXiv preprint arXiv:1709.03395, 2017. 3
work page internal anchor Pith review Pith/arXiv arXiv 2017
- [2]
-
[3]
M. M. Bronstein, J. Bruna, Y . LeCun, A. Szlam, and P. Van- dergheynst. Geometric deep learning: going beyond eu- clidean data. IEEE Signal Processing Magazine, 34(4):18– 42, 2017. 1, 2
work page 2017
- [4]
-
[5]
Convolutional Networks for Spherical Signals
T. Cohen, M. Geiger, J. K ¨ohler, and M. Welling. Con- volutional networks for spherical signals. arXiv preprint arXiv:1709.04893, 2017. 2
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[6]
T. S. Cohen, M. Geiger, J. K¨ohler, and M. Welling. Spherical cnns. arXiv preprint arXiv:1801.10130, 2018. 2
work page internal anchor Pith review Pith/arXiv arXiv 2018
- [7]
-
[8]
J. Dai, H. Qi, Y . Xiong, Y . Li, G. Zhang, H. Hu, and Y . Wei. Deformable convolutional networks. CoRR, abs/1703.06211, 1(2):3, 2017. 2, 4
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[9]
TextureNet: Consistent Local Parametrizations for Learning from High-Resolution Signals on Meshes
J. Huang, H. Zhang, L. Yi, T. Funkhouser, M. Nießner, and L. Guibas. Texturenet: Consistent local parametrizations for learning from high-resolution signals on meshes. arXiv preprint arXiv:1812.00020, 2018. 1
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[10]
M. Jaderberg, K. Simonyan, A. Zisserman, et al. Spatial transformer networks. In Advances in neural information processing systems, pages 2017–2025, 2015. 2
work page 2017
-
[11]
X. Jia, B. De Brabandere, T. Tuytelaars, and L. V . Gool. Dy- namic filter networks. In Advances in Neural Information Processing Systems, pages 667–675, 2016. 2
work page 2016
-
[12]
J. A. Kimerling, K. Sahr, D. White, and L. Song. Compar- ing geometrical properties of global grids. Cartography and Geographic Information Science, 1999. 7
work page 1999
-
[13]
D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. 4
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[14]
T. N. Kipf and M. Welling. Semi-supervised classifica- tion with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016. 2, 4
work page internal anchor Pith review Pith/arXiv arXiv 2016
- [15]
-
[16]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.- Y . Fu, and A. C. Berg. Ssd: Single shot multibox detector. In European conference on computer vision , pages 21–37. Springer, 2016. 1
work page 2016
-
[17]
J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 3431–3440, 2015. 1
work page 2015
-
[18]
C. Loop. Smooth subdivision surfaces based on triangles. Master’s thesis, University of Utah, Department of Mathe- matics, 1987. 7
work page 1987
- [19]
-
[20]
N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox. A large dataset to train con- volutional networks for disparity, optical flow, and scene flow estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages 4040– 4048, 2016. 1
work page 2016
-
[21]
J. McCormac, A. Handa, S. Leutenegger, and A. J.Davison. Scenenet rgb-d: Can 5m synthetic images beat generic ima- genet pre-training on indoor segmentation? 2017. 5
work page 2017
- [22]
-
[23]
J. P. Snyder. An equal-area map projection for polyhe- dral globes. Cartographica: The International Journal for Geographic Information and Geovisualization, 29(1):10–21,
-
[24]
S. Song, F. Yu, A. Zeng, A. X. Chang, M. Savva, and T. Funkhouser. Semantic scene completion from a single depth image. IEEE Conference on Computer Vision and Pat- tern Recognition, 2017. 5
work page 2017
- [25]
- [26]
- [27]
- [28]
-
[29]
N. Zioulis, A. Karakottas, D. Zarpalas, and P. Daras. Omnidepth: Dense depth estimation for indoors spherical panoramas. In Proceedings of the European Conference on Computer Vision (ECCV), pages 448–465, 2018. 2, 4, 5, 6, 7
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.