Submanifold Sparse Convolutional Networks

Benjamin Graham; Laurens van der Maaten

arxiv: 1706.01307 · v1 · pith:FDWMVI2Jnew · submitted 2017-06-05 · 💻 cs.NE · cs.CV

Submanifold Sparse Convolutional Networks

Benjamin Graham , Laurens van der Maaten This is my paper

classification 💻 cs.NE cs.CV

keywords sparseconvolutionaldatanetworksdensenetworkstandardsubmanifold

0 comments

read the original abstract

Convolutional network are the de-facto standard for analysing spatio-temporal data such as images, videos, 3D shapes, etc. Whilst some of this data is naturally dense (for instance, photos), many other data sources are inherently sparse. Examples include pen-strokes forming on a piece of paper, or (colored) 3D point clouds that were obtained using a LiDAR scanner or RGB-D camera. Standard "dense" implementations of convolutional networks are very inefficient when applied on such sparse data. We introduce a sparse convolutional operation tailored to processing sparse data that differs from prior work on sparse convolutional networks in that it operates strictly on submanifolds, rather than "dilating" the observation with every layer in the network. Our empirical analysis of the resulting submanifold sparse convolutional networks shows that they perform on par with state-of-the-art methods whilst requiring substantially less computation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ShardTensor: Domain Parallelism for Scientific Machine Learning
cs.DC 2026-05 unverdicted novelty 6.0

ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
Native and Compact Structured Latents for 3D Generation
cs.CV 2025-12 unverdicted novelty 6.0

Introduces O-Voxel omni-voxel representation and Sparse Compression VAE for structured native 3D latents, enabling efficient training of large flow-matching models that produce higher-quality geometry and materials th...
SVL: Spike-based Vision-language Pretraining for Efficient 3D Open-world Understanding
cs.CV 2025-05 unverdicted novelty 6.0

SVL pretraining enables SNNs to reach 85.4% top-1 accuracy on zero-shot 3D classification while outperforming prior SNNs on detection, segmentation, and action recognition with added open-world QA capability.
Efficient Semantic Scene Completion Network with Spatial Group Convolution
cs.CV 2019-07 unverdicted novelty 6.0

Proposes Spatial Group Convolution to accelerate 3D semantic scene completion networks via grouped sparse operations, reporting state-of-the-art accuracy and speed on SUNCG.
Submanifold Sparse Convolutional Networks for Automated 3D Segmentation of Kidneys and Kidney Tumours in Computed Tomography
cs.CV 2025-11 conditional novelty 5.0

A two-stage sparse convolutional network pipeline for native high-resolution 3D kidney and tumor segmentation in CT that matches top Dice scores while reducing VRAM and runtime versus nnU-Net and SegVol.