Submanifold Sparse Convolutional Networks
read the original abstract
Convolutional network are the de-facto standard for analysing spatio-temporal data such as images, videos, 3D shapes, etc. Whilst some of this data is naturally dense (for instance, photos), many other data sources are inherently sparse. Examples include pen-strokes forming on a piece of paper, or (colored) 3D point clouds that were obtained using a LiDAR scanner or RGB-D camera. Standard "dense" implementations of convolutional networks are very inefficient when applied on such sparse data. We introduce a sparse convolutional operation tailored to processing sparse data that differs from prior work on sparse convolutional networks in that it operates strictly on submanifolds, rather than "dilating" the observation with every layer in the network. Our empirical analysis of the resulting submanifold sparse convolutional networks shows that they perform on par with state-of-the-art methods whilst requiring substantially less computation.
This paper has not been read by Pith yet.
Forward citations
Cited by 5 Pith papers
-
ShardTensor: Domain Parallelism for Scientific Machine Learning
ShardTensor is a domain-parallelism system for SciML that enables flexible scaling of extreme-resolution spatial datasets by removing the constraint of batch size one per device.
-
Native and Compact Structured Latents for 3D Generation
Introduces O-Voxel omni-voxel representation and Sparse Compression VAE for structured native 3D latents, enabling efficient training of large flow-matching models that produce higher-quality geometry and materials th...
-
SVL: Spike-based Vision-language Pretraining for Efficient 3D Open-world Understanding
SVL pretraining enables SNNs to reach 85.4% top-1 accuracy on zero-shot 3D classification while outperforming prior SNNs on detection, segmentation, and action recognition with added open-world QA capability.
-
Efficient Semantic Scene Completion Network with Spatial Group Convolution
Proposes Spatial Group Convolution to accelerate 3D semantic scene completion networks via grouped sparse operations, reporting state-of-the-art accuracy and speed on SUNCG.
-
Submanifold Sparse Convolutional Networks for Automated 3D Segmentation of Kidneys and Kidney Tumours in Computed Tomography
A two-stage sparse convolutional network pipeline for native high-resolution 3D kidney and tumor segmentation in CT that matches top Dice scores while reducing VRAM and runtime versus nnU-Net and SegVol.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.