STS-Mixer decomposes 4D point cloud videos into multi-band spectral signals via graph transforms and mixes them with spatiotemporal representations to achieve better results on 3D action recognition and 4D semantic segmentation benchmarks.
PointCMP: Contrastive mask prediction for self-supervised learning on point cloud videos
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
STS-Mixer: Spatio-Temporal-Spectral Mixer for 4D Point Cloud Video Understanding
STS-Mixer decomposes 4D point cloud videos into multi-band spectral signals via graph transforms and mixes them with spatiotemporal representations to achieve better results on 3D action recognition and 4D semantic segmentation benchmarks.