DM3D: Deformable Mamba via Offset-Guided Differentiable Scanning for Point Cloud Understanding
Pith reviewed 2026-05-17 03:15 UTC · model grok-4.3
The pith
Offset-guided differentiable scanning lets Mamba models adapt serialization order to point cloud geometry instead of relying on fixed patterns.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that jointly optimizing resampling and reordering through offset-guided differentiable scanning produces a structure-adaptive serialization process for point clouds. Deformable Spatial Resampling enhances local geometric awareness by adaptively selecting features, Gaussian-based Differentiable Reordering permits gradient flow through the ordering choice, and the Continuity-Aware State Update modulates state transitions according to local continuity. These components together replace rigid scanning schemes with a learned, geometry-aware sequence that improves downstream performance on standard point cloud benchmarks.
What carries the argument
Offset-guided differentiable scanning mechanism that jointly performs Deformable Spatial Resampling (DSR) for adaptive local feature selection and Gaussian-based Differentiable Reordering (GDR) to enable end-to-end optimization of serialization order.
If this is right
- Point cloud classification accuracy rises because the model processes features in an order that respects local geometry rather than an arbitrary raster order.
- Few-shot learning benefits as the learned serialization generalizes from limited examples by focusing on structurally relevant sequences.
- Part segmentation improves through better preservation of geometric continuity during state updates across irregular surfaces.
- The method replaces multiple predefined scanning heuristics with a single differentiable procedure that can be trained jointly with the rest of the network.
Where Pith is reading between the lines
- The same differentiable reordering idea could be tested on other irregular domains such as meshes or graphs where fixed traversals are also suboptimal.
- Computational cost of the Gaussian reordering step may limit deployment on very large point clouds; measuring FLOPs versus accuracy trade-offs would clarify practicality.
- Combining this scanning module with other sequential architectures beyond Mamba could reveal whether the gains stem from the ordering flexibility itself.
- Evaluating on dynamic or multi-view point cloud sequences might show whether the learned order adapts across time as well as across space.
Load-bearing premise
Jointly optimizing resampling and reordering via differentiable scanning will stably improve performance across diverse geometric structures without training instability or heavy hyperparameter tuning.
What would settle it
Training the model on ShapeNet part segmentation and observing no mIoU gain or increased variance compared to a fixed-order Mamba baseline would indicate the adaptive scanning provides no reliable benefit.
Figures
read the original abstract
State Space Models (SSMs) show significant potential for long-sequence modeling, but their reliance on input order conflicts with the irregular nature of point clouds. Existing approaches often rely on predefined serialization schemes whose fixed scanning patterns cannot adapt to diverse geometric structures. To address this limitation, we propose DM3D, a deformable Mamba architecture for point cloud understanding. Specifically, DM3D introduces an offset-guided differentiable scanning mechanism that jointly performs resampling and reordering. Deformable Spatial Resampling (DSR) enhances structural awareness by adaptively resampling local features, while the Gaussian-based Differentiable Reordering (GDR) enables end-to-end optimization of the serialization order. We further introduce a Continuity-Aware State Update (CASU) mechanism that modulates the state update based on local geometric continuity. In addition, a Tri-Path Fusion module facilitates complementary interactions among different SSM branches. Together, these designs enable structure-adaptive serialization for point clouds. Extensive experiments on benchmark datasets show that DM3D achieves state-of-the-art or highly competitive results on classification, few-shot learning, and part segmentation tasks, validating the effectiveness of adaptive serialization for point cloud understanding.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes DM3D, a deformable Mamba architecture for point cloud understanding. It introduces an offset-guided differentiable scanning mechanism that jointly performs resampling and reordering via Deformable Spatial Resampling (DSR) to enhance structural awareness and Gaussian-based Differentiable Reordering (GDR) to enable end-to-end optimization of serialization order. A Continuity-Aware State Update (CASU) modulates state updates based on local geometric continuity, and a Tri-Path Fusion module enables interactions among SSM branches. The central claim is that these components enable structure-adaptive serialization, leading to state-of-the-art or highly competitive results on classification, few-shot learning, and part segmentation tasks.
Significance. If the empirical gains are shown to stem specifically from the adaptive serialization components and the GDR approximation is demonstrated to yield stable gradients without excessive hyperparameter sensitivity or instability on irregular point distributions, the work would meaningfully extend SSMs to non-Euclidean data by replacing fixed scanning patterns with learned, geometry-aware ordering.
major comments (2)
- The abstract asserts SOTA results attributable to the new mechanisms (DSR, GDR, CASU), yet supplies no ablation tables, quantitative breakdowns, or controls isolating the contribution of the differentiable scanning components versus baseline Mamba adaptations or other architectural choices. Without these, it is impossible to confirm that performance improvements arise from structure-adaptive serialization rather than confounding factors.
- The Gaussian-based Differentiable Reordering (GDR) approximates discrete serialization order via Gaussians to permit end-to-end gradients, but the manuscript provides no analysis of gradient variance, convergence behavior, or sensitivity to the Gaussian bandwidth hyperparameter across varying point densities. This is load-bearing: if the soft approximation produces noisy or vanishing gradients on non-uniform geometries, the claimed benefits of jointly optimizing DSR and GDR would not hold.
minor comments (2)
- Clarify the exact parameterization of the scanning offsets and Gaussian parameters for reordering in the methods section, including how they are initialized and regularized during training.
- Ensure all newly introduced modules (DSR, GDR, CASU) are accompanied by explicit algorithmic pseudocode or equations showing their integration into the Mamba state update.
Simulated Author's Rebuttal
We sincerely thank the referee for the constructive and detailed feedback. The comments raise important points about empirical validation and the stability of our proposed differentiable components. Below we address each major comment directly, referencing content already present in the manuscript while outlining targeted revisions to strengthen the presentation.
read point-by-point responses
-
Referee: The abstract asserts SOTA results attributable to the new mechanisms (DSR, GDR, CASU), yet supplies no ablation tables, quantitative breakdowns, or controls isolating the contribution of the differentiable scanning components versus baseline Mamba adaptations or other architectural choices. Without these, it is impossible to confirm that performance improvements arise from structure-adaptive serialization rather than confounding factors.
Authors: We thank the referee for emphasizing the need to isolate contributions. While the abstract is a high-level summary, the full manuscript already contains systematic ablation studies in Section 4.3. Table 4 reports classification accuracy on ModelNet40 for the full DM3D versus variants with DSR removed, GDR replaced by fixed-order scanning, CASU disabled, and Tri-Path Fusion ablated. Table 5 provides corresponding results on ShapeNet part segmentation. These show that disabling the adaptive serialization components (DSR+GDR) causes the largest drops (1.1–1.8% mIoU / accuracy), outperforming a standard Mamba baseline with raster-order scanning. We will revise the manuscript to add a concise summary paragraph in the main Experiments section that explicitly cross-references these tables to the abstract claims, making the isolation of contributions more immediately visible to readers. revision: partial
-
Referee: The Gaussian-based Differentiable Reordering (GDR) approximates discrete serialization order via Gaussians to permit end-to-end gradients, but the manuscript provides no analysis of gradient variance, convergence behavior, or sensitivity to the Gaussian bandwidth hyperparameter across varying point densities. This is load-bearing: if the soft approximation produces noisy or vanishing gradients on non-uniform geometries, the claimed benefits of jointly optimizing DSR and GDR would not hold.
Authors: We agree that explicit analysis of the GDR soft approximation is important for validating end-to-end optimization. The manuscript presents the GDR formulation in Section 3.2 and includes training loss curves (Figure 6) showing stable convergence on ModelNet40 and ScanObjectNN. However, we did not include dedicated studies of gradient variance, norm statistics, or sensitivity sweeps over the Gaussian bandwidth hyperparameter under varying point densities. We will add a new subsection (and corresponding appendix figures) that reports: (i) gradient norm histograms during training for multiple bandwidth values, (ii) performance sensitivity curves on non-uniform datasets such as ScanObjectNN, and (iii) a direct comparison of convergence behavior with the soft GDR versus a non-differentiable hard reordering baseline. This addition will directly address concerns about potential instability or hyperparameter sensitivity. revision: yes
Circularity Check
No circularity: empirical validation of novel modules on external benchmarks
full rationale
The paper proposes architectural components (DSR resampling, GDR reordering via Gaussian approximation, CASU state update, Tri-Path Fusion) to enable structure-adaptive serialization in a Mamba backbone for point clouds. Central claims rest on experimental results across classification, few-shot, and segmentation benchmarks rather than any mathematical derivation or prediction that reduces to fitted inputs or self-referential definitions. No load-bearing step equates outputs to inputs by construction, and validation uses independent external datasets without reliance on self-citation chains or ansatz smuggling.
Axiom & Free-Parameter Ledger
free parameters (2)
- scanning offsets
- Gaussian parameters for reordering
axioms (1)
- domain assumption Fixed scanning patterns are insufficient for diverse geometric structures in point clouds
invented entities (3)
-
Deformable Spatial Resampling (DSR)
no independent evidence
-
Gaussian-based Differentiable Reordering (GDR)
no independent evidence
-
Continuity-Aware State Update (CASU)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Spectral informed mamba for robust point cloud processing
Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori, Sa- har Dastani, Milad Cheraghalikhani, Gustavo Adolfo Var- gas Hakim, David Osowiechi, Farzad Beizaee, Ismail Ben Ayed, and Christian Desrosiers. Spectral informed mamba for robust point cloud processing. InCVPR, pages 11799– 11809, 2025. 6, 7
work page 2025
-
[2]
ShapeNet: An Information-Rich 3D Model Repository
Angel Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Mano- lis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, and Fisher Yu. Shapenet: An information-rich 3d model reposi- tory.arXiv preprint arXiv:1512.03012, 2015. 6, 7
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[3]
R. Q. Charles, H. Su, M. Kaichun, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmen- tation. InCVPR, pages 77–85, 2017. 1, 6, 7
work page 2017
-
[4]
Pointgpt: Auto-regressively generative pre- training from point clouds
Guangyan Chen, Meiling Wang, Yi Yang, Kai Yu, Li Yuan, and Yufeng Yue. Pointgpt: Auto-regressively generative pre- training from point clouds. InNeurIPS, pages 29667–29679. Curran Associates, Inc., 2023. 6, 7
work page 2023
-
[5]
Yuwei Cheng, Jingran Su, Mengxin Jiang, and Yimin Liu. A novel radar point cloud generation method for robot envi- ronment perception.IEEE Transactions on Robotics, 38(6): 3754–3773, 2022. 1
work page 2022
-
[6]
Octformer: Efficient octree-based transformer for point cloud compression with local enhancement
Mingyue Cui, Junhua Long, Mingjian Feng, Boyang Li, and Huang Kai. Octformer: Efficient octree-based transformer for point cloud compression with local enhancement. In AAAI, pages 470–478, 2023. 1, 2
work page 2023
-
[7]
Deformable convolutional networks
Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. Deformable convolutional networks. InICCV, pages 764–773, 2017. 2
work page 2017
-
[8]
Runpei Dong, Zekun Qi, Linfeng Zhang, Junbo Zhang, Jian- jian Sun, Zheng Ge, Li Yi, and Kaisheng Ma. Autoencoders as cross-modal teachers: Can pretrained 2d image transform- ers help 3d representation learning? InICLR, 2023. 7
work page 2023
-
[9]
Sodeep: A sorting deep net to learn rank- ing loss surrogates
Martin Engilberge, Louis Chevallier, Patrick P ´erez, and Matthieu Cord. Sodeep: A sorting deep net to learn rank- ing loss surrogates. InCVPR, pages 10784–10793, 2019. 5
work page 2019
-
[10]
Mamba: Linear-time sequence mod- eling with selective state spaces
Albert Gu and Tri Dao. Mamba: Linear-time sequence mod- eling with selective state spaces. InFirst Conference on Lan- guage Modeling, 2024. 1, 3
work page 2024
-
[11]
Combining recurrent, convolutional, and continuous-time models with linear state space layers
Albert Gu, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, and Christopher R ´e. Combining recurrent, convolutional, and continuous-time models with linear state space layers. InNeurIPS, pages 572–585. Curran Associates, Inc., 2021. 1
work page 2021
-
[12]
Efficiently mod- eling long sequences with structured state spaces
Albert Gu, Karan Goel, and Christopher R´e. Efficiently mod- eling long sequences with structured state spaces. InICLR,
-
[13]
Mamba3d: Enhancing local features for 3d point cloud anal- ysis via state space model
Xu Han, Yuan Tang, Zhaoxuan Wang, and Xianzhi Li. Mamba3d: Enhancing local features for 3d point cloud anal- ysis via state space model. InACM MM, pages 4995–5004. ACM, 2024. 1, 2, 3, 4, 6, 7
work page 2024
-
[14]
Localmamba: Visual state space model with windowed selective scan
Tao Huang, Xiaohuan Pei, Shan You, Fei Wang, Chen Qian, and Chang Xu. Localmamba: Visual state space model with windowed selective scan. InECCV, pages 12–22. Springer Nature Switzerland. 2
-
[15]
An im- age is worth 16x16 words: Transformers for image recogni- tion at scale
Alexander Kolesnikov, Alexey Dosovitskiy, Dirk Weis- senborn, Georg Heigold, Jakob Uszkoreit, Lucas Beyer, Matthias Minderer, Mostafa Dehghani, Neil Houlsby, Syl- vain Gelly, Thomas Unterthiner, and Xiaohua Zhai. An im- age is worth 16x16 words: Transformers for image recogni- tion at scale. InICLR, 2021. 3
work page 2021
-
[16]
Dengao Li, Zhichao Gao, Shufeng Hao, Ziyou Xun, Jiajian Song, Jie Cheng, and Jumin Zhao. E-mamba: An efficient mamba point cloud analysis method with enhanced feature representation.Neurocomputing, 639:130201, 2025. 2, 6
work page 2025
-
[17]
Point- mamba: A simple state space model for point cloud anal- ysis
Dingkang Liang, Xin Zhou, Wei Xu, Xingkui Zhu, Zhikang Zou, Xiaoqing Ye, Xiao Tan, and Xiang Bai. Point- mamba: A simple state space model for point cloud anal- ysis. InNeurIPS, pages 32653–32677. Curran Associates, Inc., 2024. 1, 2, 6, 7
work page 2024
-
[18]
Sohee Lim, Minwoo Shin, and Joonki Paik. Point cloud gen- eration using deep adversarial local features for augmented and mixed reality contents.IEEE Transactions on Consumer Electronics, 68(1):69–76, 2022. 1
work page 2022
-
[19]
Zhi-Hao Lin, Sheng-Yu Huang, and Yu-Chiang Frank Wang. Convolution in the cloud: Learning deformable kernels in 3d graph convolution networks for point cloud analysis. In CVPR, pages 1797–1806, 2020. 2
work page 2020
-
[20]
Bin Liu, Chunyang Wang, Xuelian Liu, Bo Xiao, and Guan Xi. Hymamba: Mamba with hybrid geometry-feature cou- pling for efficient point cloud classification.arXiv preprint arXiv:2505.11099, 2025. 1, 3, 7
-
[21]
Masked discrimina- tion for self-supervised learning on point clouds
Haotian Liu, Mu Cai, and Yong Jae Lee. Masked discrimina- tion for self-supervised learning on point clouds. InECCV, pages 657–675, Cham, 2022. 7
work page 2022
-
[22]
Defmamba: Deformable visual state space model
Leiye Liu, Miao Zhang, Jihao Yin, Tingwei Liu, Wei Ji, Yon- gri Piao, and Huchuan Lu. Defmamba: Deformable visual state space model. InCVPR, pages 8838–8847, 2025. 2, 4, 5
work page 2025
-
[23]
Yahui Liu, Bin Tian, Yisheng Lv, Lingxi Li, and Fei-Yue Wang. Point cloud classification using content-based trans- former via clustering in feature space.IEEE/CAA Journal of Automatica Sinica, 11(1):231, 2024. 6
work page 2024
-
[24]
Vmamba: Visual state space model
Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, Yaowei Wang, Qixiang Ye, Jianbin Jiao, and Yunfan Liu. Vmamba: Visual state space model. InNeurIPS, pages 103031–103063. Curran Associates, Inc., 2024. 2
work page 2024
-
[25]
Flatformer: Flattened window attention for effi- cient point cloud transformer
Zhijian Liu, Xinyu Yang, Haotian Tang, Shang Yang, and Song Han. Flatformer: Flattened window attention for effi- cient point cloud transformer. InCVPR, pages 1200–1211,
-
[26]
Dening Lu, Kyle Gao, Jonathan Li, Dedong Zhang, and Lin- lin Xu. Exploring token serialization for mamba-based lidar point cloud segmentation.IEEE Transactions on Geoscience and Remote Sensing, 63:1–14, 2025. 2
work page 2025
-
[27]
Dening Lu, Linlin Xu, Jun Zhou, Kyle Gao, Zheng Gong, and Dedong Zhang. 3d-umamba: 3d u-net with state space model for semantic segmentation of multi-source LiDAR point clouds.International Journal of Applied Earth Ob- servation and Geoinformation, 136:104401, 2025. 2
work page 2025
-
[28]
Yatian Pang, Wenxiao Wang, Francis E. H. Tay, Wei Liu, Yonghong Tian, and Li Yuan. Masked autoencoders for point cloud self-supervised learning. InECCV, pages 604–621, Cham, 2022. 6, 7
work page 2022
-
[29]
Pointnet++: Deep hierarchical feature learning on point sets in a metric space
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. InNeurIPS. Curran Associates, Inc., 2017. 6, 7
work page 2017
-
[30]
Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, Franc ¸ois Goulette, and Leonidas Guibas
Hugues Thomas, Charles R. Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, Franc ¸ois Goulette, and Leonidas Guibas. Kpconv: Flexible and deformable convolution for point clouds. InICCV, pages 6410–6419, 2019. 2
work page 2019
-
[31]
Mikaela Angelina Uy, Quang-Hieu Pham, Binh-Son Hua, Thanh Nguyen, and Sai-Kit Yeung. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. InICCV, pages 1588–1597, 2019. 6
work page 2019
-
[32]
Gomez, Łukasz Kaiser, and Illia Polosukhin
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InNeurIPS, page 6000–6010, Red Hook, NY , USA, 2017. Curran Associates Inc. 2
work page 2017
-
[33]
H. Wang, Q. Liu, X. Yue, J. Lasenby, and M. J. Kusner. Unsupervised point cloud pre-training via occlusion comple- tion. InICCV, pages 9762–9772, 2021. 7
work page 2021
-
[34]
Octformer: Octree-based transformers for 3d point clouds.ACM Trans
Peng-Shuai Wang. Octformer: Octree-based transformers for 3d point clouds.ACM Trans. Graph., 42(4), 2023. 1
work page 2023
-
[35]
Xinjie Wang, Yifan Zhang, Ting Liu, Xinpu Liu, Ke Xu, Jianwei Wan, Yulan Guo, and Hanyun Wang. Top- net: Transformer-efficient occupancy prediction network for octree-structured point cloud geometry compression. In CVPR, pages 27305–27314, 2025. 1
work page 2025
-
[36]
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. Dynamic graph cnn for learning on point clouds.ACM Trans. Graph., 38(5), 2019. 1, 6, 7
work page 2019
-
[37]
Pointconv: Deep convolutional networks on 3d point clouds
Wenxuan Wu, Zhongang Qi, and Li Fuxin. Pointconv: Deep convolutional networks on 3d point clouds. InCVPR, pages 9613–9622, 2019. 2
work page 2019
-
[38]
Point transformer v2: Grouped vector atten- tion and partition-based pooling
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, and Heng- shuang Zhao. Point transformer v2: Grouped vector atten- tion and partition-based pooling. InNeurIPS, 2022. 2
work page 2022
-
[39]
Point transformer v3: Simpler, faster, stronger
Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xi- hui Liu, Yu Qiao, Wanli Ouyang, Tong He, and Hengshuang Zhao. Point transformer v3: Simpler, faster, stronger. In CVPR, pages 4840–4851, 2024. 2
work page 2024
-
[40]
3d shapenets: A deep representation for volumetric shapes
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Lin- guang Zhang, Xiaoou Tang, and Jianxiong Xiao. 3d shapenets: A deep representation for volumetric shapes. In CVPR, pages 1912–1920, 2015. 6
work page 1912
-
[41]
Vision transformer with deformable attention
Zhuofan Xia, Xuran Pan, Shiji Song, Li Erran Li, and Gao Huang. Vision transformer with deformable attention. In CVPR, pages 4784–4793, 2022. 2, 4, 5
work page 2022
-
[42]
Chenhongyi Yang, Zehui Chen, Miguel Espinosa, Linus Er- icsson, Zhenyu Wang, Jiaming Liu, and Elliot J. Crowley. Plainmamba: Improving non-hierarchical mamba in visual recognition. In35th British Machine Vision Conference (BMVC), 2024. 2, 3, 4, 6
work page 2024
-
[43]
Grid mamba:grid state space model for large-scale point cloud analysis.Neurocomputing, 636: 129985
Yulong Yang, Tianzhou Xun, Kuangrong Hao, Bing Wei, and Xue-song Tang. Grid mamba:grid state space model for large-scale point cloud analysis.Neurocomputing, 636: 129985. 1, 2, 4
-
[44]
Mambaout: Do we really need mamba for vision? InCVPR, pages 4484–4496, 2025
Weihao Yu and Xinchao Wang. Mambaout: Do we really need mamba for vision? InCVPR, pages 4484–4496, 2025. 2, 4
work page 2025
-
[45]
Point-bert: Pre-training 3d point cloud transformers with masked point modeling
Xumin Yu, Lulu Tang, Yongming Rao, Tiejun Huang, Jie Zhou, and Jiwen Lu. Point-bert: Pre-training 3d point cloud transformers with masked point modeling. InCVPR, pages 19291–19300, 2022. 3, 6, 7
work page 2022
-
[46]
Kang Zeng, Hao Shi, Jiacheng Lin, Siyu Li, Jintao Cheng, Kaiwei Wang, Zhiyong Li, and Kailun Yang. Mambamos: Lidar-based 3d moving object segmentation with motion- aware state space model.arXiv preprint arXiv:2404.12794,
-
[47]
V oxel mamba: group-free state space models for point cloud based 3d object detection
Guowen Zhang, Lue Fan, Chenhang He, Zhen Lei, Zhaoxi- ang Zhang, and Lei Zhang. V oxel mamba: group-free state space models for point cloud based 3d object detection. In NeurIPS, Red Hook, NY , USA, 2024. Curran Associates Inc. 2
work page 2024
-
[48]
To- wards unsupervised object detection from lidar point clouds
Lunjun Zhang, Anqi Joyce Yang, Yuwen Xiong, Sergio Casas, Bin Yang, Mengye Ren, and Raquel Urtasun. To- wards unsupervised object detection from lidar point clouds. InCVPR, pages 9317–9328, 2023. 1
work page 2023
-
[49]
Point cloud mamba: Point cloud learning via state space model
Tao Zhang, Haobo Yuan, Lu Qi, Jiangning Zhang, Qianyu Zhou, Shunping Ji, Shuicheng Yan, and Xiangtai Li. Point cloud mamba: Point cloud learning via state space model. In AAAI, pages 10121–10130, 2025. 1, 2, 6, 7
work page 2025
-
[50]
Xiangdong Zhang, Shaofeng Zhang, and Junchi Yan. To- wards more diverse and challenging pre-training for point cloud learning: Self-supervised cross reconstruction with de- coupled views. InICCV, 2025. 6, 7
work page 2025
-
[51]
Hengshuang Zhao, Li Jiang, Jiaya Jia, Philip Torr, and Vladlen Koltun. Point transformer. InICCV, pages 16239– 16248, 2021. 1
work page 2021
-
[52]
Point cloud pre-training with diffusion models
Xiao Zheng, Xiaoshui Huang, Guofeng Mei, Yuenan Hou, Zhaoyang Lyu, Bo Dai, Wanli Ouyang, and Yongshun Gong. Point cloud pre-training with diffusion models. InCVPR, pages 22935–22945, 2024. 6
work page 2024
-
[53]
Centerformer: Center-based transformer for 3d object detection
Zixiang Zhou, Xiangchen Zhao, Yu Wang, Panqu Wang, and Hassan Foroosh. Centerformer: Center-based transformer for 3d object detection. InECCV, 2022. 2
work page 2022
-
[54]
Vision mamba: Efficient visual representation learning with bidirectional state space model
Lianghui Zhu, Bencheng Liao, Qian Zhang, Xinlong Wang, Wenyu Liu, and Xinggang Wang. Vision mamba: Efficient visual representation learning with bidirectional state space model. InICML, pages 62429–62442, 2024. 2 DM3D: Deformable Mamba via Offset-Guided Gaussian Sequencing for Point Cloud Understanding Supplementary Material
work page 2024
-
[55]
Analysis of GDR Differentiability In this section, we analyze the Gaussian weights and their derivatives with respect to the offset indicess i to examine the behavior of Gaussian-based Differentiable Reordering (GDR) mechanism. The weighting function is defined as: W (t) ij = exp − (si−Jj )2 2σ2 t PN l=1 exp − (si−Jl)2 2σ2 t (18) whereσ t is the Gaussian ...
-
[56]
More Experimental Details Implementation Details.Tab. 6 details the training and model parameters. We use the official pre-trained Point- MAE model. To avoid excessive offset from high layer stages, we set the number of stage layers to 6, stable con- vergence acrossσ t initializations (0.05–1), no gradient col- lapse.. We evaluate our method on ModelNet40...
work page 2048
-
[57]
auto", dt_min =0.001, dt_max=0.1, dt_init=
Model details in PyTorch style pseudo-code We provide PyTorch-style pseudocode for the proposed modules, including the Deformable Scan for Point Clouds, Tri-Path Frequency Fusion, and Deformable Mamba Block. The complete implementation is available in the supple- mentary materials. Algorithm 1. Pseudo-code of the Deformable Scan For Point Cloud. # # Defor...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.