pith. sign in

arxiv: 2401.00436 · v6 · submitted 2023-12-31 · 💻 cs.CV

Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration

Pith reviewed 2026-05-24 04:48 UTC · model grok-4.3

classification 💻 cs.CV
keywords point cloud registrationcorrespondence matchingdiffusion modeldoubly stochastic matrixiterative refinement3DMatchnon-rigid registration
0
0 comments X

The pith

A diffusion model learns to iteratively search for optimal point cloud correspondences by denoising in doubly stochastic matrix space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a framework that trains a diffusion model to predict a denoising direction within the space of doubly stochastic matrices for point cloud registration. The reverse denoising process then follows this direction to refine the correspondence matrix step by step, aiming to reach solutions closer to the maximum-likelihood target matching matrix. This replaces one-shot normalization steps and fixed refinement trajectories with an explicit, learnable search that models the distribution of good matchings. The approach targets both rigid and non-rigid cases by making the refinement path more transparent and potentially closer to globally optimal correspondences. If the learned direction reliably approximates the desired trajectory, registration accuracy improves without increasing computational cost through lightweight modules and accelerated sampling.

Core claim

The diffusion model learns a denoising direction, and the reverse denoising process iteratively searches for improved solutions along this learned direction, which approximates the maximum-likelihood direction of the target matching matrix.

What carries the argument

A denoising diffusion model that predicts search gradients in doubly stochastic matrix space to guide iterative refinement of the correspondence matrix toward the optimal matching.

If this is right

  • Correspondence matrices receive explicit iterative refinement before transformation estimation, avoiding reliance on single normalization steps.
  • The search trajectory becomes learnable and data-driven rather than fixed once refinement begins.
  • Modeling the distribution of target matchings allows the method to move beyond feature-space candidates projected only once.
  • Lightweight denoising combined with accelerated sampling keeps the iterative process efficient for both rigid and non-rigid registration.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same diffusion-in-matrix-space idea could apply to other assignment problems where solutions must stay inside the doubly stochastic polytope.
  • Multiple sampled denoising paths from the same initial matrix might yield an ensemble of plausible correspondences for uncertainty estimation.
  • Joint training of the diffusion model with upstream feature extractors could produce end-to-end systems that optimize both descriptors and matchings together.

Load-bearing premise

A diffusion process operating in doubly stochastic matrix space can be trained to approximate the distribution of globally optimal correspondence matrices so that the learned denoising trajectory produces better registration than one-shot projections or fixed refinements.

What would settle it

A test where starting from random matrices and following the trained denoising steps fails to reach matchings with lower registration error than those obtained by repeated Sinkhorn projections or standard gradient ascent on the same doubly stochastic constraint.

Figures

Figures reproduced from arXiv: 2401.00436 by Haihua Shi, Qianliang Wu.

Figure 1
Figure 1. Figure 1: The reverse sampling process for matching matrix on the doubly [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our matching matrix diffusion model. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of our framework training. Our framework includes a KPConv [9] featuren backbone optimization and a denoising module optimization. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The qualitative results of rigid registration in the [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The qualitative results of deformable matching in the 4DMatch/4DLoMatch benchmark. The top results are generated by Lepard [5]. The bottom [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: An example of the reverse sampling process. The red/green denotes two directions matching errors. Zoom in for details. [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The qualitative results of non-rigid registration in the 4DMatch/4DLoMatch benchmark. The top four are generated by Lepard+NDP [50], while [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
read the original abstract

Efficiently identifying accurate correspondences between point clouds is crucial for both rigid and non-rigid point cloud registration. Existing methods usually rely on geometric or semantic feature embeddings to establish correspondences and then estimate transformations or flow fields. Recently, several state-of-the-art methods have adopted RAFT-like iterative updates to refine solutions. However, these methods still have two major limitations. First, their iterative refinement mechanism lacks transparency, and the update trajectory is largely fixed once the refinement starts, which may lead to suboptimal solutions. Second, they overlook the importance of explicitly refining the correspondence matrix before solving for transformations or flow fields. Most existing approaches compute candidate correspondences in feature space and project the resulting matching matrix only once by using Sinkhorn or dual-softmax normalization. Such a one-shot projection can be far from the globally optimal solution, and these methods usually do not model the distribution of the target matching matrix. In this paper, we propose a novel framework that exploits a denoising diffusion model to predict a search gradient for the optimal matching matrix in doubly stochastic matrix space. Specifically, the diffusion model learns a denoising direction, and the reverse denoising process iteratively searches for improved solutions along this learned direction, which approximates the maximum-likelihood direction of the target matching matrix. To improve efficiency, we design a lightweight denoising module and adopt the accelerated sampling strategy of the Denoising Diffusion Implicit Model (DDIM)\cite{song2020denoising}. Experimental results on 3DMatch/3DLoMatch and 4DMatch/4DLoMatch demonstrate the effectiveness of the proposed framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The paper proposes Diff-PCR, a framework for point cloud registration (both rigid and non-rigid) that formulates correspondence search as iterative refinement of a matching matrix inside doubly stochastic matrix space. A denoising diffusion model is trained to predict a search gradient; the reverse denoising trajectory (accelerated via DDIM) is claimed to follow the maximum-likelihood direction toward globally optimal correspondences, addressing the fixed-trajectory limitation of RAFT-style methods and the one-shot Sinkhorn projection of prior approaches. A lightweight denoising module is introduced for efficiency, with experiments reported on 3DMatch/3DLoMatch and 4DMatch/4DLoMatch.

Significance. If the central claim holds, the work introduces a learned, distribution-aware refinement mechanism that operates directly on the space of doubly stochastic matrices, offering greater transparency than fixed-trajectory iterative methods and potentially higher accuracy by modeling the target matching distribution rather than relying on one-shot normalization. The use of diffusion models in this constrained matrix space is a distinctive technical contribution to the registration literature.

minor comments (1)
  1. [Abstract] Abstract: the claim of effectiveness is supported only by the statement that 'experimental results ... demonstrate the effectiveness'; no quantitative metrics, baselines, or ablation numbers appear in the abstract, which is standard but leaves the magnitude of improvement unstated until the results section is examined.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their thoughtful summary of our manuscript and for recognizing the potential significance of formulating correspondence search as iterative refinement in doubly stochastic matrix space via a diffusion model. We note that the report lists no specific major comments or questions for us to address. We remain available to provide further details or clarifications on any aspect of the work if the referee has additional points.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core claim is that a diffusion model, trained on data, learns a denoising direction in doubly stochastic matrix space whose reverse process approximates the maximum-likelihood trajectory toward optimal correspondence matrices. This is presented as an empirical learning procedure whose outputs are validated on external benchmarks (3DMatch/3DLoMatch, 4DMatch/4DLoMatch). The DDIM acceleration is cited to an external reference (song2020denoising). No equations or steps in the provided description reduce a claimed prediction or uniqueness result to a fitted parameter, self-definition, or self-citation chain. The framework is therefore self-contained against external data and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; the central addition is the diffusion model itself. No free parameters or invented entities are identifiable from the abstract.

axioms (1)
  • domain assumption Valid correspondences can be represented as doubly stochastic matrices
    Standard assumption in optimal transport formulations of matching problems.

pith-pipeline@v0.9.0 · 5813 in / 1003 out tokens · 25317 ms · 2026-05-24T04:48:53.972779+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages · 3 internal anchors

  1. [1]

    A comprehensive survey on point cloud registration,

    X. Huang, G. Mei, J. Zhang, and R. Abbas, “A comprehensive survey on point cloud registration,” 2021

  2. [2]

    Self-supervised 3d scene flow estimation guided by superpoints,

    Y . Shen, L. Hui, J. Xie, and J. Yang, “Self-supervised 3d scene flow estimation guided by superpoints,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 5271–5280

  3. [3]

    Loam: Lidar odometry and mapping in real- time

    J. Zhang and S. Singh, “Loam: Lidar odometry and mapping in real- time.” in Robotics: Science and systems , vol. 2, no. 9. Berkeley, CA, 2014, pp. 1–9

  4. [4]

    Geometric transformer for fast and robust point cloud registration,

    Z. Qin, H. Yu, C. Wang, Y . Guo, Y . Peng, and K. Xu, “Geometric transformer for fast and robust point cloud registration,” in Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 143–11 152

  5. [5]

    Lepard: Learning partial point cloud matching in rigid and deformable scenes,

    Y . Li and T. Harada, “Lepard: Learning partial point cloud matching in rigid and deformable scenes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 5554–5564

  6. [6]

    Regtr: End-to-end point cloud cor- respondences with transformers,

    Z. J. Yew and G. H. Lee, “Regtr: End-to-end point cloud cor- respondences with transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 6677–6686

  7. [7]

    Sgfeat: Salient geometric feature for point cloud registration,

    Q. Wu, Y . Ding, L. Luo, C. Zhou, J. Xie, and J. Yang, “Sgfeat: Salient geometric feature for point cloud registration,” arXiv preprint arXiv:2309.06207, 2023

  8. [8]

    Unsupervised deep probabilistic approach for partial point cloud registration,

    G. Mei, H. Tang, X. Huang, W. Wang, J. Liu, J. Zhang, L. Van Gool, and Q. Wu, “Unsupervised deep probabilistic approach for partial point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 13 611–13 620

  9. [9]

    Kpconv: Flexible and deformable convolution for point clouds,

    H. Thomas, C. R. Qi, J.-E. Deschaud, B. Marcotegui, F. Goulette, and L. J. Guibas, “Kpconv: Flexible and deformable convolution for point clouds,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 6411–6420

  10. [10]

    Pointdsc: Robust point cloud registration using deep spatial con- sistency,

    X. Bai, Z. Luo, L. Zhou, H. Chen, L. Li, Z. Hu, H. Fu, and C.-L. Tai, “Pointdsc: Robust point cloud registration using deep spatial con- sistency,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2021, pp. 15 859–15 869

  11. [11]

    Sc2-pcr: A second order spatial compatibility for efficient and robust point cloud registration,

    Z. Chen, K. Sun, F. Yang, and W. Tao, “Sc2-pcr: A second order spatial compatibility for efficient and robust point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 221–13 231

  12. [12]

    Deep hough voting for 3d object detection in point clouds,

    C. R. Qi, O. Litany, K. He, and L. J. Guibas, “Deep hough voting for 3d object detection in point clouds,” in proceedings of the IEEE/CVF International Conference on Computer Vision , 2019, pp. 9277–9286

  13. [13]

    Deep graph-based spatial consistency for robust non-rigid point cloud registration,

    Z. Qin, H. Yu, C. Wang, Y . Peng, and K. Xu, “Deep graph-based spatial consistency for robust non-rigid point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5394–5403

  14. [14]

    Robust outlier rejection for 3d registration with variational bayes,

    H. Jiang, Z. Dang, Z. Wei, J. Xie, J. Yang, and M. Salzmann, “Robust outlier rejection for 3d registration with variational bayes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1148–1157

  15. [15]

    3d registration with maximal cliques,

    X. Zhang, J. Yang, S. Zhang, and Y . Zhang, “3d registration with maximal cliques,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 17 745–17 754

  16. [16]

    Peal: Prior- embedded explicit attention learning for low-overlap point cloud reg- istration,

    J. Yu, L. Ren, Y . Zhang, W. Zhou, L. Lin, and G. Dai, “Peal: Prior- embedded explicit attention learning for low-overlap point cloud reg- istration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 17 702–17 711

  17. [17]

    Rcp: Recurrent closest point for scene flow estimation on 3d point clouds,

    X. Gu, C. Tang, W. Yuan, Z. Dai, S. Zhu, and P. Tan, “Rcp: Recurrent closest point for scene flow estimation on 3d point clouds,” arXiv preprint arXiv:2205.11028, 2022

  18. [18]

    Raft: Recurrent all-pairs field transforms for optical flow,

    Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II

  19. [19]

    Springer, 2020, pp. 402–419

  20. [20]

    Cotreg: Coupled optimal transport based point cloud registration,

    G. Mei, X. Huang, L. Yu, J. Zhang, and M. Bennamoun, “Cotreg: Coupled optimal transport based point cloud registration,” arXiv preprint arXiv:2112.14381, 2021

  21. [21]

    Graph matching optimization network for point cloud registration

    Q. Wu, Y . Shen, H. Jiang, G. Mei, Y . Ding, L. Luo, J. Xie, and J. Yang, “Graph matching optimization network for point cloud registration.”

  22. [22]

    Denoising diffusion probabilistic models,

    J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems , vol. 33, pp. 6840–6851, 2020

  23. [23]

    Revisiting frank-wolfe: Projection-free sparse convex opti- mization,

    M. Jaggi, “Revisiting frank-wolfe: Projection-free sparse convex opti- mization,” in International Conference on Machine Learning. PMLR, 2013, pp. 427–435

  24. [24]

    Correlation functions and computer simulations,

    G. Parisi, “Correlation functions and computer simulations,” Nuclear Physics B, vol. 180, no. 3, pp. 378–384, 1981

  25. [25]

    Mcmc using hamiltonian dynamics,

    R. M. Neal et al., “Mcmc using hamiltonian dynamics,” Handbook of markov chain monte carlo , vol. 2, no. 11, p. 2, 2011

  26. [26]

    Nonsquare

    R. M. Caron, X. Li, P. Mikusi ´nski, H. Sherwood, and M. D. Taylor, “Nonsquare ”doubly stochastic” matrices,” Lecture Notes- Monograph Series , vol. 28, pp. 65–75, 1996. [Online]. Available: http://www.jstor.org/stable/4355884

  27. [27]

    Raft-3d: Scene flow using rigid-motion em- beddings,

    Z. Teed and J. Deng, “Raft-3d: Scene flow using rigid-motion em- beddings,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , 2021, pp. 8375–8384

  28. [28]

    Denoising Diffusion Implicit Models

    J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502 , 2020

  29. [29]

    D3feat: Joint learning of dense detection and description of 3d local features,

    X. Bai, Z. Luo, L. Zhou, H. Fu, L. Quan, and C.-L. Tai, “D3feat: Joint learning of dense detection and description of 3d local features,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 6359–6367

  30. [30]

    Predator: Registration of 3d point clouds with low overlap,

    S. Huang, Z. Gojcic, M. Usvyatsov, A. Wieser, and K. Schindler, “Predator: Registration of 3d point clouds with low overlap,” in Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, 2021, pp. 4267–4276

  31. [31]

    Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration,

    H. Yu, F. Li, M. Saleh, B. Busam, and S. Ilic, “Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration,” Advances in Neural Information Processing Systems , vol. 34, pp. 23 872–23 884, 2021

  32. [32]

    Rotation-invariant transformer for point cloud matching,

    H. Yu, Z. Qin, J. Hou, M. Saleh, D. Li, B. Busam, and S. Ilic, “Rotation-invariant transformer for point cloud matching,” inProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5384–5393

  33. [33]

    Riga: Rotation-invariant and globally-aware descriptors for point cloud registration,

    H. Yu, J. Hou, Z. Qin, M. Saleh, I. Shugurov, K. Wang, B. Busam, and S. Ilic, “Riga: Rotation-invariant and globally-aware descriptors for point cloud registration,” arXiv preprint arXiv:2209.13252 , 2022

  34. [34]

    Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors,

    H. Deng, T. Birdal, and S. Ilic, “Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors,” inProceedings of the European Conference on Computer Vision (ECCV) , 2018, pp. 602–618

  35. [35]

    Diffusionpcr: Diffusion models for robust multi-step point cloud registration,

    Z. Chen, Y . Ren, T. Zhang, Z. Dang, W. Tao, S. S ¨usstrunk, and M. Salzmann, “Diffusionpcr: Diffusion models for robust multi-step point cloud registration,” arXiv preprint arXiv:2312.03053 , 2023

  36. [36]

    Generative modeling by estimating gradients of the data distribution,

    Y . Song and S. Ermon, “Generative modeling by estimating gradients of the data distribution,” Advances in neural information processing systems, vol. 32, 2019

  37. [37]

    Diffusiondet: Diffusion model for object detection,

    S. Chen, P. Sun, Y . Song, and P. Luo, “Diffusiondet: Diffusion model for object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision , 2023, pp. 19 830–19 843

  38. [38]

    Structured denoising diffusion models in discrete state-spaces,

    J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. Van Den Berg, “Structured denoising diffusion models in discrete state-spaces,” Ad- vances in Neural Information Processing Systems, vol. 34, pp. 17 981– 17 993, 2021

  39. [39]

    Vector quantized diffusion model for text-to-image synthesis,

    S. Gu, D. Chen, J. Bao, F. Wen, B. Zhang, D. Chen, L. Yuan, and B. Guo, “Vector quantized diffusion model for text-to-image synthesis,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 10 696–10 706

  40. [40]

    Se (3)- diffusionfields: Learning smooth cost functions for joint grasp and motion optimization through diffusion,

    J. Urain, N. Funk, J. Peters, and G. Chalvatzaki, “Se (3)- diffusionfields: Learning smooth cost functions for joint grasp and motion optimization through diffusion,” in 2023 IEEE International Conference on Robotics and Automation (ICRA) . IEEE, 2023, pp. 5923–5930

  41. [41]

    Se (3) diffusion model-based point cloud registration for robust 6d object pose estimation,

    H. Jiang, M. Salzmann, Z. Dang, J. Xie, and J. Yang, “Se (3) diffusion model-based point cloud registration for robust 6d object pose estimation,” arXiv preprint arXiv:2310.17359 , 2023

  42. [42]

    Deep Unsupervised Learning using Nonequilibrium Thermodynamics

    J. N. Sohl-Dickstein, E. A. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” ArXiv, vol. abs/1503.03585, 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:14888175

  43. [43]

    Score-Based Generative Modeling through Stochastic Differential Equations

    Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” arXiv preprint arXiv:2011.13456 , 2020

  44. [44]

    Kingma, Tim Salimans, Ben Poole, and Jonathan Ho

    D. P. Kingma, T. Salimans, B. Poole, and J. Ho, “Variational diffusion models,” ArXiv, vol. abs/2107.00630, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:235694314

  45. [45]

    Sinkhorn distances: Lightspeed computation of optimal transport,

    M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,”Advances in neural information processing systems, vol. 26, 2013

  46. [46]

    Method for registration of 3-d shapes,

    P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,” in Sensor fusion IV: control paradigms and data structures , vol. 1611. Spie, 1992, pp. 586–606

  47. [47]

    Unleashing transformers: Parallel token prediction with discrete absorbing diffusion for fast high-resolution image generation from vector-quantized codes,

    S. Bond-Taylor, P. Hessey, H. Sasaki, T. P. Breckon, and C. G. Willcocks, “Unleashing transformers: Parallel token prediction with discrete absorbing diffusion for fast high-resolution image generation from vector-quantized codes,” in European Conference on Computer Vision. Springer, 2022, pp. 170–188

  48. [48]

    Least-squares fitting of two 3-d point sets,

    K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” IEEE Transactions on pattern analysis and machine intelligence, no. 5, pp. 698–700, 1987

  49. [49]

    Embedded deformation for shape manipulation,

    R. W. Sumner, J. Schmid, and M. Pauly, “Embedded deformation for shape manipulation,” in ACM siggraph 2007 papers, 2007, pp. 80–es

  50. [50]

    As-rigid-as-possible shape manipulation,

    T. Igarashi, T. Moscovich, and J. F. Hughes, “As-rigid-as-possible shape manipulation,” ACM transactions on Graphics (TOG) , vol. 24, no. 3, pp. 1134–1141, 2005

  51. [51]

    Non-rigid point cloud registration with neural deformation pyramid,

    Y . Li and T. Harada, “Non-rigid point cloud registration with neural deformation pyramid,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 757–27 768, 2022

  52. [52]

    Attention is all you need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems , vol. 30, 2017

  53. [53]

    3dmatch: Learning local geometric descriptors from rgb-d reconstruc- tions,

    A. Zeng, S. Song, M. Nießner, M. Fisher, J. Xiao, and T. Funkhouser, “3dmatch: Learning local geometric descriptors from rgb-d reconstruc- tions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1802–1811

  54. [54]

    Hyper-graph matching via reweighted random walks,

    J. Lee, M. Cho, and K. M. Lee, “Hyper-graph matching via reweighted random walks,” in CVPR 2011. IEEE, 2011, pp. 1633–1640

  55. [55]

    Fully convolutional geometric features,

    C. Choy, J. Park, and V . Koltun, “Fully convolutional geometric features,” in Proceedings of the IEEE/CVF International Conference on Computer Vision , 2019, pp. 8958–8966

  56. [56]

    Pointpwc-net: A coarse-to-fine network for supervised and self-supervised scene flow estimation on 3d point clouds,

    W. Wu, Z. Wang, Z. Li, W. Liu, and L. Fuxin, “Pointpwc-net: A coarse-to-fine network for supervised and self-supervised scene flow estimation on 3d point clouds,” arXiv preprint arXiv:1911.12408 , 2019

  57. [57]

    Flot: Scene flow on point clouds guided by optimal transport,

    G. Puy, A. Boulch, and R. Marlet, “Flot: Scene flow on point clouds guided by optimal transport,” in European conference on computer vision. Springer, 2020, pp. 527–544

  58. [58]

    Neural scene flow prior,

    X. Li, J. Kaesemodel Pontes, and S. Lucey, “Neural scene flow prior,” Advances in Neural Information Processing Systems , vol. 34, pp. 7838–7851, 2021

  59. [59]

    4dcom- plete: Non-rigid motion estimation beyond the observable surface,

    Y . Li, H. Takehara, T. Taketomi, B. Zheng, and M. Nießner, “4dcom- plete: Non-rigid motion estimation beyond the observable surface,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12 706–12 716