Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration

Haihua Shi; Qianliang Wu

arxiv: 2401.00436 · v6 · submitted 2023-12-31 · 💻 cs.CV

Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration

Haihua Shi , Qianliang Wu This is my paper

Pith reviewed 2026-05-24 04:48 UTC · model grok-4.3

classification 💻 cs.CV

keywords point cloud registrationcorrespondence matchingdiffusion modeldoubly stochastic matrixiterative refinement3DMatchnon-rigid registration

0 comments

The pith

A diffusion model learns to iteratively search for optimal point cloud correspondences by denoising in doubly stochastic matrix space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a framework that trains a diffusion model to predict a denoising direction within the space of doubly stochastic matrices for point cloud registration. The reverse denoising process then follows this direction to refine the correspondence matrix step by step, aiming to reach solutions closer to the maximum-likelihood target matching matrix. This replaces one-shot normalization steps and fixed refinement trajectories with an explicit, learnable search that models the distribution of good matchings. The approach targets both rigid and non-rigid cases by making the refinement path more transparent and potentially closer to globally optimal correspondences. If the learned direction reliably approximates the desired trajectory, registration accuracy improves without increasing computational cost through lightweight modules and accelerated sampling.

Core claim

The diffusion model learns a denoising direction, and the reverse denoising process iteratively searches for improved solutions along this learned direction, which approximates the maximum-likelihood direction of the target matching matrix.

What carries the argument

A denoising diffusion model that predicts search gradients in doubly stochastic matrix space to guide iterative refinement of the correspondence matrix toward the optimal matching.

If this is right

Correspondence matrices receive explicit iterative refinement before transformation estimation, avoiding reliance on single normalization steps.
The search trajectory becomes learnable and data-driven rather than fixed once refinement begins.
Modeling the distribution of target matchings allows the method to move beyond feature-space candidates projected only once.
Lightweight denoising combined with accelerated sampling keeps the iterative process efficient for both rigid and non-rigid registration.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same diffusion-in-matrix-space idea could apply to other assignment problems where solutions must stay inside the doubly stochastic polytope.
Multiple sampled denoising paths from the same initial matrix might yield an ensemble of plausible correspondences for uncertainty estimation.
Joint training of the diffusion model with upstream feature extractors could produce end-to-end systems that optimize both descriptors and matchings together.

Load-bearing premise

A diffusion process operating in doubly stochastic matrix space can be trained to approximate the distribution of globally optimal correspondence matrices so that the learned denoising trajectory produces better registration than one-shot projections or fixed refinements.

What would settle it

A test where starting from random matrices and following the trained denoising steps fails to reach matchings with lower registration error than those obtained by repeated Sinkhorn projections or standard gradient ascent on the same doubly stochastic constraint.

Figures

Figures reproduced from arXiv: 2401.00436 by Haihua Shi, Qianliang Wu.

**Figure 2.** Figure 2: Overview of our matching matrix diffusion model. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of our framework training. Our framework includes a KPConv [9] featuren backbone optimization and a denoising module optimization. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: The qualitative results of rigid registration in the [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: The qualitative results of deformable matching in the 4DMatch/4DLoMatch benchmark. The top results are generated by Lepard [5]. The bottom [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: An example of the reverse sampling process. The red/green denotes two directions matching errors. Zoom in for details. [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: The qualitative results of non-rigid registration in the 4DMatch/4DLoMatch benchmark. The top four are generated by Lepard+NDP [50], while [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

read the original abstract

Efficiently identifying accurate correspondences between point clouds is crucial for both rigid and non-rigid point cloud registration. Existing methods usually rely on geometric or semantic feature embeddings to establish correspondences and then estimate transformations or flow fields. Recently, several state-of-the-art methods have adopted RAFT-like iterative updates to refine solutions. However, these methods still have two major limitations. First, their iterative refinement mechanism lacks transparency, and the update trajectory is largely fixed once the refinement starts, which may lead to suboptimal solutions. Second, they overlook the importance of explicitly refining the correspondence matrix before solving for transformations or flow fields. Most existing approaches compute candidate correspondences in feature space and project the resulting matching matrix only once by using Sinkhorn or dual-softmax normalization. Such a one-shot projection can be far from the globally optimal solution, and these methods usually do not model the distribution of the target matching matrix. In this paper, we propose a novel framework that exploits a denoising diffusion model to predict a search gradient for the optimal matching matrix in doubly stochastic matrix space. Specifically, the diffusion model learns a denoising direction, and the reverse denoising process iteratively searches for improved solutions along this learned direction, which approximates the maximum-likelihood direction of the target matching matrix. To improve efficiency, we design a lightweight denoising module and adopt the accelerated sampling strategy of the Denoising Diffusion Implicit Model (DDIM)\cite{song2020denoising}. Experimental results on 3DMatch/3DLoMatch and 4DMatch/4DLoMatch demonstrate the effectiveness of the proposed framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper trains a diffusion model to iteratively refine correspondence matrices by searching in doubly stochastic space for point cloud registration.

read the letter

The main thing here is a diffusion model that learns a denoising direction to guide iterative search toward better matching matrices inside doubly stochastic space. This replaces the fixed trajectories of recent iterative methods and the single Sinkhorn step used in most others. The reverse process is meant to follow something close to the maximum-likelihood path to the target matrix, and they speed it up with DDIM plus a lightweight module. That is the actual novelty claimed in the abstract. The approach directly calls out the lack of transparency in RAFT-like updates and the fact that one-shot projections often land far from the global optimum. Experiments are reported on the standard 3DMatch/3DLoMatch and 4DMatch/4DLoMatch sets, which is the right test bed. The framework is presented as addressing those two concrete limitations with a learned search mechanism. The soft spots are limited. The abstract gives no numbers, ablations, or error breakdowns, so the size of the gain is not visible yet. The core assumption that the model can be trained to approximate the distribution of globally optimal matrices could still break on harder cases with heavy noise or partial overlap, even if the stress test found no internal contradictions. Minor practical question is whether the added diffusion steps stay cheap enough in real pipelines. This is for people working on correspondence refinement inside 3D registration pipelines. A reader already following generative or iterative methods in geometric matching would see the most value. The paper is coherent on its own terms and uses relevant benchmarks, so it deserves peer review.

Referee Report

0 major / 1 minor

Summary. The paper proposes Diff-PCR, a framework for point cloud registration (both rigid and non-rigid) that formulates correspondence search as iterative refinement of a matching matrix inside doubly stochastic matrix space. A denoising diffusion model is trained to predict a search gradient; the reverse denoising trajectory (accelerated via DDIM) is claimed to follow the maximum-likelihood direction toward globally optimal correspondences, addressing the fixed-trajectory limitation of RAFT-style methods and the one-shot Sinkhorn projection of prior approaches. A lightweight denoising module is introduced for efficiency, with experiments reported on 3DMatch/3DLoMatch and 4DMatch/4DLoMatch.

Significance. If the central claim holds, the work introduces a learned, distribution-aware refinement mechanism that operates directly on the space of doubly stochastic matrices, offering greater transparency than fixed-trajectory iterative methods and potentially higher accuracy by modeling the target matching distribution rather than relying on one-shot normalization. The use of diffusion models in this constrained matrix space is a distinctive technical contribution to the registration literature.

minor comments (1)

[Abstract] Abstract: the claim of effectiveness is supported only by the statement that 'experimental results ... demonstrate the effectiveness'; no quantitative metrics, baselines, or ablation numbers appear in the abstract, which is standard but leaves the magnitude of improvement unstated until the results section is examined.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their thoughtful summary of our manuscript and for recognizing the potential significance of formulating correspondence search as iterative refinement in doubly stochastic matrix space via a diffusion model. We note that the report lists no specific major comments or questions for us to address. We remain available to provide further details or clarifications on any aspect of the work if the referee has additional points.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core claim is that a diffusion model, trained on data, learns a denoising direction in doubly stochastic matrix space whose reverse process approximates the maximum-likelihood trajectory toward optimal correspondence matrices. This is presented as an empirical learning procedure whose outputs are validated on external benchmarks (3DMatch/3DLoMatch, 4DMatch/4DLoMatch). The DDIM acceleration is cited to an external reference (song2020denoising). No equations or steps in the provided description reduce a claimed prediction or uniqueness result to a fitted parameter, self-definition, or self-citation chain. The framework is therefore self-contained against external data and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; the central addition is the diffusion model itself. No free parameters or invented entities are identifiable from the abstract.

axioms (1)

domain assumption Valid correspondences can be represented as doubly stochastic matrices
Standard assumption in optimal transport formulations of matching problems.

pith-pipeline@v0.9.0 · 5813 in / 1003 out tokens · 25317 ms · 2026-05-24T04:48:53.972779+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages · 3 internal anchors

[1]

A comprehensive survey on point cloud registration,

X. Huang, G. Mei, J. Zhang, and R. Abbas, “A comprehensive survey on point cloud registration,” 2021

work page 2021
[2]

Self-supervised 3d scene flow estimation guided by superpoints,

Y . Shen, L. Hui, J. Xie, and J. Yang, “Self-supervised 3d scene flow estimation guided by superpoints,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 5271–5280

work page 2023
[3]

Loam: Lidar odometry and mapping in real- time

J. Zhang and S. Singh, “Loam: Lidar odometry and mapping in real- time.” in Robotics: Science and systems , vol. 2, no. 9. Berkeley, CA, 2014, pp. 1–9

work page 2014
[4]

Geometric transformer for fast and robust point cloud registration,

Z. Qin, H. Yu, C. Wang, Y . Guo, Y . Peng, and K. Xu, “Geometric transformer for fast and robust point cloud registration,” in Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 143–11 152

work page 2022
[5]

Lepard: Learning partial point cloud matching in rigid and deformable scenes,

Y . Li and T. Harada, “Lepard: Learning partial point cloud matching in rigid and deformable scenes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 5554–5564

work page 2022
[6]

Regtr: End-to-end point cloud cor- respondences with transformers,

Z. J. Yew and G. H. Lee, “Regtr: End-to-end point cloud cor- respondences with transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 6677–6686

work page 2022
[7]

Sgfeat: Salient geometric feature for point cloud registration,

Q. Wu, Y . Ding, L. Luo, C. Zhou, J. Xie, and J. Yang, “Sgfeat: Salient geometric feature for point cloud registration,” arXiv preprint arXiv:2309.06207, 2023

work page arXiv 2023
[8]

Unsupervised deep probabilistic approach for partial point cloud registration,

G. Mei, H. Tang, X. Huang, W. Wang, J. Liu, J. Zhang, L. Van Gool, and Q. Wu, “Unsupervised deep probabilistic approach for partial point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 13 611–13 620

work page 2023
[9]

Kpconv: Flexible and deformable convolution for point clouds,

H. Thomas, C. R. Qi, J.-E. Deschaud, B. Marcotegui, F. Goulette, and L. J. Guibas, “Kpconv: Flexible and deformable convolution for point clouds,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 6411–6420

work page 2019
[10]

Pointdsc: Robust point cloud registration using deep spatial con- sistency,

X. Bai, Z. Luo, L. Zhou, H. Chen, L. Li, Z. Hu, H. Fu, and C.-L. Tai, “Pointdsc: Robust point cloud registration using deep spatial con- sistency,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2021, pp. 15 859–15 869

work page 2021
[11]

Sc2-pcr: A second order spatial compatibility for efficient and robust point cloud registration,

Z. Chen, K. Sun, F. Yang, and W. Tao, “Sc2-pcr: A second order spatial compatibility for efficient and robust point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 221–13 231

work page 2022
[12]

Deep hough voting for 3d object detection in point clouds,

C. R. Qi, O. Litany, K. He, and L. J. Guibas, “Deep hough voting for 3d object detection in point clouds,” in proceedings of the IEEE/CVF International Conference on Computer Vision , 2019, pp. 9277–9286

work page 2019
[13]

Deep graph-based spatial consistency for robust non-rigid point cloud registration,

Z. Qin, H. Yu, C. Wang, Y . Peng, and K. Xu, “Deep graph-based spatial consistency for robust non-rigid point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5394–5403

work page 2023
[14]

Robust outlier rejection for 3d registration with variational bayes,

H. Jiang, Z. Dang, Z. Wei, J. Xie, J. Yang, and M. Salzmann, “Robust outlier rejection for 3d registration with variational bayes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1148–1157

work page 2023
[15]

3d registration with maximal cliques,

X. Zhang, J. Yang, S. Zhang, and Y . Zhang, “3d registration with maximal cliques,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 17 745–17 754

work page 2023
[16]

Peal: Prior- embedded explicit attention learning for low-overlap point cloud reg- istration,

J. Yu, L. Ren, Y . Zhang, W. Zhou, L. Lin, and G. Dai, “Peal: Prior- embedded explicit attention learning for low-overlap point cloud reg- istration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 17 702–17 711

work page 2023
[17]

Rcp: Recurrent closest point for scene flow estimation on 3d point clouds,

X. Gu, C. Tang, W. Yuan, Z. Dai, S. Zhu, and P. Tan, “Rcp: Recurrent closest point for scene flow estimation on 3d point clouds,” arXiv preprint arXiv:2205.11028, 2022

work page arXiv 2022
[18]

Raft: Recurrent all-pairs field transforms for optical flow,

Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II

work page 2020
[19]

Springer, 2020, pp. 402–419

work page 2020
[20]

Cotreg: Coupled optimal transport based point cloud registration,

G. Mei, X. Huang, L. Yu, J. Zhang, and M. Bennamoun, “Cotreg: Coupled optimal transport based point cloud registration,” arXiv preprint arXiv:2112.14381, 2021

work page arXiv 2021
[21]

Graph matching optimization network for point cloud registration

Q. Wu, Y . Shen, H. Jiang, G. Mei, Y . Ding, L. Luo, J. Xie, and J. Yang, “Graph matching optimization network for point cloud registration.”

work page
[22]

Denoising diffusion probabilistic models,

J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems , vol. 33, pp. 6840–6851, 2020

work page 2020
[23]

Revisiting frank-wolfe: Projection-free sparse convex opti- mization,

M. Jaggi, “Revisiting frank-wolfe: Projection-free sparse convex opti- mization,” in International Conference on Machine Learning. PMLR, 2013, pp. 427–435

work page 2013
[24]

Correlation functions and computer simulations,

G. Parisi, “Correlation functions and computer simulations,” Nuclear Physics B, vol. 180, no. 3, pp. 378–384, 1981

work page 1981
[25]

Mcmc using hamiltonian dynamics,

R. M. Neal et al., “Mcmc using hamiltonian dynamics,” Handbook of markov chain monte carlo , vol. 2, no. 11, p. 2, 2011

work page 2011
[26]

Nonsquare

R. M. Caron, X. Li, P. Mikusi ´nski, H. Sherwood, and M. D. Taylor, “Nonsquare ”doubly stochastic” matrices,” Lecture Notes- Monograph Series , vol. 28, pp. 65–75, 1996. [Online]. Available: http://www.jstor.org/stable/4355884

work page arXiv 1996
[27]

Raft-3d: Scene flow using rigid-motion em- beddings,

Z. Teed and J. Deng, “Raft-3d: Scene flow using rigid-motion em- beddings,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , 2021, pp. 8375–8384

work page 2021
[28]

Denoising Diffusion Implicit Models

J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502 , 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[29]

D3feat: Joint learning of dense detection and description of 3d local features,

X. Bai, Z. Luo, L. Zhou, H. Fu, L. Quan, and C.-L. Tai, “D3feat: Joint learning of dense detection and description of 3d local features,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 6359–6367

work page 2020
[30]

Predator: Registration of 3d point clouds with low overlap,

S. Huang, Z. Gojcic, M. Usvyatsov, A. Wieser, and K. Schindler, “Predator: Registration of 3d point clouds with low overlap,” in Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, 2021, pp. 4267–4276

work page 2021
[31]

Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration,

H. Yu, F. Li, M. Saleh, B. Busam, and S. Ilic, “Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration,” Advances in Neural Information Processing Systems , vol. 34, pp. 23 872–23 884, 2021

work page 2021
[32]

Rotation-invariant transformer for point cloud matching,

H. Yu, Z. Qin, J. Hou, M. Saleh, D. Li, B. Busam, and S. Ilic, “Rotation-invariant transformer for point cloud matching,” inProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5384–5393

work page 2023
[33]

Riga: Rotation-invariant and globally-aware descriptors for point cloud registration,

H. Yu, J. Hou, Z. Qin, M. Saleh, I. Shugurov, K. Wang, B. Busam, and S. Ilic, “Riga: Rotation-invariant and globally-aware descriptors for point cloud registration,” arXiv preprint arXiv:2209.13252 , 2022

work page arXiv 2022
[34]

Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors,

H. Deng, T. Birdal, and S. Ilic, “Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors,” inProceedings of the European Conference on Computer Vision (ECCV) , 2018, pp. 602–618

work page 2018
[35]

Diffusionpcr: Diffusion models for robust multi-step point cloud registration,

Z. Chen, Y . Ren, T. Zhang, Z. Dang, W. Tao, S. S ¨usstrunk, and M. Salzmann, “Diffusionpcr: Diffusion models for robust multi-step point cloud registration,” arXiv preprint arXiv:2312.03053 , 2023

work page arXiv 2023
[36]

Generative modeling by estimating gradients of the data distribution,

Y . Song and S. Ermon, “Generative modeling by estimating gradients of the data distribution,” Advances in neural information processing systems, vol. 32, 2019

work page 2019
[37]

Diffusiondet: Diffusion model for object detection,

S. Chen, P. Sun, Y . Song, and P. Luo, “Diffusiondet: Diffusion model for object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision , 2023, pp. 19 830–19 843

work page 2023
[38]

Structured denoising diffusion models in discrete state-spaces,

J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. Van Den Berg, “Structured denoising diffusion models in discrete state-spaces,” Ad- vances in Neural Information Processing Systems, vol. 34, pp. 17 981– 17 993, 2021

work page 2021
[39]

Vector quantized diffusion model for text-to-image synthesis,

S. Gu, D. Chen, J. Bao, F. Wen, B. Zhang, D. Chen, L. Yuan, and B. Guo, “Vector quantized diffusion model for text-to-image synthesis,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 10 696–10 706

work page 2022
[40]

Se (3)- diffusionfields: Learning smooth cost functions for joint grasp and motion optimization through diffusion,

J. Urain, N. Funk, J. Peters, and G. Chalvatzaki, “Se (3)- diffusionfields: Learning smooth cost functions for joint grasp and motion optimization through diffusion,” in 2023 IEEE International Conference on Robotics and Automation (ICRA) . IEEE, 2023, pp. 5923–5930

work page 2023
[41]

Se (3) diffusion model-based point cloud registration for robust 6d object pose estimation,

H. Jiang, M. Salzmann, Z. Dang, J. Xie, and J. Yang, “Se (3) diffusion model-based point cloud registration for robust 6d object pose estimation,” arXiv preprint arXiv:2310.17359 , 2023

work page arXiv 2023
[42]

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

J. N. Sohl-Dickstein, E. A. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” ArXiv, vol. abs/1503.03585, 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:14888175

work page internal anchor Pith review Pith/arXiv arXiv 2015
[43]

Score-Based Generative Modeling through Stochastic Differential Equations

Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” arXiv preprint arXiv:2011.13456 , 2020

work page internal anchor Pith review Pith/arXiv arXiv 2011
[44]

Kingma, Tim Salimans, Ben Poole, and Jonathan Ho

D. P. Kingma, T. Salimans, B. Poole, and J. Ho, “Variational diffusion models,” ArXiv, vol. abs/2107.00630, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:235694314

work page arXiv 2021
[45]

Sinkhorn distances: Lightspeed computation of optimal transport,

M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,”Advances in neural information processing systems, vol. 26, 2013

work page 2013
[46]

Method for registration of 3-d shapes,

P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,” in Sensor fusion IV: control paradigms and data structures , vol. 1611. Spie, 1992, pp. 586–606

work page 1992
[47]

Unleashing transformers: Parallel token prediction with discrete absorbing diffusion for fast high-resolution image generation from vector-quantized codes,

S. Bond-Taylor, P. Hessey, H. Sasaki, T. P. Breckon, and C. G. Willcocks, “Unleashing transformers: Parallel token prediction with discrete absorbing diffusion for fast high-resolution image generation from vector-quantized codes,” in European Conference on Computer Vision. Springer, 2022, pp. 170–188

work page 2022
[48]

Least-squares fitting of two 3-d point sets,

K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” IEEE Transactions on pattern analysis and machine intelligence, no. 5, pp. 698–700, 1987

work page 1987
[49]

Embedded deformation for shape manipulation,

R. W. Sumner, J. Schmid, and M. Pauly, “Embedded deformation for shape manipulation,” in ACM siggraph 2007 papers, 2007, pp. 80–es

work page 2007
[50]

As-rigid-as-possible shape manipulation,

T. Igarashi, T. Moscovich, and J. F. Hughes, “As-rigid-as-possible shape manipulation,” ACM transactions on Graphics (TOG) , vol. 24, no. 3, pp. 1134–1141, 2005

work page 2005
[51]

Non-rigid point cloud registration with neural deformation pyramid,

Y . Li and T. Harada, “Non-rigid point cloud registration with neural deformation pyramid,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 757–27 768, 2022

work page 2022
[52]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems , vol. 30, 2017

work page 2017
[53]

3dmatch: Learning local geometric descriptors from rgb-d reconstruc- tions,

A. Zeng, S. Song, M. Nießner, M. Fisher, J. Xiao, and T. Funkhouser, “3dmatch: Learning local geometric descriptors from rgb-d reconstruc- tions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1802–1811

work page 2017
[54]

Hyper-graph matching via reweighted random walks,

J. Lee, M. Cho, and K. M. Lee, “Hyper-graph matching via reweighted random walks,” in CVPR 2011. IEEE, 2011, pp. 1633–1640

work page 2011
[55]

Fully convolutional geometric features,

C. Choy, J. Park, and V . Koltun, “Fully convolutional geometric features,” in Proceedings of the IEEE/CVF International Conference on Computer Vision , 2019, pp. 8958–8966

work page 2019
[56]

Pointpwc-net: A coarse-to-fine network for supervised and self-supervised scene flow estimation on 3d point clouds,

W. Wu, Z. Wang, Z. Li, W. Liu, and L. Fuxin, “Pointpwc-net: A coarse-to-fine network for supervised and self-supervised scene flow estimation on 3d point clouds,” arXiv preprint arXiv:1911.12408 , 2019

work page arXiv 1911
[57]

Flot: Scene flow on point clouds guided by optimal transport,

G. Puy, A. Boulch, and R. Marlet, “Flot: Scene flow on point clouds guided by optimal transport,” in European conference on computer vision. Springer, 2020, pp. 527–544

work page 2020
[58]

Neural scene flow prior,

X. Li, J. Kaesemodel Pontes, and S. Lucey, “Neural scene flow prior,” Advances in Neural Information Processing Systems , vol. 34, pp. 7838–7851, 2021

work page 2021
[59]

4dcom- plete: Non-rigid motion estimation beyond the observable surface,

Y . Li, H. Takehara, T. Taketomi, B. Zheng, and M. Nießner, “4dcom- plete: Non-rigid motion estimation beyond the observable surface,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12 706–12 716

work page 2021

[1] [1]

A comprehensive survey on point cloud registration,

X. Huang, G. Mei, J. Zhang, and R. Abbas, “A comprehensive survey on point cloud registration,” 2021

work page 2021

[2] [2]

Self-supervised 3d scene flow estimation guided by superpoints,

Y . Shen, L. Hui, J. Xie, and J. Yang, “Self-supervised 3d scene flow estimation guided by superpoints,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 5271–5280

work page 2023

[3] [3]

Loam: Lidar odometry and mapping in real- time

J. Zhang and S. Singh, “Loam: Lidar odometry and mapping in real- time.” in Robotics: Science and systems , vol. 2, no. 9. Berkeley, CA, 2014, pp. 1–9

work page 2014

[4] [4]

Geometric transformer for fast and robust point cloud registration,

Z. Qin, H. Yu, C. Wang, Y . Guo, Y . Peng, and K. Xu, “Geometric transformer for fast and robust point cloud registration,” in Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 143–11 152

work page 2022

[5] [5]

Lepard: Learning partial point cloud matching in rigid and deformable scenes,

Y . Li and T. Harada, “Lepard: Learning partial point cloud matching in rigid and deformable scenes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 5554–5564

work page 2022

[6] [6]

Regtr: End-to-end point cloud cor- respondences with transformers,

Z. J. Yew and G. H. Lee, “Regtr: End-to-end point cloud cor- respondences with transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 6677–6686

work page 2022

[7] [7]

Sgfeat: Salient geometric feature for point cloud registration,

Q. Wu, Y . Ding, L. Luo, C. Zhou, J. Xie, and J. Yang, “Sgfeat: Salient geometric feature for point cloud registration,” arXiv preprint arXiv:2309.06207, 2023

work page arXiv 2023

[8] [8]

Unsupervised deep probabilistic approach for partial point cloud registration,

G. Mei, H. Tang, X. Huang, W. Wang, J. Liu, J. Zhang, L. Van Gool, and Q. Wu, “Unsupervised deep probabilistic approach for partial point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 13 611–13 620

work page 2023

[9] [9]

Kpconv: Flexible and deformable convolution for point clouds,

H. Thomas, C. R. Qi, J.-E. Deschaud, B. Marcotegui, F. Goulette, and L. J. Guibas, “Kpconv: Flexible and deformable convolution for point clouds,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 6411–6420

work page 2019

[10] [10]

Pointdsc: Robust point cloud registration using deep spatial con- sistency,

X. Bai, Z. Luo, L. Zhou, H. Chen, L. Li, Z. Hu, H. Fu, and C.-L. Tai, “Pointdsc: Robust point cloud registration using deep spatial con- sistency,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2021, pp. 15 859–15 869

work page 2021

[11] [11]

Sc2-pcr: A second order spatial compatibility for efficient and robust point cloud registration,

Z. Chen, K. Sun, F. Yang, and W. Tao, “Sc2-pcr: A second order spatial compatibility for efficient and robust point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 221–13 231

work page 2022

[12] [12]

Deep hough voting for 3d object detection in point clouds,

C. R. Qi, O. Litany, K. He, and L. J. Guibas, “Deep hough voting for 3d object detection in point clouds,” in proceedings of the IEEE/CVF International Conference on Computer Vision , 2019, pp. 9277–9286

work page 2019

[13] [13]

Deep graph-based spatial consistency for robust non-rigid point cloud registration,

Z. Qin, H. Yu, C. Wang, Y . Peng, and K. Xu, “Deep graph-based spatial consistency for robust non-rigid point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5394–5403

work page 2023

[14] [14]

Robust outlier rejection for 3d registration with variational bayes,

H. Jiang, Z. Dang, Z. Wei, J. Xie, J. Yang, and M. Salzmann, “Robust outlier rejection for 3d registration with variational bayes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1148–1157

work page 2023

[15] [15]

3d registration with maximal cliques,

X. Zhang, J. Yang, S. Zhang, and Y . Zhang, “3d registration with maximal cliques,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 17 745–17 754

work page 2023

[16] [16]

Peal: Prior- embedded explicit attention learning for low-overlap point cloud reg- istration,

J. Yu, L. Ren, Y . Zhang, W. Zhou, L. Lin, and G. Dai, “Peal: Prior- embedded explicit attention learning for low-overlap point cloud reg- istration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023, pp. 17 702–17 711

work page 2023

[17] [17]

Rcp: Recurrent closest point for scene flow estimation on 3d point clouds,

X. Gu, C. Tang, W. Yuan, Z. Dai, S. Zhu, and P. Tan, “Rcp: Recurrent closest point for scene flow estimation on 3d point clouds,” arXiv preprint arXiv:2205.11028, 2022

work page arXiv 2022

[18] [18]

Raft: Recurrent all-pairs field transforms for optical flow,

Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II

work page 2020

[19] [19]

Springer, 2020, pp. 402–419

work page 2020

[20] [20]

Cotreg: Coupled optimal transport based point cloud registration,

G. Mei, X. Huang, L. Yu, J. Zhang, and M. Bennamoun, “Cotreg: Coupled optimal transport based point cloud registration,” arXiv preprint arXiv:2112.14381, 2021

work page arXiv 2021

[21] [21]

Graph matching optimization network for point cloud registration

Q. Wu, Y . Shen, H. Jiang, G. Mei, Y . Ding, L. Luo, J. Xie, and J. Yang, “Graph matching optimization network for point cloud registration.”

work page

[22] [22]

Denoising diffusion probabilistic models,

J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems , vol. 33, pp. 6840–6851, 2020

work page 2020

[23] [23]

Revisiting frank-wolfe: Projection-free sparse convex opti- mization,

M. Jaggi, “Revisiting frank-wolfe: Projection-free sparse convex opti- mization,” in International Conference on Machine Learning. PMLR, 2013, pp. 427–435

work page 2013

[24] [24]

Correlation functions and computer simulations,

G. Parisi, “Correlation functions and computer simulations,” Nuclear Physics B, vol. 180, no. 3, pp. 378–384, 1981

work page 1981

[25] [25]

Mcmc using hamiltonian dynamics,

R. M. Neal et al., “Mcmc using hamiltonian dynamics,” Handbook of markov chain monte carlo , vol. 2, no. 11, p. 2, 2011

work page 2011

[26] [26]

Nonsquare

R. M. Caron, X. Li, P. Mikusi ´nski, H. Sherwood, and M. D. Taylor, “Nonsquare ”doubly stochastic” matrices,” Lecture Notes- Monograph Series , vol. 28, pp. 65–75, 1996. [Online]. Available: http://www.jstor.org/stable/4355884

work page arXiv 1996

[27] [27]

Raft-3d: Scene flow using rigid-motion em- beddings,

Z. Teed and J. Deng, “Raft-3d: Scene flow using rigid-motion em- beddings,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , 2021, pp. 8375–8384

work page 2021

[28] [28]

Denoising Diffusion Implicit Models

J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502 , 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[29] [29]

D3feat: Joint learning of dense detection and description of 3d local features,

X. Bai, Z. Luo, L. Zhou, H. Fu, L. Quan, and C.-L. Tai, “D3feat: Joint learning of dense detection and description of 3d local features,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 6359–6367

work page 2020

[30] [30]

Predator: Registration of 3d point clouds with low overlap,

S. Huang, Z. Gojcic, M. Usvyatsov, A. Wieser, and K. Schindler, “Predator: Registration of 3d point clouds with low overlap,” in Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, 2021, pp. 4267–4276

work page 2021

[31] [31]

Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration,

H. Yu, F. Li, M. Saleh, B. Busam, and S. Ilic, “Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration,” Advances in Neural Information Processing Systems , vol. 34, pp. 23 872–23 884, 2021

work page 2021

[32] [32]

Rotation-invariant transformer for point cloud matching,

H. Yu, Z. Qin, J. Hou, M. Saleh, D. Li, B. Busam, and S. Ilic, “Rotation-invariant transformer for point cloud matching,” inProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5384–5393

work page 2023

[33] [33]

Riga: Rotation-invariant and globally-aware descriptors for point cloud registration,

H. Yu, J. Hou, Z. Qin, M. Saleh, I. Shugurov, K. Wang, B. Busam, and S. Ilic, “Riga: Rotation-invariant and globally-aware descriptors for point cloud registration,” arXiv preprint arXiv:2209.13252 , 2022

work page arXiv 2022

[34] [34]

Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors,

H. Deng, T. Birdal, and S. Ilic, “Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors,” inProceedings of the European Conference on Computer Vision (ECCV) , 2018, pp. 602–618

work page 2018

[35] [35]

Diffusionpcr: Diffusion models for robust multi-step point cloud registration,

Z. Chen, Y . Ren, T. Zhang, Z. Dang, W. Tao, S. S ¨usstrunk, and M. Salzmann, “Diffusionpcr: Diffusion models for robust multi-step point cloud registration,” arXiv preprint arXiv:2312.03053 , 2023

work page arXiv 2023

[36] [36]

Generative modeling by estimating gradients of the data distribution,

Y . Song and S. Ermon, “Generative modeling by estimating gradients of the data distribution,” Advances in neural information processing systems, vol. 32, 2019

work page 2019

[37] [37]

Diffusiondet: Diffusion model for object detection,

S. Chen, P. Sun, Y . Song, and P. Luo, “Diffusiondet: Diffusion model for object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision , 2023, pp. 19 830–19 843

work page 2023

[38] [38]

Structured denoising diffusion models in discrete state-spaces,

J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. Van Den Berg, “Structured denoising diffusion models in discrete state-spaces,” Ad- vances in Neural Information Processing Systems, vol. 34, pp. 17 981– 17 993, 2021

work page 2021

[39] [39]

Vector quantized diffusion model for text-to-image synthesis,

S. Gu, D. Chen, J. Bao, F. Wen, B. Zhang, D. Chen, L. Yuan, and B. Guo, “Vector quantized diffusion model for text-to-image synthesis,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 10 696–10 706

work page 2022

[40] [40]

Se (3)- diffusionfields: Learning smooth cost functions for joint grasp and motion optimization through diffusion,

J. Urain, N. Funk, J. Peters, and G. Chalvatzaki, “Se (3)- diffusionfields: Learning smooth cost functions for joint grasp and motion optimization through diffusion,” in 2023 IEEE International Conference on Robotics and Automation (ICRA) . IEEE, 2023, pp. 5923–5930

work page 2023

[41] [41]

Se (3) diffusion model-based point cloud registration for robust 6d object pose estimation,

H. Jiang, M. Salzmann, Z. Dang, J. Xie, and J. Yang, “Se (3) diffusion model-based point cloud registration for robust 6d object pose estimation,” arXiv preprint arXiv:2310.17359 , 2023

work page arXiv 2023

[42] [42]

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

J. N. Sohl-Dickstein, E. A. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” ArXiv, vol. abs/1503.03585, 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:14888175

work page internal anchor Pith review Pith/arXiv arXiv 2015

[43] [43]

Score-Based Generative Modeling through Stochastic Differential Equations

Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” arXiv preprint arXiv:2011.13456 , 2020

work page internal anchor Pith review Pith/arXiv arXiv 2011

[44] [44]

Kingma, Tim Salimans, Ben Poole, and Jonathan Ho

D. P. Kingma, T. Salimans, B. Poole, and J. Ho, “Variational diffusion models,” ArXiv, vol. abs/2107.00630, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:235694314

work page arXiv 2021

[45] [45]

Sinkhorn distances: Lightspeed computation of optimal transport,

M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,”Advances in neural information processing systems, vol. 26, 2013

work page 2013

[46] [46]

Method for registration of 3-d shapes,

P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,” in Sensor fusion IV: control paradigms and data structures , vol. 1611. Spie, 1992, pp. 586–606

work page 1992

[47] [47]

Unleashing transformers: Parallel token prediction with discrete absorbing diffusion for fast high-resolution image generation from vector-quantized codes,

S. Bond-Taylor, P. Hessey, H. Sasaki, T. P. Breckon, and C. G. Willcocks, “Unleashing transformers: Parallel token prediction with discrete absorbing diffusion for fast high-resolution image generation from vector-quantized codes,” in European Conference on Computer Vision. Springer, 2022, pp. 170–188

work page 2022

[48] [48]

Least-squares fitting of two 3-d point sets,

K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” IEEE Transactions on pattern analysis and machine intelligence, no. 5, pp. 698–700, 1987

work page 1987

[49] [49]

Embedded deformation for shape manipulation,

R. W. Sumner, J. Schmid, and M. Pauly, “Embedded deformation for shape manipulation,” in ACM siggraph 2007 papers, 2007, pp. 80–es

work page 2007

[50] [50]

As-rigid-as-possible shape manipulation,

T. Igarashi, T. Moscovich, and J. F. Hughes, “As-rigid-as-possible shape manipulation,” ACM transactions on Graphics (TOG) , vol. 24, no. 3, pp. 1134–1141, 2005

work page 2005

[51] [51]

Non-rigid point cloud registration with neural deformation pyramid,

Y . Li and T. Harada, “Non-rigid point cloud registration with neural deformation pyramid,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 757–27 768, 2022

work page 2022

[52] [52]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems , vol. 30, 2017

work page 2017

[53] [53]

3dmatch: Learning local geometric descriptors from rgb-d reconstruc- tions,

A. Zeng, S. Song, M. Nießner, M. Fisher, J. Xiao, and T. Funkhouser, “3dmatch: Learning local geometric descriptors from rgb-d reconstruc- tions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1802–1811

work page 2017

[54] [54]

Hyper-graph matching via reweighted random walks,

J. Lee, M. Cho, and K. M. Lee, “Hyper-graph matching via reweighted random walks,” in CVPR 2011. IEEE, 2011, pp. 1633–1640

work page 2011

[55] [55]

Fully convolutional geometric features,

C. Choy, J. Park, and V . Koltun, “Fully convolutional geometric features,” in Proceedings of the IEEE/CVF International Conference on Computer Vision , 2019, pp. 8958–8966

work page 2019

[56] [56]

Pointpwc-net: A coarse-to-fine network for supervised and self-supervised scene flow estimation on 3d point clouds,

W. Wu, Z. Wang, Z. Li, W. Liu, and L. Fuxin, “Pointpwc-net: A coarse-to-fine network for supervised and self-supervised scene flow estimation on 3d point clouds,” arXiv preprint arXiv:1911.12408 , 2019

work page arXiv 1911

[57] [57]

Flot: Scene flow on point clouds guided by optimal transport,

G. Puy, A. Boulch, and R. Marlet, “Flot: Scene flow on point clouds guided by optimal transport,” in European conference on computer vision. Springer, 2020, pp. 527–544

work page 2020

[58] [58]

Neural scene flow prior,

X. Li, J. Kaesemodel Pontes, and S. Lucey, “Neural scene flow prior,” Advances in Neural Information Processing Systems , vol. 34, pp. 7838–7851, 2021

work page 2021

[59] [59]

4dcom- plete: Non-rigid motion estimation beyond the observable surface,

Y . Li, H. Takehara, T. Taketomi, B. Zheng, and M. Nießner, “4dcom- plete: Non-rigid motion estimation beyond the observable surface,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12 706–12 716

work page 2021