pith. machine review for the scientific record. sign in

arxiv: 2604.08799 · v1 · submitted 2026-04-09 · 💻 cs.GR · cs.CV

Recognition: unknown

MeshOn: Intersection-Free Mesh-to-Mesh Composition

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:41 UTC · model grok-4.3

classification 💻 cs.GR cs.CV
keywords mesh compositionintersection-freemesh fittingbarrier optimizationdiffusion priorvision-language alignment3D modelingaccessory placement
0
0 comments X

The pith

MeshOn fits an accessory mesh onto a base mesh in a target region by initializing rigid alignment with vision-language models, then optimizing geometric attractions against a physics barrier to block intersections, followed by diffusion-pri

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MeshOn as a pipeline that takes an accessory mesh, a base mesh with a user-specified target region, and optional text descriptions, then produces a realistic, non-penetrating composition. It begins with structured rigid alignment drawn from vision-to-language models, refines the pose through attractive geometric terms plus a barrier loss that penalizes surface intersections, and finishes by deforming the accessory under a diffusion prior. The resulting compositions are claimed to handle accessories of different materials across varied body regions while remaining compatible with standard digital-artist pipelines. A sympathetic reader would care because the method promises to replace manual placement and collision cleanup with an automatic, repeatable process that still respects physical and semantic constraints.

Core claim

MeshOn demonstrates that a three-stage optimization—vision-language rigid initialization, combined geometric attraction and physics-inspired barrier losses, and diffusion-guided final deformation—can produce physically plausible, intersection-free fittings of one mesh onto another while preserving semantic intent and integrating directly with existing artist workflows.

What carries the argument

A multi-step optimization that couples attractive geometric losses with a physics-inspired barrier loss to enforce non-intersection, seeded by vision-to-language rigid alignment and completed by diffusion-prior deformation.

If this is right

  • Accessories made of rigid, soft, or articulated materials can be placed over a wide range of target body regions without manual collision resolution.
  • The compositions remain compatible with standard 3D artist tools because they output deformed meshes rather than implicit fields or point clouds.
  • The barrier term guarantees that the final surfaces do not penetrate even when the accessory must wrap around curved or concave target geometry.
  • Optional text conditioning allows semantic guidance without requiring the user to supply explicit correspondence points.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same barrier-plus-diffusion pattern could be applied to animate the composed object while preserving non-intersection across frames.
  • Replacing the vision-language initializer with a learned pose predictor trained on the same loss might remove the need for text prompts altogether.
  • The method’s emphasis on artist workflow compatibility suggests it could serve as a plug-in for existing sculpting packages rather than a standalone generator.

Load-bearing premise

The vision-to-language model supplies an initial rigid pose close enough that the subsequent optimization can escape poor local minima and reach a non-intersecting solution.

What would settle it

Running the pipeline on a diverse test set of accessory-base pairs and finding that more than a small fraction of outputs still contain surface intersections or visibly implausible deformations after all stages would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.08799 by Hadar Averbuch-Elor, Hyunwoo Kim, Itai Lang, Rana Hanocka, Silvia Sell\'an.

Figure 1
Figure 1. Figure 1: MeshOn is a multi-step optimization algorithm that fits accessories onto meshes realistically, tightly, and without intersections. Abstract. We propose MeshOn, a method that finds physically and semantically realistic compositions of two input meshes. Given an acces￾sory, a base mesh with a user-defined target region, and optional text strings for both meshes, MeshOn uses a multi-step optimization frame￾wo… view at source ↗
Figure 2
Figure 2. Figure 2: Digital artists often load shapes from pre-existing libraries of meshes that in￾clude additional information like animation rigs, skinning weights, texture maps and more. Generative shape editing methods like Instant3dit [4] discard this information, performing global changes to the shapes and merging them together. MeshOn is de￾signed to fit exactly within this common artistic pipeline, and perfectly pres… view at source ↗
Figure 3
Figure 3. Figure 3: We compose the two meshes through a multi-step optimization pipeline: from an initialization obtained with a Vision-to-Language Model (subsection 3.5), we start by obtaining a tight fit containing intersections (subsection 3.1), which we then resolve (subsection 3.2). After finetuning the fit to obtain the best possible rigid fit (subsec￾tion 3.3), we improve it further by allowing small deformations in th… view at source ↗
Figure 4
Figure 4. Figure 4: MeshOn is capable of fitting a wide variety of accessories on a range of different meshes. Our method handles challenging spatial relationships such as sliding glasses along a head to align with the ears, positioning hats and helmets to conform to head curvature, wrapping bands around articulated limbs, and fitting objects along long, curved surfaces. These examples demonstrate the flexibility of our fitti… view at source ↗
Figure 6
Figure 6. Figure 6: MeshOn is capable of fitting the accessory onto the base mesh in a way that considers material properties. Instead of having to specify the numeric material param￾eters, artists can provide material guidance through text prompts, that get combined with rendered images and interpreted by a VLM to output specific elastic parameters. 3.3 Step 3: Rigid finetuning The output of the previous section is a (scaled… view at source ↗
Figure 7
Figure 7. Figure 7: MeshOn uses a multi-step process, where each of them is critical for achiev￾ing a desirable mesh to mesh composition result. Omitting any of the steps yields a semantically (two left results) or physically (two center results) unplausible solutions. Utilizing Step 4 further improves the result (second from the right), allowing the asset to deform and better fit the base mesh (rightmost). 3.4 Step 4: Elasti… view at source ↗
Figure 9
Figure 9. Figure 9: Our method is moderately robust to different initialization configurations, al￾though it can fail to converge to the desired output in very adversarial cases (leftmost and rightmost). In subsection 3.5, we introduce a VLM initialization strategy that avoids these cases. 4 Experiments Our algorithm is implemented in Python using PyTorch for autodifferentiation. Our comparisons to Iterative Closest Point and… view at source ↗
Figure 10
Figure 10. Figure 10: Region-controlled composition. (a) Given four distinct user-selected tar￾get regions, MeshOn places a ring precisely on each selected region, demonstrating explicit user control over the fitting process. (b) Because the method adheres strictly to the selected region, asset placement is sensitive to the region definition; for example, glasses sit naturally on the ears only if a small portion of the ear reg… view at source ↗
Figure 11
Figure 11. Figure 11: Unlike ours (left), classical registration algorithms like ICP contain no seman￾tic guidance: therefore, they will produce unrealistic configurations; e.g., glasses that sit on the eyes upside down. These algorithms are also not designed to avoid intersections; therefore, they will produce many of them (see blowups). See supplementary materials for details of these methods. More importantly, general shape… view at source ↗
Figure 1
Figure 1. Figure 1: We show a comparison of runtime and GPU memory consumption for calculat￾ing a single distance measure d using our GPU-optimized BVH structure. We run the calculation with varying resolutions of the base mesh (visualized on top of each mesh), without using any user defined masks. We use a target mesh with a constant size of 8646 vertices and 17288 faces. be brought closer through the optimization process. O… view at source ↗
Figure 2
Figure 2. Figure 2: We show examples of renderings that we used for our user study. We also used the same set of examples for evaluating CLIP, CLIP-IQA, and VQA scores, but with a different renderer (nvdiffrast) for technical purposes. We render results from all methods from four viewpoints with greyscale textures. D VLM Prompt Templates [PITH_FULL_IMAGE:figures/full_fig_p028_2.png] view at source ↗
read the original abstract

We propose MeshOn, a method that finds physically and semantically realistic compositions of two input meshes. Given an accessory, a base mesh with a user-defined target region, and optional text strings for both meshes, MeshOn uses a multi-step optimization framework to realistically fit the meshes onto each other while preventing intersections. We initialize the shapes' rigid configuration via a structured alignment scheme using Vision-to-Language Models, which we then optimize using a combination of attractive geometric losses, and a physics-inspired barrier loss that prevents surface intersections. We then obtain a final deformation of the object, assisted by a diffusion prior. Our method successfully fits accessories of various materials over a breadth of target regions, and is designed to fit directly into existing digital artist workflows. We demonstrate the robustness and accuracy of our pipeline by comparing it with generative approaches and traditional registration algorithms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. MeshOn proposes a multi-step optimization pipeline for composing an accessory mesh onto a base mesh within a user-specified target region. It initializes rigid alignment using vision-to-language models, optimizes the configuration with attractive geometric losses plus a physics-inspired barrier loss to avoid surface intersections, and applies a diffusion prior to deform the accessory. The method claims to produce physically and semantically realistic, intersection-free results that integrate into existing digital artist workflows, with qualitative demonstrations against generative and registration baselines.

Significance. If the intersection-free guarantee and robustness claims hold with supporting evidence, the work would offer a practical advance for automated 3D asset composition in graphics and animation pipelines. The combination of VLM-guided initialization, barrier terms, and diffusion priors addresses a common pain point in mesh fitting, and explicit workflow compatibility is a positive aspect. However, the absence of quantitative metrics in the provided description limits assessment of whether the approach meaningfully outperforms existing methods.

major comments (2)
  1. [Optimization and deformation sections] The central claim of producing truly intersection-free output meshes relies on the physics-inspired barrier loss during rigid alignment and the subsequent diffusion-based deformation. This loss is described as a soft penalty (typically based on penetration depth or signed distance), which does not provide a strict guarantee of zero intersections after the diffusion step alters vertex positions. The paper must add explicit post-processing verification (e.g., minimum signed-distance histograms or penetration-volume statistics across all test cases) and report failure rates; without this, the headline claim remains unverified. (Optimization and deformation sections.)
  2. [Evaluation section] The abstract states that robustness and accuracy are demonstrated via comparisons to generative approaches and traditional registration algorithms, yet no quantitative metrics, error bars, intersection-rate tables, or details on how intersections are measured are referenced. The evaluation section should include specific numbers (e.g., mean penetration depth, success rates over a benchmark set) and ablation studies on the barrier loss weight to substantiate the superiority claims.
minor comments (2)
  1. Clarify the exact formulation and weighting of the barrier loss relative to the attractive geometric terms, including any schedule for the loss weights during optimization.
  2. The optional text strings for the meshes are mentioned but their precise role in the VLM alignment or diffusion prior could be illustrated with an example or diagram.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help strengthen the verification of our intersection-free claims and the quantitative evaluation of MeshOn. We address each point below and have revised the manuscript and supplementary material accordingly.

read point-by-point responses
  1. Referee: The central claim of producing truly intersection-free output meshes relies on the physics-inspired barrier loss during rigid alignment and the subsequent diffusion-based deformation. This loss is described as a soft penalty, which does not provide a strict guarantee of zero intersections after the diffusion step alters vertex positions. The paper must add explicit post-processing verification (e.g., minimum signed-distance histograms or penetration-volume statistics across all test cases) and report failure rates.

    Authors: We agree that the barrier loss is a soft constraint and that the diffusion deformation step can in principle introduce intersections. In the revised manuscript we have added a post-processing verification pipeline that computes per-vertex signed-distance fields between the final accessory and base meshes, reports minimum signed-distance values, penetration volumes, and failure rates (cases with positive penetration volume) over the full test set. Histograms of signed distances and a table of aggregate statistics are now included in the supplementary material, providing the requested empirical support for the headline claim. revision: yes

  2. Referee: The abstract states that robustness and accuracy are demonstrated via comparisons to generative approaches and traditional registration algorithms, yet no quantitative metrics, error bars, intersection-rate tables, or details on how intersections are measured are referenced. The evaluation section should include specific numbers (e.g., mean penetration depth, success rates over a benchmark set) and ablation studies on the barrier loss weight to substantiate the superiority claims.

    Authors: We acknowledge the value of quantitative evidence. The revised evaluation section now contains a table reporting mean penetration depth, intersection rate, and success rate (zero-penetration cases) over a benchmark of 50 compositions, with error bars from three independent runs per method. We also added an ablation study that varies the barrier-loss weight and plots its effect on both intersection rate and geometric fitting error, allowing direct comparison against the generative and registration baselines. revision: yes

Circularity Check

0 steps flagged

No circularity: optimization pipeline with no self-referential derivations

full rationale

The paper describes a multi-step algorithmic pipeline (VLM-based rigid initialization, geometric + barrier losses, diffusion-assisted deformation) rather than any closed-form derivation or mathematical claim. No equations are presented that reduce outputs to fitted inputs by construction, no self-citations are invoked to justify uniqueness or ansatzes, and no predictions are statistically forced from subsets of the same data. The central claims rest on empirical demonstration of the method, which is self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Based on abstract only; limited visibility into exact assumptions. The method relies on standard assumptions of optimization convergence and diffusion model priors being semantically meaningful.

free parameters (1)
  • loss weights
    Weights balancing attractive geometric losses and barrier loss are likely tuned; abstract does not specify values or selection procedure.
axioms (2)
  • domain assumption Vision-to-language models produce semantically plausible initial rigid poses for mesh pairs
    Invoked in the initialization step; no proof or extensive validation mentioned.
  • domain assumption Diffusion priors yield realistic non-intersecting deformations
    Used in the final deformation stage.

pith-pipeline@v0.9.0 · 5449 in / 1398 out tokens · 50550 ms · 2026-05-10T16:41:14.319428+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SpUDD: Superpower Contouring of Unsigned Distance Data

    cs.GR 2026-04 unverdicted novelty 7.0

    SpUDD defines superpower contours on power diagrams of unsigned distance samples, proves their convergence to the true surface, and uses them to generate approximating meshes that outperform other strategies for this ...

  2. SpUDD: Superpower Contouring of Unsigned Distance Data

    cs.GR 2026-04 unverdicted novelty 7.0

    SpUDD defines superpower contours from power diagrams of unsigned distance samples, proves convergence to the true surface, and uses them to generate approximating polygonal meshes that outperform prior strategies.

Reference graph

Works this paper leans on

50 extracted references · 9 canonical work pages · cited by 1 Pith paper · 3 internal anchors

  1. [1]

    arXiv preprint arXiv:2205.02904 (2022)

    Aigerman, N., Gupta, K., Kim, V.G., Chaudhuri, S., Saito, J., Groueix, T.: Neural jacobian fields: Learning intrinsic mappings of arbitrary meshes. arXiv preprint arXiv:2205.02904 (2022)

  2. [2]

    In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition

    Aoki, Y., Goforth, H., Srivatsan, R.A., Lucey, S.: Pointnetlk: Robust & efficient point cloud registration using pointnet. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition. pp. 7163–7172 (2019)

  3. [3]

    In: Proceedings of the 25th annual conference on Computer graphics and interactive techniques

    Baraff, D., Witkin, A.: Large steps in cloth simulation. In: Proceedings of the 25th annual conference on Computer graphics and interactive techniques. pp. 43–54 (1998)

  4. [4]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference

    Barda, A., Gadelha, M., Kim, V.G., Aigerman, N., Bermano, A.H., Groueix, T.: Instant3dit: Multiview inpainting for fast editing of 3d objects. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 16273–16282 (2025)

  5. [5]

    In: Sensor fusion IV: control paradigms and data structures

    Besl, P.J., McKay, N.D.: Method for registration of 3-d shapes. In: Sensor fusion IV: control paradigms and data structures. vol. 1611, pp. 586–606. Spie (1992)

  6. [6]

    IEEE Transactions on Visualization and Computer Graphics14(1), 213–230 (2008)

    Botsch, M., Sorkine, O.: On linear variational surface deformation methods. IEEE Transactions on Visualization and Computer Graphics14(1), 213–230 (2008). https://doi.org/10.1109/TVCG.2007.1054

  7. [7]

    In: ACM SIGGRAPH Asia (2024)

    Chen, Y.C., Ling, S., Chen, Z., Kim, V.G., Gadelha, M., Jacobson, A.: Text-guided controllable mesh refinement for interactive 3d modeling. In: ACM SIGGRAPH Asia (2024)

  8. [8]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Choy, C.B., Dong, W., Koltun, V.: Deep global registration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2514–2523 (2020)

  9. [9]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Decatur, D., Lang, I., Hanocka, R.: 3d highlighter: Localizing regions on 3d shapes via text descriptions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 20930–20939 (2023)

  10. [10]

    In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)

    Dinh, N.A., Lang, I., Kim, H., Stein, O., Hanocka, R.: Geometry in style: 3d stylization via surface normal deformation. In: Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR). pp. 28456–28467 (2025)

  11. [11]

    Computer Graphics Forum (2019)

    Fulton, L., Modi, V., Duvenaud, D., Levin, D.I.W., Jacobson, A.: Latent-space dynamics for reduced deformable simulation. Computer Graphics Forum (2019)

  12. [12]

    Gao, W., Aigerman, N., Groueix, T., Kim, V.G., Hanocka, R.: Textdeformer: Ge- ometrymanipulationusingtextguidance.In:ACMTransactionsonGraphics(SIG- GRAPH) (2023)

  13. [13]

    In: 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

    Guo, M., Tang, M., Cha, H., Zhang, R., Liu, C.K., Wu, J.: Craft: Designing creative and functional 3d objects. In: 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 7215–7224. IEEE (2025)

  14. [14]

    arXiv preprint arXiv:2412.09548 , year=

    Hao, Z., Romero, D.W., Lin, T.Y., Liu, M.Y.: Meshtron: High-fidelity, artist-like 3d mesh generation at scale. arXiv preprint arXiv:2412.09548 (2024)

  15. [15]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Huang, S., Liang, Z., Cho, Y., Li, X., Wang, Y., Yang, Y.: Feature-metric regis- tration: A fast, robust, feature-metric deep learning method for point cloud reg- istration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 9969–9978 (2020)

  16. [16]

    In: ACM SIGGRAPH 2014 Courses (2014)

    Jacobson, A., Deng, Z., Kavan, L., Lewis, J.: Skinning: Real-time shape deforma- tion. In: ACM SIGGRAPH 2014 Courses (2014)

  17. [17]

    In: Thirty-seventh Conference on Neural Information Processing Systems (2023) 16 H

    Jiang, H., Salzmann, M., Dang, Z., Xie, J., Yang, J.: Se (3) diffusion model-based point cloud registration for robust 6d object pose estimation. In: Thirty-seventh Conference on Neural Information Processing Systems (2023) 16 H. Kim et al

  18. [18]

    ACM Trans

    Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph.42(4), 139–1 (2023)

  19. [19]

    In: 2025 International Conference on 3D Vision (3DV)

    Kim, H., Lang, I., Aigerman, N., Groueix, T., Kim, V.G., Hanocka, R.: Meshup: Multi-target mesh deformation via blended score distillation. In: 2025 International Conference on 3D Vision (3DV). pp. 222–239. IEEE (2025)

  20. [20]

    arXiv preprint arXiv:2403.19272 (2024)

    Lan, L., Lu, Z., Long, J., Yuan, C., Li, X., He, X., Wang, H., Jiang, C., Yang, Y.: Efficient gpu cloth simulation with non-distance barriers and subspace reuse. arXiv preprint arXiv:2403.19272 (2024)

  21. [21]

    In: Proceedings of the International Conference on 3D Vision (3DV)

    Lang, I., Ginzburg, D., Avidan, S., Raviv, D.: DPC: Unsupervised Deep Point Cor- respondence via Cross and Self Construction. In: Proceedings of the International Conference on 3D Vision (3DV). pp. 1442–1451 (2021)

  22. [22]

    ACM Trans

    Li, M., Ferguson, Z., Schneider, T., Langlois, T.R., Zorin, D., Panozzo, D., Jiang, C., Kaufman, D.M.: Incremental potential contact: intersection-and inversion-free, large-deformation dynamics. ACM Trans. Graph.39(4), 49 (2020)

  23. [23]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition

    Lin, C.H., Gao, J., Tang, L., Takikawa, T., Zeng, X., Huang, X., Kreis, K., Fidler, S., Liu, M.Y., Lin, T.Y.: Magic3d: High-resolution text-to-3d content creation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition. pp. 300–309 (2023)

  24. [24]

    In: European Conference on Computer Vision

    Lin, Z., Pathak, D., Li, B., Li, J., Xia, X., Neubig, G., Zhang, P., Ramanan, D.: Evaluating text-to-visual generation with image-to-text generation. In: European Conference on Computer Vision. pp. 366–384. Springer (2024)

  25. [25]

    In: Proceedings of the IEEE International Conference on Computer Vision (ICCV)

    Litany, O., Remez, T., Rodolà, E., Bronstein, A.M., Bronstein, M.M.: Deep func- tional maps: Structured prediction for dense shape correspondence. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 5660–5668. IEEE Computer Society (Oct 2017).https://doi.org/10.1109/ICCV.2017.603, https://openaccess.thecvf.com/content_ICCV_...

  26. [26]

    In: Proceedings of the SPIE Videometrics VIII (2003)

    McKay, N.D.: 3d registration: A review of techniques. In: Proceedings of the SPIE Videometrics VIII (2003)

  27. [27]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

    Michel, O., Bar-On, R., Liu, R., Benaim, S., Hanocka, R.: Text2mesh: Text-driven neural stylization for meshes. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). pp. 13492–13502 (2022)

  28. [28]

    Commu- nications of the ACM65(1), 99–106 (2021)

    Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. Commu- nications of the ACM65(1), 99–106 (2021)

  29. [29]

    In: International conference on machine learning

    Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al.: Learning transferable visual models from natural language supervision. In: International conference on machine learning. pp. 8748–8763. PmLR (2021)

  30. [30]

    ACM Transactions on Graphics (TOG)34(6), 1–14 (2015)

    Sacht, L., Vouga, E., Jacobson, A.: Nested cages. ACM Transactions on Graphics (TOG)34(6), 1–14 (2015)

  31. [31]

    Image and Vision Computing25(5), 578–596 (2007)

    Salvi,J.,Matabosch,C.,Fofi,D.,Forest,J.:Areviewofrecentregistrationmethods for 3d modelling. Image and Vision Computing25(5), 578–596 (2007)

  32. [32]

    In: Robotics: science and sys- tems

    Segal, A., Haehnel, D., Thrun, S.: Generalized-icp. In: Robotics: science and sys- tems. vol. 2, p. 435. Seattle, WA (2009)

  33. [33]

    AK Peters/CRC Press (2009)

    Shirley, P., Ashikhmin, M., Marschner, S.: Fundamentals of computer graphics. AK Peters/CRC Press (2009)

  34. [34]

    In: Acm siggraph 2012 courses, pp

    Sifakis,E.,Barbic,J.:Femsimulationof3ddeformablesolids:apractitioner’sguide to theory, discretization and model reduction. In: Acm siggraph 2012 courses, pp. 1–50 (2012) MeshOn: Intersection-Free Mesh-to-Mesh Composition 17

  35. [35]

    In: Symposium on Geometry processing

    Sorkine, O., Alexa, M.: As-rigid-as-possible surface modeling. In: Symposium on Geometry processing. vol. 4, pp. 109–116. Citeseer (2007)

  36. [36]

    In: Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on Geometry processing

    Sorkine, O., Cohen-Or, D., Lipman, Y., Alexa, M., Rössl, C., Seidel, H.P.: Lapla- cian surface editing. In: Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on Geometry processing. pp. 175–184 (2004)

  37. [37]

    IEEE Transactions on Visualization and Computer Graphics19(7), 1199–1217 (2013)

    Tam, G.K.L., Cheng, K., Lai, Y.K., Langbein, F.C., Liu, Y., Marshall, D., Martin, R.R., Sun, X.F., Rosin, P.L.: Registration of 3d point clouds: A survey. IEEE Transactions on Visualization and Computer Graphics19(7), 1199–1217 (2013)

  38. [38]

    Gemini: A Family of Highly Capable Multimodal Models

    Team, G., Anil, R., Borgeaud, S., Alayrac, J.B., Yu, J., Soricut, R., Schalkwyk, J., Dai, A.M., Hauth, A., Millican, K., et al.: Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023)

  39. [39]

    In: Proceedings of the AAAI conference on artificial intelligence

    Wang, J., Chan, K.C., Loy, C.C.: Exploring clip for assessing the look and feel of images. In: Proceedings of the AAAI conference on artificial intelligence. vol. 37, pp. 2555–2563 (2023)

  40. [40]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (October 2019)

    Wang, Y., Solomon, J.M.: Deep closest point: Learning representations for point cloud registration. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (October 2019)

  41. [41]

    In: Proceedings of the IEEE/CVF international conference on computer vision

    Wang, Y., Solomon, J.M.: Deep closest point: Learning representations for point cloud registration. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 3523–3532 (2019)

  42. [42]

    The Visual Computer40(7), 4701–4712 (2024)

    Xu, H., Wu, Y., Tang, X., Zhang, J., Zhang, Y., Zhang, Z., Li, C., Jin, X.: Fu- siondeformer: Text-guided mesh deformation using diffusion models. The Visual Computer40(7), 4701–4712 (2024)

  43. [43]

    Xu, J., Cheng, W., Gao, Y., Wang, X., Gao, S., Shan, Y.: Instantmesh: Efficient 3d mesh generation from a single image with sparse-view large reconstruction models (2024),https://arxiv.org/abs/2404.07191

  44. [44]

    In: European Conference on Computer Vision (ECCV) (2024)

    Yang, H., Chen, Y., Pan, Y., Yao, T., Chen, Z., Wu, Z., Jiang, Y.G., Mei, T.: Dreammesh: Jointly manipulating and texturing triangle meshes for text-to-3d generation. In: European Conference on Computer Vision (ECCV) (2024)

  45. [45]

    IEEE transactions on pattern analysis and machine intelligence38(11), 2241–2254 (2015)

    Yang, J., Li, H., Campbell, D., Jia, Y.: Go-icp: A globally optimal solution to 3d icp point-set registration. IEEE transactions on pattern analysis and machine intelligence38(11), 2241–2254 (2015)

  46. [46]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Yew, Z.J., Lee, G.h.: Regtr: End-to-end point cloud correspondences with trans- formers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

  47. [47]

    In: ACM SIGGRAPH 2004 Papers

    Yu, Y., Zhou, K., Xu, D., Shi, X., Bao, H., Guo, B., Shum, H.Y.: Mesh editing with poisson-based gradient field manipulation. In: ACM SIGGRAPH 2004 Papers. p. 644–651. SIGGRAPH ’04, Association for Computing Machinery, New York, NY, USA (2004).https://doi.org/10.1145/1186562.1015774,https://doi.org/ 10.1145/1186562.1015774

  48. [48]

    In: European conference on computer vision

    Zhou, Q.Y., Park, J., Koltun, V.: Fast global registration. In: European conference on computer vision. pp. 766–782. Springer (2016)

  49. [49]

    Open3D: A Modern Library for 3D Data Processing

    Zhou, Q.Y., Park, J., Koltun, V.: Open3D: A modern library for 3D data process- ing. arXiv:1801.09847 (2018)

  50. [50]

    good-enough

    Zhou, Y., Barnes, C., Lu, J., Yang, J., Li, H.: On the continuity of rotation rep- resentations in neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 5745–5753 (2019) Supplementary Materials A Technical Details In this section, we will delineate some of the technical details and derivations we have ...