Skinned Motion Retargeting with Spatially Adaptive Interaction Guidance
Pith reviewed 2026-05-20 02:27 UTC · model grok-4.3
The pith
Motion retargeting preserves self-contacts using dynamically repositioned anchors
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By performing proximity matching over spatially adaptive anchors that are dynamically repositioned via a Transformer-based refinement strategy with differentiable soft projection, the method preserves interaction semantics such as self-contact and near-body proximity across characters with exaggerated body proportions, outperforming state-of-the-art approaches that use static correspondences.
What carries the argument
Spatially adaptive anchors repositioned by a Transformer-based strategy with differentiable soft projection that supply pose-dependent guidance to a graph autoencoder.
If this is right
- The approach handles exaggerated body proportions where static methods fail.
- Alternating optimization aligns the anchor adaptation and motion retargeting tasks.
- Graph autoencoder uses the adapted anchors to predict target motion while preserving spatial configurations.
- Evaluations show improved interaction fidelity on diverse character geometries.
Where Pith is reading between the lines
- This could simplify animation workflows for games featuring characters of varying scales and shapes.
- The refinement strategy might extend to other tasks involving dynamic spatial relationships in 3D models.
- Integrating the method with physics constraints could further improve realism in contact handling.
Load-bearing premise
The displacements predicted by the Transformer, once softly projected onto the target geometry, still capture the source pose's spatial structures effectively for guiding the retargeting.
What would settle it
Observe whether hand-to-head or hand-to-hand contacts in a source motion with a tall thin character are preserved when retargeted to a short stocky character, without new intersections or lost proximities.
Figures
read the original abstract
Retargeting motion across characters with varying body shapes while preserving interaction semantics, such as self-contact and near-body proximity, remains a challenging problem. While recent geometry-aware approaches address this by maintaining spatial relationships between predefined corresponding regions, their reliance on static correspondences often struggles when the target character exhibits exaggerated body proportions. In this paper, we present a geometry-aware motion retargeting framework that preserves interaction semantics by performing proximity matching over spatially adaptive anchors. Unlike prior methods with static anchor definitions, the proposed method dynamically repositions anchors to reachable regions on the target character. This is achieved via a Transformer-based anchor refinement strategy that predicts anchor displacements and constrains the translated anchors to remain on the target character geometry through differentiable soft projection. By incorporating pose-dependent spatial structures from the source character, the adapted anchors provide structurally coherent guidance for interaction-aware retargeting. Conditioned on these anchors, a graph-based autoencoder predicts target skeletal motion that preserves the spatial configuration of the source. To encourage task-aligned optimization between anchor adaptation and motion retargeting, we adopt an alternating training scheme in which each module is optimized in turn. Through extensive evaluations, we demonstrate that our method outperforms state-of-the-art approaches in preserving interaction fidelity across diverse character geometries.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to introduce a geometry-aware skinned motion retargeting method that dynamically repositions anchors on the target character using a Transformer to predict displacements, followed by differentiable soft projection to constrain anchors to the target geometry. These adapted anchors, which incorporate pose-dependent spatial structures from the source, then condition a graph autoencoder to predict target skeletal motion while preserving interaction semantics such as self-contact and near-body proximity. An alternating optimization scheme aligns the anchor adaptation and retargeting modules, with the abstract asserting outperformance over state-of-the-art methods across diverse character geometries based on extensive evaluations.
Significance. If the central claims hold with supporting quantitative evidence, the work would represent a meaningful advance in computer graphics for interaction-preserving retargeting, particularly for characters with exaggerated proportions where static correspondence methods fail. The dynamic anchor approach combined with alternating training could influence downstream applications in animation, games, and virtual reality by better maintaining spatial interaction fidelity.
major comments (2)
- [§3.2] §3.2 (Transformer-based anchor refinement and soft projection): The claim that adapted anchors 'provide structurally coherent guidance' for the graph autoencoder rests on the assumption that differentiable soft projection preserves source pose-dependent distances and contact semantics. However, when the target has exaggerated proportions, a small source displacement can map to collapsed or stretched configurations on the target mesh, potentially breaking the proximity matching that underpins the interaction preservation claim. This is load-bearing for the central contribution and requires either a formal analysis of distance preservation or targeted ablations.
- [Evaluation section (likely §5)] Evaluation section (likely §5): The abstract states that the method 'outperforms state-of-the-art approaches in preserving interaction fidelity' via 'extensive evaluations,' yet the provided text contains no quantitative metrics, error bars, ablation results on the soft-projection module, or dataset details. Without these, the outperformance claim cannot be assessed; specific tables or figures reporting interaction-error reductions (e.g., self-contact distance or proximity metrics) are needed to substantiate the result.
minor comments (2)
- [Method] Clarify the exact mathematical definition of the differentiable soft projection (e.g., whether it is barycentric interpolation or a learned weighted sum) and how gradients flow through it during alternating optimization.
- [Discussion or Conclusion] Add a brief discussion of failure cases or limitations when the target geometry deviates extremely from the source, to contextualize the scope of the spatially adaptive anchors.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below, indicating where revisions have been made to strengthen the manuscript.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Transformer-based anchor refinement and soft projection): The claim that adapted anchors 'provide structurally coherent guidance' for the graph autoencoder rests on the assumption that differentiable soft projection preserves source pose-dependent distances and contact semantics. However, when the target has exaggerated proportions, a small source displacement can map to collapsed or stretched configurations on the target mesh, potentially breaking the proximity matching that underpins the interaction preservation claim. This is load-bearing for the central contribution and requires either a formal analysis of distance preservation or targeted ablations.
Authors: We agree this is a load-bearing assumption and thank the referee for identifying the need for stronger justification. The soft projection is formulated to map source displacements onto the target surface while approximately preserving relative distances for small motions. In the revision, we have added a formal analysis in §3.2 bounding the distortion introduced by the projection operator in terms of target mesh curvature. We have also included targeted ablations in §5 that measure self-contact and proximity errors with and without soft projection on characters with exaggerated proportions, confirming improved preservation of interaction semantics. revision: yes
-
Referee: Evaluation section (likely §5): The abstract states that the method 'outperforms state-of-the-art approaches in preserving interaction fidelity' via 'extensive evaluations,' yet the provided text contains no quantitative metrics, error bars, ablation results on the soft-projection module, or dataset details. Without these, the outperformance claim cannot be assessed; specific tables or figures reporting interaction-error reductions (e.g., self-contact distance or proximity metrics) are needed to substantiate the result.
Authors: The full manuscript contains quantitative results in Section 5, including tables with self-contact distance and proximity errors (with standard deviations as error bars) and dataset details in Section 4. However, to directly address the referee's request for explicit support of the outperformance claim, we have added a new ablation subsection (§5.4) isolating the soft-projection module and additional figures comparing interaction-error reductions against baselines. These changes make the evidence more prominent and accessible. revision: partial
Circularity Check
No circularity: standard neural pipeline with independent components
full rationale
The derivation chain relies on a Transformer predicting anchor displacements, followed by differentiable soft projection onto target geometry, then conditioning a graph autoencoder on the resulting anchors for skeletal motion prediction, with alternating optimization. None of these steps reduce by construction to fitted parameters or self-citations within the paper's own equations; the components are standard architectures whose outputs are not tautologically defined by the inputs. The central claim of preserving interaction semantics is supported by external evaluations rather than internal redefinition.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Learning to sample , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[2]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Samplenet: Differentiable point cloud sampling , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[3]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Attention discriminant sampling for point clouds , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[4]
Computer Graphics Forum , volume=
Aura mesh: Motion retargeting to preserve the spatial relationships between skinned characters , author=. Computer Graphics Forum , volume=. 2018 , organization=
work page 2018
-
[5]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
Neural kinematic networks for unsupervised motion retargetting , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[6]
30th British Machine Vision Conference (BMVC 2019) , year=
Pmnet: Learning of disentangled pose and movement for unsupervised motion retargeting , author=. 30th British Machine Vision Conference (BMVC 2019) , year=
work page 2019
-
[7]
ACM Transactions on Graphics (TOG) , volume=
Skeleton-aware networks for deep motion retargeting , author=. ACM Transactions on Graphics (TOG) , volume=. 2020 , publisher=
work page 2020
-
[8]
SIGGRAPH Asia 2023 Conference Papers , pages=
Same: Skeleton-agnostic motion embedding for character animation , author=. SIGGRAPH Asia 2023 Conference Papers , pages=
work page 2023
-
[9]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Contact-aware retargeting of skinned motion , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[10]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Skinned motion retargeting with residual perception of motion semantics & geometry , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[11]
Computer Vision and Image Understanding , volume=
MoMa: Skinned motion retargeting using masked pose modeling , author=. Computer Vision and Image Understanding , volume=. 2024 , publisher=
work page 2024
-
[12]
Advances in Neural Information Processing Systems , volume=
Skinned motion retargeting with dense geometric interaction perception , author=. Advances in Neural Information Processing Systems , volume=
-
[13]
Learning-based Self-Collision Avoidance in Retargeting using Body Part-specific Signed Distance Fields , author=. 2024 , publisher=
work page 2024
-
[14]
arXiv preprint arXiv:2504.06504 , year=
STaR: Seamless Spatial-Temporal Aware Motion Retargeting with Penetration and Consistency Constraints , author=. arXiv preprint arXiv:2504.06504 , year=
-
[15]
Computer Graphics Forum , pages=
ReConForM: Real-time Contact-aware Motion Retargeting for more Diverse Character Morphologies , author=. Computer Graphics Forum , pages=. 2025 , organization=
work page 2025
-
[16]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
On the continuity of rotation representations in neural networks , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[17]
ACM Transactions on Graphics (TOG) , volume=
Phase-functioned neural networks for character control , author=. ACM Transactions on Graphics (TOG) , volume=. 2017 , publisher=
work page 2017
-
[18]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
Pointnet: Deep learning on point sets for 3d classification and segmentation , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[19]
ACM Transactions on Graphics (TOG) , volume=
Computer puppetry: An importance-based approach , author=. ACM Transactions on Graphics (TOG) , volume=. 2001 , publisher=
work page 2001
-
[20]
ACM Transactions on Graphics (TOG) , volume=
Geometry-aware retargeting for two-skinned characters interaction , author=. ACM Transactions on Graphics (TOG) , volume=. 2024 , publisher=
work page 2024
-
[21]
MotionBuilder , author =
-
[22]
Computer Graphics Forum , volume=
Character contact re-positioning under large environment deformation , author=. Computer Graphics Forum , volume=. 2016 , organization=
work page 2016
-
[23]
Proceedings of the 18th annual ACM SIGGRAPH/Eurographics Symposium on Computer Animation , pages=
Interaction motion retargeting to highly dissimilar furniture environment , author=. Proceedings of the 18th annual ACM SIGGRAPH/Eurographics Symposium on Computer Animation , pages=
-
[24]
ACM SIGGRAPH 2023 Conference Proceedings , pages=
Simulation and retargeting of complex multi-character interactions , author=. ACM SIGGRAPH 2023 Conference Proceedings , pages=
work page 2023
-
[25]
ACM SIGGRAPH 2010 papers , pages=
Spatial relationship preserving character motion adaptation , author=. ACM SIGGRAPH 2010 papers , pages=
work page 2010
-
[26]
Proceedings of the 11th ACM SIGGRAPH Conference on Motion, Interaction and Games , pages=
Surface based motion retargeting by preserving spatial relationship , author=. Proceedings of the 11th ACM SIGGRAPH Conference on Motion, Interaction and Games , pages=
-
[27]
Proceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games , pages=
Contact preserving shape transfer for rigging-free motion retargeting , author=. Proceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games , pages=
-
[28]
Proceedings of the 12th ACM SIGGRAPH/Eurographics Symposium on Computer Animation , pages=
Relationship descriptors for interactive motion adaptation , author=. Proceedings of the 12th ACM SIGGRAPH/Eurographics Symposium on Computer Animation , pages=
-
[29]
IEEE transactions on visualization and computer graphics , volume=
Retargeting human-object interaction to virtual avatars , author=. IEEE transactions on visualization and computer graphics , volume=. 2016 , publisher=
work page 2016
-
[30]
Computer Graphics Forum , pages=
InterFaceRays: Interaction-Oriented Furniture Surface Representation for Human Pose Retargeting , author=. Computer Graphics Forum , pages=. 2025 , organization=
work page 2025
-
[31]
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
Semantics-aware motion retargeting with vision-language models , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
-
[32]
Proceedings of the 25th annual conference on Computer graphics and interactive techniques , pages=
Retargetting motion to new characters , author=. Proceedings of the 25th annual conference on Computer graphics and interactive techniques , pages=
-
[33]
The Journal of Visualization and Computer Animation , volume=
Online motion retargetting , author=. The Journal of Visualization and Computer Animation , volume=. 2000 , publisher=
work page 2000
-
[34]
Proceedings of the 26th annual conference on Computer graphics and interactive techniques , pages=
A hierarchical approach to interactive motion editing for human-like figures , author=. Proceedings of the 26th annual conference on Computer graphics and interactive techniques , pages=
-
[35]
ACM Transactions on Graphics (TOG) , volume=
Ultrafast and Controllable Online Motion Retargeting for Game Scenarios , author=. ACM Transactions on Graphics (TOG) , volume=. 2025 , publisher=
work page 2025
-
[36]
Computer Graphics Forum , volume=
Online Avatar Motion Adaptation to Morphologically-similar Spaces , author=. Computer Graphics Forum , volume=. 2023 , organization=
work page 2023
-
[37]
ACM Transactions on Graphics , volume=
Neural state machine for character-scene interactions , author=. ACM Transactions on Graphics , volume=. 2019 , publisher=
work page 2019
-
[38]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Stochastic scene-aware motion prediction , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[39]
Computer Graphics Forum , volume=
Dafnet: Generating diverse actions for furniture interaction by learning conditional pose distribution , author=. Computer Graphics Forum , volume=. 2023 , organization=
work page 2023
-
[40]
SIGGRAPH Asia 2018 Posters , pages=
A variational u-net for motion retargeting , author=. SIGGRAPH Asia 2018 Posters , pages=
work page 2018
-
[41]
IEEE Transactions on Visualization and Computer Graphics , volume=
Pose-aware attention network for flexible motion retargeting by body part , author=. IEEE Transactions on Visualization and Computer Graphics , volume=. 2023 , publisher=
work page 2023
-
[42]
European Conference on Computer Vision , pages=
Couch: Towards controllable human-chair interactions , author=. European Conference on Computer Vision , pages=. 2022 , organization=
work page 2022
-
[43]
Computer Animation and Virtual Worlds , volume=
A variational U-Net for motion retargeting , author=. Computer Animation and Virtual Worlds , volume=. 2020 , publisher=
work page 2020
-
[44]
Advances in neural information processing systems , volume=
Attention is all you need , author=. Advances in neural information processing systems , volume=
-
[45]
Graph attention networks , author=. arXiv preprint arXiv:1710.10903 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[46]
Communications of the ACM , volume=
Generative adversarial networks , author=. Communications of the ACM , volume=. 2020 , publisher=
work page 2020
-
[47]
Adam: A Method for Stochastic Optimization
Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[48]
Computers & Graphics , volume=
Using task efficient contact configurations to animate creatures in arbitrary environments , author=. Computers & Graphics , volume=. 2014 , publisher=
work page 2014
-
[49]
Rignet: Neural rigging for artic- ulated characters.arXiv preprint arXiv:2005.00559, 2020
Rignet: Neural rigging for articulated characters , author=. arXiv preprint arXiv:2005.00559 , year=
-
[50]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Motion synthesis with sparse and flexible keyjoint control , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[51]
Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
Goal-driven human motion synthesis in diverse task , author=. Proceedings of the Computer Vision and Pattern Recognition Conference , pages=
-
[52]
Computer Graphics Forum , volume=
Generative motion infilling from imprecisely timed keyframes , author=. Computer Graphics Forum , volume=. 2025 , organization=
work page 2025
-
[53]
Computers & Graphics , volume=
Contact preserving shape transfer: Retargeting motion from one shape to another , author=. Computers & Graphics , volume=. 2020 , publisher=
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.