pith. sign in

arxiv: 2503.12868 · v2 · pith:LBSF3TGGnew · submitted 2025-03-17 · 💻 cs.CV

UniReg: A Universal Model for Controllable CT Image Registration

Pith reviewed 2026-05-25 08:23 UTC · model grok-4.3

classification 💻 cs.CV
keywords medical image registrationCT image registrationunified registration modelconditional deformation estimationcross-scenario generalizationinter-subject registrationintra-subject registration
0
0 comments X

The pith

A single conditional model registers CT images across multiple clinical scenarios with higher accuracy than task-specific networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to establish that one model can replace separate networks for different CT registration tasks by adapting its deformation estimates based on supplied anatomical priors, whether the alignment is between or within subjects, and features unique to each image pair. If correct, this removes the need to train and maintain isolated models for each clinical use case while preserving or improving alignment precision. The approach combines the accuracy of specialized learning methods with the flexibility of traditional optimization techniques. Experiments across several CT and MR datasets support that the unified model delivers better average accuracy and generalizes to new scenarios without retraining.

Core claim

UniReg is a conditional unified model for multi-scenario CT image registration. It adaptively estimates deformation fields by conditioning on three inputs: anatomical structure priors, registration type constraints (inter/intra-subject), and instance-specific features. This single model produces optimal alignments across heterogeneous clinical scenarios, achieving superior average registration accuracy over current state-of-the-art learning-based methods and demonstrating strong cross-scenario generalization. Replacing multiple isolated task-specific models with this compact unified model also reduces overall training cost and model redundancy.

What carries the argument

The unified registration framework that adaptively estimates deformation fields conditioned on anatomical structure priors, registration type constraints, and instance-specific features.

If this is right

  • One compact model can replace multiple isolated task-specific networks for inter-subject, intra-subject, and region-specific registration.
  • Overall training burden decreases through reduced total compute and elimination of model redundancy.
  • The model maintains or improves registration accuracy on average while generalizing to new scenarios without task-specific retraining.
  • Deformation field estimation becomes controllable through explicit conditioning rather than implicit task specialization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same conditioning strategy might reduce the number of models needed for multi-modal registration pipelines that mix CT with other modalities.
  • Clinical deployment could become simpler if hospitals maintain only one registration network instead of a suite of specialized ones.
  • Adding further conditioning signals, such as patient metadata, could be tested to handle edge-case anatomies not covered in current experiments.

Load-bearing premise

That supplying anatomical structure priors, inter/intra-subject constraints, and instance features as conditioning inputs is enough for one model to produce optimal alignments in every heterogeneous clinical scenario.

What would settle it

A previously unseen CT registration dataset from a new clinical scenario on which the single UniReg model produces lower average accuracy than separately trained task-specific models for the same tasks.

Figures

Figures reproduced from arXiv: 2503.12868 by Cheng Chen, Dakai Jin, Jianpeng Zhang, Le Lu, Tai Ma, Tony C. W. Mok, Xianghua Ye, Yan-Jie Zhou, Zeli Chen, Zi Li.

Figure 1
Figure 1. Figure 1 [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed universal registration model. Our framework comprises a backbone architecture and Dynamic Registration. Leveraging our dynamic controller mechanism, we can leverage human expertise and prior knowledge of diverse registration tasks throughout the training process. At each training iteration, our framework selects task-specific hyperparameters and segmentation masks corresponding to … view at source ↗
Figure 3
Figure 3. Figure 3: Example slices of top two registration methods. The warped anatomical segmentations are overlaid and major registration artifacts [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Network Architecture of the Shared Backbone. Each blue rectangle represents a convolutional layer, with the number of channels [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Examples of three-dimensional rendering volume maps derived from the warped segmentation images of the HeadNeck, Chest, [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
read the original abstract

Learning-based medical image registration has matched the accuracy of conventional methods while offering superior computational efficiency. However, existing approaches suffer from poor generalization across diverse clinical scenarios, requiring the laborious development of multiple isolated networks for specific registration tasks, e.g., inter-/intra-subject registration or anatomical region-specific alignment, leading to cumbersome development pipelines. To overcome this limitation, we propose UniReg, the first conditional unified model for multi-scenario CT image registration, which combines the precision advantages of task-specific learning methods with the generalization of traditional optimization methods. Our key innovation is a unified registration framework that adaptively estimates deformation fields conditioned on: (1) anatomical structure priors, (2) registration type constraints (inter/intra-subject), and (3) instance-specific features, enabling optimal alignment across heterogeneous scenarios within a single model. Through comprehensive experiments on multiple CT/MR registration datasets, UniReg achieves superior average registration accuracy compared with current state-of-the-art learning-based methods while exhibiting strong cross-scenario generalization. Moreover, by replacing multiple isolated task-specific models with a compact unified model, UniReg substantially reduces the overall training burden in terms of total training cost and model redundancy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 0 minor

Summary. The manuscript proposes UniReg, the first conditional unified model for multi-scenario CT image registration. It adaptively estimates deformation fields by conditioning on anatomical structure priors, registration type constraints (inter/intra-subject), and instance-specific features, claiming this enables optimal alignment across heterogeneous clinical scenarios within a single model. The work asserts superior average registration accuracy over current state-of-the-art learning-based methods, strong cross-scenario generalization on multiple CT/MR datasets, and substantial reduction in training burden by replacing multiple task-specific models.

Significance. If the empirical claims hold with rigorous validation, the approach could meaningfully reduce model redundancy and development overhead in medical image registration by demonstrating that a single conditioned network can match or exceed task-specific models across diverse scenarios without hidden trade-offs.

major comments (3)
  1. [Abstract] Abstract: the assertion of 'superior average registration accuracy' and 'strong cross-scenario generalization' is presented without any quantitative metrics, baseline comparisons, statistical tests, data-split details, or exclusion criteria, leaving the central empirical claim unsupported and impossible to evaluate.
  2. [Method] Method section: the fusion mechanism for integrating the three conditioning inputs (anatomical priors, registration type flags, instance-specific features) is unspecified (concatenation, modulation, attention, etc.), which is load-bearing for assessing whether the model achieves true disentanglement or merely memorizes scenario-specific behaviors.
  3. [Experiments] Experiments: no information is given on whether anatomical structure priors are assumed perfect or estimated, how instance-specific features are derived to ensure generalization beyond the training distribution, or whether performance trade-offs exist across scenarios, directly undermining the 'optimal alignment' claim.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight areas where additional clarity will strengthen the manuscript. We address each major comment below and commit to revisions that provide the requested details without altering the core claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion of 'superior average registration accuracy' and 'strong cross-scenario generalization' is presented without any quantitative metrics, baseline comparisons, statistical tests, data-split details, or exclusion criteria, leaving the central empirical claim unsupported and impossible to evaluate.

    Authors: We agree that the abstract would benefit from quantitative support. The Experiments section contains the supporting results (including average metrics, baselines, and dataset details), but these were omitted from the abstract for brevity. In the revision we will incorporate concise quantitative statements, such as key accuracy improvements and generalization metrics, while remaining within length constraints. revision: yes

  2. Referee: [Method] Method section: the fusion mechanism for integrating the three conditioning inputs (anatomical priors, registration type flags, instance-specific features) is unspecified (concatenation, modulation, attention, etc.), which is load-bearing for assessing whether the model achieves true disentanglement or merely memorizes scenario-specific behaviors.

    Authors: The current Method section describes the conditioning inputs but does not detail their integration. We will revise this section to explicitly specify the fusion mechanism, provide the relevant equations or pseudocode, and add a brief discussion of how the design supports disentanglement rather than scenario memorization. revision: yes

  3. Referee: [Experiments] Experiments: no information is given on whether anatomical structure priors are assumed perfect or estimated, how instance-specific features are derived to ensure generalization beyond the training distribution, or whether performance trade-offs exist across scenarios, directly undermining the 'optimal alignment' claim.

    Authors: We acknowledge these details are missing from the Experiments section. We will add explicit statements clarifying that priors are estimated via a separate network, describe the instance-feature extraction process and any generalization techniques employed, and include a new analysis or table examining performance trade-offs across scenarios. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical ML evaluation

full rationale

The paper frames UniReg as an empirical conditional neural network trained on CT/MR datasets with inputs (anatomical priors, inter/intra flags, instance features) and evaluated via standard registration metrics against baselines. No equations, derivations, or self-citations are presented that reduce claimed accuracy or generalization to quantities defined by the model's own fitted parameters or prior author results. The work is self-contained as standard supervised learning plus ablation experiments, with performance claims resting on external test data rather than internal redefinitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard deep-learning assumptions for deformation field estimation plus the untested premise that the three conditioning signals suffice for cross-scenario optimality; no free parameters or invented entities are named in the abstract.

axioms (1)
  • domain assumption Deep neural networks can learn accurate deformation fields when supplied appropriate conditioning signals about anatomy and task type.
    Core premise of all learning-based registration methods invoked by the proposal.

pith-pipeline@v0.9.0 · 5760 in / 1186 out tokens · 40985 ms · 2026-05-25T08:23:22.054921+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · 1 internal anchor

  1. [1]

    Medical image registration in image guided surgery: Is- sues, challenges and research opportunities

    Fakhre Alam, Sami Ur Rahman, Sehat Ullah, and Kamal Gu- lati. Medical image registration in image guided surgery: Is- sues, challenges and research opportunities. Biocybernetics and Biomedical Engineering, 38(1):71–89, 2018. 1

  2. [2]

    A fast diffeomorphic image registration al- gorithm

    John Ashburner. A fast diffeomorphic image registration al- gorithm. NeuroImage, 38(1):95–113, 2007. 6

  3. [3]

    Advanced normalization tools (ANTS)

    Brian B Avants, Nick Tustison, and Gang Song. Advanced normalization tools (ANTS). Insight j, 2(365):1–35, 2009. 3

  4. [4]

    A reproducible eval- uation of ants similarity metric performance in brain image registration

    Brian B Avants, Nicholas J Tustison, Gang Song, Philip A Cook, Arno Klein, and James C Gee. A reproducible eval- uation of ants similarity metric performance in brain image registration. NeuroImage, 54(3):2033–2044, 2011. 1, 3

  5. [5]

    V oxelmorph: A learning frame- work for deformable medical image registration

    Guha Balakrishnan, Amy Zhao, Mert R Sabuncu, John Gut- tag, and Adrian V Dalca. V oxelmorph: A learning frame- work for deformable medical image registration. IEEE Transactions on Medical Imaging , 38(8):1788–1800, 2019. 2, 3

  6. [6]

    Unsupervised 3d registration through optimization-guided cyclical self-training

    Alexander Bigalke, Lasse Hansen, Tony CW Mok, and Mattias P Heinrich. Unsupervised 3d registration through optimization-guided cyclical self-training. In International Conference on Medical Image Computing and Computer- Assisted Intervention, pages 677–687. Springer, 2023. 2

  7. [7]

    Frey, Yufan He, William Paul Segars, Ye Li, and Yong Du

    Junyu Chen, Eric C. Frey, Yufan He, William Paul Segars, Ye Li, and Yong Du. Transmorph: Transformer for unsuper- vised medical image registration. Medical Image Anal., 82: 102615, 2022. 3

  8. [8]

    Dalca, and Guido Gerig

    Neel Dey, Mengwei Ren, Adrian V . Dalca, and Guido Gerig. Generative adversarial registration for improved conditional deformable templates. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021 , pages 3909–3921. IEEE,

  9. [9]

    Measures of the amount of ecologic association between species

    Lee R Dice. Measures of the amount of ecologic association between species. Ecology, 26(3):297–302, 1945. 6

  10. [10]

    De- termination of patient-specific internal gross tumor volumes for lung cancer using four-dimensional computed tomogra- phy

    Muthuveni Ezhil, Sastry Vedam, Peter Balter, Bum Choi, Dragan Mirkovic, George Starkschall, and Joe Y Chang. De- termination of patient-specific internal gross tumor volumes for lung cancer using four-dimensional computed tomogra- phy. Radiation oncology, 4:1–14, 2009. 1

  11. [11]

    Automated learning for deformable medical image registration by jointly optimizing network architectures and objective functions

    Xin Fan, Zi Li, Ziyang Li, Xiaolin Wang, Risheng Liu, Zhongxuan Luo, and Hao Huang. Automated learning for deformable medical image registration by jointly optimizing network architectures and objective functions. IEEE Trans. Image Process., 32:4880–4892, 2023. 3

  12. [12]

    Deepstationing: thoracic lymph node station parsing in ct scans using anatomical context encoding and key organ auto-search

    Dazhou Guo, Xianghua Ye, Jia Ge, Xing Di, Le Lu, Lingyun Huang, Guotong Xie, Jing Xiao, Zhongjie Lu, Ling Peng, et al. Deepstationing: thoracic lymph node station parsing in ct scans using anatomical context encoding and key organ auto-search. In Medical Image Computing and Computer Assisted Intervention, pages 3–12. Springer, 2021. 5

  13. [13]

    HyperNetworks

    David Ha, Andrew Dai, and Quoc V Le. Hypernetworks. arXiv preprint arXiv:1609.09106, 2016. 3

  14. [14]

    Heinrich, Mark Jenkinson, Michael Brady, and Ju- lia A

    Mattias P. Heinrich, Mark Jenkinson, Michael Brady, and Ju- lia A. Schnabel. Globally optimal deformable registration on a minimum spanning tree using dense displacement sam- pling. In Medical Image Computing and Computer Assisted Intervention, pages 115–122, 2012. 1, 3, 6, 7

  15. [15]

    Towards realtime multimodal fusion for image-guided interventions using self- similarities

    Mattias P Heinrich, Mark Jenkinson, Bartlomiej W Papie ˙z, Sir Michael Brady, and Julia A Schnabel. Towards realtime multimodal fusion for image-guided interventions using self- similarities. In Medical Image Computing and Computer- Assisted Intervention, pages 187–194, 2013. 3, 8

  16. [16]

    Synthmorph: learning contrast-invariant registration without acquired images

    Malte Hoffmann, Benjamin Billot, Douglas N Greve, Juan Eugenio Iglesias, Bruce Fischl, and Adrian V Dalca. Synthmorph: learning contrast-invariant registration without acquired images. IEEE transactions on medical imaging, 41 (3):543–558, 2021. 3

  17. [17]

    Hypermorph: Amortized hyperpa- rameter learning for image registration

    Andrew Hoopes, Malte Hoffmann, Bruce Fischl, John Gut- tag, and Adrian V Dalca. Hypermorph: Amortized hyperpa- rameter learning for image registration. In Information Pro- cessing in Medical Imaging: 27th International Conference, IPMI 2021, Virtual Event, June 28–June 30, 2021, Proceed- ings 27, pages 3–17. Springer, 2021. 2, 3, 5

  18. [18]

    Arbitrary style transfer in real-time with adaptive instance normalization

    Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceed- ings of the IEEE international conference on computer vi- sion, pages 1501–1510, 2017. 3

  19. [19]

    Spatial transformer networks

    Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. Spatial transformer networks. In An- nual Conference on Neural Information Processing Systems, pages 2017–2025, 2015. 5

  20. [20]

    Dynamic filter networks

    Xu Jia, Bert De Brabandere, Tinne Tuytelaars, and Luc V Gool. Dynamic filter networks. Advances in neural informa- tion processing systems, 29, 2016. 3, 5

  21. [21]

    Efficient inference in fully connected crfs with gaussian edge potentials

    Philipp Kr ¨ahenb¨uhl and Vladlen Koltun. Efficient inference in fully connected crfs with gaussian edge potentials. In Advances in Neural Information Processing Systems , pages 109–117, 2011. 5

  22. [22]

    Coupling deep deformable registration with contextual re- finement for semi-supervised medical image segmentation

    Ziyang Li, Zi Li, Risheng Liu, Zhongxuan Luo, and Xin Fan. Coupling deep deformable registration with contextual re- finement for semi-supervised medical image segmentation. In 19th IEEE International Symposium on Biomedical Imag- ing, ISBI, pages 1–5, 2022. 1

  23. [23]

    Zi Li, Lin Tian, Tony C. W. Mok, Xiaoyu Bai, Puyang Wang, Jia Ge, Jingren Zhou, Le Lu, Xianghua Ye, Ke Yan, and Dakai Jin. Samconvex: Fast discrete optimization for CT registration using self-supervised anatomical embedding and correlation pyramid. In Medical Image Computing and Com- puter Assisted Intervention, pages 559–569, 2023. 2, 3, 4 9

  24. [24]

    Yuille, Le Lu, Chien-Hung Liao, and Adam P

    Fengze Liu, Jinzheng Cai, Yuankai Huo, Chi-Tung Cheng, Ashwin Raju, Dakai Jin, Jing Xiao, Alan L. Yuille, Le Lu, Chien-Hung Liao, and Adam P. Harrison. JSSR: A joint syn- thesis, segmentation, and registration system for 3d multi- modal image alignment of large-scale pathological CT scans. In ECCV, pages 257–274. Springer, 2020. 3

  25. [25]

    Harrison, Dazhou Guo, Le Lu, Alan L

    Fengze Liu, Ke Yan, Adam P. Harrison, Dazhou Guo, Le Lu, Alan L. Yuille, Lingyun Huang, Guotong Xie, Jing Xiao, Xi- anghua Ye, and Dakai Jin. SAME: deformable image regis- tration based on self-supervised anatomical embeddings. In Medical Image Computing and Computer Assisted Interven- tion, pages 87–97, 2021. 2, 3, 4

  26. [26]

    Bi-level probabilistic feature learning for deformable image registration

    Risheng Liu, Zi Li, Yuxi Zhang, Xin Fan, and Zhongxuan Luo. Bi-level probabilistic feature learning for deformable image registration. In Proceedings of the Twenty-Ninth Inter- national Joint Conference on Artificial Intelligence, IJCAI , pages 723–730. ijcai.org, 2020. 3

  27. [27]

    Learning deformable image registra- tion from optimization: Perspective, modules, bilevel train- ing and beyond

    Risheng Liu, Zi Li, Xin Fan, Chenying Zhao, Hao Huang, and Zhongxuan Luo. Learning deformable image registra- tion from optimization: Perspective, modules, bilevel train- ing and beyond. IEEE Transactions on Pattern Analysis Ma- chine Intelligence, 44(11):7688–7704, 2022. 2, 3, 5

  28. [28]

    Pivit: Large deformation image registration with pyramid-iterative vision transformer

    Tai Ma, Xinru Dai, Suwei Zhang, and Ying Wen. Pivit: Large deformation image registration with pyramid-iterative vision transformer. In International Conference on Medi- cal Image Computing and Computer-Assisted Intervention , pages 602–612. Springer, 2023. 3

  29. [29]

    Iirp-net: Iterative inference residual pyramid network for enhanced image registration

    Tai Ma, Suwei Zhang, Jiafeng Li, and Ying Wen. Iirp-net: Iterative inference residual pyramid network for enhanced image registration. In Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 11546–11555, 2024. 2, 6, 7, 1

  30. [30]

    A review of medical image registration

    Calvin R Maurer and J Michael Fitzpatrick. A review of medical image registration. Interactive image-guided neuro- surgery, 1:17–44, 1993. 1

  31. [31]

    Non-iterative coarse-to-fine registration based on single-pass deep cumulative learning

    Mingyuan Meng, Lei Bi, Dagan Feng, and Jinman Kim. Non-iterative coarse-to-fine registration based on single-pass deep cumulative learning. In International Conference on Medical Image Computing and Computer-Assisted Interven- tion, pages 88–97. Springer, 2022. 2

  32. [32]

    Correlation-aware coarse-to-fine mlps for deformable medi- cal image registration

    Mingyuan Meng, Dagan Feng, Lei Bi, and Jinman Kim. Correlation-aware coarse-to-fine mlps for deformable medi- cal image registration. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 9645–9654, 2024. 2, 6, 7, 1

  33. [33]

    Fast symmetric diffeo- morphic image registration with convolutional neural net- works

    Tony CW Mok and Albert Chung. Fast symmetric diffeo- morphic image registration with convolutional neural net- works. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4644–4653,

  34. [34]

    Conditional de- formable image registration with convolutional neural net- work

    Tony CW Mok and Albert CS Chung. Conditional de- formable image registration with convolutional neural net- work. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Pro- ceedings, Part IV 24 , pages 35–45. Springer, 2021. 2, 3, 5, 7

  35. [35]

    Unsupervised de- formable image registration with absent correspondences in pre-operative and post-recurrence brain tumor mri scans

    Tony CW Mok and Albert CS Chung. Unsupervised de- formable image registration with absent correspondences in pre-operative and post-recurrence brain tumor mri scans. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 25–35. Springer,

  36. [36]

    Deformable medical image reg- istration under distribution shifts with neural instance opti- mization

    Tony CW Mok, Zi Li, Yingda Xia, Jiawen Yao, Ling Zhang, Jingren Zhou, and Le Lu. Deformable medical image reg- istration under distribution shifts with neural instance opti- mization. In International Workshop on Machine Learning in Medical Imaging, pages 126–136. Springer, 2023. 3

  37. [37]

    Modality-agnostic structural image representation learning for deformable multi-modality medical image registration

    Tony CW Mok, Zi Li, Yunhao Bai, Jianpeng Zhang, Wei Liu, Yan-Jie Zhou, Ke Yan, Dakai Jin, Yu Shi, Xiaoli Yin, et al. Modality-agnostic structural image representation learning for deformable multi-modality medical image registration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11215–11225, 2024. 2, 3

  38. [38]

    Tony C. W. Mok and Albert C. S. Chung. Large deformation diffeomorphic image registration with laplacian pyramid net- works. In Medical Image Computing and Computer Assisted Intervention, pages 211–221, 2020. 2, 3, 6, 7, 1

  39. [39]

    Image registration by maximization of combined mutual in- formation and gradient information

    Josien PW Pluim, JB Antoine Maintz, and Max A Viergever. Image registration by maximization of combined mutual in- formation and gradient information. In Medical Image Com- puting and Computer-Assisted Intervention–MICCAI 2000: Third International Conference, Pittsburgh, PA, USA, Octo- ber 11-14, 2000. Proceedings 3 , pages 452–461. Springer,

  40. [40]

    Petersen, Stefan K

    Chen Qin, Wenjia Bai, Jo Schlemper, Steffen E. Petersen, Stefan K. Piechnik, Stefan Neubauer, and Daniel Rueckert. Joint learning of motion estimation and segmentation for car- diac MR image sequences. InMedical Image Computing and Computer Assisted Intervention - MICCAI , pages 472–480. Springer, 2018. 1

  41. [41]

    Hammer: hierar- chical attribute matching mechanism for elastic registration

    Dinggang Shen and Christos Davatzikos. Hammer: hierar- chical attribute matching mechanism for elastic registration. IEEE Transactions on medical imaging, 21(11):1421–1439,

  42. [42]

    Convexadam: Self-configuring dual- optimisation-based 3d multitask medical image registration

    Hanna Siebert, Christoph Großbr ¨ohmer, Lasse Hansen, and Mattias P Heinrich. Convexadam: Self-configuring dual- optimisation-based 3d multitask medical image registration. IEEE Transactions on Medical Imaging, 2024. 3

  43. [43]

    Deformable medical image registration: A survey

    Aristeidis Sotiras, Christos Davatzikos, and Nikos Paragios. Deformable medical image registration: A survey. IEEE Transactions on Medical Imaging , 32(7):1153–1190, 2013. 1

  44. [44]

    Free-form defor- mation using lower-order b-spline for nonrigid image regis- tration

    Wei Sun, Wiro J Niessen, and Stefan Klein. Free-form defor- mation using lower-order b-spline for nonrigid image regis- tration. In Medical Image Computing and Computer Assisted Intervention, pages 194–201, 2014. 1, 3, 6, 7

  45. [45]

    SAME++: A self-supervised anatomical embed- dings enhanced medical image registration framework us- ing stable sampling and regularized transformation

    Lin Tian, Zi Li, Fengze Liu, Xiaoyu Bai, Jia Ge, Le Lu, Marc Niethammer, Xianghua Ye, Ke Yan, and Dakai Jin. SAME++: A self-supervised anatomical embed- dings enhanced medical image registration framework us- ing stable sampling and regularized transformation. CoRR, abs/2311.14986, 2023. 3, 4, 6, 7, 1 10

  46. [46]

    Rushmore, and Marc Niethammer

    Lin Tian, Thomas Hastings Greer, Roland Kwitt, Franc ¸ois- Xavier Vialard, Ra ´ul San Jos ´e Est ´epar, Sylvain Bouix, Richard J. Rushmore, and Marc Niethammer. unigradicon: A foundation model for medical image registration. In Med- ical Image Computing and Computer Assisted Intervention - MICCAI 2024 - 27th International Conference, Marrakesh, Morocco, O...

  47. [47]

    Deepatlas: Joint semi- supervised learning of image registration and segmentation

    Zhenlin Xu and Marc Niethammer. Deepatlas: Joint semi- supervised learning of image registration and segmentation. In Medical Image Computing and Computer Assisted Inter- vention, pages 420–429, 2019. 3

  48. [48]

    Ke Yan, Le Lu, and Ronald M. Summers. Unsupervised body part regression via spatially self-ordering convolutional neural networks. In 15th IEEE International Symposium on Biomedical Imaging, ISBI, pages 1022–1025. IEEE, 2018. 6

  49. [49]

    Sam: Self-supervised learning of pixel-wise anatom- ical embeddings in radiological images

    Ke Yan, Jinzheng Cai, Dakai Jin, Shun Miao, Dazhou Guo, Adam P Harrison, Youbao Tang, Jing Xiao, Jingjing Lu, and Le Lu. Sam: Self-supervised learning of pixel-wise anatom- ical embeddings in radiological images. IEEE Transactions on Medical Imaging, 2022. 3, 4

  50. [50]

    Dodnet: Learning to segment multi-organ and tumors from multiple partially labeled datasets

    Jianpeng Zhang, Yutong Xie, Yong Xia, and Chunhua Shen. Dodnet: Learning to segment multi-organ and tumors from multiple partially labeled datasets. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1195–1204, 2021. 3, 5

  51. [51]

    Gut- tag, and Adrian V

    Amy Zhao, Guha Balakrishnan, Fr ´edo Durand, John V . Gut- tag, and Adrian V . Dalca. Data augmentation using learned transformations for one-shot medical image segmentation. In IEEE Conference on Computer Vision and Pattern Recogni- tion, CVPR, pages 8543–8553, 2019. 1

  52. [52]

    Recursive cascaded networks for unsupervised medical im- age registration

    Shengyu Zhao, Yue Dong, Eric I-Chao Chang, and Yan Xu. Recursive cascaded networks for unsupervised medical im- age registration. In 2019 IEEE/CVF International Confer- ence on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 10599–10609. IEEE,

  53. [53]

    Network Architecture Figure 4 illustrates the network architecture of the shared backbone

    3 11 UniReg: Foundation Model for Controllable Medical Image Registration Supplementary Material A. Network Architecture Figure 4 illustrates the network architecture of the shared backbone. Table 7 depicts the variables employed within this shared backbone, with fixed image F and moving im- age M serving as representative examples. Variables Tensor Shape...