Investigating Robot Control Policy Learning for Autonomous X-ray-guided Spine Procedures

Axel Krieger; Benjamin D. Killeen; Blanca Inigo; Florence Klitzner; Lalithkumar Seenivasan; Mathias Unberath; Michelle Song

arxiv: 2511.03882 · v2 · pith:Q6O5SNASnew · submitted 2025-11-05 · 💻 cs.CV · cs.AI· cs.LG· cs.RO

Investigating Robot Control Policy Learning for Autonomous X-ray-guided Spine Procedures

Florence Klitzner , Blanca Inigo , Benjamin D. Killeen , Lalithkumar Seenivasan , Michelle Song , Axel Krieger , Mathias Unberath This is my paper

Pith reviewed 2026-05-25 07:24 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LGcs.RO

keywords imitation learningX-ray guided proceduresrobot control policyspine instrumentationsim-to-real transfervertebroplastycannula insertionbi-planar imaging

0 comments

The pith

Imitation learning from simulated bi-planar X-rays produces policies that insert cannulas safely in spine procedures on first attempt 68.5 percent of the time.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether imitation learning can create robot control policies for X-ray-guided spine procedures that rely on sparse bi-planar images instead of dense video. It builds a simulation environment that generates realistic X-ray sequences paired with correct insertion trajectories, then trains policies to plan and execute cannula alignment step by step from visual input alone. These policies reach first-attempt success in 68.5 percent of trials while keeping trajectories inside the pedicle across many vertebral levels and some complex cases such as fractures. The work also shows partial transfer when the same policies are tested on real X-ray images. This approach matters because it tests a route to robotic assistance that could reduce dependence on pre-operative CT scans during spinal instrumentation.

Core claim

An in silico sandbox generates datasets of correct cannula trajectories and corresponding bi-planar X-ray sequences that emulate clinical stepwise alignment. Imitation learning policies trained on this data plan and control open-loop insertion using only visual information, achieving 68.5 percent first-attempt success with safe intra-pedicular paths across diverse vertebral levels. The policies transfer to fractured and varied anatomies as well as different initializations, and rollouts on real X-ray sequences produce plausible trajectories, establishing a benchmark while noting limits in entry-point precision.

What carries the argument

An imitation learning policy that iteratively aligns the cannula using only visual input from bi-planar X-ray images generated in simulation.

If this is right

The policy maintains safe intra-pedicular trajectories across diverse vertebral levels on first attempt in 68.5 percent of cases.
The policy transfers to complex anatomy including fractures and to varied anatomies and initializations.
Partial sim-to-real transfer produces plausible trajectories when the policy is rolled out on real X-ray sequences.
Entry point precision remains a limitation of the current visual policy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Improving the match between simulated and real X-ray physics could allow these policies to support fully CT-free robotic spinal navigation.
The same simulation-plus-imitation method could be tested on other X-ray-guided interventions such as biopsies or joint injections.
Adding explicit geometric priors for entry points might close the remaining precision gap without increasing model size.

Load-bearing premise

The simulation environment models X-ray imaging physics, vertebral anatomy, and procedural variations closely enough that policies trained inside it will produce safe behavior on real clinical X-ray sequences.

What would settle it

Deploy the trained policy on a collection of real patient bi-planar X-ray sequences from actual spine procedures and count how often the resulting trajectories exit the pedicle or fail to reach the target.

read the original abstract

Imitation learning-based robot control policies are enjoying renewed interest in video-based robotics. However, it remains unclear whether this approach applies to X-ray-guided procedures, such as spine instrumentation, with sparse inputs. We examine the feasibility, opportunities and challenges for imitation policy learning in bi-plane-guided cannula insertion. We develop an in silico sandbox for scalable, automated simulation of X-ray-guided spine procedures with a high degree of realism. We curate a dataset of correct trajectories and corresponding bi-planar X-ray sequences that emulate the stepwise alignment of providers. We then train imitation learning policies for planning and open-loop control that iteratively align a cannula in a vertebroplasty setting solely based on visual information. This precisely controlled setup offers insights into limitations and capabilities of this method. Our policy succeeded on the first attempt in 68.5% of cases, maintaining safe intra-pedicular trajectories across diverse vertebral levels. The policy transferred to complex anatomy, including fractures, as well as varied anatomies and initializations. Rollouts on real X-ray indicate that partial sim-to-real transfer with plausible trajectories is possible. While these preliminary results are promising, we also identify limitations, especially in entry point precision. The current results present a clear benchmark for future efforts, while with more robust priors and domain knowledge, such models may provide a foundation for future efforts toward lightweight and CT-free robotic intra-operative spinal navigation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A preliminary feasibility study that gets 68.5% first-try success in sim for X-ray cannula insertion and shows partial real-image transfer, but stays limited by simulator assumptions and thin evaluation details.

read the letter

This paper applies imitation learning to bi-plane X-ray guided cannula insertion for vertebroplasty in a custom simulator. The headline result is 68.5% first-attempt success with safe intra-pedicular paths across vertebral levels, plus some handling of fractures and varied starts. Real X-ray rollouts produce plausible trajectories but only partial transfer. They built an automated in-silico sandbox that generates stepwise X-ray sequences from expert trajectories, which lets them train open-loop policies from visual input alone. That setup is the main concrete contribution and gives a usable benchmark for this narrow task. They are upfront about entry-point precision problems and frame the whole thing as preliminary rather than ready for clinic. The soft spots are straightforward. All the quantitative success numbers sit inside simulation, with no baselines, error bars, or statistical tests reported in the abstract. The real-X-ray part is qualitative only, so it is hard to judge how much progress the 68.5% actually represents. The central assumption—that the simulator’s X-ray physics and anatomy variations are close enough to reality—remains untested at scale, and the authors flag it themselves. No load-bearing circularity or invented metrics appears. This is useful reading for people in medical robotics who work on vision-based control for interventional procedures and want to see whether imitation learning can reduce CT dependence. A reader already in that niche would pick up the sandbox design and the reported limitations as starting points for their own experiments. It is coherent on its own terms and shows honest engagement with the constraints of the domain, so it deserves a serious referee to pressure-test the simulator fidelity and ask for more comparative results.

Referee Report

0 major / 1 minor

Summary. The manuscript presents a feasibility study on imitation learning for robot control policies in X-ray-guided spine procedures, focusing on bi-plane-guided cannula insertion for vertebroplasty. The authors develop an in silico sandbox to simulate procedures with high realism, curate a dataset of correct trajectories paired with emulated bi-planar X-ray sequences, train imitation policies for visual-only planning and open-loop control, and report a 68.5% first-attempt success rate in simulation that maintains safe intra-pedicular trajectories across diverse vertebral levels. The policy transfers to complex cases including fractures and varied anatomies/initializations; real X-ray rollouts show partial sim-to-real transfer yielding plausible trajectories. The work explicitly notes limitations in entry-point precision and frames the results as a benchmark for future CT-free robotic navigation efforts.

Significance. If the results hold, the work is significant as an early demonstration that imitation learning can be applied to X-ray-guided medical robotics despite sparse visual inputs, providing a controlled, scalable testbed via the in silico sandbox and curated dataset. These elements enable reproducible experimentation and establish a concrete numerical benchmark (68.5% success). The authors' explicit qualification of the study as preliminary, their identification of specific limitations, and the focus on partial rather than complete transfer strengthen its utility for the community.

minor comments (1)

[Abstract] Abstract: the success-rate claim would be easier to contextualize if the abstract briefly noted the total number of evaluated cases or the definition of 'first attempt.'

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the work's significance as an early demonstration of imitation learning for X-ray-guided robotics, and the recommendation of minor revision. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes an empirical feasibility study: creation of an in-silico X-ray simulation sandbox, curation of a trajectory dataset, training of imitation-learning policies, and reporting of success rates (68.5% first-attempt) plus qualitative real-X-ray rollouts. No equations, fitted parameters, or derivation steps are present that reduce by construction to the inputs. Claims rest on experimental outcomes in simulation and limited real-data testing rather than any self-referential mathematical or citation chain. The central assumption (simulator realism) is explicitly flagged by the authors as a limitation, not smuggled in as a proven result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the work implicitly assumes standard imitation learning transfer from simulation to real images.

pith-pipeline@v0.9.0 · 5806 in / 1095 out tokens · 21782 ms · 2026-05-25T07:24:32.506861+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

[1]

& Wang, Y

Chen, C., Fan, P., Xie, X. & Wang, Y. Risk Factors for Cement Leakage and Adjacent Vertebral Fractures in Kyphoplasty for Osteo- porotic Vertebral Fractures.Clinical Spine Surgery33, E251 (2020). URL https://journals.lww.com/jspinaldisorders/fulltext/2020/07000/ Risk Factors for Cement Leakage and Adjacent.8.aspx

work page 2020
[2]

URL https://journals.lww.com/spinejournal/fulltext/2014/05150/identification of risk factors for the occurrence.9.aspx

Tom´ e-Bermejo, F.et al.Identification of Risk Factors for the Occur- rence of Cement Leakage During Percutaneous Vertebroplasty for Painful Osteoporotic or Malignant Vertebral Fracture.Spine39, E693 (2014). URL https://journals.lww.com/spinejournal/fulltext/2014/05150/identification of risk factors for the occurrence.9.aspx

work page 2014
[3]

R., Foley, K

Rampersaud, Y. R., Foley, K. T., Shen, A. C., Williams, S. & Solomito, M. Radiation Exposure to the Spine Surgeon During Fluoroscopi- cally Assisted Pedicle Screw Insertion.Spine25, 2637 (2000). URL https://journals.lww.com/spinejournal/fulltext/2000/10150/radiation exposure to the spine surgeon during.16.aspx

work page 2000
[4]

URL https://iopscience.iop.org/article/10.1088/1361-6560/ab2d66

Vijayan, R.et al.Automatic pedicle screw planning using atlas-based registration of anatomy and reference trajectories.Physics in Medicine & Biology64, 165020 (2019). URL https://iopscience.iop.org/article/10.1088/1361-6560/ab2d66

work page doi:10.1088/1361-6560/ab2d66 2019
[5]

B., Forsthoefel, C

Siemionow, K. B., Forsthoefel, C. W., Foy, M. P., Gawel, D. & Luciano, C. J. Autonomous lumbar spine pedicle screw planning using machine learn- ing: A validation study.Journal of Craniovertebral Junction and Spine12, 223 (2021). URL https://journals.lww.com/jcjs/fulltext/2021/12030/autonomous lumbar spine pedicle screw planning.3.aspx

work page 2021
[6]

A., Vaishnav, A., York, P

Huang, M., Tetreault, T. A., Vaishnav, A., York, P. J. & Staub, B. N. The current state of navigation in robotic spine surgery.Annals of Translational Medicine9, 86–86 (2021). URL https://atm.amegroups.com/article/view/46418/html

work page 2021
[7]

Z., Kumar, V., Levine, S

Zhao, T. Z., Kumar, V., Levine, S. & Finn, C. Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware (2023). URL http://arxiv.org/abs/2304. 13705. 11

work page 2023
[8]

W.et al.Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks (2024)

Kim, J. W.et al.Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks (2024). URL http://arxiv.org/abs/2407.12998. ArXiv:2407.12998 [cs]

work page arXiv 2024
[9]

Kim, J. W. B.et al.SRT-H: A hierarchical framework for autonomous surgery via language-conditioned imitation learning.Science Robotics10, eadt5254 (2025). URL https://www.science.org/doi/full/10.1126/scirobotics.adt5254. Publisher: American Association for the Advancement of Science

work page doi:10.1126/scirobotics.adt5254 2025
[10]

Deepdrr–a catalyst for machine learning in fluoroscopy-guided procedures, 98–106 (Springer, 2018)

Unberath, M.et al. Deepdrr–a catalyst for machine learning in fluoroscopy-guided procedures, 98–106 (Springer, 2018)

work page 2018
[11]

URL https://www.nature.com/articles/s42256-023-00629-1

Gao, C.et al.Synthetic data accelerates the development of generalizable learning-based algorithms for X-ray image analysis.Nature Machine Intelligence 5, 294–308 (2023). URL https://www.nature.com/articles/s42256-023-00629-1. Publisher: Nature Publishing Group

work page 2023
[12]

D., Cho, S

Killeen, B. D., Cho, S. M., Armand, M., Taylor, R. H. & Unberath, M. In silico simulation: a key enabling technology for next-generation intelligent surgical systems.Progress in Biomedical Engineering5, 032001 (2023)

work page 2023
[13]

Killeen, B. D.et al. Pelphix: Surgical phase recognition from x-ray images in percutaneous pelvic fixation, 133–143 (Springer, 2023)

work page 2023
[14]

D.et al.Fluorosam: A language-aligned foundation model for x-ray image segmentation.arXiv preprint arXiv:2403.08059(2024)

Killeen, B. D.et al.Fluorosam: A language-aligned foundation model for x-ray image segmentation.arXiv preprint arXiv:2403.08059(2024)

work page arXiv 2024
[15]

Edgar, H.et al.New mexico decedent image database.Office of the Medical Investigator(2020)

work page 2020
[16]

URL http://arxiv.org/abs/2208.05868

Wasserthal, J.et al.TotalSegmentator: robust segmentation of 104 anatomical structures in CT images.Radiology: Artificial Intelligence5, e230024 (2023). URL http://arxiv.org/abs/2208.05868

work page arXiv 2023
[17]

Shaker, K.et al.Synthesizing high-resolution dual-energy radiographs from coro- nary artery calcium ct images.Proceedings of SPIE–the International Society for Optical Engineering12925, 129253T (2024)

work page 2024
[18]

T.et al.A Vertebral Segmentation Dataset with Fracture Grading

L¨ offler, M. T.et al.A Vertebral Segmentation Dataset with Fracture Grading. Radiology: Artificial Intelligence2, e190138 (2020). URL https://pubs.rsna.org/ doi/10.1148/ryai.2020190138

work page doi:10.1148/ryai.2020190138 2020
[19]

Gertzbein, S. D. & Robbins, S. E. Accuracy of pedicular screw placement in vivo. Spine15, 11–14 (1990)

work page 1990
[20]

Klinwichit, P.et al.Buu-lspine: A thai open lumbar spine dataset for spondy- lolisthesis detection.Applied Sciences13, 8646 (2023). 12

work page 2023

[1] [1]

& Wang, Y

Chen, C., Fan, P., Xie, X. & Wang, Y. Risk Factors for Cement Leakage and Adjacent Vertebral Fractures in Kyphoplasty for Osteo- porotic Vertebral Fractures.Clinical Spine Surgery33, E251 (2020). URL https://journals.lww.com/jspinaldisorders/fulltext/2020/07000/ Risk Factors for Cement Leakage and Adjacent.8.aspx

work page 2020

[2] [2]

URL https://journals.lww.com/spinejournal/fulltext/2014/05150/identification of risk factors for the occurrence.9.aspx

Tom´ e-Bermejo, F.et al.Identification of Risk Factors for the Occur- rence of Cement Leakage During Percutaneous Vertebroplasty for Painful Osteoporotic or Malignant Vertebral Fracture.Spine39, E693 (2014). URL https://journals.lww.com/spinejournal/fulltext/2014/05150/identification of risk factors for the occurrence.9.aspx

work page 2014

[3] [3]

R., Foley, K

Rampersaud, Y. R., Foley, K. T., Shen, A. C., Williams, S. & Solomito, M. Radiation Exposure to the Spine Surgeon During Fluoroscopi- cally Assisted Pedicle Screw Insertion.Spine25, 2637 (2000). URL https://journals.lww.com/spinejournal/fulltext/2000/10150/radiation exposure to the spine surgeon during.16.aspx

work page 2000

[4] [4]

URL https://iopscience.iop.org/article/10.1088/1361-6560/ab2d66

Vijayan, R.et al.Automatic pedicle screw planning using atlas-based registration of anatomy and reference trajectories.Physics in Medicine & Biology64, 165020 (2019). URL https://iopscience.iop.org/article/10.1088/1361-6560/ab2d66

work page doi:10.1088/1361-6560/ab2d66 2019

[5] [5]

B., Forsthoefel, C

Siemionow, K. B., Forsthoefel, C. W., Foy, M. P., Gawel, D. & Luciano, C. J. Autonomous lumbar spine pedicle screw planning using machine learn- ing: A validation study.Journal of Craniovertebral Junction and Spine12, 223 (2021). URL https://journals.lww.com/jcjs/fulltext/2021/12030/autonomous lumbar spine pedicle screw planning.3.aspx

work page 2021

[6] [6]

A., Vaishnav, A., York, P

Huang, M., Tetreault, T. A., Vaishnav, A., York, P. J. & Staub, B. N. The current state of navigation in robotic spine surgery.Annals of Translational Medicine9, 86–86 (2021). URL https://atm.amegroups.com/article/view/46418/html

work page 2021

[7] [7]

Z., Kumar, V., Levine, S

Zhao, T. Z., Kumar, V., Levine, S. & Finn, C. Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware (2023). URL http://arxiv.org/abs/2304. 13705. 11

work page 2023

[8] [8]

W.et al.Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks (2024)

Kim, J. W.et al.Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks (2024). URL http://arxiv.org/abs/2407.12998. ArXiv:2407.12998 [cs]

work page arXiv 2024

[9] [9]

Kim, J. W. B.et al.SRT-H: A hierarchical framework for autonomous surgery via language-conditioned imitation learning.Science Robotics10, eadt5254 (2025). URL https://www.science.org/doi/full/10.1126/scirobotics.adt5254. Publisher: American Association for the Advancement of Science

work page doi:10.1126/scirobotics.adt5254 2025

[10] [10]

Deepdrr–a catalyst for machine learning in fluoroscopy-guided procedures, 98–106 (Springer, 2018)

Unberath, M.et al. Deepdrr–a catalyst for machine learning in fluoroscopy-guided procedures, 98–106 (Springer, 2018)

work page 2018

[11] [11]

URL https://www.nature.com/articles/s42256-023-00629-1

Gao, C.et al.Synthetic data accelerates the development of generalizable learning-based algorithms for X-ray image analysis.Nature Machine Intelligence 5, 294–308 (2023). URL https://www.nature.com/articles/s42256-023-00629-1. Publisher: Nature Publishing Group

work page 2023

[12] [12]

D., Cho, S

Killeen, B. D., Cho, S. M., Armand, M., Taylor, R. H. & Unberath, M. In silico simulation: a key enabling technology for next-generation intelligent surgical systems.Progress in Biomedical Engineering5, 032001 (2023)

work page 2023

[13] [13]

Killeen, B. D.et al. Pelphix: Surgical phase recognition from x-ray images in percutaneous pelvic fixation, 133–143 (Springer, 2023)

work page 2023

[14] [14]

D.et al.Fluorosam: A language-aligned foundation model for x-ray image segmentation.arXiv preprint arXiv:2403.08059(2024)

Killeen, B. D.et al.Fluorosam: A language-aligned foundation model for x-ray image segmentation.arXiv preprint arXiv:2403.08059(2024)

work page arXiv 2024

[15] [15]

Edgar, H.et al.New mexico decedent image database.Office of the Medical Investigator(2020)

work page 2020

[16] [16]

URL http://arxiv.org/abs/2208.05868

Wasserthal, J.et al.TotalSegmentator: robust segmentation of 104 anatomical structures in CT images.Radiology: Artificial Intelligence5, e230024 (2023). URL http://arxiv.org/abs/2208.05868

work page arXiv 2023

[17] [17]

Shaker, K.et al.Synthesizing high-resolution dual-energy radiographs from coro- nary artery calcium ct images.Proceedings of SPIE–the International Society for Optical Engineering12925, 129253T (2024)

work page 2024

[18] [18]

T.et al.A Vertebral Segmentation Dataset with Fracture Grading

L¨ offler, M. T.et al.A Vertebral Segmentation Dataset with Fracture Grading. Radiology: Artificial Intelligence2, e190138 (2020). URL https://pubs.rsna.org/ doi/10.1148/ryai.2020190138

work page doi:10.1148/ryai.2020190138 2020

[19] [19]

Gertzbein, S. D. & Robbins, S. E. Accuracy of pedicular screw placement in vivo. Spine15, 11–14 (1990)

work page 1990

[20] [20]

Klinwichit, P.et al.Buu-lspine: A thai open lumbar spine dataset for spondy- lolisthesis detection.Applied Sciences13, 8646 (2023). 12

work page 2023