pith. sign in

arxiv: 2511.03882 · v2 · pith:Q6O5SNASnew · submitted 2025-11-05 · 💻 cs.CV · cs.AI· cs.LG· cs.RO

Investigating Robot Control Policy Learning for Autonomous X-ray-guided Spine Procedures

Pith reviewed 2026-05-25 07:24 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LGcs.RO
keywords imitation learningX-ray guided proceduresrobot control policyspine instrumentationsim-to-real transfervertebroplastycannula insertionbi-planar imaging
0
0 comments X

The pith

Imitation learning from simulated bi-planar X-rays produces policies that insert cannulas safely in spine procedures on first attempt 68.5 percent of the time.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines whether imitation learning can create robot control policies for X-ray-guided spine procedures that rely on sparse bi-planar images instead of dense video. It builds a simulation environment that generates realistic X-ray sequences paired with correct insertion trajectories, then trains policies to plan and execute cannula alignment step by step from visual input alone. These policies reach first-attempt success in 68.5 percent of trials while keeping trajectories inside the pedicle across many vertebral levels and some complex cases such as fractures. The work also shows partial transfer when the same policies are tested on real X-ray images. This approach matters because it tests a route to robotic assistance that could reduce dependence on pre-operative CT scans during spinal instrumentation.

Core claim

An in silico sandbox generates datasets of correct cannula trajectories and corresponding bi-planar X-ray sequences that emulate clinical stepwise alignment. Imitation learning policies trained on this data plan and control open-loop insertion using only visual information, achieving 68.5 percent first-attempt success with safe intra-pedicular paths across diverse vertebral levels. The policies transfer to fractured and varied anatomies as well as different initializations, and rollouts on real X-ray sequences produce plausible trajectories, establishing a benchmark while noting limits in entry-point precision.

What carries the argument

An imitation learning policy that iteratively aligns the cannula using only visual input from bi-planar X-ray images generated in simulation.

If this is right

  • The policy maintains safe intra-pedicular trajectories across diverse vertebral levels on first attempt in 68.5 percent of cases.
  • The policy transfers to complex anatomy including fractures and to varied anatomies and initializations.
  • Partial sim-to-real transfer produces plausible trajectories when the policy is rolled out on real X-ray sequences.
  • Entry point precision remains a limitation of the current visual policy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Improving the match between simulated and real X-ray physics could allow these policies to support fully CT-free robotic spinal navigation.
  • The same simulation-plus-imitation method could be tested on other X-ray-guided interventions such as biopsies or joint injections.
  • Adding explicit geometric priors for entry points might close the remaining precision gap without increasing model size.

Load-bearing premise

The simulation environment models X-ray imaging physics, vertebral anatomy, and procedural variations closely enough that policies trained inside it will produce safe behavior on real clinical X-ray sequences.

What would settle it

Deploy the trained policy on a collection of real patient bi-planar X-ray sequences from actual spine procedures and count how often the resulting trajectories exit the pedicle or fail to reach the target.

read the original abstract

Imitation learning-based robot control policies are enjoying renewed interest in video-based robotics. However, it remains unclear whether this approach applies to X-ray-guided procedures, such as spine instrumentation, with sparse inputs. We examine the feasibility, opportunities and challenges for imitation policy learning in bi-plane-guided cannula insertion. We develop an in silico sandbox for scalable, automated simulation of X-ray-guided spine procedures with a high degree of realism. We curate a dataset of correct trajectories and corresponding bi-planar X-ray sequences that emulate the stepwise alignment of providers. We then train imitation learning policies for planning and open-loop control that iteratively align a cannula in a vertebroplasty setting solely based on visual information. This precisely controlled setup offers insights into limitations and capabilities of this method. Our policy succeeded on the first attempt in 68.5% of cases, maintaining safe intra-pedicular trajectories across diverse vertebral levels. The policy transferred to complex anatomy, including fractures, as well as varied anatomies and initializations. Rollouts on real X-ray indicate that partial sim-to-real transfer with plausible trajectories is possible. While these preliminary results are promising, we also identify limitations, especially in entry point precision. The current results present a clear benchmark for future efforts, while with more robust priors and domain knowledge, such models may provide a foundation for future efforts toward lightweight and CT-free robotic intra-operative spinal navigation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The manuscript presents a feasibility study on imitation learning for robot control policies in X-ray-guided spine procedures, focusing on bi-plane-guided cannula insertion for vertebroplasty. The authors develop an in silico sandbox to simulate procedures with high realism, curate a dataset of correct trajectories paired with emulated bi-planar X-ray sequences, train imitation policies for visual-only planning and open-loop control, and report a 68.5% first-attempt success rate in simulation that maintains safe intra-pedicular trajectories across diverse vertebral levels. The policy transfers to complex cases including fractures and varied anatomies/initializations; real X-ray rollouts show partial sim-to-real transfer yielding plausible trajectories. The work explicitly notes limitations in entry-point precision and frames the results as a benchmark for future CT-free robotic navigation efforts.

Significance. If the results hold, the work is significant as an early demonstration that imitation learning can be applied to X-ray-guided medical robotics despite sparse visual inputs, providing a controlled, scalable testbed via the in silico sandbox and curated dataset. These elements enable reproducible experimentation and establish a concrete numerical benchmark (68.5% success). The authors' explicit qualification of the study as preliminary, their identification of specific limitations, and the focus on partial rather than complete transfer strengthen its utility for the community.

minor comments (1)
  1. [Abstract] Abstract: the success-rate claim would be easier to contextualize if the abstract briefly noted the total number of evaluated cases or the definition of 'first attempt.'

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the work's significance as an early demonstration of imitation learning for X-ray-guided robotics, and the recommendation of minor revision. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes an empirical feasibility study: creation of an in-silico X-ray simulation sandbox, curation of a trajectory dataset, training of imitation-learning policies, and reporting of success rates (68.5% first-attempt) plus qualitative real-X-ray rollouts. No equations, fitted parameters, or derivation steps are present that reduce by construction to the inputs. Claims rest on experimental outcomes in simulation and limited real-data testing rather than any self-referential mathematical or citation chain. The central assumption (simulator realism) is explicitly flagged by the authors as a limitation, not smuggled in as a proven result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the work implicitly assumes standard imitation learning transfer from simulation to real images.

pith-pipeline@v0.9.0 · 5806 in / 1095 out tokens · 21782 ms · 2026-05-25T07:24:32.506861+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

  1. [1]

    & Wang, Y

    Chen, C., Fan, P., Xie, X. & Wang, Y. Risk Factors for Cement Leakage and Adjacent Vertebral Fractures in Kyphoplasty for Osteo- porotic Vertebral Fractures.Clinical Spine Surgery33, E251 (2020). URL https://journals.lww.com/jspinaldisorders/fulltext/2020/07000/ Risk Factors for Cement Leakage and Adjacent.8.aspx

  2. [2]

    URL https://journals.lww.com/spinejournal/fulltext/2014/05150/identification of risk factors for the occurrence.9.aspx

    Tom´ e-Bermejo, F.et al.Identification of Risk Factors for the Occur- rence of Cement Leakage During Percutaneous Vertebroplasty for Painful Osteoporotic or Malignant Vertebral Fracture.Spine39, E693 (2014). URL https://journals.lww.com/spinejournal/fulltext/2014/05150/identification of risk factors for the occurrence.9.aspx

  3. [3]

    R., Foley, K

    Rampersaud, Y. R., Foley, K. T., Shen, A. C., Williams, S. & Solomito, M. Radiation Exposure to the Spine Surgeon During Fluoroscopi- cally Assisted Pedicle Screw Insertion.Spine25, 2637 (2000). URL https://journals.lww.com/spinejournal/fulltext/2000/10150/radiation exposure to the spine surgeon during.16.aspx

  4. [4]

    URL https://iopscience.iop.org/article/10.1088/1361-6560/ab2d66

    Vijayan, R.et al.Automatic pedicle screw planning using atlas-based registration of anatomy and reference trajectories.Physics in Medicine & Biology64, 165020 (2019). URL https://iopscience.iop.org/article/10.1088/1361-6560/ab2d66

  5. [5]

    B., Forsthoefel, C

    Siemionow, K. B., Forsthoefel, C. W., Foy, M. P., Gawel, D. & Luciano, C. J. Autonomous lumbar spine pedicle screw planning using machine learn- ing: A validation study.Journal of Craniovertebral Junction and Spine12, 223 (2021). URL https://journals.lww.com/jcjs/fulltext/2021/12030/autonomous lumbar spine pedicle screw planning.3.aspx

  6. [6]

    A., Vaishnav, A., York, P

    Huang, M., Tetreault, T. A., Vaishnav, A., York, P. J. & Staub, B. N. The current state of navigation in robotic spine surgery.Annals of Translational Medicine9, 86–86 (2021). URL https://atm.amegroups.com/article/view/46418/html

  7. [7]

    Z., Kumar, V., Levine, S

    Zhao, T. Z., Kumar, V., Levine, S. & Finn, C. Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware (2023). URL http://arxiv.org/abs/2304. 13705. 11

  8. [8]

    W.et al.Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks (2024)

    Kim, J. W.et al.Surgical Robot Transformer (SRT): Imitation Learning for Surgical Tasks (2024). URL http://arxiv.org/abs/2407.12998. ArXiv:2407.12998 [cs]

  9. [9]

    Kim, J. W. B.et al.SRT-H: A hierarchical framework for autonomous surgery via language-conditioned imitation learning.Science Robotics10, eadt5254 (2025). URL https://www.science.org/doi/full/10.1126/scirobotics.adt5254. Publisher: American Association for the Advancement of Science

  10. [10]

    Deepdrr–a catalyst for machine learning in fluoroscopy-guided procedures, 98–106 (Springer, 2018)

    Unberath, M.et al. Deepdrr–a catalyst for machine learning in fluoroscopy-guided procedures, 98–106 (Springer, 2018)

  11. [11]

    URL https://www.nature.com/articles/s42256-023-00629-1

    Gao, C.et al.Synthetic data accelerates the development of generalizable learning-based algorithms for X-ray image analysis.Nature Machine Intelligence 5, 294–308 (2023). URL https://www.nature.com/articles/s42256-023-00629-1. Publisher: Nature Publishing Group

  12. [12]

    D., Cho, S

    Killeen, B. D., Cho, S. M., Armand, M., Taylor, R. H. & Unberath, M. In silico simulation: a key enabling technology for next-generation intelligent surgical systems.Progress in Biomedical Engineering5, 032001 (2023)

  13. [13]

    Killeen, B. D.et al. Pelphix: Surgical phase recognition from x-ray images in percutaneous pelvic fixation, 133–143 (Springer, 2023)

  14. [14]

    D.et al.Fluorosam: A language-aligned foundation model for x-ray image segmentation.arXiv preprint arXiv:2403.08059(2024)

    Killeen, B. D.et al.Fluorosam: A language-aligned foundation model for x-ray image segmentation.arXiv preprint arXiv:2403.08059(2024)

  15. [15]

    Edgar, H.et al.New mexico decedent image database.Office of the Medical Investigator(2020)

  16. [16]

    URL http://arxiv.org/abs/2208.05868

    Wasserthal, J.et al.TotalSegmentator: robust segmentation of 104 anatomical structures in CT images.Radiology: Artificial Intelligence5, e230024 (2023). URL http://arxiv.org/abs/2208.05868

  17. [17]

    Shaker, K.et al.Synthesizing high-resolution dual-energy radiographs from coro- nary artery calcium ct images.Proceedings of SPIE–the International Society for Optical Engineering12925, 129253T (2024)

  18. [18]

    T.et al.A Vertebral Segmentation Dataset with Fracture Grading

    L¨ offler, M. T.et al.A Vertebral Segmentation Dataset with Fracture Grading. Radiology: Artificial Intelligence2, e190138 (2020). URL https://pubs.rsna.org/ doi/10.1148/ryai.2020190138

  19. [19]

    Gertzbein, S. D. & Robbins, S. E. Accuracy of pedicular screw placement in vivo. Spine15, 11–14 (1990)

  20. [20]

    Klinwichit, P.et al.Buu-lspine: A thai open lumbar spine dataset for spondy- lolisthesis detection.Applied Sciences13, 8646 (2023). 12