Vertebra partitioning with thin-plate spline surfaces steered by a convolutional neural network

Bram van Ginneken; Ivana I\v{s}gum; Jelmer M. Wolterink; Majd Zreik; Max A. Viergever; Nikolas Lessmann

arxiv: 1907.10978 · v1 · pith:OOVLSNI5new · submitted 2019-07-25 · 📡 eess.IV

Vertebra partitioning with thin-plate spline surfaces steered by a convolutional neural network

Nikolas Lessmann , Jelmer M. Wolterink , Majd Zreik , Max A. Viergever , Bram van Ginneken , Ivana I\v{s}gum This is my paper

Pith reviewed 2026-05-24 16:08 UTC · model grok-4.3

classification 📡 eess.IV

keywords vertebra partitioningthin-plate splineconvolutional neural networkunpaired dataautoencodersegmentationmedical imagingposterior elements

0 comments

The pith

A convolutional neural network predicts control points for a thin-plate spline surface that partitions vertebra segmentation masks into vertebral body and posterior elements using unpaired data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a method to divide vertebra segmentation masks into the vertebral body and the posterior elements by modeling their boundary as a thin-plate spline surface. A convolutional neural network predicts the control points that define this surface. Training relies on the reconstruction error from a convolutional autoencoder, which permits the use of unpaired data where only the full vertebra masks are available without corresponding partitioned examples. This matters because it allows leveraging existing large datasets of vertebra segmentations without the need for additional manual partitioning annotations.

Core claim

The boundary between the vertebral body and posterior elements is modeled as a thin-plate spline surface defined by a set of control points predicted by the network. The neural network is trained using the reconstruction error of a convolutional autoencoder to enable the use of unpaired data.

What carries the argument

Thin-plate spline surface defined by control points predicted by the convolutional neural network, trained via autoencoder reconstruction error on unpaired masks.

Load-bearing premise

The dividing boundary between vertebral body and posterior elements can be accurately and sufficiently represented by a thin-plate spline surface whose shape is fully determined by a finite set of control points predicted by the CNN.

What would settle it

A collection of vertebra segmentations where no thin-plate spline surface defined by the network's predicted control points matches the true anatomical boundary within the reconstruction tolerance of the autoencoder.

read the original abstract

Thin-plate splines can be used for interpolation of image values, but can also be used to represent a smooth surface, such as the boundary between two structures. We present a method for partitioning vertebra segmentation masks into two substructures, the vertebral body and the posterior elements, using a convolutional neural network that predicts the boundary between the two structures. This boundary is modeled as a thin-plate spline surface defined by a set of control points predicted by the network. The neural network is trained using the reconstruction error of a convolutional autoencoder to enable the use of unpaired data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a clean hybrid method for vertebra partitioning via CNN-predicted TPS control points plus autoencoder loss for unpaired training, but the abstract shows no results and the finite-control-point assumption needs checking against real boundary complexity.

read the letter

The core idea is a CNN that outputs control points to define a thin-plate spline surface separating the vertebral body from the posterior elements in a segmentation mask, with training driven by reconstruction error from a convolutional autoencoder so paired data is not required. That combination is the actual novelty here: the specific use of TPS surfaces steered by the network for this substructure split, together with the unpaired training signal. It is a targeted, practical move for medical segmentation where getting matched image-mask pairs is expensive. The approach is described clearly and the geometric parameterization makes sense for a relatively smooth anatomical interface. Credit to the authors for making the training objective explicit and for avoiding direct reliance on target labels during learning. The main limitation visible from the abstract is the complete absence of any quantitative results, validation protocol, or comparison. Without those, it is impossible to judge whether the method actually improves on standard approaches or whether the TPS surface with a modest number of control points captures the real boundary geometry. The stress-test concern about high-curvature features or end-plate detail is reasonable to raise; nothing in the given description shows that the chosen control-point density is sufficient or that the reconstruction loss enforces fidelity on those scales. If the full paper contains experiments that measure partition accuracy on held-out data and include an ablation on control-point count, the work becomes more convincing. This is aimed at researchers in spine analysis and hybrid geometric learning for medical images. A reader already working on vertebra segmentation or unpaired domain adaptation could extract the technique and test it. The paper deserves peer review because the method is coherent and the unpaired-training angle is worth referee scrutiny, even if the current write-up is light on evidence.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes partitioning vertebral segmentation masks into the vertebral body and posterior elements by using a CNN to predict a set of control points that define a thin-plate spline (TPS) surface as the dividing boundary. Training relies on the reconstruction error of a convolutional autoencoder applied to unpaired masks, avoiding the need for paired substructure labels.

Significance. If the geometric assumption holds and the method is validated, the approach would offer a way to obtain substructure partitions from unpaired data, which is valuable in medical image analysis where detailed annotations are costly. The TPS parameterization provides an explicit, differentiable boundary model steered by the network, and the autoencoder loss is a creative way to supervise without direct labels. These elements could influence future work on shape-constrained segmentation if quantitative evidence demonstrates accuracy on real vertebral anatomy.

major comments (2)

[Abstract and §3] Abstract and §3 (method description): The central claim that the boundary is accurately modeled by a TPS surface determined by a finite set of CNN-predicted control points lacks any supporting analysis or experiment showing that the chosen control-point density suffices for anatomical features such as end-plate undulations, foramina, or high-curvature ridges. If the true interface contains frequencies above the TPS span, the partition will be systematically biased regardless of reconstruction loss minimization.
[Abstract and results section] Abstract and results section: No quantitative validation, error metrics, or comparison against ground-truth partitions is supplied. The soundness of the method cannot be assessed without Dice scores, surface distances, or cross-validation on held-out data that directly measure partition fidelity rather than only autoencoder reconstruction.

minor comments (1)

[Abstract] The abstract states the training objective but does not specify the number of control points, the autoencoder architecture, or how the TPS surface is rasterized into a partition mask; these details are needed for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We respond to each major comment below, indicating planned changes to the manuscript where appropriate.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (method description): The central claim that the boundary is accurately modeled by a TPS surface determined by a finite set of CNN-predicted control points lacks any supporting analysis or experiment showing that the chosen control-point density suffices for anatomical features such as end-plate undulations, foramina, or high-curvature ridges. If the true interface contains frequencies above the TPS span, the partition will be systematically biased regardless of reconstruction loss minimization.

Authors: The manuscript does not include an explicit frequency-domain analysis or ablation on control-point density for features such as end-plate undulations or foramina. The TPS parameterization was selected for its smoothness and differentiability with a modest number of points chosen via preliminary stability tests. We will add a dedicated paragraph in the revised method section justifying the control-point count and a limitations subsection noting the potential for bias on high-curvature anatomy. revision: partial
Referee: [Abstract and results section] Abstract and results section: No quantitative validation, error metrics, or comparison against ground-truth partitions is supplied. The soundness of the method cannot be assessed without Dice scores, surface distances, or cross-validation on held-out data that directly measure partition fidelity rather than only autoencoder reconstruction.

Authors: Direct metrics such as Dice or surface distance require paired substructure labels, which are unavailable by design in the unpaired training regime that the method targets. The autoencoder reconstruction serves as the supervision signal. We will expand the results section with additional qualitative examples on held-out masks and, where a small amount of paired data can be obtained, include limited quantitative partition metrics to illustrate fidelity. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation relies on independent autoencoder objective

full rationale

The core construction defines a CNN that outputs control points for a TPS surface used to partition input masks; the training objective is the reconstruction error of a separate convolutional autoencoder applied to the resulting substructure masks. This loss is external to any target partition labels and does not reduce the predicted control points or surface to a tautological fit of the inputs. No load-bearing self-citation, uniqueness theorem, or ansatz smuggling is present in the described chain, and the TPS modeling choice is an explicit representational assumption rather than a result derived from the data by construction. The method therefore contains independent content from the AE training signal.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The method rests on the domain assumption that a thin-plate spline is an adequate model for the anatomical boundary and that reconstruction error provides a sufficient supervisory signal without paired labels. No free parameters or invented entities are explicitly introduced in the abstract.

axioms (2)

domain assumption The boundary between vertebral body and posterior elements can be represented by a thin-plate spline surface
Invoked as the core modeling choice in the abstract.
domain assumption Reconstruction error from a convolutional autoencoder is a valid training objective for learning the control-point predictor without paired boundary labels
Used to enable unpaired data training.

pith-pipeline@v0.9.0 · 5644 in / 1253 out tokens · 24894 ms · 2026-05-24T16:08:27.034525+00:00 · methodology

Vertebra partitioning with thin-plate spline surfaces steered by a convolutional neural network

Core claim

What carries the argument

Load-bearing premise

What would settle it

discussion (0)