pith. sign in

arxiv: 2410.05882 · v3 · submitted 2024-10-08 · 📡 eess.IV · cs.CV· cs.LG· cs.NE

Frame forecasting in cine MRI using the PCA respiratory motion model: comparing recurrent neural networks trained online and transformers

Pith reviewed 2026-05-23 19:40 UTC · model grok-4.3

classification 📡 eess.IV cs.CVcs.LGcs.NE
keywords cine MRIrespiratory motionframe forecastingPCA modelrecurrent neural networksonline learningtransformersradiotherapy
0
0 comments X

The pith

Online RNNs with RTRL and SnAp-1 outperform transformers for medium-to-long horizon forecasting of respiratory motion in cine MRI.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper evaluates algorithms that forecast future frames in sagittal chest and liver cine-MRI sequences to reduce target uncertainty caused by system latency in radiotherapy. Motion is first reduced to low-dimensional time-dependent weights via PCA applied to Lucas-Kanade optical-flow fields; these weights are then predicted by linear filters, sequence-specific or population transformers, and RNNs updated via several online learning rules. Linear regression works best at the shortest horizons while RTRL and SnAp-1 trained RNNs deliver the lowest geometrical errors at medium and long horizons, staying below 1.4 mm on the ETH Zürich sequences and 2.8 mm on the more variable OvGU sequences. A sympathetic reader cares because accurate on-the-fly forecasts could directly shrink the safety margins needed around moving tumors.

Core claim

Recurrent neural networks trained with real-time recurrent learning (RTRL) and sparse one-step approximation (SnAp-1) outperform linear regression, unbiased online recurrent optimization, decoupled neural interfaces, and both population and sequence-specific transformer encoders when forecasting the low-dimensional PCA weights of respiratory motion extracted from Lucas-Kanade optical flow on sagittal cine-MRI sequences, yielding predicted frames whose geometrical errors remain below 1.4 mm and 2.8 mm at medium-to-long horizons on the ETH Zürich and OvGU datasets respectively.

What carries the argument

The PCA respiratory motion model that decomposes Lucas-Kanade optical-flow fields into static deformation modes plus low-dimensional time-dependent weights, which are forecasted to warp a reference frame into future images.

If this is right

  • Prediction accuracy falls steadily as the forecast horizon lengthens.
  • Sequence-specific transformers remain competitive only up to medium horizons; overall transformers suffer from data scarcity and dataset shift.
  • Linear regression is superior solely at the shortest horizons around 0.32 s.
  • Generated frames match ground truth visually except near the diaphragm at end-inspiration and in regions dominated by out-of-plane motion.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Because the online RNNs update parameters on the fly, they could support fully real-time adaptation inside a treatment session without offline retraining.
  • The observed domain shift between the two datasets implies that patient-specific or site-specific fine-tuning will be necessary before clinical deployment.
  • Extending the same PCA-plus-online-RNN pipeline to 3-D or multi-slice acquisitions would test whether the current sagittal-only representation remains sufficient.

Load-bearing premise

The low-dimensional PCA weights derived from Lucas-Kanade optical flow on sagittal cine-MRI sequences capture the full respiratory motion that must be forecasted.

What would settle it

New cine-MRI sequences in which the geometrical error of RTRL or SnAp-1 forecasts at a 1-second horizon exceeds 2 mm while linear regression stays lower, or in which out-of-plane motion produces visible discrepancies larger than the reported in-plane errors.

Figures

Figures reproduced from arXiv: 2410.05882 by Hiroyuki Takahashi, Kazuyuki Demachi, Michel Pohl, Mitsuru Uesaka, Ritu Bhusal Chhatkuli.

Figure 2
Figure 2. Figure 2: The chest image at time 𝑡 is estimated by warping the initial image (at 𝑡 = 𝑡1 ) 3 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Warping the first image (at time 𝑡 = 𝑡1 ) using Nadaraya-Watson regression with a Gaussian kernel. The closer a point in the source image lands to the square point in the target image (at time 𝑡), the more it influences the inten￾sity of that target pixel at 𝑡 4 . motion, artifacts, and brightness variations corresponding to the same tissue patch throughout the video are minimal. Since the intensity values… view at source ↗
Figure 4
Figure 4. Figure 4: Overall experimental setting. We set 𝑀𝑡𝑟𝑎𝑖𝑛 = 90 for each forecasting method, except linear regression, for which 𝑀𝑡𝑟𝑎𝑖𝑛 = 160. The parameters of the optical flow algorithm used for DIR are optimized using the first 90 images. The cross-validation set ends at 𝑡𝑀𝑐𝑣 with 𝑀𝑐𝑣 = 180. (a) DVF between 𝑡 = 𝑡1 and 𝑡 = 𝑡28. The image at 𝑡 = 𝑡28 is displayed in the background. (b) DVF between 𝑡 = 𝑡1 and 𝑡 = 𝑡32. The… view at source ↗
Figure 5
Figure 5. Figure 5: DVF in sequence 1 during inspiration (left) and expiration (right). The origins of each of the displayed 2D displacement vectors are separated from each other by 6 pixels (6mm). Best viewed with zoom-in on a digital display. 3.2. Prediction of the time-dependent weights In this section only, we consider the case where the number of PCA components is fixed in advance and exclu￾sively focus on the prediction… view at source ↗
Figure 6
Figure 6. Figure 6: First two principal components and first four time-varying PCA coefficients associated with sequence 1. The first 𝑀𝑡𝑟𝑎𝑖𝑛 = 90 images are used to compute the principal DVFs. The time-dependent weights are computed by projecting the DVF at time 𝑡 onto the subspace spanned by the principal components (Eq. 3). The origins of each of the displayed 2D displacement vectors are separated from each other by 6 pixel… view at source ↗
Figure 7
Figure 7. Figure 7: First two principal components and first four PCA coefficients associated with sequence 4 (for 𝑀𝑡𝑟𝑎𝑖𝑛 = 90). The origins of each of the displayed 2D displacement vectors are separated from each other by 6 pixels. The norm of each principal DVF was multiplied by a coefficient equal to 500 to ease visualization. The image at 𝑡 = 𝑡 1 is displayed in the background. Best viewed with zoom-in on a digital displa… view at source ↗
Figure 8
Figure 8. Figure 8: nRMSE of the test set associated with the prediction of the first three time-dependent PCA weights (Eq. 7 with 𝑘𝑚𝑖𝑛 = 𝑀𝑐𝑣 + 1, 𝑘𝑚𝑎𝑥 = 200, and 𝑛𝑐𝑝 = 3) for each algorithm as a function of the look-ahead time ℎ. Each point represents the nRMSE averaged over the four sequences and 𝑛 𝑃 𝐶𝐴 𝑡𝑒𝑠𝑡 = 250 runs in the case of RNNs (except RTRL, for which we set 𝑛 𝑃 𝐶𝐴 𝑡𝑒𝑠𝑡 = 10 given its higher processing time and l… view at source ↗
Figure 9
Figure 9. Figure 9: Prediction of the time-dependent PCA weights associated with sequence 1 for different horizon values using SnAp-1. Ehe hyper-parameters are selected by cross-validation for each horizon value, as described in Section 2.3. resulted in a better accuracy than the latter. That seems to confirm that discarding the higher-order PCA components had a regularizing effect relative to the initial deformation field an… view at source ↗
Figure 10
Figure 10. Figure 10: nRMSE between the predicted and ground-truth first three time-dependent PCA weights of the cross￾validation set (Eq. 7 with 𝑘𝑚𝑖𝑛 = 𝑀𝑡𝑟𝑎𝑖𝑛 + 1, 𝑘𝑚𝑎𝑥 = 𝑀𝑐𝑣, and 𝑛𝑐𝑝 = 3). For a given hyper-parameter and specific value of ℎ, each colored point in the corresponding graph represents the nRMSE minimum across all combinations of the other hyper-parameters within the cross-validation range ( [PITH_FULL_IMAGE:fig… view at source ↗
Figure 11
Figure 11. Figure 11: Normalized root-mean-square registration error 𝐸𝑝𝑟𝑒𝑑 (𝑛𝑐𝑝) using the predicted DVF of the cross-validation set (Eq. 10) as a function of the number of principal components 𝑛𝑐𝑝 for different forecasting algorithms and horizon values ℎ. Each color point corresponds to the error average over the four sequences and 𝑛𝑤𝑎𝑟𝑝 = 25 runs in the case of RNNs (we set 𝑛𝑤𝑎𝑟𝑝 = 5 for RTRL as a specific case) for a given … view at source ↗
Figure 12
Figure 12. Figure 12: Future frame forecasting accuracy/errors corresponding to each algorithm as a function of the look-ahead time interval. Each point represents the average of a given performance metric of the test set over the four image sequences and 𝑛𝑤𝑎𝑟𝑝 = 25 runs in the case of RNNs (we set 𝑛𝑤𝑎𝑟𝑝 = 5 for RTRL as a specific case). We plotted the 95% confidence interval associated with the mean of each RNN performance me… view at source ↗
Figure 13
Figure 13. Figure 13: Predicted images in sequence 1 using RTRL at ℎ = 0.31s and LMS at ℎ = 1.88s at the end of expiration (top line, frame 188) and inspiration phases (bottom line, frame 196), along with the corresponding pixel-wise intensity and Euclidean deformation error maps. The predictions were performed with 𝑛𝑐𝑝 = 3 principal components, as that choice led to the best cross-validation accuracy for both methods. Among t… view at source ↗
Figure 14
Figure 14. Figure 14: Spatial SnAp-1 prediction errors averaged over the test set and 5 evaluation runs with ℎ = 1.88s, along with the mean image of the test set, for the four MRI sequences13 . instance, a 1.67mm geometric mean deformation error and SSIM of 0.75 were reported in [47], focusing on 4D-MR liver sequence reconstruction from a static 3D-MR volume and sagittal cine-MR navigator slices. The errors were likely higher … view at source ↗
Figure 15
Figure 15. Figure 15: Relative influence of the iterative and pyramidal Lucas-Kanade optical flow algorithm parameters on the (min￾imum) registration error 𝐸𝑔𝑡. Each bar corresponds to the standard deviation associated with one curve in [PITH_FULL_IMAGE:figures/full_fig_p025_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Registration error 𝐸𝑔𝑡 as a function of the parameters of the iterative and pyramidal version of the Lucas-Kanade optical flow algorithm. Given one parameter, each point in the associated graph corresponds to the minimum of 𝐸𝑔𝑡 over every possible combination of the other parameters in the grid ( [PITH_FULL_IMAGE:figures/full_fig_p026_16.png] view at source ↗
read the original abstract

Respiratory motion complicates accurate irradiation of thoraco-abdominal tumors during radiotherapy, as treatment-system latency entails target-location uncertainties. This work addresses frame forecasting in chest and liver cine MRI to compensate for such delays. We investigate RNNs trained with online learning algorithms, enabling adaptation to changing respiratory patterns via on-the-fly parameter updates, and transformers, increasingly common in time-series forecasting for their ability to capture long-term dependencies. Experiments used 12 sagittal thoracic and upper-abdominal cine-MRI sequences from ETH Z\"urich and OvGU; the OvGU data exhibited higher motion variability, noise, and lower contrast. PCA decomposes the Lucas-Kanade optical-flow field into static deformation modes and low-dimensional, time-dependent weights. We compare various methods for forecasting these weights: linear filters, population and sequence-specific transformer encoders, and RNNs trained with real-time recurrent learning (RTRL), unbiased online recurrent optimization, decoupled neural interfaces, and sparse one-step approximation (SnAp-1). Predicted displacements were used to warp the reference frame and generate future images. Prediction accuracy decreased with the horizon h. Linear regression performed best at short horizons (1.3mm geometrical error at h=0.32s, ETH Z\"urich dataset), while RTRL and SnAp-1 outperformed the other algorithms at medium-to-long horizons, with geometrical errors below 1.4mm and 2.8mm on the sequences from ETH Z\"urich and OvGU, respectively. The sequence-specific transformer was competitive for low-to-medium horizons, but transformers remained overall limited by data scarcity and domain shift between datasets. Predicted frames visually resembled the ground truth, with notable errors occurring near the diaphragm at end-inspiration and regions affected by out-of-plane motion.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that PCA decomposition of Lucas-Kanade optical flow fields from sagittal cine-MRI sequences yields low-dimensional weights that can be forecasted to predict future frames for radiotherapy latency compensation. It compares linear filters, sequence-specific and population transformers, and RNNs trained with online algorithms (RTRL, UORO, DNI, SnAp-1) on 12 sequences from two datasets, reporting that RTRL and SnAp-1 achieve the lowest geometrical errors at medium-to-long horizons (<1.4 mm on ETH Zürich data, <2.8 mm on OvGU data) while transformers are limited by data scarcity.

Significance. If the central empirical claims hold after addressing the noted gaps, the work would provide concrete evidence that certain online RNN training methods can adaptively forecast respiratory motion weights better than transformers or linear baselines in data-limited medical imaging settings. This could inform latency compensation strategies, with the explicit comparison of online learning algorithms being a useful contribution. The reproducible experimental setup on public datasets is a strength, though the absence of statistical tests and representation validation limits immediate applicability.

major comments (3)
  1. [Abstract] Abstract: The headline claim that RTRL and SnAp-1 outperform other methods at medium-to-long horizons with geometrical errors below 1.4 mm / 2.8 mm is presented without error bars, standard deviations across sequences, or any statistical significance tests (e.g., paired t-tests or Wilcoxon tests) against the linear regression or transformer baselines. This directly affects verifiability of the outperformance assertion.
  2. [Abstract] Abstract (PCA decomposition paragraph): The central pipeline assumes that the low-dimensional PCA weights from 2D sagittal Lucas-Kanade flow sufficiently represent the respiratory motion to be forecasted, yet the abstract itself flags 'notable errors occurring near the diaphragm at end-inspiration and regions affected by out-of-plane motion' without any ablation, sensitivity analysis, or quantification of how out-of-plane components bias the reported 2D geometrical errors. This is load-bearing for the utility claim in radiotherapy.
  3. [Experiments] Experiments (implied in abstract results): The comparison relies on post-hoc identification of best methods after evaluating multiple algorithms on two heterogeneous datasets (ETH Zürich vs. OvGU, differing in variability, noise, and contrast) without pre-specified primary endpoints, hyper-parameter reporting, or cross-validation details. This setup risks overfitting to the specific sequences and undermines generalizability of the medium-to-long horizon superiority claim.
minor comments (2)
  1. [Abstract] The abstract mentions 'population and sequence-specific transformer encoders' but provides no details on architecture, training regime, or how domain shift between datasets was handled; this should be expanded for reproducibility.
  2. [Methods] No mention of the exact number of PCA modes retained or the criterion used for truncation; this parameter choice affects the dimensionality of the forecasting task and should be stated explicitly.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment point by point below, indicating where revisions to the manuscript are planned. The work remains exploratory given the small number of sequences, and we have aimed to be transparent in our responses.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The headline claim that RTRL and SnAp-1 outperform other methods at medium-to-long horizons with geometrical errors below 1.4 mm / 2.8 mm is presented without error bars, standard deviations across sequences, or any statistical significance tests (e.g., paired t-tests or Wilcoxon tests) against the linear regression or transformer baselines. This directly affects verifiability of the outperformance assertion.

    Authors: We agree that the abstract would benefit from variability measures to support the headline claims. In the revised manuscript we will add standard deviations across sequences to the reported geometrical errors in the abstract (subject to length constraints) and include paired statistical tests (e.g., Wilcoxon signed-rank) comparing RTRL/SnAp-1 against baselines in the results section, with a brief reference in the abstract. revision: partial

  2. Referee: [Abstract] Abstract (PCA decomposition paragraph): The central pipeline assumes that the low-dimensional PCA weights from 2D sagittal Lucas-Kanade flow sufficiently represent the respiratory motion to be forecasted, yet the abstract itself flags 'notable errors occurring near the diaphragm at end-inspiration and regions affected by out-of-plane motion' without any ablation, sensitivity analysis, or quantification of how out-of-plane components bias the reported 2D geometrical errors. This is load-bearing for the utility claim in radiotherapy.

    Authors: The abstract already notes these limitations as inherent to 2D cine-MRI. We will expand the discussion section with additional qualitative error maps and per-region error breakdowns from the existing data to better quantify the impact of out-of-plane motion and diaphragm errors. A full quantitative ablation would require 3D ground truth, which is unavailable in the current datasets; we will therefore frame this explicitly as a study limitation. revision: partial

  3. Referee: [Experiments] Experiments (implied in abstract results): The comparison relies on post-hoc identification of best methods after evaluating multiple algorithms on two heterogeneous datasets (ETH Zürich vs. OvGU, differing in variability, noise, and contrast) without pre-specified primary endpoints, hyper-parameter reporting, or cross-validation details. This setup risks overfitting to the specific sequences and undermines generalizability of the medium-to-long horizon superiority claim.

    Authors: All algorithms were evaluated under an identical protocol on the same 12 sequences; we report full performance tables rather than only the best performers. In revision we will add explicit hyper-parameter search details, the leave-one-sequence-out evaluation scheme used, and a statement that the study is exploratory rather than confirmatory. Pre-specification of a single primary endpoint was not performed because the goal was comparative evaluation of several online RNN variants; this will be clarified. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results on held-out external sequences

full rationale

The paper decomposes optical-flow fields via PCA, then trains and evaluates forecasting models (RTRL, SnAp-1, transformers, linear filters) on the resulting time-dependent weights. Geometrical errors are computed by warping reference frames and comparing to ground-truth held-out sequences from two independent datasets. No equation reduces a reported prediction to a quantity fitted on the same test data, no self-citation supplies a load-bearing uniqueness theorem, and no ansatz is smuggled via prior work by the same authors. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the adequacy of the PCA motion model and on the two small datasets being representative of clinical variability; no free parameters or invented entities are introduced beyond standard ML training.

axioms (1)
  • domain assumption Lucas-Kanade optical flow on sagittal cine-MRI yields a motion field whose dominant PCA modes capture the relevant respiratory dynamics.
    Invoked in the paragraph describing decomposition of the optical-flow field into static modes and time-dependent weights.

pith-pipeline@v0.9.0 · 5889 in / 1353 out tokens · 38338 ms · 2026-05-23T19:40:39.628438+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

68 extracted references · 68 canonical work pages · 8 internal anchors

  1. [1]

    Huynh, A

    E. Huynh, A. Hosny, C. Guthier, D. S. Bitterman, S. F. Petit, D. A. Haas-Kogan, B. Kann, H. J. Aerts, R. H. Mak, Artificial intelligence in radiation oncology, Nature Reviews Clinical Oncology 17 (2020) 771–781

  2. [2]

    S.Sarudis,A.KarlssonHauer,J.Nyman,A.Bäck, Systematicevalu- ationoflungtumormotionusingfour-dimensionalcomputedtomog- raphy, Acta Oncologica 56 (2017) 525–530

  3. [3]

    S.Takao,N.Miyamoto,T.Matsuura,R.Onimaru,N.Katoh,T.Inoue, K. L. Sutherland, R. Suzuki, H. Shirato, S. Shimizu, Intrafractional baseline shift or drift of lung tumor motion during gated radiation therapywithareal-timetumor-trackingsystem, InternationalJournal of Radiation Oncology - Biology - Physics 94 (2016) 172–180

  4. [4]

    Verma, H

    P. Verma, H. Wu, M. Langer, I. Das, G. Sandison, Survey: real- time tumor motion prediction for image-guided radiation treatment, Computing in Science & Engineering 13 (2010) 24–35

  5. [5]

    G. Wang, Z. Li, G. Li, G. Dai, Q. Xiao, L. Bai, Y. He, Y. Liu, S. Bai, Real-timelivertrackingalgorithmbasedonLSTMandSVRnetworks for use in surface-guided radiation therapy, Radiation Oncology 16 (2021) 1–12

  6. [6]

    Ehrhardt, C

    J. Ehrhardt, C. Lorenz, et al., 4D modeling and estimation of respira- tory motion for radiation therapy, volume 10, Springer, 2013

  7. [7]

    Zhang, A

    Q. Zhang, A. Pevsner, A. Hertanto, Y.-C. Hu, K. E. Rosenzweig, C. C. Ling, G. S. Mageras, A patient-specific respiratory model of anatomical motion for radiation treatment planning, Medical physics 34 (2007) 4772–4781

  8. [8]

    H.Chen,Z.Zhong,Y.Yang,J.Chen,L.Zhou,X.Zhen,X.Gu, Inter- nal motion estimation by internal-external motion modeling for lung cancer radiotherapy, Scientific reports 8 (2018) 1–14

  9. [9]

    VanDenBerg, Image-driven,model-based3Dabdominalmotiones- timationforMR-guidedradiotherapy, PhysicsinMedicine&Biology 61 (2016) 5335

    B.Stemkens,R.H.Tijssen,B.D.DeSenneville,J.J.Lagendijk,C.A. VanDenBerg, Image-driven,model-based3Dabdominalmotiones- timationforMR-guidedradiotherapy, PhysicsinMedicine&Biology 61 (2016) 5335

  10. [10]

    W.Harris,L.Ren,J.Cai,Y.Zhang,Z.Chang,F.-F.Yin, Atechnique forgeneratingvolumetriccine-magneticresonanceimaging, Interna- tional Journal of Radiation Oncology - Biology - Physics 95 (2016) 844–853

  11. [11]

    L. V. Romaguera, T. Mezheritsky, R. Mansour, W. Tanguay, S. Kadoury, Predictive online 3D target tracking with population- based generative networks for image-guided radiotherapy, Interna- Pohl, Uesaka, Demachi and Chhatkuli:Preprint submitted to Elsevier Page 22 of 27 Future frame prediction in chest cine MR imaging using the PCA respiratory motion model...

  12. [12]

    H. Lin, C. Shi, B. Wang, M. F. Chan, X. Tang, W. Ji, Towards real- timerespiratorymotionpredictionbasedonlongshort-termmemory neural networks, Physics in Medicine & Biology 64 (2019) 085010

  13. [13]

    R. Wang, X. Liang, X. Zhu, Y. Xie, A feasibility of respiration pre- diction based on deep bi-LSTM for real-time tumor tracking, IEEE Access 6 (2018) 51262–51268

  14. [14]

    S. Yu, J. Wang, J. Liu, R. Sun, S. Kuang, L. Sun, Rapid prediction of respiratory motion based on bidirectional gated recurrent unit net- work, IEEE Access 8 (2020) 49424–49435

  15. [15]

    Samadi Miandoab, S

    P. Samadi Miandoab, S. Saramad, S. Setayeshi, Respiratory mo- tionpredictionbasedondeepartificialneuralnetworksinCyberKnife system: A comparative study, Journal of Applied Clinical Medical Physics 24 (2023) e13854

  16. [16]

    Boldrini, S

    E.Lombardo,M.Rabe,Y.Xiong,L.Nierer,D.Cusumano,L.Placidi, L. Boldrini, S. Corradini, M. Niyazi, C. Belka, et al., Offline and on- line LSTM networks for respiratory motion prediction in MR-guided radiotherapy, Physics in Medicine & Biology 67 (2022) 095006

  17. [17]

    echostatenetwork

    H. Jaeger, Tutorial on training recurrent neural networks, covering BPPT,RTRL,EKFandthe"echostatenetwork"approach,volume5, GMD-Forschungszentrum Informationstechnik Bonn, 2002

  18. [18]

    R. J. Williams, D. Zipser, A learning algorithm for continually run- ning fully recurrent neural networks, Neural computation 1 (1989) 270–280

  19. [19]

    Jiang, F

    K. Jiang, F. Fujii, T. Shiinoki, Prediction of lung tumor motion us- ing nonlinear autoregressive model with exogenous input, Physics in Medicine & Biology 64 (2019) 21NT02

  20. [20]

    M. Mafi, S. M. Moghadam, Real-time prediction of tumor motion using a dynamic neural network, Medical & biological engineering & computing 58 (2020) 529–539

  21. [21]

    M. Pohl, M. Uesaka, K. Demachi, R. B. Chhatkuli, Prediction of the motion of chest internal points using a recurrent neural network trained with real-time recurrent learning for latency compensation in lungcancerradiotherapy, ComputerizedMedicalImagingandGraph- ics (2021) 101941

  22. [22]

    M. Pohl, M. Uesaka, H. Takahashi, K. Demachi, R. B. Chhatkuli, Prediction of the position of external markers using a recurrent neu- ral network trained with unbiased online recurrent optimization for safe lung cancer radiotherapy, Computer Methods and Programs in Biomedicine 222 (2022) 106908

  23. [23]

    M.Pohl,M.Uesaka,H.Takahashi,K.Demachi,R.B.Chhatkuli,Res- piratory motion forecasting with online learning of recurrent neural networks for safety enhancement in externally guided radiotherapy, arXiv preprint arXiv:2403.01607 (2024)

  24. [24]

    O.Marschall,K.Cho,C.Savin, Aunifiedframeworkofonlinelearn- ingalgorithmsfortrainingrecurrentneuralnetworks, JournalofMa- chine Learning Research 21 (2020) 1–34

  25. [25]

    Unbiased Online Recurrent Optimization

    C. Tallec, Y. Ollivier, Unbiased online recurrent optimization, arXiv preprint arXiv:1702.05043 (2017)

  26. [26]

    J.Menick,E.Elsen,U.Evci,S.Osindero,K.Simonyan,A.Graves, A practicalsparseapproximationforrealtimerecurrentlearning, arXiv preprint arXiv:2006.07232 (2020)

  27. [27]

    Hochreiter, J

    S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computation 9 (1997) 1735–1780

  28. [28]

    Minderer, C

    M. Minderer, C. Sun, R. Villegas, F. Cole, K. P. Murphy, H. Lee, Unsupervisedlearningofobjectstructureanddynamicsfromvideos, Advances in Neural Information Processing Systems 32 (2019)

  29. [29]

    X. Jin, H. Xiao, X. Shen, J. Yang, Z. Lin, Y. Chen, Z. Jie, J. Feng, S. Yan, Predicting scene parsing and motion dynamics in the future, arXiv preprint arXiv:1711.03270 (2017)

  30. [30]

    Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning

    W. Lotter, G. Kreiman, D. Cox, Deep predictive coding networks for video prediction and unsupervised learning, 2017.arXiv:1605.08104

  31. [31]

    P. Luc, C. Couprie, Y. Lecun, J. Verbeek, Predicting future instance segmentation by forecasting convolutional features, in: Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 584–599

  32. [32]

    X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, W.-c. Woo, ConvolutionalLSTMnetwork: Amachinelearningapproachforpre- cipitation nowcasting, arXiv preprint arXiv:1506.04214 (2015)

  33. [33]

    I.Sutskever,O.Vinyals,Q.V.Le, Sequencetosequencelearningwith neural networks, Advances in neural information processing systems 27 (2014)

  34. [34]

    N.Srivastava,E.Mansimov,R.Salakhudinov, Unsupervisedlearning of video representations using LSTMs, in: International conference on machine learning, PMLR, 2015, pp. 843–852

  35. [35]

    Oprea, P

    S. Oprea, P. Martinez-Gonzalez, A. Garcia-Garcia, J. A. Castro- Vargas,S.Orts-Escolano,J.Garcia-Rodriguez,A.Argyros, Areview on deep learning techniques for video prediction, IEEE Transactions on Pattern Analysis and Machine Intelligence (2020)

  36. [36]

    Decomposing Motion and Content for Natural Video Sequence Prediction

    R. Villegas, J. Yang, S. Hong, X. Lin, H. Lee, Decomposing mo- tionandcontentfornaturalvideosequenceprediction, arXivpreprint arXiv:1706.08033 (2017)

  37. [37]

    C. Finn, I. Goodfellow, S. Levine, Unsupervised learning for physical interaction through video prediction, arXiv preprint arXiv:1605.07157 (2016)

  38. [38]

    4463–4471

    Z.Liu,R.A.Yeh,X.Tang,Y.Liu,A.Agarwala, Videoframesynthe- sis using deep voxel flow, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4463–4471

  39. [39]

    Spatial Transformer Networks

    M. Jaderberg, K. Simonyan, A. Zisserman, K. Kavukcuoglu, Spatial transformer networks, arXiv preprint arXiv:1506.02025 (2015)

  40. [40]

    Stochastic Variational Video Prediction

    M. Babaeizadeh, C. Finn, D. Erhan, R. H. Campbell, S. Levine, Stochastic variational video prediction, arXiv preprint arXiv:1710.11252 (2017)

  41. [41]

    Denton, R

    E. Denton, R. Fergus, Stochastic video generation with a learned prior, in: Internationalconferenceonmachinelearning,PMLR,2018, pp. 1174–1183

  42. [42]

    R. B. Chhatkuli, K. Demachi, N. Miyamoto, M. Uesaka, A. Haga, et al., Dynamic image prediction using principal component and multi-channel singular spectral analysis: a feasibility study, Open Journal of Medical Imaging 5 (2015) 133

  43. [43]

    J. Pham, W. Harris, W. Sun, Z. Yang, F.-F. Yin, L. Ren, Predicting real-time3Ddeformationfieldmaps(DFM)basedonvolumetriccine MRI (VC-MRI) and artificial neural networks for on-board 4D target tracking: afeasibilitystudy, PhysicsinMedicine&Biology64(2019) 165016

  44. [44]

    W. Liu, A. Sawant, D. Ruan, Prediction of high-dimensional states subject to respiratory motion: a manifold learning approach, Physics in Medicine & Biology 61 (2016) 4989

  45. [45]

    Nabavi, M

    S. Nabavi, M. Abdoos, M. E. Moghaddam, M. Mohammadi, Res- piratory motion prediction using deep convolutional long short-term memory network, Journal of Medical Signals and Sensors 10 (2020) 69

  46. [46]

    L. V. Romaguera, R. Plantefève, F. P. Romero, F. Hébert, J.-F. Car- rier, S. Kadoury, Prediction of in-plane organ deformation during free-breathingradiotherapyviadiscriminativespatialtransformernet- works, Medical image analysis 64 (2020) 101754

  47. [47]

    L. V. Romaguera, T. Mezheritsky, R. Mansour, J.-F. Carrier, S. Kadoury, Probabilistic 4D predictive model from in-room surro- gates using conditional generative networks for image-guided radio- therapy, Medical image analysis 74 (2021) 102250

  48. [48]

    L. V. Romaguera, S. Alley, J.-F. Carrier, S. Kadoury, Conditional- basedtransformernetworkwithlearnablequeriesfor4Ddeformation forecasting and tracking, IEEE Transactions on Medical Imaging 42 (2023) 1603–1618

  49. [49]

    [On- line; accessed 26-May-2021]

    ETH Zürich, Biomedical Image Computing, Datasets, 4D MRI lung data, https://bmic.ee.ethz.ch/research/datasets.html, 2021. [On- line; accessed 26-May-2021]

  50. [50]

    von Siebenthal, G

    M. von Siebenthal, G. Szekely, U. Gamper, P. Boesiger, A. Lomax, P. Cattin, 4D MR imaging of respiratory organ motion and its vari- ability, Physics in Medicine & Biology 52 (2007) 1547

  51. [51]

    D. Boye, G. Samei, J. Schmidt, G. Székely, C. Tanner, Population based modeling of respiratory lung motion and prediction from par- tial information, in: Medical Imaging 2013: Image Processing, vol- ume 8669, International Society for Optics and Photonics, 2013, p. 86690U. Pohl, Uesaka, Demachi and Chhatkuli:Preprint submitted to Elsevier Page 23 of 27 Fu...

  52. [52]

    B. D. Lucas, T. Kanade, et al., An iterative image registration tech- nique with an application to stereo vision (1981)

  53. [53]

    Bouguet, et al., Pyramidal implementation of the affine Lucas Kanade feature tracker, description of the algorithm, Intel Corpora- tion 5 (2001) 4

    J.-Y. Bouguet, et al., Pyramidal implementation of the affine Lucas Kanade feature tracker, description of the algorithm, Intel Corpora- tion 5 (2001) 4

  54. [54]

    1310–1318

    R.Pascanu,T.Mikolov,Y.Bengio, Onthedifficultyoftrainingrecur- rent neural networks, in: International conference on machine learn- ing, 2013, pp. 1310–1318

  55. [55]

    Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE trans- actions on image processing 13 (2004) 600–612

  56. [56]

    thesis, The University of Tokyo, 2016.URL: https://repository.dl.itc.u-tokyo.ac.jp/record/48459/ files/A32580_summary.pdf

    Chhatkuli, Development of a markerless tumor prediction sys- tem using principal component analysis and multi-channel sin- gular spectral analysis with real-time respiratory phase recogni- tion in radiation therapy, Ph.D. thesis, The University of Tokyo, 2016.URL: https://repository.dl.itc.u-tokyo.ac.jp/record/48459/ files/A32580_summary.pdf

  57. [57]

    Vandemeulebroucke, D

    J. Vandemeulebroucke, D. Sarrut, P. Clarysse, et al., The POPI- model, a point-validated pixel-based breathing thorax model, in: XVth international conference on the use of computers in radiation therapy (ICCR), volume 2, 2007, pp. 195–199

  58. [58]

    Castillo, R

    E. Castillo, R. Castillo, J. Martinez, M. Shenoy, T. Guerrero, Four- dimensionaldeformableimageregistrationusingtrajectorymodeling, Physics in Medicine & Biology 55 (2009) 305

  59. [59]

    Y. Li, Z. Li, J. Zhu, B. Li, H. Shu, D. Ge, Online prediction for respiratorymovementcompensation: apatient-specificgatingcontrol for MRI-guided radiotherapy, Radiation Oncology 18 (2023) 149

  60. [60]

    R. Li, J. H. Lewis, X. Jia, T. Zhao, W. Liu, S. Wuenschel, J. Lamb, D.Yang,D.A.Low,S.B.Jiang, OnaPCA-basedlungmotionmodel, Physics in Medicine & Biology 56 (2011) 6009

  61. [61]

    Pohl, pohl-michel/2d-mr-image-prediction: First release, 2024

    M. Pohl, pohl-michel/2d-mr-image-prediction: First release, 2024. URL: https://doi.org/10.5281/zenodo.13896202.doi: 10.5281/zenodo. 13896202

  62. [62]

    Q. Xu, R. J. Hamilton, R. A. Schowengerdt, B. Alexander, S. B. Jiang, Lung tumor tracking in fluoroscopic video based on optical flow, Medical physics 35 (2008) 5351–5359

  63. [63]

    Y.Akino, R.-J.Oh, N.Masai, H.Shiomi,T.Inoue, Evaluationofpo- tentialinternaltargetvolumeoflivertumorsusingcine-MRI,Medical physics 41 (2014) 111704

  64. [64]

    Dhont, J

    J. Dhont, J. Vandemeulebroucke, D. Cusumano, L. Boldrini, F. Cellini, V. Valentini, D. Verellen, Multi-object tracking in MRI- guidedradiotherapyusingthetracking-learning-detectionframework, Radiotherapy and Oncology 138 (2019) 25–29

  65. [65]

    Balakrishnan, A

    G. Balakrishnan, A. Zhao, M. R. Sabuncu, J. Guttag, A. V. Dalca, Voxelmorph: a learning framework for deformable medical image registration, IEEE transactions on medical imaging 38 (2019) 1788– 1800

  66. [66]

    M. J. Murphy, Tracking moving organs in real time, in: Seminars in radiation oncology, volume 14, Elsevier, 2004, pp. 91–100

  67. [67]

    Fleet, Y

    D. Fleet, Y. Weiss, Optical flow estimation, in: Handbook of mathe- matical models in computer vision, Springer, 2006, pp. 237–257

  68. [68]

    ground-truth

    G. Zhang, T.-C. Huang, T. Guerrero, K.-P. Lin, C. Stevens, G.Starkschall,K.Forster, Useofthree-dimensional(3D)opticalflow methodinmapping3Danatomicstructureandtumorcontoursacross four-dimensionalcomputedtomographydata, Journalofappliedclin- ical medical physics 9 (2008) 59–69. A. Appendix : Optimization of the image registration parameters Inthissection,w...