Timesynth: A Temporal Fidelity Framework for Health Signal Digital Twins
Pith reviewed 2026-07-02 16:33 UTC · model grok-4.3
The pith
Standard pointwise metrics fail to detect phase and frequency errors in health-signal forecasting models, misranking them by up to 53 degrees.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Across 11 architectures, models with comparable pointwise error diverge by up to 53° in phase accuracy, equivalent to roughly 123 ms for a 1.2 Hz cardiac rhythm and invisible to standard metrics. Linear and full-sequence attention models systematically lose frequency and phase information despite acceptable amplitude error, whereas architectures with localized temporal structure better preserve dynamical fidelity and adapt to observable state transitions; none, however, reliably preserves stochastic switching. Because the dominant determinant of fidelity is architectural, model choice becomes a principled, use-case-driven decision rather than a search for a single winner.
What carries the argument
TimeSynth's physiologically grounded generator that produces signals with analytically known ground-truth dynamics from parametric models fitted to real EEG, ECG, and PPG signals, paired with diagnostics that quantify amplitude, frequency, phase, and state-transition fidelity.
If this is right
- Architectures with localized temporal structure preserve frequency and phase better than linear or full-sequence attention models.
- No tested architecture reliably preserves stochastic switching between states.
- Model selection for health-signal digital twins should be driven by the specific dynamical properties required rather than overall error scores.
- Development of such models can rely on controlled preclinical tests with known dynamics before coupling to patient data.
Where Pith is reading between the lines
- The same generator and diagnostics could be applied to other domains that rely on oscillatory time series, such as climate or financial forecasting.
- The diagnostics might serve as an online monitoring tool to trigger model retraining when phase drift is detected in live digital twins.
- Hybrid architectures that combine localized structure with mechanisms for stochastic switching could be tested as a direct extension of the reported architectural comparisons.
Load-bearing premise
The generator produces signals with analytically known ground-truth dynamics from parametric models fitted to real electroencephalography, electrocardiography and photoplethysmogram signals.
What would settle it
A controlled test across many more architectures in which phase accuracy shows strong correlation with pointwise error and no 53-degree divergences appear would falsify the claim that pointwise metrics create a blind spot.
Figures
read the original abstract
Forecasting models for health-signal digital twins must preserve the oscillatory, frequency, phase, and state-transition dynamics of physiological signals, yet the pointwise metrics used to benchmark them cannot detect when these fundamental properties are lost. We show that this blind spot misranks models: across 11 architectures, models with comparable pointwise error diverge by up to 53{\deg} in phase accuracy, equivalent to roughly 123 ms for a 1.2 Hz cardiac rhythm and invisible to standard metrics. To enable development of models that escape such failures, we introduce TimeSynth, a controlled benchmarking framework with two reusable components: a physiologically grounded generator producing signals with analytically known ground-truth dynamics from parametric models fitted to real electroencephalography, electrocardiography and photoplethysmogram signals, along with diagnostics quantifying amplitude, frequency, phase, and state-transition fidelity. Linear and full-sequence attention models systematically lose frequency and phase information despite acceptable amplitude error, whereas architectures with localized temporal structure better preserve dynamical fidelity and adapt to observable state transitions; none, however, reliably preserves stochastic switching. Because the dominant determinant of fidelity is architectural, model choice becomes a principled, use-case-driven decision rather than a search for a single winner. TimeSynth thus supplies the controlled preclinical stress test missing before models are coupled to patient data, with a reusable generator and diagnostics for fidelity-aware development.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that pointwise metrics are blind to losses in oscillatory dynamics when forecasting health signals for digital twins. Across 11 architectures, models with comparable pointwise error exhibit up to 53° phase divergence (roughly 123 ms at 1.2 Hz), and the authors introduce TimeSynth: a generator that produces EEG/ECG/PPG signals from parametric models fitted to real data and claimed to supply analytically known ground-truth dynamics, together with diagnostics that quantify amplitude, frequency, phase, and state-transition fidelity. Architectures with localized temporal structure preserve fidelity better than linear or full-sequence attention models, but none reliably capture stochastic switching; the dominant factor is therefore architectural choice rather than a universal winner.
Significance. If the generator truly supplies phase and frequency ground truth that is analytically independent of the diagnostic extraction pipeline, the work would be significant for exposing a systematic blind spot in current benchmarking of physiological signal models and for supplying reusable components that enable fidelity-aware development before models are deployed on patient data.
major comments (2)
- [Abstract / Generator component] Abstract and generator description: the headline result (53° phase divergence invisible to pointwise metrics) rests on the generator supplying signals whose phase/frequency/state transitions are 'analytically known' from parametric models fitted to real signals. Standard parametric forms for ECG (e.g., McSharry-style ODE systems) and similar oscillators yield phase only after numerical integration; the manuscript must specify exactly how ground-truth phase is obtained and demonstrate that the extraction is independent of (or identically matched to) the diagnostic pipeline, otherwise the reported divergence risks being partly an artifact of integrator tolerance, unwrapping convention, or state-transition detection.
- [Results (phase accuracy across architectures)] Results on phase accuracy: the equivalence '53° ≈ 123 ms at 1.2 Hz' is presented as a practical illustration. The manuscript should state the precise formula used for this conversion and confirm it is applied uniformly to the tested cardiac and neural signals rather than derived from a single nominal frequency.
minor comments (2)
- [Abstract] The abstract refers to '11 architectures' without enumeration or pointer to a table; adding a concise list or reference would aid readability.
- [Generator description] All parametric models used in the generator should be named with their governing equations (or explicit citations) to support reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. These points help improve the clarity of the generator description and the interpretation of phase metrics. We address each major comment below and will incorporate revisions to strengthen the paper.
read point-by-point responses
-
Referee: [Abstract / Generator component] Abstract and generator description: the headline result (53° phase divergence invisible to pointwise metrics) rests on the generator supplying signals whose phase/frequency/state transitions are 'analytically known' from parametric models fitted to real signals. Standard parametric forms for ECG (e.g., McSharry-style ODE systems) and similar oscillators yield phase only after numerical integration; the manuscript must specify exactly how ground-truth phase is obtained and demonstrate that the extraction is independent of (or identically matched to) the diagnostic pipeline, otherwise the reported divergence risks being partly an artifact of integrator tolerance, unwrapping convention, or state-transition detection.
Authors: We agree that explicit specification of the ground-truth extraction is required for reproducibility and to rule out artifacts. The TimeSynth generator derives phase, frequency, and state transitions directly from the closed-form parametric equations (e.g., instantaneous phase from the analytic oscillator state variables in the McSharry-style ECG model and analogous forms for EEG and PPG), without post-hoc numerical integration for the ground truth itself. The diagnostic pipeline applies identical extraction operators to both generated and reference signals. In the revised manuscript we will add a dedicated Methods subsection that (i) states the exact analytic expressions used for each signal modality, (ii) provides the verification that the same operators are used in diagnostics, and (iii) reports a numerical check confirming that integrator tolerance and unwrapping conventions do not affect the reported phase divergence. revision: yes
-
Referee: [Results (phase accuracy across architectures)] Results on phase accuracy: the equivalence '53° ≈ 123 ms at 1.2 Hz' is presented as a practical illustration. The manuscript should state the precise formula used for this conversion and confirm it is applied uniformly to the tested cardiac and neural signals rather than derived from a single nominal frequency.
Authors: We accept the need for an explicit formula and uniform application. The conversion is time_delay = (phase_error_deg / 360) × (1 / f_dom), where f_dom is the dominant frequency of the specific signal under test. For the cardiac example this yields ~123 ms at 1.2 Hz; for neural signals the corresponding band-limited dominant frequency is substituted. The revised Results section will state the formula verbatim and confirm that each architecture comparison uses the per-signal dominant frequency extracted from the same parametric model, ensuring the illustration is not based on a single nominal value. revision: yes
Circularity Check
No significant circularity; framework and diagnostics are independent of target claims
full rationale
The paper introduces TimeSynth as a new generator (parametric models fitted to real EEG/ECG/PPG) plus new diagnostics for amplitude/frequency/phase/state-transition fidelity. The headline result (up to 53° phase divergence invisible to pointwise metrics) is obtained by running 11 external architectures on signals from this generator and comparing their outputs against the generator's stated ground-truth dynamics. No equation or claim reduces by construction to a fitted parameter renamed as prediction, no self-citation chain is load-bearing for the central result, and the generator/diagnostics are presented as newly introduced components rather than derived from the evaluated models. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- parameters of the parametric models
axioms (1)
- domain assumption Parametric models fitted to real physiological signals accurately capture their oscillatory, frequency, phase, and state-transition dynamics
Reference graph
Works this paper leans on
-
[1]
& Zhu, B
Li, H., Zhang, J., Zhang, N. & Zhu, B. Advancing emergency care with digital twins.JMIR aging8, e71777 (2025)
2025
-
[2]
Sarani Rad, F., Bitaraf, E., Jafarpour, M. & Li, J. Technologies, clinical appli- cations, and implementation barriers of digital twins in precision cardiology: Systematic review.JMIR cardio10, e78499 (2026)
2026
-
[3]
Sad´ ee, C.et al.Medical digital twins: enabling precision medicine and medical artificial intelligence.The Lancet Digital Health7(2025)
2025
-
[4]
A.et al.Foundational research gaps and future directions for digital twins (2024)
of Engineering, N. A.et al.Foundational research gaps and future directions for digital twins (2024)
2024
-
[5]
D., Azuaje, F., McSharry, P.et al
Clifford, G. D., Azuaje, F., McSharry, P.et al. Advanced methods and tools for ECG data analysisVol. 10 (Artech house Boston, 2006)
2006
-
[6]
Tong, H.Non-linear time series: a dynamical system approach(Oxford university press, 1990)
1990
-
[7]
& Laguna, P.Bioelectrical signal processing in cardiac and neurolog- ical applications(Academic press, 2005)
S¨ ornmo, L. & Laguna, P.Bioelectrical signal processing in cardiac and neurolog- ical applications(Academic press, 2005)
2005
-
[8]
Zeng, A., Chen, M., Zhang, L. & Xu, Q. Are transformers effective for time series forecasting?Proceedings of the AAAI Conference on Artificial Intelligence37, 11121–11128 (2023)
2023
-
[9]
& Yoon, S
Kim, J., Kim, H., Kim, H., Lee, D. & Yoon, S. A comprehensive survey of deep learning for time series forecasting: architectural diversity and open challenges. Artificial Intelligence Review58, 216 (2025). 23
2025
-
[10]
Deep Time Series Models: A Comprehensive Survey and Benchmark
Wang, Y.et al.Deep time series models: A comprehensive survey and benchmark. arXiv preprint arXiv:2407.13278(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[11]
H.et al.A scoping review of human digital twins in healthcare applications and usage patterns.npj Digital Medicine8, 587 (2025)
Tudor, B. H.et al.A scoping review of human digital twins in healthcare applications and usage patterns.npj Digital Medicine8, 587 (2025)
2025
-
[12]
Sel, K.et al.Survey and perspective on verification, validation, and uncertainty quantification of digital twins for precision medicine.npj Digital Medicine8, 40 (2025)
2025
-
[13]
L.et al.Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals.circulation101, e215– e220 (2000)
Goldberger, A. L.et al.Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals.circulation101, e215– e220 (2000)
2000
-
[14]
& Emanuel, E
Obermeyer, Z. & Emanuel, E. J. Predicting the future—big data, machine learn- ing, and clinical medicine.The New England journal of medicine375, 1216 (2016)
2016
-
[15]
K., Mendez Guerra, I., Deslauriers-Gauthier, S
Maksymenko, K., Clarke, A. K., Mendez Guerra, I., Deslauriers-Gauthier, S. & Farina, D. A myoelectric digital twin for fast and realistic modelling in deep learning.Nature Communications14, 1600 (2023)
2023
-
[16]
Bernett, J.et al.Critical evaluation of drug response prediction models with DrEval.Nature Communications(2026)
2026
-
[17]
Yan, C.et al.A multifaceted benchmarking of synthetic electronic health record generation models.Nature Communications13, 7609 (2022)
2022
-
[18]
A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
Nie, Y., Nguyen, N. H., Sinthong, P. & Kalagnanam, J. A time series is worth 64 words: Long-term forecasting with transformers.arXiv preprint arXiv:2211.14730 (2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[19]
Wang, H.et al.MICN: Multi-scale local and global context modeling for long- term series forecasting.The Eleventh International Conference on Learning Representations(2023)
2023
-
[20]
& Wang, X
Luo, D. & Wang, X. ModernTCN: A modern pure convolution structure for general time series analysis.The Twelfth International Conference on Learning Representations(2024)
2024
-
[21]
Reiss, A., Indlekofer, I. & Schmidt, P. PPG-DaLiA. UCI Machine Learning Repository (2019). DOI: https://doi.org/10.24432/C53890
-
[22]
Moody, G. B. & Mark, R. G. The impact of the MIT-BIH Arrhythmia Database. IEEE Engineering in Medicine and Biology Magazine20, 45–50 (2001)
2001
-
[23]
Shoeb, A. H. Application of machine learning to epileptic seizure onset detection and treatment (2009). 24
2009
-
[24]
& Long, M
Wu, H., Xu, J., Wang, J. & Long, M. Autoformer: Decomposition transform- ers with auto-correlation for long-term series forecasting.Advances in neural information processing systems34, 22419–22430 (2021)
2021
-
[25]
O., Ildiz, M
Taga, E. O., Ildiz, M. E. & Oymak, S. TimePFN: Effective multivariate time series forecasting with synthetic data.Proceedings of the AAAI Conference on Artificial Intelligence39, 20761–20769 (2025)
2025
- [26]
-
[27]
Oreshkin, B. N., Carpov, D., Chapados, N. & Bengio, Y. N-beats: Neural basis expansion analysis for interpretable time series forecasting.arXiv preprint arXiv:1905.10437(2019)
-
[28]
p1" p2" · · ·
Yi, K.et al.Frequency-domain mlps are more effective learners in time series forecasting.Advances in Neural Information Processing Systems36, 76656–76679 (2023). Author Contributions M.R.H.conceived the TimeSynth framework, designed the synthetic signal gener- ation pipeline, implemented the parametric models fitted to real biosignals (ECG, PPG, EEG), dev...
2023
-
[29]
Zero-pad the signal to lengthN fft = 2N(pad factor = 2) to reduce circular convolution edge effects
-
[30]
Compute the FFT:X(k) = FFT(x padded)
-
[31]
Construct the one-sided spectral mask: H(k) = 1k= 0 2 1≤k < N fft/2 1k=N fft/2 0k > N fft/2 (A14)
-
[32]
The real part ofz(t) approximates the original signal, and the imaginary part is its Hilbert transform
Inverse transform and crop to the original length:z(t) = IFFT(X·H) N−1 t=0 . The real part ofz(t) approximates the original signal, and the imaginary part is its Hilbert transform. This implementation is equivalent toscipy.signal.hilbertbut provides explicit control over the padding factor. The pad factor of 2 was chosen to minimize edge artifacts; increa...
-
[33]
Sort themrawp-values in ascending order:p (1) ≤p (2) ≤ · · · ≤p (m). 38
-
[34]
Multiply each by its rank-dependent factor: ˜p(k) = (m−k+ 1)·p (k)
-
[35]
Enforce monotonicity:p Holm (k) = max ˜p(k), p Holm (k−1) , capped at 1.0. The Holm procedure controls the family-wise error rate atα= 0.05 while providing uniformly greater power than the classical Bonferroni correction. A5.3 Intersection-valid masking For frequency and phase error, spectral reliability filtering (§A4.2,§A4.3) can produce NaN values for ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.