pith. sign in

arxiv: 2605.17804 · v2 · pith:5GYHENSFnew · submitted 2026-05-18 · 💻 cs.LG · eess.SP

GenTS: A Comprehensive Benchmark Library for Generative Time Series Models

Pith reviewed 2026-05-20 13:11 UTC · model grok-4.3

classification 💻 cs.LG eess.SP
keywords generative modelstime seriesbenchmark librarymodel evaluationdata synthesisforecastingimputationmachine learning
0
0 comments X

The pith

GenTS provides a benchmark library built specifically for generative time series models rather than discriminative ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to address the mismatch between existing time series libraries, which optimize for direct input-output mappings and metrics like mean squared error, and generative models that instead learn full data distributions through processes such as adversarial training or diffusion. It introduces GenTS to supply a single preprocessing pipeline, a set of ready models, and broad evaluation metrics that work across synthesis, forecasting, imputation, and similar tasks. The modular structure is presented as the feature that lets users add their own datasets and models without rewriting the core workflow. Benchmarking runs on diverse tasks then produce concrete comparisons that guide which models suit which settings and highlight open research questions.

Core claim

GenTS is a benchmark library that supplies a unified data preprocessing pipeline, a collection of versatile generative models, panoramic evaluation metrics, and a modular architecture so that researchers can run systematic assessments of models that learn time series distributions rather than direct mappings.

What carries the argument

The modular design that links a shared preprocessing stage, interchangeable generative models, and a wide set of distribution-aware metrics into one extensible workflow.

If this is right

  • Evaluations of generative time series models become reproducible across synthesis, forecasting, and imputation tasks.
  • Model selection decisions can rest on side-by-side results instead of isolated published numbers.
  • Gaps in current generative approaches become visible when the same metrics are applied to many models.
  • New models can be inserted into the pipeline without rebuilding the evaluation stack.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same library structure could be adapted to other sequential data types such as event streams or spatial-temporal records.
  • Standardized generative benchmarks may shorten the cycle from new model proposal to comparative testing.
  • Industry teams working with irregular time series could adopt the preprocessing layer as a common starting point.

Load-bearing premise

The modular structure will let outside researchers add new datasets and models without friction and that the reported experiments will give stable guidance on model choice.

What would settle it

A follow-up study that adds several new datasets and models and finds the library requires major code changes or that the original benchmark rankings reverse under different random seeds or task definitions.

Figures

Figures reproduced from arXiv: 2605.17804 by Chenxi Wang, Peiyang Li, Xiaorong Wang, Yi Wang.

Figure 1
Figure 1. Figure 1: Left: The overview of GenTS framework. Right: A code snippet showing the neat usage of GenTS in less than 30 lines. Customizable Base Model Module Similar to BaseDataModule, a base model class BaseModel is designed as a template for all mod￾els. In this framework, we inherit LightningModule and then re￾spectively implement training_step and validation_step for each model. Since some GAN-based models, like … view at source ↗
Figure 2
Figure 2. Figure 2: Main Results of Time Series Synthesis Benchmarking. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: t-SNE Visualization of Class Label-guided Time Series Generation. In ECG dataset, we selected first two classes from [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 7
Figure 7. Figure 7: Training and inference time of different models in [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 4
Figure 4. Figure 4: t-SNE Visualization of multivariate synthesis [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Training and inference time of different models in [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Training and inference time of different models in [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
read the original abstract

Generative models have demonstrated remarkable potential in time series analysis tasks, like synthesis, forecasting, imputation, etc. However, offering limited coverage for generative models, existing time series libraries are mainly engineered for discriminative models, with standardized workflows for specific tasks, such as optimizing Mean Squared Errors for time series forecasting. This rigid structure is fundamentally incompatible with the distinct and often complex paradigms of generative models (e.g., adversarial training, diffusion processes), which learn the underlying data distribution rather than a direct input-output mapping. To this end, we proposed GenTS, a comprehensive and extensible benchmark library designed for systematic assessment on generative time series models. GenTS features a unified data preprocessing pipeline, a collection of versatile models, and panoramic evaluation metrics. Its modular design also enables the researchers to flexibly customize beyond our built-in datasets and models. Based on GenTS, we conducted benchmarking experiments under diverse tasks, accordingly offering suggestions for model selection and identifying potential directions for future research. Our codes are open-source at https://github.com/WillWang1113/GenTS. The official tutorials and document are available at https://willwang1113.github.io/GenTS/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces GenTS, a comprehensive and extensible benchmark library for generative time series models. It features a unified data preprocessing pipeline, a collection of versatile models supporting paradigms such as adversarial training and diffusion processes, panoramic evaluation metrics, and a modular design for customization beyond built-in datasets and models. Based on the library, the authors conduct benchmarking experiments under diverse tasks (synthesis, forecasting, imputation) and provide model selection suggestions, with open-source code and tutorials available.

Significance. If the library's modular components are correctly implemented and the benchmarking experiments are executed with consistent training protocols and statistical rigor, GenTS could address a clear gap in existing time series libraries that focus primarily on discriminative models. It would offer a standardized, extensible framework for evaluating generative models that learn data distributions rather than direct mappings, potentially enabling more reliable comparisons and future research directions. The open-source release strengthens reproducibility.

major comments (2)
  1. [Benchmarking Experiments] Benchmarking experiments section: The manuscript describes the experimental setup at a high level but supplies no quantitative results, error analysis, ablation studies on hyperparameter sensitivity, multiple random seeds with significance testing, or explicit verification that the modular components preserve generative training dynamics (e.g., adversarial or diffusion processes) without hidden task-specific tweaks. This directly undermines the central claim that the experiments yield reliable suggestions for model selection, as identical data splits, comparable training budgets, and appropriate distribution-matching metrics are not demonstrated.
  2. [Library Design] Library design and evaluation metrics: While the unified preprocessing pipeline and panoramic metrics are presented as compatible with generative paradigms, the manuscript does not include concrete examples or validation showing that these metrics (rather than point-wise losses) are applied consistently across synthesis, imputation, and forecasting tasks, which is load-bearing for claims of systematic assessment.
minor comments (2)
  1. [Abstract] Abstract: The claim of 'panoramic evaluation metrics' would benefit from a brief enumeration of the specific metrics used (e.g., distribution divergence measures) to clarify their suitability for generative tasks.
  2. [Introduction] The GitHub link and documentation URL are provided but should be verified for accessibility and accompanied by a brief description of the repository structure in the main text for reader convenience.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We have carefully reviewed the major comments and provide point-by-point responses below. Where the comments identify areas needing additional rigor or clarity, we have incorporated revisions into the next version of the manuscript.

read point-by-point responses
  1. Referee: [Benchmarking Experiments] Benchmarking experiments section: The manuscript describes the experimental setup at a high level but supplies no quantitative results, error analysis, ablation studies on hyperparameter sensitivity, multiple random seeds with significance testing, or explicit verification that the modular components preserve generative training dynamics (e.g., adversarial or diffusion processes) without hidden task-specific tweaks. This directly undermines the central claim that the experiments yield reliable suggestions for model selection, as identical data splits, comparable training budgets, and appropriate distribution-matching metrics are not demonstrated.

    Authors: We acknowledge that the current manuscript presents the benchmarking experiments primarily at a descriptive level to emphasize the library's modularity and extensibility. To strengthen the empirical foundation for our model selection suggestions, the revised manuscript will include comprehensive quantitative results across synthesis, forecasting, and imputation tasks. These additions will feature performance tables with distribution-matching metrics, error analysis, ablation studies on hyperparameter sensitivity, results aggregated over multiple random seeds with statistical significance testing, and explicit documentation of consistent data splits and training budgets. We will also add verification examples (including code references) confirming that the modular components preserve the core dynamics of adversarial and diffusion-based training without unintended task-specific alterations. These changes directly address the concerns about reliability and comparability. revision: yes

  2. Referee: [Library Design] Library design and evaluation metrics: While the unified preprocessing pipeline and panoramic metrics are presented as compatible with generative paradigms, the manuscript does not include concrete examples or validation showing that these metrics (rather than point-wise losses) are applied consistently across synthesis, imputation, and forecasting tasks, which is load-bearing for claims of systematic assessment.

    Authors: We agree that explicit validation is essential to substantiate the compatibility claims. The revised manuscript will include new concrete examples and validation subsections. These will demonstrate, with sample code snippets and illustrative outputs, how the panoramic metrics (focused on distribution matching) are applied uniformly across the three tasks, in contrast to point-wise losses. We will also show that the unified preprocessing pipeline supports generative paradigms without introducing inconsistencies. This addition will provide the necessary evidence for the systematic assessment framework. revision: yes

Circularity Check

0 steps flagged

No circularity: library and benchmarking claims are directly verifiable

full rationale

The paper introduces GenTS as an open-source benchmark library with unified preprocessing, models, metrics, and modular design, plus experimental results from diverse tasks. No mathematical derivations, equations, predictions, or first-principles results exist that could reduce to inputs by construction. Claims rest on the released code and described setup rather than self-referential definitions or self-citation chains. This matches the default non-circular case for software/benchmark contributions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software and benchmarking contribution rather than a theoretical paper, so it introduces no free parameters, mathematical axioms, or invented entities; it relies on standard machine learning assumptions about generative models learning data distributions.

pith-pipeline@v0.9.0 · 5739 in / 1037 out tokens · 55102 ms · 2026-05-20T13:11:41.190556+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages · 7 internal anchors

  1. [1]

    Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Flo- rencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shya- mal Anadkat, et al . 2024. Gpt-4 technical report. arXiv:2303.08774 [cs.CL] https://arxiv.org/abs/2303.08774

  2. [2]

    Ahmed Alaa, Alex James Chan, and Mihaela van der Schaar. 2021. Genera- tive Time-series Modeling with Fourier Flows. In International Conference on Learning Representations. https://openreview.net/forum?id=PpshD0AXfA

  3. [3]

    Maddix, Syama Ran- gapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang

    Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C. Maddix, Syama Ran- gapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang. 2020. GluonTS: Probabilistic and Neural Time Series Mod- eling in Python. Journal of Machine Learning Research 21,...

  4. [4]

    Yihao Ang, Qiang Huang, Yifan Bao, Anthony K. H. Tung, and Zhiyong Huang

  5. [5]

    TSGBench: Time Series Generation Benchmark. Proc. VLDB Endow. 17, 3 (Nov. 2023), 305–318. doi:10.14778/3632093.3632097

  6. [6]

    Yihao Ang, Qiang Wang, Qiang Huang, Yifan Bao, Xinyu Xi, Anthony K. H. Tung, Chen Jin, and Zhiyong Huang. 2025. CTBench: Cryptocurrency Time Series Generation Benchmark. arXiv:2508.02758 [q-fin.ST] https://arxiv.org/abs/2508. 02758

  7. [7]

    Nicolas Bonneel, Julien Rabin, Gabriel Peyré, and Hanspeter Pfister. 2015. Sliced and radon wasserstein barycenters of measures.Journal of Mathematical Imaging and Vision 51, 1 (2015), 22–45

  8. [8]

    Andrew Brock, Jeff Donahue, and Karen Simonyan. 2019. Large Scale GAN Training for High Fidelity Natural Image Synthesis. In International Conference on Learning Representations. https://openreview.net/forum?id=B1xsqj09Fm

  9. [9]

    Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud

  10. [10]

    In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18)

    Neural ordinary differential equations. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 6572–6583

  11. [11]

    Jonathan Crabbé, Nicolas Huynh, Jan Stanczuk, and Mihaela Van Der Schaar

  12. [12]

    In Proceedings of the 41st International Conference on Machine Learning (Vienna, Austria) (ICML’24)

    Time series diffusion in the frequency domain. In Proceedings of the 41st International Conference on Machine Learning (Vienna, Austria) (ICML’24). JMLR.org, Article 374, 32 pages

  13. [13]

    Abhyuday Desai, Cynthia Freeman, Zuhui Wang, and Ian Beaver. 2021. TimeVAE: A Variational Auto-Encoder for Multivariate Time Series Generation. arXiv:2111.08095 [cs.LG] https://arxiv.org/abs/2111.08095

  14. [14]

    Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. 2017. Density estimation using Real NVP. InInternational Conference on Learning Representations. https: //openreview.net/forum?id=HkpbnH9lx

  15. [15]

    Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs

    Cristóbal Esteban, Stephanie L. Hyland, and Gunnar Rätsch. 2017. Real- valued (Medical) Time Series Generation with Recurrent Conditional GANs. arXiv:1706.02633 [stat.ML] https://arxiv.org/abs/1706.02633

  16. [16]

    2019.Unsupervised scalable representation learning for multivariate time series

    Jean-Yves Franceschi, Aymeric Dieuleveut, and Martin Jaggi. 2019.Unsupervised scalable representation learning for multivariate time series. Curran Associates Inc., Red Hook, NY, USA

  17. [17]

    Asadullah Hill Galib, Pang-Ning Tan, and Lifeng Luo. 2024. Fide: Frequency- inflated conditional diffusion model for extreme-aware time series generation. Advances in Neural Information Processing Systems 37 (2024), 114434–114457

  18. [18]

    Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle

  19. [19]

    In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol

    MADE: Masked Autoencoder for Distribution Estimation. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 37), Fran- cis Bach and David Blei (Eds.). PMLR, Lille, France, 881–889. https: //proceedings.mlr.press/v37/germain15.html

  20. [20]

    Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio

    Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2 (Montreal, Canada) (NIPS’14). MIT Press, Cambridge, MA, USA, 2672–2680

  21. [21]

    Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al . 2025. DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning. Nature 645, 8081 (Sept. 2025), 633–638. doi:10.1038/s41586-025-09422-z

  22. [22]

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840– 6851

  23. [23]

    Paul Jeha, Michael Bohlke-Schneider, Pedro Mercado, Shubham Kapoor, Ra- jbir Singh Nirwan, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2022. PSA-GAN: Progressive self attention GANs for synthetic time series. InThe tenth international conference on learning representations

  24. [24]

    Jinsung Jeon, Jeonghak Kim, Haryong Song, Seunghyeon Cho, and Noseong Park

  25. [25]

    Advances in Neural Information Processing Systems 35 (2022), 36999– 37010

    Gt-gan: General purpose time series synthesis with generative adversarial networks. Advances in Neural Information Processing Systems 35 (2022), 36999– 37010

  26. [26]

    Patrick Kidger, James Foster, Xuechen Li, and Terry Lyons. 2021. Efficient and accurate gradients for neural SDEs. In Proceedings of the 35th International Conference on Neural Information Processing Systems (NIPS ’21). Curran Asso- ciates Inc., Red Hook, NY, USA, Article 1433, 15 pages

  27. [27]

    Patrick Kidger, James Foster, Xuechen Li, and Terry J Lyons. 2021. Neural SDEs as Infinite-Dimensional GANs. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 5453–5463. https: //proceedings.mlr.press/v139/kidger21b.html

  28. [28]

    Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling

    Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improved variational inference with inverse autoregressive flow. InProceedings of the 30th International Conference on Neural Information Processing Systems (Barcelona, Spain) (NIPS’16). Curran Associates Inc., Red Hook, NY, USA, 4743–4751

  29. [29]

    Diederik P Kingma and Max Welling. 2022. Auto-Encoding Variational Bayes. arXiv:1312.6114 [stat.ML] https://arxiv.org/abs/1312.6114

  30. [30]

    Marcel Kollovieh, Abdul Fatir Ansari, Michael Bohlke-Schneider, Jasper Zschieg- ner, Hao Wang, and Yuyang Bernie Wang. 2023. Predict, refine, synthesize: Self-guiding diffusion models for probabilistic time series forecasting. Advances in Neural Information Processing Systems 36 (2023), 28341–28364

  31. [31]

    Daesoo Lee, Sara Malacarne, and Erlend Aune. 2023. Vector Quantized Time Series Generation with a Bidirectional Prior Model. In Proceedings of The 26th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 206), Francisco Ruiz, Jennifer Dy, and Jan-Willem van de Meent (Eds.). PMLR, 7665–7693. ht...

  32. [32]

    Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, and David Duvenaud

  33. [33]

    In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol

    Scalable Gradients for Stochastic Differential Equations. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 108), Silvia Chiappa and Roberto Calandra (Eds.). PMLR, 3870–3882. https://proceedings.mlr.press/ v108/li20i.html

  34. [34]

    Yuxin Li, Wenchao Chen, Xinyue Hu, Bo Chen, Baolin Sun, and Mingyuan Zhou

  35. [35]

    In The Twelfth International Conference on Learning Representations

    Transformer-modulated diffusion models for probabilistic multivariate time series forecasting. In The Twelfth International Conference on Learning Representations

  36. [36]

    Yong Liu, Haixu Wu, Jianmin Wang, and Mingsheng Long. 2022. Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in neural information processing systems 35 (2022), 9881–9893

  37. [37]

    Calvin Luo. 2022. Understanding Diffusion Models: A Unified Perspective. arXiv:2208.11970 [cs.LG] https://arxiv.org/abs/2208.11970

  38. [38]

    Ilan Naiman, Nimrod Berman, Itai Pemper, Idan Arbiv, Gal Fadlon, and Omri Azencot. 2024. Utilizing image transforms and diffusion models for generative modeling of short and long time series. In Proceedings of the 38th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS ’24). Curran Associates Inc., Red Hook, NY...

  39. [39]

    Benjamin Erichson, Pu Ren, Michael W

    Ilan Naiman, N. Benjamin Erichson, Pu Ren, Michael W. Mahoney, and Omri Azencot. 2024. Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=eY7sLb0dVF

  40. [40]

    Alexander Nikitin, Letizia Iannucci, and Samuel Kaski. 2024. TSGM: a flexible framework for generative modeling of synthetic time series. Advances in Neural Information Processing Systems 37 (2024), 129042–129061

  41. [41]

    George Papamakarios, Theo Pavlakou, and Iain Murray. 2017. Masked autore- gressive flow for density estimation. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 2335–2344

  42. [42]

    William Peebles and Saining Xie. 2023. Scalable Diffusion Models with Transform- ers. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). 4172–4182. doi:10.1109/ICCV51070.2023.00387

  43. [43]

    Zhaozhi Qian, Bogdan-Constantin Cebere, and Mihaela van der Schaar. 2023. Synthcity: facilitating innovative use cases of synthetic data in different data modalities. arXiv:2301.07573 [cs.LG] https://arxiv.org/abs/2301.07573

  44. [44]

    Kashif Rasul, Calvin Seward, Ingmar Schuster, and Roland Vollgraf. 2021. Autore- gressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8857–8868. https...

  45. [45]

    Ali Razavi, Aäron van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with VQ-VAE-2. Curran Associates Inc., Red Hook, NY, USA. Conference’17, July 2017, Washington, DC, USA Trovato et al

  46. [46]

    Yulia Rubanova, Ricky T. Q. Chen, and David Duvenaud. 2019. Latent ODEs for irregularly-sampled time series. Curran Associates Inc., Red Hook, NY, USA

  47. [47]

    Ali Seyfi, Jean-Francois Rajotte, and Raymond Ng. 2022. Generating multivariate time series with COmmon Source CoordInated GAN (COSCI-GAN). Advances in neural information processing systems 35 (2022), 32777–32788

  48. [48]

    Yusuke Tashiro, Jiaming Song, Yang Song, and Stefano Ermon. 2021. Csdi: Con- ditional score-based diffusion models for probabilistic time series imputation. Advances in neural information processing systems 34 (2021), 24804–24816

  49. [49]

    Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Mil- lican, et al . 2025. Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805 [cs.CL] https://arxiv.org/abs/2312.11805

  50. [50]

    Aaron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu, et al . 2016. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 12 (2016), 1

  51. [51]

    Chenxi Wang, Linxiao Yang, Zhixian Wang, Liang Sun, and Yi Wang. 2025. A Non-isotropic Time Series Diffusion Model with Moving Average Transi- tions. In Proceedings of the 42nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 267), Aarti Singh, Maryam Fazel, Daniel Hsu, Simon Lacoste-Julien, Felix Berkenkamp, Te...

  52. [52]

    Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Yong Liu, Chen Wang, Mingsheng Long, and Jianmin Wang. 2025. Deep Time Series Models: A Comprehensive Survey and Benchmark. arXiv:2407.13278 [cs.LG] https://arxiv.org/abs/2407.13278

  53. [53]

    Jinsung Yoon, Daniel Jarrett, and Mihaela van der Schaar. 2019. Time-series Generative Adversarial Networks. InAdvances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc. https://proceedings.neurips.cc/ paper_files/paper/2019/file/c9efe5f26cd1...

  54. [54]

    Xinyu Yuan and Yan Qiao. 2024. Diffusion-TS: Interpretable Diffusion for General Time Series Generation. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=4h1apFjO99

  55. [55]

    Jiawen Zhang, Xumeng Wen, Zhenwei Zhang, Shun Zheng, Jia Li, and Jiang Bian

  56. [56]

    In NeurIPS Datasets and Benchmarks Track

    ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons. In NeurIPS Datasets and Benchmarks Track

  57. [57]

    Linqi Zhou, Michael Poli, Winnie Xu, Stefano Massaroli, and Stefano Er- mon. 2023. Deep Latent State Space Models for Time-Series Generation. In Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 202), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jona...