GenTS: A Comprehensive Benchmark Library for Generative Time Series Models

Chenxi Wang; Peiyang Li; Xiaorong Wang; Yi Wang

arxiv: 2605.17804 · v2 · pith:5GYHENSFnew · submitted 2026-05-18 · 💻 cs.LG · eess.SP

GenTS: A Comprehensive Benchmark Library for Generative Time Series Models

Chenxi Wang , Xiaorong Wang , Peiyang Li , Yi Wang This is my paper

Pith reviewed 2026-05-20 13:11 UTC · model grok-4.3

classification 💻 cs.LG eess.SP

keywords generative modelstime seriesbenchmark librarymodel evaluationdata synthesisforecastingimputationmachine learning

0 comments

The pith

GenTS provides a benchmark library built specifically for generative time series models rather than discriminative ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to address the mismatch between existing time series libraries, which optimize for direct input-output mappings and metrics like mean squared error, and generative models that instead learn full data distributions through processes such as adversarial training or diffusion. It introduces GenTS to supply a single preprocessing pipeline, a set of ready models, and broad evaluation metrics that work across synthesis, forecasting, imputation, and similar tasks. The modular structure is presented as the feature that lets users add their own datasets and models without rewriting the core workflow. Benchmarking runs on diverse tasks then produce concrete comparisons that guide which models suit which settings and highlight open research questions.

Core claim

GenTS is a benchmark library that supplies a unified data preprocessing pipeline, a collection of versatile generative models, panoramic evaluation metrics, and a modular architecture so that researchers can run systematic assessments of models that learn time series distributions rather than direct mappings.

What carries the argument

The modular design that links a shared preprocessing stage, interchangeable generative models, and a wide set of distribution-aware metrics into one extensible workflow.

If this is right

Evaluations of generative time series models become reproducible across synthesis, forecasting, and imputation tasks.
Model selection decisions can rest on side-by-side results instead of isolated published numbers.
Gaps in current generative approaches become visible when the same metrics are applied to many models.
New models can be inserted into the pipeline without rebuilding the evaluation stack.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same library structure could be adapted to other sequential data types such as event streams or spatial-temporal records.
Standardized generative benchmarks may shorten the cycle from new model proposal to comparative testing.
Industry teams working with irregular time series could adopt the preprocessing layer as a common starting point.

Load-bearing premise

The modular structure will let outside researchers add new datasets and models without friction and that the reported experiments will give stable guidance on model choice.

What would settle it

A follow-up study that adds several new datasets and models and finds the library requires major code changes or that the original benchmark rankings reverse under different random seeds or task definitions.

Figures

Figures reproduced from arXiv: 2605.17804 by Chenxi Wang, Peiyang Li, Xiaorong Wang, Yi Wang.

**Figure 1.** Figure 1: Left: The overview of GenTS framework. Right: A code snippet showing the neat usage of GenTS in less than 30 lines. Customizable Base Model Module Similar to BaseDataModule, a base model class BaseModel is designed as a template for all models. In this framework, we inherit LightningModule and then respectively implement training_step and validation_step for each model. Since some GAN-based models, like … view at source ↗

**Figure 2.** Figure 2: Main Results of Time Series Synthesis Benchmarking. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: t-SNE Visualization of Class Label-guided Time Series Generation. In ECG dataset, we selected first two classes from [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 7.** Figure 7: Training and inference time of different models in [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 4.** Figure 4: t-SNE Visualization of multivariate synthesis [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Training and inference time of different models in [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Training and inference time of different models in [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

read the original abstract

Generative models have demonstrated remarkable potential in time series analysis tasks, like synthesis, forecasting, imputation, etc. However, offering limited coverage for generative models, existing time series libraries are mainly engineered for discriminative models, with standardized workflows for specific tasks, such as optimizing Mean Squared Errors for time series forecasting. This rigid structure is fundamentally incompatible with the distinct and often complex paradigms of generative models (e.g., adversarial training, diffusion processes), which learn the underlying data distribution rather than a direct input-output mapping. To this end, we proposed GenTS, a comprehensive and extensible benchmark library designed for systematic assessment on generative time series models. GenTS features a unified data preprocessing pipeline, a collection of versatile models, and panoramic evaluation metrics. Its modular design also enables the researchers to flexibly customize beyond our built-in datasets and models. Based on GenTS, we conducted benchmarking experiments under diverse tasks, accordingly offering suggestions for model selection and identifying potential directions for future research. Our codes are open-source at https://github.com/WillWang1113/GenTS. The official tutorials and document are available at https://willwang1113.github.io/GenTS/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GenTS is a new open-source library filling a gap for generative time series benchmarking, but its model selection suggestions rest on details that are not yet shown in depth.

read the letter

The main point is that GenTS targets a real coverage gap: most time series libraries focus on discriminative models with fixed losses like MSE, while generative ones rely on distribution learning through adversarial or diffusion training. The library supplies a unified preprocessing pipeline, a set of models, panoramic metrics suited to synthesis and imputation, and a modular structure so users can plug in their own data or models. The authors ran benchmarking across tasks and drew some selection suggestions from the results, with code and tutorials released on GitHub. That combination is the concrete contribution here and could help standardize comparisons in this corner of the field. The experiments are the softer spot. The stress-test concern holds because reliable suggestions require consistent data splits, comparable training budgets, and metrics that actually match generative objectives rather than point-wise errors. The abstract stays high-level and shows no quantitative tables, ablations, or multi-seed statistics, so it is not yet clear whether the modular components preserve training dynamics without hidden tweaks. If the full paper supplies those checks, the suggestions gain weight; otherwise they remain preliminary. This work is aimed at researchers already working on generative time series who need a shared evaluation harness rather than a broad audience. A reader who wants to run fair head-to-head tests or extend the framework would get direct value from the code. It is coherent enough on its own terms to deserve peer review, mainly to pressure the experimental section for more transparency on consistency and statistical support.

Referee Report

2 major / 2 minor

Summary. The paper introduces GenTS, a comprehensive and extensible benchmark library for generative time series models. It features a unified data preprocessing pipeline, a collection of versatile models supporting paradigms such as adversarial training and diffusion processes, panoramic evaluation metrics, and a modular design for customization beyond built-in datasets and models. Based on the library, the authors conduct benchmarking experiments under diverse tasks (synthesis, forecasting, imputation) and provide model selection suggestions, with open-source code and tutorials available.

Significance. If the library's modular components are correctly implemented and the benchmarking experiments are executed with consistent training protocols and statistical rigor, GenTS could address a clear gap in existing time series libraries that focus primarily on discriminative models. It would offer a standardized, extensible framework for evaluating generative models that learn data distributions rather than direct mappings, potentially enabling more reliable comparisons and future research directions. The open-source release strengthens reproducibility.

major comments (2)

[Benchmarking Experiments] Benchmarking experiments section: The manuscript describes the experimental setup at a high level but supplies no quantitative results, error analysis, ablation studies on hyperparameter sensitivity, multiple random seeds with significance testing, or explicit verification that the modular components preserve generative training dynamics (e.g., adversarial or diffusion processes) without hidden task-specific tweaks. This directly undermines the central claim that the experiments yield reliable suggestions for model selection, as identical data splits, comparable training budgets, and appropriate distribution-matching metrics are not demonstrated.
[Library Design] Library design and evaluation metrics: While the unified preprocessing pipeline and panoramic metrics are presented as compatible with generative paradigms, the manuscript does not include concrete examples or validation showing that these metrics (rather than point-wise losses) are applied consistently across synthesis, imputation, and forecasting tasks, which is load-bearing for claims of systematic assessment.

minor comments (2)

[Abstract] Abstract: The claim of 'panoramic evaluation metrics' would benefit from a brief enumeration of the specific metrics used (e.g., distribution divergence measures) to clarify their suitability for generative tasks.
[Introduction] The GitHub link and documentation URL are provided but should be verified for accessibility and accompanied by a brief description of the repository structure in the main text for reader convenience.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We have carefully reviewed the major comments and provide point-by-point responses below. Where the comments identify areas needing additional rigor or clarity, we have incorporated revisions into the next version of the manuscript.

read point-by-point responses

Referee: [Benchmarking Experiments] Benchmarking experiments section: The manuscript describes the experimental setup at a high level but supplies no quantitative results, error analysis, ablation studies on hyperparameter sensitivity, multiple random seeds with significance testing, or explicit verification that the modular components preserve generative training dynamics (e.g., adversarial or diffusion processes) without hidden task-specific tweaks. This directly undermines the central claim that the experiments yield reliable suggestions for model selection, as identical data splits, comparable training budgets, and appropriate distribution-matching metrics are not demonstrated.

Authors: We acknowledge that the current manuscript presents the benchmarking experiments primarily at a descriptive level to emphasize the library's modularity and extensibility. To strengthen the empirical foundation for our model selection suggestions, the revised manuscript will include comprehensive quantitative results across synthesis, forecasting, and imputation tasks. These additions will feature performance tables with distribution-matching metrics, error analysis, ablation studies on hyperparameter sensitivity, results aggregated over multiple random seeds with statistical significance testing, and explicit documentation of consistent data splits and training budgets. We will also add verification examples (including code references) confirming that the modular components preserve the core dynamics of adversarial and diffusion-based training without unintended task-specific alterations. These changes directly address the concerns about reliability and comparability. revision: yes
Referee: [Library Design] Library design and evaluation metrics: While the unified preprocessing pipeline and panoramic metrics are presented as compatible with generative paradigms, the manuscript does not include concrete examples or validation showing that these metrics (rather than point-wise losses) are applied consistently across synthesis, imputation, and forecasting tasks, which is load-bearing for claims of systematic assessment.

Authors: We agree that explicit validation is essential to substantiate the compatibility claims. The revised manuscript will include new concrete examples and validation subsections. These will demonstrate, with sample code snippets and illustrative outputs, how the panoramic metrics (focused on distribution matching) are applied uniformly across the three tasks, in contrast to point-wise losses. We will also show that the unified preprocessing pipeline supports generative paradigms without introducing inconsistencies. This addition will provide the necessary evidence for the systematic assessment framework. revision: yes

Circularity Check

0 steps flagged

No circularity: library and benchmarking claims are directly verifiable

full rationale

The paper introduces GenTS as an open-source benchmark library with unified preprocessing, models, metrics, and modular design, plus experimental results from diverse tasks. No mathematical derivations, equations, predictions, or first-principles results exist that could reduce to inputs by construction. Claims rest on the released code and described setup rather than self-referential definitions or self-citation chains. This matches the default non-circular case for software/benchmark contributions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software and benchmarking contribution rather than a theoretical paper, so it introduces no free parameters, mathematical axioms, or invented entities; it relies on standard machine learning assumptions about generative models learning data distributions.

pith-pipeline@v0.9.0 · 5739 in / 1037 out tokens · 55102 ms · 2026-05-20T13:11:41.190556+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

GenTS features a unified data preprocessing pipeline, a collection of versatile models, and panoramic evaluation metrics. Its modular design also enables the researchers to flexibly customize beyond our built-in datasets and models.
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we focus on three major and common time series generation tasks... Time Series Synthesis... Forecasting... Imputation

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages · 7 internal anchors

[1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Flo- rencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shya- mal Anadkat, et al . 2024. Gpt-4 technical report. arXiv:2303.08774 [cs.CL] https://arxiv.org/abs/2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2024
[2]

Ahmed Alaa, Alex James Chan, and Mihaela van der Schaar. 2021. Genera- tive Time-series Modeling with Fourier Flows. In International Conference on Learning Representations. https://openreview.net/forum?id=PpshD0AXfA

work page 2021
[3]

Maddix, Syama Ran- gapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang

Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C. Maddix, Syama Ran- gapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang. 2020. GluonTS: Probabilistic and Neural Time Series Mod- eling in Python. Journal of Machine Learning Research 21,...

work page 2020
[4]

Yihao Ang, Qiang Huang, Yifan Bao, Anthony K. H. Tung, and Zhiyong Huang

work page
[5]

TSGBench: Time Series Generation Benchmark. Proc. VLDB Endow. 17, 3 (Nov. 2023), 305–318. doi:10.14778/3632093.3632097

work page doi:10.14778/3632093.3632097 2023
[6]

Yihao Ang, Qiang Wang, Qiang Huang, Yifan Bao, Xinyu Xi, Anthony K. H. Tung, Chen Jin, and Zhiyong Huang. 2025. CTBench: Cryptocurrency Time Series Generation Benchmark. arXiv:2508.02758 [q-fin.ST] https://arxiv.org/abs/2508. 02758

work page arXiv 2025
[7]

Nicolas Bonneel, Julien Rabin, Gabriel Peyré, and Hanspeter Pfister. 2015. Sliced and radon wasserstein barycenters of measures.Journal of Mathematical Imaging and Vision 51, 1 (2015), 22–45

work page 2015
[8]

Andrew Brock, Jeff Donahue, and Karen Simonyan. 2019. Large Scale GAN Training for High Fidelity Natural Image Synthesis. In International Conference on Learning Representations. https://openreview.net/forum?id=B1xsqj09Fm

work page 2019
[9]

Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud

work page
[10]

In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18)

Neural ordinary differential equations. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 6572–6583

work page
[11]

Jonathan Crabbé, Nicolas Huynh, Jan Stanczuk, and Mihaela Van Der Schaar

work page
[12]

In Proceedings of the 41st International Conference on Machine Learning (Vienna, Austria) (ICML’24)

Time series diffusion in the frequency domain. In Proceedings of the 41st International Conference on Machine Learning (Vienna, Austria) (ICML’24). JMLR.org, Article 374, 32 pages

work page
[13]

Abhyuday Desai, Cynthia Freeman, Zuhui Wang, and Ian Beaver. 2021. TimeVAE: A Variational Auto-Encoder for Multivariate Time Series Generation. arXiv:2111.08095 [cs.LG] https://arxiv.org/abs/2111.08095

work page arXiv 2021
[14]

Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. 2017. Density estimation using Real NVP. InInternational Conference on Learning Representations. https: //openreview.net/forum?id=HkpbnH9lx

work page 2017
[15]

Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs

Cristóbal Esteban, Stephanie L. Hyland, and Gunnar Rätsch. 2017. Real- valued (Medical) Time Series Generation with Recurrent Conditional GANs. arXiv:1706.02633 [stat.ML] https://arxiv.org/abs/1706.02633

work page internal anchor Pith review Pith/arXiv arXiv 2017
[16]

2019.Unsupervised scalable representation learning for multivariate time series

Jean-Yves Franceschi, Aymeric Dieuleveut, and Martin Jaggi. 2019.Unsupervised scalable representation learning for multivariate time series. Curran Associates Inc., Red Hook, NY, USA

work page 2019
[17]

Asadullah Hill Galib, Pang-Ning Tan, and Lifeng Luo. 2024. Fide: Frequency- inflated conditional diffusion model for extreme-aware time series generation. Advances in Neural Information Processing Systems 37 (2024), 114434–114457

work page 2024
[18]

Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle

work page
[19]

In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol

MADE: Masked Autoencoder for Distribution Estimation. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 37), Fran- cis Bach and David Blei (Eds.). PMLR, Lille, France, 881–889. https: //proceedings.mlr.press/v37/germain15.html

work page
[20]

Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2 (Montreal, Canada) (NIPS’14). MIT Press, Cambridge, MA, USA, 2672–2680

work page 2014
[21]

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al . 2025. DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning. Nature 645, 8081 (Sept. 2025), 633–638. doi:10.1038/s41586-025-09422-z

work page doi:10.1038/s41586-025-09422-z 2025
[22]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840– 6851

work page 2020
[23]

Paul Jeha, Michael Bohlke-Schneider, Pedro Mercado, Shubham Kapoor, Ra- jbir Singh Nirwan, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2022. PSA-GAN: Progressive self attention GANs for synthetic time series. InThe tenth international conference on learning representations

work page 2022
[24]

Jinsung Jeon, Jeonghak Kim, Haryong Song, Seunghyeon Cho, and Noseong Park

work page
[25]

Advances in Neural Information Processing Systems 35 (2022), 36999– 37010

Gt-gan: General purpose time series synthesis with generative adversarial networks. Advances in Neural Information Processing Systems 35 (2022), 36999– 37010

work page 2022
[26]

Patrick Kidger, James Foster, Xuechen Li, and Terry Lyons. 2021. Efficient and accurate gradients for neural SDEs. In Proceedings of the 35th International Conference on Neural Information Processing Systems (NIPS ’21). Curran Asso- ciates Inc., Red Hook, NY, USA, Article 1433, 15 pages

work page 2021
[27]

Patrick Kidger, James Foster, Xuechen Li, and Terry J Lyons. 2021. Neural SDEs as Infinite-Dimensional GANs. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 5453–5463. https: //proceedings.mlr.press/v139/kidger21b.html

work page 2021
[28]

Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling

Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improved variational inference with inverse autoregressive flow. InProceedings of the 30th International Conference on Neural Information Processing Systems (Barcelona, Spain) (NIPS’16). Curran Associates Inc., Red Hook, NY, USA, 4743–4751

work page 2016
[29]

Diederik P Kingma and Max Welling. 2022. Auto-Encoding Variational Bayes. arXiv:1312.6114 [stat.ML] https://arxiv.org/abs/1312.6114

work page internal anchor Pith review Pith/arXiv arXiv 2022
[30]

Marcel Kollovieh, Abdul Fatir Ansari, Michael Bohlke-Schneider, Jasper Zschieg- ner, Hao Wang, and Yuyang Bernie Wang. 2023. Predict, refine, synthesize: Self-guiding diffusion models for probabilistic time series forecasting. Advances in Neural Information Processing Systems 36 (2023), 28341–28364

work page 2023
[31]

Daesoo Lee, Sara Malacarne, and Erlend Aune. 2023. Vector Quantized Time Series Generation with a Bidirectional Prior Model. In Proceedings of The 26th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 206), Francisco Ruiz, Jennifer Dy, and Jan-Willem van de Meent (Eds.). PMLR, 7665–7693. ht...

work page 2023
[32]

Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, and David Duvenaud

work page
[33]

In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol

Scalable Gradients for Stochastic Differential Equations. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 108), Silvia Chiappa and Roberto Calandra (Eds.). PMLR, 3870–3882. https://proceedings.mlr.press/ v108/li20i.html

work page
[34]

Yuxin Li, Wenchao Chen, Xinyue Hu, Bo Chen, Baolin Sun, and Mingyuan Zhou

work page
[35]

In The Twelfth International Conference on Learning Representations

Transformer-modulated diffusion models for probabilistic multivariate time series forecasting. In The Twelfth International Conference on Learning Representations

work page
[36]

Yong Liu, Haixu Wu, Jianmin Wang, and Mingsheng Long. 2022. Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in neural information processing systems 35 (2022), 9881–9893

work page 2022
[37]

Calvin Luo. 2022. Understanding Diffusion Models: A Unified Perspective. arXiv:2208.11970 [cs.LG] https://arxiv.org/abs/2208.11970

work page arXiv 2022
[38]

Ilan Naiman, Nimrod Berman, Itai Pemper, Idan Arbiv, Gal Fadlon, and Omri Azencot. 2024. Utilizing image transforms and diffusion models for generative modeling of short and long time series. In Proceedings of the 38th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS ’24). Curran Associates Inc., Red Hook, NY...

work page 2024
[39]

Benjamin Erichson, Pu Ren, Michael W

Ilan Naiman, N. Benjamin Erichson, Pu Ren, Michael W. Mahoney, and Omri Azencot. 2024. Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=eY7sLb0dVF

work page 2024
[40]

Alexander Nikitin, Letizia Iannucci, and Samuel Kaski. 2024. TSGM: a flexible framework for generative modeling of synthetic time series. Advances in Neural Information Processing Systems 37 (2024), 129042–129061

work page 2024
[41]

George Papamakarios, Theo Pavlakou, and Iain Murray. 2017. Masked autore- gressive flow for density estimation. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 2335–2344

work page 2017
[42]

William Peebles and Saining Xie. 2023. Scalable Diffusion Models with Transform- ers. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). 4172–4182. doi:10.1109/ICCV51070.2023.00387

work page doi:10.1109/iccv51070.2023.00387 2023
[43]

Zhaozhi Qian, Bogdan-Constantin Cebere, and Mihaela van der Schaar. 2023. Synthcity: facilitating innovative use cases of synthetic data in different data modalities. arXiv:2301.07573 [cs.LG] https://arxiv.org/abs/2301.07573

work page arXiv 2023
[44]

Kashif Rasul, Calvin Seward, Ingmar Schuster, and Roland Vollgraf. 2021. Autore- gressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8857–8868. https...

work page 2021
[45]

Ali Razavi, Aäron van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with VQ-VAE-2. Curran Associates Inc., Red Hook, NY, USA. Conference’17, July 2017, Washington, DC, USA Trovato et al

work page 2019
[46]

Yulia Rubanova, Ricky T. Q. Chen, and David Duvenaud. 2019. Latent ODEs for irregularly-sampled time series. Curran Associates Inc., Red Hook, NY, USA

work page 2019
[47]

Ali Seyfi, Jean-Francois Rajotte, and Raymond Ng. 2022. Generating multivariate time series with COmmon Source CoordInated GAN (COSCI-GAN). Advances in neural information processing systems 35 (2022), 32777–32788

work page 2022
[48]

Yusuke Tashiro, Jiaming Song, Yang Song, and Stefano Ermon. 2021. Csdi: Con- ditional score-based diffusion models for probabilistic time series imputation. Advances in neural information processing systems 34 (2021), 24804–24816

work page 2021
[49]

Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Mil- lican, et al . 2025. Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805 [cs.CL] https://arxiv.org/abs/2312.11805

work page internal anchor Pith review Pith/arXiv arXiv 2025
[50]

Aaron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu, et al . 2016. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 12 (2016), 1

work page internal anchor Pith review Pith/arXiv arXiv 2016
[51]

Chenxi Wang, Linxiao Yang, Zhixian Wang, Liang Sun, and Yi Wang. 2025. A Non-isotropic Time Series Diffusion Model with Moving Average Transi- tions. In Proceedings of the 42nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 267), Aarti Singh, Maryam Fazel, Daniel Hsu, Simon Lacoste-Julien, Felix Berkenkamp, Te...

work page 2025
[52]

Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Yong Liu, Chen Wang, Mingsheng Long, and Jianmin Wang. 2025. Deep Time Series Models: A Comprehensive Survey and Benchmark. arXiv:2407.13278 [cs.LG] https://arxiv.org/abs/2407.13278

work page internal anchor Pith review Pith/arXiv arXiv 2025
[53]

Jinsung Yoon, Daniel Jarrett, and Mihaela van der Schaar. 2019. Time-series Generative Adversarial Networks. InAdvances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc. https://proceedings.neurips.cc/ paper_files/paper/2019/file/c9efe5f26cd1...

work page 2019
[54]

Xinyu Yuan and Yan Qiao. 2024. Diffusion-TS: Interpretable Diffusion for General Time Series Generation. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=4h1apFjO99

work page 2024
[55]

Jiawen Zhang, Xumeng Wen, Zhenwei Zhang, Shun Zheng, Jia Li, and Jiang Bian

work page
[56]

In NeurIPS Datasets and Benchmarks Track

ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons. In NeurIPS Datasets and Benchmarks Track

work page
[57]

Linqi Zhou, Michael Poli, Winnie Xu, Stefano Massaroli, and Stefano Er- mon. 2023. Deep Latent State Space Models for Time-Series Generation. In Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 202), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jona...

work page internal anchor Pith review Pith/arXiv arXiv 2023

[1] [1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Flo- rencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shya- mal Anadkat, et al . 2024. Gpt-4 technical report. arXiv:2303.08774 [cs.CL] https://arxiv.org/abs/2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2024

[2] [2]

Ahmed Alaa, Alex James Chan, and Mihaela van der Schaar. 2021. Genera- tive Time-series Modeling with Fourier Flows. In International Conference on Learning Representations. https://openreview.net/forum?id=PpshD0AXfA

work page 2021

[3] [3]

Maddix, Syama Ran- gapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang

Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C. Maddix, Syama Ran- gapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang. 2020. GluonTS: Probabilistic and Neural Time Series Mod- eling in Python. Journal of Machine Learning Research 21,...

work page 2020

[4] [4]

Yihao Ang, Qiang Huang, Yifan Bao, Anthony K. H. Tung, and Zhiyong Huang

work page

[5] [5]

TSGBench: Time Series Generation Benchmark. Proc. VLDB Endow. 17, 3 (Nov. 2023), 305–318. doi:10.14778/3632093.3632097

work page doi:10.14778/3632093.3632097 2023

[6] [6]

Yihao Ang, Qiang Wang, Qiang Huang, Yifan Bao, Xinyu Xi, Anthony K. H. Tung, Chen Jin, and Zhiyong Huang. 2025. CTBench: Cryptocurrency Time Series Generation Benchmark. arXiv:2508.02758 [q-fin.ST] https://arxiv.org/abs/2508. 02758

work page arXiv 2025

[7] [7]

Nicolas Bonneel, Julien Rabin, Gabriel Peyré, and Hanspeter Pfister. 2015. Sliced and radon wasserstein barycenters of measures.Journal of Mathematical Imaging and Vision 51, 1 (2015), 22–45

work page 2015

[8] [8]

Andrew Brock, Jeff Donahue, and Karen Simonyan. 2019. Large Scale GAN Training for High Fidelity Natural Image Synthesis. In International Conference on Learning Representations. https://openreview.net/forum?id=B1xsqj09Fm

work page 2019

[9] [9]

Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud

work page

[10] [10]

In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18)

Neural ordinary differential equations. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 6572–6583

work page

[11] [11]

Jonathan Crabbé, Nicolas Huynh, Jan Stanczuk, and Mihaela Van Der Schaar

work page

[12] [12]

In Proceedings of the 41st International Conference on Machine Learning (Vienna, Austria) (ICML’24)

Time series diffusion in the frequency domain. In Proceedings of the 41st International Conference on Machine Learning (Vienna, Austria) (ICML’24). JMLR.org, Article 374, 32 pages

work page

[13] [13]

Abhyuday Desai, Cynthia Freeman, Zuhui Wang, and Ian Beaver. 2021. TimeVAE: A Variational Auto-Encoder for Multivariate Time Series Generation. arXiv:2111.08095 [cs.LG] https://arxiv.org/abs/2111.08095

work page arXiv 2021

[14] [14]

Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. 2017. Density estimation using Real NVP. InInternational Conference on Learning Representations. https: //openreview.net/forum?id=HkpbnH9lx

work page 2017

[15] [15]

Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs

Cristóbal Esteban, Stephanie L. Hyland, and Gunnar Rätsch. 2017. Real- valued (Medical) Time Series Generation with Recurrent Conditional GANs. arXiv:1706.02633 [stat.ML] https://arxiv.org/abs/1706.02633

work page internal anchor Pith review Pith/arXiv arXiv 2017

[16] [16]

2019.Unsupervised scalable representation learning for multivariate time series

Jean-Yves Franceschi, Aymeric Dieuleveut, and Martin Jaggi. 2019.Unsupervised scalable representation learning for multivariate time series. Curran Associates Inc., Red Hook, NY, USA

work page 2019

[17] [17]

Asadullah Hill Galib, Pang-Ning Tan, and Lifeng Luo. 2024. Fide: Frequency- inflated conditional diffusion model for extreme-aware time series generation. Advances in Neural Information Processing Systems 37 (2024), 114434–114457

work page 2024

[18] [18]

Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle

work page

[19] [19]

In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol

MADE: Masked Autoencoder for Distribution Estimation. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 37), Fran- cis Bach and David Blei (Eds.). PMLR, Lille, France, 881–889. https: //proceedings.mlr.press/v37/germain15.html

work page

[20] [20]

Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2 (Montreal, Canada) (NIPS’14). MIT Press, Cambridge, MA, USA, 2672–2680

work page 2014

[21] [21]

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al . 2025. DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning. Nature 645, 8081 (Sept. 2025), 633–638. doi:10.1038/s41586-025-09422-z

work page doi:10.1038/s41586-025-09422-z 2025

[22] [22]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840– 6851

work page 2020

[23] [23]

Paul Jeha, Michael Bohlke-Schneider, Pedro Mercado, Shubham Kapoor, Ra- jbir Singh Nirwan, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2022. PSA-GAN: Progressive self attention GANs for synthetic time series. InThe tenth international conference on learning representations

work page 2022

[24] [24]

Jinsung Jeon, Jeonghak Kim, Haryong Song, Seunghyeon Cho, and Noseong Park

work page

[25] [25]

Advances in Neural Information Processing Systems 35 (2022), 36999– 37010

Gt-gan: General purpose time series synthesis with generative adversarial networks. Advances in Neural Information Processing Systems 35 (2022), 36999– 37010

work page 2022

[26] [26]

Patrick Kidger, James Foster, Xuechen Li, and Terry Lyons. 2021. Efficient and accurate gradients for neural SDEs. In Proceedings of the 35th International Conference on Neural Information Processing Systems (NIPS ’21). Curran Asso- ciates Inc., Red Hook, NY, USA, Article 1433, 15 pages

work page 2021

[27] [27]

Patrick Kidger, James Foster, Xuechen Li, and Terry J Lyons. 2021. Neural SDEs as Infinite-Dimensional GANs. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 5453–5463. https: //proceedings.mlr.press/v139/kidger21b.html

work page 2021

[28] [28]

Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling

Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improved variational inference with inverse autoregressive flow. InProceedings of the 30th International Conference on Neural Information Processing Systems (Barcelona, Spain) (NIPS’16). Curran Associates Inc., Red Hook, NY, USA, 4743–4751

work page 2016

[29] [29]

Diederik P Kingma and Max Welling. 2022. Auto-Encoding Variational Bayes. arXiv:1312.6114 [stat.ML] https://arxiv.org/abs/1312.6114

work page internal anchor Pith review Pith/arXiv arXiv 2022

[30] [30]

Marcel Kollovieh, Abdul Fatir Ansari, Michael Bohlke-Schneider, Jasper Zschieg- ner, Hao Wang, and Yuyang Bernie Wang. 2023. Predict, refine, synthesize: Self-guiding diffusion models for probabilistic time series forecasting. Advances in Neural Information Processing Systems 36 (2023), 28341–28364

work page 2023

[31] [31]

Daesoo Lee, Sara Malacarne, and Erlend Aune. 2023. Vector Quantized Time Series Generation with a Bidirectional Prior Model. In Proceedings of The 26th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 206), Francisco Ruiz, Jennifer Dy, and Jan-Willem van de Meent (Eds.). PMLR, 7665–7693. ht...

work page 2023

[32] [32]

Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, and David Duvenaud

work page

[33] [33]

In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol

Scalable Gradients for Stochastic Differential Equations. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 108), Silvia Chiappa and Roberto Calandra (Eds.). PMLR, 3870–3882. https://proceedings.mlr.press/ v108/li20i.html

work page

[34] [34]

Yuxin Li, Wenchao Chen, Xinyue Hu, Bo Chen, Baolin Sun, and Mingyuan Zhou

work page

[35] [35]

In The Twelfth International Conference on Learning Representations

Transformer-modulated diffusion models for probabilistic multivariate time series forecasting. In The Twelfth International Conference on Learning Representations

work page

[36] [36]

Yong Liu, Haixu Wu, Jianmin Wang, and Mingsheng Long. 2022. Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in neural information processing systems 35 (2022), 9881–9893

work page 2022

[37] [37]

Calvin Luo. 2022. Understanding Diffusion Models: A Unified Perspective. arXiv:2208.11970 [cs.LG] https://arxiv.org/abs/2208.11970

work page arXiv 2022

[38] [38]

Ilan Naiman, Nimrod Berman, Itai Pemper, Idan Arbiv, Gal Fadlon, and Omri Azencot. 2024. Utilizing image transforms and diffusion models for generative modeling of short and long time series. In Proceedings of the 38th International Conference on Neural Information Processing Systems (Vancouver, BC, Canada) (NIPS ’24). Curran Associates Inc., Red Hook, NY...

work page 2024

[39] [39]

Benjamin Erichson, Pu Ren, Michael W

Ilan Naiman, N. Benjamin Erichson, Pu Ren, Michael W. Mahoney, and Omri Azencot. 2024. Generative Modeling of Regular and Irregular Time Series Data via Koopman VAEs. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=eY7sLb0dVF

work page 2024

[40] [40]

Alexander Nikitin, Letizia Iannucci, and Samuel Kaski. 2024. TSGM: a flexible framework for generative modeling of synthetic time series. Advances in Neural Information Processing Systems 37 (2024), 129042–129061

work page 2024

[41] [41]

George Papamakarios, Theo Pavlakou, and Iain Murray. 2017. Masked autore- gressive flow for density estimation. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 2335–2344

work page 2017

[42] [42]

William Peebles and Saining Xie. 2023. Scalable Diffusion Models with Transform- ers. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). 4172–4182. doi:10.1109/ICCV51070.2023.00387

work page doi:10.1109/iccv51070.2023.00387 2023

[43] [43]

Zhaozhi Qian, Bogdan-Constantin Cebere, and Mihaela van der Schaar. 2023. Synthcity: facilitating innovative use cases of synthetic data in different data modalities. arXiv:2301.07573 [cs.LG] https://arxiv.org/abs/2301.07573

work page arXiv 2023

[44] [44]

Kashif Rasul, Calvin Seward, Ingmar Schuster, and Roland Vollgraf. 2021. Autore- gressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8857–8868. https...

work page 2021

[45] [45]

Ali Razavi, Aäron van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with VQ-VAE-2. Curran Associates Inc., Red Hook, NY, USA. Conference’17, July 2017, Washington, DC, USA Trovato et al

work page 2019

[46] [46]

Yulia Rubanova, Ricky T. Q. Chen, and David Duvenaud. 2019. Latent ODEs for irregularly-sampled time series. Curran Associates Inc., Red Hook, NY, USA

work page 2019

[47] [47]

Ali Seyfi, Jean-Francois Rajotte, and Raymond Ng. 2022. Generating multivariate time series with COmmon Source CoordInated GAN (COSCI-GAN). Advances in neural information processing systems 35 (2022), 32777–32788

work page 2022

[48] [48]

Yusuke Tashiro, Jiaming Song, Yang Song, and Stefano Ermon. 2021. Csdi: Con- ditional score-based diffusion models for probabilistic time series imputation. Advances in neural information processing systems 34 (2021), 24804–24816

work page 2021

[49] [49]

Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Mil- lican, et al . 2025. Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805 [cs.CL] https://arxiv.org/abs/2312.11805

work page internal anchor Pith review Pith/arXiv arXiv 2025

[50] [50]

Aaron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu, et al . 2016. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 12 (2016), 1

work page internal anchor Pith review Pith/arXiv arXiv 2016

[51] [51]

Chenxi Wang, Linxiao Yang, Zhixian Wang, Liang Sun, and Yi Wang. 2025. A Non-isotropic Time Series Diffusion Model with Moving Average Transi- tions. In Proceedings of the 42nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 267), Aarti Singh, Maryam Fazel, Daniel Hsu, Simon Lacoste-Julien, Felix Berkenkamp, Te...

work page 2025

[52] [52]

Yuxuan Wang, Haixu Wu, Jiaxiang Dong, Yong Liu, Chen Wang, Mingsheng Long, and Jianmin Wang. 2025. Deep Time Series Models: A Comprehensive Survey and Benchmark. arXiv:2407.13278 [cs.LG] https://arxiv.org/abs/2407.13278

work page internal anchor Pith review Pith/arXiv arXiv 2025

[53] [53]

Jinsung Yoon, Daniel Jarrett, and Mihaela van der Schaar. 2019. Time-series Generative Adversarial Networks. InAdvances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc. https://proceedings.neurips.cc/ paper_files/paper/2019/file/c9efe5f26cd1...

work page 2019

[54] [54]

Xinyu Yuan and Yan Qiao. 2024. Diffusion-TS: Interpretable Diffusion for General Time Series Generation. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=4h1apFjO99

work page 2024

[55] [55]

Jiawen Zhang, Xumeng Wen, Zhenwei Zhang, Shun Zheng, Jia Li, and Jiang Bian

work page

[56] [56]

In NeurIPS Datasets and Benchmarks Track

ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons. In NeurIPS Datasets and Benchmarks Track

work page

[57] [57]

Linqi Zhou, Michael Poli, Winnie Xu, Stefano Massaroli, and Stefano Er- mon. 2023. Deep Latent State Space Models for Time-Series Generation. In Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 202), Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jona...

work page internal anchor Pith review Pith/arXiv arXiv 2023