Multivariate Time Series Data Imputation via Distributionally Robust Regularization

Che-Yi Liao; Gian-Gabriel Garcia; Kamran Paynabar; Zheng Dong

arxiv: 2602.00844 · v2 · submitted 2026-01-31 · 📊 stat.ML · cs.LG· stat.AP

Multivariate Time Series Data Imputation via Distributionally Robust Regularization

Che-Yi Liao , Zheng Dong , Gian-Gabriel Garcia , Kamran Paynabar This is my paper

Pith reviewed 2026-05-16 08:39 UTC · model grok-4.3

classification 📊 stat.ML cs.LGstat.AP

keywords multivariate time seriesdata imputationdistributionally robust optimizationWasserstein distancemissing datanon-stationary datadeep learning

0 comments

The pith

A Wasserstein-based robust objective reduces overfitting in multivariate time series imputation caused by non-stationarity and systematic missingness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard imputation methods overfit to the biased observed data because non-stationary patterns and missing values create a mismatch with the true distribution. The paper introduces the Distributionally Robust Regularized Imputer Objective that adds a worst-case divergence penalty inside a Wasserstein ambiguity set. This forces the model to perform well even under the most adverse plausible shift in the data distribution. A tractable surrogate converts the problem to an adversarial search over trajectories, solved by alternating optimization that fits modern neural network backbones. Experiments across real datasets indicate the approach delivers steadier imputations and supports better forecasting on the completed series.

Core claim

The Distributionally Robust Regularized Imputer Objective jointly minimizes reconstruction error and the worst-case divergence between the imputer distribution and data distributions within a Wasserstein ambiguity set. A tractable upper-bound surrogate reduces the infinite-dimensional optimization over measures to an adversarial search over sample trajectories, solved by an alternating learning algorithm compatible with deep learning architectures.

What carries the argument

The DRIO objective, which augments standard reconstruction loss with a worst-case Wasserstein divergence term over an ambiguity set around the observed data.

If this is right

DRIO yields more stable imputation accuracy across varied missingness scenarios in real multivariate time series.
Completed series produced by DRIO support improved performance in downstream forecasting tasks.
The method integrates directly with existing deep learning time-series models without changing their architecture.
The surrogate bound converts the robust objective into a practical min-max training loop.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same worst-case regularization idea could extend to other sequential data tasks where partial observations induce distribution shift.
Synthetic benchmarks with explicitly constructed non-stationary shifts would isolate whether the Wasserstein set choice drives the gains.
Combining DRIO with uncertainty-aware forecasting models might produce end-to-end pipelines that remain reliable under heavy missingness.

Load-bearing premise

The chosen Wasserstein ambiguity set must correctly describe the distribution mismatch created by non-stationarity and missing values.

What would settle it

On a controlled dataset where the true complete distribution is known, if standard point-wise or alignment-based imputers achieve lower error than DRIO under the same missingness patterns, the advantage disappears.

Figures

Figures reproduced from arXiv: 2602.00844 by Che-Yi Liao, Gian-Gabriel Garcia, Kamran Paynabar, Zheng Dong.

**Figure 2.** Figure 2: Downstream forecasting MSE using imputed data. Each box aggregates scenario-level [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Test-MSE gap between the deployable validation-MSE pick and the oracle test-MSE [PITH_FULL_IMAGE:figures/full_fig_p028_3.png] view at source ↗

read the original abstract

Multivariate time series imputation is often compromised by mismatch between the observed and true data distributions, a bias induced by the combined effects of time-series non-stationarity and systematic missingness. Standard methods that encourage point-wise reconstruction or direct distributional alignment may overfit these biased observations. We propose the Distributionally Robust Regularized Imputer Objective (DRIO), which jointly minimizes reconstruction error and the worst-case divergence between the imputer distribution and data distributions within a Wasserstein ambiguity set. We derive a tractable upper-bound surrogate that reduces infinite-dimensional optimization over measures to adversarial search over sample trajectories, and develop an alternating learning algorithm compatible with modern deep learning backbones. Comprehensive experiments on diverse real-world datasets show that DRIO consistently provides robust imputation and suggests improved downstream forecasting under various missingness scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DRIO adds a Wasserstein-robust term to time-series imputation with a derived surrogate and alternating algorithm, but the guarantee depends on the ambiguity set actually containing the true distribution under systematic missingness.

read the letter

The paper's main move is to frame multivariate time series imputation as a distributionally robust optimization problem. DRIO minimizes reconstruction error while also minimizing the worst-case divergence to any distribution inside a Wasserstein ball around the observed data. They derive a tractable upper-bound surrogate that converts the min-max over measures into an adversarial search over sample trajectories, then solve it with an alternating algorithm that works with standard deep learning models. This is a concrete step beyond plain reconstruction losses or direct alignment methods. The experiments on several real-world datasets report better imputation accuracy and downstream forecasting under different missingness patterns, which gives the work applied relevance. The formulation and algorithm are the clearest new pieces. The soft spot sits in the robustness claim itself. The Wasserstein ball is centered on the empirical measure from the observed samples, yet the motivating mismatch comes from non-stationarity plus systematic missingness. If that missingness mechanism pushes the true distribution outside the ball, the min-max no longer delivers the advertised protection and the reported gains could be tied to the particular datasets and missingness patterns tested. The abstract gives no details on radius selection, bound tightness, or ablations that isolate the regularization strength, so those checks matter. This paper is aimed at researchers and engineers who build imputation pipelines for incomplete, non-stationary multivariate series in domains like sensor networks or clinical data. A reader who wants a new regularization approach that stays compatible with neural networks will find usable material here. It deserves peer review because the problem is practical, the derivation reaches an implementable algorithm, and the experiments point to real-world utility, even though the central assumption about the ambiguity set needs closer examination.

Referee Report

2 major / 2 minor

Summary. The paper proposes the Distributionally Robust Regularized Imputer Objective (DRIO) for multivariate time series imputation to address distribution mismatch induced by non-stationarity and systematic missingness. It jointly minimizes reconstruction error and worst-case divergence within a Wasserstein ambiguity set, derives a tractable upper-bound surrogate that reduces the infinite-dimensional problem to adversarial search over sample trajectories, and develops an alternating optimization algorithm compatible with deep learning models. Experiments on diverse real-world datasets are reported to show consistent robustness in imputation and improved downstream forecasting under various missingness scenarios.

Significance. If the central claims hold, the work offers a principled DRO-based approach to robust imputation that could mitigate overfitting to biased observations in non-stationary time series, with the tractable surrogate and deep-learning compatibility as notable practical strengths. This could influence methods for handling incomplete data in forecasting pipelines, provided the ambiguity-set construction and bound quality are validated.

major comments (2)

[Derivation of surrogate] The derivation of the tractable upper-bound surrogate (as stated in the abstract) lacks explicit verification of bound tightness or empirical checks on approximation quality, which is load-bearing for the claimed robustness guarantees.
[Ambiguity set construction] The Wasserstein ambiguity set is constructed around the observed empirical measure without an explicit correction for selection bias induced by the missingness mechanism in non-stationary series; this risks the true data-generating distribution lying outside the ball, undermining the min-max guarantee.

minor comments (2)

[Experiments] Experimental results report gains without error bars, standard deviations, or multiple-run statistics, which would be needed to substantiate consistency claims.
[Experiments] No ablation on the ambiguity-set radius is presented, leaving sensitivity to this key hyperparameter unexamined.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify key aspects of our work. We address each major comment below and indicate the revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [Derivation of surrogate] The derivation of the tractable upper-bound surrogate (as stated in the abstract) lacks explicit verification of bound tightness or empirical checks on approximation quality, which is load-bearing for the claimed robustness guarantees.

Authors: We agree that the manuscript would benefit from explicit verification of the surrogate bound. In the revision we will add a dedicated subsection deriving conditions under which the upper bound is tight (under Lipschitz continuity of the loss and bounded support assumptions) and include new synthetic-data experiments that compare the surrogate objective value against a Monte-Carlo estimate of the true min-max objective, thereby quantifying approximation error across different missingness rates. revision: yes
Referee: [Ambiguity set construction] The Wasserstein ambiguity set is constructed around the observed empirical measure without an explicit correction for selection bias induced by the missingness mechanism in non-stationary series; this risks the true data-generating distribution lying outside the ball, undermining the min-max guarantee.

Authors: The concern is well-taken. Our current construction follows the standard empirical-measure centering used in most DRO imputation literature, but it does not explicitly adjust for selection bias. In the revision we will (i) add a paragraph in Section 3.2 acknowledging this limitation and (ii) propose a simple re-weighting scheme based on an estimated missingness probability to recenter the empirical measure. We will also report additional experiments that vary the missingness mechanism (MCAR vs. MNAR) to illustrate the practical effect of this bias. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained mathematical reduction

full rationale

The paper defines DRIO as a joint min-max objective over reconstruction error and worst-case Wasserstein divergence, then derives a tractable upper-bound surrogate that converts the infinite-dimensional problem into adversarial search over sample trajectories. This reduction is presented as a direct consequence of the Wasserstein ball construction and duality arguments, without any fitted parameters being renamed as predictions or any load-bearing step collapsing to a self-citation. Experiments are reported separately on real datasets and do not feed back into the objective definition. No self-definitional loops, uniqueness theorems imported from prior author work, or ansatz smuggling are present in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard properties of the Wasserstein metric and the existence of a tractable upper bound; no new free parameters or invented entities are introduced in the abstract.

axioms (1)

standard math Wasserstein distance admits a tractable dual or adversarial representation that yields a finite-dimensional surrogate
Invoked when reducing the infinite-dimensional DRO problem to adversarial search over trajectories

pith-pipeline@v0.9.0 · 5440 in / 1147 out tokens · 49000 ms · 2026-05-16T08:39:37.455251+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

DRIO objective: min_θ α R_θ + (1-α) sup_{Q∈B_ρ(bP_N)} S_{ε,τ}(Q, bP_θ) with Wasserstein ball and Sinkhorn surrogate
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Ambiguity set B_ρ(bP_N) and adversarial trajectories Z

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages

[1]

Multivariate short-term traffic flow forecasting using time-series analysis.IEEE transactions on intelligent transportation systems, 10(2):246–254, 2009

Bidisha Ghosh, Biswajit Basu, and Margaret O’Mahony. Multivariate short-term traffic flow forecasting using time-series analysis.IEEE transactions on intelligent transportation systems, 10(2):246–254, 2009

work page 2009
[2]

Springer Science & Business Media, 2012

Gebhard Kirchgässner, Jürgen Wolters, and Uwe Hassler.Introduction to modern time series analysis. Springer Science & Business Media, 2012

work page 2012
[3]

Princeton university press, 2020

James D Hamilton.Time series analysis. Princeton university press, 2020

work page 2020
[4]

Yang Yang, Che-Yi Liao, Esmaeil Keyvanshokooh, Hui Shao, Mary Beth Weber, Francisco J Pasquel, and Gian-Gabriel P Garcia. A responsible framework for assessing, selecting, and explaining machine learning models in cardiovascular disease outcomes among people with type 2 diabetes: Methodology and validation study.JMIR Medical Informatics, 13:e66200, 2025

work page 2025
[5]

Development and evaluation of cardiovascular disease risk prediction models for patients with type 2 diabetes.Scientific Reports, 2026

Yang Yang, Tian Liu, Che-Yi Liao, Sun Ju Lee, Esmaeil Keyvanshokooh, Hui Shao, Mary Beth Weber, Francisco J Pasquel, and Gian-Gabriel P Garcia. Development and evaluation of cardiovascular disease risk prediction models for patients with type 2 diabetes.Scientific Reports, 2026

work page 2026
[6]

Constraint-aware self- improving large language model for clinical role model generation.Available at SSRN 5642250, 2025

Che-Yi Liao, Esmaeil Keyvanshokooh, and Gian-Gabriel Garcia. Constraint-aware self- improving large language model for clinical role model generation.Available at SSRN 5642250, 2025

work page 2025
[7]

A spatiotemporal approach for traffic data imputation with complicated missing patterns.Transportation research part C: emerging technologies, 119:102730, 2020

Huiping Li, Meng Li, Xi Lin, Fang He, and Yinhai Wang. A spatiotemporal approach for traffic data imputation with complicated missing patterns.Transportation research part C: emerging technologies, 119:102730, 2020

work page 2020
[8]

Racial disparities in opioid overdose deaths in massachusetts.JAMA Network Open, 5(4):e229081, 2022

Che-Yi Liao, Gian-Gabriel P Garcia, Catherine DiGennaro, and Mohammad S Jalali. Racial disparities in opioid overdose deaths in massachusetts.JAMA Network Open, 5(4):e229081, 2022

work page 2022
[9]

A survey on missing data in machine learning.Journal of Big data, 8(1):140, 2021

Tlamelo Emmanuel, Thabiso Maupong, Dimane Mpoeleng, Thabo Semong, Banyatsang Mphago, and Oteng Tabona. A survey on missing data in machine learning.Journal of Big data, 8(1):140, 2021

work page 2021
[10]

Estimating hidden epidemic: A bayesian spatiotemporal compartmental modeling approach.INFORMS Journal on Data Science, 4(3):230–247, 2025

Che-Yi Liao, Peiliang Bai, Lance A Waller, and Kamran Paynabar. Estimating hidden epidemic: A bayesian spatiotemporal compartmental modeling approach.INFORMS Journal on Data Science, 4(3):230–247, 2025

work page 2025
[11]

Missing data in non-stationary multivariate time series from digital studies in psychiatry.arXiv preprint arXiv:2506.14946, 2025

Xiaoxuan Cai, Charlotte R Fowler, Li Zeng, Habiballah Rahimi Eichi, Dost Ongur, Lisa Dixon, Justin T Baker, Jukka-Pekka Onnela, and Linda Valeri. Missing data in non-stationary multivariate time series from digital studies in psychiatry.arXiv preprint arXiv:2506.14946, 2025

work page arXiv 2025
[12]

Springer Science & Business Media, 2012

Moamar Sayed-Mouchaweh and Edwin Lughofer.Learning in non-stationary environments: methods and applications. Springer Science & Business Media, 2012

work page 2012
[13]

Time series forecasting for nonlinear and non-stationary processes: a review and comparative study.Iie Transactions, 47(10):1053–1071, 2015

Changqing Cheng, Akkarapol Sa-Ngasoongsong, Omer Beyca, Trung Le, Hui Yang, Zhenyu Kong, and Satish TS Bukkapatnam. Time series forecasting for nonlinear and non-stationary processes: a review and comparative study.Iie Transactions, 47(10):1053–1071, 2015

work page 2015
[14]

Learning in nonstationary environments: A survey.IEEE Computational intelligence magazine, 10(4):12–25, 2015

Gregory Ditzler, Manuel Roveri, Cesare Alippi, and Robi Polikar. Learning in nonstationary environments: A survey.IEEE Computational intelligence magazine, 10(4):12–25, 2015

work page 2015
[15]

Che-Yi Liao, Zheng Dong, Gian-Gabriel P Garcia, Kamran Paynabar, Yao Xie, and Moham- mad S Jalali. Tides need stemmed: A locally operating spatiotemporal mutually exciting point process with dynamic network for improving opioid overdose death prediction.Manufacturing & Service Operations Management, 28(2):577–593, 2026

work page 2026
[16]

Ankit Dixit and Shikha Jain. Contemporary approaches to analyze non-stationary time-series: Some solutions and challenges.Recent Advances in Computer Science and Communications (Formerly: Recent Patents on Computer Science), 16(2):61–80, 2023. 10

work page 2023
[17]

On the sample complexity of graphical model selection from non-stationary samples.IEEE Transactions on Signal Processing, 68: 17–32, 2019

Nguyen Tran, Oleksii Abramenko, and Alexander Jung. On the sample complexity of graphical model selection from non-stationary samples.IEEE Transactions on Signal Processing, 68: 17–32, 2019

work page 2019
[18]

Balancing access, precision, and equity in adaptive test site allocation with an application to covid-19 in atlanta, georgia.Scientific Reports, 15(1):32775, 2025

Thomas W Hsiao, Che-Yi Liao, Lance A Waller, and Kamran Paynabar. Balancing access, precision, and equity in adaptive test site allocation with an application to covid-19 in atlanta, georgia.Scientific Reports, 15(1):32775, 2025

work page 2025
[19]

Trouble in the tails: Earnings non-response and response bias across the distribution

Christopher R Bollinger, Barry T Hirsch, Charles M Hokayem, and James P Ziliak. Trouble in the tails: Earnings non-response and response bias across the distribution. InAnnual Meeting of the Society of Labor Economists. http://citeseerx. ist. psu. edu/viewdoc/download, 2014

work page 2014
[20]

Missing data, part 2

Tra My Pham, Nikolaos Pandis, and Ian R White. Missing data, part 2. missing data mechanisms: Missing completely at random, missing at random, missing not at random, and why they matter. American journal of orthodontics and dentofacial orthopedics, 162(1):138–139, 2022

work page 2022
[21]

Missing data recovery methods on multivariate time series in iot: A comprehensive survey.IEEE Communications Surveys & Tutorials, 2025

Kai Zhang, Qinmin Yang, Chao Li, Xin Sun, and Jiming Chen. Missing data recovery methods on multivariate time series in iot: A comprehensive survey.IEEE Communications Surveys & Tutorials, 2025

work page 2025
[22]

Missing data imputation of high-resolution temporal climate time series data.Meteorological Applications, 27(1):e1873, 2020

Eben Afrifa-Yamoah, Ute A Mueller, Stephen M Taylor, and Aiden J Fisher. Missing data imputation of high-resolution temporal climate time series data.Meteorological Applications, 27(1):e1873, 2020

work page 2020
[23]

Brits: Bidirectional recurrent imputation for time series.Advances in neural information processing systems, 31, 2018

Wei Cao, Dong Wang, Jian Li, Hao Zhou, Lei Li, and Yitan Li. Brits: Bidirectional recurrent imputation for time series.Advances in neural information processing systems, 31, 2018

work page 2018
[24]

Bidi- rectional spatial–temporal traffic data imputation via graph attention recurrent neural network

Guojiang Shen, Wenfeng Zhou, Wenyi Zhang, Nali Liu, Zhi Liu, and Xiangjie Kong. Bidi- rectional spatial–temporal traffic data imputation via graph attention recurrent neural network. Neurocomputing, 531:151–162, 2023

work page 2023
[25]

Miwae: Deep generative modelling and imputation of incomplete data sets

Pierre-Alexandre Mattei and Jes Frellsen. Miwae: Deep generative modelling and imputation of incomplete data sets. InInternational conference on machine learning, pages 4413–4423. PMLR, 2019

work page 2019
[26]

not-miwae: Deep generative modelling with missing not at random data

Niels Bruun Ipsen, Pierre-Alexandre Mattei, and Jes Frellsen. not-miwae: Deep generative modelling with missing not at random data.arXiv preprint arXiv:2006.12871, 2020

work page arXiv 2006
[27]

Missing data imputation using optimal transport

Boris Muzellec, Julie Josse, Claire Boyer, and Marco Cuturi. Missing data imputation using optimal transport. InInternational Conference on Machine Learning, pages 7130–7140. PMLR, 2020

work page 2020
[28]

Glima: Global and local time series imputation with multi-directional attention learning

Qiuling Suo, Weida Zhong, Guangxu Xun, Jianhui Sun, Changyou Chen, and Aidong Zhang. Glima: Global and local time series imputation with multi-directional attention learning. In 2020 IEEE International Conference on Big Data (Big Data), pages 798–807. IEEE, 2020

work page 2020
[29]

Remian: real-time and error-tolerant missing value imputation.ACM Transactions on Knowledge Discovery from Data (TKDD), 14(6):1–38, 2020

Qian Ma, Yu Gu, Wang-Chien Lee, Ge Yu, Hongbo Liu, and Xindong Wu. Remian: real-time and error-tolerant missing value imputation.ACM Transactions on Knowledge Discovery from Data (TKDD), 14(6):1–38, 2020

work page 2020
[30]

Spatial-temporal traffic data imputation via graph attention convolutional network

Yongchao Ye, Shiyao Zhang, and James JQ Yu. Spatial-temporal traffic data imputation via graph attention convolutional network. InInternational Conference on artificial neural networks, pages 241–252. Springer, 2021

work page 2021
[31]

Learning to reconstruct missing data from spatiotemporal graphs with sparse observations.Advances in neural information processing systems, 35:32069–32082, 2022

Ivan Marisca, Andrea Cini, and Cesare Alippi. Learning to reconstruct missing data from spatiotemporal graphs with sparse observations.Advances in neural information processing systems, 35:32069–32082, 2022

work page 2022
[32]

Saits: Self-attention-based imputation for time series

Wenjie Du, David Côté, and Yan Liu. Saits: Self-attention-based imputation for time series. Expert Systems with Applications, 219:119619, 2023

work page 2023
[33]

Imputeformer: Low rankness- induced transformers for generalizable spatiotemporal imputation

Tong Nie, Guoyang Qin, Wei Ma, Yuewen Mei, and Jian Sun. Imputeformer: Low rankness- induced transformers for generalizable spatiotemporal imputation. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining, pages 2260–2271, 2024. 11

work page 2024
[34]

Gain: Missing data imputation using generative adversarial nets

Jinsung Yoon, James Jordon, and Mihaela Schaar. Gain: Missing data imputation using generative adversarial nets. InInternational conference on machine learning, pages 5689–5698. PMLR, 2018

work page 2018
[35]

Variational auto-encoders based on the shift correction for imputation of specific missing in multivariate time series.Measurement, 186:110055, 2021

Jie Li, Weijie Ren, and Min Han. Variational auto-encoders based on the shift correction for imputation of specific missing in multivariate time series.Measurement, 186:110055, 2021. doi: 10.1016/j.measurement.2021.110055. URL https://www.sciencedirect.com/scie nce/article/pii/S0263224121009805

work page doi:10.1016/j.measurement.2021.110055 2021
[36]

Csdi: Conditional score-based diffusion models for probabilistic time series imputation.Advances in neural information processing systems, 34:24804–24816, 2021

Yusuke Tashiro, Jiaming Song, Yang Song, and Stefano Ermon. Csdi: Conditional score-based diffusion models for probabilistic time series imputation.Advances in neural information processing systems, 34:24804–24816, 2021

work page 2021
[37]

Diffusion-based time series imputa- tion and forecasting with structured state space models,

Juan Miguel Lopez Alcaraz and Nils Strodthoff. Diffusion-based time series imputation and forecasting with structured state space models.arXiv preprint arXiv:2208.09399, 2022

work page arXiv 2022
[38]

Diffimp: Efficient diffusion model for probabilistic time series imputation with bidirectional mamba backbone.arXiv preprint arXiv:2410.13338, 2024

Hongfan Gao, Wangmeng Shen, Xiangfei Qiu, Ronghui Xu, Jilin Hu, and Bin Yang. Diffimp: Efficient diffusion model for probabilistic time series imputation with bidirectional mamba backbone.arXiv preprint arXiv:2410.13338, 2024

work page arXiv 2024
[39]

Shuo-Chieh Huang, Tengyuan Liang, and Ruey S. Tsay. Temporal wasserstein imputation: A versatile method for time series imputation, 2025. URL https://arxiv.org/abs/2411.0 2811

work page 2025
[40]

Optimal transport for time series imputation

Hao Wang, Haoxuan Li, Xu Chen, Mingming Gong, Zhichao Chen, et al. Optimal transport for time series imputation. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025
[41]

Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019

Thibault Séjourné, Jean Feydy, François-Xavier Vialard, Alain Trouvé, and Gabriel Peyré. Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019

work page arXiv 1910
[42]

Generative modeling through the semi- dual formulation of unbalanced optimal transport.Advances in Neural Information Processing Systems, 36:42433–42455, 2023

Jaemoo Choi, Jaewoong Choi, and Myungjoo Kang. Generative modeling through the semi- dual formulation of unbalanced optimal transport.Advances in Neural Information Processing Systems, 36:42433–42455, 2023

work page 2023
[43]

Springer, 2008

Cédric Villani et al.Optimal transport: old and new, volume 338. Springer, 2008

work page 2008
[44]

Sinkhorn distributionally robust optimization.Operations Research, 2025

Jie Wang, Rui Gao, and Yao Xie. Sinkhorn distributionally robust optimization.Operations Research, 2025

work page 2025
[45]

Miscellaneous notes on optimization theory and related topics.Report, Cal- tech.[0915], 2015

Kim C Border. Miscellaneous notes on optimization theory and related topics.Report, Cal- tech.[0915], 2015

work page 2015
[46]

Cnnpred: Cnn-based stock market prediction using a diverse set of variables.Expert Systems with Applications, 129:273–285, 2019

Ehsan Hoseinzade and Saman Haratizadeh. Cnnpred: Cnn-based stock market prediction using a diverse set of variables.Expert Systems with Applications, 129:273–285, 2019

work page 2019
[47]

Attention based spatial- temporal graph convolutional networks for traffic flow forecasting.Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):922–929, Jul

Shengnan Guo, Youfang Lin, Ning Feng, Chao Song, and Huaiyu Wan. Attention based spatial- temporal graph convolutional networks for traffic flow forecasting.Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):922–929, Jul. 2019. doi: 10.1609/aaai.v33i01.33 01922. URLhttps://ojs.aaai.org/index.php/AAAI/article/view/3881

work page doi:10.1609/aaai.v33i01.33 2019
[48]

Assessing beijing’s pm2

Xuan Liang, Tao Zou, Bin Guo, Shuo Li, Haozhe Zhang, Shuyi Zhang, Hui Huang, and Song Xi Chen. Assessing beijing’s pm2. 5 pollution: severity, weather impact, apec and winter heating. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 471 (2182):20150257, 2015

work page 2015
[49]

Bioinspired early detection through gas flow modulation in chemo-sensory systems.Sensors and Actuators B: Chemical, 206:538–547, 2015

Andrey Ziyatdinov, Jordi Fonollosa, Luis Fernández, Agustín Gutierrez-Gálvez, Santiago Marco, and Alexandre Perera. Bioinspired early detection through gas flow modulation in chemo-sensory systems.Sensors and Actuators B: Chemical, 206:538–547, 2015

work page 2015
[50]

Anguita, Alessandro Ghio, L

D. Anguita, Alessandro Ghio, L. Oneto, Xavier Parra, and Jorge Luis Reyes-Ortiz. A public domain dataset for human activity recognition using smartphones. InThe European Symposium on Artificial Neural Networks, 2013. URL https://api.semanticscholar.org/Corpus ID:6975432. 12

work page 2013
[51]

A kernel two-sample test.The journal of machine learning research, 13(1):723–773, 2012

Arthur Gretton, Karsten M Borgwardt, Malte J Rasch, Bernhard Schölkopf, and Alexander Smola. A kernel two-sample test.The journal of machine learning research, 13(1):723–773, 2012

work page 2012
[52]

Generative moment matching networks

Yujia Li, Kevin Swersky, and Rich Zemel. Generative moment matching networks. InInterna- tional conference on machine learning, pages 1718–1727. PMLR, 2015

work page 2015
[53]

Generative models and model criticism via optimized maximum mean discrepancy.arXiv preprint arXiv:1611.04488, 2016

Danica J Sutherland, Hsiao-Yu Tung, Heiko Strathmann, Soumyajit De, Aaditya Ramdas, Alex Smola, and Arthur Gretton. Generative models and model criticism via optimized maximum mean discrepancy.arXiv preprint arXiv:1611.04488, 2016

work page arXiv 2016
[54]

The wasserstein-fourier distance for stationary time series.IEEE Transactions on Signal Processing, 69:709–721, 2020

Elsa Cazelles, Arnaud Robert, and Felipe Tobar. The wasserstein-fourier distance for stationary time series.IEEE Transactions on Signal Processing, 69:709–721, 2020

work page 2020
[55]

Certifying some distributional robustness with principled adversarial training.arXiv preprint arXiv:1710.10571, 2017

Aman Sinha, Hongseok Namkoong, Riccardo V olpi, and John Duchi. Certifying some dis- tributional robustness with principled adversarial training.arXiv preprint arXiv:1710.10571, 2017. 13 A Loss Function Construction We provide a comprehensive discussion on the Sinkhorn divergence used in our formulation (2). Let Z=R D×T denote the feature-temporal space o...

work page arXiv 2017
[56]

Balanced

and structure all datasets as three-dimensional tensors of shape (N, T, D) representing samples, time steps, and features, respectively. Note that the exchange of temporal and feature dimensions does not affect our theory and algorithm as one just needs to swap the indices during computation. CNNpred [46].UCI stock market data combining 5 US indices (S&P ...

work page 2010

[1] [1]

Multivariate short-term traffic flow forecasting using time-series analysis.IEEE transactions on intelligent transportation systems, 10(2):246–254, 2009

Bidisha Ghosh, Biswajit Basu, and Margaret O’Mahony. Multivariate short-term traffic flow forecasting using time-series analysis.IEEE transactions on intelligent transportation systems, 10(2):246–254, 2009

work page 2009

[2] [2]

Springer Science & Business Media, 2012

Gebhard Kirchgässner, Jürgen Wolters, and Uwe Hassler.Introduction to modern time series analysis. Springer Science & Business Media, 2012

work page 2012

[3] [3]

Princeton university press, 2020

James D Hamilton.Time series analysis. Princeton university press, 2020

work page 2020

[4] [4]

Yang Yang, Che-Yi Liao, Esmaeil Keyvanshokooh, Hui Shao, Mary Beth Weber, Francisco J Pasquel, and Gian-Gabriel P Garcia. A responsible framework for assessing, selecting, and explaining machine learning models in cardiovascular disease outcomes among people with type 2 diabetes: Methodology and validation study.JMIR Medical Informatics, 13:e66200, 2025

work page 2025

[5] [5]

Development and evaluation of cardiovascular disease risk prediction models for patients with type 2 diabetes.Scientific Reports, 2026

Yang Yang, Tian Liu, Che-Yi Liao, Sun Ju Lee, Esmaeil Keyvanshokooh, Hui Shao, Mary Beth Weber, Francisco J Pasquel, and Gian-Gabriel P Garcia. Development and evaluation of cardiovascular disease risk prediction models for patients with type 2 diabetes.Scientific Reports, 2026

work page 2026

[6] [6]

Constraint-aware self- improving large language model for clinical role model generation.Available at SSRN 5642250, 2025

Che-Yi Liao, Esmaeil Keyvanshokooh, and Gian-Gabriel Garcia. Constraint-aware self- improving large language model for clinical role model generation.Available at SSRN 5642250, 2025

work page 2025

[7] [7]

A spatiotemporal approach for traffic data imputation with complicated missing patterns.Transportation research part C: emerging technologies, 119:102730, 2020

Huiping Li, Meng Li, Xi Lin, Fang He, and Yinhai Wang. A spatiotemporal approach for traffic data imputation with complicated missing patterns.Transportation research part C: emerging technologies, 119:102730, 2020

work page 2020

[8] [8]

Racial disparities in opioid overdose deaths in massachusetts.JAMA Network Open, 5(4):e229081, 2022

Che-Yi Liao, Gian-Gabriel P Garcia, Catherine DiGennaro, and Mohammad S Jalali. Racial disparities in opioid overdose deaths in massachusetts.JAMA Network Open, 5(4):e229081, 2022

work page 2022

[9] [9]

A survey on missing data in machine learning.Journal of Big data, 8(1):140, 2021

Tlamelo Emmanuel, Thabiso Maupong, Dimane Mpoeleng, Thabo Semong, Banyatsang Mphago, and Oteng Tabona. A survey on missing data in machine learning.Journal of Big data, 8(1):140, 2021

work page 2021

[10] [10]

Estimating hidden epidemic: A bayesian spatiotemporal compartmental modeling approach.INFORMS Journal on Data Science, 4(3):230–247, 2025

Che-Yi Liao, Peiliang Bai, Lance A Waller, and Kamran Paynabar. Estimating hidden epidemic: A bayesian spatiotemporal compartmental modeling approach.INFORMS Journal on Data Science, 4(3):230–247, 2025

work page 2025

[11] [11]

Missing data in non-stationary multivariate time series from digital studies in psychiatry.arXiv preprint arXiv:2506.14946, 2025

Xiaoxuan Cai, Charlotte R Fowler, Li Zeng, Habiballah Rahimi Eichi, Dost Ongur, Lisa Dixon, Justin T Baker, Jukka-Pekka Onnela, and Linda Valeri. Missing data in non-stationary multivariate time series from digital studies in psychiatry.arXiv preprint arXiv:2506.14946, 2025

work page arXiv 2025

[12] [12]

Springer Science & Business Media, 2012

Moamar Sayed-Mouchaweh and Edwin Lughofer.Learning in non-stationary environments: methods and applications. Springer Science & Business Media, 2012

work page 2012

[13] [13]

Time series forecasting for nonlinear and non-stationary processes: a review and comparative study.Iie Transactions, 47(10):1053–1071, 2015

Changqing Cheng, Akkarapol Sa-Ngasoongsong, Omer Beyca, Trung Le, Hui Yang, Zhenyu Kong, and Satish TS Bukkapatnam. Time series forecasting for nonlinear and non-stationary processes: a review and comparative study.Iie Transactions, 47(10):1053–1071, 2015

work page 2015

[14] [14]

Learning in nonstationary environments: A survey.IEEE Computational intelligence magazine, 10(4):12–25, 2015

Gregory Ditzler, Manuel Roveri, Cesare Alippi, and Robi Polikar. Learning in nonstationary environments: A survey.IEEE Computational intelligence magazine, 10(4):12–25, 2015

work page 2015

[15] [15]

Che-Yi Liao, Zheng Dong, Gian-Gabriel P Garcia, Kamran Paynabar, Yao Xie, and Moham- mad S Jalali. Tides need stemmed: A locally operating spatiotemporal mutually exciting point process with dynamic network for improving opioid overdose death prediction.Manufacturing & Service Operations Management, 28(2):577–593, 2026

work page 2026

[16] [16]

Ankit Dixit and Shikha Jain. Contemporary approaches to analyze non-stationary time-series: Some solutions and challenges.Recent Advances in Computer Science and Communications (Formerly: Recent Patents on Computer Science), 16(2):61–80, 2023. 10

work page 2023

[17] [17]

On the sample complexity of graphical model selection from non-stationary samples.IEEE Transactions on Signal Processing, 68: 17–32, 2019

Nguyen Tran, Oleksii Abramenko, and Alexander Jung. On the sample complexity of graphical model selection from non-stationary samples.IEEE Transactions on Signal Processing, 68: 17–32, 2019

work page 2019

[18] [18]

Balancing access, precision, and equity in adaptive test site allocation with an application to covid-19 in atlanta, georgia.Scientific Reports, 15(1):32775, 2025

Thomas W Hsiao, Che-Yi Liao, Lance A Waller, and Kamran Paynabar. Balancing access, precision, and equity in adaptive test site allocation with an application to covid-19 in atlanta, georgia.Scientific Reports, 15(1):32775, 2025

work page 2025

[19] [19]

Trouble in the tails: Earnings non-response and response bias across the distribution

Christopher R Bollinger, Barry T Hirsch, Charles M Hokayem, and James P Ziliak. Trouble in the tails: Earnings non-response and response bias across the distribution. InAnnual Meeting of the Society of Labor Economists. http://citeseerx. ist. psu. edu/viewdoc/download, 2014

work page 2014

[20] [20]

Missing data, part 2

Tra My Pham, Nikolaos Pandis, and Ian R White. Missing data, part 2. missing data mechanisms: Missing completely at random, missing at random, missing not at random, and why they matter. American journal of orthodontics and dentofacial orthopedics, 162(1):138–139, 2022

work page 2022

[21] [21]

Missing data recovery methods on multivariate time series in iot: A comprehensive survey.IEEE Communications Surveys & Tutorials, 2025

Kai Zhang, Qinmin Yang, Chao Li, Xin Sun, and Jiming Chen. Missing data recovery methods on multivariate time series in iot: A comprehensive survey.IEEE Communications Surveys & Tutorials, 2025

work page 2025

[22] [22]

Missing data imputation of high-resolution temporal climate time series data.Meteorological Applications, 27(1):e1873, 2020

Eben Afrifa-Yamoah, Ute A Mueller, Stephen M Taylor, and Aiden J Fisher. Missing data imputation of high-resolution temporal climate time series data.Meteorological Applications, 27(1):e1873, 2020

work page 2020

[23] [23]

Brits: Bidirectional recurrent imputation for time series.Advances in neural information processing systems, 31, 2018

Wei Cao, Dong Wang, Jian Li, Hao Zhou, Lei Li, and Yitan Li. Brits: Bidirectional recurrent imputation for time series.Advances in neural information processing systems, 31, 2018

work page 2018

[24] [24]

Bidi- rectional spatial–temporal traffic data imputation via graph attention recurrent neural network

Guojiang Shen, Wenfeng Zhou, Wenyi Zhang, Nali Liu, Zhi Liu, and Xiangjie Kong. Bidi- rectional spatial–temporal traffic data imputation via graph attention recurrent neural network. Neurocomputing, 531:151–162, 2023

work page 2023

[25] [25]

Miwae: Deep generative modelling and imputation of incomplete data sets

Pierre-Alexandre Mattei and Jes Frellsen. Miwae: Deep generative modelling and imputation of incomplete data sets. InInternational conference on machine learning, pages 4413–4423. PMLR, 2019

work page 2019

[26] [26]

not-miwae: Deep generative modelling with missing not at random data

Niels Bruun Ipsen, Pierre-Alexandre Mattei, and Jes Frellsen. not-miwae: Deep generative modelling with missing not at random data.arXiv preprint arXiv:2006.12871, 2020

work page arXiv 2006

[27] [27]

Missing data imputation using optimal transport

Boris Muzellec, Julie Josse, Claire Boyer, and Marco Cuturi. Missing data imputation using optimal transport. InInternational Conference on Machine Learning, pages 7130–7140. PMLR, 2020

work page 2020

[28] [28]

Glima: Global and local time series imputation with multi-directional attention learning

Qiuling Suo, Weida Zhong, Guangxu Xun, Jianhui Sun, Changyou Chen, and Aidong Zhang. Glima: Global and local time series imputation with multi-directional attention learning. In 2020 IEEE International Conference on Big Data (Big Data), pages 798–807. IEEE, 2020

work page 2020

[29] [29]

Remian: real-time and error-tolerant missing value imputation.ACM Transactions on Knowledge Discovery from Data (TKDD), 14(6):1–38, 2020

Qian Ma, Yu Gu, Wang-Chien Lee, Ge Yu, Hongbo Liu, and Xindong Wu. Remian: real-time and error-tolerant missing value imputation.ACM Transactions on Knowledge Discovery from Data (TKDD), 14(6):1–38, 2020

work page 2020

[30] [30]

Spatial-temporal traffic data imputation via graph attention convolutional network

Yongchao Ye, Shiyao Zhang, and James JQ Yu. Spatial-temporal traffic data imputation via graph attention convolutional network. InInternational Conference on artificial neural networks, pages 241–252. Springer, 2021

work page 2021

[31] [31]

Learning to reconstruct missing data from spatiotemporal graphs with sparse observations.Advances in neural information processing systems, 35:32069–32082, 2022

Ivan Marisca, Andrea Cini, and Cesare Alippi. Learning to reconstruct missing data from spatiotemporal graphs with sparse observations.Advances in neural information processing systems, 35:32069–32082, 2022

work page 2022

[32] [32]

Saits: Self-attention-based imputation for time series

Wenjie Du, David Côté, and Yan Liu. Saits: Self-attention-based imputation for time series. Expert Systems with Applications, 219:119619, 2023

work page 2023

[33] [33]

Imputeformer: Low rankness- induced transformers for generalizable spatiotemporal imputation

Tong Nie, Guoyang Qin, Wei Ma, Yuewen Mei, and Jian Sun. Imputeformer: Low rankness- induced transformers for generalizable spatiotemporal imputation. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining, pages 2260–2271, 2024. 11

work page 2024

[34] [34]

Gain: Missing data imputation using generative adversarial nets

Jinsung Yoon, James Jordon, and Mihaela Schaar. Gain: Missing data imputation using generative adversarial nets. InInternational conference on machine learning, pages 5689–5698. PMLR, 2018

work page 2018

[35] [35]

Variational auto-encoders based on the shift correction for imputation of specific missing in multivariate time series.Measurement, 186:110055, 2021

Jie Li, Weijie Ren, and Min Han. Variational auto-encoders based on the shift correction for imputation of specific missing in multivariate time series.Measurement, 186:110055, 2021. doi: 10.1016/j.measurement.2021.110055. URL https://www.sciencedirect.com/scie nce/article/pii/S0263224121009805

work page doi:10.1016/j.measurement.2021.110055 2021

[36] [36]

Csdi: Conditional score-based diffusion models for probabilistic time series imputation.Advances in neural information processing systems, 34:24804–24816, 2021

Yusuke Tashiro, Jiaming Song, Yang Song, and Stefano Ermon. Csdi: Conditional score-based diffusion models for probabilistic time series imputation.Advances in neural information processing systems, 34:24804–24816, 2021

work page 2021

[37] [37]

Diffusion-based time series imputa- tion and forecasting with structured state space models,

Juan Miguel Lopez Alcaraz and Nils Strodthoff. Diffusion-based time series imputation and forecasting with structured state space models.arXiv preprint arXiv:2208.09399, 2022

work page arXiv 2022

[38] [38]

Diffimp: Efficient diffusion model for probabilistic time series imputation with bidirectional mamba backbone.arXiv preprint arXiv:2410.13338, 2024

Hongfan Gao, Wangmeng Shen, Xiangfei Qiu, Ronghui Xu, Jilin Hu, and Bin Yang. Diffimp: Efficient diffusion model for probabilistic time series imputation with bidirectional mamba backbone.arXiv preprint arXiv:2410.13338, 2024

work page arXiv 2024

[39] [39]

Shuo-Chieh Huang, Tengyuan Liang, and Ruey S. Tsay. Temporal wasserstein imputation: A versatile method for time series imputation, 2025. URL https://arxiv.org/abs/2411.0 2811

work page 2025

[40] [40]

Optimal transport for time series imputation

Hao Wang, Haoxuan Li, Xu Chen, Mingming Gong, Zhichao Chen, et al. Optimal transport for time series imputation. InThe Thirteenth International Conference on Learning Representations, 2025

work page 2025

[41] [41]

Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019

Thibault Séjourné, Jean Feydy, François-Xavier Vialard, Alain Trouvé, and Gabriel Peyré. Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019

work page arXiv 1910

[42] [42]

Generative modeling through the semi- dual formulation of unbalanced optimal transport.Advances in Neural Information Processing Systems, 36:42433–42455, 2023

Jaemoo Choi, Jaewoong Choi, and Myungjoo Kang. Generative modeling through the semi- dual formulation of unbalanced optimal transport.Advances in Neural Information Processing Systems, 36:42433–42455, 2023

work page 2023

[43] [43]

Springer, 2008

Cédric Villani et al.Optimal transport: old and new, volume 338. Springer, 2008

work page 2008

[44] [44]

Sinkhorn distributionally robust optimization.Operations Research, 2025

Jie Wang, Rui Gao, and Yao Xie. Sinkhorn distributionally robust optimization.Operations Research, 2025

work page 2025

[45] [45]

Miscellaneous notes on optimization theory and related topics.Report, Cal- tech.[0915], 2015

Kim C Border. Miscellaneous notes on optimization theory and related topics.Report, Cal- tech.[0915], 2015

work page 2015

[46] [46]

Cnnpred: Cnn-based stock market prediction using a diverse set of variables.Expert Systems with Applications, 129:273–285, 2019

Ehsan Hoseinzade and Saman Haratizadeh. Cnnpred: Cnn-based stock market prediction using a diverse set of variables.Expert Systems with Applications, 129:273–285, 2019

work page 2019

[47] [47]

Attention based spatial- temporal graph convolutional networks for traffic flow forecasting.Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):922–929, Jul

Shengnan Guo, Youfang Lin, Ning Feng, Chao Song, and Huaiyu Wan. Attention based spatial- temporal graph convolutional networks for traffic flow forecasting.Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):922–929, Jul. 2019. doi: 10.1609/aaai.v33i01.33 01922. URLhttps://ojs.aaai.org/index.php/AAAI/article/view/3881

work page doi:10.1609/aaai.v33i01.33 2019

[48] [48]

Assessing beijing’s pm2

Xuan Liang, Tao Zou, Bin Guo, Shuo Li, Haozhe Zhang, Shuyi Zhang, Hui Huang, and Song Xi Chen. Assessing beijing’s pm2. 5 pollution: severity, weather impact, apec and winter heating. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 471 (2182):20150257, 2015

work page 2015

[49] [49]

Bioinspired early detection through gas flow modulation in chemo-sensory systems.Sensors and Actuators B: Chemical, 206:538–547, 2015

Andrey Ziyatdinov, Jordi Fonollosa, Luis Fernández, Agustín Gutierrez-Gálvez, Santiago Marco, and Alexandre Perera. Bioinspired early detection through gas flow modulation in chemo-sensory systems.Sensors and Actuators B: Chemical, 206:538–547, 2015

work page 2015

[50] [50]

Anguita, Alessandro Ghio, L

D. Anguita, Alessandro Ghio, L. Oneto, Xavier Parra, and Jorge Luis Reyes-Ortiz. A public domain dataset for human activity recognition using smartphones. InThe European Symposium on Artificial Neural Networks, 2013. URL https://api.semanticscholar.org/Corpus ID:6975432. 12

work page 2013

[51] [51]

A kernel two-sample test.The journal of machine learning research, 13(1):723–773, 2012

Arthur Gretton, Karsten M Borgwardt, Malte J Rasch, Bernhard Schölkopf, and Alexander Smola. A kernel two-sample test.The journal of machine learning research, 13(1):723–773, 2012

work page 2012

[52] [52]

Generative moment matching networks

Yujia Li, Kevin Swersky, and Rich Zemel. Generative moment matching networks. InInterna- tional conference on machine learning, pages 1718–1727. PMLR, 2015

work page 2015

[53] [53]

Generative models and model criticism via optimized maximum mean discrepancy.arXiv preprint arXiv:1611.04488, 2016

Danica J Sutherland, Hsiao-Yu Tung, Heiko Strathmann, Soumyajit De, Aaditya Ramdas, Alex Smola, and Arthur Gretton. Generative models and model criticism via optimized maximum mean discrepancy.arXiv preprint arXiv:1611.04488, 2016

work page arXiv 2016

[54] [54]

The wasserstein-fourier distance for stationary time series.IEEE Transactions on Signal Processing, 69:709–721, 2020

Elsa Cazelles, Arnaud Robert, and Felipe Tobar. The wasserstein-fourier distance for stationary time series.IEEE Transactions on Signal Processing, 69:709–721, 2020

work page 2020

[55] [55]

Certifying some distributional robustness with principled adversarial training.arXiv preprint arXiv:1710.10571, 2017

Aman Sinha, Hongseok Namkoong, Riccardo V olpi, and John Duchi. Certifying some dis- tributional robustness with principled adversarial training.arXiv preprint arXiv:1710.10571, 2017. 13 A Loss Function Construction We provide a comprehensive discussion on the Sinkhorn divergence used in our formulation (2). Let Z=R D×T denote the feature-temporal space o...

work page arXiv 2017

[56] [56]

Balanced

and structure all datasets as three-dimensional tensors of shape (N, T, D) representing samples, time steps, and features, respectively. Note that the exchange of temporal and feature dimensions does not affect our theory and algorithm as one just needs to swap the indices during computation. CNNpred [46].UCI stock market data combining 5 US indices (S&P ...

work page 2010