Parametric Prior Mapping Framework for Non-stationary Probabilistic Time Series Forecasting

Jinglin Li; Jun Tan; Ning Gui; Qi Fang

arxiv: 2605.23402 · v1 · pith:7SXZJN2Gnew · submitted 2026-05-22 · 💻 cs.LG · cs.AI

Parametric Prior Mapping Framework for Non-stationary Probabilistic Time Series Forecasting

Jinglin Li , Jun Tan , QI Fang , Ning Gui This is my paper

Pith reviewed 2026-05-25 05:07 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords parametric prior mappingnon-stationary time seriesprobabilistic forecastinggenerative modelshybrid modelsmultivariate time seriesadaptive priors

0 comments

The pith

PPM uses a parametric estimator to derive a dynamic prior mapped into a generative model for non-stationary time series forecasts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes the Parametric Prior Mapping framework to address the trade-off between expressiveness and robustness when modeling non-stationary dynamics in probabilistic multivariate time series forecasting. It derives a dynamic adaptive prior from a parametric estimator and injects this prior into a generative model through a learnable mapping. The hybrid design aims to preserve the efficiency and inductive biases of parametric methods while gaining the flexibility of generative approaches. Training occurs via a hybrid objective that produces precise point forecasts along with well-calibrated uncertainty. Empirical comparisons indicate the method outperforms existing baselines on non-stationary data while maintaining a favorable accuracy-computation balance.

Core claim

PPM injects parametric structural priors into a generative modeling process. Specifically, PPM utilizes a parametric estimator to derive a dynamic, adaptive prior that guides the learning of a complex predictive distribution via a learnable mapping. This design allows the model to retain the efficiency of parametric methods while exploiting the expressive power of generative models. Trained with a hybrid objective, PPM yields precise forecasts with well-calibrated uncertainty estimates and outperforms existing baselines in handling non-stationary data.

What carries the argument

The Parametric Prior Mapping (PPM) framework, which derives a dynamic adaptive prior from a parametric estimator and injects it into a generative model through a learnable mapping.

If this is right

Forecasts on non-stationary multivariate time series achieve higher accuracy than pure parametric or pure generative baselines.
The resulting predictive distributions carry well-calibrated uncertainty estimates.
Computational cost remains closer to parametric methods than to full generative training.
The hybrid objective enables retention of parametric inductive biases without sacrificing generative flexibility.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same mapping mechanism could be tested on other sequential tasks that exhibit distribution shifts, such as streaming sensor data.
Different families of parametric estimators might be swapped in without altering the generative backbone, allowing domain-specific prior choices.
If the mapping learns to translate priors effectively, the approach may reduce the data volume needed for reliable generative training on drifting series.

Load-bearing premise

A parametric estimator can reliably produce a useful dynamic prior that, when mapped into a generative model, simultaneously retains parametric efficiency and delivers superior performance on non-stationary data.

What would settle it

On standard non-stationary multivariate time series benchmarks, PPM shows no gain in forecast accuracy or uncertainty calibration compared with strong baselines while using comparable computation.

Figures

Figures reproduced from arXiv: 2605.23402 by Jinglin Li, Jun Tan, Ning Gui, Qi Fang.

**Figure 2.** Figure 2: PPM is trained in three stages: encode historical context to estimate the prior’s parameters and resample a sample-based prior; push the prior forward to obtain the predictive output distribution; then use KDE to estimate the conditional predictive density, minimizing averaged NLL (MLE) over the horizon with an auxiliary averaged MSE term. as a factorized (diagonal) multivariate Gaussian: pθ(z|x) = N [PIT… view at source ↗

**Figure 3.** Figure 3: Probabilistic prediction interval comparisons on ETTm1 and Traffic datasets with NsDiff. Second, to assess conditioning quality, we quantify information retention in the learned prior. Finally, we verify how the NLL and Mean MSE objectives complement each other to guide this process. These analyses affirm the synergy between our architectural choices and learning objectives. 6.4.1. PUSH-FORWARD MAPPING [… view at source ↗

**Figure 5.** Figure 5: Mutual-information lower bound between the input x and the prior latent variable z on the test set of ETTh1, ETTm1, and Traffic (batch size=256) [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Inference time comparison with different models on Traffic dataset, with history window H = 96, future length L = 192. The number of samples is set to 100. 6.4.2. PARAMETRIC PRIOR ANALYSIS To further substantiate the advantage of our method in parametric prior modeling, we compute the mutual-information lower bound (MI lower bound) (Oord et al., 2018) between the input sequence x and the prior latent vari… view at source ↗

**Figure 7.** Figure 7: Analysis for sampling count K. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗

**Figure 8.** Figure 8: Analysis for weight coefficient α. C.6.3. BANDWIDTH h As shown in [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗

**Figure 9.** Figure 9: Analysis for bandwidth h. D. Case Study To demonstrate the superiority of the proposed method, we visualize the ground truth and predictions of time series across five datasets in [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗

**Figure 10.** Figure 10: Visualization of the ETT datasets [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗

**Figure 11.** Figure 11: Visualization of the Weather and Traffic datasets. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗

read the original abstract

Effectively modeling non-stationary dynamics in probabilistic multivariate time series(MTS) forecasting requires balancing expressiveness with robustness. Existing parametric approaches benefit from strong inductive biases but lack flexibility, whereas deep generative models struggle to capture complex temporal dependencies without extensive data and computation. We introduce Parametric Prior Mapping (PPM), a framework that injects parametric structural priors into a generative modeling process. Specifically, PPM utilizes a parametric estimator to derive a dynamic, adaptive prior that guides the learning of a complex predictive distribution via a learnable mapping. This design allows the model to retain the efficiency of parametric methods while exploiting the expressive power of generative models. Trained with a hybrid objective, PPM yields precise forecasts with well-calibrated uncertainty estimates. Empirical results show that PPM outperforms existing baselines in handling non-stationary data, offering a superior trade-off between accuracy and computational efficiency. The code is available at https://github.com/ljl8336/PPM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PPM is a reasonable hybrid parametric-generative idea for non-stationary MTS forecasting, but the abstract gives almost no concrete mechanism or evidence to judge whether the mapping actually delivers the claimed gains.

read the letter

The paper introduces Parametric Prior Mapping as a way to feed a dynamic prior from a parametric estimator into a generative model via a learnable mapping, trained with a hybrid objective. The goal is to keep parametric efficiency while gaining the flexibility needed for non-stationary multivariate time series. That framing is straightforward and addresses a real tension in the field. Releasing the code is also a concrete positive step that lets others test the claims directly. The central design choice—using the parametric part to guide rather than replace the generative part—could be useful if the mapping stays lightweight and the prior actually adapts without extra data cost. The main limitation is that the abstract supplies no equations, no description of the estimator or mapping function, and no dataset or metric details. Without those, it is impossible to tell whether the approach reduces to existing hybrid tricks or introduces something that holds up under scrutiny. The reported outperformance and calibration are stated but cannot be evaluated from what is here. This work is mainly for people already working on probabilistic time-series models who want to try a parametric-generative blend. It is coherent enough on its own terms to merit peer review so the details and experiments can be checked properly.

Referee Report

1 major / 0 minor

Summary. The manuscript introduces Parametric Prior Mapping (PPM), a framework for non-stationary probabilistic multivariate time series forecasting. PPM employs a parametric estimator to produce a dynamic adaptive prior that is injected into a generative model through a learnable mapping; the model is trained with a hybrid objective. The authors claim that this yields precise forecasts with well-calibrated uncertainty estimates, outperforms existing baselines on non-stationary data, and provides a superior accuracy-efficiency trade-off. Code is released at https://github.com/ljl8336/PPM.

Significance. If the empirical claims hold under rigorous evaluation, PPM would represent a practical compromise between the inductive biases of parametric methods and the flexibility of deep generative models for handling non-stationarity in MTS forecasting. The public code release is a clear strength that supports reproducibility.

major comments (1)

Abstract: The abstract asserts empirical outperformance and well-calibrated uncertainty but supplies no equations, metrics, baselines, dataset descriptions, or ablation results; therefore the data and derivations cannot be checked against the stated claims.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review. We address the single major comment below.

read point-by-point responses

Referee: Abstract: The abstract asserts empirical outperformance and well-calibrated uncertainty but supplies no equations, metrics, baselines, dataset descriptions, or ablation results; therefore the data and derivations cannot be checked against the stated claims.

Authors: We agree that the abstract contains no equations, metrics, baselines, dataset descriptions, or ablation results. This is by design, as abstracts are required to be concise high-level summaries (typically under 200 words). All supporting details—including the hybrid objective, evaluation metrics (CRPS, NLL), baselines, non-stationary MTS datasets, and ablation studies—are provided in the Experiments section and supplementary material. The abstract claims are therefore directly verifiable against the quantitative results reported in the body of the manuscript. revision: no

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The abstract and visible description present PPM as an empirical framework combining a parametric estimator, learnable mapping, and hybrid objective for non-stationary MTS forecasting. No equations, derivations, first-principles predictions, or load-bearing self-citations are stated that could reduce to fitted inputs or self-definitional constructs by construction. Claims of outperformance are presented as empirical results supported by released code, with no internal reduction of the central mechanism to its own training parameters. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only input supplies no explicit free parameters, axioms, or invented entities; the framework description implies an unspecified parametric estimator and learnable mapping whose internal details are not visible.

pith-pipeline@v0.9.0 · 5687 in / 1222 out tokens · 23899 ms · 2026-05-25T05:07:27.618843+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

PPM utilizes a parametric estimator to derive a dynamic, adaptive prior that guides the learning of a complex predictive distribution via a learnable mapping... q_ϕ(y|x) = (g_ϕ)#p_θ(z|x)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We employ... reparameterization trick... push the resampled prior through a learnable non-linear mapping

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 8 internal anchors

[1]

Nature , volume=

Probabilistic weather forecasting with machine learning , author=. Nature , volume=. 2025 , publisher=

work page 2025
[2]

Transport Reviews , volume=

Recent advances in deep learning for traffic probabilistic prediction , author=. Transport Reviews , volume=. 2024 , publisher=

work page 2024
[3]

Organizational behavior and human decision processes , volume=

Probabilistic forecasts of stock prices and earnings: The hazards of nascent expertise , author=. Organizational behavior and human decision processes , volume=. 1991 , publisher=

work page 1991
[4]

Advances in Neural Information Processing Systems , volume=

Stochastic multiple choice learning for training diverse deep ensembles , author=. Advances in Neural Information Processing Systems , volume=

work page
[5]

Advances in neural information processing systems , volume=

Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing , author=. Advances in neural information processing systems , volume=

work page
[6]

arXiv preprint arXiv:2406.04706 , year=

Winner-takes-all learners are geometry-aware conditional density estimators , author=. arXiv preprint arXiv:2406.04706 , year=

work page arXiv
[7]

Proceedings of the AAAI conference on artificial intelligence , volume=

Are transformers effective for time series forecasting? , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

work page
[8]

Sparsetsf: Modeling long-term time series forecasting with 1k parameters

Sparsetsf: Modeling long-term time series forecasting with 1k parameters , author=. arXiv preprint arXiv:2405.00946 , year=

work page arXiv
[9]

International conference on learning representations , year=

Reversible instance normalization for accurate time-series forecasting against distribution shift , author=. International conference on learning representations , year=

work page
[10]

Auto-Encoding Variational Bayes

Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[11]

Advances in Neural Information Processing Systems , volume=

Card: Classification and regression diffusion models , author=. Advances in Neural Information Processing Systems , volume=

work page
[12]

Management science , volume=

Scoring rules for continuous probability distributions , author=. Management science , volume=. 1976 , publisher=

work page 1976
[13]

International journal of forecasting , volume=

DeepAR: Probabilistic forecasting with autoregressive recurrent networks , author=. International journal of forecasting , volume=. 2020 , publisher=

work page 2020
[14]

International Conference on Artificial Intelligence and Statistics , pages=

Better batch for deep probabilistic time series forecasting , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2024 , organization=

work page 2024
[15]

Flow matching with gaussian process priors for probabilistic time series forecasting.arXiv preprint arXiv:2410.03024, 2024

Flow matching with gaussian process priors for probabilistic time series forecasting , author=. arXiv preprint arXiv:2410.03024 , year=

work page arXiv
[16]

The Twelfth International Conference on Learning Representations , year=

Transformer-modulated diffusion models for probabilistic multivariate time series forecasting , author=. The Twelfth International Conference on Learning Representations , year=

work page
[17]

International Conference on Machine Learning , pages=

Non-autoregressive conditional diffusion models for time series prediction , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023
[18]

International conference on machine learning , pages=

Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting , author=. International conference on machine learning , pages=. 2021 , organization=

work page 2021
[19]

arXiv preprint arXiv:2403.01742 , year=

Diffusion-ts: Interpretable diffusion for general time series generation , author=. arXiv preprint arXiv:2403.01742 , year=

work page arXiv
[20]

Advances in Neural Information Processing Systems , volume=

Generative time series forecasting with diffusion, denoise, and disentanglement , author=. Advances in Neural Information Processing Systems , volume=

work page
[21]

Non-stationary Diffusion For Probabilistic Time Series Forecasting

Non-stationary Diffusion For Probabilistic Time Series Forecasting , author=. arXiv preprint arXiv:2505.04278 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[22]

arXiv preprint arXiv:2506.05515 , year=

Winner-takes-all for Multivariate Probabilistic Time Series Forecasting , author=. arXiv preprint arXiv:2506.05515 , year=

work page arXiv
[23]

Advances in neural information processing systems , volume=

Multiple choice learning: Learning to produce multiple structured outputs , author=. Advances in neural information processing systems , volume=

work page
[24]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Versatile multiple choice learning and its application to vision computing , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[25]

International Conference on Machine Learning , pages=

Confident multiple choice learning , author=. International Conference on Machine Learning , pages=. 2017 , organization=

work page 2017
[26]

Proceedings of the IEEE international conference on computer vision , pages=

Learning in an uncertain world: Representing ambiguity through multiple hypotheses , author=. Proceedings of the IEEE international conference on computer vision , pages=

work page
[27]

Annual Review of Statistics and Its Application , volume=

Probabilistic forecasting , author=. Annual Review of Statistics and Its Application , volume=. 2014 , publisher=

work page 2014
[28]

arXiv preprint arXiv:2403.11968 , year=

Unveil conditional diffusion models with classifier-free guidance: A sharp statistical theory , author=. arXiv preprint arXiv:2403.11968 , year=

work page arXiv
[29]

International Conference on Machine Learning , pages=

Diffusion models are minimax optimal distribution estimators , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023
[30]

Advances in neural information processing systems , volume=

Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting , author=. Advances in neural information processing systems , volume=

work page
[31]

Proceedings of the AAAI conference on artificial intelligence , volume=

Informer: Beyond efficient transformer for long sequence time-series forecasting , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

work page
[32]

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

A time series is worth 64 words: Long-term forecasting with transformers , author=. arXiv preprint arXiv:2211.14730 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[33]

Advances in neural information processing systems , volume=

Non-stationary transformers: Exploring the stationarity in time series forecasting , author=. Advances in neural information processing systems , volume=

work page
[34]

iTransformer: Inverted Transformers Are Effective for Time Series Forecasting

itransformer: Inverted transformers are effective for time series forecasting , author=. arXiv preprint arXiv:2310.06625 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[35]

1987 , publisher=

Density estimation for statistics and data analysis , author=. 1987 , publisher=

work page 1987
[36]

1994 , publisher=

Kernel smoothing , author=. 1994 , publisher=

work page 1994
[37]

A Multi-Horizon Quantile Recurrent Forecaster

A multi-horizon quantile recurrent forecaster , author=. arXiv preprint arXiv:1711.11053 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[38]

Advances in neural information processing systems , volume=

Conformalized quantile regression , author=. Advances in neural information processing systems , volume=

work page
[39]

GluonTS: Probabilistic Time Series Models in Python

Gluonts: Probabilistic time series models in python , author=. arXiv preprint arXiv:1906.05264 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1906
[40]

Advances in Neural Information Processing Systems , volume=

Ant: Adaptive noise schedule for time series diffusion models , author=. Advances in Neural Information Processing Systems , volume=

work page
[41]

Advances in neural information processing systems , volume=

A universal approximation theorem of deep neural networks for expressing probability distributions , author=. Advances in neural information processing systems , volume=

work page
[42]

Advances in Neural Information Processing Systems , volume=

Multivariate probabilistic time series forecasting with correlated errors , author=. Advances in Neural Information Processing Systems , volume=

work page
[43]

Advances in neural information processing systems , volume=

Deep state space models for time series forecasting , author=. Advances in neural information processing systems , volume=

work page
[44]

International conference on machine learning , pages=

Deep factors for forecasting , author=. International conference on machine learning , pages=. 2019 , organization=

work page 2019
[45]

International journal of forecasting , volume=

Temporal fusion transformers for interpretable multi-horizon time series forecasting , author=. International journal of forecasting , volume=. 2021 , publisher=

work page 2021
[46]

arXiv preprint arXiv:2404.17451 , year=

Any-quantile probabilistic forecasting of short-term electricity demand , author=. arXiv preprint arXiv:2404.17451 , year=

work page arXiv
[47]

arXiv preprint arXiv:2601.03220 , year=

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence , author=. arXiv preprint arXiv:2601.03220 , year=

work page arXiv
[48]

arXiv preprint arXiv:2002.10689 , year=

A theory of usable information under computational constraints , author=. arXiv preprint arXiv:2002.10689 , year=

work page arXiv 2002
[49]

Representation Learning with Contrastive Predictive Coding

Representation learning with contrastive predictive coding , author=. arXiv preprint arXiv:1807.03748 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[50]

2008 , publisher=

Optimal transport: old and new , author=. 2008 , publisher=

work page 2008
[51]

2019 , publisher=

Computational optimal transport , author=. 2019 , publisher=

work page 2019
[52]

The Eleventh International Conference on Learning Representations , year=

Flow Matching for Generative Modeling , author=. The Eleventh International Conference on Learning Representations , year=

work page
[53]

Transactions on Machine Learning Research , year=

Improving and generalizing flow-based generative models with minibatch optimal transport , author=. Transactions on Machine Learning Research , year=

work page
[54]

Diffusion schr

De Bortoli, Valentin and Thornton, James and Heng, Jeremy and Doucet, Arnaud , journal=. Diffusion schr

work page
[55]

Journal of Machine Learning Research , volume=

Normalizing flows for probabilistic modeling and inference , author=. Journal of Machine Learning Research , volume=

work page
[56]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

work page
[57]

Flow Matching for Generative Modeling

Flow matching for generative modeling , author=. arXiv preprint arXiv:2210.02747 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[58]

Advances in Neural Information Processing Systems , volume=

Frequency adaptive normalization for non-stationary time series forecasting , author=. Advances in Neural Information Processing Systems , volume=

work page
[59]

The Thirteenth International Conference on Learning Representations , year=

Diffusion-based decoupled deterministic and uncertain framework for probabilistic multivariate time series forecasting , author=. The Thirteenth International Conference on Learning Representations , year=

work page
[60]

International Conference on Neural Information Processing , pages=

MPFT: Multi-perspective Frequency Learning for Non-Stationary Time Series Forecasting , author=. International Conference on Neural Information Processing , pages=. 2025 , organization=

work page 2025

[1] [1]

Nature , volume=

Probabilistic weather forecasting with machine learning , author=. Nature , volume=. 2025 , publisher=

work page 2025

[2] [2]

Transport Reviews , volume=

Recent advances in deep learning for traffic probabilistic prediction , author=. Transport Reviews , volume=. 2024 , publisher=

work page 2024

[3] [3]

Organizational behavior and human decision processes , volume=

Probabilistic forecasts of stock prices and earnings: The hazards of nascent expertise , author=. Organizational behavior and human decision processes , volume=. 1991 , publisher=

work page 1991

[4] [4]

Advances in Neural Information Processing Systems , volume=

Stochastic multiple choice learning for training diverse deep ensembles , author=. Advances in Neural Information Processing Systems , volume=

work page

[5] [5]

Advances in neural information processing systems , volume=

Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing , author=. Advances in neural information processing systems , volume=

work page

[6] [6]

arXiv preprint arXiv:2406.04706 , year=

Winner-takes-all learners are geometry-aware conditional density estimators , author=. arXiv preprint arXiv:2406.04706 , year=

work page arXiv

[7] [7]

Proceedings of the AAAI conference on artificial intelligence , volume=

Are transformers effective for time series forecasting? , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

work page

[8] [8]

Sparsetsf: Modeling long-term time series forecasting with 1k parameters

Sparsetsf: Modeling long-term time series forecasting with 1k parameters , author=. arXiv preprint arXiv:2405.00946 , year=

work page arXiv

[9] [9]

International conference on learning representations , year=

Reversible instance normalization for accurate time-series forecasting against distribution shift , author=. International conference on learning representations , year=

work page

[10] [10]

Auto-Encoding Variational Bayes

Auto-encoding variational bayes , author=. arXiv preprint arXiv:1312.6114 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[11] [11]

Advances in Neural Information Processing Systems , volume=

Card: Classification and regression diffusion models , author=. Advances in Neural Information Processing Systems , volume=

work page

[12] [12]

Management science , volume=

Scoring rules for continuous probability distributions , author=. Management science , volume=. 1976 , publisher=

work page 1976

[13] [13]

International journal of forecasting , volume=

DeepAR: Probabilistic forecasting with autoregressive recurrent networks , author=. International journal of forecasting , volume=. 2020 , publisher=

work page 2020

[14] [14]

International Conference on Artificial Intelligence and Statistics , pages=

Better batch for deep probabilistic time series forecasting , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2024 , organization=

work page 2024

[15] [15]

Flow matching with gaussian process priors for probabilistic time series forecasting.arXiv preprint arXiv:2410.03024, 2024

Flow matching with gaussian process priors for probabilistic time series forecasting , author=. arXiv preprint arXiv:2410.03024 , year=

work page arXiv

[16] [16]

The Twelfth International Conference on Learning Representations , year=

Transformer-modulated diffusion models for probabilistic multivariate time series forecasting , author=. The Twelfth International Conference on Learning Representations , year=

work page

[17] [17]

International Conference on Machine Learning , pages=

Non-autoregressive conditional diffusion models for time series prediction , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023

[18] [18]

International conference on machine learning , pages=

Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting , author=. International conference on machine learning , pages=. 2021 , organization=

work page 2021

[19] [19]

arXiv preprint arXiv:2403.01742 , year=

Diffusion-ts: Interpretable diffusion for general time series generation , author=. arXiv preprint arXiv:2403.01742 , year=

work page arXiv

[20] [20]

Advances in Neural Information Processing Systems , volume=

Generative time series forecasting with diffusion, denoise, and disentanglement , author=. Advances in Neural Information Processing Systems , volume=

work page

[21] [21]

Non-stationary Diffusion For Probabilistic Time Series Forecasting

Non-stationary Diffusion For Probabilistic Time Series Forecasting , author=. arXiv preprint arXiv:2505.04278 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[22] [22]

arXiv preprint arXiv:2506.05515 , year=

Winner-takes-all for Multivariate Probabilistic Time Series Forecasting , author=. arXiv preprint arXiv:2506.05515 , year=

work page arXiv

[23] [23]

Advances in neural information processing systems , volume=

Multiple choice learning: Learning to produce multiple structured outputs , author=. Advances in neural information processing systems , volume=

work page

[24] [24]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Versatile multiple choice learning and its application to vision computing , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page

[25] [25]

International Conference on Machine Learning , pages=

Confident multiple choice learning , author=. International Conference on Machine Learning , pages=. 2017 , organization=

work page 2017

[26] [26]

Proceedings of the IEEE international conference on computer vision , pages=

Learning in an uncertain world: Representing ambiguity through multiple hypotheses , author=. Proceedings of the IEEE international conference on computer vision , pages=

work page

[27] [27]

Annual Review of Statistics and Its Application , volume=

Probabilistic forecasting , author=. Annual Review of Statistics and Its Application , volume=. 2014 , publisher=

work page 2014

[28] [28]

arXiv preprint arXiv:2403.11968 , year=

Unveil conditional diffusion models with classifier-free guidance: A sharp statistical theory , author=. arXiv preprint arXiv:2403.11968 , year=

work page arXiv

[29] [29]

International Conference on Machine Learning , pages=

Diffusion models are minimax optimal distribution estimators , author=. International Conference on Machine Learning , pages=. 2023 , organization=

work page 2023

[30] [30]

Advances in neural information processing systems , volume=

Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting , author=. Advances in neural information processing systems , volume=

work page

[31] [31]

Proceedings of the AAAI conference on artificial intelligence , volume=

Informer: Beyond efficient transformer for long sequence time-series forecasting , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

work page

[32] [32]

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

A time series is worth 64 words: Long-term forecasting with transformers , author=. arXiv preprint arXiv:2211.14730 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[33] [33]

Advances in neural information processing systems , volume=

Non-stationary transformers: Exploring the stationarity in time series forecasting , author=. Advances in neural information processing systems , volume=

work page

[34] [34]

iTransformer: Inverted Transformers Are Effective for Time Series Forecasting

itransformer: Inverted transformers are effective for time series forecasting , author=. arXiv preprint arXiv:2310.06625 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[35] [35]

1987 , publisher=

Density estimation for statistics and data analysis , author=. 1987 , publisher=

work page 1987

[36] [36]

1994 , publisher=

Kernel smoothing , author=. 1994 , publisher=

work page 1994

[37] [37]

A Multi-Horizon Quantile Recurrent Forecaster

A multi-horizon quantile recurrent forecaster , author=. arXiv preprint arXiv:1711.11053 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[38] [38]

Advances in neural information processing systems , volume=

Conformalized quantile regression , author=. Advances in neural information processing systems , volume=

work page

[39] [39]

GluonTS: Probabilistic Time Series Models in Python

Gluonts: Probabilistic time series models in python , author=. arXiv preprint arXiv:1906.05264 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1906

[40] [40]

Advances in Neural Information Processing Systems , volume=

Ant: Adaptive noise schedule for time series diffusion models , author=. Advances in Neural Information Processing Systems , volume=

work page

[41] [41]

Advances in neural information processing systems , volume=

A universal approximation theorem of deep neural networks for expressing probability distributions , author=. Advances in neural information processing systems , volume=

work page

[42] [42]

Advances in Neural Information Processing Systems , volume=

Multivariate probabilistic time series forecasting with correlated errors , author=. Advances in Neural Information Processing Systems , volume=

work page

[43] [43]

Advances in neural information processing systems , volume=

Deep state space models for time series forecasting , author=. Advances in neural information processing systems , volume=

work page

[44] [44]

International conference on machine learning , pages=

Deep factors for forecasting , author=. International conference on machine learning , pages=. 2019 , organization=

work page 2019

[45] [45]

International journal of forecasting , volume=

Temporal fusion transformers for interpretable multi-horizon time series forecasting , author=. International journal of forecasting , volume=. 2021 , publisher=

work page 2021

[46] [46]

arXiv preprint arXiv:2404.17451 , year=

Any-quantile probabilistic forecasting of short-term electricity demand , author=. arXiv preprint arXiv:2404.17451 , year=

work page arXiv

[47] [47]

arXiv preprint arXiv:2601.03220 , year=

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence , author=. arXiv preprint arXiv:2601.03220 , year=

work page arXiv

[48] [48]

arXiv preprint arXiv:2002.10689 , year=

A theory of usable information under computational constraints , author=. arXiv preprint arXiv:2002.10689 , year=

work page arXiv 2002

[49] [49]

Representation Learning with Contrastive Predictive Coding

Representation learning with contrastive predictive coding , author=. arXiv preprint arXiv:1807.03748 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[50] [50]

2008 , publisher=

Optimal transport: old and new , author=. 2008 , publisher=

work page 2008

[51] [51]

2019 , publisher=

Computational optimal transport , author=. 2019 , publisher=

work page 2019

[52] [52]

The Eleventh International Conference on Learning Representations , year=

Flow Matching for Generative Modeling , author=. The Eleventh International Conference on Learning Representations , year=

work page

[53] [53]

Transactions on Machine Learning Research , year=

Improving and generalizing flow-based generative models with minibatch optimal transport , author=. Transactions on Machine Learning Research , year=

work page

[54] [54]

Diffusion schr

De Bortoli, Valentin and Thornton, James and Heng, Jeremy and Doucet, Arnaud , journal=. Diffusion schr

work page

[55] [55]

Journal of Machine Learning Research , volume=

Normalizing flows for probabilistic modeling and inference , author=. Journal of Machine Learning Research , volume=

work page

[56] [56]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

work page

[57] [57]

Flow Matching for Generative Modeling

Flow matching for generative modeling , author=. arXiv preprint arXiv:2210.02747 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[58] [58]

Advances in Neural Information Processing Systems , volume=

Frequency adaptive normalization for non-stationary time series forecasting , author=. Advances in Neural Information Processing Systems , volume=

work page

[59] [59]

The Thirteenth International Conference on Learning Representations , year=

Diffusion-based decoupled deterministic and uncertain framework for probabilistic multivariate time series forecasting , author=. The Thirteenth International Conference on Learning Representations , year=

work page

[60] [60]

International Conference on Neural Information Processing , pages=

MPFT: Multi-perspective Frequency Learning for Non-Stationary Time Series Forecasting , author=. International Conference on Neural Information Processing , pages=. 2025 , organization=

work page 2025