pith. sign in

arxiv: 2605.05359 · v1 · submitted 2026-05-06 · 📊 stat.ME

Bayesian inference of sparsity in stable vector autoregressive processes

Pith reviewed 2026-05-08 16:24 UTC · model grok-4.3

classification 📊 stat.ME
keywords Bayesian inferencevector autoregressionsparsitystationarityspike-and-slab priorgraphical modelstime seriesGranger causality
0
0 comments X

The pith

Bayesian priors enforce stationarity and sparsity in vector autoregressive processes using parameter expansion.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a Bayesian prior for vector autoregressive models that enforces both stationarity and sparsity in the autoregressive coefficients. This is achieved through parameter expansion to constrain the prior support to the stationary region while using a spike-and-slab mechanism for sparsity. Such a prior addresses the need to learn directed relationships from high-dimensional time series data in fields like neuroscience and macroeconomics, where assuming stability can be beneficial. Inference proceeds via a Metropolis-within-Gibbs sampler incorporating the No-U-Turn Sampler and reversible-jump steps. A mixture of G-Wishart distributions is used for the sparse prior on the error precision matrix.

Core claim

Through parameter expansion, a spike-and-slab prior is constructed for the autoregressive coefficients with support constrained exactly to the stationary region, allowing simultaneous enforcement of stationarity and sparsity in graphical vector autoregressive processes.

What carries the argument

The parameter-expanded spike-and-slab prior for autoregressive coefficients that restricts the prior support to the stationary region.

If this is right

  • The approach enables learning of Granger non-causal relationships under the constraint of process stability.
  • Inference and prediction improve in applications to macroeconomic and neuroscience data.
  • The method scales to moderate-to-high dimensions via specialized MCMC techniques.
  • Sparsity in the error precision matrix is handled jointly through G-Wishart mixtures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This construction could be adapted to other constrained parameter spaces in multivariate time series models.
  • Applications might extend to identifying stable causal structures in financial or biological networks.
  • Further work could test the method's performance in very high dimensions where the stationary region geometry becomes more complex.

Load-bearing premise

The parameter expansion produces a prior whose support exactly matches the stationary region without introducing bias or making the Metropolis-within-Gibbs sampler computationally intractable.

What would settle it

Run the sampler on simulated data from a known sparse stable VAR process and check whether the posterior support remains within the stationary region and accurately recovers the zero patterns in the coefficients.

Figures

Figures reproduced from arXiv: 2605.05359 by Darren J. Wilkinson, Ian H. Jermyn, Sarah E. Heaps, Yujiang Wang.

Figure 1
Figure 1. Figure 1: Boxplots summarising misclassification rates for the indicators (a) view at source ↗
Figure 2
Figure 2. Figure 2: For each forecast horizon h and each prior: h-step ahead CRPS and logarithmic score for variable k (CRPSk, LogSk) and h-step ahead energy score (ES). Variables k = 1, 2, 3 correspond to GDP251, CPIAUCSL and FYFF, respectively. The priors are represented through (i) ▲ (ii) ■ (iii) • (iv) ⊞ (v) ⊕. (consumer price index) to FYFF, with posterior probability 0.9993, and from FYFF to CPIAUCSL, with posterior pro… view at source ↗
Figure 3
Figure 3. Figure 3: Mixed graph G whose edges have posterior probability greater than 0.5 for the U.S. macroeconomic data. The graph shows directed ( ) and undirected ( ) edges. The full variable names for the vertex labels are provided in Supplementary Table S7. lack of convergence. A plot showing the locations of the brain regions for individual C is provided in Supplementary Figure S7; there are six regions in the left hem… view at source ↗
Figure 4
Figure 4. Figure 4: For data in the beta band for individual C, posterior probabilities of the off-diagonal view at source ↗
Figure 5
Figure 5. Figure 5: For data in the beta band for individual C, boxplots summarising posterior proba view at source ↗
read the original abstract

Advances in sensing technology have made it possible to collect large volumes of high-dimensional time-series data. In fields like genetics and neuroscience, key questions concern whether directed relationships between variables can be learned from these data. To this end, graphical vector autoregressions are a popular tool because zeros among the autoregressive coefficients and error precision matrix have natural interpretations in terms of Granger non-causality and contemporaneous conditional independence. In applications where system dynamics are subject to functional or structural constraints, assuming the process is stable can be advantageous. However, enforcing stability demands restricting the autoregressive coefficients to lie in a constrained space with a complex geometry called the stationary region. The resulting inferential challenges are compounded when sparsity is also a requirement. Working in the Bayesian paradigm, we tackle the problem of developing a prior that simultaneously enforces stationarity and sparsity through parameter expansion, constructing a spike-and-slab prior with support constrained to the stationary region. A mixture of G-Wishart distributions provides a sparse prior for the error precision matrix. Computational inference is carried out using Metropolis-within-Gibbs, exploiting the No-U-Turn Sampler and reversible-jump steps. We demonstrate the inferential and predictive benefits of our approach through simulations and applications in macroeconomics and neuroscience.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a Bayesian method for performing inference in sparse stable vector autoregressive (VAR) models. It constructs a spike-and-slab prior for the autoregressive coefficients with support restricted to the stationary region using parameter expansion, uses a mixture of G-Wishart priors for the error precision matrix to promote sparsity, and employs a Metropolis-within-Gibbs MCMC algorithm utilizing the No-U-Turn Sampler and reversible-jump moves for posterior sampling. The approach is validated through simulation studies and applied to real datasets in macroeconomics and neuroscience, claiming benefits in inferential accuracy and predictive performance.

Significance. Should the proposed parameter expansion yield a prior whose support precisely coincides with the stationary region without introducing bias, this contribution would be significant for the field of Bayesian time series analysis. It would enable joint modeling of sparsity (interpretable as Granger non-causality) and stability in high-dimensional settings, which is valuable in applications such as genetics, neuroscience, and economics. The reliance on well-established components like G-Wishart priors and NUTS sampling enhances the potential for the method to be adopted and extended. The simulations and applications provide a starting point for assessing practical utility, though the overall impact depends on confirming the correctness of the constrained prior construction.

major comments (2)
  1. Abstract: the central claim is that parameter expansion constructs a spike-and-slab prior 'with support constrained to the stationary region'. The stationary region is the non-convex set where all eigenvalues of the companion matrix lie inside the unit circle. The manuscript must provide an explicit derivation showing that the auxiliary-variable construction induces precisely this marginal prior on the autoregressive coefficients (including any necessary Jacobian adjustment) rather than an approximation or biased truncation. This is load-bearing for the methodological contribution and the assertion that the prior 'enforces stationarity'.
  2. §4 (Computational Inference): the Metropolis-within-Gibbs scheme combines NUTS and reversible-jump steps to sample the constrained posterior. The paper should report acceptance rates, effective sample sizes, and mixing diagnostics from the simulation experiments to demonstrate that the sampler remains tractable for moderate-to-high dimensions; otherwise the claimed computational feasibility and practical benefits cannot be verified.
minor comments (2)
  1. The abstract refers to 'simulations and applications' but omits the specific dimensions (p, T) of the VAR processes and the number of variables in the macroeconomic and neuroscience examples; adding these details would clarify the scale at which the method operates.
  2. Notation for the companion matrix, its eigenvalues, and the precise definition of the stationary region should be introduced in a dedicated preliminary section before the prior construction is presented.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive comments on our manuscript. We have carefully considered each point and provide our responses below. We will make revisions to address the concerns raised.

read point-by-point responses
  1. Referee: Abstract: the central claim is that parameter expansion constructs a spike-and-slab prior 'with support constrained to the stationary region'. The stationary region is the non-convex set where all eigenvalues of the companion matrix lie inside the unit circle. The manuscript must provide an explicit derivation showing that the auxiliary-variable construction induces precisely this marginal prior on the autoregressive coefficients (including any necessary Jacobian adjustment) rather than an approximation or biased truncation. This is load-bearing for the methodological contribution and the assertion that the prior 'enforces stationarity'.

    Authors: We agree that an explicit derivation is essential to substantiate the claim that the parameter-expanded prior has support precisely on the stationary region. In the revised manuscript, we will add a detailed derivation in the methodology section, including the Jacobian adjustment for the transformation induced by the auxiliary variables, to demonstrate that the marginal prior on the autoregressive coefficients is exactly supported on the stationary region without approximation or bias. revision: yes

  2. Referee: §4 (Computational Inference): the Metropolis-within-Gibbs scheme combines NUTS and reversible-jump steps to sample the constrained posterior. The paper should report acceptance rates, effective sample sizes, and mixing diagnostics from the simulation experiments to demonstrate that the sampler remains tractable for moderate-to-high dimensions; otherwise the claimed computational feasibility and practical benefits cannot be verified.

    Authors: We acknowledge the importance of providing quantitative evidence of the sampler's performance. In the revised manuscript, we will include tables or figures reporting acceptance rates, effective sample sizes (ESS), and other mixing diagnostics (such as R-hat statistics) from the simulation experiments across different dimensions to confirm the tractability of the Metropolis-within-Gibbs algorithm. revision: yes

Circularity Check

0 steps flagged

No significant circularity in prior construction or inference

full rationale

The paper constructs a spike-and-slab prior with stationary-region support via parameter expansion and pairs it with a G-Wishart mixture for the precision matrix. This is a direct modeling choice whose support and density are defined by the expansion itself rather than recovered from data or prior results. The subsequent Metropolis-within-Gibbs sampler (NUTS + reversible-jump) is a standard computational device applied to the newly defined prior; no equation shows a fitted parameter or self-cited uniqueness theorem being renamed as a prediction. The derivation therefore remains self-contained and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the existence of a parameter-expansion mapping that exactly covers the stationary region and on standard interpretations of zeros in the autoregressive matrix and precision matrix.

axioms (2)
  • domain assumption The vector autoregressive process is assumed to be stable (stationary).
    Explicitly stated as advantageous when system dynamics are subject to functional or structural constraints.
  • domain assumption Zeros in the autoregressive coefficient matrix correspond to Granger non-causality.
    Standard graphical VAR interpretation invoked in the abstract.

pith-pipeline@v0.9.0 · 5528 in / 1395 out tokens · 22005 ms · 2026-05-08T16:24:35.136288+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages

  1. [1]

    S. E. Heaps. Enforcing Stationarity through the Prior in Vector Autoregressions. 2023

  2. [2]

    M. Eichler. Graphical modelling of multivariate time series. 2012

  3. [3]

    C. W. J. Granger. Investigating causal relations by econometric models and cross-spectral methods. 1969

  4. [4]

    Chiang and M

    S. Chiang and M. Guindani and H. J. Yeh and Z. Haneef and J. M. Stern and M. Vannucci. Bayesian vector autoregressive model for multi-subject effective connectivity inference using multi-modal neuroimaging data. 2017

  5. [5]

    Gorrostieta and M

    C. Gorrostieta and M. Fiecas and H. Ombao and E. Burke and S. Cramer. Hierarchical vector auto-regressive models and their applications to multi-subject effective connectivity. 2013

  6. [6]

    Paci and G

    L. Paci and G. Consonni. Structural learning contemporaneous dependencies in graphical VAR models. 2020

  7. [7]

    Corander and M

    J. Corander and M. Villani. A B ayesian approach to modelling graphical vector autoregressions. 2005

  8. [8]

    Abegaz and E

    F. Abegaz and E. Wit. Sparse time series chain graphical models for reconstructing genetic networks. 2013

  9. [9]

    He and Y

    Y. He and Y. She and D. Wu. Stationary-sparse causality network learning. 2013

  10. [10]

    Shojaie and E

    A. Shojaie and E. B. Fox. Granger causality: a review and recent advances. 2022

  11. [11]

    L. L. Duan and Z. Yuwen and G. Michailidis and Z. Zhang. Low tree-rank B ayesian vector autoregression model. 2023

  12. [12]

    V. E. Johnson and D. Rossell. On the use of non-local prior densities in B ayesian hypothesis tests. 2010

  13. [13]

    Fan and K

    J. Fan and K. Sitek and B. Chandrasekaran and A. Sarkar. Bayesian tensor factorized vector autoregressive models for inferring G ranger causality patterns from high-dimensional multi-subject panel neuroimaging data. arXiv:2206.10757 , year = "2022", adsurl =

  14. [14]

    Ding and Y

    M. Ding and Y. Chen and S. L. Bressler. Granger causality: basic theory and application to neuroscience. Handbook of T ime S eries A nalysis. 2006

  15. [15]

    Mukherjee and C

    S. Mukherjee and C. Oates. Graphical models in molecular systems biology. Handbook of G raphical M odels. 2020

  16. [16]

    W. C. Young and K. Y. Yeung and A. E. Raftery. Identifying dynamical time series model parameters from equilibrium samples, with application to gene regulatory networks. 2019

  17. [17]

    Michailidis and F

    G. Michailidis and F. d'Alch\' e -Buc. Autoregressive models for gene regulatory network inference: sparsity, stability and causality issues. 2013

  18. [18]

    D. F. Ahelegbey and M. Billio and R. Casarin. Bayesian graphical models for structural vector autoregressive processes. 2016

  19. [19]

    Roverato

    A. Roverato. Hyper inverse W ishart distribution for non-decomposable graphs and its application to B ayesian inference for G aussian graphical models. 2002

  20. [20]

    Atay-Kayis and H

    A. Atay-Kayis and H. Massam. A M onte C arlo method for computing the marginal likelihood in nondecomposable G aussian graphical models. 2005

  21. [21]

    Hinne and A

    M. Hinne and A. Lenkoski and T. Heskes and M. van G erven. Efficient sampling of G aussian graphical models using conditional B ayes factors. Stat. 2014

  22. [22]

    M. D. Hoffman and A. Gelman. The N o- U - T urn S ampler: adaptively setting path lengths in H amiltonian M onte C arlo. 2014

  23. [23]

    E. I. George and D. Sun and S. Ni. Bayesian stochastic search for VAR model restrictions. 2008

  24. [24]

    Lei and R

    G. Lei and R. J. Boys and C. S. Gillespie and A. J. Greenall and D. J. Wilkinson. Bayesian inference for sparse VAR (1) models, with application to time course microarray data. 2011

  25. [25]

    N. E. Hannaford and S. E. Heaps and T. M. W. Nye and T. P. Curtis and B. Allen and A. Golightly and D. J. Wilkinson. A sparse B ayesian hierarchical vector autoregressive model for microbial dynamics in a wastewater treatment plant. 2023

  26. [26]

    Roy and A

    A. Roy and A. Roy and S. Ghosal. Bayesian inference for relational graph in a causal vector autoregressive time series. arXiv:2410.22617 , year = "2024", adsurl =

  27. [27]

    Marttinen and J

    P. Marttinen and J. Corander. Bayesian learning of graphical vector autoregressions with unequal lag-lengths. 2009

  28. [28]

    Bernardi and D

    M. Bernardi and D. Bianchi and N. Bianco. Variational inference for large B ayesian vector autoregressions. 2024

  29. [29]

    Billio and R

    M. Billio and R. Casarin and L. Rossini. Bayesian nonparametric sparse VAR models. 2019

  30. [30]

    Korobilis

    D. Korobilis. Prior selection for panel vector autoregressions. 2016

  31. [31]

    Meng and D

    X.-L. Meng and D. A. Van Dyke. Seeking efficient data augmentation schemes via conditional and marginal augmentation. 1999

  32. [32]

    J. S. Liu and Y. N. Wu. Parameter expansion for data augmentation. 1999

  33. [33]

    Jauch and P

    M. Jauch and P. D. Hoff and D. B. Dunson. Monte C arlo simulation on the S tiefel manifold via polar expansion. 2021

  34. [34]

    S. E. Heaps and I. H. Jermyn. Structured prior distributions for the covariance matrix in latent factor models. 2024

  35. [35]

    Bradbury and R

    J. Bradbury and R. Frostig and P. Hawkins and M. J. Johnson and C. Leary and D. Maclaurin and G. Necula and A. Paszke and J. Vander P las and S. Wanderman- M ilne and Q. Zhang. JAX : composable transformations of P ython+ N um P y programs. 2018

  36. [36]

    Cabezas and A

    A. Cabezas and A. Corenflos and J. Lao and R. Louf. Black JAX : Composable B ayesian inference in JAX. 2024

  37. [37]

    Lenkoski

    A. Lenkoski. A direct sampler for G - W ishart variates. Stat. 2013

  38. [38]

    Tjelmeland and H

    H. Tjelmeland and H. B. Kval y. An MCMC hypothesis test to check a claimed sampler: applied to a claimed sampler for the G - W ishart distribution. arXiv:2505.24400 , year = "2025", adsurl =

  39. [39]

    S. Brooks and A. Gelman and G. Jones and X.--L. Meng

    R. M. Neal. Handbook of Markov Chain Monte Carlo , editor = "S. Brooks and A. Gelman and G. Jones and X.--L. Meng", pages = "113--162", title = ". 2011

  40. [40]

    Murray and Z

    I. Murray and Z. Ghahramani and D. J. C. MacKay. MCMC for doubly-intractable distributions. Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence (UAI-06) , publisher =. 2006

  41. [41]

    Carpenter and A

    B. Carpenter and A. Gelman and M. D. Hoffman and D. Lee and B. Goodrich and M. Betancourt and M. A. Brubaker and J. Guo and P. Li and A. Riddell. Stan: A probabilistic programming language. 2017

  42. [42]

    Vogels and R

    L. Vogels and R. Mohammadi and M. Schoonhoven and S . \. I . Birbil. Bayesian Structure Learning in Undirected G aussian Graphical Models: Literature Review with Empirical Comparison. 2024

  43. [43]

    H. Massam. Bayesian inference in graphical G aussian models. Handbook of G raphical M odels. 2020

  44. [44]

    Wang and S

    H. Wang and S. Z. Li. Efficient G aussian graphical model determination under G - W ishart prior distributions. 2012

  45. [45]

    Chen and Z

    J. Chen and Z. Chen. Extended B ayesian information criteria for model selection with large model spaces. 2008

  46. [46]

    S. Epskamp. graphical VAR : graphical VAR for experience sampling data. 2024

  47. [47]

    G. M. Koop. Forecasting with Medium and Large B ayesian VAR s. 2013

  48. [48]

    Koop and D

    G. Koop and D. Korobilis. Bayesian multivariate time series methods for empirical macroeconomics. Foundations and Trends in Econometrics. 2009

  49. [49]

    R. L. Binks and S. E. Heaps and M. Panagiotopoulou and Y. Wang and D. J. Wilkinson. Bayesian inference on the order of stationary vector autoregressions. 2024

  50. [50]

    Gneiting and A

    T. Gneiting and A. E. Raftery. Strictly proper scoring rules, prediction, and estimation. 2007

  51. [51]

    Doan and R

    T. Doan and R. B. Litterman and C. A. Sims. Forecasting and conditional projection using realistic prior distributions. 1984

  52. [52]

    Jordan and F

    A. Jordan and F. Kr \"u ger and S. Lerch. Evaluating Probabilistic Forecasts with scoringRules. 2019

  53. [53]

    M. Eichler. Granger causality and path diagrams for multivariate time series. 2007

  54. [54]

    Zivot and J

    E. Zivot and J. Wang. Modelling Financial Time Series with S-PLUS. 2006

  55. [55]

    F. Liang. A double M etropolis- H astings sampler for spatial models with intractable normalizing constants. 2010

  56. [56]

    Y. Luo. P arsimonious T ime S eries M odelling of H igh-dimensional D ata with L inear and N on- L inear M odels. 2025

  57. [57]

    Jones and C

    B. Jones and C. Carvalho and A. Dobra and C. Hans and C. Carter and M. West. Experiments in Stochastic Computation for High-Dimensional Graphical Models. 2005

  58. [58]

    P. N. Taylor and C. A. Papasavvas and T. W. Owen and G. M. Schroeder and F. E. Hutchings and F. A. Chowdhury and B. Diehl and J. S. Duncan and A. W. Mc E voy and A. Miserocchi and J. de Tisi and S. B. Vos and M. C. Walker and Y. Wang. Normative brain mapping of interictal intracranial EEG to localize epileptogenic tissue. 2020

  59. [59]

    Alexander and W

    B. Alexander and W. Y. Loh and L. G. Matthews and A. L. Murray and C. Adamson and R. Beare and J. Chen and C. E. Kelly and P. J. Anderson and L. W. Doyle and A. J. Spittle and J. L. Y. Cheong and M. L. Seal and D. K. Thompson. Desikan- K illiany- T ourville atlas compatible version of M-CRIB neonatal parcellated whole brain atlas: the M-CRIB 2.0. 2019

  60. [60]

    Wang and N

    Y. Wang and N. Sinha and G. M. Schroeder and S. Ramaraju and A. W. Mc E voy and A. Miserocchi and J. de Tisi and F. A. Chowdhury and B. Diehl and J. S. Duncan and P. N. Taylor. Interictal intracranial electroencephalography for predicting surgical success: the importance of space and time. 2020