A multivariate Birnbaum-Saunders autoregressive moving average model with application to air pollution concentration data
Pith reviewed 2026-05-08 17:21 UTC · model grok-4.3
The pith
The MBSARMA model combines the multivariate Birnbaum-Saunders distribution with ARMA dynamics on the conditional location parameter to jointly model correlated positive asymmetric time series such as PM2.5 concentrations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The proposed MBSARMA model combines the multivariate log-linear BS framework with dynamic autoregressive moving average components on the conditional location parameter of each response and shows good performance in Monte Carlo simulations and real PM2.5 data.
What carries the argument
Multivariate Birnbaum-Saunders distribution with ARMA dynamics applied to the conditional location parameters, enabling joint modeling of temporal dependence and cross-response correlations in positive asymmetric series.
If this is right
- Joint forecasting of pollution levels across monitoring stations becomes possible while preserving the skewed marginal distributions.
- Exogenous terms can be included to account for external factors such as weather variables in the multivariate setting.
- The EM estimation procedure maintains accuracy for moderate sample sizes and varying correlation strengths according to the simulation results.
- The model supports environmental applications by handling the positive asymmetric nature of concentration data without transformation.
Where Pith is reading between the lines
- The same structure could be applied to other positive skewed multivariate series in fields such as reliability engineering or insurance claim modeling.
- Adding nonlinear or higher-order dependence might further improve fit for series with complex seasonal patterns.
- Distribution-specific multivariate time series models may reduce misspecification errors compared with standard approaches that assume normality after transformation.
Load-bearing premise
The observed series follow the multivariate Birnbaum-Saunders distribution with the specified ARMA structure on the location parameters, and the EM algorithm recovers the parameters reliably under the correlation structures in the data.
What would settle it
Generate synthetic multivariate series from a different distribution such as multivariate lognormal with ARMA dependence and fit the MBSARMA model to check whether parameter estimates show large bias or whether out-of-sample predictions degrade sharply.
Figures
read the original abstract
Fine particulate matter (PM$_{2.5}$) concentration data are positive, right-skewed series that arise naturally in environmental monitoring and are well described by the Birnbaum-Saunders (BS) distribution. In this paper, we propose a multivariate BS autoregressive moving average (MBSARMA) model with exogenous terms for the joint analysis of correlated positive asymmetric time series. The proposed model combines the multivariate log-linear BS framework with dynamic autoregressive moving average components on the conditional location parameter of each response. We estimate the model parameters by means of the Expectation-Maximisation (EM) algorithm. The performance of the proposed conditional likelihood estimators is evaluated by means of a Monte Carlo simulation study under several correlation levels and sample sizes. An application to weekly PM$_{2.5}$ pollution concentration data recorded at three monitoring stations in Santiago, Chile, obtained from the National Air Quality Information System of Chile (SINCA), is presented. The results show the good performance of the proposed methodology.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a multivariate Birnbaum-Saunders autoregressive moving average (MBSARMA) model for joint analysis of correlated positive asymmetric time series such as PM2.5 concentrations. It combines the multivariate log-linear BS framework with ARMA dynamics on the conditional location parameter of each response, estimates parameters via the EM algorithm, evaluates the conditional likelihood estimators through Monte Carlo simulations under several correlation levels and sample sizes, and applies the model to weekly PM2.5 data from three Santiago monitoring stations.
Significance. If the central claims hold, the MBSARMA model supplies a flexible parametric framework for multivariate skewed positive series with temporal dependence, which is relevant for environmental statistics. The EM estimation approach and the real-data illustration are standard strengths; the Monte Carlo design under varying correlations is also a positive feature when fully documented.
major comments (3)
- [§3] §3 (EM algorithm): The E-step requires the conditional expectation of the latent variables from the multivariate BS representation given the full observed vector and history. The manuscript must specify whether this expectation is obtained exactly from the joint distribution or via an approximation or marginalization; under the high cross-correlations typical of nearby PM2.5 stations, any approximation risks biasing the ARMA coefficient updates in the M-step.
- [§4] §4 (Monte Carlo study): The study reports 'good performance' under several correlation levels, yet provides neither the explicit correlation matrices tested nor a comparison of those levels to the empirical cross-correlations in the Santiago data. This omission prevents verification that the EM estimators remain reliable at the dependence strengths encountered in the application.
- [§5] §5 (Application): The fitted model is presented without the estimated correlation matrix or the selected ARMA orders (p, q); these quantities are needed to assess whether the dynamics and dependence structure are adequately captured and to judge the practical utility of the results.
minor comments (2)
- [Abstract] The abstract refers to 'conditional likelihood estimators' while the body uses EM; a brief clarification of the relationship would improve readability.
- [Notation] Notation for the BS shape parameters and the ARMA orders should be checked for consistency between the model definition and the simulation/application sections.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment point by point below and will incorporate the suggested clarifications and additions in the revised version.
read point-by-point responses
-
Referee: [§3] §3 (EM algorithm): The E-step requires the conditional expectation of the latent variables from the multivariate BS representation given the full observed vector and history. The manuscript must specify whether this expectation is obtained exactly from the joint distribution or via an approximation or marginalization; under the high cross-correlations typical of nearby PM2.5 stations, any approximation risks biasing the ARMA coefficient updates in the M-step.
Authors: We appreciate the referee drawing attention to this point. In our EM algorithm the conditional expectations of the latent variables are obtained exactly from the joint multivariate Birnbaum-Saunders distribution; no approximation or marginalization is employed. We will add an explicit statement and the relevant conditional-expectation formulas to Section 3 of the revised manuscript to make this clear. The simulation results already indicate that the M-step updates remain stable at the correlation levels examined, including those comparable to the application. revision: yes
-
Referee: [§4] §4 (Monte Carlo study): The study reports 'good performance' under several correlation levels, yet provides neither the explicit correlation matrices tested nor a comparison of those levels to the empirical cross-correlations in the Santiago data. This omission prevents verification that the EM estimators remain reliable at the dependence strengths encountered in the application.
Authors: We agree that the simulation design would be more transparent with this information. In the revised manuscript we will report the explicit correlation matrices used for the low-, moderate-, and high-correlation scenarios and will add a direct comparison with the sample cross-correlation matrix computed from the three Santiago PM2.5 series. This will allow readers to confirm that the simulated dependence structures bracket the empirical dependence observed in the data. revision: yes
-
Referee: [§5] §5 (Application): The fitted model is presented without the estimated correlation matrix or the selected ARMA orders (p, q); these quantities are needed to assess whether the dynamics and dependence structure are adequately captured and to judge the practical utility of the results.
Authors: We thank the referee for noting this omission. The revised version will include the estimated correlation matrix obtained from the fitted MBSARMA model, the selected ARMA orders (p, q) for each of the three series, and a brief description of the model-selection criterion employed. These additions will enable readers to evaluate the fitted dynamics and dependence structure directly. revision: yes
Circularity Check
No circularity: model specification, EM estimation, and validation are independent of fitted outputs
full rationale
The MBSARMA model is constructed by extending the multivariate log-linear Birnbaum-Saunders distribution with exogenous ARMA components on the conditional location parameters; parameters are recovered via a standard EM algorithm whose E-step and M-step follow from the joint distribution and the ARMA recursion. Monte Carlo evaluation is performed on simulated data generated from the model under controlled correlation levels, and the real-data application compares fitted values against held-out observations without re-using the same quantities as both input and output. No equation equates a derived quantity to a fitted parameter by definition, no uniqueness theorem is imported from self-citation, and no ansatz is smuggled through prior work. The derivation chain therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (3)
- ARMA orders p and q
- Correlation matrix parameters
- BS shape parameters
axioms (1)
- domain assumption The conditional distribution of each response given past values and covariates is multivariate Birnbaum-Saunders.
Reference graph
Works this paper leans on
-
[1]
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control , 19:716--723
work page 1974
-
[2]
Bhatti, C. (2010). The Birnbaum-Saunders autoregressive conditional duration model . Mathematics and Computers in Simulation , 80:2062--2078
work page 2010
-
[3]
Bhogal, S. K. and Variyam Thekke, R. (2019). Conditional duration models for high-frequency data: A review on recent developments. Journal of Economic Surveys , 33(1):252--273
work page 2019
-
[4]
Birnbaum, Z. and Saunders, S. (1969). A new family of life distributions. Journal of Applied Probability , 6:319--327
work page 1969
-
[5]
Cox, D. and Snell, E. (1968). A general definition of residuals. Journal of the Royal Statistical Society, Series B , 30:248--275
work page 1968
-
[6]
Dempster, A., Laird, N., and Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B , 39:1--38
work page 1977
-
[7]
Efron, B. and Hinkley, D. (1978). Assessing the accuracy of the maximum likelihood estimator: observed versus expected F isher information. Biometrika , 65:457--487
work page 1978
-
[8]
Ibacache-Pulgar, G., Marchant, C., Osorio, M., and Saulo, H. (2026). A novel partially linear varying coefficient model with diagnostic analysis for the B irnbaum- S aunders distribution: application to real-world air pollution data. Journal of Applied Statistics
work page 2026
-
[9]
Johnson, N., Kotz, S., and Balakrishnan, N. (1994). Continuous Univariate Distributions, V ol. 1 . Wiley, New York
work page 1994
-
[10]
Lange, K. (2010). Numerical Analysis for Statisticians . Springer, New York, 2nd edition
work page 2010
-
[11]
Leiva, V. (2016). The B irnbaum- S aunders Distribution . Academic Press, New York
work page 2016
-
[12]
Leiva, V., Marchant, C., Ruggeri, F., and Saulo, H. (2015). A criterion for environmental assessment using B irnbaum- S aunders attribute control charts. Environmetrics , 26:463--476
work page 2015
-
[13]
Leiva, V., Rojas, M., Paula, F., and Sanhueza, A. (2008). Generalized B irnbaum- S aunders distributions applied to air pollutant concentration. Environmetrics , 19:235--249
work page 2008
-
[14]
Leiva, V., Saulo, H., Le\ a o, J., and Marchant, C. (2014). A family of autoregressive conditional duration models applied to financial data . Computational Statistics and Data Analysis , 79:175--191
work page 2014
-
[15]
Leiva, V., Saulo, H., Souza, R., Aykroyd, R., and Vila, R. (2021). A new BISARMA time series model for forecasting mortality using weather and particulate matter data. Journal of Forecasting , 40:346--364
work page 2021
-
[16]
M\" a kel\" a inen, T., Schmidt, K., and Styan, G. (1981). On the existence and uniqueness of the maximum likelihood estimate of a vector-valued parameter in fixed-size samples. Annals of Statistics , 9:758--767
work page 1981
-
[17]
Marchant, C., Leiva, V., Cysneiros, F., and Liu, S. (2018). Robust multivariate control charts based on B irnbaum- S aunders distributions. Journal of Statistical Computation and Simulation , 88:182--202
work page 2018
-
[18]
Marchant, C., Leiva, V., Cysneiros, F., and Vivanco, J. (2016). A multivariate log-linear model for B irnbaum- S aunders distributions. IEEE Transactions on Reliability , 65:816--827
work page 2016
-
[19]
McLachlan , G. and Krishnan, T. (2008). The EM Algorithm and Extensions . Wiley, New York, 2nd edition
work page 2008
-
[20]
Sistema de información nacional de calidad del aire ( SINCA )
Ministerio del Medio Ambiente de Chile (2024). Sistema de información nacional de calidad del aire ( SINCA ). https://sinca.mma.gob.cl/. Accessed: April 2024
work page 2024
-
[21]
Mu\ n oz, R., Garreaud, R., Rutllant, J., Seguel, R., and Corral, M. (2023). New observations of the meteorological conditions associated with particulate matter air pollution episodes in S antiago, C hile. Atmosphere , 14:1454
work page 2023
-
[22]
Puentes, R., Marchant, C., Leiva, V., Figueroa-Z\' u \ n iga, J., and Ruggeri, F. (2021). Predicting PM _ 2.5 and PM _ 10 levels during critical episodes management in S antiago, C hile, with a bivariate B irnbaum- S aunders log-linear model. Mathematics , 9:645
work page 2021
-
[23]
R: A Language and Environment for Statistical Computing
R Core Team (2023). R: A Language and Environment for Statistical Computing . R Foundation for Statistical Computing, Vienna, Austria
work page 2023
-
[24]
Rieck, J. and Nedelman, J. (1991). A log-linear model for the B irnbaum- S aunders distribution. Technometrics , 33:51--60
work page 1991
-
[25]
Saulo, H., Balakrishnan, N., and Vila, R. (2023). On a quantile autoregressive conditional duration model. Mathematics and Computers in Simulation , 203:425--448
work page 2023
-
[26]
Saulo, H., Le\ a o, J., Leiva, V., and Aykroyd, R. G. (2019). Birnbaum- S aunders autoregressive conditional duration models applied to high-frequency financial data. Statistical Papers , 60:1605--1629
work page 2019
-
[27]
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics , 6:461--464
work page 1978
-
[28]
Zhang, S., Guo, B., Dong, A., He, J., Xu, Z., and Chen, S. (2017). Cautionary tales on air-quality improvement in B eijing. Proceedings of the Royal Society A , 473:20170457
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.