Modeling Sparse and Bursty Vulnerability Sightings: Forecasting Under Data Constraints
Pith reviewed 2026-05-10 08:14 UTC · model grok-4.3
The pith
Poisson regression models offer more stable forecasts than SARIMAX for sparse and bursty vulnerability sightings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Vulnerability sightings exhibit sparse and bursty patterns that standard autoregressive models like SARIMAX cannot adequately capture, leading to wide confidence intervals and invalid predictions. Poisson regression models, when applied to weekly aggregated data and augmented with severity scores derived from textual descriptions, yield more stable and interpretable forecasts. Simpler methods like exponential decay functions also offer practical alternatives for short-term horizons without needing extensive historical data.
What carries the argument
Comparison of SARIMAX time-series models and Poisson regression for modeling sighting counts, using VLAI-derived severity scores as exogenous variables.
If this is right
- Aggregating to weekly counts improves stability for bursty sighting data.
- Severity scores serve as useful exogenous inputs for better forecasts.
- Exponential decay functions enable short-horizon estimates without long histories.
- Predictive models can support vulnerability intelligence workflows under data constraints.
Where Pith is reading between the lines
- Zero-inflated Poisson models could handle the high number of zero sightings more effectively.
- Improved forecasts might allow security teams to prioritize patching for vulnerabilities likely to see activity soon.
- The preference for count models suggests cyber events follow discrete rather than continuous dynamics.
- Testing on multi-year data could validate if patterns persist across different threat landscapes.
Load-bearing premise
That severity scores derived from textual descriptions can serve as useful exogenous variables and that sighting counts are adequately described by standard Poisson or time-series assumptions despite sparsity.
What would settle it
Running both models on a new set of vulnerabilities and comparing mean absolute error of forecasts against actual sighting counts, particularly checking for negative predictions in SARIMAX.
Figures
read the original abstract
Understanding and anticipating vulnerability-related activity is a major challenge in cyber threat intelligence. This work investigates whether vulnerability sightings, such as proof-of-concept releases, detection templates, or online discussions, can be forecast over time. Building on our earlier work on VLAI, a transformer-based model that predicts vulnerability severity from textual descriptions, we examine whether severity scores can improve time-series forecasting as exogenous variables. We evaluate several approaches for short-term forecasting of sightings per vulnerability. First, we test SARIMAX models with and without log(x+1) transformations and VLAI-derived severity inputs. Although these adjustments provide limited improvements, SARIMAX remains poorly suited to sparse, short, and bursty vulnerability data. In practice, forecasts often produce overly wide confidence intervals and sometimes unrealistic negative values. To better capture the discrete and event-driven nature of sightings, we then explore count-based methods such as Poisson regression. Early results show that these models produce more stable and interpretable forecasts, especially when sightings are aggregated weekly. We also discuss simpler operational alternatives, including exponential decay functions for short forecasting horizons, to estimate future activity without requiring long historical series. Overall, this study highlights both the potential and the limitations of forecasting rare and bursty cyber events, and provides practical guidance for integrating predictive analytics into vulnerability intelligence workflows.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper investigates forecasting sparse, bursty vulnerability sightings (e.g., PoC releases, discussions) using SARIMAX models (with/without log(x+1) transforms and VLAI severity scores as exogenous inputs) and count-based alternatives like Poisson regression. It claims SARIMAX is poorly suited, often yielding negative values and wide intervals, while Poisson regression (especially on weekly aggregates) produces more stable, interpretable forecasts; simpler exponential decay is also discussed for short horizons. The work builds on prior VLAI severity prediction and aims to provide practical guidance for cyber threat intelligence under data constraints.
Significance. If the empirical comparison holds after adding quantitative validation, the result would be useful for practitioners by demonstrating the mismatch between standard time-series tools and rare-event cyber data, while showing how severity scores from text models can be integrated as covariates. The emphasis on operational simplicity (e.g., decay functions) is a strength for real-world deployment where long histories are unavailable.
major comments (2)
- Abstract: the central claim that 'Poisson regression models produce more stable and interpretable forecasts' is unsupported by any reported metrics (MAE, RMSE, coverage, dispersion statistic, or comparison to negative binomial), error bars, or full experimental details, leaving the superiority over SARIMAX unverified.
- Abstract: no overdispersion diagnostics (e.g., variance-to-mean ratio or likelihood ratio test against negative binomial) are described despite the explicitly bursty nature of the counts; violation of the Poisson equidispersion assumption would bias standard errors and miscalibrate forecast intervals, directly undermining the stability claim.
minor comments (1)
- Abstract: the phrase 'limited improvements' from SARIMAX adjustments is imprecise; specify which performance aspect (bias, interval width, or predictive accuracy) was evaluated.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback emphasizing the need for quantitative validation and overdispersion checks. We agree that the original claims in the abstract required stronger empirical support and have revised the manuscript accordingly by adding the requested metrics, diagnostics, and model comparisons.
read point-by-point responses
-
Referee: [—] Abstract: the central claim that 'Poisson regression models produce more stable and interpretable forecasts' is unsupported by any reported metrics (MAE, RMSE, coverage, dispersion statistic, or comparison to negative binomial), error bars, or full experimental details, leaving the superiority over SARIMAX unverified.
Authors: We agree that the abstract's reference to 'early results' was insufficiently supported by quantitative evidence in the initial submission. In the revised manuscript we have expanded the experimental evaluation to report MAE, RMSE, and interval coverage for SARIMAX versus Poisson regression on both daily and weekly aggregates. We also include bootstrap-derived error bars, a direct comparison to negative binomial regression, and explicit counts of invalid negative forecasts produced by SARIMAX. These additions confirm that Poisson (and negative binomial) models yield lower error and more stable intervals, particularly after weekly aggregation, while SARIMAX frequently produces negative values and overly wide intervals. revision: yes
-
Referee: [—] Abstract: no overdispersion diagnostics (e.g., variance-to-mean ratio or likelihood ratio test against negative binomial) are described despite the explicitly bursty nature of the counts; violation of the Poisson equidispersion assumption would bias standard errors and miscalibrate forecast intervals, directly undermining the stability claim.
Authors: We acknowledge the omission of formal overdispersion diagnostics. The revised version now includes variance-to-mean ratios computed for each vulnerability sighting series and likelihood-ratio tests comparing Poisson to negative binomial specifications. Where moderate overdispersion is detected, we report that negative binomial regression further improves interval calibration without altering the overall finding that count-based models remain more stable and interpretable than SARIMAX. These diagnostics refine rather than contradict our recommendation for count-based approaches under data constraints. revision: yes
Circularity Check
No circularity: empirical comparison of standard models
full rationale
The manuscript is an empirical study comparing off-the-shelf SARIMAX and Poisson regression models on sparse vulnerability sighting counts, with VLAI severity scores used only as exogenous inputs. No equations, derivations, or 'predictions' are presented that reduce by construction to fitted parameters or self-citations. The reference to prior VLAI work supplies an auxiliary feature and does not carry the central claim about model stability. The analysis therefore contains no load-bearing self-definition, fitted-input renaming, or uniqueness theorem imported from the authors' own prior results.
Axiom & Free-Parameter Ledger
free parameters (2)
- SARIMAX order parameters
- VLAI severity scores
axioms (1)
- domain assumption Vulnerability sighting counts follow distributions amenable to SARIMAX or Poisson modeling after possible transformation
Reference graph
Works this paper leans on
-
[1]
Scoring vulnerabilities by leveraging activity data from the fediverse
C ´edric Bonhomme and Alexandre Dulaunoy. Scoring vulnerabilities by leveraging activity data from the fediverse. InCyber Threat Intelligence Conference, 2025
work page 2025
-
[2]
VLAI: A RoBERTa-based model for automated vulnerability sever- ity classification, 2025
C ´edric Bonhomme and Alexandre Dulaunoy. VLAI: A RoBERTa-based model for automated vulnerability sever- ity classification, 2025
work page 2025
-
[3]
Jay Jacobs, Sasha Romanosky, Octavian Suciu, Benjamin Edwards, and Armin Sarabi. Enhancing vulnerability prioritization: Data-driven exploit predictions with community-driven insights, 2023
work page 2023
-
[4]
Vulnerability forecasting: Theory and practice.Digital Threats, 3(4), March 2022
´Eireann Leverett, Matilda Rhode, and Adam Wedgbury. Vulnerability forecasting: Theory and practice.Digital Threats, 3(4), March 2022. 12
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.