A Review of Statistical Methods for Spontaneous Reporting System Data Mining: Signal Detection and Beyond
Pith reviewed 2026-05-10 02:43 UTC · model grok-4.3
The pith
Contemporary statistical methods for spontaneous reporting system data support both binary signal detection and estimation of signal strength with uncertainty for drug safety.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that a review of contemporary SRS data mining methods and their statistical underpinnings, paired with explicit guidance on constructing contingency tables from aggregated AE-drug counts, supplies a usable foundation for safety assessment across major pharmacovigilance databases.
What carries the argument
Statistical signal detection methods (including disproportionality analyses) together with the preprocessing step of building SRS contingency tables from publicly available aggregated counts.
Load-bearing premise
The selected contemporary methods and preprocessing steps using aggregated counts adequately represent current best practice and can be applied directly without further validation or dataset-specific adjustments.
What would settle it
An analysis of a confirmed drug-adverse event pair that produces materially weaker or stronger signals when the recommended preprocessing steps are omitted.
Figures
read the original abstract
Postmarketing safety surveillance relies on data from spontaneous reporting systems (SRS) such as FAERS, EudraVigilance and VigiBase, and commonly uses SRS data mining methods to assess the associations between drugs and adverse events (AEs). Traditionally, these analyses have focused on signal detection framed as a binary decision problem, whereas more recent work has emphasized more nuanced inference involving signal strength estimation and uncertainty quantification. In this paper, we review contemporary SRS data mining approaches and their statistical underpinnings for safety assessment using data from major pharmacovigilance databases worldwide. In addition to methodological review, we provide practical guidance on data preprocessing for such analysis, including construction of SRS contingency tables using only aggregated AE-drug counts, as are publicly available from databases such as VigiBase and EudraVigilance. We illustrate the guidance via opioid-related datasets obtained from FAERS and VigiBase, complied with subsequent downstream SRS data analyses.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reviews contemporary statistical methods for mining spontaneous reporting system (SRS) data from databases such as FAERS, EudraVigilance, and VigiBase, covering traditional signal detection framed as binary decisions as well as more recent approaches to signal strength estimation and uncertainty quantification. It also supplies practical guidance on preprocessing steps to construct contingency tables from publicly available aggregated AE-drug counts and illustrates the guidance with opioid-related datasets from FAERS and VigiBase.
Significance. If the summaries of methods are accurate and the preprocessing guidance is internally consistent with the stated scope of public aggregated tables, the paper would serve as a useful reference for pharmacovigilance researchers seeking to move beyond binary signal detection toward nuanced inference while working with readily accessible data sources.
minor comments (1)
- [Abstract] Abstract, final sentence: the word 'complied' is almost certainly a typographical error and should read 'combined' to make the intended meaning clear.
Simulated Author's Rebuttal
We thank the referee for their positive summary of our manuscript, accurate characterization of its scope, and recommendation for minor revision. The referee's assessment aligns well with our intent to provide both a methodological review and practical preprocessing guidance for SRS data mining.
Circularity Check
No significant circularity in this review paper
full rationale
This is a review paper summarizing existing SRS data mining methods from external literature and offering practical preprocessing guidance for aggregated counts from public databases. No new derivations, predictions, or equations are introduced that could reduce to the paper's own inputs by construction. The claims are descriptive and illustrative (e.g., opioid example as demonstration, not proof), with methods attributed to cited sources rather than self-referential fits or definitions. Any self-citations are incidental and non-load-bearing for novel results.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
World Health Organization. What is Pharmacovigilance? Accessed: 2025-09-03, https://www.who.int/ teams/regulation-prequalification/regulation-and-safety/pharmacovigilance
work page 2025
-
[2]
FDA Adverse Event Reporting System
US Food and Drug Administration. FDA Adverse Event Reporting System. Accessed: 2025-09-03, https: //open.fda.gov/data/faers/
work page 2025
-
[3]
European Medicines Agency. EudraVigilance. Accessed: 2025-09-03, https://www.ema.europa.eu/en/human- regulatory-overview/research-development/pharmacovigilance-research-development/eudravigilance
work page 2025
-
[4]
Accessing global data with VigiBase search services
World Health Organization. Accessing global data with VigiBase search services. Accessed: 2025-09-03, https://who-umc.org/vigibase-search-services/
work page 2025
-
[5]
Marianthi Markatou and Robert Ball. A pattern discovery framework for adverse event evaluation and inference in spontaneous reporting systems.Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(5):352–367, 2014
work page 2014
-
[6]
Sharon E Davis, Luke Zabotka, Rishi J Desai, Shirley V Wang, Judith C Maro, Kevin Coughlin, José J Hernández- Muñoz, Danijela Stojanovic, Nigam H Shah, and Joshua C Smith. Use of electronic health record data for drug safety signal identification: a scoping review.Drug Safety, 46(8):725–742, 2023
work page 2023
-
[7]
Yihao Tan, Marianthi Markatou, and Saptarshi Chakraborty. Flexible empirical bayesian approaches to pharma- covigilance for simultaneous signal detection and signal strength estimation in spontaneous reporting systems data.Statistics in Medicine, 44(18-19):e70195, 2025
work page 2025
-
[8]
Stephen JW Evans, Patrick C Waller, and S Davis. Use of proportional reporting ratios (prrs) for signal generation from spontaneous adverse drug reaction reports.Pharmacoepidemiology and Drug Safety, 10(6):483–486, 2001
work page 2001
-
[9]
Kenneth J Rothman, Stephan Lanes, and Susan T Sacks. The reporting odds ratio and its advantages over the proportional reporting ratio.Pharmacoepidemiology and Drug Safety, 13(8):519–523, 2004
work page 2004
-
[10]
Lan Huang, Jyoti Zalkikar, and Ram C Tiwari. A likelihood ratio test based method for signal detection with application to fda’s drug safety data.Journal of the American Statistical Association, 106(496):1230–1241, 2011
work page 2011
-
[11]
Yuxin Ding, Marianthi Markatou, and Robert Ball. An evaluation of statistical approaches to postmarketing surveillance.Statistics in Medicine, 39(7):845–874, 2020
work page 2020
-
[12]
Lan Huang, Dan Zheng, Jyoti Zalkikar, and Ram Tiwari. Zero-inflated poisson model based likelihood ratio test for drug safety signal detection.Statistical Methods in Medical Research, 26(1):471–488, 2017
work page 2017
-
[13]
Yueqin Zhao, Min Yi, and Ram C Tiwari. Extended likelihood ratio test-based methods for signal detection in a drug class with application to fda’s adverse event reporting system database.Statistical Methods in Medical Research, 27(3):876–890, 2018
work page 2018
-
[14]
Saptarshi Chakraborty, Anran Liu, Robert Ball, and Marianthi Markatou. On the use of the likelihood ratio test methodology in pharmacovigilance.Statistics in Medicine, 41(27):5395–5420, 2022
work page 2022
-
[15]
Andrew Bate, Marie Lindquist, I Ralph Edwards, Sten Olsson, Roland Orre, Anders Lansner, and R Melhado De Freitas. A bayesian neural network method for adverse drug reaction signal generation.European Journal of Clinical Pharmacology, 54:315–321, 1998
work page 1998
-
[16]
William DuMouchel. Bayesian data mining in large frequency tables, with an application to the fda spontaneous reporting system.The American Statistician, 53(3):177–190, 1999
work page 1999
-
[17]
Empirical bayes screening for multi-item associations
William DuMouchel and Daryl Pregibon. Empirical bayes screening for multi-item associations. InProceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 67–76, 2001
work page 2001
-
[18]
Seok-Jae Heo and Inkyung Jung. Extended multi-item gamma poisson shrinker methods based on the zero-inflated poisson model for postmarket drug safety surveillance.Statistics in Medicine, 39(30):4636–4650, 2020
work page 2020
-
[19]
Na Hu, Lan Huang, and Ram C Tiwari. Signal detection in FDA AERS database using Dirichlet process.Statistics in Medicine, 34(19):2725–2742, 2015
work page 2015
-
[20]
Roger Koenker and Ivan Mizera. Convex optimization, shape constraints, compound decisions, and empirical bayes rules.Journal of the American Statistical Association, 109(506):674–685, 2014
work page 2014
-
[21]
Empirical Bayes deconvolution estimates.Biometrika, 103(1):1–20, 2016
Bradley Efron. Empirical Bayes deconvolution estimates.Biometrika, 103(1):1–20, 2016
work page 2016
-
[22]
Anran Liu, Raktim Mukhopadhyay, and Marianthi Markatou. MDDC: An R and Python package for adverse event identification in pharmacovigilance data.Scientific Reports, 15(1):21317, 2025. 18 APREPRINT
work page 2025
-
[23]
Vaccine adverse event enrichment tests.Statistics in Medicine, 40(19):4269–4278, 2021
Shuoran Li and Lili Zhao. Vaccine adverse event enrichment tests.Statistics in Medicine, 40(19):4269–4278, 2021
work page 2021
-
[24]
Mei-Chiung Shih, Tze Leung Lai, Joseph F Heyse, and Jie Chen. Sequential generalized likelihood ratio tests for vaccine safety evaluation.Statistics in Medicine, 29(26):2698–2708, 2010
work page 2010
-
[25]
Émeline Courtois, Pascale Tubert-Bitter, and Ismaïl Ahmed. New adaptive lasso approaches for variable selection in automated pharmacovigilance signal detection.BMC Medical Research Methodology, 21(1):271, 2021
work page 2021
-
[26]
Ismaïl Ahmed and Antoine Poncet.PhViD: PharmacoVigilance Signal Detection, 2016. R package version 1.0.8
work page 2016
-
[27]
John Ihrie and Travis Canida.openEBGM: EBGM Disproportionality Scores for Adverse Event Data Mining,
-
[28]
R package version 0.9.1
-
[29]
Travis Canida and John Ihrie. openEBGM: an R implementation of the gamma-Poisson shrinker data mining model.The R journal, 9(2):499–519, 2017
work page 2017
-
[30]
Anran Liu Saptarshi Chakraborty, Marianthi Markatou.pvLRT: Likelihood Ratio Test-Based Approaches to Pharmacovigilance, 2023. R package version 0.5.1
work page 2023
-
[31]
Saptarshi Chakraborty, Marianthi Markatou, and Robert Ball. Likelihood Ratio Test-Based Drug Safety Assess- ment using R Package pvLRT.The R Journal, 15:101–121, 2023. https://doi.org/10.32614/RJ-2023-027
-
[32]
Balasubramanian Narasimhan.sglr: Sequential Generalized Likelihood Ratio Decision Boundaries, 2022. R package version 0.8
work page 2022
-
[33]
Martin Kulldorff Ivair Ramos Silva.Sequential: Exact Sequential Analysis for Poisson and Binomial Data, 2025. R package version 4.5.2
work page 2025
-
[34]
Shuoran Li et al.AEenrich: Adverse Event Enrichment Tests, 2026. R package version 1.1.1
work page 2026
-
[35]
Marianthi Markatou Anran Liu, Raktim Mukhopadhyay.MDDC: Modified Detecting Deviating Cells Algorithm in Pharmacovigilance, 2025. R package version 1.1.0
work page 2025
-
[36]
Hervé Perdry Emeline Courtois, Ismaïl Ahmed.adapt4pv: Adaptive Approaches for Signal Detection in Pharmacovigilance, 2023. R package version 0.2-3
work page 2023
-
[37]
Yihao Tan, Saptarshi Chakraborty, Marianthi Markatou, and Raktim Mukhopadhyay.pvEBayes: Empirical Bayes Models for Pharmacovigilance, 2026. R package version 0.2.2
work page 2026
-
[38]
Yihao Tan, Marianthi Markatou, and Saptarshi Chakraborty. pvebayes: An r package for empirical bayes methods in pharmacovigilance.arXiv preprint arXiv:2512.01057, 2025
- [39]
-
[40]
FDR and Bayesian Multiple Comparisons Rules
Peter Müller, Giovanni Parmigiani, and Kenneth Rice. FDR and Bayesian Multiple Comparisons Rules. In Bayesian Statistics 8: Proceedings of the Eighth Valencia International Meeting, page 349–370. Oxford University Press, 07 2006
work page 2006
-
[41]
Simon N Wood.Generalized additive models: an introduction with R. Chapman and Hall/CRC, 2017
work page 2017
-
[42]
Sylvia Frühwirth-Schnatter and Gertraud Malsiner-Walli. From here to infinity: sparse finite versus dirichlet process mixtures in model-based clustering.Advances in Data Analysis and Classification, 13:33–64, 2019
work page 2019
-
[43]
Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, and Bettina Grün. Model-based clustering based on sparse finite gaussian mixtures.Statistics and Computing, 26(1):303–324, 2016
work page 2016
-
[44]
Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, and Bettina Grün. Identifying mixtures of mixtures using bayesian estimation.Journal of Computational and Graphical Statistics, 26(2):285–295, 2017
work page 2017
-
[45]
Judith Rousseau and Kerrie Mengersen. Asymptotic behaviour of the posterior distribution in overfitted mixture models.Journal of the Royal Statistical Society Series B: Statistical Methodology, 73(5):689–710, 2011
work page 2011
-
[46]
European Medicines Agency. European medicines agency policy on access to eudravigilance data for medicinal products for human use, 2025. Accessed: 2025-12-27,https://www.ema.europa.eu/en/documents/other/ european-medicines-agency-policy-access-eudravigilance-data-medicinal-products-human-use_ en.pdf
work page 2025
-
[47]
Raktim Mukhopadhyay and Marianthi Markatou. Survigilance: An application for accessing global pharmacovig- ilance data.SoftwareX, 34:102546, 2026
work page 2026
-
[48]
Pentazocine (injection route) - side effects & uses
Mayo Clinic. Pentazocine (injection route) - side effects & uses. Mayo Clinic: Drugs & Supple- ments, December 2025. Accessed: 2025-12-27, https://www.mayoclinic.org/drugs-supplements/ pentazocine-injection-route/description/drg-20074265. 19 APREPRINT
work page 2025
-
[49]
Pentazocine and naloxone (oral route) - side effects & dosage
Mayo Clinic. Pentazocine and naloxone (oral route) - side effects & dosage. Mayo Clinic: Drugs & Sup- plements, December 2025. Accessed: 2025-12-27, https://www.mayoclinic.org/drugs-supplements/ pentazocine-and-naloxone-oral-route/description/drg-20074147
work page 2025
-
[50]
Oracle Corporation.Oracle Life Sciences Empirica Documentation, Release 2025.4.02. Oracle Corporation, 2025. Accessed: 2026-04-06
work page 2025
-
[51]
Ismaïl Ahmed, Françoise Haramburu, Annie Fourrier-Réglat, Frantz Thiessard, Carmen Kreft-Jais, Ghada Miremont-Salamé, Bernard Bégaud, and Pascale Tubert-Bitter. Bayesian pharmacovigilance signal detection methods revisited in a multiple comparison setting.Statistics in Medicine, 28(13):1774–1792, 2009. 20
work page 2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.