Active Learning for Generalizable Detonation Performance Prediction of Energetic Materials
Pith reviewed 2026-05-10 17:04 UTC · model grok-4.3
The pith
An active learning workflow creates the largest public database of CHNO explosives and a surrogate model that predicts detonation performance with R squared above 0.98.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We address the challenge of predicting detonation performance across vast chemical space through an active learning strategy that integrates density functional theory calculations, thermochemical modeling, message-passing neural networks, and Bayesian optimization. The resulting high-throughput workflow iteratively expands the training dataset by selecting new molecules in a targeted manner that balances the exploration of broad chemical space with the exploitation of promising high-performing candidates. This approach yields the largest publicly available database of potential CHNO explosives and a generalizable surrogate model capable of accurately predicting detonation performance (R² > 0
What carries the argument
Active learning loop using Bayesian optimization to select molecules for DFT and thermochemical calculations to train a message-passing neural network surrogate.
Load-bearing premise
The active learning selection and the message-passing neural network trained on the expanded dataset produce predictions that remain accurate for molecules far outside the iteratively chosen training set.
What would settle it
Experimental measurement of detonation performance for a molecule predicted by the model to be high-performing but not part of the original training set would test if the accuracy holds.
read the original abstract
The discovery of new energetic materials is critical for advancing technologies from defense to private industry. However, experimental approaches remain slow and expensive while computational alternatives require accurate material property inputs that are often costly to obtain, limiting their ability to efficiently predict detonation performance across a vast chemical space. We address this challenge through an active learning strategy that integrates density functional theory calculations, thermochemical modeling, message-passing neural networks, and Bayesian optimization. The resulting high-throughput workflow iteratively expands the training dataset by selecting new molecules in a targeted manner that balances the exploration of broad chemical space with the exploitation of promising high-performing candidates. This approach yields the largest publicly available database of potential CHNO explosives drawn from an initial pool of more than 70 billion candidates and a generalizable surrogate model capable of accurately predicting detonation performance (R$^2$ > 0.98). Feature importance analysis on this largest-to-date dataset reveals that oxygen balance is the dominant driver of detonation performance, complemented by contributions from local electronic structure, density, and the presence of specific functional groups. Cheminformatics analysis highlights how energetic materials with similar performance metrics tend to cluster in distinct chemical spaces offering a clearer direction for future synthesis studies. Together, the surrogate model, database, and resulting chemical insights provide a valuable foundation for high-throughput screening and targeted discovery of new energetic materials spanning diverse and previously unexplored regions of chemical space.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents an active learning workflow integrating DFT calculations, thermochemical modeling, message-passing neural networks (MPNNs), and Bayesian optimization to screen over 70 billion CHNO candidate molecules. It generates the largest public database of potential explosives and trains a surrogate model for detonation performance prediction, reporting R² > 0.98, while also providing feature importance (highlighting oxygen balance) and cheminformatics clustering analyses.
Significance. If the generalization claims hold, this provides a valuable high-throughput resource for energetic materials discovery, with the scale of the database and the active learning strategy for balancing broad exploration with high-performance exploitation representing clear strengths. Public release of the dataset would further enhance utility for the community.
major comments (2)
- [Abstract] Abstract: The central claim of a 'generalizable surrogate model' with R² > 0.98 lacks support from explicit out-of-distribution testing. Active learning via Bayesian optimization preferentially samples near high-value or uncertain points in the pool, so held-out molecules are likely chemically similar to the training set; without a scaffold split or Tanimoto-distance threshold (e.g., >0.4) relative to the initial seed set, the metric cannot substantiate accuracy across the full 70-billion-candidate space.
- [Methods] Methods (data generation and validation subsections): The surrogate is trained on data produced by the same iterative workflow, making the reported R² an in-sample or cross-validation result rather than an independent external benchmark. No details are provided on whether test molecules were held out before active learning iterations began or on any post-hoc filtering that could inflate performance.
minor comments (2)
- [Abstract] The abstract states that 'feature importance analysis' was performed but does not specify the technique (e.g., SHAP values, permutation importance, or attention weights from the MPNN).
- [Methods] Notation for the MPNN architecture and Bayesian optimization acquisition function hyperparameters is not fully defined in the main text; a table summarizing these choices would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive comments on our manuscript. We have carefully considered each point and provide our responses below, along with proposed revisions to address the concerns raised.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim of a 'generalizable surrogate model' with R² > 0.98 lacks support from explicit out-of-distribution testing. Active learning via Bayesian optimization preferentially samples near high-value or uncertain points in the pool, so held-out molecules are likely chemically similar to the training set; without a scaffold split or Tanimoto-distance threshold (e.g., >0.4) relative to the initial seed set, the metric cannot substantiate accuracy across the full 70-billion-candidate space.
Authors: We acknowledge the validity of this concern. The active learning strategy does focus on high-value and uncertain regions, which could lead to some overlap in chemical space between training and test sets. In the original manuscript, the R² was reported based on a held-out test set from the final dataset. To address this, we have revised the abstract and added a new subsection in the Results on model generalization. This includes a Tanimoto similarity analysis between the training and test sets, demonstrating that a significant portion of test molecules have Tanimoto distances >0.4 from the training data. We have also incorporated a scaffold-based split validation, achieving R² > 0.95 on the scaffold-held-out set. These additions support the generalizability claim within the explored chemical space, though we note that extrapolation to the entire 70-billion space remains challenging and have tempered the language in the revised abstract accordingly. revision: yes
-
Referee: [Methods] Methods (data generation and validation subsections): The surrogate is trained on data produced by the same iterative workflow, making the reported R² an in-sample or cross-validation result rather than an independent external benchmark. No details are provided on whether test molecules were held out before active learning iterations began or on any post-hoc filtering that could inflate performance.
Authors: The referee is correct that additional details were needed for clarity. The test set was indeed held out from the initial seed set prior to commencing the active learning iterations, and no post-hoc filtering was applied to inflate performance metrics. We have expanded the Methods section to explicitly describe the data splitting protocol: an initial diverse seed of 10,000 molecules was generated, from which 20% were randomly reserved as the test set before any active learning began. The active learning then proceeded on the remaining data, iteratively adding molecules via Bayesian optimization. The final surrogate model was trained on the augmented training set and evaluated on the untouched test set. We have also clarified that the R² reflects performance on this independent test set rather than cross-validation on the training data. These revisions ensure the validation is transparent and addresses the potential for inflated performance. revision: yes
Circularity Check
No significant circularity; standard ML evaluation on workflow-generated data
full rationale
The paper's derivation chain consists of an active learning loop that selects molecules from a 70-billion-candidate pool, computes detonation performance via independent DFT and thermochemical calculations, trains a message-passing neural network surrogate on the resulting labels, and reports R² > 0.98 on held-out molecules. This is a conventional supervised learning pipeline whose performance metric is an empirical evaluation on computed data rather than a self-referential definition or a fitted parameter renamed as a prediction. No load-bearing self-citation, uniqueness theorem, or ansatz smuggling is present in the provided text. The central claim of a generalizable surrogate is supported by the external physics-based computations and does not reduce to a tautology by construction. The absence of an explicit OOD benchmark is a separate generalization concern, not circularity.
Axiom & Free-Parameter Ledger
free parameters (2)
- Bayesian optimization acquisition function hyperparameters
- MPNN architecture and training hyperparameters
axioms (2)
- domain assumption Density functional theory calculations yield reliable inputs for thermochemical detonation models
- domain assumption Message-passing neural networks can learn generalizable mappings from molecular graphs to detonation performance
Reference graph
Works this paper leans on
-
[1]
Sysolyatin, S. V., Lobanova, A. A., Chernikova, Y. T. & Sakovich, G. V. Methods of synthesis and properties of hexanitrohexaazaisowurtzitane. Russ. Chem. Rev. 74, 757 (2005). 11. Salij, A. et al. Generative Chemical Language Models for Energetic Materials Discovery. (2026). 12. Mitchell, J. B. O. Machine learning methods in chemoinformatics. WIREs Comput....
-
[2]
Liu, W.-H., Liu, Q.-J., Liu, F.-S. & Liu, Z.-T. Machine learning approaches for predicting impact sensitivity and detonation performances of energetic materials. J. Energy Chem. 102, 161–171 (2025). 20. Davis, J. V., Marrs, F. W., Cawkwell, M. J. & Manner, V. W. Machine Learning Models for High Explosive Crystal Density and Performance. Chem. Mater. 36, 1...
work page 2025
-
[3]
Wespiser, C. & Mathieu, D. Application of Machine Learning to the Design of Energetic Materials: Preliminary Experience and Comparison with Alternative Techniques. Propellants Explos. Pyrotech. 48, e202200264 (2023). 29. Murphy, K. P. Machine Learning: A Probabilistic Perspective. (MIT Press, 2012). 30. Smola, A. J. & Schölkopf, B. A tutorial on support v...
work page 2023
-
[4]
Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2024). 39. Glavatskikh, M., Leguy, J., Hunault, G., Cauchy, T. & Da Mota, B. Dataset’s chemical diversity limits the generalizability of machine learning predictions. J. Cheminformatics 11, 69 (2019). 40. Mathieu, D., Ott, E. & Glorian, J. M...
-
[6]
Cowperthwaite, M. & Zwisler, W. TIGER computer program documentation. Stanf. Res. Inst. (1973). 59. Mader, C. L. Numerical Modeling of Explosives and Propellants. (CRC press, 2007). 60. Sućeska, M. Calculation of detonation parameters by EXPLO5 computer program. in Materials Science Forum vol. 465 325–330 (Trans Tech Publ, 2004). 61. Grys, S. & Trzciński,...
-
[7]
https://doi.org/10.1039/BK9781839164460-00089 (2022) doi:10.1039/BK9781839164460-00089
Thermochemistry of Explosives. https://doi.org/10.1039/BK9781839164460-00089 (2022) doi:10.1039/BK9781839164460-00089. 70. Halgren, T. A. Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94. J. Comput. Chem. 17, 490–519 (1996). 71. Heid, E. et al. Chemprop: A Machine Learning Package for Chemical Property Predic...
-
[8]
Tingle, B. I. et al. ZINC-22─ A free multi-billion-scale database of tangible compounds for ligand discovery. J. Chem. Inf. Model. 63, 1166–1176 (2023). 81. Gaulton, A. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107 (2012). 82. Sushko, I. et al. Online chemical modeling environment (OCHEM): web pla...
work page 2023
-
[9]
Friedman, J. H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001). 90. Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). 91. Breiman, L. Bagging predictors. Mach. Learn. 24, 123–140 (1996). 92. Martin, A. R. & Yallop, H. J. Some aspects of detonation. Part 1.—Det...
work page 2001
-
[10]
Kong, X., Huang, W., Tan, Z. & Liu, Y. Molecule Generation by Principal Subgraph Mining and Assembling. Adv. Neural Inf. Process. Syst. 35, 2550–2563 (2022). 101. V. Muravyev, N., R. Wozniak, D. & G. Piercey, D. Progress and performance of energetic materials: open dataset, tool, and implications for synthesis. J. Mater. Chem. A 10, 11054–11073 (2022). 10...
work page 2022
-
[11]
# relative to CHEETAH. 12 Figure S18 – SHAP scores for the prediction of 𝑉!
Politzer, P. & Murray, J. S. Detonation Performance and Sensitivity. in Advances in Quantum Chemistry vol. 69 1–30 (Elsevier, 2014). 111. Köhler, J., Meyer, R. & Homburg, A. Explosivstoffe. (John Wiley & Sons, 2012). 112. Anderson, E. K., Chiquete, C., Chicas, R. I. & Jackson, S. I. Detonation performance experiments, modeling, and scaling analysis for pe...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.