Recognition: 1 theorem link
· Lean TheoremAI-Driven Expansion and Application of the Alexandria Database
Pith reviewed 2026-05-16 23:14 UTC · model grok-4.3
The pith
A multi-stage AI workflow expands the ALEXANDRIA database by 1.3 million DFT-validated compounds at 99% success rate near thermodynamic stability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By chaining the Matra-Genoa generative model with Orb-v2 interatomic potentials and ALIGNN energy predictions, the workflow filters candidates so that 99% of those sent to full DFT calculations lie within 100 meV/atom of thermodynamic stability. This yields 1.3 million newly validated entries and 74 thousand new stable compounds, bringing the ALEXANDRIA database to 5.8 million total structures and 175 thousand on the convex hull. The same pipeline produces a 14-million-structure out-of-equilibrium dataset that, when used to fine-tune a GRACE model, improves benchmark performance. Structural disorder statistics in the new data match experimental databases, and analysis of the hull reveals sub
What carries the argument
The multi-stage filtering pipeline that runs Matra-Genoa generation, Orb-v2 relaxation, and ALIGNN energy ranking before DFT validation.
If this is right
- The larger set of stable and near-stable compounds supplies more training data for machine-learning potentials, as demonstrated by the improved GRACE benchmark scores.
- Space-group and coordination-environment statistics extracted from the expanded hull can be used to test theories of phase stability networks.
- The released 14 million out-of-equilibrium structures with forces and stresses enable training of universal force fields that capture dynamic behavior beyond the convex hull.
- Sub-linear growth of convex-hull connectivity with database size implies that exhaustive enumeration of all stable phases may remain computationally tractable.
Where Pith is reading between the lines
- The workflow could be applied to targeted searches for materials with specific functional properties by adding property filters after the stability stage.
- Matching experimental disorder rates suggests the generated structures can serve as realistic starting points for finite-temperature simulations.
- Releasing the full dataset under open licenses removes a common barrier for smaller research groups that lack access to large-scale DFT resources.
Load-bearing premise
The combined Matra-Genoa, Orb-v2, and ALIGNN filters select candidates without systematic bias, so that 99 percent of those reaching DFT truly lie within 100 meV/atom of the hull.
What would settle it
Perform independent DFT relaxations on a random subset of the 1.3 million newly added compounds and measure whether the fraction within 100 meV/atom of the hull remains at or above 99 percent.
Figures
read the original abstract
We present a novel multi-stage workflow for computational materials discovery that achieves a 99% success rate in identifying compounds within 100 meV/atom of thermodynamic stability, with a threefold improvement over previous approaches. By combining the Matra-Genoa generative model, Orb-v2 universal machine learning interatomic potential, and ALIGNN graph neural network for energy prediction, we generated 119 million candidate structures and added 1.3 million DFT-validated compounds to the ALEXANDRIA database, including 74 thousand new stable materials. The expanded ALEXANDRIA database now contains 5.8 million structures with 175 thousand compounds on the convex hull. Predicted structural disorder rates (37-43%) match experimental databases, unlike other recent AI-generated datasets. Analysis reveals fundamental patterns in space group distributions, coordination environments, and phase stability networks, including sub-linear scaling of convex hull connectivity. We release the complete dataset, including sAlex25 with 14 million out-of-equilibrium structures containing forces and stresses for training universal force fields. We demonstrate that fine-tuning a GRACE model on this data improves benchmark accuracy. All data, models, and workflows are freely available under Creative Commons licenses.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a multi-stage AI workflow combining the Matra-Genoa generative model, Orb-v2 universal ML interatomic potential, and ALIGNN graph neural network to generate 119 million candidate structures, perform DFT validation on selected compounds, and expand the ALEXANDRIA database by 1.3 million entries (including 74 thousand new stable materials). It claims a 99% success rate for compounds within 100 meV/atom of thermodynamic stability (threefold improvement over prior methods), reports that predicted structural disorder rates (37-43%) match experimental databases, analyzes space-group and coordination patterns, and releases the full dataset plus sAlex25 (14 million out-of-equilibrium structures with forces/stresses) for training universal force fields, demonstrating improved GRACE model performance after fine-tuning.
Significance. If the 99% success rate and lack of systematic ML bias hold, the work provides a major expansion of a validated materials database (now 5.8 million structures, 175 thousand on the convex hull) with open data and models, plus large-scale training data that improves downstream ML accuracy. The explicit match to experimental disorder statistics and sub-linear hull-connectivity scaling are strengths that distinguish it from other AI-generated datasets and support broader use in materials discovery.
major comments (2)
- [Abstract and Results] Abstract and Results: The central 99% success-rate claim for structures within 100 meV/atom of the hull rests on the unverified assumption that the Matra-Genoa + Orb-v2 + ALIGNN cascade introduces no net energy bias favoring near-hull candidates. No error histogram, calibration curve on a near-hull test set, or direct ML-vs-DFT energy distribution for the final 1.3 M compounds is provided, leaving open the possibility that model under-prediction in selected space groups or coordinations inflates the observed rate.
- [Methods] Methods: The multi-stage filtering procedure (Matra-Genoa generative step followed by Orb-v2 and ALIGNN surrogates) requires explicit quantification of how selection thresholds and post-hoc rules affect the final DFT-validated set; without this, it is impossible to confirm that the threefold improvement is not partly an artifact of the filtering rather than intrinsic model performance.
minor comments (2)
- [Results] The description of sAlex25 and its use for fine-tuning GRACE would benefit from a table summarizing benchmark improvements (e.g., force MAE before/after) to make the training-data value concrete.
- [Figures] Figure captions for space-group and coordination-environment distributions should explicitly state the binning or normalization used so that sub-linear hull-connectivity scaling can be directly compared to prior literature.
Simulated Author's Rebuttal
We thank the referee for the constructive comments and the opportunity to improve the clarity of our work. We address the major comments point by point below, with revisions planned to strengthen the supporting evidence for our claims.
read point-by-point responses
-
Referee: [Abstract and Results] Abstract and Results: The central 99% success-rate claim for structures within 100 meV/atom of the hull rests on the unverified assumption that the Matra-Genoa + Orb-v2 + ALIGNN cascade introduces no net energy bias favoring near-hull candidates. No error histogram, calibration curve on a near-hull test set, or direct ML-vs-DFT energy distribution for the final 1.3 M compounds is provided, leaving open the possibility that model under-prediction in selected space groups or coordinations inflates the observed rate.
Authors: We appreciate the referee's emphasis on rigorously validating the absence of systematic bias in the ML cascade. The reported 99% success rate is computed from direct DFT energies on the final selected compounds, which serve as the ground truth. To address the concern directly, the revised manuscript will include (i) an error histogram of ML-predicted versus DFT energies on a held-out near-hull validation set, (ii) a calibration curve for the ALIGNN surrogate restricted to low-energy candidates, and (iii) a side-by-side ML-versus-DFT energy distribution for a statistically representative subset of the 1.3 million DFT-validated entries. These additions will allow readers to quantify any residual bias and confirm that the threefold improvement is not an artifact of under-prediction in particular space groups or coordinations. revision: yes
-
Referee: [Methods] Methods: The multi-stage filtering procedure (Matra-Genoa generative step followed by Orb-v2 and ALIGNN surrogates) requires explicit quantification of how selection thresholds and post-hoc rules affect the final DFT-validated set; without this, it is impossible to confirm that the threefold improvement is not partly an artifact of the filtering rather than intrinsic model performance.
Authors: We agree that greater transparency on the filtering pipeline is necessary. The revised Methods section will be expanded to report retention fractions after each stage (Matra-Genoa generation, Orb-v2 energy cutoff, ALIGNN ranking, and post-hoc stability rules), together with the precise numerical thresholds employed. We will also include a supplementary table showing how varying these thresholds alters the final DFT-validated composition and the resulting success rate. This quantification will demonstrate that the observed improvement arises primarily from the generative model's ability to produce high-quality candidates rather than from overly aggressive filtering alone. revision: yes
Circularity Check
No circularity in derivation chain
full rationale
The paper's central result—a 99% success rate for near-hull compounds—is an empirical count obtained by running independent DFT calculations on structures proposed by pre-trained generative and surrogate models (Matra-Genoa, Orb-v2, ALIGNN). No parameters are fitted inside the target dataset and then reused to generate the reported success metric; the DFT validation step lies outside the ML pipeline. No self-citation chains, uniqueness theorems, or ansatzes are invoked to justify the workflow or the reported statistics. The derivation therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption DFT calculations supply reliable ground-truth energies for thermodynamic stability assessment
Forward citations
Cited by 1 Pith paper
-
OptiMat Alloys: a FAIR, living database of multi-principal element alloys enabled by a conversational agent
OptiMat Alloys is a conversational AI system that maintains a living FAIR database of multi-principal element alloy calculations and enables natural-language, on-demand computations with built-in uncertainty checks.
Reference graph
Works this paper leans on
-
[1]
We used theM3GNetenergy to compute directly the distance to the DFT convex hull. We then selected the 88 thousands structures compounds predicted to have the smallest distance to the hull for further DFT relaxation. This dataset is labeled m3gnet/m3gnetin Table I and Figure 2
-
[2]
This dataset is labeled m3gnet/faenetin Table I and Figure 2
We used theF AENetmodel to predict the distance to the convex hull and selected 69 thousands compounds for further DFT relaxation that were closest to the hull. This dataset is labeled m3gnet/faenetin Table I and Figure 2
-
[3]
We used theALIGNNmodel to predict the distance to the convex hull and selected 143 thousands compounds for further DFT relaxation that were closest to the hull. This dataset is labeled m3gnet/alignnin Table I and Figure 2. The analysis of the DFT energies showed clearly that distance to the hull obtained directly from the M3GNetwas not a good estimator of...
-
[4]
provides a rigorous mathematical foundation for machine learning interatomic potentials, unifying both local and message-passing graph neural network (GNN) architectures in a single framework.GRACEgeneralizes the Atomic Cluster Expansion (ACE) [67], which builds on a complete basis for local, star graphs, by introducing a complete set of tree graph cluste...
work page 2022
-
[5]
A. Merchant, S. Batzner, S. S. Schoenholz, M. Aykol, G. Cheon, and E. D. Cubuk, Nature624, 80–85 (2023)
work page 2023
-
[6]
S. D. Griesemer, B. Baldassarri, R. Zhu, J. Shen, K. Pal, C. W. Park, and C. Wolverton, Science Advances11, 10.1126/sciadv.adq1431 (2025). 12
-
[7]
J. Schmidt, T. F. Cerqueira, A. H. Romero, A. Loew, F. Jäger, H.-C. Wang, S. Botti, and M. A. Marques, Materials Today Physics48, 101560 (2024)
work page 2024
-
[8]
J. Schmidt, N. Hoffmann, H. Wang, P. Borlido, P. J. M. A. Carriço, T. F. T. Cerqueira, S. Botti, and M. A. L. Marques, Adv. Mater.35, 2210788 (2023)
work page 2023
-
[9]
H. J. Kulik, T. Hammerschmidt, J. Schmidt, S. Botti, M. A. L. Marques, M. Boley, M. Scheffler, M. Todorović, P. Rinke, C. Oses, A. Smolyanyuk, S. Curtarolo, A. Tkatchenko, A. P. Bartók, S. Manzhos, M. Ihara, T. Carrington, J. Behler, O. Isayev, M. Veit, A. Grisafi, J. Nigam, M. Ceriotti, K. T. Schütt, J. Westermayr, M. Gastegger, R. J. Maurer, B. Kalita, ...
work page 2022
-
[10]
Z. W. Ulissi, K. Tran, J. Yoon, M. S. Shuaibi, L. Mingjie, N. Zhan, K. Broderick, and J. R. Kitchin,Computational Catalysis(Royal Society of Chemistry, 2024) pp. 224 – 279
work page 2024
-
[11]
J. Abed, J. Kim, M. Shuaibi, B. Wander, B. Duijf, S. Mahesh, H. Lee, V. Gharakhanyan, S. Hoogland, E. Irtem, J. Lan, N. Schouten, A. U. Vijayakumar, J. Hattrick-Simpers, J. R. Kitchin, Z. W. Ulissi, A. van Vugt, E. H. Sargent, D. Sinton, and C. L. Zitnick, Open catalyst experiments 2024 (ocx24): Bridging experiments and computational models (2024)
work page 2024
- [12]
-
[13]
J. Schmidt, L. Pettersson, C. Verdozzi, S. Botti, and M. A. L. Marques, Sci. Adv.7, eabi7948 (2021)
work page 2021
-
[14]
M. Neumann, J. Gin, B. Rhodes, S. Bennett, Z. Li, H. Choubisa, A. Hussey, and J. Godwin, Orb: A fast, scalable neural network potential (2024)
work page 2024
-
[15]
I. Batatia, D. P. Kovacs, G. N. C. Simm, C. Ortner, and G. Csanyi, inAdv. Neural Inf. Process. Syst., edited by A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho (2022)
work page 2022
- [16]
-
[17]
A. Bochkarev, Y. Lysogorskiy, and R. Drautz, Physical Review X14, 021036 (2024)
work page 2024
-
[18]
Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models
L. Barroso-Luque, M. Shuaibi, X. Fu, B. M. Wood, M. Dzamba, M. Gao, A. Rizvi, C. L. Zitnick, and Z. W. Ulissi, arXiv preprint arXiv:2410.12771 10.48550/arXiv.2410.12771 (2024)
work page internal anchor Pith review doi:10.48550/arxiv.2410.12771 2024
- [19]
- [20]
-
[21]
J. Zeng, D. Zhang, A. Peng, X. Zhang, S. He, Y. Wang, X. Liu, H. Bi, Y. Li, C. Cai, C. Zhang, Y. Du, J.-X. Zhu, P. Mo, Z. Huang, Q. Zeng, S. Shi, X. Qin, Z. Yu, C. Luo, Y.Ding, Y.-P.Liu, R.Shi, Z.Wang, S.L.Bore, J.Chang, Z. Deng, Z. Ding, S. Han, W. Jiang, G. Ke, Z. Liu, D. Lu, K. Muraoka, H. Oliaei, A. K. Singh, H. Que, W. Xu, Z. Xu, Y.-B. Zhuang, J. Dai...
-
[22]
Y. Lysogorskiy, A. Bochkarev, and R. Drautz, Graph atomic cluster expansion for foundational machine learning interatomic potentials (2025), arXiv:2508.17936 [cond-mat]
-
[23]
J. Riebesell, R. E. A. Goodall, P. Benner, Y. Chiang, B.Deng, G.Ceder, M.Asta, A.A.Lee, A.Jain,andK.A. Persson, arXiv 10.48550/ARXIV.2308.14920 (2023)
-
[24]
A. Loew, D. Sun, H.-C. Wang, S. Botti, and M. A. L. Marques, npj Computational Materials11, 178 (2025), publisher: Nature Publishing Group
work page 2025
-
[25]
P. Eastman, R. Galvelis, R. P. Peláez, C. R. A. Abreu, S. E. Farr, E. Gallicchio, A. Gorenko, M. M. Henry, F. Hu, J. Huang, A. Krämer, J. Michel, J. A. Mitchell, V. S. Pande, J. P. Rodrigues, J. Rodriguez- Guerra, A. C. Simmonett, S. Singh, J. Swails, P. Turner, Y. Wang, I. Zhang, J. D. Chodera, G. D. Fabritiis, and T. E. Markland, OpenMM 8: Molecular Dyn...
- [26]
- [27]
- [28]
- [29]
-
[30]
H.-C. Wang, J. Schmidt, M. A. L. Marques, L. Wirtz, and A. H. Romero, 2D Mater.10, 035007 (2023)
work page 2023
-
[31]
A. Vishina, O. Eriksson, and H. C. Herper, Acta Materialia261, 119348 (2023)
work page 2023
-
[32]
T. F. T. Cerqueira, A. Sanna, and M. A. L. Marques, Advanced Materials36, 10.1002/adma.202307085 (2023)
-
[33]
T. F. T. Cerqueira, Y. Fang, I. Errea, A. Sanna, and M. A. L. Marques, Advanced Functional Materials34, 10.1002/adfm.202404043 (2024)
-
[34]
T. H. B. da Silva, T. Cavignac, T. F. T. Cerqueira, H.- C. Wang, and M. A. L. Marques, Materials Horizons 10.1039/d4mh01753f (2025)
-
[35]
J. Schmidt, J. Shi, P. Borlido, L. Chen, S. Botti, and M. A. L. Marques, Chem. Mater.29, 5090–5103 (2017)
work page 2017
-
[36]
H.-C. Wang, S. Botti, and M. A. L. Marques, npj Comput. Mater.7, 12 (2021)
work page 2021
-
[37]
A. Merchant, S. Batzner, S. S. Schoenholz, M. Aykol, G. Cheon, and E. D. Cubuk, Nature624, 80 (2023), publisher: Nature Publishing Group
work page 2023
-
[38]
K. Jakob, A. Walsh, K. Reuter, and J. T. Margraf, ChemRxiv 10.26434/chemrxiv-2025-f52qs (2025)
-
[39]
T. F. T. Cerqueira, A. Sanna, and M. A. L. Marques, Adv. Mater.36, 2307085 (2024)
work page 2024
- [40]
-
[41]
S. Fredericks, K. Parrish, D. Sayre, and Q. Zhu, Comput. Phys. Commun.261, 107810 (2021)
work page 2021
-
[42]
A. R. Oganov, ed.,Modern methods of crystal structure prediction(Wiley-VCH, Weinheim, Germany, 2011)
work page 2011
-
[43]
A. R. Oganov, A. O. Lyakhov, and M. Valle, Accounts of Chemical Research44, 227 (2011)
work page 2011
-
[44]
Y. Wang, J. Lv, L. Zhu, and Y. Ma, Computer Physics Communications183, 2063 (2012)
work page 2063
-
[45]
S.Goedecker,TheJournalofChemicalPhysics120,9911 (2004)
work page 2004
- [46]
-
[47]
P.-P. D. Breuck, H. A. Piracha, G.-M. Rignanese, and M. A. L. Marques, A generative material transformer using Wyckoff representation (2025)
work page 2025
- [48]
-
[49]
M. Neumann, J. Gin, B. Rhodes, S. Bennett, Z. Li, H. Choubisa, A. Hussey, and J. Godwin, Orb: A Fast, Scalable Neural Network Potential (2024)
work page 2024
-
[50]
J. Schmidt, L. Chen, S. Botti, and M. A. L. Marques, J. Chem. Phys.148, 241728 (2018)
work page 2018
-
[51]
D. Zagorac, H. Müller, S. Ruehl, J. Zagorac, and S. Rehme, Journal of Applied Crystallography52, 918 (2019)
work page 2019
- [52]
-
[53]
A. J. C. Wilson, Acta Crystallographica Section A Foundations of Crystallography44, 715 (1988)
work page 1988
-
[54]
A. J. C. Wilson, Acta Crystallographica Section A Foundations of Crystallography46, 742 (1990)
work page 1990
-
[55]
S. W. Wukovitz and T. O. Yeates, Nature Structural & Molecular Biology2, 1062 (1995)
work page 1995
-
[56]
H. Pan, A. M. Ganose, M. Horton, M. Aykol, K. A. Persson, N. E. R. Zimmermann, and A. Jain, Inorganic Chemistry60, 1590 (2021), publisher: American Chemical Society
work page 2021
-
[57]
S. P. Ong, W. D. Richards, A. Jain, G. Hautier, M. Kocher, S. Cholia, D. Gunter, V. L. Chevrier, K. A. Persson, and G. Ceder, Computational Materials Science 68, 314 (2013)
work page 2013
-
[58]
S. V. Krivovichev, Coordination Chemistry Reviews498, 215484 (2024)
work page 2024
- [59]
-
[60]
V. I. Hegde, M. Aykol, S. Kirklin, and C. Wolverton, Science Advances6, 10.1126/sciadv.aay5606 (2020)
-
[61]
G. Benedini, A. Loew, M. Hellstrom, S. Botti, and M. A. L. Marques, Universal Machine Learning Potential for Systems with Reduced Dimensionality (2025), arXiv:2508.15614 [cond-mat]
- [62]
- [63]
-
[64]
G.KresseandJ.Furthmüller,Comp.Mater.Sci.6,15–50 (1996)
work page 1996
- [65]
-
[66]
P. E. Blöchl, Phys. Rev. B50, 17953–17979 (1994)
work page 1994
-
[67]
J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett.77, 3865–3868 (1996)
work page 1996
- [68]
-
[69]
P. Ramachandran, B. Zoph, and Q. V. Le, Searching for activation functions (2017)
work page 2017
-
[70]
I. Loshchilov and F. Hutter, Decoupled weight decay regularization (2017)
work page 2017
- [71]
-
[72]
J. Riebesell, H. D. YANG, R. Goodall, S. G. Baird, M.- H. chiu, B. B. Maranville, Colin, J. George, J. Wang, and T. Keane, janosh/pymatviz: v0.17.3 (2025)
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.