Scalable Data-Driven Basis Selection for Linear Machine Learning Interatomic Potentials
Pith reviewed 2026-05-22 19:13 UTC · model grok-4.3
The pith
Active set algorithms produce sparse ACE models that run faster and generalize better than dense models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors show that active set algorithms for automated, data-driven feature selection within the Atomic Cluster Expansion yield sparse linear models. These sparse models deliver consistent gains in computational efficiency, generalization accuracy, and interpretability over dense ACE models on multiple benchmark datasets. The algorithms further generate full paths of models that span a range of cost-to-accuracy ratios.
What carries the argument
Active set algorithms that iteratively select or discard basis functions in the Atomic Cluster Expansion according to their contribution to the training loss, thereby constructing sparse linear models from data.
If this is right
- Atomistic simulations become cheaper to run at fixed accuracy because fewer basis functions are evaluated per atom.
- Generalization to new configurations improves, lowering the risk of unphysical behavior outside the training set.
- Model interpretability rises because only the retained basis functions need to be examined.
- Development time drops since entire cost-accuracy curves are obtained from one run instead of repeated hyperparameter searches.
Where Pith is reading between the lines
- The same selection procedure could be applied to other linear regression models for interatomic potentials to obtain similar sparsity benefits.
- Production codes could expose the cost-accuracy path as a user option so that end users pick the operating point best suited to their scale and accuracy needs.
- Transfer tests on datasets drawn from different chemical elements or extreme conditions would directly check whether the observed gains survive outside the original benchmarks.
Load-bearing premise
The benchmark datasets used in the tests are representative of the atomic configurations that appear in actual production simulations.
What would settle it
A new simulation on atomic configurations absent from the original benchmarks in which the sparse model’s prediction error exceeds that of the corresponding dense model or in which the measured wall-clock speedup disappears.
Figures
read the original abstract
Machine learning interatomic potentials (MLIPs) provide an effective approach for accurately and efficiently modeling atomic interactions, expanding the capabilities of atomistic simulations to complex systems. However, a priori feature selection leads to high complexity, which can be detrimental to both computational cost and generalization, resulting in a need for hyperparameter tuning. We demonstrate the benefits of active set algorithms for automated data-driven feature selection. The proposed methods are implemented within the Atomic Cluster Expansion (ACE) framework. Computational tests conducted on a variety of benchmark datasets indicate that sparse ACE models consistently enhance computational efficiency, generalization accuracy and interpretability over dense ACE models. An added benefit of the proposed algorithms is that they produce entire paths of models with varying cost/accuracy ratio.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes active-set algorithms for automated, data-driven feature selection within the Atomic Cluster Expansion (ACE) framework for linear machine learning interatomic potentials. It claims that the resulting sparse ACE models deliver consistent gains in computational efficiency, generalization accuracy, and interpretability relative to dense ACE models, as shown on a variety of benchmark datasets, while also generating entire paths of models with tunable cost/accuracy ratios.
Significance. If the empirical results are robust, the work offers a practical route to reducing manual hyperparameter tuning and a priori basis complexity in MLIPs. The production of model paths and the emphasis on interpretability are concrete strengths that could aid adoption in large-scale atomistic simulations.
major comments (1)
- [Computational tests / benchmark results] The central empirical claim of consistent generalization improvements rests on results from 'a variety of benchmark datasets' (abstract and computational tests section). The manuscript does not demonstrate that these datasets include sufficient structural, compositional, and thermodynamic diversity (e.g., defects, surfaces, or disordered phases) to ensure the selected sparse bases remain optimal outside the training distribution; this is load-bearing for the transferability of the reported accuracy and efficiency gains.
minor comments (2)
- [Methods] Clarify the precise active-set algorithm variant employed and any regularization choices in the methods section to aid reproducibility.
- Ensure all benchmark dataset references and preprocessing steps are explicitly cited or described.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback, which helps clarify the scope and limitations of our empirical claims. We address the major comment point by point below.
read point-by-point responses
-
Referee: The central empirical claim of consistent generalization improvements rests on results from 'a variety of benchmark datasets' (abstract and computational tests section). The manuscript does not demonstrate that these datasets include sufficient structural, compositional, and thermodynamic diversity (e.g., defects, surfaces, or disordered phases) to ensure the selected sparse bases remain optimal outside the training distribution; this is load-bearing for the transferability of the reported accuracy and efficiency gains.
Authors: We agree that explicit documentation of dataset diversity is necessary to support claims of transferability. The benchmarks employed are standard datasets from the ACE and MLIP literature (e.g., elemental and alloy systems with varying degrees of structural complexity). In the revised manuscript we will add a dedicated paragraph and accompanying table in the Computational Tests section that quantifies the structural (defects, surfaces, grain boundaries), compositional, and thermodynamic coverage of each dataset. We will also include a brief discussion of how the active-set selection procedure adapts to these variations. Because the current results already span multiple chemistries and phases, we view this as a clarification rather than a fundamental change to the conclusions. revision: partial
Circularity Check
No significant circularity detected
full rationale
The paper proposes and empirically validates active-set algorithms for data-driven feature selection inside the existing ACE framework. All load-bearing claims (improved efficiency, generalization, and interpretability of sparse versus dense models) rest on direct numerical comparisons across benchmark datasets rather than any derivation, uniqueness theorem, or fitted parameter that is redefined as a prediction. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the reported chain; the results are therefore independent of the inputs they are tested against.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We demonstrate the benefits of active set algorithms for automated data-driven feature selection. The proposed methods are implemented within the Atomic Cluster Expansion (ACE) framework.
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
sparse ACE models consistently enhance computational efficiency, generalization accuracy and interpretability over dense ACE models
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
M. Bachmayr, G. Csanyi, G. Dusson, R. Drautz, S. Etter, C. Oord, and C. Ortner. Atomic cluster expansion: Completeness, efficiency and stability. J. Comput. Phys. , 454, 2022
work page 2022
-
[2]
M. Bachmayr, G. Dusson, C. Ortner, and J. Thomas. Polynomial approximation of symmetric functions. Math. Comp., 93:811–839, 2024
work page 2024
-
[3]
A. P. Bart ´ok, M. C. Payne, R. Kondor, and G. Cs ´anyi. Gaussian approximation potentials: The accuracy of quan- tum mechanics, without the electrons. Phys. Rev. Lett. , 104:136403, Apr 2010
work page 2010
-
[4]
Bart ´ok, James Kermode, Noam Bernstein, and G´abor Cs ´anyi
Albert P. Bart ´ok, James Kermode, Noam Bernstein, and G´abor Cs ´anyi. Machine learning a general-purpose inter- atomic potential for silicon. Phys. Rev. X, 8:041048, Dec 2018
work page 2018
-
[5]
Bart ´ok, Risi Kondor, and G ´abor Cs ´anyi
Albert P. Bart ´ok, Risi Kondor, and G ´abor Cs ´anyi. On representing chemical environments. Physical Review B , 87(18), May 2013
work page 2013
-
[6]
Atom-centered symmetry functions for con- structing high-dimensional neural network potentials
J ¨org Behler. Atom-centered symmetry functions for con- structing high-dimensional neural network potentials. The Journal of Chemical Physics , 134(7):074106, 02 2011
work page 2011
-
[7]
N. Bernstein, G. Cs ´anyi, and V . Deringer. De novo exploration and self-guided learning of potential-energy surfaces. npj Computational Materials , 5, 12 2019
work page 2019
-
[8]
Introduction to Linear Optimization
Dimitris Bertsimas and John Tsitsiklis. Introduction to Linear Optimization. 01 1998
work page 1998
-
[9]
Bastiaan J. Braams and Joel M. Bowman and. Permuta- tionally invariant potential energy surfaces in high dimen- sionality. International Reviews in Physical Chemistry , 28(4):577–606, 2009
work page 2009
-
[10]
K. Burke. Perspective on density functional theory. The Journal of chemical physics , 136:150901, 04 2012
work page 2012
-
[11]
Wakin, m.b.: An introduction to compressive sampling
Emmanuel Candes and Michael Wakin. Wakin, m.b.: An introduction to compressive sampling. ieee signal process. mag. 25(2), 21-30. Signal Processing Magazine, IEEE , 25:21 – 30, 04 2008
work page 2008
-
[12]
Emmanuel J. Cand `es, Justin K. Romberg, and Terence Tao. Stable signal recovery from incomplete and inaccu- rate measurements. Communications on Pure and Applied Mathematics, 59(8):1207–1223, 2006
work page 2006
-
[13]
Tony F. Chan. Rank revealing qr factorizations. Linear Algebra and its Applications , 88-89:67–82, 1987
work page 1987
-
[14]
Graph networks as a universal machine learning framework for molecules and crystals
Chi Chen, Weike Ye, Yunxing Zuo, Chen Zheng, and Shyue Ong. Graph networks as a universal machine learning framework for molecules and crystals. Chemistry of Materials, 31, 04 2019
work page 2019
-
[15]
Learning properties of ordered and disordered materials from multi-fidelity data
Chi Chen, Yunxing Zuo, Weike Ye, Xiang-Guo Li, and Shyue Ong. Learning properties of ordered and disordered materials from multi-fidelity data. Nature Computational Science, 1:46–53, 01 2021
work page 2021
-
[16]
Scott Shaobing Chen, David L. Donoho, and Michael A. Saunders. Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing , 20(1):33–61, 1998
work page 1998
-
[17]
Scott Shaobing Chen, David L. Donoho, and Michael A. Saunders. Atomic decomposition by basis pursuit. SIAM Review, 43(1):129–159, 2001
work page 2001
- [18]
-
[19]
Cartesian atomic cluster expansion for machine learning interatomic potentials
Bingqing Cheng. Cartesian atomic cluster expansion for machine learning interatomic potentials. npj Computa- tional Materials, 10, 07 2024
work page 2024
-
[20]
Stewart J. Clark, Matthew D. Segall, Chris J. Pickard, Phil J. Hasnip, Matt I. J. Probert, Keith Refson, and Mike C. Payne. First principles methods using castep. Zeitschrift f ¨ur Kristallographie - Crystalline Materials , 220(5-6):567–570, 2005
work page 2005
-
[21]
M. S. Daw, S. M. Foiles, and M. I. Baskes. The embedded-atom method: a review of theory and appli- cations. Materials Science Reports , 9(7):251–310, 1993
work page 1993
-
[22]
Aldo Faisal, and Cheng Soon Ong
Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong. Mathematics for Machine Learning . Cambridge University Press, 2020
work page 2020
-
[23]
Deringer, Noam Bernstein, Albert P
V olker L. Deringer, Noam Bernstein, Albert P. Bart ´ok, Matthew J. Cliffe, Rachel N. Kerber, Lauren E. Marbella, Clare P. Grey, Stephen R. Elliott, and G ´abor Cs ´anyi. Realistic atomistic structure of amorphous silicon from machine-learning-driven molecular dynamics. The Jour- nal of Physical Chemistry Letters , 9(11):2879–2885, Jun 2018
work page 2018
-
[24]
V olker L. Deringer, Miguel A. Caro, and G ´abor Cs ´anyi. Machine learning interatomic potentials as emerging tools for materials science. Advanced Materials , 31(46):1902765, 2019
work page 2019
-
[25]
Atomic cluster expansion for accurate and transferable interatomic potentials
Ralf Drautz. Atomic cluster expansion for accurate and transferable interatomic potentials. Phys. Rev. B , 99:014104, Jan 2019
work page 2019
-
[26]
Robust solutions to least-squares problems with uncertain data
Laurent El Ghaoui and Herv ´e Lebret. Robust solutions to least-squares problems with uncertain data. SIAM Journal on Matrix Analysis and Applications , 18(4):1035–1064, 1997
work page 1997
-
[27]
M. P. Friedlander and M. A. Saunders. A dual active- set quadratic programming method for finding sparse least-squares solutions. Technical report, Department of Computer Science, University of British Columbia, July 30 2012
work page 2012
-
[28]
Michael P. Friedlander and contributors. Qrupdate.jl - a julia package for updating qr factorizations. https://github. com/mpf/QRupdate.jl, 2012. Accessed: 2024-10-29
work page 2012
-
[29]
Michael P. Friedlander and Michael A. Saunders. Active- set pursuit: an active-set solver for basis pursuit and related sparse optimization problems. https://github.com/ MPF-Optimization-Laboratory/asp
-
[30]
Minima of functions of several variables with inequalities as side conditions
William Karush. Minima of functions of several variables with inequalities as side conditions. Master’s thesis, De- partment of Mathematics, University of Chicago, Chicago, IL, USA, 1939
work page 1939
-
[31]
H. W. Kuhn and A. W. Tucker. Nonlinear Programming, pages 481–492. University of California Press, Berkeley, 1951
work page 1951
-
[32]
Distributed learning with regularized least squares
Shao-Bo Lin, Xin Guo, and Ding-Xuan Zhou. Distributed learning with regularized least squares. Journal of Ma- chine Learning Research , 18(92):1–31, 2017
work page 2017
-
[33]
David J. C. MacKay. Bayesian Non-Linear Modeling for the Prediction Competition , pages 221–234. Springer Netherlands, Dordrecht, 1996
work page 1996
-
[34]
David John Cameron MacKay. Bayesian interpolation. Neural Computation, 4:415–447, 1992
work page 1992
-
[35]
A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way
Stphane Mallat. A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way . Academic Press, Inc., USA, 3rd edition, 2008
work page 2008
-
[36]
B. K. Natarajan. Sparse approximate solutions to linear systems. SIAM Journal on Computing , 24(2):227–234, 1995
work page 1995
-
[37]
A new approach to variable selection in least squares problems
MR Osborne, B Presnell, and BA Turlach. A new approach to variable selection in least squares problems. IMA Journal of Numerical Analysis , 20(3):389–403, 07 2000
work page 2000
-
[38]
Y .C. Pati, R. Rezaiifar, and P.S. Krishnaprasad. Orthog- onal matching pursuit: recursive function approximation with applications to wavelet decomposition. In Proceed- ings of 27th Asilomar Conference on Signals, Systems and Computers, pages 40–44 vol.1, 1993
work page 1993
-
[39]
F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vanderplas, A. Passos, D. Cour- napeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit- learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011
work page 2011
-
[40]
Graph neural networks for materials science and chemistry
Patrick Reiser, Marlen Neubert, Andr ´e Eberhard, Luca Torresi, Chen Zhou, Chen Shao, Houssam Metni, Clint Hoesel, Henrik Schopmans, Timo Sommer, and Pascal Friederich. Graph neural networks for materials science and chemistry. Communications Materials, 3, 11 2022
work page 2022
-
[41]
Moment tensor potentials: A class of systematically improvable interatomic potentials
Alexander Shapeev. Moment tensor potentials: A class of systematically improvable interatomic potentials. Multi- scale Modeling & Simulation , 14, 12 2015
work page 2015
-
[42]
J. Tersoff. New empirical approach for the structure and energy of covalent systems. Phys. Rev. B, 37:6991–7000, Apr 1988
work page 1988
-
[43]
A.P. Thompson, L.P. Swiler, C.R. Trott, S.M. Foiles, and G.J. Tucker. Spectral neighbor analysis method for automated generation of quantum-accurate interatomic potentials. Journal of Computational Physics , 285:316– 330, 2015
work page 2015
-
[44]
Regression shrinkage and selection via the lasso
Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the royal statistical society series b- methodological, 58:267–288, 1996
work page 1996
-
[45]
Regression shrinkage and selection via the lasso
Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288, 1996
work page 1996
-
[46]
John Tatini Titantah and Mikko Karttunen. Water dynam- ics: Relation between hydrogen bond bifurcations, molec- ular jumps, local density & hydrophobicity. Scientific Reports, 3(1):2991, Oct 2013
work page 2013
-
[47]
Tina Torabi. Activesetpursuit.jl: A julia implementation of active set pursuit algorithms for sparse optimiza- tion. https://github.com/MPF-Optimization-Laboratory/ ActiveSetPursuit.jl, 2024. Accessed: 2024-11-05
work page 2024
-
[48]
On asymptotically optimal confidence regions and tests for high-dimensional models
Sara van de Geer, Peter B ¨uhlmann, Ya’acov Ritov, and Ruben Dezeure. On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics, 42(3):1166 – 1202, 2014
work page 2014
-
[49]
Ewout van den Berg and Michael P. Friedlander. Probing the pareto frontier for basis pursuit solutions. SIAM Journal on Scientific Computing , 31(2):890–912, 2009
work page 2009
-
[50]
Hyperactive learning for data-driven interatomic potentials
Cas van der Oord, Matthias Sachs, D’avid P’eter Kov’acs, Christoph Ortner, and G´abor Cs´anyi. Hyperactive learning for data-driven interatomic potentials. Npj Computational Materials, 9, 2022
work page 2022
-
[51]
Linear programming: Foundations and extensions
Robert Vanderbei. Linear programming: Foundations and extensions. Journal of the Operational Research Society , 49, 03 2002
work page 2002
-
[52]
W. C. Witt, C. van der Oord, E. Gel ˇzinyt˙e, T. J ¨arvinen, A. Ross, J. P. Darby, C. Hin Ho, W. J. Baldwin, M. Sachs, J. Kermode, N. Bernstein, G. Cs ´anyi, and C. Ortner. ACEpotentials.jl: A Julia implementation of the atomic cluster expansion. The Journal of Chemical Physics , 159(16):164101, 10 2023
work page 2023
- [53]
-
[54]
Cun-Hui Zhang and Stephanie S. Zhang. Confidence intervals for low dimensional parameters in high dimen- sional linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 76(1):217–242, 07 2013
work page 2013
-
[55]
Y . Zuo, C. Chen, X. Li, Z. Deng, Y . Chen, J. Behler, G. Cs ´anyi, A. Shapeev, A. Thompson, M. A. Wood, and Shyue P. Ong. Performance and cost assessment of machine learning interatomic potentials. The Journal of Physical Chemistry A , 124(4):731–745, Jan 2020
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.