SLayerGen generates crystals invariant to any space or layer group via autoregressive lattice and Wyckoff sampling plus equivariant diffusion, achieving gains over bulk models on diperiodic materials after correcting a prior loss inconsistency for hexagonal groups.
hub Mixed citations
Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models
Mixed citation behavior. Most common role is background (56%).
abstract
The ability to discover new materials with desirable properties is critical for numerous applications from helping mitigate climate change to advances in next generation computing hardware. AI has the potential to accelerate materials discovery and design by more effectively exploring the chemical space compared to other computational methods or by trial-and-error. While substantial progress has been made on AI for materials data, benchmarks, and models, a barrier that has emerged is the lack of publicly available training data and open pre-trained models. To address this, we present a Meta FAIR release of the Open Materials 2024 (OMat24) large-scale open dataset and an accompanying set of pre-trained models. OMat24 contains over 110 million density functional theory (DFT) calculations focused on structural and compositional diversity. Our EquiformerV2 models achieve state-of-the-art performance on the Matbench Discovery leaderboard and are capable of predicting ground-state stability and formation energies to an F1 score above 0.9 and an accuracy of 20 meV/atom, respectively. We explore the impact of model size, auxiliary denoising objectives, and fine-tuning on performance across a range of datasets including OMat24, MPtraj, and Alexandria. The open release of the OMat24 dataset and models enables the research community to build upon our efforts and drive further advancements in AI-assisted materials science.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
JanusPipe introduces SymFold and WaveK to enable efficient 3D-parallel training for conservative MLIPs, reporting 1.51x and 1.45x average throughput gains over 1F1B and Hanayo baselines on 32 GPUs.
Lang2MLIP is an LLM multi-agent framework that automates end-to-end development of machine learning interatomic potentials from natural language input for heterogeneous materials systems.
MatRIS-MoE and Janus enable efficient exascale training of billion-parameter universal interatomic potentials by addressing second-order derivative computation and communication overheads.
CarNet develops irreducible Cartesian natural tensors and an equivariant model that matches leading spherical-tensor performance for ML interatomic potentials and high-rank tensor predictions like elastic constants.
Pre-training ML interaction potentials on classical force fields followed by ab initio fine-tuning produces stable and accurate molecular dynamics simulations for gas-phase molecules, liquid water, and hydrogen combustion.
A system-specific MACE-sqs model trained on spin-polarized PBE DFT data for Fe-Ni SQS structures outperforms foundation models for equations of state, volumes, elastic constants and thermal expansion but all models incorrectly increase bcc-to-hcp transition pressure with Ni content.
CrystalREPA closes the representation gap between crystal generators and universal MLIPs via contrastive alignment, yielding more stable and valid generated crystals while revealing that MLIP teacher quality is better predicted by representation distinguishability than by leaderboard accuracy.
Structural pruning of SO(3) equivariant atomistic models from large checkpoints yields 1.5-4x fewer parameters and 2.5-4x less pre-training compute than small models trained from scratch, while outperforming them on most Matbench Discovery metrics and downstream tasks.
Density diversity in training data is the key factor for making machine learning interatomic potentials transferable across thermodynamic states, outperforming temperature diversity.
VibroML automates remediation of dynamic instabilities in crystalline materials by combining MLIPs with genetic algorithms for polymorph search, finite-temperature MD validation, and compositional alloying to yield stable structures from databases like Alexandria.
PET-UAFD ensemble of ML potentials, calibrated on experimental cohesive energies and moduli, matches experimental accuracy on liquid properties and supplies uncertainty estimates via the PET-EXP protocol.
An agentic framework fusing large atomic and language models rediscovers 66 known superconductors and guides experimental verification of four new ones with transition temperatures from 2.5 K to 6.5 K.
A combined generative model, ML potential, and graph neural network pipeline expands the Alexandria database by 1.3 million DFT-validated compounds with 99% success near the convex hull and releases training data for universal force fields.
An end-to-end framework combining domain separation, lightweight ML potentials, and de novo in silico synthesis enables quantitative atomistic modeling of mesoporous metallosilicates that matches experimental densities, pair distribution functions, IR spectra, and hydroxyl densities.
Machine learning interatomic potentials fine-tuned on first-principles relaxation data accurately reproduce phonon spectra and optical lineshapes for defects, matching explicit calculations and experiments.
Universal MLIPs serve as configuration generators whose DFT-relabeled subsamples enable one-shot or iterative training of material-specific MLIPs that recover accurate reactive energy profiles with 600-2000 DFT calculations.
Fine-tuned MACE MLIPs achieve lower mean absolute errors on catalytic reaction energies and barriers than from-scratch models, with a large fine-tuned model performing best on both metallic and oxide systems including out-of-distribution cases.
MatterSim-MT is a multi-task ML foundation model pretrained on 35M+ structures for in silico materials property prediction and complex simulations.
OptiMat Alloys is a conversational AI system that maintains a living FAIR database of multi-principal element alloy calculations and enables natural-language, on-demand computations with built-in uncertainty checks.
Benchmarks of 15 MLIPs show parameter count and training set size correlate with accuracy, architecture drives speed and memory, and explicit Coulomb terms provide no benefit.
Different uMLIPs encode chemical space in distinct ways, with high cross-model feature reconstruction errors, and fine-tuning preserves strong pre-training bias in the latent features.
Centimeter-scale epitaxial growth of phase-pure crystalline 2D CrCl3 films achieved on mica via controlled physical vapor transport with innovations in light management, high carrier-gas flow, and moisture control.
Pretrained UMA model reproduces chemisorbed S and O coverage under 15 eV O+ and O2+ bombardment on WS2 without fine-tuning; fine-tuning lowers energy MAE to 4.5e-3 eV/atom and force MAE to 0.076 eV/Å.
citing papers explorer
-
SLayerGen: a Crystal Generative Model for all Space and Layer Groups
SLayerGen generates crystals invariant to any space or layer group via autoregressive lattice and Wyckoff sampling plus equivariant diffusion, achieving gains over bulk models on diperiodic materials after correcting a prior loss inconsistency for hexagonal groups.
-
Atomistic Machine Learning with Irreducible Cartesian Natural Tensors
CarNet develops irreducible Cartesian natural tensors and an equivariant model that matches leading spherical-tensor performance for ML interatomic potentials and high-rank tensor predictions like elastic constants.
-
Can MACE Potentials Accurately Describe Magnetism and Phase Stability in Fe-Ni Alloys? A Systematic Benchmark
A system-specific MACE-sqs model trained on spin-polarized PBE DFT data for Fe-Ni SQS structures outperforms foundation models for equations of state, volumes, elastic constants and thermal expansion but all models incorrectly increase bcc-to-hcp transition pressure with Ni content.
-
CrystalREPA: Transferring Physical Priors from Universal MLIPs to Crystal Generative Models
CrystalREPA closes the representation gap between crystal generators and universal MLIPs via contrastive alignment, yielding more stable and valid generated crystals while revealing that MLIP teacher quality is better predicted by representation distinguishability than by leaderboard accuracy.
-
VibroML: an automated toolkit for high-throughput vibrational analysis and dynamic instability remediation of crystalline materials using machine-learned potentials
VibroML automates remediation of dynamic instabilities in crystalline materials by combining MLIPs with genetic algorithms for polymorph search, finite-temperature MD validation, and compositional alloying to yield stable structures from databases like Alexandria.
-
AI-Driven Expansion and Application of the Alexandria Database
A combined generative model, ML potential, and graph neural network pipeline expands the Alexandria database by 1.3 million DFT-validated compounds with 99% success near the convex hull and releases training data for universal force fields.
-
An experimentally validated end-to-end framework for operando modeling of intrinsically complex metallosilicates
An end-to-end framework combining domain separation, lightweight ML potentials, and de novo in silico synthesis enables quantitative atomistic modeling of mesoporous metallosilicates that matches experimental densities, pair distribution functions, IR spectra, and hydroxyl densities.
-
Machine Learning Phonon Spectra for Fast and Accurate Optical Lineshapes of Defects
Machine learning interatomic potentials fine-tuned on first-principles relaxation data accurately reproduce phonon spectra and optical lineshapes for defects, matching explicit calculations and experiments.
-
Universal Interatomic Potentials as Configuration-Space Generators for One-Shot and Iterative Fine-Tuning of Ab Initio-Accurate Material-Specific Models
Universal MLIPs serve as configuration generators whose DFT-relabeled subsamples enable one-shot or iterative training of material-specific MLIPs that recover accurate reactive energy profiles with 600-2000 DFT calculations.
-
MatterSim-MT: A multi-task foundation model for in silico materials characterization
MatterSim-MT is a multi-task ML foundation model pretrained on 35M+ structures for in silico materials property prediction and complex simulations.
-
OptiMat Alloys: a FAIR, living database of multi-principal element alloys enabled by a conversational agent
OptiMat Alloys is a conversational AI system that maintains a living FAIR database of multi-principal element alloy calculations and enables natural-language, on-demand computations with built-in uncertainty checks.
-
Tailored Vapor Deposition Unlocks Large-Grain, Wafer-Scale Epitaxial Growth of 2D Magnetic CrCl3
Centimeter-scale epitaxial growth of phase-pure crystalline 2D CrCl3 films achieved on mica via controlled physical vapor transport with innovations in light management, high carrier-gas flow, and moisture control.
-
Fine-Tuning a Universal Machine-Learned Interatomic Potential for Oxygen Plasma Interactions with WS$_2$
Pretrained UMA model reproduces chemisorbed S and O coverage under 15 eV O+ and O2+ bombardment on WS2 without fine-tuning; fine-tuning lowers energy MAE to 4.5e-3 eV/atom and force MAE to 0.076 eV/Å.
-
Accurate and Efficient Interatomic Potentials for Dislocations in InP
New ACE and MACE potentials for InP achieve at most 4% error on partial dislocation formation energies versus DFT, outperforming literature models by factors of 4-12 while being computationally faster.
-
Comparing fine-tuning strategies of MACE machine learning force field for modeling Li-ion diffusion in LiF for batteries
MACE-MPA-0 predicts Li diffusion Ea of 0.22 eV in LiF, fine-tuned version with 300 points gives 0.20 eV, close to DeePMD reference of 0.24 eV, using far less training data.
-
Atomistic Modeling of Chemical Disorder in Materials: Bridging Classical Methods and AI-Assisted Approaches
A review of classical and AI-assisted methods for modeling chemical disorder in atomistic simulations of alloys and complex materials.
-
Six Open Questions in Machine-Learned Interatomic Potential Foundation Models
This perspective article develops a definition of foundational MLIPs and poses six open questions that the authors believe will define future research in machine-learned interatomic potentials.
- Benchmarking Chemically Scalable Machine-Learning Interatomic Potentials for Large-Scale Simulations of Multicomponent Alloys