SLayerGen generates crystals invariant to any space or layer group via autoregressive lattice and Wyckoff sampling plus equivariant diffusion, achieving gains over bulk models on diperiodic materials after correcting a prior loss inconsistency for hexagonal groups.
hub Mixed citations
Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models
Mixed citation behavior. Most common role is background (56%).
abstract
The ability to discover new materials with desirable properties is critical for numerous applications from helping mitigate climate change to advances in next generation computing hardware. AI has the potential to accelerate materials discovery and design by more effectively exploring the chemical space compared to other computational methods or by trial-and-error. While substantial progress has been made on AI for materials data, benchmarks, and models, a barrier that has emerged is the lack of publicly available training data and open pre-trained models. To address this, we present a Meta FAIR release of the Open Materials 2024 (OMat24) large-scale open dataset and an accompanying set of pre-trained models. OMat24 contains over 110 million density functional theory (DFT) calculations focused on structural and compositional diversity. Our EquiformerV2 models achieve state-of-the-art performance on the Matbench Discovery leaderboard and are capable of predicting ground-state stability and formation energies to an F1 score above 0.9 and an accuracy of 20 meV/atom, respectively. We explore the impact of model size, auxiliary denoising objectives, and fine-tuning on performance across a range of datasets including OMat24, MPtraj, and Alexandria. The open release of the OMat24 dataset and models enables the research community to build upon our efforts and drive further advancements in AI-assisted materials science.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
JanusPipe introduces SymFold and WaveK to enable efficient 3D-parallel training for conservative MLIPs, reporting 1.51x and 1.45x average throughput gains over 1F1B and Hanayo baselines on 32 GPUs.
Lang2MLIP is an LLM multi-agent framework that automates end-to-end development of machine learning interatomic potentials from natural language input for heterogeneous materials systems.
MatRIS-MoE and Janus enable efficient exascale training of billion-parameter universal interatomic potentials by addressing second-order derivative computation and communication overheads.
CarNet develops irreducible Cartesian natural tensors and an equivariant model that matches leading spherical-tensor performance for ML interatomic potentials and high-rank tensor predictions like elastic constants.
Pre-training ML interaction potentials on classical force fields followed by ab initio fine-tuning produces stable and accurate molecular dynamics simulations for gas-phase molecules, liquid water, and hydrogen combustion.
A system-specific MACE-sqs model trained on spin-polarized PBE DFT data for Fe-Ni SQS structures outperforms foundation models for equations of state, volumes, elastic constants and thermal expansion but all models incorrectly increase bcc-to-hcp transition pressure with Ni content.
CrystalREPA closes the representation gap between crystal generators and universal MLIPs via contrastive alignment, yielding more stable and valid generated crystals while revealing that MLIP teacher quality is better predicted by representation distinguishability than by leaderboard accuracy.
Structural pruning of SO(3) equivariant atomistic models from large checkpoints yields 1.5-4x fewer parameters and 2.5-4x less pre-training compute than small models trained from scratch, while outperforming them on most Matbench Discovery metrics and downstream tasks.
Density diversity in training data is the key factor for making machine learning interatomic potentials transferable across thermodynamic states, outperforming temperature diversity.
VibroML automates remediation of dynamic instabilities in crystalline materials by combining MLIPs with genetic algorithms for polymorph search, finite-temperature MD validation, and compositional alloying to yield stable structures from databases like Alexandria.
PET-UAFD ensemble of ML potentials, calibrated on experimental cohesive energies and moduli, matches experimental accuracy on liquid properties and supplies uncertainty estimates via the PET-EXP protocol.
An agentic framework fusing large atomic and language models rediscovers 66 known superconductors and guides experimental verification of four new ones with transition temperatures from 2.5 K to 6.5 K.
A combined generative model, ML potential, and graph neural network pipeline expands the Alexandria database by 1.3 million DFT-validated compounds with 99% success near the convex hull and releases training data for universal force fields.
An end-to-end framework combining domain separation, lightweight ML potentials, and de novo in silico synthesis enables quantitative atomistic modeling of mesoporous metallosilicates that matches experimental densities, pair distribution functions, IR spectra, and hydroxyl densities.
Machine learning interatomic potentials fine-tuned on first-principles relaxation data accurately reproduce phonon spectra and optical lineshapes for defects, matching explicit calculations and experiments.
Universal MLIPs serve as configuration generators whose DFT-relabeled subsamples enable one-shot or iterative training of material-specific MLIPs that recover accurate reactive energy profiles with 600-2000 DFT calculations.
Fine-tuned MACE MLIPs achieve lower mean absolute errors on catalytic reaction energies and barriers than from-scratch models, with a large fine-tuned model performing best on both metallic and oxide systems including out-of-distribution cases.
MatterSim-MT is a multi-task ML foundation model pretrained on 35M+ structures for in silico materials property prediction and complex simulations.
OptiMat Alloys is a conversational AI system that maintains a living FAIR database of multi-principal element alloy calculations and enables natural-language, on-demand computations with built-in uncertainty checks.
Benchmarks of 15 MLIPs show parameter count and training set size correlate with accuracy, architecture drives speed and memory, and explicit Coulomb terms provide no benefit.
Different uMLIPs encode chemical space in distinct ways, with high cross-model feature reconstruction errors, and fine-tuning preserves strong pre-training bias in the latent features.
Centimeter-scale epitaxial growth of phase-pure crystalline 2D CrCl3 films achieved on mica via controlled physical vapor transport with innovations in light management, high carrier-gas flow, and moisture control.
Pretrained UMA model reproduces chemisorbed S and O coverage under 15 eV O+ and O2+ bombardment on WS2 without fine-tuning; fine-tuning lowers energy MAE to 4.5e-3 eV/atom and force MAE to 0.076 eV/Å.
citing papers explorer
-
Lang2MLIP: End-to-End Language-to-Machine Learning Interatomic Potential Development with Autonomous Agentic Workflows
Lang2MLIP is an LLM multi-agent framework that automates end-to-end development of machine learning interatomic potentials from natural language input for heterogeneous materials systems.
-
Compact SO(3) Equivariant Atomistic Foundation Models via Structural Pruning
Structural pruning of SO(3) equivariant atomistic models from large checkpoints yields 1.5-4x fewer parameters and 2.5-4x less pre-training compute than small models trained from scratch, while outperforming them on most Matbench Discovery metrics and downstream tasks.
-
Agentic Fusion of Large Atomic and Language Models to Accelerate Superconductor Discovery
An agentic framework fusing large atomic and language models rediscovers 66 known superconductors and guides experimental verification of four new ones with transition temperatures from 2.5 K to 6.5 K.
-
An experimentally validated end-to-end framework for operando modeling of intrinsically complex metallosilicates
An end-to-end framework combining domain separation, lightweight ML potentials, and de novo in silico synthesis enables quantitative atomistic modeling of mesoporous metallosilicates that matches experimental densities, pair distribution functions, IR spectra, and hydroxyl densities.
-
MatterSim-MT: A multi-task foundation model for in silico materials characterization
MatterSim-MT is a multi-task ML foundation model pretrained on 35M+ structures for in silico materials property prediction and complex simulations.