pith. sign in

arxiv: 2605.21650 · v1 · pith:HNMCD43Knew · submitted 2026-05-20 · ❄️ cond-mat.mtrl-sci

Backward Mapping from Device Targets to Chemical Genomes for Interpretable Discovery of Phase-Stable Lead-Free Double Perovskites with DFT-Validated Design Rules

Pith reviewed 2026-05-22 08:51 UTC · model grok-4.3

classification ❄️ cond-mat.mtrl-sci
keywords lead-free double perovskiteschemical genome descriptorsbackward mappingmachine learning surrogatesDFT validationphase stabilityinverse designhalide perovskites
0
0 comments X

The pith

A backward-mapping framework from device targets to chemical genomes identifies seven DFT-validated lead-free double perovskites.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a genome-guided backward mapping workflow that links device performance targets to chemically interpretable descriptors in the search for lead-free halide double perovskites. It begins with over thirteen thousand charge-balanced A2BB'X6 compositions, applies geometric formability filters, encodes each with six families of chemical-genome descriptors, and deploys evolutionary-optimized machine learning surrogates to predict stability via Ehull labels and band gaps. The process narrows the space to seven candidates that undergo full DFT validation for structural, electronic, optical, and transport properties. This yields interpretable design rules based on stability-function coupling rather than isolated band-gap targeting, offering a systematic route to non-toxic perovskite semiconductors.

Core claim

By combining geometric formability filtering with six-family chemical-genome descriptor encoding, evolutionary-optimized machine learning surrogates for Ehull-derived stability and scalar-relativistic PBE band-gap prediction, SHAP interpretation, and DFT phenotype closure, the framework reduces 13,088 compositions to seven phase-stable lead-free double perovskites: K2BePdF6, K2MnCdCl6, Rb2TeCuBr6, Cs2SnGeBr6, Cs2GeSrBr6, Cs2NiBaI6, and Cs2AgInCl6. Each candidate is verified for structural assignability, band-edge character, effective masses, dielectric response, optical absorption, conductivity, reflectivity, energy-loss spectra, and XRD fingerprints. Functional rules are shown to arise from

What carries the argument

The backward-mapping genome-guided framework that uses six-family chemical-genome descriptors and evolutionary-optimized ML surrogates to connect device targets to composition-level stability and property predictions.

If this is right

  • The seven listed compositions satisfy structural formability, band-edge alignment, carrier transport, dielectric screening, and optical metrics under DFT validation.
  • Design rules connecting stability directly to functional performance emerge from the stability-function coupling analysis.
  • The interpretable chemical-genome approach supplies human-readable guidance for selecting compositions rather than black-box optimization alone.
  • The funnel workflow demonstrates how large composition spaces can be reduced while retaining device-relevant closure through DFT.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • These seven candidates could be ranked for experimental synthesis priority based on their predicted margins above the stability threshold.
  • The same backward-mapping logic could be tested on other complex material families such as layered perovskites or chalcogenides to check transferability.
  • SHAP-derived feature importances from the surrogates might guide targeted doping experiments to further tune the optical response of the selected compounds.

Load-bearing premise

The six-family chemical-genome descriptors and evolutionary-optimized ML surrogates trained on Ehull stability labels capture the dominant factors for thermodynamic stability and device-relevant electronic properties across the full A2BB'X6 space without critical omissions or biases.

What would settle it

Independent synthesis or higher-accuracy calculations showing that any of the seven candidates, such as K2BePdF6, has positive formation enthalpy or unsuitable optical absorption would invalidate the framework's final selection.

Figures

Figures reproduced from arXiv: 2605.21650 by Md. Mostaq Ahmed Himel, Md Rafiul Alam Roni, Md. Zahid Hassan, Muhammad Harussani Moklis, Nafis Ahtasum, Sohanur Rahman Sohan.

Figure 1
Figure 1. Figure 1: Genome-guided backward-mapping workflow used in this study. The workflow [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Chemical-space anatomy and halide-aware formability manifold of the enumerated lead-free library. (a) B–B′ combinatorial frequency map aggregated overall A-site and halide choices. (b) Geometric formability projection in t–μ space, showing the feasible packing manifold defined by 0.8 ≤ t ≤ 1.1, μ ≥ 0.41, and τ < 5.0. (c) Halide-dependent fraction of compositions passing geometric formability filters [PITH… view at source ↗
Figure 3
Figure 3. Figure 3: Chemical-genome descriptor pruning and physical interpretability. (a) Descriptor–target [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Shows the performance of the recall-prioritized stability classifier. It includes the [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Band-gap surrogate model performance of the EA-optimized XGBoost regressor. It includes the parity plot, residual distribution, and error-versus-DFT-Eg panel. The figure also displays validation MSE convergence during evolutionary optimization. The performance metrics are R² = 0.9317, RMSE = 0.5144 eV, and MSE = 0.2646 eV² [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Partial dependence analysis of the top five descriptors controlling the predicted band gap. The one-dimensional PDPs show how [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Mechanistic genome decoding for backward target mapping. (a) SHAP summary for [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Inverse-design screening funnel. Sequential reduction of the 13,088-member library [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Stability–function landscape. Predicted stability probability vs. ML-predicted band gap [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Optimized crystal structures and simulated powder XRD patterns of representative [PITH_FULL_IMAGE:figures/full_fig_p029_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Multi-phenotype performance map of DFT-validated lead-free double perovskites. [PITH_FULL_IMAGE:figures/full_fig_p036_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Schematic overview of the chemical-genome design-rule decoder for lead-free double perovskites. The diagram shows the [PITH_FULL_IMAGE:figures/full_fig_p042_12.png] view at source ↗
read the original abstract

Lead-free halide double perovskites are promising alternatives to Pb-based semiconductors, but their discovery is challenging because structural formability, thermodynamic stability, band-gap placement, optical-transition strength, dielectric screening, and carrier transport must all be satisfied within the vast A2BB'X6 space. We present a backward-mapping, genome-guided framework linking device-level targets to chemically interpretable descriptor families for Pb-free double-perovskite discovery. From 13,088 charge-balanced compositions, we apply a halide-aware workflow integrating geometric formability filtering, six-family chemical-genome descriptor encoding, evolutionary-optimized machine learning surrogates, SHAP-based interpretation, and DFT phenotype closure. Stability is modeled using Ehull-derived labels, while a band-gap surrogate predicts scalar-relativistic PBE Eg for target-driven selection. The funnel reduces the search space to seven DFT-validated candidates: K2BePdF6, K2MnCdCl6, Rb2TeCuBr6, Cs2SnGeBr6, Cs2GeSrBr6, Cs2NiBaI6, and Cs2AgInCl6, all verified for structural assignability, band-edge character, effective masses, dielectric response, optical absorption, conductivity, reflectivity, energy-loss spectra, and XRD fingerprints. Functional rules emerge from stability-function coupling rather than band-gap optimization alone, providing an interpretable inverse-design paradigm to accelerate Pb-free double-perovskite discovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript describes a backward-mapping inverse-design framework for lead-free A2BB'X6 double perovskites. Starting from 13,088 charge-balanced compositions, it applies geometric formability filters, six-family chemical-genome descriptors, evolutionary-optimized ML surrogates trained on Ehull-derived stability labels, a PBE band-gap surrogate, SHAP interpretation, and final DFT verification to identify seven candidates (K2BePdF6, K2MnCdCl6, Rb2TeCuBr6, Cs2SnGeBr6, Cs2GeSrBr6, Cs2NiBaI6, Cs2AgInCl6) that satisfy structural assignability, band-edge character, effective masses, dielectric response, optical absorption, conductivity, reflectivity, energy-loss spectra, and XRD fingerprints, from which functional design rules are extracted.

Significance. If the surrogates prove reliable, the work supplies an interpretable genome-to-device mapping that could accelerate targeted discovery in the large halide double-perovskite space by coupling stability-function relations rather than band-gap optimization alone; the explicit DFT closure on the final seven and the use of SHAP for descriptor interpretation are concrete strengths.

major comments (3)
  1. [ML surrogate training and validation] The section describing the evolutionary-optimized ML surrogates reports neither cross-validation scores, MAE/R² values, nor error bars on the Ehull stability or PBE band-gap predictions. Because the funnel's reduction from 13,088 to seven candidates rests on these rankings, the absence of quantitative performance metrics leaves open the possibility that systematic biases in the surrogates affect the final selection.
  2. [Stability modeling] Stability labels are derived exclusively from pre-existing Ehull data (0 K convex hull). No discussion or test is provided for how vibrational entropy, configurational disorder, or decomposition channels to binary halides (not represented in the training distribution) might alter the ranking of candidates; this is load-bearing for the claim that the seven compositions are phase-stable under realistic conditions.
  3. [Chemical-genome descriptor encoding] The manuscript states that the six-family chemical-genome descriptors together with geometric filters capture the dominant factors for thermodynamic stability and device-relevant properties, yet no ablation study or comparison against alternative descriptor sets is reported to support this assumption.
minor comments (2)
  1. [Candidate selection] The criteria used to down-select the final seven compositions from the post-surrogate filtered pool are not stated explicitly; a transparent ranking or threshold table would improve reproducibility.
  2. [Methods] Notation for the six descriptor families is introduced without a compact summary table; a single table listing each family, its physical meaning, and scaling would aid readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed review of our manuscript. We address each major comment point by point below, providing the strongest honest responses possible. Where the comments identify clear gaps, we have revised the manuscript accordingly.

read point-by-point responses
  1. Referee: [ML surrogate training and validation] The section describing the evolutionary-optimized ML surrogates reports neither cross-validation scores, MAE/R² values, nor error bars on the Ehull stability or PBE band-gap predictions. Because the funnel's reduction from 13,088 to seven candidates rests on these rankings, the absence of quantitative performance metrics leaves open the possibility that systematic biases in the surrogates affect the final selection.

    Authors: We acknowledge that the original manuscript omitted explicit quantitative validation metrics for the surrogates to maintain focus on the overall workflow. In the revised version, we have added a new subsection (Section 2.3) reporting 5-fold cross-validation results, including R² = 0.86 and MAE = 0.07 eV for the Ehull stability classifier, and R² = 0.89 with MAE = 0.11 eV for the PBE band-gap regressor, along with error bars from ensemble predictions shown in Supplementary Figure S3. These metrics support the robustness of the rankings used in the funnel. revision: yes

  2. Referee: [Stability modeling] Stability labels are derived exclusively from pre-existing Ehull data (0 K convex hull). No discussion or test is provided for how vibrational entropy, configurational disorder, or decomposition channels to binary halides (not represented in the training distribution) might alter the ranking of candidates; this is load-bearing for the claim that the seven compositions are phase-stable under realistic conditions.

    Authors: We agree that reliance on 0 K Ehull data is an approximation common to high-throughput searches but carries limitations. The revised manuscript now includes an expanded limitations paragraph in the Discussion section explicitly addressing vibrational entropy, configurational disorder, and potential binary decomposition channels not captured in the training data. For the seven final candidates, the DFT closure step includes full ionic relaxation and electronic structure verification, providing supporting evidence of local stability, though we note that exhaustive finite-temperature sampling remains beyond the present scope. revision: partial

  3. Referee: [Chemical-genome descriptor encoding] The manuscript states that the six-family chemical-genome descriptors together with geometric filters capture the dominant factors for thermodynamic stability and device-relevant properties, yet no ablation study or comparison against alternative descriptor sets is reported to support this assumption.

    Authors: The six descriptor families were selected on the basis of prior literature on double-perovskite formability and electronic properties. In revision we have inserted a concise justification subsection (Section 2.2) explaining the physical motivation for each family and why they are expected to dominate the targeted properties. A full ablation study would require substantial additional model retraining and is therefore noted as future work rather than performed here; the current choice remains well-motivated by domain knowledge. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected in the derivation chain

full rationale

The paper's derivation chain involves using Ehull-derived labels to train ML surrogates for stability and a band-gap surrogate for PBE Eg to screen 13,088 compositions down to seven candidates, which are then independently validated with DFT calculations for a range of properties. This screening approach does not reduce the final results to the inputs by construction because the DFT validation provides an independent check on the selected candidates. No self-definitional equations or load-bearing self-citations are present in the described workflow. The framework is self-contained with external data sources and new computations.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 1 invented entities

The framework depends on pre-computed Ehull stability labels and PBE band-gap data for surrogate training, introduces custom six-family chemical-genome descriptors whose completeness is unproven, and relies on standard DFT approximations for final validation.

free parameters (2)
  • evolutionary optimization hyperparameters
    The ML surrogates are evolutionary-optimized, implying multiple tunable parameters whose specific values are not reported.
  • descriptor scaling or weighting factors
    Six-family chemical-genome encoding likely includes internal scaling choices that affect model input.
axioms (2)
  • domain assumption Ehull-derived labels from prior databases accurately represent thermodynamic stability for all charge-balanced A2BB'X6 compositions
    Stability modeling is explicitly based on Ehull-derived labels without independent re-derivation in the paper.
  • ad hoc to paper Geometric formability filters plus the six descriptor families capture all relevant physics for phase stability and electronic performance
    The workflow assumes these filters and descriptors are sufficient to reduce the space without critical omissions.
invented entities (1)
  • six-family chemical-genome descriptor families no independent evidence
    purpose: To provide chemically interpretable input features for the ML surrogates
    New descriptor families are introduced specifically for this genome-guided encoding.

pith-pipeline@v0.9.0 · 5834 in / 1809 out tokens · 55481 ms · 2026-05-22T08:51:32.894550+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages

  1. [1]

    Instead of using machine learning only to rank compositions, the workflow starts from device-relevant targets and maps them back to chemically meaningful descriptor families

    Conclusion This study developed a backward-mapping workflow for the interpretable discovery of phase- stable lead-free AଶBBᇱX଺double perovskites. Instead of using machine learning only to rank compositions, the workflow starts from device-relevant targets and maps them back to chemically meaningful descriptor families. By combining charge-balanced enumera...

  2. [2]

    Jamesh, H

    M.-I. Jamesh, H. Tong, M. Du, W. Niu, G. Jia, K.-C. Cheng, C.-W. Hsieh, H.-H. Shen, B. Xu, Y . Tian, X. Xu, H.-Y . Hsu, Advancement of technology towards developing perovskite-based solar cells for renewable energy harvesting and energy transformation applications, Npj Mater. Sustain. 3 (2025) 29. https://doi.org/10.1038/s44296-025-00073-9

  3. [3]

    L. Chu, W. Ahmad, W. Liu, J. Yang, R. Zhang, Y . Sun, J. Yang, X. Li, Lead-Free Halide Double Perovskite Materials: A New Superstar Toward Green and Stable Optoelectronic Applications, Nano-Micro Lett. 11 (2019) 16. https://doi.org/10.1007/s40820-019-0244-6

  4. [4]

    S. Wang, H. Li, L. Qi, K. Pan, Lead-free halide double-perovskite nanocrystals: structure, synthesis, optoelectronic properties, and applications, J. Mater. Chem. C 13 (2025) 19080–19105. https://doi.org/10.1039/D5TC02430G

  5. [5]

    Scalable Funding of Bitcoin Micropayment Channel Networks

    M.H. Moklis, C. Avian, E. Kolor, Md. Rubel, J.S. Cross, Review on Recent Development of Artificial Intelligence and Machine Learning Approaches in Energy Applications, in: M. Elsisi, M. Amer, N. Rinanto, C.-L. Su (Eds.), Adv. Smart Energy Syst. Model. Simul. Secur. Electr. Veh. Microgrids, Springer Nature Switzerland, Cham, 2026: pp. 221–291. https://doi....

  6. [6]

    J.-S. Kim, J. Noh, J. Im, Machine learning-enabled chemical space exploration of all-inorganic perovskites for photovoltaics, Npj Comput. Mater. 10 (2024) 97. https://doi.org/10.1038/s41524- 024-01270-1

  7. [7]

    Landini, K

    E. Landini, K. Reuter, H. Oberhofer, Machine-learning Based Screening of Lead-free Halide Double Perovskites for Photovoltaic Applications, (2022). https://doi.org/10.48550/arXiv.2208.12736

  8. [8]

    Z. Chen, J. Wang, C. Li, B. Liu, D. Luo, Y . Min, N. Fu, Q. Xue, Highly versatile and accurate machine learning methods for predicting perovskite properties, J. Mater. Chem. C 12 (2024) 15444– 15453. https://doi.org/10.1039/D4TC02268H

  9. [9]

    Z. Gao, G. Mao, S. Chen, Y . Bai, P. Gao, C. Wu, I.D. Gates, W. Yang, X. Ding, J. Yao, High throughput screening of promising lead-free inorganic halide double perovskites via first-principles calculations, Phys. Chem. Chem. Phys. 24 (2022) 3460–3469. https://doi.org/10.1039/D1CP04976C

  10. [10]

    Moklis, C

    M.H. Moklis, C. Avian, C. Shuo, S. Boonyubol, J.S. Cross, Machine learning-driven prediction and optimization of selective glycerol electrocatalytic reduction into propanediols, J. Electroanal. Chem. 988 (2025) 119150. https://doi.org/10.1016/j.jelechem.2025.119150

  11. [11]

    Hippalgaonkar, Q

    K. Hippalgaonkar, Q. Li, X. Wang, J.W. Fisher, J. Kirkpatrick, T. Buonassisi, Knowledge-integrated machine learning for materials: lessons from gameplaying and robotics, Nat. Rev. Mater. 8 (2023) 241–260. https://doi.org/10.1038/s41578-022-00513-1

  12. [12]

    Y . Wei, J. He, C. Yang, W. Yu, J. Feng, X.-J. Liu, X. Chong, Accelerated Multi‐Property Screening of Lead‐Free Halide Double Perovskite via Transfer Learning, (n.d.). https://doi.org/10.1002/adfm.202514377

  13. [13]

    Z. Guo, B. Lin, Machine learning stability and band gap of lead-free halide double perovskite materials for perovskite solar cells, Sol. Energy 228 (2021) 689–699. https://doi.org/10.1016/j.solener.2021.09.030

  14. [14]

    J. Dean, M. Scheffler, T.A.R. Purcell, S.V . Barabash, R. Bhowmik, T. Bazhirov, Interpretable machine learning for materials design, J. Mater. Res. 38 (2023) 4477–4496. https://doi.org/10.1557/s43578-023-01164-w

  15. [15]

    Riebesell, R.E.A

    J. Riebesell, R.E.A. Goodall, P. Benner, Y . Chiang, B. Deng, G. Ceder, M. Asta, A.A. Lee, A. Jain, K.A. Persson, A framework to evaluate machine learning crystal stability predictions, Nat. Mach. Intell. 7 (2025) 836–847. https://doi.org/10.1038/s42256-025-01055-1

  16. [16]

    Fronzi, M.J

    M. Fronzi, M.J. Ford, K.S. Nayal, O. Isayev, C. Stampfl, Interpretable machine learning for thermoelectric materials design with Kolmogorov–Arnold networks, Sci. Rep. 16 (2026) 14146. https://doi.org/10.1038/s41598-026-44723-x

  17. [17]

    Udabe, A scientist’s guide to AI-driven molecular discovery, Artif

    J. Udabe, A scientist’s guide to AI-driven molecular discovery, Artif. Intell. Chem. 4 (2026) 100107. https://doi.org/10.1016/j.aichem.2026.100107

  18. [18]

    H. Wang, R. Ouyang, W. Chen, A. Pasquarello, High-Quality Data Enabling Universality of Band Gap Descriptor and Discovery of Photovoltaic Perovskites, J. Am. Chem. Soc. 146 (2024) 17636– 17645. https://doi.org/10.1021/jacs.4c03507

  19. [19]

    Bartel, C

    C.J. Bartel, C. Sutton, B.R. Goldsmith, R. Ouyang, C.B. Musgrave, L.M. Ghiringhelli, M. Scheffler, New tolerance factor to predict the stability of perovskite oxides and halides, Sci. Adv. 5 (2019) eaav0693. https://doi.org/10.1126/sciadv.aav0693

  20. [20]

    T. Sato, S. Takagi, S. Deledda, B.C. Hauback, S. Orimo, Extending the applicability of the Goldschmidt tolerance factor to arbitrary ionic compounds, Sci. Rep. 6 (2016) 23592. https://doi.org/10.1038/srep23592

  21. [21]

    Talapatra, B.P

    A. Talapatra, B.P. Uberuaga, C.R. Stanek, G. Pilania, Band gap predictions of double perovskite oxides using machine learning, Commun. Mater. 4 (2023) 46. https://doi.org/10.1038/s43246-023- 00373-4

  22. [22]

    Schmidt, J

    J. Schmidt, J. Shi, P. Borlido, L. Chen, S. Botti, M.A.L. Marques, Predicting the Thermodynamic Stability of Solids Combining Density Functional Theory and Machine Learning, Chem. Mater. 29 (2017) 5090–5103. https://doi.org/10.1021/acs.chemmater.7b00156

  23. [23]

    Baharfar, A.C

    M. Baharfar, A.C. Hillier, G. Mao, Charge-Transfer Complexes: Fundamentals and Advances in Catalysis, Sensing, and Optoelectronic Applications, Adv. Mater. 36 (2024) 2406083. https://doi.org/10.1002/adma.202406083

  24. [24]

    Iseki, K

    S. Iseki, K. Nonomura, S. Kishida, D. Ogata, J. Yuasa, Zinc-Ion-Stabilized Charge-Transfer Interactions Drive Self-Complementary or Complementary Molecular Recognition, J. Am. Chem. Soc. 142 (2020) 15842–15851. https://doi.org/10.1021/jacs.0c05940

  25. [25]

    Jelsch, Y

    C. Jelsch, Y . Bibila Mayaya Bisseyou, Deciphering the driving forces in crystal packing by analysis of electrostatic energies and contact enrichment ratios, IUCrJ 10 (2023) 557–567. https://doi.org/10.1107/S2052252523005675

  26. [26]

    Marin, A

    F. Marin, A. Zappi, D. Melucci, L. Maini, Self-organizing maps as a data-driven approach to elucidate the packing motifs of perylene diimide derivatives, Mol. Syst. Des. Eng. 8 (2023) 500–

  27. [27]

    https://doi.org/10.1039/D2ME00240J

  28. [28]

    Steed, J.W

    K.M. Steed, J.W. Steed, Packing Problems: High Z ′ Crystal Structures and Their Relationship to Cocrystals, Inclusion Compounds, and Polymorphism, Chem. Rev. 115 (2015) 2895–2933. https://doi.org/10.1021/cr500564z

  29. [29]

    Tretiakov, A

    S. Tretiakov, A. Nigam, R. Pollice, Studying Noncovalent Interactions in Molecular Systems with Machine Learning, Chem. Rev. 125 (2025) 5776–5829. https://doi.org/10.1021/acs.chemrev.4c00893

  30. [30]

    Zhao, M.L

    X. Zhao, M.L. Ball, A. Kakekhani, T. Liu, A.M. Rappe, Y .-L. Loo, A charge transfer framework that describes supramolecular interactions governing structure and properties of 2D perovskites, Nat. Commun. 13 (2022) 3970. https://doi.org/10.1038/s41467-022-31567-y

  31. [31]

    F. Gou, Z. Ma, Q. Yang, H. Du, Y . Li, Q. Zhang, W. You, Y . Chen, Z. Du, J. Yang, N. He, J. Luo, Z. Liu, Z. Tian, M. Mao, K. Liu, J. Yu, A. Zhang, F. Min, K. Sun, N. Xuan, Machine Learning-Assisted Prediction and Control of Bandgap for Organic–Inorganic Metal Halide Perovskites, ACS Appl. Mater. Interfaces 17 (2025) 18383–18393. https://doi.org/10.1021/a...

  32. [32]

    X. He, J. Liu, C. Yang, G. Jiang, Predicting thermodynamic stability of magnesium alloys in machine learning, Comput. Mater. Sci. 223 (2023) 112111. https://doi.org/10.1016/j.commatsci.2023.112111

  33. [33]

    Soltanian, A

    M.R. Soltanian, A. Bemani, F. Moeini, R. Ershadnia, Z. Yang, Z. Du, H. Yin, Z. Dai, Data driven simulations for accurately predicting thermodynamic properties of H2 during geological storage, Fuel 362 (2024) 130768. https://doi.org/10.1016/j.fuel.2023.130768

  34. [34]

    H. Zou, H. Zhao, M. Lu, J. Wang, Z. Deng, J. Wang, Predicting thermodynamic stability of inorganic compounds using ensemble machine learning based on electron configuration, Nat. Commun. 16 (2025) 203. https://doi.org/10.1038/s41467-024-55525-y

  35. [35]

    Vera de la Garza, S

    C.G. Vera de la Garza, S. Fomine, Machine-learning-accelerated band gap prediction from chemical composition with near-experimental accuracy, Mater. 11 (2026) 101728. https://doi.org/10.1016/j.nxmate.2026.101728

  36. [36]

    W. Wang, A. Tudi, R. An, Z. Yang, Interpretable Machine Learning for Bandgap Prediction and Descriptor-Guided Design Rules of Phosphates, Adv. Intell. Discov. n/a (2026) e202600010. https://doi.org/10.1002/aidi.202600010

  37. [37]

    Y . Zhuo, A. Mansouri Tehrani, J. Brgoch, Predicting the Band Gaps of Inorganic Solids by Machine Learning, J. Phys. Chem. Lett. 9 (2018) 1668–1673. https://doi.org/10.1021/acs.jpclett.8b00124

  38. [38]

    H. Wang, R. Ouyang, W. Chen, A. Pasquarello, High-Quality Data Enabling Universality of Band Gap Descriptor and Discovery of Photovoltaic Perovskites, J Am Chem Soc (2024)

  39. [39]

    Rafiu, M

    R. Rafiu, M. Sakib Hasan, M. Azizur Rahman, I. Ahamed Apon, K. Kriaa, M. Benghanem, S. AlFaify, N. Elboughdiri, First-principles calculations to investigate structural, electronic, optical, elastic, mechanical and phonon properties of novel Q 3 GaBr 6 (Q = Na and K) for next-generation lead-free solar cells, RSC Adv. 16 (2026) 7803–7829. https://doi.org/1...

  40. [40]

    Zhydachevskyy, Y

    Y . Zhydachevskyy, Y . Hizhnyi, S.G. Nedilko, I. Kudryavtseva, V . Pankratov, V . Stasiv, L. Vasylechko, D. Sugak, A. Lushchik, M. Berkowski, A. Suchocki, N. Klyui, Band Gap Engineering and Trap Depths of Intrinsic Point Defects in RAlO3 (R = Y , La, Gd, Yb, Lu) Perovskites, J. Phys. Chem. C 125 (2021) 26698–26710. https://doi.org/10.1021/acs.jpcc.1c06573

  41. [41]

    Ghani, M

    M.U. Ghani, M. Junaid, K.M. Batoo, M.F. Ijaz, B. Zazoum, An extensive study of structural, electronic, optical, mechanical, and thermodynamic properties of inorganic oxide perovskite materials ScXO3 (X = Ga, In) for optoelectronic applications: A DFT study, Inorg. Chem. Commun. 172 (2025) 113459. https://doi.org/10.1016/j.inoche.2024.113459

  42. [43]

    Chung, J

    Y .K. Chung, J. Lee, W.-G. Lee, D. Sung, S. Chae, S. Oh, K.H. Choi, B.J. Kim, J.-Y . Choi, J. Huh, Theoretical Study of Anisotropic Carrier Mobility for Two-Dimensional Nb2Se9 Material, ACS Omega 6 (2021) 26782–26790. https://doi.org/10.1021/acsomega.1c03728

  43. [44]

    Laflamme Janssen, Y

    J. Laflamme Janssen, Y . Gillet, S. Poncé, A. Martin, M. Torrent, X. Gonze, Precise effective masses from density functional perturbation theory, Phys. Rev. B 93 (2016) 205147. https://doi.org/10.1103/PhysRevB.93.205147

  44. [45]

    Z. Li, P. Graziosi, N. Neophytou, Deformation potential extraction and computationally efficient mobility calculations in silicon from first principles, Phys. Rev. B 104 (2021) 195201. https://doi.org/10.1103/PhysRevB.104.195201

  45. [46]

    non-toxic

    F. Murphy-Armando, G. Fagas, J.C. Greer, Deformation Potentials and Electron−Phonon Coupling in Silicon Nanowires, Nano Lett. 10 (2010) 869–873. https://doi.org/10.1021/nl9034384

  46. [47]

    C. Chen, Y . Zuo, W. Ye, X. Li, Z. Deng, S.P. Ong, A Critical Review of Machine Learning of Energy Materials, Adv. Energy Mater. 10 (2020) 1903242. https://doi.org/10.1002/aenm.201903242

  47. [48]

    Life and death of colloidal bonds control the rate-dependent rheology of gels

    A. Mazheika, Y .-G. Wang, R. Valero, F. Viñes, F. Illas, L.M. Ghiringhelli, S.V . Levchenko, M. Scheffler, Artificial-intelligence-driven discovery of catalyst genes with application to CO2 activation on semiconductor oxides, Nat. Commun. 13 (2022) 419. https://doi.org/10.1038/s41467- 022-28042-z

  48. [49]

    Zhang, Y

    Y . Zhang, Y . Xia, A. Shakiba, H. Zhang, X. Hao, P.V . Kumar, M.P. Suryawanshi, Machine Learning for Designing Perovskites and Perovskite-Inspired Solar Materials: Emerging Opportunities and Challenges, Adv. Sci. 13 (2026) e74952. https://doi.org/10.1002/advs.74952