pith. sign in

arxiv: 2605.22887 · v1 · pith:ZZWQKGEFnew · submitted 2026-05-21 · ❄️ cond-mat.mtrl-sci

Genome-Guided Interpretable Screening of Phase-Stable, Lead-Free Double Perovskite Absorbers for All-Inorganic Semiconductors, Sensors, and Photovoltaics with DFT-Validated Design Rules

Pith reviewed 2026-05-25 05:53 UTC · model grok-4.3

classification ❄️ cond-mat.mtrl-sci
keywords lead-free double perovskitesmachine learning screeningDFT validationthermodynamic stabilityoptical absorptioninverse designphase stabilityperovskite absorbers
0
0 comments X

The pith

A genome-guided ML framework narrows 13,088 lead-free compositions to five DFT-validated phase-stable double perovskites.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper trains stability and band-gap models on 1,221 DFT-calculated A2BB'X6 compounds using four descriptor families for packing, bonding, polarization, and electronic identity. These surrogates then apply a staged constraint stack that reduces the pool of charge-balanced, lead-free candidates to five compounds confirmed by further DFT to lie on the convex hull with ordered structures and peak absorption near 10^5 cm^-1. A sympathetic reader would care because the work supplies explicit, interpretable design rules that link structural descriptors to optoelectronic performance for lead-free photovoltaics, sensors, and semiconductors.

Core claim

Applying the staged inverse-design constraint stack to 13,088 charge-balanced, lead-free compositions reduces the search space to five DFT-validated, phase-stable semiconductors—Rb2SnMnBr6, Cs2CdSnBr6, Cs2CdSnI6, Cs2KGaI6, and Cs2AgAlBr6—that lie on the convex hull (E_hull <= 0 meV/atom), preserve ordered double-perovskite structures, and exhibit strong optical absorption (alpha peak ~1e5 cm^-1).

What carries the argument

The staged inverse-design constraint stack that sequentially applies a recall-optimized stability classifier, an XGBoost band-gap regressor, and structural validation to the four descriptor families of packing, bonding, polarization, and electronic identity.

If this is right

  • The five candidates are thermodynamically stable and maintain ordered double-perovskite structures suitable for all-inorganic devices.
  • Packing descriptors control structural formability while bonding descriptors govern near-edge optical transitions.
  • Optoelectronic response descriptors regulate dielectric constants in the range 4.6-8.2 and exciton screening.
  • The genotype-phenotype analysis supplies concrete design rules for further lead-free double perovskite discovery.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same staged stack could be tested on single perovskites or other halide families to check transferability.
  • Thin-film growth and measured absorption spectra on any of the five candidates would provide an external check on the predicted alpha values.
  • If the hierarchical descriptor ranking holds, future searches can safely de-emphasize polarization descriptors until packing and bonding are satisfied.

Load-bearing premise

The machine learning models trained on the 1,221 DFT compounds generalize accurately to the 13,088 screened compositions without significant errors from extrapolation or descriptor limitations.

What would settle it

A new DFT calculation showing that any one of the five listed candidates has a positive hull energy above zero meV/atom would falsify the claim that the screening produced phase-stable materials.

Figures

Figures reproduced from arXiv: 2605.22887 by AKM Kamrul Islam, Hasan Jamil, Masud Rana Rashel, Md. Mostaq Ahmed Himel, Md. Zahid Hassan, Mouhaydine Tlemcani, Muhammad Harussani Moklis, Nafis Ahtasu, Sohanur Rahman Sohan.

Figure 1
Figure 1. Figure 1: Genome-guided interpretable screening workflow used in this study. 2.2 Chemical-space construction and structural filtering A lead-free, all-inorganic A₂BB′X₆ double-perovskite chemical space was constructed by combining explicit charge neutrality with structural rules consistent with corner-sharing BX₆ octahedra and rock-salt ordering of the B/B′ sites. Two valence families were considered: heterovalent A… view at source ↗
Figure 2
Figure 2. Figure 2: Chemical-space anatomy. Element pools and oxidation-state families used to enumerate the A₂BB′X₆ library. Summary of the oxidation-state families, element pools, duplicate-handling rules, and geometric cutoffs used to construct the lead-free A₂BB′X₆ chemical library, as according to [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Chemical-space formability statistics. (a) B–B′ combinatorial density map aggregated overall A and X choices. (b) Distributions of t, μ, and τ across the enumerated space, showing the formability manifold used as a physical screening gate. (c) Fraction of compositions retained after the geometry-based filters. Overall, these results define the chemical-space anatomy of the present discovery problem. Althou… view at source ↗
Figure 4
Figure 4. Figure 4: Schematic illustration of the descriptor space clustered into packing, bonding, and optoelectronic-response categories. The first group, termed packing descriptors, describes structural existence through ionic radii and the formability metrics 𝑡, 𝜇, and 𝜏. These variables indicate directly, whether a given composition is geometrically compatible or not. The second group, bonding genes (descriptors), is con… view at source ↗
Figure 5
Figure 5. Figure 5: Construction of the stability genome and feature selection process. (a) Correlation matrix of the descriptors used to identify highly related variables. (b) Final descriptor set with low collinearity selected for machine-learning model training [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Performance of the recall-optimized stability classifier. (a) The confusion matrix, (b) precision–recall curve, (c) receiver operating characteristic (ROC) curve, (d) calibration behavior, (e) operating point selected and (f) the recall score for high stable￾class recalls during inverse-design screening [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Performance of the evolutionary-optimized band-gap regressor. The figure summarizes (a) parity between predicted and DFT-calculated E_g, (b) residual statistics, (c) error distribution across the band-gap range, and (d) evolutionary convergence during hyperparameter optimization. The interpretability analysis further supports the physical meaning of these mappings. From [PITH_FULL_IMAGE:figures/full_fig_p… view at source ↗
Figure 8
Figure 8. Figure 8: Partial dependence analysis of the top five ranked descriptors controlling the predicted band gap. The one-dimensional PDPs show how each feature affects the predicted band gap when considered separately. The features are ranked by importance. BE(B1–X) is the most important feature. P–B1 comes next. X–B1, R–B1, and BE(B′–X) follow in order. Figure (f) shows the interaction between the two most important fe… view at source ↗
Figure 9
Figure 9. Figure 9: Mechanistic genome decoding from feature attribution and response analysis. (a) SHAP summary for thermodynamic stability. (b) SHAP summary for band gap. (c) Stability probability as a function of τ. (d) Stability response versus B–X bond dissociation energy. (e) Band-gap response versus B-site electronegativity [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Inverse-design constraint stack and candidate funnel. The panels summarize candidate reduction across screening stages, the predicted stability–band-gap landscape, and the application windows used to identify the final DFT-validation set. This narrow overlap is also visible in [PITH_FULL_IMAGE:figures/full_fig_p027_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Plot for the predicted stability probability vs the predicted band gap. Summary of the staged workflow used to translate descriptor-genome rules into composition-space candidates shown in [PITH_FULL_IMAGE:figures/full_fig_p028_11.png] view at source ↗
Figure 13
Figure 13. Figure 13: Electronic band structures and projected density of states (PDOS) of the DFT-validated double perovskites. Panels (a–e) show Rb₂SnMnBr₆, Cs₂CdSnBr₆, Cs₂CdSnI₆, Cs₂KGaI₆, and Cs₂AgAlBr₆, respectively. The band structure and PDOS results show that all compounds have indirect band gaps. The valence bands are mainly formed by halide p states. The conduction band edges are mainly formed by cation states. The M… view at source ↗
Figure 14
Figure 14. Figure 14: Frequency-dependent complex dielectric function, ε(ω) = ε₁(ω) + iε₂(ω), of the studied lead-free double perovskites: (a) Rb₂SnMnBr₆, (b) Cs₂CdSnBr₆, (c) Cs₂CdSnI₆, (d) Cs₂KGaI₆, and (e) Cs₂AgAlBr₆. Strong low-energy ε₂ peaks indicate dominant inter-band transitions, whereas the low-energy ε₁ baseline reflects dielectric constant strength. According to [PITH_FULL_IMAGE:figures/full_fig_p037_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Absorption spectra of the DFT-validated lead-free double perovskites. Calculated absorption coefficient as a function of photon energy is shown for the shortlisted compounds: (a) Cs₂CdSnBr₆, (b) Cs₂CdSnI₆, (c) Rb₂SnMnBr₆, (d) Cs₂KGaI₆, and (e) [PITH_FULL_IMAGE:figures/full_fig_p039_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Refractive index and extinction coefficient spectra of the studied double perovskites. This figure shows the real refractive [PITH_FULL_IMAGE:figures/full_fig_p041_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Reflectivity spectra of the DFT-validated lead-free double perovskites. Calculated [PITH_FULL_IMAGE:figures/full_fig_p042_17.png] view at source ↗
Figure 17
Figure 17. Figure 17: Frequency-dependent energy-loss spectra of the shortlisted lead-free double perovskites. The dominant plasmon peaks provide excitation-energy fingerprints that complement the dielectric and absorption analyses [PITH_FULL_IMAGE:figures/full_fig_p043_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Optical conductivity spectra of the studied double perovskites. This figure shows the optical conductivity as a function [PITH_FULL_IMAGE:figures/full_fig_p045_18.png] view at source ↗
Figure 18
Figure 18. Figure 18: Genotype–phenotype coupling maps linking interpretable machine-learning gene clusters to DFT-derived optical properties. (a) Polarization gene score versus dielectric constant (ε₀) and refractive index n(0). (b) Bonding gene score versus near￾edge transition strength, represented by α(E_g + 0.5 eV) and σ₁(E_g + 0.5 eV). (c) Packing descriptor τ versus absorption-related phenotypes, highlighting structural… view at source ↗
Figure 19
Figure 19. Figure 19: Application-oriented map of the DFT-validated candidates. The x-axis shows the PBE band gap, the y-axis shows α(Eg + 0.5 eV), and bubble size scales with ε₀. The Cd/Sn systems define the absorber window, whereas Cs₂KGaI₆ occupies the near-IR edge [PITH_FULL_IMAGE:figures/full_fig_p052_19.png] view at source ↗
read the original abstract

The discovery of stable, lead-free halide perovskites for optoelectronic applications is constrained by vast compositional space and limited interpretability of conventional screening approaches. We present a genome-guided, physics-informed framework that decodes thermodynamic stability and optoelectronic behavior through four physically interpretable descriptor families: packing, bonding, polarization, and electronic identity. Trained on 1,221 DFT-calculated A2BB'X6 compounds, machine-learning surrogates achieve robust predictive performance, with a recall-optimized stability classifier (ROC-AUC = 0.92) and an XGBoost regressor for band-gap prediction (R2 = 0.93 on held-out data). Applying a staged inverse-design constraint stack to 13,088 charge-balanced, lead-free compositions reduces the search space to five DFT-validated, phase-stable semiconductors: Rb2SnMnBr6, Cs2CdSnBr6, Cs2CdSnI6, Cs2KGaI6, and Cs2AgAlBr6. These candidates lie on the convex hull (E_hull <= 0 meV/atom), preserve ordered double-perovskite structures, and exhibit strong optical absorption (alpha peak ~1e5 cm^-1). Genotype-phenotype coupling analysis reveals a hierarchical control mechanism: packing genes define structural formability, bonding genes govern near-edge optical transitions and conductivity, and optoelectronic response genes regulate dielectric response and exciton screening (epsilon0 = 4.6-8.2). This work establishes a generalizable paradigm for interpretable inverse design, linking descriptor-level genomics to experimentally relevant optoelectronic phenotypes and providing design rules for discovering stable, lead-free double perovskites for photovoltaics, sensing, and transparent electronic applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript presents a genome-guided ML framework using four descriptor families (packing, bonding, polarization, electronic identity) trained on 1,221 DFT A2BB'X6 compounds. A recall-optimized stability classifier (ROC-AUC 0.92) and XGBoost band-gap regressor (R2 0.93 on held-out data) are applied via a staged inverse-design constraint stack to filter 13,088 charge-balanced lead-free compositions, yielding five DFT-validated candidates (Rb2SnMnBr6, Cs2CdSnBr6, Cs2CdSnI6, Cs2KGaI6, Cs2AgAlBr6) that lie on the convex hull (E_hull <= 0 meV/atom), retain ordered double-perovskite structures, and show strong absorption (~1e5 cm^-1). The work also derives hierarchical design rules linking descriptors to optoelectronic phenotypes.

Significance. If the ML generalization holds, the paper offers a concrete, interpretable inverse-design pipeline that reduces a large compositional space to experimentally relevant candidates while providing physics-based design rules. The combination of ML surrogates with final DFT validation on survivors and the explicit genotype-phenotype mapping are strengths that could aid discovery in halide perovskites for PV and sensors. The significance is limited by the absence of evidence that the screening process itself is reliable outside the training manifold.

major comments (3)
  1. [Results (screening pipeline)] Results section on screening pipeline: The central claim that the staged constraint stack correctly reduces 13,088 compositions to five phase-stable candidates rests on the assumption that the stability classifier and band-gap regressor generalize to the full screened space, yet no out-of-distribution test set, descriptor-space coverage analysis, or uncertainty quantification is reported comparing the 13,088 to the 1,221 training compounds.
  2. [Methods (ML model training)] Methods (ML model training) and Abstract: The reported ROC-AUC 0.92 and R2 0.93 are obtained on held-out data drawn from the same 1,221 DFT set; without an independent test on compositions far from this manifold, the reduction step cannot be confirmed as complete or free of systematic false negatives/positives.
  3. [DFT validation] DFT validation paragraph: Validation that the five survivors satisfy E_hull <= 0 and ordered structures confirms only those specific compounds, not that the ML-driven filtering correctly identified all (or the only) stable candidates from the 13,088.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'genome-guided' is used without a clear definition of what constitutes the 'genome' versus the four descriptor families; a brief clarification would improve readability.
  2. [Figures] Figure captions (assumed from typical structure): Ensure all figures showing the constraint stack explicitly label the number of compositions remaining after each stage.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments highlight important considerations regarding generalization of the ML models. We respond to each major comment below and indicate planned revisions where appropriate.

read point-by-point responses
  1. Referee: Results section on screening pipeline: The central claim that the staged constraint stack correctly reduces 13,088 compositions to five phase-stable candidates rests on the assumption that the stability classifier and band-gap regressor generalize to the full screened space, yet no out-of-distribution test set, descriptor-space coverage analysis, or uncertainty quantification is reported comparing the 13,088 to the 1,221 training compounds.

    Authors: We agree that explicit OOD testing, coverage analysis, and uncertainty quantification would provide stronger evidence of generalization. The four descriptor families were selected for physical transferability across A2BB'X6 chemistries, and the 1,221 training compounds were chosen to span diverse elemental combinations. The final DFT validation of the five candidates on the convex hull offers supporting evidence that the pipeline identified viable compounds. We will add a new subsection discussing descriptor-space overlap between training and screened sets along with a limitations paragraph on the absence of formal OOD metrics. revision: partial

  2. Referee: Methods (ML model training) and Abstract: The reported ROC-AUC 0.92 and R2 0.93 are obtained on held-out data drawn from the same 1,221 DFT set; without an independent test on compositions far from this manifold, the reduction step cannot be confirmed as complete or free of systematic false negatives/positives.

    Authors: The performance figures are from stratified cross-validation within the 1,221-compound DFT dataset. The stability classifier was explicitly recall-optimized to reduce the risk of discarding stable phases. While an external far-manifold test set is absent, the genome-guided descriptors encode packing, bonding, polarization, and electronic features expected to remain relevant beyond the training distribution. We will revise the abstract and methods sections to explicitly state that metrics reflect in-distribution performance and to note the reliance on final DFT validation for the survivors. revision: yes

  3. Referee: DFT validation paragraph: Validation that the five survivors satisfy E_hull <= 0 and ordered structures confirms only those specific compounds, not that the ML-driven filtering correctly identified all (or the only) stable candidates from the 13,088.

    Authors: We concur that DFT validation establishes the stability of the five reported candidates but does not demonstrate that the pipeline recovered every stable composition or that no other stable phases exist among the 13,088. The objective is to deliver an interpretable, efficient inverse-design workflow that yields experimentally actionable leads together with genotype-phenotype design rules, rather than an exhaustive enumeration. The staged stack was constructed to be conservative via high-recall filtering. We will expand the discussion to clarify this scope and emphasize that the five compounds constitute validated, promising starting points. revision: partial

Circularity Check

0 steps flagged

No significant circularity; ML screening is a surrogate step with external DFT grounding on final candidates

full rationale

The derivation trains ML surrogates (stability classifier, band-gap regressor) on 1,221 DFT compounds, applies them to filter 13,088 compositions, and then performs independent DFT validation on the five survivors to confirm E_hull <= 0, structure, and absorption. This is a standard surrogate-assisted discovery workflow whose central claims rest on the external DFT benchmarks rather than on the ML outputs alone. No self-definitional steps, no fitted inputs renamed as predictions, and no load-bearing self-citations appear in the provided text. The approach is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions from density functional theory for computing stability and electronic properties, plus machine learning generalization from a limited training set; no new physical entities are postulated and no free parameters are explicitly named in the abstract.

axioms (2)
  • domain assumption Density functional theory calculations provide sufficiently accurate ground-truth labels for thermodynamic stability (E_hull) and band gaps in double perovskites.
    All training data and final validation of the five candidates rely on DFT results as the reference.
  • domain assumption The four descriptor families (packing, bonding, polarization, electronic identity) capture the dominant physics controlling phase stability and optoelectronic behavior.
    The genome-guided framework and hierarchical control mechanism are built on these descriptors.

pith-pipeline@v0.9.0 · 5921 in / 1684 out tokens · 25313 ms · 2026-05-25T05:53:02.925568+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    genome-guided, physics-informed screening framework that decodes thermodynamic stability and optoelectronic behavior through four physically interpretable descriptor families–packing, bonding, polarization, and electronic identity. Trained on 1,221 DFT-calculated A₂BB′X₆ compounds, machine-learning surrogates achieve robust predictive performance, with a recall-optimized stability classifier (ROC–AUC = 0.92) and an XGBoost regressor for band-gap prediction (R² = 0.93 on held-out test data).

  • IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    Applying a staged inverse-design constraint stack to 13,088 charge-balanced, lead-free compositions reduces the search space to five DFT-validated, phase-stable semiconductors

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    Obada, S.B

    D.O. Obada, S.B. Akinpelu, S.A. Abolade, E. Okafor, A.M. Ukpong, S. Kumar R, A. Akande, Lead-Free Double Perovskites: A Review of the Structural, Optoelectronic, Mechanical, and Thermoelectric Properties Derived from First-Principles Calculations, and Materials Design Applicable for Pedagogical Purposes, Crystals 14 (2024) 86. https://doi.org/10.3390/crys...

  2. [2]

    S. Wang, H. Li, L. Qi, K. Pan, Lead-free halide double-perovskite nanocrystals: structure, synthesis, optoelectronic properties, and applications, J. Mater. Chem. C 13 (2025) 19080– 19105. https://doi.org/10.1039/D5TC02430G

  3. [3]

    Ali, A.A

    M.K.M. Ali, A.A. Mohsen, N.K. Allam, Lead-free perovskite materials for optoelectronic and solar energy applications, Sol. Energy Mater. Sol. Cells 295 (2026) 114025. https://doi.org/10.1016/j.solmat.2025.114025

  4. [4]

    Moklis, C

    M.H. Moklis, C. Avian, E. Kolor, M. Rubel, J.S. Cross, Review on Recent Development of Artificial Intelligence and Machine Learning Approaches in Energy Applications, Adv. Smart Energy Syst. Model. Simul. Secur. Electr. Veh. Microgrids (2026) 221

  5. [5]

    Landini, K

    E. Landini, K. Reuter, H. Oberhofer, Machine-learning Based Screening of Lead-free Halide Double Perovskites for Photovoltaic Applications, (2022). https://doi.org/10.48550/arXiv.2208.12736

  6. [6]

    Z. Chen, J. Wang, C. Li, B. Liu, D. Luo, Y . Min, N. Fu, Q. Xue, Highly versatile and accurate machine learning methods for predicting perovskite properties, J. Mater. Chem. C 12 (2024) 15444–15453. https://doi.org/10.1039/D4TC02268H

  7. [7]

    Z. Gao, G. Mao, S. Chen, Y . Bai, P. Gao, C. Wu, I.D. Gates, W. Yang, X. Ding, J. Yao, High throughput screening of promising lead-free inorganic halide double perovskites via first- principles calculations, Phys. Chem. Chem. Phys. 24 (2022) 3460–3469. https://doi.org/10.1039/D1CP04976C

  8. [8]

    Hippalgaonkar, Q

    K. Hippalgaonkar, Q. Li, X. Wang, J.W. Fisher, J. Kirkpatrick, T. Buonassisi, Knowledge- integrated machine learning for materials: lessons from gameplaying and robotics, Nat. Rev. Mater. 8 (2023) 241–260. https://doi.org/10.1038/s41578-022-00513-1

  9. [10]

    J. Dean, M. Scheffler, T.A.R. Purcell, S.V . Barabash, R. Bhowmik, T. Bazhirov, Interpretable machine learning for materials design, J. Mater. Res. 38 (2023) 4477–4496. https://doi.org/10.1557/s43578-023-01164-w

  10. [11]

    Z. Guo, B. Lin, Machine learning stability and band gap of lead-free halide double perovskite materials for perovskite solar cells, Sol. Energy 228 (2021) 689–699. https://doi.org/10.1016/j.solener.2021.09.030

  11. [12]

    Y . Wei, J. He, C. Yang, W. Yu, J. Feng, X. Liu, X. Chong, Accelerated Multi‐Property Screening of Lead‐Free Halide Double Perovskite via Transfer Learning, Adv. Funct. Mater. 36 (2026) e14377

  12. [13]

    Riebesell, R.E.A

    J. Riebesell, R.E.A. Goodall, P. Benner, Y . Chiang, B. Deng, G. Ceder, M. Asta, A.A. Lee, A. Jain, K.A. Persson, A framework to evaluate machine learning crystal stability predictions, Nat. Mach. Intell. 7 (2025) 836–847. https://doi.org/10.1038/s42256-025- 01055-1

  13. [14]

    Fronzi, M.J

    M. Fronzi, M.J. Ford, K.S. Nayal, O. Isayev, C. Stampfl, Interpretable machine learning for thermoelectric materials design with Kolmogorov–Arnold networks, Sci. Rep. 16 (2026) 14146. https://doi.org/10.1038/s41598-026-44723-x

  14. [15]

    Udabe, A scientist’s guide to AI-driven molecular discovery, Artif

    J. Udabe, A scientist’s guide to AI-driven molecular discovery, Artif. Intell. Chem. 4 (2026) 100107. https://doi.org/10.1016/j.aichem.2026.100107

  15. [16]

    H. Wang, R. Ouyang, W. Chen, A. Pasquarello, High-Quality Data Enabling Universality of Band Gap Descriptor and Discovery of Photovoltaic Perovskites, J Am Chem Soc (2024)

  16. [17]

    Schmidt, J

    J. Schmidt, J. Shi, P. Borlido, L. Chen, S. Botti, M.A.L. Marques, Predicting the Thermodynamic Stability of Solids Combining Density Functional Theory and Machine Learning, Chem. Mater. 29 (2017) 5090–5103. https://doi.org/10.1021/acs.chemmater.7b00156

  17. [18]

    Y . Wei, J. He, C. Yang, W. Yu, J. Feng, X.-J. Liu, X. Chong, Accelerated Multi‐Property Screening of Lead‐Free Halide Double Perovskite via Transfer Learning, (2025). https://doi.org/10.1002/adfm.202514377

  18. [19]

    Moklis, C

    M.H. Moklis, C. Avian, C. Shuo, S. Boonyubol, J.S. Cross, Machine learning-driven prediction and optimization of selective glycerol electrocatalytic reduction into propanediols, J. Electroanal. Chem. 988 (2025) 119150. https://doi.org/10.1016/j.jelechem.2025.119150

  19. [20]

    Baharfar, A.C

    M. Baharfar, A.C. Hillier, G. Mao, Charge-Transfer Complexes: Fundamentals and Advances in Catalysis, Sensing, and Optoelectronic Applications, Adv. Mater. 36 (2024) 2406083. https://doi.org/10.1002/adma.202406083

  20. [21]

    Iseki, K

    S. Iseki, K. Nonomura, S. Kishida, D. Ogata, J. Yuasa, Zinc-Ion-Stabilized Charge-Transfer Interactions Drive Self-Complementary or Complementary Molecular Recognition, J. Am. Chem. Soc. 142 (2020) 15842–15851. https://doi.org/10.1021/jacs.0c05940

  21. [22]

    Jelsch, Y

    C. Jelsch, Y . Bibila Mayaya Bisseyou, Deciphering the driving forces in crystal packing by analysis of electrostatic energies and contact enrichment ratios, IUCrJ 10 (2023) 557–567. https://doi.org/10.1107/S2052252523005675

  22. [23]

    Marin, A

    F. Marin, A. Zappi, D. Melucci, L. Maini, Self-organizing maps as a data-driven approach to elucidate the packing motifs of perylene diimide derivatives, Mol. Syst. Des. Eng. 8 (2023) 500–515. https://doi.org/10.1039/D2ME00240J

  23. [24]

    Steed, J.W

    K.M. Steed, J.W. Steed, Packing Problems: High Z ′ Crystal Structures and Their Relationship to Cocrystals, Inclusion Compounds, and Polymorphism, Chem. Rev. 115 (2015) 2895–2933. https://doi.org/10.1021/cr500564z

  24. [25]

    Tretiakov, A

    S. Tretiakov, A. Nigam, R. Pollice, Studying Noncovalent Interactions in Molecular Systems with Machine Learning, Chem. Rev. 125 (2025) 5776–5829. https://doi.org/10.1021/acs.chemrev.4c00893

  25. [26]

    Zhao, M.L

    X. Zhao, M.L. Ball, A. Kakekhani, T. Liu, A.M. Rappe, Y .-L. Loo, A charge transfer framework that describes supramolecular interactions governing structure and properties of 2D perovskites, Nat. Commun. 13 (2022) 3970. https://doi.org/10.1038/s41467-022-31567- y

  26. [27]

    F. Gou, Z. Ma, Q. Yang, H. Du, Y . Li, Q. Zhang, W. You, Y . Chen, Z. Du, J. Yang, N. He, J. Luo, Z. Liu, Z. Tian, M. Mao, K. Liu, J. Yu, A. Zhang, F. Min, K. Sun, N. Xuan, Machine Learning-Assisted Prediction and Control of Bandgap for Organic–Inorganic Metal Halide Perovskites, ACS Appl. Mater. Interfaces 17 (2025) 18383–18393. https://doi.org/10.1021/a...

  27. [28]

    X. He, J. Liu, C. Yang, G. Jiang, Predicting thermodynamic stability of magnesium alloys in machine learning, Comput. Mater. Sci. 223 (2023) 112111. https://doi.org/10.1016/j.commatsci.2023.112111

  28. [29]

    Soltanian, A

    M.R. Soltanian, A. Bemani, F. Moeini, R. Ershadnia, Z. Yang, Z. Du, H. Yin, Z. Dai, Data driven simulations for accurately predicting thermodynamic properties of H2 during geological storage, Fuel 362 (2024) 130768. https://doi.org/10.1016/j.fuel.2023.130768

  29. [30]

    H. Wang, R. Ouyang, W. Chen, A. Pasquarello, High-Quality Data Enabling Universality of Band Gap Descriptor and Discovery of Photovoltaic Perovskites, J. Am. Chem. Soc. 146 (2024) 17636–17645. https://doi.org/10.1021/jacs.4c03507

  30. [31]

    Rafiu, M

    R. Rafiu, M. Sakib Hasan, M. Azizur Rahman, I. Ahamed Apon, K. Kriaa, M. Benghanem, S. AlFaify, N. Elboughdiri, First-principles calculations to investigate structural, electronic, optical, elastic, mechanical and phonon properties of novel Q 3 GaBr 6 (Q = Na and K) for next-generation lead-free solar cells, RSC Adv. 16 (2026) 7803–7829. https://doi.org/1...

  31. [32]

    Zhydachevskyy, Y

    Y . Zhydachevskyy, Y . Hizhnyi, S.G. Nedilko, I. Kudryavtseva, V . Pankratov, V . Stasiv, L. Vasylechko, D. Sugak, A. Lushchik, M. Berkowski, A. Suchocki, N. Klyui, Band Gap Engineering and Trap Depths of Intrinsic Point Defects in RAlO3 (R = Y , La, Gd, Yb, Lu) Perovskites, J. Phys. Chem. C 125 (2021) 26698–26710. https://doi.org/10.1021/acs.jpcc.1c06573

  32. [33]

    Ghani, M

    M.U. Ghani, M. Junaid, K.M. Batoo, M.F. Ijaz, B. Zazoum, An extensive study of structural, electronic, optical, mechanical, and thermodynamic properties of inorganic oxide perovskite materials ScXO3 (X = Ga, In) for optoelectronic applications: A DFT study, Inorg. Chem. Commun. 172 (2025) 113459. https://doi.org/10.1016/j.inoche.2024.113459

  33. [34]

    Xu, Y .D

    B. Xu, Y .D. Xia, J. Yin, X.G. Wan, K. Jiang, A.D. Li, D. Wu, Z.G. Liu, The effect of acoustic phonon scattering on the carrier mobility in the semiconducting zigzag single wall carbon nanotubes, Appl. Phys. Lett. 96 (2010) 183108. https://doi.org/10.1063/1.3427419

  34. [35]

    Chung, J

    Y .K. Chung, J. Lee, W.-G. Lee, D. Sung, S. Chae, S. Oh, K.H. Choi, B.J. Kim, J.-Y . Choi, J. Huh, Theoretical Study of Anisotropic Carrier Mobility for Two-Dimensional Nb2Se9 Material, ACS Omega 6 (2021) 26782–26790. https://doi.org/10.1021/acsomega.1c03728

  35. [36]

    Laflamme Janssen, Y

    J. Laflamme Janssen, Y . Gillet, S. Poncé, A. Martin, M. Torrent, X. Gonze, Precise effective masses from density functional perturbation theory, Phys. Rev. B 93 (2016) 205147. https://doi.org/10.1103/PhysRevB.93.205147

  36. [37]

    Z. Li, P. Graziosi, N. Neophytou, Deformation potential extraction and computationally efficient mobility calculations in silicon from first principles, Phys. Rev. B 104 (2021) 195201. https://doi.org/10.1103/PhysRevB.104.195201

  37. [38]

    non-toxic

    F. Murphy-Armando, G. Fagas, J.C. Greer, Deformation Potentials and Electron−Phonon Coupling in Silicon Nanowires, Nano Lett. 10 (2010) 869–873. https://doi.org/10.1021/nl9034384. Research Article Genome guided Interpretable Screening of Perovskite Supporting Information Genome-Guided Interpretable Screening of Phase-Stable, Lead-Free Double Perovskite Ab...