pith. sign in

arxiv: 2605.24455 · v1 · pith:L4EI6SKYnew · submitted 2026-05-23 · ❄️ cond-mat.mtrl-sci

Multi-Source Domain Transfer Learning for Accurate Property Prediction in Two-Dimensional Materials

Pith reviewed 2026-06-30 13:20 UTC · model grok-4.3

classification ❄️ cond-mat.mtrl-sci
keywords multi-source domain transfer learningtwo-dimensional materialscarrier mobility predictionadversarial transfer learningmaximum mean discrepancymaterials screeningmachine learning for materials
0
0 comments X

The pith

A transfer learning framework predicts carrier mobilities in two-dimensional materials from crystal structures alone with R-squared above 0.90.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to overcome limited and fragmented data on two-dimensional materials by pulling complementary knowledge from multiple source domains through transfer learning. It builds a shared feature extractor that aligns distributions via adversarial training and maximum mean discrepancy, then applies an adaptive ensemble to combine predictions from different sources. This setup yields accurate mobility forecasts without direct simulation for each new structure. The approach also identifies and validates dozens of previously unknown high-mobility candidates.

Core claim

The authors establish that a multi-source domain transfer learning model, built around a shared feature extractor using adversarial transfer learning together with maximum mean discrepancy and followed by a sample-adaptive weighted ensemble, maps crystal structures into a domain-invariant latent space that supports carrier mobility prediction with R-squared exceeding 0.90 while still preserving physical correlations. The same pipeline screens 55 new two-dimensional semiconductors whose high mobilities and stability are then confirmed by first-principles electron-phonon coupling calculations.

What carries the argument

Multi-source domain transfer learning framework consisting of a shared feature extractor (adversarial transfer plus maximum mean discrepancy) and a sample-adaptive weighted ensemble that aggregates source-domain predictions.

If this is right

  • Carrier mobility can be estimated for new two-dimensional structures without running separate expensive simulations for each candidate.
  • The framework identifies 55 previously unreported high-mobility semiconductors whose transport properties are independently confirmed by electron-phonon calculations.
  • Data scarcity and fragmented distributions no longer block unified high-throughput screening of functional properties in two-dimensional systems.
  • The same architecture can be retrained on other source domains to extend predictions to additional transport or electronic properties.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same latent-space alignment could be tested on three-dimensional crystals or on properties such as thermal conductivity that also suffer from sparse labeled data.
  • If the domain-invariant features turn out to encode general geometric descriptors rather than property-specific ones, the method might transfer to molecular or nanostructure datasets outside crystalline materials.
  • A follow-up experiment could measure whether removing the adaptive ensemble step drops accuracy more on highly dissimilar source-target pairs than on similar ones.

Load-bearing premise

The combination of adversarial alignment and maximum mean discrepancy creates a latent space that keeps the physical relationships needed for mobility prediction intact across the different structural families of two-dimensional materials.

What would settle it

First-principles electron-phonon calculations on the 55 screened candidates returning mobilities substantially below the model predictions or showing structural instability would falsify the claim that the latent space preserves the required physical correlations.

Figures

Figures reproduced from arXiv: 2605.24455 by Huiyang Zhang, Jinlan Wang, Qionghua Zhou, Xinyu Chen.

Figure 1
Figure 1. Figure 1: Schematic of the Multi-source Domain Transfer Learning Framework. (a) Model Architecture. Crystalline structures and compositions are encoded into high-dimensional vectors via MAGPIE and processed by a Common Feature Extractor, which simultaneously feeds into three functional branches: a Label Predictor, a Material Classifier, and a Domain Alignment module to extract cross-domain invariant representations.… view at source ↗
Figure 2
Figure 2. Figure 2: Performance Comparison of Multi-Source Domain Transfer Learning in Target Material Property Prediction. (a) This figure demonstrates the prediction performance of five single-source domains (S1–S5) and their integrated model across varying sample sizes in target domains. (b) The performance of different source domain combinations (S1+S3, S1+S3+S2, S1+S3+S2+S4) was compared with that of the fully integrated… view at source ↗
read the original abstract

Machine learning has revolutionized materials discovery, but data scarcity remains a critical bottleneck for complex functional properties. As emerging systems, two-dimensional (2D) materials possess limited overall data volumes. Evaluating their diverse functional properties requires time-consuming simulations, hindering unified high-throughput screening. Furthermore, restrictions in known structural prototypes lead to highly fragmented data distributions. To address these challenges, we propose a multi-source domain transfer learning framework to extract generalizable and complementary knowledge from diverse crystalline systems. To mitigate data scarcity, the framework employs a shared feature extractor that integrates adversarial transfer learning with maximum mean discrepancy, mapping crystal structures into a domain-invariant latent space while preserving underlying physical correlations. To resolve distribution fragmentation, a sample-adaptive weighted ensemble strategy is subsequently utilized to dynamically aggregate predictions from multiple source domains. Relying solely on crystal structures, the framework predicts 2D carrier mobilities with an R2 score exceeding 0.90. The framework successfully screened 55 novel high-mobility 2D semiconductors, which were validated via first-principles electron-phonon coupling analysis, confirming their exceptional transport properties and stability. This work can potentially accelerate machine learning-assisted materials design and discovery with less data restriction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes a multi-source domain transfer learning framework for predicting carrier mobilities in 2D materials from crystal structures alone. It combines adversarial transfer learning with maximum mean discrepancy in a shared feature extractor to produce a domain-invariant latent space, followed by a sample-adaptive weighted ensemble across source domains. The central claims are an R² exceeding 0.90 on 2D mobility prediction and the identification of 55 novel high-mobility 2D semiconductors whose transport properties and stability were confirmed by first-principles electron-phonon coupling calculations.

Significance. If the reported performance and validation results hold under scrutiny, the work would provide a practical route to leverage heterogeneous crystalline datasets for 2D materials discovery, directly addressing data scarcity and distributional fragmentation that currently limit high-throughput screening of functional properties such as mobility.

major comments (3)
  1. [Abstract] Abstract: the statement that the shared feature extractor 'maps crystal structures into a domain-invariant latent space while preserving underlying physical correlations' is unsupported by any described mechanism. Only adversarial and MMD losses are mentioned; no auxiliary regression term on source-domain mobilities, contrastive loss on phonon descriptors, or other task-specific regularizer is referenced that would prevent the alignment objective from discarding electron-phonon-relevant variance.
  2. [Abstract] Abstract: the headline performance claim (R² > 0.90) and the screening of 55 validated candidates are presented without any quantitative information on source/target dataset sizes, train/test splits, baseline models, error bars, or hyperparameter selection. These omissions render the central empirical result impossible to evaluate for robustness or statistical significance.
  3. [Abstract] Abstract: the first-principles validation of the 55 screened materials is asserted but supplies no numerical values for the computed mobilities, deformation potentials, or stability criteria, nor any comparison against the training distribution or known high-mobility 2D benchmarks.
minor comments (1)
  1. [Abstract] The abstract refers to 'multi-source domain transfer learning' and 'sample-adaptive weighted ensemble' without defining the weighting scheme or the number of source domains used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment below and have made revisions to the abstract to improve clarity and completeness where the concerns are valid.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the statement that the shared feature extractor 'maps crystal structures into a domain-invariant latent space while preserving underlying physical correlations' is unsupported by any described mechanism. Only adversarial and MMD losses are mentioned; no auxiliary regression term on source-domain mobilities, contrastive loss on phonon descriptors, or other task-specific regularizer is referenced that would prevent the alignment objective from discarding electron-phonon-relevant variance.

    Authors: The shared feature extractor is trained jointly in an end-to-end manner with a regression head on the labeled source domains. This supervised regression objective on source mobilities ensures that task-relevant physical correlations (including those tied to electron-phonon interactions) are retained in the latent space, while the adversarial and MMD terms only enforce domain invariance. We acknowledge that the original abstract did not explicitly reference this joint training; the revised abstract now states that the extractor is optimized together with the mobility regression task on sources. revision: yes

  2. Referee: [Abstract] Abstract: the headline performance claim (R² > 0.90) and the screening of 55 validated candidates are presented without any quantitative information on source/target dataset sizes, train/test splits, baseline models, error bars, or hyperparameter selection. These omissions render the central empirical result impossible to evaluate for robustness or statistical significance.

    Authors: We agree that the abstract should supply sufficient quantitative context for readers to assess robustness. The revised abstract now includes the total number of source and target samples, the train/test split strategy, a brief mention of baseline comparisons, and a note that hyperparameter selection was performed via cross-validation with reported standard deviations across runs. revision: yes

  3. Referee: [Abstract] Abstract: the first-principles validation of the 55 screened materials is asserted but supplies no numerical values for the computed mobilities, deformation potentials, or stability criteria, nor any comparison against the training distribution or known high-mobility 2D benchmarks.

    Authors: Detailed numerical results (mobility values, deformation potentials, phonon spectra confirming stability, and direct comparisons to both the training set and literature benchmarks) appear in Section 4 and the supplementary tables. To address the concern, the revised abstract now summarizes the key validation outcomes, including the range of computed mobilities for the 55 candidates and confirmation that all satisfy dynamical stability criteria. revision: yes

Circularity Check

0 steps flagged

No circularity: standard transfer-learning pipeline with external validation

full rationale

The paper presents a multi-source domain-adversarial + MMD framework followed by sample-adaptive ensemble regression on crystal-structure inputs. No equations, loss terms, or reported metrics (R² > 0.90, 55 screened candidates) are shown to reduce by construction to quantities defined inside the same model. The first-principles electron-phonon validation is an independent external check. No self-citation chains, uniqueness theorems, or fitted-input-renamed-as-prediction steps appear in the abstract or described pipeline. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no explicit free parameters, axioms, or invented entities are stated. The framework implicitly assumes that source-domain data contain transferable physical correlations and that the latent space alignment does not erase mobility-relevant features.

pith-pipeline@v0.9.1-grok · 5747 in / 1120 out tokens · 33982 ms · 2026-06-30T13:20:32.566374+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 3 canonical work pages

  1. [1]

    strength

    Results and Discussion 2.1. Multi-source domain transfer learning framework Figure 1. Schematic of the Multi-source Domain Transfer Learning Framework. (a) Model Architecture. Crystalline structures and compositions are encoded into high -dimensional vectors via MAGPIE and processed by a Common Feature Extractor , which simultaneously feeds into three fun...

  2. [2]

    Our framework successfully bridges the gap between bulk and 2D materials to find universal physical features

    Conclusion In this study, we proved that the data-scarcity problem in 2D material research can be solved by learning from related crystal systems. Our framework successfully bridges the gap between bulk and 2D materials to find universal physical features. The accuracy of our model is much higher than traditional methods. For example, in unknown material ...

  3. [3]

    Methods 4.1 DFPT calculations The first-principles calculations are performed using the Quantum Espresso (QE) Package37 with Optimized Norm-Conserving Vanderbilt (ONCV) pseudopotentials38,39 and the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional40. Specifically, the carrier mobility 𝜇 for band transport at low electric field can be obtained ...

  4. [4]

    Cheng, M. et al. Artificial intelligence-driven approaches for materials design and discovery. Nat. Mater. 25, 174–190 (2026)

  5. [5]

    & Wang, J

    Lu, S., Zhou, Q., Ma, L., Guo, Y . & Wang, J. Rapid discovery of ferroelectric photovoltaic perovskites and material descriptors via machine learning. Small Methods 3, 1900360 (2019)

  6. [6]

    Griesemer, S. D. et al. Wide-ranging predictions of new stable compoun ds powered by recommendation engines. Sci. Adv. 11, eadq1431 (2025)

  7. [7]

    Higgins, K., Ziatdinov, M., Kalinin, S. V . & Ahmadi, M. High-throughput study of antisolvents on the stability of multicomponent metal halide perovskites through robotics -based synthesis and machine learning approaches. J. Am. Chem. Soc. 143, 19945–19955 (2021)

  8. [8]

    G., Jung, G

    Jung, S. G., Jung, G. & Cole, J. M. Automatic prediction of band gaps of inorganic materials using a gradient boosted and statistical feature selection workflow. J. Chem. Inf. Model. 64, 1187– 1200 (2024)

  9. [9]

    & Gao, Z

    Lookman, T., Liu, Y . & Gao, Z. Materials informatics: Emergence to autonomous discovery in the age of AI. Advanced Materials 38, e15941

  10. [10]

    & Barati Farimani, A

    Wang, Y ., Li, Z. & Barati Farimani, A. Graph neural networks for molecules. in Machine Learning in Molecular Sciences (eds Qu, C. & Liu, H.) 21 –66 (Springer International Publishing, Cham, 2023). doi:10.1007/978-3-031-37196-7_2

  11. [11]

    Jain, A. et al. Overview and importance of data quality for machine learning tasks. in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 3561–3562 (ACM, Virtual Event CA USA, 2020). doi:10.1145/3394486.3406477

  12. [12]

    Horton, M. K. et al. Accelerated data-driven materials science with the materials project. Nat. Mater. 24, 1522–1532 (2025)

  13. [13]

    Esters, M. et al. aflow.org: A web ecosystem of databases, software and tools. Computational Materials Science 216, 111808 (2023)

  14. [14]

    Shen, J. et al. Reflections on one million compounds in the open quantum materials database (OQMD). J. Phys. Mater. 5, 031001 (2022)

  15. [15]

    Tawfik, S. A. Computational material science has a data problem. J. Chem. Inf. Model. 65, 5823–5826 (2025)

  16. [16]

    Zhuang, F. et al. A comprehensive survey on transfer learning. Proceedings of the IEEE 109, 43–76 (2021)

  17. [17]

    & Wang, J

    Zhou, Q., Chen, X. & Wang, J. Machine learning assisted material discovery: A small data approach. Acc. Mater. Res. 6, 685–694 (2025)

  18. [18]

    & Runge, E

    Grunert, M., Großmann, M. & Runge, E. Machine learning climbs the jacob’s ladder of optoelectronic properties. Nat Commun 16, 8142 (2025)

  19. [19]

    & Wang, J

    Chen, X., Lu, S., Chen, Q., Zhou, Q. & Wang, J. From bulk effective mass to 2D carrier mobility accurate prediction via adversarial transfer learning. Nat Commun 15, 5391 (2024)

  20. [20]

    Hu, J. et al. Assisted energetic material property prediction through advanced transfer learning with graph neural networks. Ind. Eng. Chem. Res. 64, 2396–2405 (2025)

  21. [21]

    L., Barnes, B

    Lansford, J. L., Barnes, B. C., Rice, B. M. & Jensen, K. F. Building chemical property models for energetic materials from small datasets using a transfer learning approach. J. Chem. Inf. Model. 62, 5397–5410 (2022)

  22. [22]

    Feng, S. et al. A general and transferable deep learning framework for predicting phase formation in materials. npj Comput Mater 7, 10 (2021)

  23. [23]

    V ., O’Driscoll, P

    Bets, K. V ., O’Driscoll, P. C. & Yakobson, B. I. Physics -inspired transfer learning for ML - prediction of CNT band gaps from limited data. npj Comput Mater 10, 66 (2024)

  24. [24]

    Gupta, V . et al. MPpredictor: An artificial intelligence-driven web tool for composition-based material property prediction. J. Chem. Inf. Model. 63, 1865–1871 (2023)

  25. [25]

    Wang, J. et al. Accurate prediction of band gap of two -dimensional monolayer materials via transfer learning. Materials Today Physics 56, 101774 (2025)

  26. [26]

    Baruah, J. B. Isomorphous series of inorganic complexes and composite materials. Inorganica Chimica Acta 560, 121838 (2024)

  27. [27]

    & Qiao, J

    Duan, H., Meng, X., Tang, J. & Qiao, J. Dynamic system modeling using a multisource transfer learning-based modular neural network for industrial application. IEEE Transactions on Industrial Informatics 20, 7173–7182 (2024)

  28. [28]

    Wang, J. et al. Generalizing to unseen domains: A survey on domain generalization. IEEE Transactions on Knowledge and Data Engineering 35, 8052–8072 (2023)

  29. [29]

    & Carbonell, J

    Wang, Z., Dai, Z., Póczos, B. & Carbonell, J. Characterizing and avoiding negative transfer. in 2019 IEEE/CVF Conference on Co mputer Vision and Pattern Recognition (CVPR) 11285–11294 (2019). doi:10.1109/CVPR.2019.01155

  30. [30]

    Ng, H. K. et al. Improving carrier mobility in two -dimensional semiconductors with rippled materials. Nat Electron 5, 489–496 (2022)

  31. [31]

    Ko, T. W. et al. Materials graph library (MatGL), an open-source graph deep learning library for materials science and chemistry. npj Comput Mater 11, 253 (2025)

  32. [32]

    Ricci, F. et al. An ab initio electronic transport database for inorganic materials. Sci Data 4, 170085 (2017)

  33. [33]

    Jin, Y . et al. High-throughput deformation potential and electrical transport calculations. npj Comput Mater 9, 190 (2023)

  34. [34]

    Gjerding, M. N. et al. Recent progress of the computational 2D materials database (C2DB). 2D Mater. 8, 044002 (2021)

  35. [35]

    D., Hauser, J

    Arora, N. D., Hauser, J. R. & Roulston, D. J. Electron and hole mobilities in silicon as a function of concentration and temperature. IEEE Trans. Electron Devices 29, 292–295 (1982)

  36. [36]

    & Gibertini, M

    Poncé, S., Royo, M., Stengel, M., Marzari, N. & Gibertini, M. Long -range electrostatic contribution to electron-phonon couplings and mobilities of two -dimensional and bulk materials. Phys. Rev. B 107, 155424 (2023)

  37. [37]

    -J., Jhalani, V

    Park, J., Zhou, J. -J., Jhalani, V . A., Dreyer, C. E. & Bernardi, M. Long -range quadrupole electron-phonon interaction from first principles. Phys. Rev. B 102, 125203 (2020)

  38. [38]

    A., Zhou, J.-J., Park, J., Dreyer, C

    Jhalani, V . A., Zhou, J.-J., Park, J., Dreyer, C. E. & Bernardi, M. Piezoelectric electron-phonon interaction from ab initio dynamical quadrupoles: Impact on charge transport in wurtzite GaN. Phys. Rev. Lett. 125, 136602 (2020)

  39. [39]

    & Ren, J

    Wu, M., Yan, S. & Ren, J. Hierarchy-boosted funnel learning for identifying semiconductors with ultralow lattice thermal conductivity. npj Comput Mater 11, 106 (2025)

  40. [40]

    Giannozzi, P. et al. Advanced capabilities for materials modelling with quantum ESPRESSO. J. Phys.: Condens. Matter 29, 465901 (2017)

  41. [41]

    Hamann, D. R. Optimized norm -conserving vanderbilt pseudopotentials. Phys. Rev. B 88, 085117 (2013)

  42. [42]

    van Setten, M. J. et al. The PseudoDojo: Training and grading a 85 element optimized norm- conserving pseudopotential table. Computer Physics Communications 226, 39–54 (2018)

  43. [43]

    P., Burke, K

    Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868 (1996)

  44. [44]

    & Baroni, S

    Giannozzi, P., de Gironcoli, S., Pavone, P. & Baroni, S. Ab initio calculation of phonon dispersions in semiconductors. Phys. Rev. B 43, 7231–7242 (1991)

  45. [45]

    & Shockley, W

    Bardeen, J. & Shockley, W. Deformation potentials and mobilities in non-polar crystals. Phys. Rev. 80, 72–80 (1950)

  46. [46]

    -Q., Tang, L., Wang, D., Wang, L

    Long, M. -Q., Tang, L., Wang, D., Wang, L. & Shuai, Z. Theoretical predictions of size - dependent carrier mobility and polarity in graphene. J. Am. Chem. Soc. 131, 17728–17729 (2009)

  47. [47]

    Niu, G. et al. Efficient and accurate prediction of double perovskite quasiparticle ban d gaps via machine learning and a descriptor. J. Phys. Chem. Lett. 16, 4006–4013 (2025)

  48. [48]

    Choudhary, K. et al. The joint automated repository for various integrated simulations (JARVIS) for data-driven materials design. npj Comput Mater 6, 173 (2020)

  49. [49]

    Haastrup, S. et al. The computational 2D materials database: High -throughput modeling and discovery of atomically thin crystals. 2D Mater. 5, 042002 (2018)

  50. [50]

    Paszke, A. et al. PyTorch: An imperative style, high -performance deep learning library. in Proceedings of the 33rd International Conference on Neural Information Processing Systems 8026– 8037 (Curran Associates Inc., Red Hook, NY , USA, 2019)