pith. sign in

arxiv: 2605.19802 · v1 · pith:EBX5SY3Pnew · submitted 2026-05-19 · ❄️ cond-mat.mtrl-sci · physics.app-ph· physics.comp-ph

Building a Regional Data-Centric Materials Science Ecosystem for Processing-Rich Materials Innovation in the Great Plains

Pith reviewed 2026-05-20 04:21 UTC · model grok-4.3

classification ❄️ cond-mat.mtrl-sci physics.app-phphysics.comp-ph
keywords regional materials data ecosystemGreat PlainsFAIR metadataprovenance trackinghigh-purity germaniumexperimental dataworkforce trainingdata sharing barriers
0
0 comments X

The pith

The Great Plains can lead in data-centric materials science by organizing its scattered labs into a trusted regional data ecosystem.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Data-centric materials science still struggles to capture experimental, processing-rich, and field-relevant data that standard databases overlook. The paper argues that the Great Plains and nearby interior corridor can make a national contribution by linking distributed experimental assets into one coordinated ecosystem built on FAIR metadata, provenance tracking, persistent sample identifiers, uncertainty-aware modeling, semi-closed-loop workflows, and tiered governance for different data users. A high-purity germanium pilot shows how regional strengths can be turned into reusable datasets, benchmark models, and cross-trained staff. Readers care because the approach offers a path for non-coastal regions to participate without needing massive centralized facilities. The claim is that trustworthy data practices and application-driven challenges matter more than geographic concentration.

Core claim

The central claim is that the Great Plains and adjacent interior research corridor can make a distinctive national contribution by organizing distributed experimental assets into a trusted regional materials-data ecosystem that uses FAIR metadata, provenance, persistent sample identifiers, uncertainty-aware modeling, semi-closed-loop workflows, stackable workforce training, and tiered governance for academic, public, controlled-access, and industry-protected data; this model addresses five coupled barriers through a staged roadmap, with the high-purity germanium pilot demonstrating conversion of regional strengths into reusable datasets, benchmark models, trained personnel, and decision-impr

What carries the argument

The trusted regional materials-data ecosystem that integrates distributed experimental assets via FAIR metadata, provenance tracking, persistent identifiers, and tiered governance while using a staged roadmap to overcome five barriers.

If this is right

  • Experimental and processing-rich data from real devices and fields become reusable across labs.
  • Benchmark models improve optimization, manufacturing, and qualification of materials.
  • Workforce gaps at the materials-data interface shrink through stackable training.
  • Decision-improving workflows become available for both academic and industry partners.
  • Regional leadership in data-centric materials science grows from trustworthy practices rather than physical concentration of resources.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Other interior or rural research corridors could adopt the same tiered-governance model to expand national capacity without new megacenters.
  • The approach might scale to adjacent fields such as energy or environmental sensing if the same provenance and access rules prove portable.
  • If the pilot succeeds, smaller labs gain a route to contribute to national databases while retaining control over proprietary or sensitive data.

Load-bearing premise

That the proposed mix of FAIR metadata, provenance tracking, persistent identifiers, uncertainty-aware modeling, semi-closed-loop workflows, and tiered governance will overcome the five barriers and actually produce reusable datasets and better materials decisions.

What would settle it

If the high-purity germanium pilot fails to generate reusable datasets, benchmark models, or trained personnel that produce measurable improvements in materials decisions, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 2605.19802 by A. Ghasemi, A. Mahat, A. Majeed, A. M. Castillo, A. Nawaz, A. Prem, B. Cui, B. D. S. Gurung, B. Lama, B. V. Benson, C. M. Adhikari, C. S. Tadi, C.-X. Yu, D. Chakraborty, D. Kim, D.-M. Mei, D. Zeng, E. Z. Gnimpieba, G.-L. Yin, H. A. Hashim, H. Oli, J. Mammo, K. Acharya, K. Bhatta, K.-C. Kong, K.-E. Hasin, K.-M. Dong, K. Rana Magar, K. S. Moore, L. Pandey, L.-W. Wang, M. Adhikari, M. K. Hassanzadeh, M. K. Jha, M. M. Masud, M. M. Rana, M. Zhou, N. Budhathoki, N. Maharjan, Q. Zhou, R. D. Cruz, R. Gapuz, R. I. Harry, R. Pandey, R. Rizk, S. A. Panamaldeniya, S. Aryal, S. Bhattarai, S. Chhetri, S. Choudhury, S. Dhital, T. A. Chowdhury, T. Mukherjee, Y. Yang, Z. Peng.

Figure 1
Figure 1. Figure 1: Conceptual architecture for a regional data-centric materials science ecosystem. The [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Conceptual closed-loop workflow for data-centric materials innovation. The loop links sci [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Networked landscape for data-centric materials science in the Great Plains and adjacent [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Regional asset-to-data-commons strategy. Distributed assets become reusable, decision [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Example high-purity germanium closed-loop pilot workflow for data-centric materials [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Representative use cases for a Great Plains data-centric materials science ecosystem. [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
read the original abstract

Data-centric materials science is changing how materials are discovered, optimized, manufactured, and qualified, yet many deployment-limiting materials problems still depend on experimental, processing-rich, device-level, and field-relevant data that are difficult to capture in conventional materials databases. This perspective argues that the Great Plains and adjacent interior research corridor can make a distinctive national contribution by organizing distributed experimental assets into a trusted regional materials-data ecosystem. The proposed model emphasizes FAIR metadata, provenance, persistent sample identifiers, uncertainty-aware modeling, semi-closed-loop workflows, stackable workforce training, and tiered governance for academic, public, controlled-access, and industry-protected data. We identify five coupled barriers -- fragmented data, weak algorithm--laboratory translation, uneven access to cyberinfrastructure and technical staff, workforce gaps at the materials--data interface, and insufficient incentives for sharing and reuse -- and propose a staged roadmap for addressing them. A high-purity germanium pilot illustrates how regional strengths can be converted into reusable datasets, benchmark models, trained personnel, and decision-improving workflows. The broader message is that regional leadership in data-centric materials science will depend less on geographic concentration than on trustworthy data practices, interoperable infrastructure, cross-trained people, and application-driven materials challenges.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. This perspective article claims that the Great Plains and adjacent interior research corridor can make a distinctive national contribution to data-centric materials science by organizing distributed experimental assets into a trusted regional materials-data ecosystem. It identifies five barriers (fragmented data, weak algorithm-laboratory translation, uneven access to cyberinfrastructure and staff, workforce gaps at the materials-data interface, and insufficient incentives for sharing) and proposes a model using FAIR metadata, provenance tracking, persistent sample identifiers, uncertainty-aware modeling, semi-closed-loop workflows, stackable workforce training, and tiered governance across academic, public, controlled-access, and industry-protected data. A high-purity germanium pilot serves as an illustrative example, accompanied by a staged roadmap for implementation.

Significance. If implemented, the proposed regional ecosystem could advance materials innovation by improving capture and reuse of processing-rich, experimental, and device-level data that conventional databases struggle to accommodate. The regional, distributed approach offers a practical alternative to centralized efforts, potentially leveraging existing experimental assets in the corridor while addressing trust and access through tiered governance and uncertainty-aware methods. The framework contributes conceptually by linking specific barriers to actionable infrastructure and training elements.

major comments (2)
  1. [High-purity germanium pilot illustration] The high-purity germanium pilot is presented as converting regional strengths into reusable datasets, benchmark models, and decision-improving workflows, yet the manuscript supplies no outcomes, metrics, error analysis, or even preliminary results from this pilot. This is load-bearing for the central claim that the ecosystem model will produce reusable datasets and better materials decisions.
  2. [Identification of barriers and proposed roadmap] The assertion that the proposed combination of FAIR metadata, provenance tracking, persistent identifiers, uncertainty-aware modeling, semi-closed-loop workflows, and tiered governance will overcome the five identified barriers rests entirely on conceptual logic without references to prior implementations, quantitative projections, or risk assessments. This undermines the persuasiveness of the staged roadmap as a solution.
minor comments (2)
  1. [Abstract and workforce training discussion] The abstract introduces 'stackable workforce training' but the main text provides limited elaboration on its structure or integration with the other elements; ensure this is expanded for consistency.
  2. [Overall manuscript] Terms such as 'semi-closed-loop workflows' and 'tiered governance' would benefit from a short definition or citation on first use to aid readers outside the immediate subfield.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive review and for recognizing the potential value of a regional, distributed approach to data-centric materials science. We address the two major comments below, clarifying the scope of our perspective article while committing to targeted revisions that improve transparency without altering its conceptual nature.

read point-by-point responses
  1. Referee: [High-purity germanium pilot illustration] The high-purity germanium pilot is presented as converting regional strengths into reusable datasets, benchmark models, and decision-improving workflows, yet the manuscript supplies no outcomes, metrics, error analysis, or even preliminary results from this pilot. This is load-bearing for the central claim that the ecosystem model will produce reusable datasets and better materials decisions.

    Authors: We agree that the high-purity germanium example is presented without empirical outcomes or quantitative metrics. As a perspective article, the pilot functions as an illustrative case study to show how existing regional experimental assets (e.g., detector-grade crystal growth and characterization capabilities) could map onto the proposed ecosystem elements such as persistent identifiers and uncertainty-aware workflows. We do not claim completed results from an operational pilot. In revision we will (1) explicitly label the example as conceptual and forward-looking, (2) add a short paragraph describing the specific regional strengths that motivate the choice of germanium, and (3) note the absence of pilot-scale data as a limitation that future implementation work would need to address. This clarification removes any implication that the manuscript contains empirical validation. revision: partial

  2. Referee: [Identification of barriers and proposed roadmap] The assertion that the proposed combination of FAIR metadata, provenance tracking, persistent identifiers, uncertainty-aware modeling, semi-closed-loop workflows, and tiered governance will overcome the five identified barriers rests entirely on conceptual logic without references to prior implementations, quantitative projections, or risk assessments. This undermines the persuasiveness of the staged roadmap as a solution.

    Authors: The referee correctly notes that the linkage between the proposed technical and governance elements and the five barriers is primarily conceptual. While the manuscript draws on established FAIR principles and provenance standards, it does not cite specific prior deployments of the full combination in materials science. In the revised version we will add citations to documented regional or distributed data-sharing efforts in adjacent domains (e.g., environmental sensor networks and high-energy physics collaborations) and include a brief risk-assessment subsection within the roadmap that identifies key adoption risks and corresponding mitigation steps. We maintain that a perspective article is an appropriate venue for outlining such a framework, but we accept that additional grounding references and risk discussion will strengthen the presentation. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript is a perspective and roadmap proposal that identifies five barriers to data-centric materials science and outlines a regional ecosystem model using FAIR principles, provenance tracking, and tiered governance. It contains no equations, derivations, fitted parameters, or quantitative predictions that could reduce to their own inputs by construction. The high-purity germanium example is presented as an illustration rather than a validated result or self-referential fit. No load-bearing self-citations or uniqueness theorems are invoked to force the central claim; the argument rests on stated assumptions about implementation that are explicitly noted as unproven in the reader's take. The derivation chain is therefore self-contained as a forward-looking proposal without circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The proposal rests on domain assumptions about the superiority of regional trusted data ecosystems over existing national approaches, without new evidence or derivations supplied in the text.

axioms (1)
  • domain assumption Organizing distributed experimental assets with FAIR metadata, provenance, and tiered governance will overcome fragmented data and weak algorithm-laboratory translation barriers.
    This premise is invoked to justify the entire regional ecosystem model and roadmap.

pith-pipeline@v0.9.0 · 6061 in / 1459 out tokens · 65389 ms · 2026-05-20T04:21:24.602201+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    The proposed model emphasizes FAIR metadata, provenance, persistent sample identifiers, uncertainty-aware modeling, semi-closed-loop workflows, stackable workforce training, and tiered governance for academic, public, controlled-access, and industry-protected data.

  • IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    We identify five coupled barriers—fragmented data, weak algorithm–laboratory translation, uneven access to cyberinfrastructure and technical staff, workforce gaps at the materials–data interface, and insufficient incentives for sharing and reuse—and propose a staged roadmap for addressing them.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages

  1. [1]

    Stefano Curtarolo, Gus L. W. Hart, Marco Buongiorno Nardelli, Natalio Mingo, Stefano San- vito, and Ohad Levy. The high-throughput highway to computational materials design.Nature Materials, 12:191–201, 2013. doi:10.1038/nmat3568. 29

  2. [2]

    fourth paradigm

    Ankit Agrawal and Alok Choudhary. Perspective: Materials informatics and big data: Real- ization of the “fourth paradigm” of science in materials science.APL Materials, 4(5):053208,

  3. [3]

    doi:10.1063/1.4946894

  4. [4]

    Butler, Daniel W

    Keith T. Butler, Daniel W. Davies, Hugh Cartwright, Olexandr Isayev, and Aron Walsh. Machine learning for molecular and materials science.Nature, 559:547–555, 2018. doi:10.1038/s41586-018-0337-2

  5. [5]

    Anubhav Jain, Shyue Ping Ong, Geoffroy Hautier, Wei Chen, William Davidson Richards, Stephen Dacek, Shreyas Cholia, Dan Gunter, David Skinner, Gerbrand Ceder, and Kristin A. Persson. Commentary: The materials project: A materials genome approach to accelerating materials innovation.APL Materials, 1(1):011002, 2013. doi:10.1063/1.4812323

  6. [6]

    Logan Ward, Alexander Dunn, Alireza Faghaninia, Nils E. R. Zimmermann, Saurabh Bajaj, Qi Wang, Joseph Montoya, Jiming Chen, Kyle Bystrom, Maxwell Dylla, Kyle Chard, Mark Asta, Kristin A. Persson, G. Jeffrey Snyder, Ian Foster, and Anubhav Jain. Matminer: An open source toolkit for materials data mining.Computational Materials Science, 152:60–69,

  7. [7]

    doi:10.1016/j.commatsci.2018.05.018

  8. [8]

    Garrity, Andrew C

    Kamal Choudhary, Kevin F. Garrity, Andrew C. E. Reid, Brian DeCost, Adam J. Biacchi, Angela R. Hight Walker, Zachary Trautt, Jason Hattrick-Simpers, Aaron Gilad Kusne, Andrea Centrone, Albert Davydov, Jie Jiang, Ruth Pachter, Gowoon Cheon, Evan J. Reed, Ankit Agrawal, Xiaofeng Qian, Vinit Sharma, Houlong L. Zhuang, Sergei V. Kalinin, et al. The joint auto...

  9. [9]

    2019 , month =

    Claudia Draxl and Matthias Scheffler. The NOMAD laboratory: from data sharing to artificial intelligence.Journal of Physics: Materials, 2(3):036001, 2019. doi:10.1088/2515-7639/ab13bb

  10. [10]

    Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E

    Mark D. Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E. Bourne, et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3:160018, 2016. doi:10.1038/sdata.2016.18

  11. [11]

    Materials Genome Initiative Strategic Plan

    National Science and Technology Council. Materials Genome Initiative Strategic Plan. Tech- nical report, Executive Office of the President of the United States, 2021. URLhttps: //www.mgi.gov/sites/mgi/files/MGI-2021-Strategic-Plan.pdf. Accessed April 27, 2026

  12. [12]

    National Science Foundation

    U.S. National Science Foundation. Materials Innovation Platforms (MIP), NSF 25-521, 2024. URLhttps://www.nsf.gov/funding/opportunities/mip-materials-innovation-platf orms/nsf25-521/solicitation. Accessed April 27, 2026

  13. [13]

    Workshop for AI-Powered Materials Discovery at Great Plains,

    University of South Dakota. Workshop for AI-Powered Materials Discovery at Great Plains,

  14. [14]

    Held June 22–25, 2025, University of South Dakota

    URLhttps://aimaterialsworkshop.org/event/1/. Held June 22–25, 2025, University of South Dakota. Accessed April 27, 2026

  15. [15]

    Tabor, Loïc M

    Daniel P. Tabor, Loïc M. Roch, Semion K. Saikin, Christoph Kreisbeck, Dennis Sheberla, Joseph H. Montoya, Shyam Dwaraknath, Muratahan Aykol, Carlos Ortiz, Herman Tribukait, Carlos Amador-Bedolla, Christoph J. Brabec, Benji Maruyama, Kristin A. Persson, and Alán Aspuru-Guzik. Accelerating the discovery of materials for clean energy in the era of smart auto...

  16. [16]

    Szymanski, B

    Nathan J. Szymanski, B. Rendy, Y. Fei, R. E. Kumar, T. He, D. Milsted, M. J. McDermott, M. Gallant, Ekin D. Cubuk, A. Merchant, H. Kim, A. Jain, Christopher J. Bartel, Kristin A. Persson, Y. Zeng, and Gerbrand Ceder. An autonomous laboratory for the accelerated syn- thesis of novel materials.Nature, 624(7990):86–91, 2023. doi:10.1038/s41586-023-06734-w

  17. [17]

    Yager, Danielle Monteverde, Dave Baiocchi, Hyukjun Kwon, Shi- jing Sun, and Santosh K

    Linda Hung, Jessica A. Yager, Danielle Monteverde, Dave Baiocchi, Hyukjun Kwon, Shi- jing Sun, and Santosh K. Suram. Autonomous laboratories for accelerated materials dis- covery: a community survey and practical insights.Digital Discovery, 3:1273–1287, 2024. doi:10.1039/D4DD00059E

  18. [18]

    National Science Foundation

    U.S. National Science Foundation. Cyberinfrastructure for Sustained Scientific Innovation (CSSI), NSF 22-632, 2022. URLhttps://www.nsf.gov/funding/opportunities/cssi-c yberinfrastructure-sustained-scientific-innovation/nsf22-632/solicitation. Accessed April 27, 2026

  19. [19]

    Nebraska Center for Materials and Nanoscience, 2026

    University of Nebraska–Lincoln. Nebraska Center for Materials and Nanoscience, 2026. URL https://ncmn.unl.edu. Accessed April 27, 2026

  20. [20]

    Research and Innovation, 2026

    South Dakota Mines. Research and Innovation, 2026. URLhttps://www.sdsmt.edu/resear ch-innovation/index.html. Accessed April 27, 2026

  21. [21]

    High purity ger- manium crystal growth at the University of South Dakota.Journal of Physics: Conference Series, 606:012012, 2015

    Guojian Wang, Hao Mei, Dongming Mei, Yutong Guan, and Gang Yang. High purity ger- manium crystal growth at the University of South Dakota.Journal of Physics: Conference Series, 606:012012, 2015. doi:10.1088/1742-6596/606/1/012012

  22. [22]

    USD’s Germanium Crystal Growth and Detector Development Lab Gains National Visibility, 2025

    University of South Dakota. USD’s Germanium Crystal Growth and Detector Development Lab Gains National Visibility, 2025. URLhttps://www.usd.edu/academics/colleges-a nd-schools/college-of-arts-sciences/south-dakotan-arts-and-sciences/usds-ger manium-crystal-growth-and-detector-development-lab-gains-national-visibility. Accessed April 27, 2026

  23. [23]

    Materials Science and Engineering Program, 2026

    University of Colorado Boulder. Materials Science and Engineering Program, 2026. URL https://www.colorado.edu/mse. Accessed April 27, 2026

  24. [24]

    Texas Materials Institute, The University of Texas at Austin, 2026

    Texas Materials Institute. Texas Materials Institute, The University of Texas at Austin, 2026. URLhttps://tmi.utexas.edu. Accessed April 27, 2026

  25. [25]

    Center for High Technology Materials, 2026

    University of New Mexico. Center for High Technology Materials, 2026. URLhttps://chtm .unm.edu. Accessed April 27, 2026

  26. [26]

    The Sanford Underground Research Facility at Homestake.Journal of Physics: Conference Series, 606:012015, 2015

    Jaret Heise. The Sanford Underground Research Facility at Homestake.Journal of Physics: Conference Series, 606:012015, 2015. doi:10.1088/1742-6596/606/1/012015

  27. [27]

    Sanford Underground Research Facility, 2026

    Sanford Underground Research Facility. Sanford Underground Research Facility, 2026. URL https://www.sanfordlab.org. Accessed April 27, 2026

  28. [28]

    Critical Materials Innovation Hub, 2026

    Ames National Laboratory. Critical Materials Innovation Hub, 2026. URLhttps://www.am eslab.gov/cmi. Accessed April 27, 2026

  29. [29]

    Department of Energy

    U.S. Department of Energy. Critical Materials Innovation Hub Reflects on 10 Years of Suc- cesses, 2024. URLhttps://www.energy.gov/cmei/ammto/articles/critical-materials -innovation-hub-reflects-10-years-successes. Accessed April 27, 2026. 31

  30. [30]

    Department of Energy, Office of Science

    U.S. Department of Energy, Office of Science. Office of Science User Facilities, 2026. URL https://www.energy.gov/science/office-science-user-facilities. Accessed April 27, 2026

  31. [31]

    Department of Energy, Office of Basic Energy Sciences

    U.S. Department of Energy, Office of Basic Energy Sciences. Scientific User Facilities Division,

  32. [32]

    Accessed April 27, 2026

    URLhttps://science.osti.gov/bes/suf. Accessed April 27, 2026

  33. [33]

    Great Plains Network, 2026

    Great Plains Network. Great Plains Network, 2026. URLhttps://www.greatplains.net. Accessed April 27, 2026

  34. [34]

    Andersen, Rickard Armiento, Evgeny Blokhin, Gareth J

    Casper W. Andersen, Rickard Armiento, Evgeny Blokhin, Gareth J. Conduit, Shyam Dwarak- nath, Matthew L. Evans, Ákos Fekete, Anoop Gopakumar, Saulius Gražulis, Andrius Merkys, Fawzi Mohamed, Corey Oses, Giovanni Pizzi, William Davidson Richards, Markus Scheidgen, Leopold Talirz, Cormac Toher, Antanas Vaitkus, Angelo Ziletti, and Kurt Le- jaeghere. OPTIMADE...

  35. [35]

    Balachandran, Dezhen Xue, and Ruihao Yuan

    Turab Lookman, Prasanna V. Balachandran, Dezhen Xue, and Ru Yuan. Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design. npj Computational Materials, 5:21, 2019. doi:10.1038/s41524-019-0153-8

  36. [36]

    Gilad Kusne, Heshan Yu, Changming Wu, Hantao Zhang, Jason Hattrick-Simpers, Brian DeCost, Sayre Sarker, Corey Oses, Cormac Toher, Stefano Curtarolo, Albert V

    A. Gilad Kusne, Heshan Yu, Changming Wu, Hantao Zhang, Jason Hattrick-Simpers, Brian DeCost, Sayre Sarker, Corey Oses, Cormac Toher, Stefano Curtarolo, Albert V. Davydov, Ritesh Agarwal, Leonid A. Bendersky, Mo Li, Apurva Mehta, and Ichiro Takeuchi. On-the-fly closed-loop materials discovery via bayesian active learning.Nature Communications, 11:5966,

  37. [37]

    doi:10.1038/s41467-020-19597-w

  38. [38]

    The National Academies Press, Washington, DC, 2018

    National Academies of Sciences, Engineering, and Medicine.Data Science for Undergrad- uates: Opportunities and Options. The National Academies Press, Washington, DC, 2018. doi:10.17226/25104

  39. [39]

    Joint Declaration of Data Citation Principles, 2014

    Data Citation Synthesis Group. Joint Declaration of Data Citation Principles, 2014. URL https://doi.org/10.25490/a97f-egyk. FORCE11; accessed April 27, 2026

  40. [40]

    Grethe, David Kennedy, Henning Hermjakob, Philippe Rocca-Serra, Robin Berjon, Sebastian Karcher, Maryann Martone, and Timothy Clark

    Martin Fenner, Mercè Crosas, Jeffrey S. Grethe, David Kennedy, Henning Hermjakob, Philippe Rocca-Serra, Robin Berjon, Sebastian Karcher, Maryann Martone, and Timothy Clark. A data citation roadmap for scholarly data repositories.Scientific Data, 6:28, 2019. doi:10.1038/s41597-019-0031-8

  41. [41]

    The Large Enriched Germa- nium Experiment for NeutrinolessββDecay: LEGEND- 1000 Preconceptual Design Report,

    N. Abgrall et al. LEGEND-1000 Preconceptual Design Report.arXiv preprint arXiv:2107.11462, 2021. URLhttps://arxiv.org/abs/2107.11462

  42. [42]

    Watzinger, J

    H. Watzinger, J. Kukucka, L. Vukušić, F. Gao, T. Wang, F. Schäffler, J.-J. Zhang, and G. Katsaros. A germanium hole spin qubit.Nature Communications, 9:3902, 2018. doi:10.1038/s41467-018-06418-4

  43. [43]

    N. W. Hendrickx et al. Sweet-spot operation of a germanium hole spin qubit with highly anisotropic noise sensitivity.Nature Materials, 23:920–927, 2024. doi:10.1038/s41563-024- 01857-5

  44. [44]

    Liu et al

    Y. Liu et al. Machine learning assisted materials design and discovery for rechargeable batter- ies.Energy Storage Materials, 31:434–450, 2020. doi:10.1016/j.ensm.2020.06.033. 32

  45. [45]

    Machine learning in energy storage materials.Interdisciplinary Mate- rials, 1(2):175–193, 2022

    Zhong-Hui Shen et al. Machine learning in energy storage materials.Interdisciplinary Mate- rials, 1(2):175–193, 2022. doi:10.1002/idm2.12020

  46. [46]

    Machine learning for polymeric materials: Progress, challenges, and opportunities.APL Materials, 9(2):020901, 2021

    Arun Mannodi-Kanakkithodi. Machine learning for polymeric materials: Progress, challenges, and opportunities.APL Materials, 9(2):020901, 2021. doi:10.1063/5.0036898

  47. [47]

    Polymer informatics: Current status and crit- ical next steps.Materials Science and Engineering: R: Reports, 144:100595, 2021

    Lihua Chen, Ghanshyam Pilania, Rohit Batra, Tran Doan Huan, Chiho Kim, Christo- pher Kuenneth, and Rampi Ramprasad. Polymer informatics: Current status and crit- ical next steps.Materials Science and Engineering: R: Reports, 144:100595, 2021. doi:10.1016/j.mser.2020.100595

  48. [48]

    Machine learning in additive manufacturing: A review.JOM, 72: 2363–2377, 2020

    Lingbin Meng, Brandon McWilliams, William Jarosinski, Hyeong-Do Park, Yoonki Jung, Jae- hoon Lee, and Jing Zhang. Machine learning in additive manufacturing: A review.JOM, 72: 2363–2377, 2020. doi:10.1007/s11837-020-04155-y

  49. [49]

    J. L. Terrell et al. Machine learning and big data provide crucial insight for future biomaterials discovery and research.Acta Biomaterialia, 130:54–65, 2021. doi:10.1016/j.actbio.2021.01.014

  50. [50]

    A review on the applications of machine learning in biomaterials.Materials Today Bio, 31:101515, 2025

    Rongkai Fu et al. A review on the applications of machine learning in biomaterials.Materials Today Bio, 31:101515, 2025. doi:10.1016/j.mtbio.2025.101515

  51. [51]

    Deshpande

    Utpal Mahanta, Mandeep Khandelwal, and Aniruddha S. Deshpande. Antimicrobial surfaces: A review of synthetic approaches, applicability and outlook.Journal of Materials Science, 56: 17915–17941, 2021. doi:10.1007/s10853-021-06404-0

  52. [52]

    ACM64, 12 (Dec

    Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford. Datasheets for datasets.Communications of the ACM, 64(12):86–92, 2021. doi:10.1145/3458723

  53. [53]

    Selbst, danah boyd, Sorelle A

    Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. Model Cards for Model Reporting. InProceedings of the Conference on Fairness, Accountability, and Transparency, pages 220–229, 2019. doi:10.1145/3287560.3287596

  54. [54]

    Ghiringhelli, Christian Carbogno, Sergey V

    Luca M. Ghiringhelli, Christian Carbogno, Sergey V. Levchenko, Fawzi Mohamed, Georg Huhs, Silvana Botti, Claudia Draxl, and Matthias Scheffler. Shared metadata for data-centric materials science.Scientific Data, 10:626, 2023. doi:10.1038/s41597-023-02501-8. 33