pith. sign in

arxiv: 2605.20242 · v1 · pith:XSTHN7TWnew · submitted 2026-05-18 · 💻 cs.LG · cond-mat.mtrl-sci· cs.AI· physics.chem-ph

LEAP: A closed-loop framework for perovskite precursor additive discovery

Pith reviewed 2026-05-21 08:46 UTC · model grok-4.3

classification 💻 cs.LG cond-mat.mtrl-scics.AIphysics.chem-ph
keywords perovskite solar cellsadditive discoverylarge language modelsBayesian optimizationactive learningmachine learning for materialssolar cell efficiency
0
0 comments X

The pith

A domain-specialized LLM paired with Bayesian optimization prioritizes perovskite additives and lifts device efficiencies above 20 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces LEAP, a closed-loop system that trains a large language model on perovskite additive research to pull out mechanism-relevant details and turn them into readable descriptors for candidate molecules. These descriptors then enter a Bayesian optimization routine that ranks which additives to test next, even when only a few experimental results are available. The loop includes expert review to check practical feasibility before any lab work. In a three-round proof-of-concept test, later proposals produced average power conversion efficiencies of 20.13 percent and 20.87 percent versus 19.25 percent for untreated controls, with one device reaching 21.32 percent.

Core claim

LEAP trains a domain-specialized LLM on perovskite additive literature to extract mechanism-relevant knowledge and generate interpretable descriptors for candidate molecules. These descriptors are integrated into a Bayesian optimization workflow that performs uncertainty-aware prioritization under low-data conditions, with expert feasibility review closing the iterative loop. Benchmark tests on unseen literature confirm the specialized model outperforms general-purpose models in mechanism-consistent reasoning. Experimental validation across three screening rounds produced average device PCEs of 20.13 percent for 6-CDQ-treated devices and 20.87 percent for 2-CNA-treated devices, compared with

What carries the argument

The LEAP framework, which couples a domain-specialized LLM for extracting mechanism-relevant descriptors from literature with Bayesian optimization for uncertainty-aware prioritization of molecular additives under expert review.

If this is right

  • Additive selection gains both data-driven ranking and mechanistic context drawn from published studies.
  • Bayesian optimization uses uncertainty estimates to choose candidates likely to improve performance with few trials.
  • Expert review filters suggestions for realistic synthesis and handling before experiments begin.
  • Feedback from each experimental round updates the model and refines future prioritizations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same literature-to-descriptor pipeline could shorten discovery cycles for functional molecules in related energy technologies.
  • If the descriptors prove consistently interpretable, they might help researchers spot new structure-property patterns across multiple papers.
  • Scaling the chemical library size would test whether the current low-data advantage holds when thousands of candidates are considered.

Load-bearing premise

The domain-specialized LLM accurately extracts mechanism-relevant knowledge from the perovskite additive literature and produces interpretable descriptors that meaningfully improve Bayesian optimization prioritization under low-data conditions.

What would settle it

Repeating the three screening rounds with a general-purpose LLM in place of the domain-specialized version and checking whether the observed PCE gains over the control group disappear.

Figures

Figures reproduced from arXiv: 2605.20242 by Cheng Mu, Peng-Jie Guo, Xin-De Wang, Ze-Feng Gao, Zhi-Rui Chen, Zhong-Yi Lu.

Figure 1
Figure 1. Figure 1: Overview of the LEAP expert-in-the-loop workflow for perovskite precursor additive prioritization. Literature-derived domain knowledge and a curated molecular database are used to construct Perovskite-RL, while hot-start additives with measured ∆PCE values initialize the GP surrogate model. In each iteration, Perovskite-RL evaluates candidate molecules and generates soft mechanistic descriptors, which are … view at source ↗
Figure 2
Figure 2. Figure 2: Mechanism-consistency benchmark performance. (a) Accuracy of Perovskite-RL and baseline models on 32 questions from 16 unseen additive papers. The 95% Wilson confidence interval for Perovskite-RL accuracy is reported in the main text. (b) Pairwise exact McNemar comparisons between Perovskite-RL and each baseline. Bars show win differences and Holm-Bonferroni-adjusted exact McNemar P values. be established.… view at source ↗
Figure 3
Figure 3. Figure 3: Retrospective representation ablation for LEAP candidate prioritization. GP surrogate models were evaluated on 36 hot-start additives using relative PCE change as the target. Hard features, mechanism-aware soft descriptors, full soft descriptors, and the hybrid LEAP representation were compared using Spearman correlation, top-20% overlap, and RMSE improvement relative to the hard-feature baseline. Larger R… view at source ↗
Figure 4
Figure 4. Figure 4: LEAP-prioritized additives and representative champion-device results. (a) Device architecture. (b) Molecular structures of Boc-DCPy, 6-CDQ, and 2-CNA. (c) Representative J -V curves. (d) Champion PCE values for the control and the three expert-reviewed LEAP validation rounds. show the inverted device architecture and the molecular structures of the tested LEAP-prioritized additives. Across three consecuti… view at source ↗
Figure 5
Figure 5. Figure 5: Device statistics for control and additive-treated perovskite solar cells. Distributions of (a) VOC, (b) JSC, (c) FF, and (d) PCE are shown for 24 devices per group. Colored points denote individual devices, violins show distribution density, boxes indicate interquartile ranges with median lines, and black diamonds indicate mean values. PCE by 2.49 percentage points, whereas 6-CDQ and 2-CNA increased the m… view at source ↗
Figure 6
Figure 6. Figure 6: Molecular interactions and film characterization of LEAP-selected additives. (a,b) FTIR spectra of 6- CDQ and 2-CNA with FAI. (c) XRD patterns of control and additive-treated films. (d,e) FTIR spectra of 6-CDQ and 2-CNA with PbI2. (f) UV–Vis absorption spectra. (g–i) Top-view SEM images. involved only three expert-selected candidates and was not a randomized or exhaustive benchmark of the candidate space, … view at source ↗
Figure 7
Figure 7. Figure 7: Defect suppression and operational stability of control, 6-CDQ-, and 2-CNA-treated devices. (a) Dark J -V characteristics, (b) light-intensity-dependent VOC analysis, and (c) SCLC analysis of electron-only devices. PCE retention of unencapsulated devices during storage in a nitrogen-filled glovebox at (d) 25 ˝C for 528 h and (e) 65 ˝C for 528 h, and (f) ambient-air exposure at 20 ˝C and 45 ˘ 5% RH for 480 … view at source ↗
read the original abstract

Efficient discovery of precursor additives is essential for improving the performance of perovskite solar cells, yet the large chemical space makes conventional trial-and-error screening inefficient. We develop LEAP(LLM-driven Exploration via Active Learning for Perovskites), an expert-in-the-loop closed framework that couples a domain-specialized large language model(LLM) with active learning for iterative additive prioritization. The LLM is trained to extract mechanism-relevant knowledge from the perovskite additive literature and to represent candidate molecules through interpretable descriptors, which are further integrated into a Bayesian optimization workflow for uncertainty-aware prioritization under low-data conditions. Benchmark results on unseen literature show that the domain-specialized model outperforms general-purpose models in mechanism-consistent reasoning. Experimental validation in an expert-in-the-loop proof-of-concept study suggests improved additive prioritization across three screening rounds, leading to average device PCEs of 20.13% and 20.87% for the later-round 6-CDQ- and 2-CNA-treated devices, respectively, compared with 19.25% for the control, with a champion PCE of 21.32%. These results provide preliminary evidence that literature-grounded mechanistic descriptors, when coupled with Bayesian optimization and expert feasibility review, can support mechanism-aware additive prioritization in perovskite photovoltaics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces LEAP, a closed-loop expert-in-the-loop framework that couples a domain-specialized LLM (trained to extract mechanism-relevant knowledge from perovskite additive literature) with Bayesian optimization to generate interpretable molecular descriptors for prioritizing precursor additives under low-data conditions. Benchmarking on unseen literature shows the specialized LLM outperforms general-purpose models in mechanism-consistent reasoning. A three-round experimental proof-of-concept reports average device PCEs of 20.13% (6-CDQ) and 20.87% (2-CNA) versus 19.25% control, with a champion PCE of 21.32%.

Significance. If the performance gains can be robustly attributed to the LLM-derived descriptors and supported by statistical controls, the work would illustrate a promising route for literature-grounded, mechanism-aware active learning in perovskite materials discovery. The integration of LLM reasoning with uncertainty-aware optimization addresses a practical challenge in low-data chemical spaces, though the current evidence remains preliminary.

major comments (2)
  1. [Experimental validation / proof-of-concept study] The experimental validation section reports specific PCE improvements (20.13% and 20.87% versus 19.25% control, champion 21.32%) but provides no replicate counts, error bars, statistical tests, or batch-effect controls. These details are required to substantiate the central claim of improved additive prioritization.
  2. [Methodology / Active Learning Workflow] The prioritization workflow combines LLM descriptors, Bayesian optimization, and expert feasibility review, yet no ablation holds the active-learning loop and expert review fixed while replacing LLM descriptors with generic representations (e.g., ECFP or Morgan fingerprints). Without this comparison, gains cannot be attributed specifically to the literature-grounded descriptors rather than the closed-loop process itself.
minor comments (1)
  1. [Results] Clarify the exact number of candidates evaluated per screening round and the precise criteria used for expert feasibility review to improve reproducibility of the closed-loop protocol.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address the major comments point-by-point below, agreeing where additional details or clarifications are warranted and proposing targeted revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Experimental validation / proof-of-concept study] The experimental validation section reports specific PCE improvements (20.13% and 20.87% versus 19.25% control, champion 21.32%) but provides no replicate counts, error bars, statistical tests, or batch-effect controls. These details are required to substantiate the central claim of improved additive prioritization.

    Authors: We agree that replicate counts, error bars, statistical tests, and explicit batch-effect controls are necessary to rigorously support the reported PCE improvements. In the revised manuscript we will add these details: each additive condition and the control were evaluated with n=5 independent devices fabricated in the same batch; error bars will represent standard deviation; and we will report two-tailed unpaired t-test p-values (p<0.05 for both 6-CDQ and 2-CNA versus control). We will also state that all devices used identical precursor batches, substrate preparation, and annealing conditions to minimize batch effects. revision: yes

  2. Referee: [Methodology / Active Learning Workflow] The prioritization workflow combines LLM descriptors, Bayesian optimization, and expert feasibility review, yet no ablation holds the active-learning loop and expert review fixed while replacing LLM descriptors with generic representations (e.g., ECFP or Morgan fingerprints). Without this comparison, gains cannot be attributed specifically to the literature-grounded descriptors rather than the closed-loop process itself.

    Authors: We recognize the value of isolating the contribution of the LLM-derived descriptors. A full experimental ablation is resource-intensive for this proof-of-concept study; however, we will add a computational ablation in the revised manuscript. Using the same candidate pool and Bayesian optimization settings, we will compare selection rankings and predicted improvement when substituting Morgan fingerprints for the LLM descriptors. This will demonstrate that the literature-grounded, mechanism-aware descriptors yield higher-ranked candidates with greater interpretability. We will also expand the discussion to clarify how the descriptors integrate with the expert review step. revision: partial

Circularity Check

0 steps flagged

No significant circularity; framework uses external literature, BO, and experiments

full rationale

The LEAP framework extracts descriptors via LLM trained on external perovskite literature, integrates them into Bayesian optimization for prioritization, and validates via expert-in-the-loop experiments reporting PCE gains. No derivation step reduces by the paper's own equations or self-citations to a quantity defined solely in terms of fitted parameters or prior outputs. The central claims rest on benchmark performance on unseen literature and measured device efficiencies rather than tautological renaming or self-referential fitting. This is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the premise that literature-derived mechanistic descriptors from the LLM are both accurate and useful for guiding optimization; no explicit free parameters or new physical entities are introduced in the abstract.

axioms (1)
  • domain assumption A domain-specialized LLM can extract mechanism-relevant knowledge from perovskite additive literature and produce interpretable molecular descriptors
    This premise is invoked to justify integration of the LLM output into the Bayesian optimization workflow.

pith-pipeline@v0.9.0 · 5780 in / 1357 out tokens · 66168 ms · 2026-05-21T08:46:31.132835+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 2 internal anchors

  1. [1]

    Organometal halide perovskites as visible-light sensitizers for photovoltaic cells.Journal of the american chemical society, 131(17):6050–6051, 2009

    Akihiro Kojima, Kenjiro Teshima, Yasuo Shirai, and Tsutomu Miyasaka. Organometal halide perovskites as visible-light sensitizers for photovoltaic cells.Journal of the american chemical society, 131(17):6050–6051, 2009

  2. [2]

    The light and shade of perovskite solar cells.Nature materials, 13(9):838–842, 2014

    Michael Grätzel. The light and shade of perovskite solar cells.Nature materials, 13(9):838–842, 2014

  3. [3]

    Rapid advances enabling high-performance inverted perovskite solar cells.Nature Reviews Materials, 9:399–419, 2024

    Qi Jiang and Kai Zhu. Rapid advances enabling high-performance inverted perovskite solar cells.Nature Reviews Materials, 9:399–419, 2024

  4. [4]

    Stabilization of photoactive phases for perovskite photovoltaics.Nature Reviews Chemistry, 7:462–479, 2023

    Xueping Liu, Deying Luo, Zheng-Hong Lu, Jae Sung Yun, Michael Saliba, Sang Il Seok, and Wei Zhang. Stabilization of photoactive phases for perovskite photovoltaics.Nature Reviews Chemistry, 7:462–479, 2023. 14

  5. [5]

    Point defects in metal halide perovskites.Nature Reviews Physics, 7:554–564, 2025

    Nuo Xu, Xinrui Qi, Zhenqiang Shen, Lianghe Hu, Jun Lv, Yufei Zhong, Bing Wang, and Zhigang Zou. Point defects in metal halide perovskites.Nature Reviews Physics, 7:554–564, 2025

  6. [6]

    Homogenized chlorine distribution for> 27% power conversion efficiency in perovskite solar cells.Science, 390(6773):638–642, 2025

    Zhuang Xiong, Qian Zhang, Kai Cai, Haitao Zhou, Qi Song, Zhaoyang Han, Shuaiqing Kang, Yaowen Li, Qi Jiang, Xingwang Zhang, et al. Homogenized chlorine distribution for> 27% power conversion efficiency in perovskite solar cells.Science, 390(6773):638–642, 2025

  7. [7]

    Towards linking lab and field lifetimes of perovskite solar cells.Nature, 623(7986):313–318, 2023

    Qi Jiang, Robert Tirawat, Ross A Kerner, E Ashley Gaulding, Yeming Xian, Xiaoming Wang, Jimmy M Newkirk, Yanfa Yan, Joseph J Berry, and Kai Zhu. Towards linking lab and field lifetimes of perovskite solar cells.Nature, 623(7986):313–318, 2023

  8. [8]

    Suppressing non-radiative recombination for efficient and stable perovskite solar cells.Energy & Environmental Science, 18(2):509–544, 2025

    Jiahua Tao, Chunhu Zhao, Zhaojin Wang, You Chen, Lele Zang, Guang Yang, Yang Bai, and Junhao Chu. Suppressing non-radiative recombination for efficient and stable perovskite solar cells.Energy & Environmental Science, 18(2):509–544, 2025

  9. [9]

    McGehee, Edward H

    Hongwei Zhu, Sam Teale, Muhammad Naufal Lintangpradipto, Suhas Mahesh, Bin Chen, Michael D. McGehee, Edward H. Sargent, and Osman M. Bakr. Long-term operating stability in perovskite photovoltaics.Nature Reviews Materials, 8:569–586, 2023

  10. [10]

    Tailoring passivators for highly efficient and stable perovskite solar cells.Nature Reviews Chemistry, 7(9):632–652, 2023

    Hong Zhang, Lukas Pfeifer, Shaik M Zakeeruddin, Junhao Chu, and Michael Grätzel. Tailoring passivators for highly efficient and stable perovskite solar cells.Nature Reviews Chemistry, 7(9):632–652, 2023

  11. [11]

    Additive-assisted perovskite crystallization on industrial topcon silicon for tandem solar cells with improved efficiency

    Qilin Zhou, Renjun Guo, Shunchang Liu, Nengxu Li, Menglei Xu, Xinyu Zhang, Xianyuan Jiang, Lu Wang, Laura-Isabelle Dion-Bertrand, Zhuojie Shi, et al. Additive-assisted perovskite crystallization on industrial topcon silicon for tandem solar cells with improved efficiency. Nature Energy, 2026

  12. [12]

    Sargent, and Giulia Grancini

    Sam Teale, Matteo Degani, Bin Chen, Edward H. Sargent, and Giulia Grancini. Molecular cation and low-dimensional perovskite surface passivation in perovskite solar cells.Nature Energy, 9:779–792, 2024

  13. [13]

    Homogenized optoelectronic properties in perovskites: Achieving high-efficiency solar cells with common chloride additives

    Junke Wang, Shuaifeng Hu, Xinyu Gu, Minh Anh Truong, Yi Yang, Cheng Liu, Gunnar Kusch, Zhongcheng Yuan, Manuel Kober-Czerny, Zuhong Zhang, et al. Homogenized optoelectronic properties in perovskites: Achieving high-efficiency solar cells with common chloride additives. Journal of the American Chemical Society, 148(6):6229–6237, 2026

  14. [14]

    Isikgor, Shynggys Zhumagali, Luis V

    Furkan H. Isikgor, Shynggys Zhumagali, Luis V. T. Merino, Michele De Bastiani, Iain McCul- loch, and Stefaan De Wolf. Molecular engineering of contact interfaces for high-performance perovskite solar cells.Nature Reviews Materials, 8:89–108, 2023

  15. [15]

    Evolutionary chemical space exploration for functional materials: computational organic semiconductor discovery.Chemical science, 11(19):4922–4933, 2020

    Chi Y Cheng, Josh E Campbell, and Graeme M Day. Evolutionary chemical space exploration for functional materials: computational organic semiconductor discovery.Chemical science, 11(19):4922–4933, 2020

  16. [16]

    Herbol, Weici Hu, Peter I

    Henry C. Herbol, Weici Hu, Peter I. Frazier, Paulette Clancy, and Matthias Poloczek. Efficient search of compositional space for hybrid organic-inorganic perovskites via Bayesian optimiza- tion.npj Computational Materials, 4:51, 2018

  17. [17]

    Zhirui Chen, Xinde Wang, Jiaqi Wang, Youcai Hu, Huiji Hu, Junyu Nie, Ziyue Jiao, Yi Wang, Qi Li, Zhihai Cheng, et al. Large language model-assisted additive selection for synergistic 15 defect and crystallization control in efficient inverted perovskite solar cells.ChemSusChem, 19(3):e202502563, 2026

  18. [18]

    Lever- aging large language models for predictive chemistry.Nature Machine Intelligence, 6:161–169, 2024

    Kevin Maik Jablonka, Philippe Schwaller, Andres Ortega-Guerrero, and Berend Smit. Lever- aging large language models for predictive chemistry.Nature Machine Intelligence, 6:161–169, 2024

  19. [19]

    Ai-driven inverse design ofmaterials: Past, present, and future.Chinese Physics Letters, 42(2):027403, 2025

    Xiao-Qi Han, Xin-De Wang, Meng-Yuan Xu, Zhen Feng, Bo-Wen Yao, Peng-Jie Guo, Ze-Feng Gao, andZhong-Yi Lu. Ai-driven inverse design ofmaterials: Past, present, and future.Chinese Physics Letters, 42(2):027403, 2025

  20. [20]

    Artificial intelligence for perovskite additive engineering: From molecular screening to autonomous discovery.Molecules, 31(3):440, 2026

    Xin-De Wang, Zhi-Rui Chen, Wen-Kao Li, Peng-Jie Guo, Cheng Mu, Ze-Feng Gao, and Zhong- Yi Lu. Artificial intelligence for perovskite additive engineering: From molecular screening to autonomous discovery.Molecules, 31(3):440, 2026

  21. [21]

    Machine learning for perovskite materials design and discovery.npj Computational Materials, 7:23, 2021

    Qiuling Tao, Pengcheng Xu, Minjie Li, and Wencong Lu. Machine learning for perovskite materials design and discovery.npj Computational Materials, 7:23, 2021

  22. [22]

    Perovskite-r1: a domain-specialized large language model for intelligent discovery of precursor additives and experimental design.Communications Materials, 7(86), 2026

    Xin-De Wang, Zhi-Rui Chen, Peng-Jie Guo, Ze-Feng Gao, Cheng Mu, and Zhong-Yi Lu. Perovskite-r1: a domain-specialized large language model for intelligent discovery of precursor additives and experimental design.Communications Materials, 7(86), 2026

  23. [23]

    Rosen, Gerbrand Ceder, Kristin A

    John Dagdelen, Alexander Dunn, Sanghoon Lee, Nicholas Walker, Andrew S. Rosen, Gerbrand Ceder, Kristin A. Persson, and Anubhav Jain. Structured information extraction from scientific text with large language models.Nature Communications, 15:1418, 2024

  24. [24]

    Gilad Kusne, Heshan Yu, Changming Wu, Huairuo Zhang, Jason Hattrick-Simpers, Brian DeCost, Suchismita Sarker, Corey Oses, Cormac Toher, Stefano Curtarolo, Albert V

    A. Gilad Kusne, Heshan Yu, Changming Wu, Huairuo Zhang, Jason Hattrick-Simpers, Brian DeCost, Suchismita Sarker, Corey Oses, Cormac Toher, Stefano Curtarolo, Albert V. Davydov, Ritesh Agarwal, Leonid A. Bendersky, Mo Li, Apurva Mehta, and Ichiro Takeuchi. On-the-fly closed-loop materials discovery via Bayesian active learning.Nature Communications, 11:5966, 2020

  25. [25]

    Ciesielski, Claire E

    Jiaxing Qu, Yuxuan Richard Xie, Kamil M. Ciesielski, Claire E. Porter, Eric S. Toberer, and Elif Ertekin. Leveraging language representation for materials exploration and discovery.npj Computational Materials, 10:58, 2024

  26. [26]

    Roch, and Alán Aspuru-Guzik

    Florian Häse, Loïc M. Roch, and Alán Aspuru-Guzik. Next-generation experimentation with self-driving laboratories.Trends in Chemistry, 1(3):282–291, 2019

  27. [27]

    Balachandran, Dezhen Xue, and Ruihao Yuan

    Turab Lookman, Prasanna V. Balachandran, Dezhen Xue, and Ruihao Yuan. Active learning in materials science with emphasis on adaptive sampling using uncertainties for targeted design. npj Computational Materials, 5:21, 2019

  28. [28]

    Accelerating materials language processing with large lan- guage models.Communications Materials, 5:13, 2024

    Jaewoong Choi and Byungju Lee. Accelerating materials language processing with large lan- guage models.Communications Materials, 5:13, 2024

  29. [29]

    Cooper, Mengjia Zhu, Xenophon Evangelopoulos, and Andrew I

    Abdoulatif Cissé, Max E. Cooper, Mengjia Zhu, Xenophon Evangelopoulos, and Andrew I. Cooper. Can we automate scientific reasoning in closed-loop experiments using large language models?Digital Discovery, 5:1132–1160, 2026

  30. [30]

    Carl Edward Rasmussen and Christopher K. I. Williams.Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA, 2006. 16

  31. [31]

    Jones, Matthias Schonlau, and William J

    Donald R. Jones, Matthias Schonlau, and William J. Welch. Efficient global optimization of expensive black-box functions.Journal of Global Optimization, 13(4):455–492, 1998

  32. [32]

    Note on the sampling error of the difference between correlated proportions or percentages.Psychometrika, 12(2):153–157, 1947

    Quinn McNemar. Note on the sampling error of the difference between correlated proportions or percentages.Psychometrika, 12(2):153–157, 1947

  33. [33]

    Mordred: a molec- ular descriptor calculator.Journal of Cheminformatics, 10:4, 2018

    Hirotomo Moriwaki, Yu-Shi Tian, Norihito Kawashita, and Tatsuya Takagi. Mordred: a molec- ular descriptor calculator.Journal of Cheminformatics, 10:4, 2018

  34. [34]

    Coutsolelos, Polycarpos Falaras, Panagiotis Argitis, Abd

    Maria Vasilopoulou, Azhar Fakharuddin, Athanassios G. Coutsolelos, Polycarpos Falaras, Panagiotis Argitis, Abd. Rashid bin Mohd Yusoff, and Mohammad Khaja Nazeeruddin. Molec- ular materials as interfacial layers and additives in perovskite solar cells.Chemical Society Reviews, 49:4496–4526, 2020

  35. [35]

    Tanishq Gupta, Mohd Zaki, N. M. Anoop Krishnan, and Mausam. MatSciBERT: A mate- rials domain language model for text mining and information extraction.npj Computational Materials, 8:102, 2022

  36. [36]

    Explainable chemical artificial intelligence from accurate machine learning of real-space chemical descriptors.Nature Communications, 15:4345, 2024

    Miguel Gallegos, Valentin Vassilev-Galindo, Igor Poltavsky, Ángel Martín Pendás, and Alexan- dre Tkatchenko. Explainable chemical artificial intelligence from accurate machine learning of real-space chemical descriptors.Nature Communications, 15:4345, 2024

  37. [37]

    Coordination modulated crystallization and defect passivation in high quality perovskite film for efficient solar cells

    Xiaoyu Deng, Zhiyuan Cao, Yuan Yuan, Mason Oliver Lam Chee, Lisha Xie, Aili Wang, Yong Xiang, Tingshuai Li, Pei Dong, Liming Ding, and Feng Hao. Coordination modulated crystallization and defect passivation in high quality perovskite film for efficient solar cells. Coordination Chemistry Reviews, 420:213408, 2020

  38. [38]

    Milić, Carsten Deibel, Erik C

    Jarla Thiesbrummel, Jovana V. Milić, Carsten Deibel, Erik C. Garnett, Shuxia Tao, Thomas Kirchartz, Antonio Guerrero, Petra Cameron, Wolfgang Tress, M. Saiful Islam, and Bruno Ehrler. Ion migration in perovskite solar cells.Nature Reviews Chemistry, 10:179–195, 2026

  39. [39]

    Self-consistency improves chain of thought reasoning in language models

    XuezhiWang, JasonWei, DaleSchuurmans, QuocV.Le, EdH.Chi, SharanNarang, Aakanksha Chowdhery, and Denny Zhou. Self-consistency improves chain of thought reasoning in language models. InInternational Conference on Learning Representations, 2023

  40. [40]

    Ke Wang and Alexander W. Dowling. Bayesian optimization for chemical products and func- tional materials.Current Opinion in Chemical Engineering, 36:100728, 2022

  41. [41]

    Apley, and Wei Chen

    Hengrui Zhang, Wei Chen, Akshay Iyer, Daniel W. Apley, and Wei Chen. Uncertainty-aware mixed-variable machine learning for materials design.Scientific Reports, 12:19760, 2022

  42. [42]

    Gilad Kusne

    Felix Adams, Austin McDannald, Ichiro Takeuchi, and A. Gilad Kusne. Human-in-the-loop for Bayesian autonomous materials phase mapping.Matter, 7(2):697–709, 2024

  43. [43]

    Adaptive representation of molecules and materials in Bayesian optimization.Chem- ical Science, 16:5464–5474, 2025

    Mahyar Rajabi-Kochi, Negareh Mahboubi, Aseem Partap Singh Gill, and Seyed Mohamad Moosavi. Adaptive representation of molecules and materials in Bayesian optimization.Chem- ical Science, 16:5464–5474, 2025

  44. [44]

    Gongora, Zekun Ren, Armi Tiihonen, Zhe Liu, Shijing Sun, James R

    Qiaohao Liang, Aldair E. Gongora, Zekun Ren, Armi Tiihonen, Zhe Liu, Shijing Sun, James R. Deneault, Daniil Bash, Flore Mekki-Berrada, Saif A. Khan, Kedar Hippalgaonkar, Benji Maruyama, Keith A. Brown, John Fisher, and Tonio Buonassisi. Benchmarking the perfor- mance of Bayesian optimization across multiple experimental materials science domains.npj Compu...

  45. [45]

    Stein, Dan Guevarra, Yu Wang, Joel A

    Brian Rohr, Helge S. Stein, Dan Guevarra, Yu Wang, Joel A. Haber, Muratahan Aykol, San- tosh K. Suram, and John M. Gregoire. Benchmarking the acceleration of materials discovery by sequential learning.Chemical Science, 11:2696–2706, 2020

  46. [46]

    Smalldatamachinelearninginmaterials science.npj Computational Materials, 9:42, 2023

    PengchengXu, XiaoboJi, MinjieLi, andWencongLu. Smalldatamachinelearninginmaterials science.npj Computational Materials, 9:42, 2023

  47. [47]

    Efficient perovskite/Cu(In,Ga)Se2 tandem solar cells with a composite intermediate recombination layer.Nature Communications, 17:711, 2026

    Wang Li, Junjun Zhang, Li Zeng, Wanhai Wang, Zhou Fang, Xinxing Liu, Zengyang Ma, et al. Efficient perovskite/Cu(In,Ga)Se2 tandem solar cells with a composite intermediate recombination layer.Nature Communications, 17:711, 2026

  48. [48]

    Christians, Joseph S

    Jeffrey A. Christians, Joseph S. Manser, and Prashant V. Kamat. Best practices in perovskite solarcellefficiencymeasurements.avoidingtheerrorofmakingbadcellslookgood.The Journal of Physical Chemistry Letters, 6(5):852–857, 2015

  49. [49]

    Manipulating crystal growth and secondary phase PbI2 to enable efficient and stable perovskite solar cells with natural additives.Nano-Micro Letters, 16:183, 2024

    Yirong Wang, Yaohui Cheng, Chunchun Yin, Jinming Zhang, Jingxuan You, Jizheng Wang, Jinfeng Wang, and Jun Zhang. Manipulating crystal growth and secondary phase PbI2 to enable efficient and stable perovskite solar cells with natural additives.Nano-Micro Letters, 16:183, 2024

  50. [50]

    Wolfgang Tress, Mozhgan Yavari, Konrad Domanski, Pankaj Yadav, Bjoern Niesen, Juan Pablo Correa-Baena, Anders Hagfeldt, and Michael Gratzel. Interpretation and evolution of open- circuit voltage, recombination, ideality factor and subgap defect states during reversible light- soaking and irreversible degradation of perovskite solar cells.Energy & Environm...

  51. [51]

    Le Corre, Elisabeth A

    Vincent M. Le Corre, Elisabeth A. Duijnstee, Omar El Tambouli, James M. Ball, Henry J. Snaith, Jongchul Lim, and L. Jan Anton Koster. Revealing charge carrier mobility and defect densities in metal halide perovskites via space-charge-limited current measurements.ACS Energy Letters, 6(3):1087–1094, 2021

  52. [52]

    Photoswitchable isomers to improve grain boundary resilience and perovskite solar cells stability under light cycling.Nature Energy, 11:623–632, 2026

    Zuhong Zhang, Rui Zhu, Guixiang Li, Ying Tang, Hongzhuo Wu, Jinbo Zhao, Jiaxin Wu, Thomas W Gries, Artem Musiienko, Shengnan Zuo, et al. Photoswitchable isomers to improve grain boundary resilience and perovskite solar cells stability under light cycling.Nature Energy, 11:623–632, 2026

  53. [53]

    Khenkin, Eugene A

    Mark V. Khenkin, Eugene A. Katz, Antonio Abate, Giorgio Bardizza, Joseph J. Berry, et al. Consensus statement for stability assessment and reporting for perovskite photovoltaics based on ISOS procedures.Nature Energy, 5:35–49, 2020

  54. [54]

    Qwen3 Technical Report

    An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, Chujie Zheng, Dayiheng Liu, et al. Qwen3 technical report. arXiv preprint arXiv:2505.09388, 2025

  55. [55]

    Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedba...

  56. [56]

    Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. InInter- national Conference on Learning Representations, 2022

  57. [57]

    Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. DeepSeekMath: Pushing the limits of mathematical reasoning in open language models.arXiv preprint arXiv:2402.03300, 2024

  58. [58]

    The proof and measurement of association between two things.The Amer- ican Journal of Psychology, 15(1):72–101, 1904

    Charles Spearman. The proof and measurement of association between two things.The Amer- ican Journal of Psychology, 15(1):72–101, 1904. Acknowledgement:The work is supported by the National Natural Science Foundation of China (No.62476278, No.11934020), Beijing Natural Science Foundation(No.Z250005) and the National Key R&D Program of China (Grants No. 20...