An Unsupervised Machine Learning-based Framework for Wafer Scale Variability Analysis and Performance Prediction of Ferroelectric Hf0.5Zr0.5O2 Thin Film Capacitors
Pith reviewed 2026-05-09 18:41 UTC · model grok-4.3
The pith
Unsupervised machine learning using PCA and K-Means predicts ferroelectric HZO capacitor performance on untested wafer dies with 5-10% error.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By reducing multidimensional performance data with PCA and applying K-Means to form clusters, the framework separates dies into performance categories defined by key parameters like remanent polarization (Pr) and coercive voltage (Vc), then uses data from measured dies to predict the same parameters on untested dies, achieving a mean absolute percentage error in the 5-10% range.
What carries the argument
Principal Component Analysis combined with K-Means clustering applied to electrical test data, which reduces dimensionality and defines performance categories for virtual metrology predictions.
If this is right
- Performance categories allow classification of high- and low-performing regions without exhaustive testing.
- Low-error predictions reduce the volume of physical measurements required for wafer-scale quality control.
- Uniformity comparisons between dies become possible from partial sampling data.
- The virtual metrology approach supports faster iteration on fabrication processes to improve overall yield.
Where Pith is reading between the lines
- The method could be paired with real-time process sensors to adjust deposition conditions mid-run based on early cluster assignments.
- Similar clustering on other variable thin-film systems might reduce testing needs in related memory or sensor technologies.
- Determining the minimal number of initial dies required for stable clusters would make the framework more practical for production lines.
Load-bearing premise
The clusters identified from a limited sample of measured dies are representative of variability patterns across the full wafer without needing further physical calibration.
What would settle it
Applying the trained model to a hold-out set of measured dies or a new wafer and finding average prediction errors above 10% for remanent polarization or coercive voltage on those locations.
Figures
read the original abstract
Fabrication process-induced performance variability remains a formidable barrier in the high-volume manufacturing of semiconductor chips. With skyrocketing Artificial Intelligence (AI) workload, demand for non-volatile and computational memories is growing exponentially. As embedded non-volatile memory, ferroelectric Hf0.5Zr0.5O2 emerged as a strong candidate due to their CMOS back-end-of-line (BEOL) compatibility, scalability and high performance. However, their sensitive crystallization kinetics leads to significant device-to-device (D2D) non-uniformity leading to unpredictability of performance over wafer scale. In this work, we demonstrate unsupervised machine learning can analyze intra-die D2D variations and predict performance of "unseen" dies efficiently. We present a framework utilizing Principal Component Analysis (PCA) and K-Means clustering to analyze D2D performance variations in HZO capacitors and building on data from multiple dies, we move beyond traditional descriptive statistics to a predictive "Virtual Metrology" approach that separates performance categories, defined by key parameters like remanent polarization (Pr) and coercive voltage (Vc). The analysis further extends to comparing uniformity across different dies across the wafer showing the proposed methodology can accurately predict device performance on untested dies with a low Mean Absolute Percentage Error (MAPE) in the range of 5-10%, suggesting a robust path for accelerated yield improvement and reduced metrology overhead.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an unsupervised machine learning framework that applies Principal Component Analysis (PCA) and K-Means clustering to electrical data from ferroelectric Hf0.5Zr0.5O2 (HZO) thin-film capacitors. It aims to characterize intra-die device-to-device variability, identify performance categories based on parameters such as remanent polarization (Pr) and coercive voltage (Vc), and enable predictive 'Virtual Metrology' for performance on untested dies, reporting a Mean Absolute Percentage Error (MAPE) of 5-10%.
Significance. If the prediction generalizes reliably, the approach could reduce physical metrology overhead in wafer-scale manufacturing of BEOL-compatible ferroelectric memories by enabling data-driven yield analysis. The unsupervised clustering extends beyond traditional statistics for variability assessment. However, the absence of key methodological details limits the ability to evaluate whether the 5-10% MAPE reflects genuine predictive power or untested assumptions about cluster representativeness.
major comments (3)
- [Abstract] Abstract: The claim of accurate prediction on unseen dies with 5-10% MAPE provides no information on dataset size (number of dies or measurements), die sampling strategy across the wafer, or the validation procedure (e.g., whether hold-out was by die, random split, or leave-one-die-out). This information is load-bearing for assessing if the learned clusters capture full wafer variability including spatial gradients.
- [Methods] Methods/Results: It is not specified how cluster assignments or centroids from the unsupervised PCA+K-Means step are converted into quantitative predictions of Pr and Vc values for unseen dies. If this involves additional fitting or supervised steps, the reported MAPE and the 'unsupervised' framing require clarification to avoid circularity in the error metric.
- [Results] Results: No details are given on how ground-truth Pr/Vc values were obtained for the held-out dies used in the MAPE calculation, nor is there validation that the clusters correspond to physical mechanisms (e.g., crystallization kinetics) rather than data artifacts.
minor comments (2)
- [Abstract] The term 'Virtual Metrology' is used without a brief definition or reference on first appearance.
- [Figures] Figure captions should explicitly state the input features to PCA, the number of retained components, and the value of K used in clustering for reproducibility.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments, which highlight important areas for improving the clarity and completeness of our manuscript. We address each major comment point-by-point below, providing clarifications and indicating where revisions will be made to strengthen the presentation of our unsupervised framework.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim of accurate prediction on unseen dies with 5-10% MAPE provides no information on dataset size (number of dies or measurements), die sampling strategy across the wafer, or the validation procedure (e.g., whether hold-out was by die, random split, or leave-one-die-out). This information is load-bearing for assessing if the learned clusters capture full wafer variability including spatial gradients.
Authors: We agree these details are critical for assessing the framework's ability to capture wafer-scale variability. In the revised manuscript, we will expand both the abstract and the Methods section to report the dataset size (number of dies measured and total device measurements), the die sampling strategy (selection across the wafer to include spatial gradients), and the validation procedure (hold-out of entire dies to simulate unseen dies, rather than random splits). This leave-one-die-out approach directly tests generalization to untested dies while preserving the unsupervised nature of the PCA and K-Means steps. revision: yes
-
Referee: [Methods] Methods/Results: It is not specified how cluster assignments or centroids from the unsupervised PCA+K-Means step are converted into quantitative predictions of Pr and Vc values for unseen dies. If this involves additional fitting or supervised steps, the reported MAPE and the 'unsupervised' framing require clarification to avoid circularity in the error metric.
Authors: The conversion is performed without any supervised fitting or labeled data. After fitting PCA and K-Means on the training dies, an unseen die's electrical measurements are projected into the PCA space and assigned to the nearest cluster centroid. The predicted Pr and Vc values are then taken directly from the centroid coordinates of that cluster (computed as the mean of the training devices assigned to it). The MAPE is calculated by comparing these centroid-derived predictions against the independently measured values on the held-out die. This process remains fully unsupervised, as cluster assignment uses only the learned structure from training data. We will revise the Methods section to explicitly describe this procedure and clarify that the MAPE evaluates predictive utility of the clusters rather than introducing circularity. revision: yes
-
Referee: [Results] Results: No details are given on how ground-truth Pr/Vc values were obtained for the held-out dies used in the MAPE calculation, nor is there validation that the clusters correspond to physical mechanisms (e.g., crystallization kinetics) rather than data artifacts.
Authors: Ground-truth Pr and Vc values for held-out dies were obtained through the same direct electrical characterization measurements used for the training dies, as detailed in the experimental methods. The clusters are statistical groupings derived from the electrical parameter space and are not claimed to map directly to specific physical mechanisms such as crystallization kinetics. We will add explicit text in the Results section stating that the framework is data-driven for variability analysis and prediction, with clusters reflecting performance similarity rather than proven physical causation. We note this as a scope limitation and suggest physical interpretation as a direction for future work, without altering the current claims. revision: partial
Circularity Check
No circularity in derivation chain
full rationale
The framework applies PCA and K-Means to measured Pr/Vc data from sampled dies to form clusters, then reports MAPE on held-out untested dies. This is a standard data-driven clustering-plus-prediction pipeline whose accuracy metric is computed on independent test data rather than reducing to a fitted parameter or self-definition. No self-citations, ansatz smuggling, or uniqueness theorems are invoked as load-bearing steps; the central claim rests on empirical validation rather than construction.
Axiom & Free-Parameter Ledger
free parameters (2)
- Number of principal components retained
- Number of clusters (K) in K-Means
axioms (1)
- domain assumption Electrical measurements (Pr, Vc, etc.) from sampled dies are statistically representative of the full wafer.
Reference graph
Works this paper leans on
-
[1]
The detailed process flow for this work is given in Supplementary information Figure S1
Methods In this section, we first define the main concept for this work (Figure 1). The detailed process flow for this work is given in Supplementary information Figure S1. Next, we describe individual processes like device fabrication and characterization results and finally utilization of this data in developing the framework. 6 Figure 1: Schematic diag...
work page 2000
-
[2]
Unsupervised Analysis: In the first stage Principal Component Analysis (PCA) was used for dimensionality reduction to visualize the measured P – V dataset and identify the main sources of variation. Then K-Means clustering technique was used to automatically group devices into performance categories, with the optimal cluster count (k) determined by the "e...
-
[3]
Predictive Modeling: A predictive model was built based on the unsupervised clustering. The model was trained on data from a set of measured dies as the "training" dies. Its purpose is to predict the performance of the cluster and key electrical parameters ( Pr, V c) of devices on separate, " uncharacterized" die s. The model's accuracy was then verified ...
-
[4]
Results and Discussion 3.1. Ferroelectric properties of HZO FeCaps To verify the basic electrical behavior of our fabricated HZO capacitors Figure 2(a), we examined their polarization vs. voltage (P-V) loops using dynamic hysteresis measurement (DHM). Voltage was applied on the top Au electrode keeping the bottom TiN electrode grounded. Physical location ...
-
[5]
Conclusion In conclusion, a combined hardware-machine-learning based framework to access chip-scale and wafer-scale variability of ferroelectric HZO capacitors for ferroelectric random -access memories is presented in this work for the first time. It is found that uns upervised machine learning techniques, specifically Principal Component Analysis (PCA) a...
- [6]
-
[7]
M. Ghahramani, Y. Qiao, M. Zhou, A. O'Hagan, J. Sweeney, IEEE/CAA J. Autom. Sin. 2020, 7, 1026
work page 2020
-
[8]
J. Wang, C. Xu, Z. Yang, J. Zhang, X. Li, IEEE Trans. Semicond. Manuf. 2020, 33, 587
work page 2020
-
[9]
W. Wan, R. Kubendran, C. Muzio, S. Zhou, P. Chang, R. Huang, C. Wu, N. Liao, P.-N. Chiu, S. Spetalnick, et al., Nature 2022, 608, 504
work page 2022
-
[10]
C. Wolters, X. Yang, U. Schlichtmann, T. Suzumura, ACM Trans. Des. Autom. Electron. Syst. 2024, DOI: 10.48550/arXiv.2406.08413. 20
-
[11]
L. Zhu, S. K. Lim, in Proc. Int. Symp. Phys. Des., 2021, 154
work page 2021
- [12]
- [13]
-
[14]
E. Yu, G. Kumar K., U. Saxena, K. Roy, Sci. Rep. 2024, 14, 9426
work page 2024
-
[15]
K. Bhardwaj, E. Paasio, S. Majumdar, Adv. Intell. Discov. 2025, e202500143
work page 2025
- [16]
- [17]
-
[18]
T. S. Böscke, J. Müller, D. Bräuhaus, U. Schröder, U. Böttger, Appl. Phys. Lett. 2011, 99, 102903
work page 2011
- [19]
-
[20]
A. G. Chernikova, M. G. Kozodaev, A. M. Markeev, D. V. Negrov, M. Spiridonov, S. Zarubin, V. Deshevoi, O. M. Orlov, T. V. Bakasova, A. A. Bagatur’yants, et al., ACS Appl. Mater. Interfaces 2016, 8, 7232
work page 2016
-
[21]
X. Li, P. Srivari, E. Paasio, S. Majumdar, Nanoscale 2025, 17, 6058
work page 2025
-
[22]
X. Li, P. Srivari, M. Honkanen, T. Salminen, S. Majumdar, Adv. Mater. Technol. 2026, e202502027
work page 2026
-
[23]
L. Alrifai, E. V. Skopin, N. Guillaume, P. Gonon, A. Bsiesy, Appl. Phys. Lett. 2023, 123, DOI: 10.1063/5.0151257
-
[24]
M. G. Kozodaev, A. G. Chernikova, E. V. Korostylev, M. H. Park, R. R. Khakimov, C. S. Hwang, A. M. Markeev, J. Appl. Phys. 2019, 125, DOI: 10.1063/1.5050700
-
[25]
Y. Liu, H. Zhang, J. Yang, D. A. Golosov, X. Wu, C. Gu, S. Ding, W. Liu, Chip 2025, 4, 100120
work page 2025
- [26]
-
[27]
Y. Luo, Z. Tang, X. Yin, C. Chen, Z. Fan, M. Qin, M. Zeng, G. Zhou, X. Gao, X. Lu, J. Dai, D. Chen, J. Liu, J. Materiomics 2022, 8, 311. 21
work page 2022
-
[28]
S. S. Cheema et al., Nature 2020, 580, 478–482
work page 2020
-
[29]
P. Srivari, X. Li, E. Paasio, S. Majumdar, Mater. Sci. Semicond. Process. 2026, in review, arXiv:2412.11288
-
[30]
M. H. Park, Y. H. Lee, H. J. Kim, Y. J. Kim, T. Moon, K. D. Kim, J. Müller, A. Kersch, U. Schroeder, T. Mikolajick, C. S. Hwang, Adv. Mater. 2015, 27, 1811
work page 2015
- [31]
- [32]
-
[33]
Z. Xu, J. H. Saleh, Reliab. Eng. Syst. Saf. 2021, 211, 107530
work page 2021
-
[34]
M. Saqlain, Q. Abbas, J. Y. Lee, IEEE Trans. Semicond. Manuf. 2020, 33, 436
work page 2020
-
[35]
Theodosiou, et al., Procedia Comput
T. Theodosiou, et al., Procedia Comput. Sci. 2023, 217, 570
work page 2023
-
[36]
T. Kim, K. Behdinan, J. Intell. Manuf. 2023, 34, 3215
work page 2023
-
[37]
J. Lee, Y. Ju, J. Lim, S. Hong, S.-W. Baek, J. Lee, Micromachines 2025, 16, 1057
work page 2025
-
[38]
Y. Kim, D. Cho, J.-H. Lee, Microelectron. Reliab. 2021, 122, 114157
work page 2021
-
[39]
M. Hou, P. Li, S. Cheng, J. Yv, Adv. Control Ind. Process. 2024, DOI: 10.1002/adc2.196
-
[40]
Y.-F. Yang, M. Sun, in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, DOI: 10.1109/cvpr52688.2022.00236
-
[41]
S. M. Miraftabzadeh, C. G. Colombo, M. Longo, F. Foiadelli, IEEE Access 2023, 11, 119596
work page 2023
-
[42]
J. S. Armstrong, Long-Range Forecasting: From Crystal Ball to Computer, Wiley, New York 1985
work page 1985
- [43]
-
[44]
A. G. R. Balan, Z. Guo, X. Li, IEEE Trans. Semicond. Manuf. 2021, 34, 165
work page 2021
-
[45]
C. M. Bishop, Pattern Recognition and Machine Learning, Springer, New York 2006. 22
work page 2006
-
[46]
T. Mikolajick, S. Slesazeck, H. Mulaosmanovic, M. H. Park, S. Fichtner, P. D. Lomenzo, M. Hoffmann, U. Schroeder, J. Appl. Phys. 2021, 129, 100901
work page 2021
-
[47]
P. Srivari, E. Paasio, A. Anu, M. A. H. Nehal, S. Majumdar, in Proc. IEEE Int. Integr. Reliab. Workshop (IIRW), South Lake Tahoe, USA 2025. 23 Supplementary information An Unsupervised Machine Learning -based Framework for Wafer Scale Variability Analysis and Performance Prediction of Ferroelectric Hf0.5Zr0.5O2 Thin Film Capacitors Anika Anu, Sayani Majum...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.