pith. sign in

arxiv: 2509.01769 · v2 · pith:WWWQHQ3Xnew · submitted 2025-09-01 · ❄️ cond-mat.mes-hall

AM-DefectNet: Additive Manufacturing Defect Classification Using Machine Learning -- A comparative Study

Pith reviewed 2026-05-18 19:27 UTC · model grok-4.3

classification ❄️ cond-mat.mes-hall
keywords additive manufacturingdefect classificationmachine learningmelt pool characterizationCatBoostbenchmarkingprocess monitoring
0
0 comments X

The pith

Non-linear tree-based machine learning models classify additive manufacturing defects with up to 92.47 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AM-DefectNet as a framework to benchmark machine learning models for identifying defects during additive manufacturing through melt pool characterization. It tests fifteen models on a dataset of 1514 training samples and 505 test samples using ten performance metrics. Non-linear tree-based algorithms stand out, with CatBoost reaching the highest accuracy at 92.47 percent along with strong scores in precision, recall, and F1. These results suggest machine learning can support better monitoring to improve part quality and reduce defects in production. Learning curves from the tests also show how model performance changes with more data.

Core claim

In this study, we benchmark 15 ML models for melt pool characterization in additive manufacturing using 1514 training and 505 test datasets across 10 metrics. Non-linear tree-based algorithms, particularly CatBoost, LGBM, and XGBoost, outperform other models, achieving accuracies of 92.47%, 91.08%, and 90.89%, respectively. CatBoost emerges as the top-performing algorithm, exhibiting superior performance in precision, recall, F1-score, and overall accuracy for defect classification tasks.

What carries the argument

AM-DefectNet, a benchmarking framework that evaluates 15 ML models on melt pool data for defect classification using ten metrics.

If this is right

  • Tree-based models can reach over 90 percent accuracy for defect detection in AM melt pools.
  • CatBoost provides the best balance of precision, recall, and overall accuracy among the tested approaches.
  • Learning curves indicate the amount of data needed to train effective models for this task.
  • Deep neural networks perform competitively but fall short of the leading tree-based methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Integrating these classifiers into real-time monitoring systems could allow automatic adjustments during printing to reduce defects.
  • The benchmarking approach might apply to sensor data from other manufacturing processes beyond additive manufacturing.
  • Larger datasets from varied production conditions would likely improve generalization of the top models.

Load-bearing premise

The 1514 training and 505 test datasets are representative of real-world additive manufacturing conditions and the ten metrics adequately reflect practical utility for process monitoring.

What would settle it

Running the same models on fresh melt pool data collected from a different machine, material, or process parameter set and finding accuracies below 80 percent would challenge the reported performance.

read the original abstract

Additive Manufacturing (AM) processes present challenges in monitoring and controlling material properties and process parameters, affecting production quality and defect detection. Machine Learning (ML) techniques offer a promising solution for addressing these challenges. In this study, we introduce a comprehensive framework, AM-DefectNet, for benchmarking ML models in melt pool characterization, a critical aspect of AM. We evaluate 15 ML models across 10 metrics using 1514 training and 505 test datasets. Our benchmarking reveals that non-linear tree-based algorithms, particularly CatBoost, LGBM, and XGBoost, outperform other models, achieving accuracies of 92.47%, 91.08%, and 90.89%, respectively. Notably, the Deep Neural Network (DNN) also demonstrates competitive performance with an accuracy of 88.55%. CatBoost emerges as the top-performing algorithm, exhibiting superior performance in precision, recall, F1-score, and overall accuracy for defect classification tasks. Learning curves provide insights into model performance and data requirements, indicating potential areas for improvement. Our study highlights the effectiveness of ML models in melt pool characterization and defect detection, laying the groundwork for process optimization in AM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces AM-DefectNet, a benchmarking framework evaluating 15 machine learning models for defect classification in additive manufacturing melt pool characterization. Using a dataset of 1514 training and 505 test samples, the authors compare models across 10 metrics and report that tree-based algorithms outperform others, with CatBoost achieving 92.47% accuracy, LGBM 91.08%, XGBoost 90.89%, and a DNN reaching 88.55%; learning curves are also presented to assess data requirements.

Significance. If the performance ordering holds under improved validation, the study supplies a useful empirical benchmark that can inform algorithm selection for real-time defect monitoring in AM processes. The multi-metric evaluation and inclusion of learning curves provide practical guidance on model reliability and data efficiency for process optimization.

major comments (2)
  1. [Methods] Methods section: No details are given on feature extraction from the melt pool data, the hyperparameter tuning protocol for the 15 models, or handling of possible class imbalance. These omissions prevent reproduction of the reported accuracies and make it impossible to assess why CatBoost, LGBM, and XGBoost rank highest.
  2. [Results] Results section (performance table and abstract claims): All accuracy figures (92.47%, 91.08%, 90.89%) derive from a single fixed 1514/505 train-test split with no k-fold cross-validation, multiple random seeds, or reported standard deviations. In melt-pool data, process-parameter correlations can make a single partition non-representative, undermining the load-bearing claim that these specific models and values are reliably superior.
minor comments (2)
  1. [Abstract] Abstract: Adding one sentence on whether the melt-pool data are experimental or simulated would improve context for readers.
  2. [Figures] Learning-curve figures: Verify that training and validation curves are distinctly labeled and that the x-axis scale (number of samples) is clearly indicated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which have helped us improve the clarity, reproducibility, and robustness of the manuscript. We address each major comment below and have revised the manuscript accordingly.

read point-by-point responses
  1. Referee: [Methods] Methods section: No details are given on feature extraction from the melt pool data, the hyperparameter tuning protocol for the 15 models, or handling of possible class imbalance. These omissions prevent reproduction of the reported accuracies and make it impossible to assess why CatBoost, LGBM, and XGBoost rank highest.

    Authors: We agree that the original Methods section was insufficiently detailed for full reproducibility. In the revised manuscript we have added a dedicated subsection describing the feature extraction pipeline applied to the melt-pool images, including the image-processing steps and the specific numerical features derived. We have also documented the hyperparameter tuning protocol (grid search combined with internal validation) used for all 15 models and clarified how class imbalance was handled through class-weighting in the tree-based models and appropriate loss weighting in the neural network. These additions directly address the referee’s concerns and allow independent reproduction of the reported results. revision: yes

  2. Referee: [Results] Results section (performance table and abstract claims): All accuracy figures (92.47%, 91.08%, 90.89%) derive from a single fixed 1514/505 train-test split with no k-fold cross-validation, multiple random seeds, or reported standard deviations. In melt-pool data, process-parameter correlations can make a single partition non-representative, undermining the load-bearing claim that these specific models and values are reliably superior.

    Authors: The referee correctly identifies a limitation of the original evaluation. While a single fixed split was initially chosen to enable direct head-to-head comparison under identical conditions, we acknowledge that it does not quantify variability. In the revised manuscript we have performed additional 5-fold cross-validation experiments using multiple random seeds and now report mean accuracies together with standard deviations for every model. The updated results preserve the original ranking, with CatBoost remaining the strongest performer; the added statistics strengthen rather than weaken the central claim. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical accuracies are direct held-out measurements

full rationale

The paper performs standard supervised classification benchmarking of 15 ML models on a fixed 1514/505 train-test split and reports point-estimate accuracies, precision, recall, and F1 scores across 10 metrics. These numbers are obtained by applying each model to the held-out test set and counting correct predictions; they do not arise from any equation, fitted parameter, or self-referential derivation that reduces to the input data by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatz smuggling appear in the reported workflow. The results therefore remain independent empirical observations rather than tautological restatements of the training procedure.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The study rests on standard supervised learning assumptions and the quality of the collected AM dataset; no new physical entities or ad-hoc constants are introduced beyond routine model hyperparameters.

free parameters (2)
  • Train/test split sizes
    Specific choice of 1514 training and 505 test samples is a modeling decision that can affect reported accuracies.
  • Model hyperparameters
    Each of the 15 algorithms requires hyperparameter selection that is not detailed in the abstract.
axioms (2)
  • domain assumption Labels in the dataset correctly indicate the presence or absence of defects in the melt pool.
    Required for any supervised classification claim to be meaningful.
  • domain assumption The input features derived from melt pool data are informative enough to distinguish defect classes.
    Implicit assumption underlying all model comparisons.

pith-pipeline@v0.9.0 · 5745 in / 1372 out tokens · 41127 ms · 2026-05-18T19:27:11.543342+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    Introduction Transitioning from conventional manufacturing, which relies on physical-contact energy to shape materials, to advanced manufacturing driven by non -contact energy holds promise for meeting the diverse demands of various industries such as biomedical, electr onics, and aerospace applications. Additive Manufacturing (AM), commonly known as 3-D ...

  2. [2]

    [9] specifically for defect detection in additive manufacturing, achieving over 80% accuracy in identifying various defects

    and Gobert et al. [9] specifically for defect detection in additive manufacturing, achieving over 80% accuracy in identifying various defects. Moreover, Bayesian classifiers and Artificial Neural Networks (ANNs) have found roles in defect detection, with Bayesian classifiers offering probabilistic defect information in processes like Laser Beam Additive M...

  3. [3]

    This section delves into the processes of dataset collection and curation, feature engineering, and selection of ML algorithms

    Methodology In Figure 1, the AM-DefectNet framework is depicted, encompassing the raw dataset features, the process of featurization, the employed ML models, and the target classification. This section delves into the processes of dataset collection and curation, feature engineering, and selection of ML algorithms. 2.1 Data Collection The data concerning ...

  4. [4]

    Laser Beam Spot Diameter

  5. [5]

    Hatch spacing Ma erial ro er ies

  6. [6]

    Specific heat Capacity

  7. [7]

    Thermal Conductivity

  8. [8]

    Chemical Composition Inputs OutputsM Models aser ower anning eed esirable ogis i egression e e lassi i a ion K aussian B e ision ree ando ores AdaBoos BF M inear M have integrated two commonly utilized additive manufacturing techniques: Selective Laser Melting (SLM) and Electron Beam Melting (EBM) into the datasets. As depicted in Figure 3, the defect cla...

  9. [9]

    Initially, the datasets were collected, cleaned, and prepared by removing illogical data and applying methods like forward, backward, and polynomial filling

    Results and discussion In this section, we analyze the performance of AM - DefectNet benchmarked models on datasets. Initially, the datasets were collected, cleaned, and prepared by removing illogical data and applying methods like forward, backward, and polynomial filling. Subsequently, each model's efficacy was assessed, comparing linear and non -linear...

  10. [10]

    Conclusion Additive Manufacturing is a sophisticated multi-physics process influenced by numerous process parameters and the thermal -affected melt pool zone. Defects such as keyhole formation, balling phenomenon, and lack of fusion (LOF) are common in AM -built products, with material prop erties playing a crucial role in their occurrence. In -situ and e...

  11. [11]

    Followed closely were LGBM and XGBoost, with accuracies of 91.08% and 90.89%, respectively

    Among the 15 models considered in our benchmark, CatBoost emerged as the top -performing algorithm, achieving an accuracy of 92.47%. Followed closely were LGBM and XGBoost, with accuracies of 91.08% and 90.89%, respectively. Notably, the leading models primarily consisted of non -linear tree-based algorithms, with the Deep Neural Network (DNN) also displa...

  12. [12]

    The model exhibited robust performance across different classes, further validating its effectiveness in defect classification tasks

    CatBoost demonstrated superior performance in classification tasks, surpassing other gradient boost algorithms in terms of precision, recall, F1-score, and overall accuracy. The model exhibited robust performance across different classes, further validating its effectiveness in defect classification tasks

  13. [13]

    These curves depicted the evolution of model performance with increasing training data, offering insights into model fitting and data requirements

    Learning curves provided valuable insights into the potential for further performance improvement and the reasons behind suboptimal model performance. These curves depicted the evolution of model performance with increasing training data, offering insights into model fitting and data requirements. In summary, our study underscores the efficacy of ML techn...

  14. [14]

    Akbari, F

    P. Akbari, F. Ogoke, N.Y. Kao, K. Meidani, C.Y. Yeh, W. Lee, A. Barati Farimani, MeltpoolNet: Melt pool characteristic prediction in Metal Additive Manufacturing using machine learning, Addit Manuf 55 (2022) 102817. https: doi.org 10.1016 J.ADDMA.2022.102817

  15. [17]

    Wang, X.P

    C. Wang, X.P. Tan, S.B. Tor, C.S. Lim, Machine learning in additive manufacturing: State-of-the- art and perspectives, Addit Manuf 36 (2020) 101538. https: doi.org 10.1016 J.ADDMA.2020.101538

  16. [19]

    Okaro, S

    I.A. Okaro, S. Jayasinghe, C. Sutcliffe, K. Black, P. Paoletti, P.L. Green, Automatic fault detection for laser powder-bed fusion using semi- supervised machine learning, Addit Manuf 27 (2019) 42–53. https: doi.org 10.1016 J.ADDMA.2019.01.006

  17. [20]

    Khanzadeh, S

    M. Khanzadeh, S. Chowdhury, M. Marufuzzaman, M.A. Tschopp, L. Bian, Porosity prediction: Supervised-learning of thermal history for direct laser deposition, J Manuf Syst 47 (2018) 69–82. https: doi.org 10.1016 J.JMSY.2018.04.001

  18. [23]

    Tapia, A.H

    G. Tapia, A.H. Elwany, H. Sang, Prediction of porosity in metal-based additive manufacturing using spatial Gaussian process models, Addit Manuf 12 (2016) 282–290. https: doi.org 10.1016 J.ADDMA.2016.05.009

  19. [24]

    S. Lee, J. Peng, D. Shin, Y.S. Choi, Data analytics approach for melt-pool geometries in metal additive manufacturing, Sci Technol Adv Mater 20 (2019) 972–978. https: doi.org 10.1080 14686996.2019.1671140

  20. [25]

    Yuan, Solidification Defects in Additive Manufactured Materials, JOM 71 (2019) 3221–

    L. Yuan, Solidification Defects in Additive Manufactured Materials, JOM 71 (2019) 3221–

  21. [26]

    https: doi.org 10.1007 S11837-019-03662- X METRICS

  22. [27]

    Gaikwad, B

    A. Gaikwad, B. Giera, G.M. Guss, J.B. Forien, M.J. Matthews, P. Rao, Heterogeneous sensing and scientific machine learning for quality assurance in laser powder bed fusion – A single- track study, Addit Manuf 36 (2020) 101659. https: doi.org 10.1016 J.ADDMA.2020.101659

  23. [28]

    Zhang, W

    D. Zhang, W. Sui, The application of AR model and SVM in rolling bearings condition monitoring, Communications in Computer and Information Science 152 CCIS (2011) 326–331. https: doi.org 10.1007 978-3-642-21402- 8_53 COVER

  24. [29]

    Lecun, Y

    Y. Lecun, Y. Bengio, G. Hinton, Deep learning, Nature 2015 521:7553 521 (2015) 436–444. https: doi.org 10.1038 nature14539

  25. [30]

    J. Li, L. Shen, Z. Liu, al -, C. Zhang, Q. Liao, X. Zhang, D.S. Ye, Y.H. J Fuh, Y.J. Zhang, G.S. Hong, K.P. Zhu, Defects Recognition in Selective Laser Melting with Acoustic Signals by SVM Based on Feature Reduction, IOP Conf Ser Mater Sci Eng 436 (2018) 012020. https: doi.org 10.1088 1757- 899X 436 1 012020

  26. [31]

    Imani, A

    F. Imani, A. Gaikwad, M. Montazeri, P. Rao, H. Yang, E. Reutzel, Process mapping and in- process monitoring of porosity in laser powder bed fusion using layerwise optical imaging, Journal of Manufacturing Science and Engineering, Transactions of the ASME 140 (2018). https: doi.org 10.1115 1.4040615 366215

  27. [32]

    Gobert, E.W

    C. Gobert, E.W. Reutzel, J. Petrich, A.R. Nassar, S. Phoha, Application of supervised machine learning for defect detection during metallic powder bed fusion additive manufacturing using high resolution imaging., Addit Manuf 21 (2018) 517–528. https: doi.org 10.1016 J.ADDMA.2018.04.005

  28. [33]

    Scime, J

    L. Scime, J. Beuth, Using machine learning to identify in-situ melt pool signatures indicative of flaw formation in a laser powder bed fusion additive manufacturing process, Addit Manuf 25 (2019) 151–165. https: doi.org 10.1016 J.ADDMA.2018.11.010

  29. [34]

    Bartlett, A

    J.L. Bartlett, A. Jarama, J. Jones, X. Li, Prediction of microstructural defects in additive manufacturing from powder bed quality using digital image correlation, Materials Science and Engineering: A 794 (2020) 140002. https: doi.org 10.1016 J.MSEA.2020.140002

  30. [35]

    Z. Snow, B. Diehl, E.W. Reutzel, A. Nassar, Toward in-situ flaw detection in laser powder bed fusion additive manufacturing through layerwise imagery and machine learning, J Manuf Syst 59 (2021) 12–26. https: doi.org 10.1016 J.JMSY.2021.01.008

  31. [36]

    Kageyama, H

    K. Kageyama, H. Murayama, I. Ohsawa, M. Kanai, K. Nagata, Y. MacHijima, F. Matsumura, Acoustic emission monitoring of a reinforced concrete structure by applying new fiber-optic sensors, Smart Mater Struct 14 (2005) S52. https: doi.org 10.1088 0964-1726 14 3 007