A multi-task spatiotemporal deep neural network for predicting penetration depth and morphology in laser welding

Chendong Shao; Haichao Cui; Sen Li; Xinhua Tang; Yaqi Wang

arxiv: 2606.26260 · v1 · pith:XTIFEFFLnew · submitted 2026-06-24 · 💻 cs.CV · cs.AI

A multi-task spatiotemporal deep neural network for predicting penetration depth and morphology in laser welding

Sen Li , Haichao Cui , Chendong Shao , Yaqi Wang , Xinhua Tang This is my paper

Pith reviewed 2026-06-26 01:35 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords laser weldingpenetration predictionweld morphologymulti-task learningspatiotemporal modelweld pool imagingdeep neural networkin-situ monitoring

0 comments

The pith

A multi-task spatiotemporal neural network predicts laser weld penetration state, depth, and cross-section morphology from top-view pool images plus process parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors develop a deep learning model that processes sequences of weld pool images captured by a CMOS camera together with welding parameters. The network uses convolutional layers and state space models to extract spatial-temporal features and outputs three predictions at once: whether the weld has penetrated fully, the numerical depth value, and the reconstructed weld cross-section shape. The work also describes a dataset-construction procedure intended to make the training examples more representative. On a held-out test set the model reaches 99.35 percent accuracy on penetration state, 1.79 mm mean error on depth, and 95.65 percent accuracy on cross-section reconstruction. If the reported numbers generalize, the approach supplies an in-situ, non-destructive way to monitor weld quality during laser penetration welding.

Core claim

The authors present a multi-task model that integrates spatiotemporal features extracted from top weld pool images along with welding parameters, using a convolutional neural network and state space model architecture, together with a dataset-construction method, and report validation performance of 99.35 percent accuracy for penetration state, 1.79 mm error for penetration depth, and 95.65 percent accuracy for weld cross-section reconstruction.

What carries the argument

The multi-task spatiotemporal deep neural network that fuses convolutional and state-space processing of weld-pool image sequences with welding parameters to produce simultaneous predictions of penetration state, depth, and morphology.

If this is right

The model supplies simultaneous, real-time estimates of three weld-quality metrics from a single camera view.
The dataset-construction procedure is presented as a way to improve robustness and generalization of similar image-based welding monitors.
The approach is positioned as a component of in-situ quality control strategies for laser penetration welding systems.
High test-set numbers on state, depth, and morphology are offered as evidence that image-plus-parameter inputs suffice for these three tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the model runs fast enough on embedded hardware, it could close the loop for automatic adjustment of laser power or speed during welding.
The same image-sequence plus parameter format might be reused for related processes such as laser cladding if comparable labeled data can be collected.
Adding a second synchronized camera angle or acoustic emission signals could be tested as a way to reduce the remaining depth error without changing the network architecture.

Load-bearing premise

The dataset-construction method produces training examples whose distribution matches the distribution of future production welds closely enough for the reported test-set numbers to generalize.

What would settle it

Running the trained model on a new collection of welds made under production conditions with different material batches or parameter ranges and observing penetration-state accuracy below 90 percent or depth error above 3 mm would falsify the generalization claim.

read the original abstract

In laser penetration welding, the assessment of penetration state and weld seam morphology plays a crucial role in determining the weld quality. This paper presents a comprehensive introduction of the innovative muti-task deep learning model that has the capability to predict penetration state, depth, and weld seam morphology with high accuracy. The monitoring platform relies on weld pool images captured during the laser welding process using a complementary metal-oxide-semiconductor camera. The proposed model integrates spatiotemporal features extracted from top weld pool images along with welding parameters, establishing a deep learning framework based on convolutional neural networks and state space models for more efficient extraction and processing of spatial-temporal information. Furthermore, a reliable method for constructing the dataset is proposed to enhance both robustness and generalization capability of the developed model. Validation results on the test set demonstrate that prediction accuracy for penetration state can reach 99.35%, while prediction error for penetration depth is 1.79 millimeter, and accuracy of reconstructing the weld cross-section is 95.65%. This study provides new insights and methodologies for in-situ quality control strategies in laser penetration welding systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies a multi-task CNN plus state-space model to laser welding images and parameters and reports high test-set numbers, but provides no checks that the data matches real production distributions.

read the letter

The main thing here is a multi-task spatiotemporal network that takes weld pool images and process parameters and outputs penetration state, depth, and cross-section morphology. On their test set it hits 99.35% state accuracy, 1.79 mm depth error, and 95.65% morphology accuracy. They also describe a dataset construction procedure meant to increase robustness.

The architecture choice makes sense for the task. Combining CNN spatial features with state-space temporal modeling and feeding in the welding parameters lets one model handle the three outputs together. That is a reasonable way to build an in-situ monitoring system, and the dataset method shows they thought about making the training examples more representative than a naive collection would be.

The evaluation is the weak part. No ablation results show what the state-space component or the multi-task structure actually contributes. There are no error bars, no description of how the test split was made, and no indication whether the test images come from the same welds or different runs. Most critically, the paper states that the dataset method improves generalization but supplies no quantitative support—no statistical distance between their constructed data and actual production welds, no results on an external set collected on different equipment. The stress-test note is correct on this point: the headline numbers rest on an unverified assumption about distribution match. If that assumption does not hold, the numbers will not translate to the factory.

This is for people working on applied computer vision in manufacturing, particularly process monitoring for welding. A reader in that area might borrow the multi-task framing or the data-handling idea.

I would send it for peer review. The application is concrete enough that referees can check the missing validation pieces and decide whether the claims are supported.

Referee Report

4 major / 1 minor

Summary. The paper introduces a multi-task deep neural network for in-situ monitoring in laser penetration welding. It processes top-view weld pool images captured by a CMOS camera together with process parameters using a combination of convolutional networks and state-space models to extract spatiotemporal features. The model simultaneously predicts penetration state (binary classification), penetration depth (regression), and weld cross-section morphology (reconstruction). A custom dataset-construction procedure is proposed to improve robustness and generalization. On a held-out test set the authors report 99.35 % accuracy for penetration-state prediction, 1.79 mm mean error for depth, and 95.65 % accuracy for cross-section reconstruction.

Significance. If the reported test-set metrics prove reliable under independent validation, the work would offer a practical multi-task framework for real-time weld-quality assessment that integrates visual and parametric inputs. Such a system could support closed-loop control in laser welding, reducing defects in high-value manufacturing. The emphasis on spatiotemporal modeling and an explicit dataset-construction method addresses domain-specific challenges, though the absence of ablations and distribution-shift checks limits immediate deployability claims.

major comments (4)

[Abstract] Abstract: the headline metrics (99.35 % state accuracy, 1.79 mm depth error, 95.65 % morphology accuracy) are presented as single aggregate values with no error bars, confidence intervals, or description of the loss function and aggregation method used to obtain the depth error. Without these details it is impossible to assess whether the numbers reflect stable performance or are sensitive to a few outliers.
[Abstract] Abstract (final paragraph) and dataset-construction section: the claim that the proposed dataset-construction method improves generalization rests on the unverified assumption that the constructed training and test distributions match future production welds. No statistical distance metrics, covariate-shift tests, or external validation welds collected on different equipment or parameter regimes are reported to support this assumption.
[Abstract] Abstract: no ablation studies, baseline comparisons, or component-wise analysis are described to justify the multi-task formulation, the choice of state-space model, or the spatiotemporal fusion strategy. Consequently the contribution of each architectural element to the reported numbers cannot be isolated.
[Abstract] Abstract: the test-set construction is described only at high level; it is unclear whether the held-out examples are temporally or spatially independent of the training data or whether they were collected under identical process conditions, which directly affects the validity of the generalization claim.

minor comments (1)

[Abstract] Abstract contains the typo "muti-task" (should be "multi-task").

Simulated Author's Rebuttal

4 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating revisions made to the manuscript where appropriate.

read point-by-point responses

Referee: [Abstract] Abstract: the headline metrics (99.35 % state accuracy, 1.79 mm depth error, 95.65 % morphology accuracy) are presented as single aggregate values with no error bars, confidence intervals, or description of the loss function and aggregation method used to obtain the depth error. Without these details it is impossible to assess whether the numbers reflect stable performance or are sensitive to a few outliers.

Authors: We agree that reporting variability and methodological details strengthens the presentation of results. In the revised manuscript we now include the mean and standard deviation of each metric computed across five independent training runs with different random seeds. The depth error is explicitly defined as mean absolute error (MAE) aggregated over the test samples, and the methods section now details the loss functions (binary cross-entropy for state classification, mean-squared error for depth regression, and a weighted combination of MSE and perceptual loss for morphology reconstruction). revision: yes
Referee: [Abstract] Abstract (final paragraph) and dataset-construction section: the claim that the proposed dataset-construction method improves generalization rests on the unverified assumption that the constructed training and test distributions match future production welds. No statistical distance metrics, covariate-shift tests, or external validation welds collected on different equipment or parameter regimes are reported to support this assumption.

Authors: The referee correctly notes the lack of quantitative support for the generalization claim. We have revised the abstract and dataset-construction section to present the method more cautiously as a procedure intended to increase robustness within the collected data regime, and we added qualitative comparisons of image and parameter distributions between training and test splits. Formal statistical distance metrics and external validation on different equipment were not performed; we now explicitly list this as a limitation and direction for future work. revision: partial
Referee: [Abstract] Abstract: no ablation studies, baseline comparisons, or component-wise analysis are described to justify the multi-task formulation, the choice of state-space model, or the spatiotemporal fusion strategy. Consequently the contribution of each architectural element to the reported numbers cannot be isolated.

Authors: We acknowledge that the original submission did not isolate component contributions. In the revised manuscript we have added an ablation study subsection that compares the full multi-task model against single-task variants, a version without the state-space model, and alternative spatiotemporal fusion strategies. The new results are summarized in an additional table and support the design decisions. revision: yes
Referee: [Abstract] Abstract: the test-set construction is described only at high level; it is unclear whether the held-out examples are temporally or spatially independent of the training data or whether they were collected under identical process conditions, which directly affects the validity of the generalization claim.

Authors: We have expanded the dataset section to clarify that the test set comprises complete, temporally disjoint welding trials collected on separate days using the same equipment and overlapping but not identical parameter settings. This ensures temporal independence while maintaining comparable process conditions; the revised text now states this explicitly. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical ML performance on held-out test data with no derivations or self-referential reductions

full rationale

The paper introduces a multi-task CNN+state-space model for predicting weld penetration state, depth, and morphology from images and parameters, then reports standard empirical metrics on a held-out test set (99.35% state accuracy, 1.79 mm depth error, 95.65% morphology accuracy). No equations, first-principles derivations, fitted parameters renamed as predictions, or self-citation chains exist. The dataset-construction procedure is described at a high level to support robustness, but the reported numbers are ordinary train/test evaluation and do not reduce to inputs by construction under any of the enumerated circularity patterns. The work is self-contained as empirical validation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no equations, no model diagram, no dataset statistics, and no cited prior results are visible. Consequently the ledger cannot be populated beyond the generic observation that any deep-learning claim rests on the unstated assumption that the training distribution matches deployment.

pith-pipeline@v0.9.1-grok · 5728 in / 1166 out tokens · 19142 ms · 2026-06-26T01:35:53.142015+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages · 4 internal anchors

[1]

https://doi.org/10.1016/j.ijheatmasstransfer.2018.05.031 Bai, S., Kolter, J.Z., Koltun, V .,

work page doi:10.1016/j.ijheatmasstransfer.2018.05.031 2018
[2]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. https://doi.org/10.48550/arXiv.1803.01271 Bertasius, G., Wang, H., Torresani, L.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1803.01271
[3]

Is Space -Time Attention All You Need for Video Understanding? https://doi.org/10.48550/arXiv.2102.05095 Brock, C., Hohenstein, R., Schmidt, M.,

work page doi:10.48550/arxiv.2102.05095
[4]

Mechanisms of vapour plume formation in laser deep penetration welding. Opt. Lasers Eng. 58, 93 –101. https://doi.org/10.1016/j.optlaseng.2014.02.001 Cai, W., Wang, J., Jiang, P., Cao, L., Mi, G., Zhou, Q.,

work page doi:10.1016/j.optlaseng.2014.02.001 2014
[5]

Application of sensing techniques and artificial intelligence-based methods to laser welding real-time monitoring: A critical review of recent literature. J. Manuf. Syst. 57, 1 –18. https://doi.org/10.1016/j.jmsy.2020.07.021 Chang, Z., Zhang, X., Wang, S., Ma, S., Ye, Y ., Xinguang, X., Gao, W.,

work page doi:10.1016/j.jmsy.2020.07.021 2020
[6]

Multi -task learning for data- efficient spatiotemporal modeling of tool surface progression in ultrasonic metal welding. J. Manuf. Syst. 58, 306–315. https://doi.org/10.1016/j.jmsy.2020.12.009 Gao, X., Sun, Y ., Katayama, S.,

work page doi:10.1016/j.jmsy.2020.12.009 2020
[7]

Neural network of plume and spatter for monitoring high-power disk laser welding. Int. J. Precis. Eng. Manuf. -Green Technol. 1, 293–298. https://doi.org/10.1007/s40684-014-0035-y Gao, X., Zhang, Y .,

work page doi:10.1007/s40684-014-0035-y
[8]

Monitoring of welding status by molten pool morphology during high-power disk laser welding. Opt. - Int. J. Light Electron Opt. 126, 1797 –1802. https://doi.org/10.1016/j.ijleo.2015.04.060 Gianfrancesco, A.D.,

work page doi:10.1016/j.ijleo.2015.04.060 2015
[9]

Bead geometry prediction and optimization for corner structures in directed energy deposition using machine learning. Addit. Manuf. 84, 104080. https://doi.org/10.1016/j.addma.2024.104080 He, K., Zhang, X., Ren, S., Sun, J.,

work page doi:10.1016/j.addma.2024.104080 2024
[10]

Deep Residual Learning for Image Recognition. pp. 770–778. https://doi.org/10.1109/CVPR.2016.90 Hinton, G.E., Salakhutdinov, R.R.,

work page doi:10.1109/cvpr.2016.90 2016
[11]

Science 313, 504–507

Reducing the Dimensionality of Data with Neural Networks. Science 313, 504–507. https://doi.org/10.1126/science.1127647 Hong, Y ., Pan, H., Sun, W., Jia, Y .,

work page doi:10.1126/science.1127647
[12]

https://doi.org/10.48550/arXiv.2101.06085 Kim, C.-H., Ahn, D.-C.,

Deep Dual-resolution Networks for Real- time and Accurate Semantic Segmentation of Road Scenes. https://doi.org/10.48550/arXiv.2101.06085 Kim, C.-H., Ahn, D.-C.,

work page doi:10.48550/arxiv.2101.06085
[13]

https://doi.org/10.1016/j.optlastec.2012.02.025 Kingma, D.P., Ba, J.,

work page doi:10.1016/j.optlastec.2012.02.025 2012
[14]

Adam: A Method for Stochastic Optimization

Adam: A Method for Stochastic Optimization. https://doi.org/10.48550/arXiv.1412.6980 Le-Hong, T., Lin, P.C., Chen, J.- Z., Pham, T.D.Q., Van Tran, X.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1412.6980
[15]

Data -driven models for predictions of geometric characteristics of bead fabricated by selective laser melting. J. Intell. Manuf. 34, 1241–1257. https://doi.org/10.1007/s10845-021-01845-5 Li, H., Ren, H., Liu, Z., Huang, F., Xia, G., Long, Y .,

work page doi:10.1007/s10845-021-01845-5
[16]

Measurement 204, 112138

In-situ monitoring system for weld geometry of laser welding based on multi- task convolutional neural network model. Measurement 204, 112138. https://doi.org/10.1016/j.measurement.2022.112138 Liu, M., Dan, J., Lu, Z., Yu, Y ., Li, Y ., Li, X.,

work page doi:10.1016/j.measurement.2022.112138 2022
[17]

https://doi.org/10.48550/arXiv.2405.10530 Liu, S., Yu, H., Liao, C., Li, J., Lin, W., Liu, A.X., Dustdar, S.,

CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation. https://doi.org/10.48550/arXiv.2405.10530 Liu, S., Yu, H., Liao, C., Li, J., Lin, W., Liu, A.X., Dustdar, S.,

work page doi:10.48550/arxiv.2405.10530
[18]

Presented at the International Conference on Learning Representations

Pyraformer: Low- Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting. Presented at the International Conference on Learning Representations. Luo, M., Shin, Y .C., 2015a. Vision-based weld pool boundary extraction and width measurement during keyhole fiber laser welding. Opt. Lasers Eng. 64, 59 –70. https://doi.org/10.1016/j....

work page doi:10.1016/j.optlaseng.2014.07.004 2014
[19]

Imperfections in narrow gap multi- layer welding - Potential causes and countermeasures. Opt. Lasers Eng. 129, 106011. https://doi.org/10.1016/j.optlaseng.2020.106011 Olague, G., Hernández, D.E., Llamas, P., Clemente, E., Briseño, J.L.,

work page doi:10.1016/j.optlaseng.2020.106011 2020
[20]

Multimed

Brain programming as a new strategy to create visual routines for object tracking. Multimed. Tools Appl. 78, 5881–5918. https://doi.org/10.1007/s11042-018-6634-9 Rahman, M.M., Tutul, A.A., Nath, A., Laishram, L., Jung, S.K., Hammond, T.,

work page doi:10.1007/s11042-018-6634-9
[21]

https://doi.org/10.48550/arXiv.2410.03105 Sebestova, H., Chmelickova, H., Nozka, L., Moudry, J.,

Mamba in Vision: A Comprehensive Survey of Techniques and Applications. https://doi.org/10.48550/arXiv.2410.03105 Sebestova, H., Chmelickova, H., Nozka, L., Moudry, J.,

work page doi:10.48550/arxiv.2410.03105
[22]

Non- destructive Real Time Monitoring of the Laser Welding Process. J. Mater. Eng. Perform. 21, 764 –769. https://doi.org/10.1007/s11665-012-0193-4 Shelhamer, E., Long, J., Darrell, T.,

work page doi:10.1007/s11665-012-0193-4
[23]

IEEE Trans

Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640 –651. https://doi.org/10.1109/TPAMI.2016.2572683 Shi, X., Chen, Z., Wang, H., Yeung, D.-Y ., Wong, W., Woo, W.,

work page doi:10.1109/tpami.2016.2572683 2016
[24]

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. https://doi.org/10.48550/arXiv.1506.04214 Squillace, A., Prisco, U., Ciliberto, S., Astarita, A.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1506.04214
[25]

Effect of welding parameters on morphology and mechanical properties of Ti–6Al–4V laser beam welded butt joints. J. Mater. Process. Technol. 212, 427–436. https://doi.org/10.1016/j.jmatprotec.2011.10.005 Taylor, K.E.,

work page doi:10.1016/j.jmatprotec.2011.10.005 2011
[26]

https://doi.org/10.1029/2000JD900719 Wan, X., Wang, Y ., Zhao, D., Huang, Y ., Yin, Z.,

work page doi:10.1029/2000jd900719
[27]

Measurement 99, 120–127

Weld quality monitoring research in small scale resistance spot welding by dynamic resistance and neural network. Measurement 99, 120–127. https://doi.org/10.1016/j.measurement.2016.12.010 Wu, J., Zhang, C., Giam, A., Chia, H.Y ., Cao, H., Ge, W., Yan, W.,

work page doi:10.1016/j.measurement.2016.12.010 2016
[28]

Physics - assisted transfer learning metamodels to predict bead geometry and carbon emission in laser butt welding. Appl. Energy 359, 122682. https://doi.org/10.1016/j.apenergy.2024.122682 Yan, S., Chen, B., Tan, C., Song, X., Wang, G.,

work page doi:10.1016/j.apenergy.2024.122682 2024
[29]

A data -driven time-sequence feature-based composite network of time- distributed CNN -LSTM for detecting pore defects in laser penetration welding. J. Intell. Manuf. https://doi.org/10.1007/s10845 -024- 02391-6 You, D., Gao, X., Katayama, S.,

work page doi:10.1007/s10845
[30]

Data-driven based analyzing and modeling of MIMO laser welding process by integration of six advanced sensors. Int. J. Adv. Manuf. Technol. 82, 1127–1139. https://doi.org/10.1007/s00170-015-7455-x You, D., Gao, X., Katayama, S.,

work page doi:10.1007/s00170-015-7455-x
[31]

IEEE Trans

Multisensor Fusion System for Monitoring High-Power Disk Laser Welding Using Support Vector Machine. IEEE Trans. Ind. Inform. 10, 1285–1295. https://doi.org/10.1109/TII.2014.2309482 Yu, R., Kershaw, J., Wang, P., Zhang, Y .,

work page doi:10.1109/tii.2014.2309482 2014
[32]

How to Accurately Monitor the Weld Penetration From Dynamic Weld Pool Serial Images Using CNN -LSTM Deep Learning Model? IEEE Robot. Autom. Lett. 7, 6519 –6525. https://doi.org/10.1109/LRA.2022.3173659 Zadeh, A., Liang, P.P., Mazumder, N., Poria, S., Cambria, E., Morency, L.- P.,

work page doi:10.1109/lra.2022.3173659 2022
[33]

Memory Fusion Network for Multi-view Sequential Learning

Memory Fusion Network for Multi- view Sequential Learning. https://doi.org/10.48550/arXiv.1802.00927 Zhang, B., Hong, K.- M., Shin, Y .C.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1802.00927
[34]

Deep-learning-based porosity monitoring of laser welding process. Manuf. Lett. 23, 62 –66. https://doi.org/10.1016/j.mfglet.2020.01.001 Zhou, F., Liu, X., Jia, C., Li, S., Tian, J., Zhou, W., Wu, C.,

work page doi:10.1016/j.mfglet.2020.01.001 2020
[35]

Expert Syst

Unified CNN-LSTM for keyhole status prediction in PAW based on spatial-temporal features. Expert Syst. Appl. 237, 121425. https://doi.org/10.1016/j.eswa.2023.121425

work page doi:10.1016/j.eswa.2023.121425 2023

[1] [1]

https://doi.org/10.1016/j.ijheatmasstransfer.2018.05.031 Bai, S., Kolter, J.Z., Koltun, V .,

work page doi:10.1016/j.ijheatmasstransfer.2018.05.031 2018

[2] [2]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. https://doi.org/10.48550/arXiv.1803.01271 Bertasius, G., Wang, H., Torresani, L.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1803.01271

[3] [3]

Is Space -Time Attention All You Need for Video Understanding? https://doi.org/10.48550/arXiv.2102.05095 Brock, C., Hohenstein, R., Schmidt, M.,

work page doi:10.48550/arxiv.2102.05095

[4] [4]

Mechanisms of vapour plume formation in laser deep penetration welding. Opt. Lasers Eng. 58, 93 –101. https://doi.org/10.1016/j.optlaseng.2014.02.001 Cai, W., Wang, J., Jiang, P., Cao, L., Mi, G., Zhou, Q.,

work page doi:10.1016/j.optlaseng.2014.02.001 2014

[5] [5]

Application of sensing techniques and artificial intelligence-based methods to laser welding real-time monitoring: A critical review of recent literature. J. Manuf. Syst. 57, 1 –18. https://doi.org/10.1016/j.jmsy.2020.07.021 Chang, Z., Zhang, X., Wang, S., Ma, S., Ye, Y ., Xinguang, X., Gao, W.,

work page doi:10.1016/j.jmsy.2020.07.021 2020

[6] [6]

Multi -task learning for data- efficient spatiotemporal modeling of tool surface progression in ultrasonic metal welding. J. Manuf. Syst. 58, 306–315. https://doi.org/10.1016/j.jmsy.2020.12.009 Gao, X., Sun, Y ., Katayama, S.,

work page doi:10.1016/j.jmsy.2020.12.009 2020

[7] [7]

Neural network of plume and spatter for monitoring high-power disk laser welding. Int. J. Precis. Eng. Manuf. -Green Technol. 1, 293–298. https://doi.org/10.1007/s40684-014-0035-y Gao, X., Zhang, Y .,

work page doi:10.1007/s40684-014-0035-y

[8] [8]

Monitoring of welding status by molten pool morphology during high-power disk laser welding. Opt. - Int. J. Light Electron Opt. 126, 1797 –1802. https://doi.org/10.1016/j.ijleo.2015.04.060 Gianfrancesco, A.D.,

work page doi:10.1016/j.ijleo.2015.04.060 2015

[9] [9]

Bead geometry prediction and optimization for corner structures in directed energy deposition using machine learning. Addit. Manuf. 84, 104080. https://doi.org/10.1016/j.addma.2024.104080 He, K., Zhang, X., Ren, S., Sun, J.,

work page doi:10.1016/j.addma.2024.104080 2024

[10] [10]

Deep Residual Learning for Image Recognition. pp. 770–778. https://doi.org/10.1109/CVPR.2016.90 Hinton, G.E., Salakhutdinov, R.R.,

work page doi:10.1109/cvpr.2016.90 2016

[11] [11]

Science 313, 504–507

Reducing the Dimensionality of Data with Neural Networks. Science 313, 504–507. https://doi.org/10.1126/science.1127647 Hong, Y ., Pan, H., Sun, W., Jia, Y .,

work page doi:10.1126/science.1127647

[12] [12]

https://doi.org/10.48550/arXiv.2101.06085 Kim, C.-H., Ahn, D.-C.,

Deep Dual-resolution Networks for Real- time and Accurate Semantic Segmentation of Road Scenes. https://doi.org/10.48550/arXiv.2101.06085 Kim, C.-H., Ahn, D.-C.,

work page doi:10.48550/arxiv.2101.06085

[13] [13]

https://doi.org/10.1016/j.optlastec.2012.02.025 Kingma, D.P., Ba, J.,

work page doi:10.1016/j.optlastec.2012.02.025 2012

[14] [14]

Adam: A Method for Stochastic Optimization

Adam: A Method for Stochastic Optimization. https://doi.org/10.48550/arXiv.1412.6980 Le-Hong, T., Lin, P.C., Chen, J.- Z., Pham, T.D.Q., Van Tran, X.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1412.6980

[15] [15]

Data -driven models for predictions of geometric characteristics of bead fabricated by selective laser melting. J. Intell. Manuf. 34, 1241–1257. https://doi.org/10.1007/s10845-021-01845-5 Li, H., Ren, H., Liu, Z., Huang, F., Xia, G., Long, Y .,

work page doi:10.1007/s10845-021-01845-5

[16] [16]

Measurement 204, 112138

In-situ monitoring system for weld geometry of laser welding based on multi- task convolutional neural network model. Measurement 204, 112138. https://doi.org/10.1016/j.measurement.2022.112138 Liu, M., Dan, J., Lu, Z., Yu, Y ., Li, Y ., Li, X.,

work page doi:10.1016/j.measurement.2022.112138 2022

[17] [17]

https://doi.org/10.48550/arXiv.2405.10530 Liu, S., Yu, H., Liao, C., Li, J., Lin, W., Liu, A.X., Dustdar, S.,

CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation. https://doi.org/10.48550/arXiv.2405.10530 Liu, S., Yu, H., Liao, C., Li, J., Lin, W., Liu, A.X., Dustdar, S.,

work page doi:10.48550/arxiv.2405.10530

[18] [18]

Presented at the International Conference on Learning Representations

Pyraformer: Low- Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting. Presented at the International Conference on Learning Representations. Luo, M., Shin, Y .C., 2015a. Vision-based weld pool boundary extraction and width measurement during keyhole fiber laser welding. Opt. Lasers Eng. 64, 59 –70. https://doi.org/10.1016/j....

work page doi:10.1016/j.optlaseng.2014.07.004 2014

[19] [19]

Imperfections in narrow gap multi- layer welding - Potential causes and countermeasures. Opt. Lasers Eng. 129, 106011. https://doi.org/10.1016/j.optlaseng.2020.106011 Olague, G., Hernández, D.E., Llamas, P., Clemente, E., Briseño, J.L.,

work page doi:10.1016/j.optlaseng.2020.106011 2020

[20] [20]

Multimed

Brain programming as a new strategy to create visual routines for object tracking. Multimed. Tools Appl. 78, 5881–5918. https://doi.org/10.1007/s11042-018-6634-9 Rahman, M.M., Tutul, A.A., Nath, A., Laishram, L., Jung, S.K., Hammond, T.,

work page doi:10.1007/s11042-018-6634-9

[21] [21]

https://doi.org/10.48550/arXiv.2410.03105 Sebestova, H., Chmelickova, H., Nozka, L., Moudry, J.,

Mamba in Vision: A Comprehensive Survey of Techniques and Applications. https://doi.org/10.48550/arXiv.2410.03105 Sebestova, H., Chmelickova, H., Nozka, L., Moudry, J.,

work page doi:10.48550/arxiv.2410.03105

[22] [22]

Non- destructive Real Time Monitoring of the Laser Welding Process. J. Mater. Eng. Perform. 21, 764 –769. https://doi.org/10.1007/s11665-012-0193-4 Shelhamer, E., Long, J., Darrell, T.,

work page doi:10.1007/s11665-012-0193-4

[23] [23]

IEEE Trans

Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640 –651. https://doi.org/10.1109/TPAMI.2016.2572683 Shi, X., Chen, Z., Wang, H., Yeung, D.-Y ., Wong, W., Woo, W.,

work page doi:10.1109/tpami.2016.2572683 2016

[24] [24]

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. https://doi.org/10.48550/arXiv.1506.04214 Squillace, A., Prisco, U., Ciliberto, S., Astarita, A.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1506.04214

[25] [25]

Effect of welding parameters on morphology and mechanical properties of Ti–6Al–4V laser beam welded butt joints. J. Mater. Process. Technol. 212, 427–436. https://doi.org/10.1016/j.jmatprotec.2011.10.005 Taylor, K.E.,

work page doi:10.1016/j.jmatprotec.2011.10.005 2011

[26] [26]

https://doi.org/10.1029/2000JD900719 Wan, X., Wang, Y ., Zhao, D., Huang, Y ., Yin, Z.,

work page doi:10.1029/2000jd900719

[27] [27]

Measurement 99, 120–127

Weld quality monitoring research in small scale resistance spot welding by dynamic resistance and neural network. Measurement 99, 120–127. https://doi.org/10.1016/j.measurement.2016.12.010 Wu, J., Zhang, C., Giam, A., Chia, H.Y ., Cao, H., Ge, W., Yan, W.,

work page doi:10.1016/j.measurement.2016.12.010 2016

[28] [28]

Physics - assisted transfer learning metamodels to predict bead geometry and carbon emission in laser butt welding. Appl. Energy 359, 122682. https://doi.org/10.1016/j.apenergy.2024.122682 Yan, S., Chen, B., Tan, C., Song, X., Wang, G.,

work page doi:10.1016/j.apenergy.2024.122682 2024

[29] [29]

A data -driven time-sequence feature-based composite network of time- distributed CNN -LSTM for detecting pore defects in laser penetration welding. J. Intell. Manuf. https://doi.org/10.1007/s10845 -024- 02391-6 You, D., Gao, X., Katayama, S.,

work page doi:10.1007/s10845

[30] [30]

Data-driven based analyzing and modeling of MIMO laser welding process by integration of six advanced sensors. Int. J. Adv. Manuf. Technol. 82, 1127–1139. https://doi.org/10.1007/s00170-015-7455-x You, D., Gao, X., Katayama, S.,

work page doi:10.1007/s00170-015-7455-x

[31] [31]

IEEE Trans

Multisensor Fusion System for Monitoring High-Power Disk Laser Welding Using Support Vector Machine. IEEE Trans. Ind. Inform. 10, 1285–1295. https://doi.org/10.1109/TII.2014.2309482 Yu, R., Kershaw, J., Wang, P., Zhang, Y .,

work page doi:10.1109/tii.2014.2309482 2014

[32] [32]

How to Accurately Monitor the Weld Penetration From Dynamic Weld Pool Serial Images Using CNN -LSTM Deep Learning Model? IEEE Robot. Autom. Lett. 7, 6519 –6525. https://doi.org/10.1109/LRA.2022.3173659 Zadeh, A., Liang, P.P., Mazumder, N., Poria, S., Cambria, E., Morency, L.- P.,

work page doi:10.1109/lra.2022.3173659 2022

[33] [33]

Memory Fusion Network for Multi-view Sequential Learning

Memory Fusion Network for Multi- view Sequential Learning. https://doi.org/10.48550/arXiv.1802.00927 Zhang, B., Hong, K.- M., Shin, Y .C.,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1802.00927

[34] [34]

Deep-learning-based porosity monitoring of laser welding process. Manuf. Lett. 23, 62 –66. https://doi.org/10.1016/j.mfglet.2020.01.001 Zhou, F., Liu, X., Jia, C., Li, S., Tian, J., Zhou, W., Wu, C.,

work page doi:10.1016/j.mfglet.2020.01.001 2020

[35] [35]

Expert Syst

Unified CNN-LSTM for keyhole status prediction in PAW based on spatial-temporal features. Expert Syst. Appl. 237, 121425. https://doi.org/10.1016/j.eswa.2023.121425

work page doi:10.1016/j.eswa.2023.121425 2023