arxiv: 2604.16082 · v1 · submitted 2026-04-17 · 💻 cs.CV · cs.AI· cs.LG

Recognition: unknown

Early Detection of Acute Myeloid Leukemia (AML) Using YOLOv12 Deep Learning Model

Enas E. Ahmed , Salah A. Aly , Mayar Moner

Authors on Pith no claims yet

Pith reviewed 2026-05-10 08:29 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG

keywords classificationyolov12acutecelldeeplearningleukemiamodel

0 comments

The pith

YOLOv12 with Otsu thresholding on cell-based segmentation classifies AML cells at 99.3% validation and test accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Acute myeloid leukemia cells look very similar under a microscope, making manual classification slow and error-prone. The authors preprocess blood-cell images by segmenting either whole cells or nuclei using two simple techniques: one based on color hue and one using Otsu automatic thresholding. They then feed the segmented images into YOLOv12, a modern object-detection network, and train it to label the different AML subtypes. The best result came from Otsu thresholding applied to whole-cell segmentation, reaching 99.3 percent accuracy on both the validation and test sets.

Core claim

Our experiments demonstrate that YOLOv12 with Otsu thresholding on cell-based segmentation achieved the highest level of validation and test accuracy, both reaching 99.3%.

Load-bearing premise

The assumption that the (unspecified) dataset is representative of real-world patient variability and that the reported test accuracy reflects generalization rather than overfitting to the particular images used.

Figures

Figures reproduced from arXiv: 2604.16082 by Enas E. Ahmed, Mayar Moner, Salah A. Aly.

**Figure 1.** Figure 1: The implementation process of our proposed framework, showing the key stages from data preprocessing to final classification. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Area Attention uses a simple method to split the feature [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Training and validation loss curves for YOLO on hue [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Confusion matrix of YOLO for hue-segmented cell image [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Training and validation loss curves for YOLO on Otsu [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 8.** Figure 8: Confusion matrix of YOLO for hue-segmented nucleus image [PITH_FULL_IMAGE:figures/full_fig_p005_8.png] view at source ↗

**Figure 6.** Figure 6: Confusion matrix of YOLO for Otsu-thresholded cell image [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 9.** Figure 9: Training and validation loss curves for YOLO on Otsu [PITH_FULL_IMAGE:figures/full_fig_p006_9.png] view at source ↗

**Figure 10.** Figure 10: Confusion matrix of YOLO for Otsu-thresholded nucleus [PITH_FULL_IMAGE:figures/full_fig_p006_10.png] view at source ↗

read the original abstract

Acute Myeloid Leukemia (AML) is one of the most life-threatening type of blood cancers, and its accurate classification is considered and remains a challenging task due to the visual similarity between various cell types. This study addresses the classification of the multiclasses of AML cells Utilizing YOLOv12 deep learning model. We applied two segmentation approaches based on cell and nucleus features, using Hue channel and Otsu thresholding techniques to preprocess the images prior to classification. Our experiments demonstrate that YOLOv12 with Otsu thresholding on cell-based segmentation achieved the highest level of validation and test accuracy, both reaching 99.3%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

YOLOv12 plus Otsu cell segmentation reaches 99.3% on AML cells in the abstract, but missing dataset size, patient count, and split details leave the result vulnerable to leakage.

read the letter

Hi, the one thing to take away is that this paper applies the new YOLOv12 model to multiclass AML cell classification after running Otsu thresholding on cell-based segmentation, and it claims 99.3% accuracy on both validation and test. That combination is the actual new piece here rather than any change to the model itself. The authors also test a Hue-channel variant, which is a reasonable way to tackle the visual overlap between cell types that they correctly flag as the core difficulty. The work is a straightforward domain extension that shows someone has tried the latest YOLO release on blood-smear images, and the preprocessing steps are standard but relevant for isolating cells before classification. That part is fine and worth noting for anyone scanning recent applications. The soft spots sit right where the stress-test note says. The abstract gives zero numbers on total images, patients, class balance, or how the train-val-test split was performed. In smear data, cells from one slide or patient routinely appear together, so an image-level split can easily leak information and push accuracy up without real generalization. No baseline models are mentioned either, and there is no error analysis or external validation set. These gaps are not minor when the central claim is a 99.3% reliable detector for a serious disease. The paper is aimed at hematopathology groups that want to see recent object-detection models tried on AML images. A reader looking for a quick starting point or an idea for their own pipeline might pull something useful from the segmentation choices. Anyone needing reproducible numbers or evidence that the accuracy holds under proper patient stratification will come away empty. I would send it for peer review rather than desk-reject. The clinical target is important enough that referees should get the chance to ask for the dataset statistics, split protocol, and baseline runs so the claim can be checked properly.

Referee Report

2 major / 1 minor

Summary. The manuscript applies the YOLOv12 deep learning model to multiclass classification of Acute Myeloid Leukemia (AML) cells in blood-smear images. Two segmentation strategies (cell-based and nucleus-based) are tested with Hue-channel and Otsu-thresholding preprocessing; the authors report that YOLOv12 plus Otsu thresholding on cell-based segmentation reaches 99.3% accuracy on both validation and test sets.

Significance. If the accuracy claim is shown to be robust, reproducible, and free of leakage, the work could provide a practical high-accuracy pipeline for automated AML screening. The combination of a recent detection architecture with classical thresholding is a straightforward engineering contribution. However, the complete absence of dataset statistics, patient counts, split protocols, baselines, and error analysis prevents any assessment of whether the result reflects genuine generalization.

major comments (2)

Abstract: The 99.3% validation and test accuracy is stated without any information on total images, number of patients, class distribution, dataset source, or train/validation/test partitioning strategy. Because multiple cells are routinely extracted from the same slide or patient, an image-level split risks leakage; the reported figure cannot be interpreted as evidence of reliable early detection until patient-stratified splits or external validation are demonstrated.
Experiments/Results section: No baseline comparisons (other YOLO versions, standard CNN classifiers, or non-deep-learning methods) are provided, nor is any confusion matrix, per-class accuracy, or error analysis reported. For a multiclass problem whose difficulty stems from visual similarity between cell types, these omissions leave the superiority claim unsupported.

minor comments (1)

Abstract: The number of AML cell classes and their identities are not stated, making it impossible to judge the difficulty of the reported multiclass task.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comments correctly identify omissions in dataset reporting and experimental comparisons that limit interpretability of the results. We will revise the manuscript to address both major points.

read point-by-point responses

Referee: Abstract: The 99.3% validation and test accuracy is stated without any information on total images, number of patients, class distribution, dataset source, or train/validation/test partitioning strategy. Because multiple cells are routinely extracted from the same slide or patient, an image-level split risks leakage; the reported figure cannot be interpreted as evidence of reliable early detection until patient-stratified splits or external validation are demonstrated.

Authors: We agree that these details are essential and were omitted from the abstract and main text. In the revised manuscript we will add a dedicated 'Dataset' subsection in Methods that reports the total number of images, number of patients, class distribution, dataset source, and the precise train/validation/test partitioning protocol. We will also update the abstract to summarize these elements. To address leakage, we will re-process the data using patient-stratified splits (all cells from one patient assigned to only one subset) and report the resulting accuracies; if the original experiments used image-level splits, the revised results will reflect the corrected protocol. revision: yes
Referee: Experiments/Results section: No baseline comparisons (other YOLO versions, standard CNN classifiers, or non-deep-learning methods) are provided, nor is any confusion matrix, per-class accuracy, or error analysis reported. For a multiclass problem whose difficulty stems from visual similarity between cell types, these omissions leave the superiority claim unsupported.

Authors: We acknowledge that the absence of baselines and error analysis weakens the claims. In the revised manuscript we will expand the Experiments section to include direct comparisons with YOLOv8, YOLOv11, ResNet50, EfficientNet-B0, and a traditional SVM baseline using hand-crafted features. We will also add a confusion matrix, per-class precision/recall/F1 scores, and a short error-analysis paragraph discussing misclassifications between visually similar cell types. These additions will be based on re-running the experiments under the same preprocessing and patient-stratified splits. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical accuracy on held-out test set with no derivation chain

full rationale

The paper reports an experimental result: YOLOv12 combined with Otsu thresholding on cell-based segmentation yields 99.3% validation and test accuracy for AML cell classification. No equations, fitted parameters renamed as predictions, self-definitional constructs, or load-bearing self-citations appear in the abstract or described claims. The central claim is a direct empirical measurement on a test set rather than a quantity derived by construction from its own inputs. Data-split details are unspecified, but that is a generalization risk, not a circularity in any derivation. The result is self-contained against external benchmarks and does not reduce to tautology.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the effectiveness of standard deep-learning image classification after simple thresholding segmentation; no new entities or parameters are introduced beyond typical model training choices.

free parameters (1)

YOLOv12 training hyperparameters
Learning rate, batch size, epochs and augmentation settings are not reported and must have been chosen or tuned.

axioms (1)

domain assumption YOLOv12 architecture can learn discriminative features from pre-segmented blood-cell images
Invoked by applying the off-the-shelf model to the preprocessed AML dataset.

pith-pipeline@v0.9.0 · 5408 in / 1288 out tokens · 65337 ms · 2026-05-10T08:29:56.960827+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 1 canonical work pages · 1 internal anchor

[1]

Automatic detection of acute leukemia (all and aml) utilizing customized deep graph convolutional neural networks,

L. Zare, M. Rahmani, N. Khaleghi, S. Sheykhivand, and S. Dan- ishvar, “Automatic detection of acute leukemia (all and aml) utilizing customized deep graph convolutional neural networks,” Bioengineering, vol. 11, no. 7, p. 644, 2024

2024
[2]

Acute myeloid leukemia - cancer stat facts

SEER Program, National Cancer Institute, “Acute myeloid leukemia - cancer stat facts.” https://seer.cancer.gov/statfacts/ html/amyl.html, n.d
[3]

Acute myeloid leukemia (nursing),

A. Vakiti, S. B. Reynolds, P. Mewawalla, and S. Coleman, “Acute myeloid leukemia (nursing),” inStatPearls [Internet], StatPearls Publishing, 2024

2024
[4]

Leukemia: an overview for primary care,

A. S. Davis, A. J. Viera, and M. D. Mead, “Leukemia: an overview for primary care,”American family physician, vol. 89, no. 9, pp. 731–738, 2014

2014
[5]

Chapter 18 - acute myeloid leukemia—overview,

F. Naeim, P. Nagesh Rao, S. X. Song, and R. T. Phan, “Chapter 18 - acute myeloid leukemia—overview,” inAtlas of Hematopathology (Second Edition)(F. Naeim, P. Nagesh Rao, S. X. Song, and R. T. Phan, eds.), pp. 293–302, Academic Press, second edition ed., 2018

2018
[6]

Chapter 27 - granulocy- topoiesis and monocytopoiesis,

A. Khanna-Gupta and N. Berliner, “Chapter 27 - granulocy- topoiesis and monocytopoiesis,” inHematology (Seventh Edi- tion)(R. Hoffman, E. J. Benz, L. E. Silberstein, H. E. Heslop, J. I. Weitz, J. Anastasi, M. E. Salama, and S. A. Abutalib, eds.), pp. 321–333.e1, Elsevier, seventh edition ed., 2018

2018
[7]

Approach to acute myeloid leukemia with increased eosinophils and basophils,

S. Papadakis, I. Liapis, S. I. Papadhimitriou, E. Spanoudakis, I. Kotsianidis, and K. Liapis, “Approach to acute myeloid leukemia with increased eosinophils and basophils,”Journal of Clinical Medicine, vol. 13, no. 3, p. 876, 2024

2024
[8]

Detecting malignant leukemia cells using microscopic blood smear images: a deep learning approach,

R. Baig, A. Rehman, A. Almuhaimeed, A. Alzahrani, and H. T. Rauf, “Detecting malignant leukemia cells using microscopic blood smear images: a deep learning approach,”Applied Sci- ences, vol. 12, no. 13, p. 6317, 2022

2022
[9]

Ghost-resnext: An effective deep learning based on mature and immature wbc classification,

S. S. R. Bairaboina and S. R. Battula, “Ghost-resnext: An effective deep learning based on mature and immature wbc classification,”Applied Sciences, vol. 13, no. 6, p. 4054, 2023

2023
[10]

Cae-resvgg fusionnet: A feature extraction framework integrating convo- lutional autoencoders and transfer learning for immature white blood cells in acute myeloid leukemia,

T. Elhassan, A. H. Osman, M. S. M. Rahim, S. Z. M. Hashim, A. Ali, E. Elhassan, Y . Elkamali, and M. Aljurf, “Cae-resvgg fusionnet: A feature extraction framework integrating convo- lutional autoencoders and transfer learning for immature white blood cells in acute myeloid leukemia,”Heliyon, vol. 10, no. 19, 2024

2024
[11]

YOLOv12: Attention-Centric Real-Time Object Detectors

Y . Tian, Q. Ye, and D. Doermann, “Yolov12: Attention- centric real-time object detectors. arxiv 2025,”arXiv preprint arXiv:2502.12524, 2025

work page internal anchor Pith review arXiv 2025
[12]

You only look once: Unified, real-time object detection,

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” inProceedings of the IEEE conference on computer vision and pattern recog- nition, pp. 779–788, 2016

2016