pith. sign in

arxiv: 2605.18058 · v1 · pith:MQLM6CAVnew · submitted 2026-05-18 · 💻 cs.CV

Threats to Arabic Handwriting Recognition: Investigating Black-Box Adversarial Attacks on embedded ConvNet models

Pith reviewed 2026-05-20 11:30 UTC · model grok-4.3

classification 💻 cs.CV
keywords Arabic handwriting recognitionblack-box adversarial attacksConvNet modelsPixle attackmodel vulnerabilityadversarial exampleshandwritten character recognitionsecurity threats
0
0 comments X

The pith

Arabic handwriting recognition ConvNets are vulnerable to black-box attacks that succeed at 99-100% while appearing normal to humans.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors investigate the security of deep learning models for Arabic handwriting recognition by applying black-box adversarial attacks that require no knowledge of the model internals. They evaluate several such attacks on two standard benchmark datasets using embedded convolutional networks. The Pixle attack in particular reaches 99-100% success on most models. These changes keep the structural form of the characters intact so the writing still looks natural to human observers. The findings show these models can be manipulated without obvious signs, raising questions about their use in applications that need reliable recognition.

Core claim

Embedded ConvNet models for Arabic handwriting recognition can be successfully attacked using black-box methods, with the Pixle attack reaching 99-100% success rates on most models across two benchmark datasets of handwritten Arabic characters, while the modified images retain structural integrity and remain nearly invisible to human observers.

What carries the argument

Black-box adversarial attack methods, especially the Pixle perturbation technique, applied directly to input images fed into ConvNet-based Arabic handwriting recognition systems.

If this is right

  • The studied models exhibit clear vulnerability to adversarial manipulation even without attacker access to architecture details.
  • Imperceptible attacks that preserve character appearance could interfere with real-world AHR applications without detection.
  • High success rates in controlled black-box settings indicate similar risks for other embedded recognition pipelines.
  • Strengthening security measures becomes necessary to maintain reliability in deployed AHR systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Comparable vulnerabilities may appear in recognition systems for other scripts with complex character structures.
  • Document processing pipelines that rely on these models could experience undetected accuracy loss if exposed to such attacks.
  • Direct evaluation of defense strategies against Pixle-style perturbations on Arabic handwriting data would be a logical follow-up.

Load-bearing premise

The two benchmark datasets and the tested ConvNet models represent the performance of high-performing Arabic handwriting recognition systems used in real applications.

What would settle it

Running the Pixle attack on a production Arabic handwriting recognition system in actual use and observing either low success rates or clearly visible changes to the characters would falsify the practical threat.

Figures

Figures reproduced from arXiv: 2605.18058 by Abdelaziz Courr, Abdelillah Semma, Mohsine EL Khayati, Rachid Elouahbi.

Figure 1
Figure 1. Figure 1: Flowchart of the proposed method B. Models Four models were selected for their lightweight archi￾tectures: MobileNet [26], MnasNet [27], ShuffleNet [29], and SqueezeNet [28]. These models have minimal parameter counts and number of Floating Points Operations (FLOPs), making them suitable for mobile and IoT devices. They were originally designed to operate efficiently in resource￾constrained environments wh… view at source ↗
Figure 2
Figure 2. Figure 2: The accuracy drop of the studied models across attack types and datasets [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
read the original abstract

Arabic handwriting recognition (AHR) has made significant progress with deep learning models. AHR research has largely focused on performance, with security receiving little attention. This study provides what appears to be a new line of inquiry by demonstrating the vulnerability of high-performing models to adversarial black-box attacks. The focus on black-box attacks reflects real-world scenarios where the attacker has no prior knowledge of the model architecture. Extensive experiments were conducted on two benchmark AHR datasets containing Arabic handwritten Characters. Results demonstrated the effectiveness of the attacks, with the Pixle attack achieving an attack success rate of 99-100\% on most models. Other, less aggressive attacks achieved success rates of 50-96\% across most experiments. Despite the higher attack success rate, the attacks maintain the structural integrity of the characters, rendering them almost imperceptible to the human eye. The findings indicate the higher vulnerability of the studied models to adversarial manipulation. This underscores the need to strengthen efforts to secure these models and ensure their reliability in AHR real-world applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper examines the security of Arabic handwriting recognition (AHR) systems by testing black-box adversarial attacks on ConvNet models using two standard benchmark datasets of Arabic handwritten characters. It reports that the Pixle attack achieves 99-100% attack success rate on most models while preserving character structure and remaining nearly imperceptible, with other attacks reaching 50-96% success; the work concludes that these models are vulnerable and calls for greater focus on robustness in real-world AHR applications.

Significance. If the reported attack success rates prove reproducible and the tested models are representative, the work is significant for highlighting an understudied security dimension in AHR, a task with practical importance in document analysis and embedded systems. The empirical demonstration of high-success, low-visibility perturbations on standard benchmarks provides a concrete starting point for robustness research in non-Latin script recognition, analogous to established adversarial studies in general computer vision.

major comments (2)
  1. [§4 (Experiments)] §4 (Experiments) and abstract: The reported 99-100% ASR for Pixle and 50-96% for other attacks lack any description of model architectures (layer counts, parameter sizes), exact attack hyperparameters (query budgets, perturbation limits), number of trials, or error bars/statistical tests, which are required to assess whether the central vulnerability claim is reliably supported rather than an artifact of specific unstated choices.
  2. [§5 (Discussion)] §5 (Discussion) or conclusion: The claim that the attacks demonstrate practical threats to embedded AHR systems rests on results from two benchmark datasets and selected ConvNets, but the manuscript contains no transfer experiments, no evaluation on models trained on in-the-wild Arabic handwriting, and no tests under realistic embedded constraints (preprocessing pipelines, hardware quantization, or limited API access), leaving the generalization to deployed systems unsupported.
minor comments (2)
  1. [Abstract] Abstract and §3: The term 'embedded ConvNet models' is used without clarifying whether the models were actually quantized or deployed on embedded hardware versus simply trained as standard ConvNets; a brief statement on this distinction would improve clarity.
  2. [References] References: The manuscript would benefit from citing recent surveys on adversarial attacks in handwriting recognition or Arabic OCR to better situate the novelty of applying black-box methods to AHR.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments on our manuscript. We address each major comment below, indicating the revisions we will make to improve clarity, reproducibility, and the appropriate scope of our claims.

read point-by-point responses
  1. Referee: [§4 (Experiments)] §4 (Experiments) and abstract: The reported 99-100% ASR for Pixle and 50-96% for other attacks lack any description of model architectures (layer counts, parameter sizes), exact attack hyperparameters (query budgets, perturbation limits), number of trials, or error bars/statistical tests, which are required to assess whether the central vulnerability claim is reliably supported rather than an artifact of specific unstated choices.

    Authors: We agree that these experimental details are necessary to support reproducibility and to allow readers to evaluate the reliability of the reported attack success rates. In the revised manuscript we will expand Section 4 to include: (i) full specifications of the ConvNet architectures (layer counts, filter sizes, and parameter counts), (ii) the precise hyperparameter settings used for each attack (including query budgets, perturbation budgets, and any other relevant controls), (iii) the number of independent trials or images evaluated, and (iv) error bars or results of appropriate statistical tests. Corresponding clarifications will also be added to the abstract. revision: yes

  2. Referee: [§5 (Discussion)] §5 (Discussion) or conclusion: The claim that the attacks demonstrate practical threats to embedded AHR systems rests on results from two benchmark datasets and selected ConvNets, but the manuscript contains no transfer experiments, no evaluation on models trained on in-the-wild Arabic handwriting, and no tests under realistic embedded constraints (preprocessing pipelines, hardware quantization, or limited API access), leaving the generalization to deployed systems unsupported.

    Authors: We acknowledge that the present experiments are confined to two standard benchmark datasets and do not include transferability tests, in-the-wild data, or evaluations under realistic embedded constraints such as quantization or restricted API access. Our primary goal was to demonstrate vulnerability on established benchmarks as a first step. In the revised version we will (a) revise the language in Section 5 and the conclusion to more precisely delimit the scope of our claims, (b) add an explicit limitations paragraph discussing the absence of these additional evaluations, and (c) outline concrete directions for future work on transferability and real-world deployment scenarios. No new experiments will be added at this stage. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation of known attacks on benchmarks

full rationale

The manuscript reports direct experimental results from applying standard black-box adversarial attacks (including Pixle) to ConvNet models trained on two public Arabic handwriting benchmark datasets. No derivations, equations, fitted parameters renamed as predictions, or self-citation chains appear in the provided abstract or described content. All performance numbers (e.g., 99-100% ASR) are measured outcomes rather than outputs forced by construction from the inputs. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is an empirical security evaluation that relies on standard machine-learning assumptions about model behavior and benchmark datasets rather than introducing new free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5728 in / 1103 out tokens · 40805 ms · 2026-05-20T11:30:27.800598+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 4 internal anchors

  1. [1]

    Online Arabic handwrit- ing recognition: a survey,

    N. Tagougui, A. M. Alimi, and M. Kherallah, “Online Arabic handwrit- ing recognition: a survey,”International Journal on Document Analysis and Recognition (IJDAR), vol. 16, May 2012

  2. [2]

    Arabic handwriting recognition: Between handcrafted methods and deep learning techniques,

    A. Korichi, S. Slatnia, O. Aiadi, N. Tagougui, and M. Kherallah, “Arabic handwriting recognition: Between handcrafted methods and deep learning techniques,” in2020 21st International Arab Conference on Information Technology (ACIT). Giza, Egypt: IEEE, Nov. 2020, pp. 1–6

  3. [3]

    Convolutional Arabic handwriting recognition system based BLSTM-CTC using WBS decoder,

    M. Rabi, “Convolutional Arabic handwriting recognition system based BLSTM-CTC using WBS decoder,”International Journal of Advanced Science and Computer Applications, vol. 4, no. 1, Jan. 2024

  4. [4]

    CNN-based Methods for Offline Arabic Handwriting Recognition: A Review,

    M. El Khayati, I. Kich, and Y . Taouil, “CNN-based Methods for Offline Arabic Handwriting Recognition: A Review,”Neural Processing Letters, vol. 56, no. 2, p. 115, Mar. 2024

  5. [5]

    End-to-End Machine Learning Solution for Recognizing Handwritten Arabic Doc- uments,

    R. E. Shtaiwi, G. A. Abandah, and S. A. Sawalhah, “End-to-End Machine Learning Solution for Recognizing Handwritten Arabic Doc- uments,” in2022 13th International Conference on Information and Communication Systems (ICICS). Irbid, Jordan: IEEE, Jun. 2022, pp. 180–185

  6. [6]

    Novel Deep Con- volutional Neural Network-Based Contextual Recognition of Arabic Handwritten Scripts,

    R. Ahmed, M. Gogate, A. Tahir, K. Dashtipour, B. Al-tamimi, A. Hawalah, M. A. El-Affendi, and A. Hussain, “Novel Deep Con- volutional Neural Network-Based Contextual Recognition of Arabic Handwritten Scripts,”Entropy, vol. 23, no. 3, p. 340, Mar. 2021

  7. [7]

    A novel architecture of CNN based on SVM classifier for recognising Arabic handwritten script,

    M. Elleuch, N. Tagougui, and M. Kherallah, “A novel architecture of CNN based on SVM classifier for recognising Arabic handwritten script,”International Journal of Intelligent Systems Technologies and Applications, vol. 15, no. 4, p. 323, 2016

  8. [8]

    Leveraging transfer learning and mobile-enabled convolutional neural networks for improved arabic handwritten character recognition,

    M. El Khayati, A. Maafiri, Y . Himeur, H. Ali Alkhazaleh, S. Atalla, and W. Mansoor, “Leveraging transfer learning and mobile-enabled convolutional neural networks for improved arabic handwritten character recognition,”IEEE Access, vol. 13, p. 166104–166126, 2025

  9. [9]

    A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability,

    X. Huang, D. Kroening, W. Ruan, J. Sharp, Y . Sun, E. Thamo, M. Wu, and X. Yi, “A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability,”Computer Science Review, vol. 37, p. 100270, Aug. 2020

  10. [10]

    Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey,

    N. Akhtar and A. Mian, “Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey,”IEEE Access, vol. 6, pp. 14 410–14 430, 2018

  11. [11]

    FAW A: Fast Adversarial Watermark Attack on Optical Character Recognition (OCR) Systems,

    L. Chen, J. Sun, and W. Xu, “FAW A: Fast Adversarial Watermark Attack on Optical Character Recognition (OCR) Systems,” inMachine Learning and Knowledge Discovery in Databases, F. Hutter, K. Kersting, J. Lijffijt, and I. Valera, Eds. Cham: Springer International Publishing, 2021, vol. 12459, pp. 547–563, series Title: Lecture Notes in Computer Science

  12. [12]

    Improvement Optical Character Recognition for Structured Documents using Gener- ative Adversarial Networks,

    J. D. B. Castro, S. W. A. Canchumuni, C. E. M. Villalobos, F. C. Cordeiro, A. M. A. Alexandre, and M. A. C. Pacheco, “Improvement Optical Character Recognition for Structured Documents using Gener- ative Adversarial Networks,” in2021 21st International Conference on Computational Science and Its Applications (ICCSA). Cagliari, Italy: IEEE, Sep. 2021, pp. 285–292

  13. [13]

    One-Word Answer Correction using Deep Learning Models and OCR,

    K. P. K. Devan, S. Prabakaran, S. Tamizhazhagan, and S. Vaishnavi, “One-Word Answer Correction using Deep Learning Models and OCR,” International Journal of Recent Technology and Engineering (IJRTE), vol. 9, no. 2, pp. 679–682, Jul. 2020

  14. [14]

    Quranic Optical Text Recognition Using Deep Learning Models,

    M. Mohd, F. Qamar, I. Al-Sheikh, and R. Salah, “Quranic Optical Text Recognition Using Deep Learning Models,”IEEE Access, vol. 9, pp. 38 318–38 330, 2021

  15. [15]

    Intriguing properties of neural networks

    C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” Feb. 2014, arXiv:1312.6199

  16. [16]

    Explaining and Harnessing Adversarial Examples

    I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and Harnessing Adversarial Examples,” Mar. 2015, arXiv:1412.6572 [cs, stat]

  17. [17]

    Towards Deep Learning Models Resistant to Adversarial Attacks

    A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards Deep Learning Models Resistant to Adversarial Attacks,” Sep. 2019, arXiv:1706.06083 [cs, stat]

  18. [18]

    Towards Deep Learning: A Review On Adversarial Attacks,

    M. M. Irfan, S. Ali, I. Yaqoob, and N. Zafar, “Towards Deep Learning: A Review On Adversarial Attacks,” in2021 International Conference on Artificial Intelligence (ICAI). Islamabad, Pakistan: IEEE, Apr. 2021, pp. 91–96

  19. [19]

    Pixle: a fast and effective black-box attack based on rearranging pixels,

    J. Pomponi, S. Scardapane, and A. Uncini, “Pixle: a fast and effective black-box attack based on rearranging pixels,” in2022 International Joint Conference on Neural Networks (IJCNN), Jul. 2022, pp. 1–7, arXiv:2202.02236 [cs, stat]

  20. [20]

    Square Attack: A Query-Efficient Black-Box Adversarial Attack via Random Search,

    M. Andriushchenko, F. Croce, N. Flammarion, and M. Hein, “Square Attack: A Query-Efficient Black-Box Adversarial Attack via Random Search,” inComputer Vision – ECCV 2020, A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. Cham: Springer International Publish- ing, 2020, pp. 484–501

  21. [21]

    Impact of Attention on Adversarial Robustness of Image Classification Models,

    P. Agrawal, N. S. Punn, S. Kumar Sonbhadra, and S. Agarwal, “Impact of Attention on Adversarial Robustness of Image Classification Models,” in2021 IEEE International Conference on Big Data (Big Data). Orlando, FL, USA: IEEE, Dec. 2021, pp. 3013–3019

  22. [22]

    Adversarial Attacks on Image Classification Models: FGSM and Patch Attacks and Their Impact,

    J. Sen and S. Dasgupta, “Adversarial Attacks on Image Classification Models: FGSM and Patch Attacks and Their Impact,” inInformation Security and Privacy in the Digital World - Some Selected Topics, J. Sen and J. Mayer, Eds. IntechOpen, Sep. 2023

  23. [23]

    Adversarial Attacks on Neural Network Policies

    S. Huang, N. Papernot, I. Goodfellow, Y . Duan, and P. Abbeel, “Adversarial Attacks on Neural Network Policies,” Feb. 2017, arXiv:1702.02284 [cs]

  24. [24]

    The Limitations of Deep Learning in Adversarial Settings,

    N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, “The Limitations of Deep Learning in Adversarial Settings,” in2016 IEEE European Symposium on Security and Privacy (EuroS&P), Mar. 2016, pp. 372–387

  25. [25]

    Metamorphic filtering of black-box adversarial attacks on multi-network face recognition models,

    R. R. Mekala, A. Porter, and M. Lindvall, “Metamorphic filtering of black-box adversarial attacks on multi-network face recognition models,” inProceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops. Seoul Republic of Korea: ACM, Jun. 2020, pp. 410–417

  26. [26]

    Searching for mobilenetv3,

    A. Howard, M. Sandler, B. Chen, W. Wang, L.-C. Chen, M. Tan, G. Chu, V . Vasudevan, Y . Zhu, R. Pang, H. Adam, and Q. Le, “Searching for mobilenetv3,” in2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Oct. 2019, p. 1314–1324

  27. [27]

    Mnasnet: Platform-aware neural architecture search for mobile,

    M. Tan, B. Chen, R. Pang, V . Vasudevan, M. Sandler, A. Howard, and Q. V . Le, “Mnasnet: Platform-aware neural architecture search for mobile,” in2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Jun. 2019, p. 2815–2823

  28. [28]

    Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <0.5mb model size,

    F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <0.5mb model size,” 2016

  29. [29]

    N. Ma, X. Zhang, H.-T. Zheng, and J. Sun,ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. Springer Interna- tional Publishing, 2018, p. 122–138

  30. [30]

    A comprehensive isolated farsi/arabic character database for handwritten ocr research,

    S. Mozaffari, K. Faez, F. Faradji, M. Ziaratban, and S. M. Golzan, “A comprehensive isolated farsi/arabic character database for handwritten ocr research,” inProceedings of the 10th International Workshop on Frontiers in Handwriting Recognition, La Baule, France, 2006, pp. 385– 389

  31. [31]

    Arabic handwritten characters recognition using convolutional neural network,

    A. Elsawy, M. Loey, and H. El-Bakry, “Arabic handwritten characters recognition using convolutional neural network,”WSEAS Transactions on Computer Research, vol. 5, pp. 11–19, 2017

  32. [32]

    Coatnet: Marrying convolution and attention for all data sizes,

    Z. Dai, H. Liu, Q. V . Le, and M. Tan, “Coatnet: Marrying convolution and attention for all data sizes,” 2021

  33. [33]

    Scaling vision transform- ers, 6 2021

    X. Zhai, A. Kolesnikov, N. Houlsby, and L. Beyer, “Scaling vision transformers,”arXiv:2106.04560 [cs], 2021

  34. [34]

    Davit: Dual attention vision transformers,

    M. Ding, B. Xiao, N. Codella, P. Luo, J. Wang, and L. Yuan, “Davit: Dual attention vision transformers,” inProceedings of the Conference, 2022