pith. sign in

arxiv: 2605.07816 · v2 · pith:VMYVR744new · submitted 2026-05-08 · 💻 cs.CV

ICDAR 2026 Competition on Writer Identification and Pen Classification from Hand-Drawn Circles

Pith reviewed 2026-05-21 08:04 UTC · model grok-4.3

classification 💻 cs.CV
keywords writer identificationpen classificationhand-drawn circlesopen-set recognitionbiometric featurescompetition datasetfeature disentanglementminimal traces
0
0 comments X

The pith

Hand-drawn circles contain enough information to identify writers at 65 percent accuracy and classify pens at 93 percent even for unknown people.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CircleID, a competition built around a new dataset of more than 46,000 scanned hand-drawn circles produced by 66 writers with eight different pens. It sets up two tasks that test whether writer-specific traits and pen-specific properties can be separated from these very simple, static marks: one task requires open-set identification of known writers while rejecting unknowns, and the other requires classifying the pen used even when the writer is unseen. The work supplies a ResNet baseline and reports that the top private-leaderboard entries reached 64.801 percent accuracy on writer identification and 92.726 percent on pen classification. A reader would care because the results supply a controlled, large-scale measurement of how much biometric and material signal survives in minimal traces, offering a concrete starting point for studying feature entanglement without complex drawings or dynamic data.

Core claim

CircleID supplies a controlled dataset of 46,155 tightly cropped circle images digitized at 400 DPI from 44 known and 22 unknown writers using eight pens. The competition defines an open-set writer identification task and a cross-writer pen classification task whose goal is to measure how biometric writer characteristics and physical pen features become entangled inside these minimal static traces. The strongest private-leaderboard submissions achieved 64.801 percent top-1 accuracy for writer identification and 92.726 percent for pen classification, establishing an initial quantitative baseline for minimal-trace analysis.

What carries the argument

The CircleID dataset together with its two evaluation tracks that force models to separate writer identity from pen type on single static circles.

If this is right

  • Writer identification from circles alone reaches a usable but still limited 65 percent accuracy in open-set conditions.
  • Pen type can be recovered at over 92 percent accuracy across both seen and unseen writers, showing strong separability of physical properties.
  • Separate known and unknown writer groups provide a direct test of how out-of-distribution individuals affect model behavior.
  • The reported numbers supply a reproducible reference point against which future minimal-trace methods can be compared.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same circle-based protocol could be extended to other elementary shapes to test whether the observed disentanglement holds for different geometric constraints.
  • Adding temporal stroke information to the static images might raise both accuracies without changing the core minimal-trace premise.
  • The large gap between pen-classification and writer-identification performance suggests that material cues are easier to isolate than individual motor habits in these drawings.

Load-bearing premise

The dataset split into known and unknown writer groups measures true generalization and feature disentanglement without selection bias or distribution shifts that would appear in real-world use.

What would settle it

If the top-performing models drop well below 50 percent accuracy when tested on circles drawn by a fresh group of writers using the same pens, the claimed level of generalization would be falsified.

Figures

Figures reproduced from arXiv: 2605.07816 by Fei Wu, Janne van der Loop, Linda-Sophie Schneider, Lukas H\"uttner, Mathias Seuret, Thomas Gorges, Vincent Christlein.

Figure 1
Figure 1. Figure 1: Examples of morphological and textural variations in hand-drawn circles from randomly selected writers across the eight pens, all using black ink. The dataset consists of a newly assembled set of hand-drawn circles, drawn under controlled conditions. Data collection involved a cohort of 66 healthy adults from diverse cultural backgrounds, representing various age and gender groups. Participants received st… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the Circleid dataset splits and evaluation set distributions. (a) sizes of the train set, additional train set, and evaluation set. The additional train set is subdivided into samples with known and unknown writer annotations. (b) sizes of the public and private evaluation sets, further broken down by Part A and Part B. (c) distribution of samples across the eight pens in the evaluation set, sh… view at source ↗
Figure 3
Figure 3. Figure 3: Performance analysis of the top-3 teams on different subsets of the private leaderboard data. (a) writer identification task. Accuracies represent Top-1 accuracies, F1 scores for all categories, and macro-F1 scores for known categories. Recall and F1 for unknown writers are recall and binary F1 for unknown-writer detection on the unknown-writer subset. (b) pen classification task. Top-1 accuracies and macr… view at source ↗
Figure 4
Figure 4. Figure 4: Rank-shift analysis between the public and private leaderboards. (a) writer identification task. (b) pen classification task. Both models were initialized from ImageNet pretraining to support fine-grained texture modeling. Training was carried out in multiple stages. In the first stage, the models were optimized on labeled data using cross-entropy loss and label smoothing, with strong augmentations includi… view at source ↗
Figure 5
Figure 5. Figure 5: Empirical cumulative distribution functions of Top-1 accuracy for both tasks, shown separately for the public and private leaderboards. (a) writer identification leaderboard. (b) pen classification leaderboard. classification in (b). The Spearman rank correlations are high for both tasks, with ρ = 0.939 for writer identification and ρ = 0.938 for pen classification. This indicates that the rankings are rel… view at source ↗
Figure 6
Figure 6. Figure 6: Writer identification top-30 team performance on the private leaderboard evaluation set. (a) violin plots showing the distribution across the all-writer Top-1 accuracy, known-writer Top-1 accuracy, and unknown-writer recall, and binary known vs. unknown accuracy. In this binary setting, a prediction counts as correct when the model correctly distinguishes between any known writer and the unknown class, ind… view at source ↗
Figure 7
Figure 7. Figure 7: Pen classification top-30 team performance on the private leaderboard evaluation set. (a) per-pen F1 distribution for pens 1, 2, 4, 5, and 6, shown separately for all, known, and unknown writers. (b) per-pen F1 distribution for pens 3, 7, and 8 under the same subset split. (c) top-1 accuracy distribution of teams across all, known, and unknown writer subsets. pens 1, 2, 4, 5, and 6 consistently with strong… view at source ↗
Figure 8
Figure 8. Figure 8: Samples from the private leaderboard set that were most frequently misclassified by the top-30 teams in the pen classification task. Ground truth of all samples is shown. 6 Discussion Pen classification proved easier, while writer identification from isolated circles was considerably more difficult. This difference suggests that pen-specific stroke texture and material properties can be captured more readi… view at source ↗
read the original abstract

This paper presents CircleID, a large-scale ICDAR 2026 competition on writer identification and pen classification from scanned hand-drawn circles. The primary objective is to investigate how biometric writer characteristics and physical pen features naturally entangle within minimal, static traces. CircleID comprises two distinct tasks: (1) open-set writer identification, requiring models to recognize known writers while explicitly rejecting unknown ones, and (2) cross-writer pen classification, evaluated across both seen and unseen writers. Participants were provided with a new, controlled dataset of 46,155 tightly cropped circle images, digitized at 400 DPI and annotated for writer identity and pen type. The dataset comprises samples from 44 known and 22 unknown writers using eight different pens. Hosted on Kaggle as two separate tracks with public and private leaderboards, the competition provided participants with a ResNet baseline. In total, 389 teams (436 participants) made 3,185 submissions for the pen classification task, and 113 teams (141 participants) made 1,737 submissions for the writer identification track. The best-performing private leaderboard submissions achieved a Top-1 accuracy of 64.801% for writer identification and 92.726% for pen classification. This paper details the dataset, evaluates the winning methodologies, and analyzes the impact of out-of-distribution writers on model generalization and feature disentanglement. In this large-scale competition, CircleID establishes a new baseline for minimal-trace analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. This paper presents the CircleID competition for ICDAR 2026 on writer identification and pen classification from hand-drawn circles. It introduces a dataset of 46,155 tightly cropped circle images from 66 writers (44 known, 22 unknown) using 8 pens, scanned at 400 DPI. Two tasks are defined: open-set writer identification (recognize known writers, reject unknowns) and cross-writer pen classification (across seen and unseen writers). Hosted on Kaggle with public/private leaderboards and a ResNet baseline, the competition received 3,185 submissions for pen classification and 1,737 for writer identification. Best private leaderboard results are 64.801% Top-1 accuracy for writer ID and 92.726% for pen classification. The paper evaluates winning methods and analyzes OOD writer impact on generalization and feature disentanglement, claiming a new baseline for minimal-trace analysis.

Significance. If the results hold, this establishes a useful large-scale empirical benchmark for disentangling biometric writer traits from physical pen features in minimal static traces, supported by substantial community participation and dual public/private leaderboards that provide held-out evaluation. The dual-task design and controlled dataset enable targeted study of generalization to unknown writers, with potential relevance to forensic document analysis and biometric systems. Provision of baseline code and open data further aids reproducibility.

major comments (1)
  1. [Dataset description] Dataset description: the known/unknown writer partition (44 known, 22 unknown) is stated without any details on construction criteria, exclusion rules, or checks for confounding covariates such as age, handedness, drawing speed, or demographics. This partition is load-bearing for the central claim that private-leaderboard accuracies (64.801% writer ID, 92.726% pen classification) measure genuine open-set generalization and feature disentanglement rather than extraneous distribution shifts.
minor comments (1)
  1. [Abstract] Abstract: the claim that the competition 'analyzes the impact of out-of-distribution writers on model generalization and feature disentanglement' would benefit from a brief mention of the specific metrics or qualitative methods used for this analysis.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback on the manuscript. We address the major comment point by point below and will revise the paper accordingly where appropriate.

read point-by-point responses
  1. Referee: Dataset description: the known/unknown writer partition (44 known, 22 unknown) is stated without any details on construction criteria, exclusion rules, or checks for confounding covariates such as age, handedness, drawing speed, or demographics. This partition is load-bearing for the central claim that private-leaderboard accuracies (64.801% writer ID, 92.726% pen classification) measure genuine open-set generalization and feature disentanglement rather than extraneous distribution shifts.

    Authors: We agree that additional details on the known/unknown partition would strengthen the manuscript. The 66 writers were recruited from a single university community with the goal of obtaining a controlled set of hand-drawn circles. The partition into 44 known and 22 unknown writers was performed by randomly assigning writers to the two groups while balancing the total number of circles contributed per writer and ensuring each of the eight pens was represented proportionally in both sets. No writers were excluded on the basis of age, handedness, or drawing speed, as these variables were not recorded. Demographic covariates were deliberately omitted from the data collection protocol to protect participant privacy and because the study focus was limited to the visual properties of the static traces. We will add a new subsection to the dataset description that explicitly states the random assignment procedure, the balancing criteria used, and the absence of collected covariates. This revision will clarify the basis for the open-set evaluation without overstating the controls. revision: yes

Circularity Check

0 steps flagged

No circularity: competition results from independent held-out evaluations

full rationale

The paper describes an ICDAR competition with a new dataset, participant submissions, and private leaderboard results for writer identification and pen classification. No derivations, equations, or load-bearing steps are present that reduce claims to self-definitions, fitted inputs renamed as predictions, or self-citation chains. Reported accuracies (64.801% writer ID, 92.726% pen classification) are empirical outcomes from external teams evaluated on held-out private test data, with the analysis remaining self-contained against external benchmarks rather than internally forced.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the empirical competition outcomes and the domain assumption that controlled circle drawings allow measurement of entangled writer and pen features; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Writer identity and pen type features can be studied through disentanglement analysis in static circle images.
    This premise underpins the two tasks and the analysis of out-of-distribution writers.

pith-pipeline@v0.9.0 · 5818 in / 1242 out tokens · 53648 ms · 2026-05-21T08:04:52.341510+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 4 internal anchors

  1. [1]

    EURASIP Journal on Image and Video Processing2016(1), 34 (2016)

    Bensefia, A., Paquet, T.: Writer verification based on a single handwriting word samples. EURASIP Journal on Image and Video Processing2016(1), 34 (2016)

  2. [2]

    Journal of Imaging11(6), 204 (2025)

    Boudraa, M., Bennour, A., Nahas, M., Marie, R.R., Al-Sarem, M.: Historical manuscripts analysis: A deep learning system for writer identification using intel- ligent feature selection with vision transformers. Journal of Imaging11(6), 204 (2025)

  3. [3]

    Forensic Science International232(1), 206–212 (2013)

    Braz, A., López-López, M., García-Ruiz, C.: Raman spectroscopy for forensic analysis of inks in questioned documents. Forensic Science International232(1), 206–212 (2013)

  4. [4]

    IEEE transactions on pattern analysis and machine intelligence29(4), 701–717 (2007)

    Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE transactions on pattern analysis and machine intelligence29(4), 701–717 (2007)

  5. [5]

    In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4685–4694 (June 2019) Circleid17

  6. [6]

    IEEE transactions on pattern analysis and machine intelligence30(10), 1858–1865 (2008)

    Evangelidis, G.D., Psarakis, E.Z.: Parametric image alignment using enhanced correlation coefficient maximization. IEEE transactions on pattern analysis and machine intelligence30(10), 1858–1865 (2008)

  7. [7]

    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM24(6), 381–395 (Jun 1981)

  8. [8]

    IEEE transactions on pattern analysis and machine intelligence43(10), 3614–3631 (2020)

    Geng, C., Huang, S.j., Chen, S.: Recent advances in open set recognition: A survey. IEEE transactions on pattern analysis and machine intelligence43(10), 3614–3631 (2020)

  9. [9]

    In: 2017 seventh international conference on image processing theory, tools and applications (IPTA)

    Hafemann, L.G., Sabourin, R., Oliveira, L.S.: Offline handwritten signature ver- ification—Literature review. In: 2017 seventh international conference on image processing theory, tools and applications (IPTA). pp. 1–8. IEEE (2017)

  10. [10]

    He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition (2015), https://arxiv.org/abs/1512.03385

  11. [11]

    In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A ConvNet for the 2020s. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 11966–11976 (2022)

  12. [12]

    Loshchilov, I., Hutter, F.: Decoupled Weight Decay Regularization (2019), https: //arxiv.org/abs/1711.05101

  13. [13]

    In: Proceedings of the seventh IEEE international conference on computer vision

    Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision. vol. 2, pp. 1150–1157. Ieee (1999)

  14. [14]

    In: 2011 International conference on computer vision

    Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-svms for object detection and beyond. In: 2011 International conference on computer vision. pp. 89–96. IEEE (2011)

  15. [15]

    In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    Murray, N., Perronnin, F.: Generalized Max Pooling . In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2473–2480. IEEE Computer Society, Los Alamitos, CA, USA (Jun 2014)

  16. [16]

    International journal of computer vision115(3), 211–252 (2015)

    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpa- thy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. International journal of computer vision115(3), 211–252 (2015)

  17. [17]

    Siméoni, O., Vo, H.V., Seitzer, M., Baldassarre, F., Oquab, M., Jose, C., Khalidov, V., Szafraniec, M., Yi, S., Ramamonjisoa, M., Massa, F., Haziza, D., Wehrstedt, L., Wang, J., Darcet, T., Moutakanni, T., Sentana, L., Roberts, C., Vedaldi, A., Tolan, J., Brandt, J., Couprie, C., Mairal, J., Jégou, H., Labatut, P., Bojanowski, P.: DINOv3 (2025), https://a...

  18. [18]

    org/abs/2312.15571

    Sun, J., Dong, Q.: A Survey on Open-Set Image Recognition (2023), https://arxiv. org/abs/2312.15571

  19. [19]

    In: International conference on machine learning

    Tan, M., Le, Q.: Efficientnetv2: Smaller models and faster training. In: International conference on machine learning. pp. 10096–10106. PMLR (2021)

  20. [20]

    IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 9677–9696 (2024)

    Wang, X., Chen, H., Tang, S., Wu, Z., Zhu, W.: Disentangled representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 9677–9696 (2024)

  21. [21]

    In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Wang, X., Han, X., Huang, W., Dong, D., Scott, M.R.: Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 5017–5025 (2019)

  22. [22]

    In: Proceedings of the IEEE/CVF international conference on computer vision

    Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 6023–6032 (2019)

  23. [23]

    Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: Beyond Empirical Risk Minimization (2018), https://arxiv.org/abs/1710.09412