arxiv: 2605.07816 · v1 · submitted 2026-05-08 · 💻 cs.CV

Recognition: no theorem link

ICDAR 2026 Competition on Writer Identification and Pen Classification from Hand-Drawn Circles

Thomas Gorges , Janne van der Loop , Lukas H\"uttner , Linda-Sophie Schneider , Fei Wu , Mathias Seuret , Vincent Christlein

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:06 UTC · model grok-4.3

classification 💻 cs.CV

keywords writer identificationpen classificationhand-drawn circlesbiometricsfeature disentanglementopen-set recognitionminimal tracescompetition dataset

0 comments

The pith

Hand-drawn circles contain identifiable traces of both the writer and the pen used.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a large-scale competition to test how much writer identity and pen type information is recoverable from simple static images of hand-drawn circles. It supplies a controlled collection of over forty-six thousand tightly cropped circle scans from sixty-six writers and eight pens, organized into an open-set writer identification task and a cross-writer pen classification task. Models are required to recognize known writers while rejecting unknowns and to classify pens even when the writer is unseen. Top entries reached 64.8 percent accuracy on writer identification and 92.7 percent on pen classification, establishing measurable performance levels for this minimal-trace setting.

Core claim

CircleID establishes a new baseline for minimal-trace analysis by releasing a controlled dataset of hand-drawn circles and running a large-scale competition on two tasks: open-set writer identification, where models must handle unknown writers, and cross-writer pen classification. The best entries reached 64.801 percent Top-1 accuracy on writer identification and 92.726 percent on pen classification, demonstrating that biometric characteristics and pen features can be disentangled to a measurable degree from these minimal static traces.

What carries the argument

The CircleID dataset and competition framework of 46,155 annotated circle images used for open-set writer recognition and cross-writer pen type classification, with evaluation on private leaderboards for unseen writers.

If this is right

Writer identification from circles generalizes partially to unknown writers, setting a performance floor for future minimal-trace biometrics.
Pen classification achieves high accuracy even across writers, indicating that physical pen properties are extractable independently of drawing style.
Large participation validates the dataset as a useful public resource for testing feature disentanglement methods.
The impact of out-of-distribution writers highlights challenges in generalization for such tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar minimal drawings could be used in forensic applications to link documents to specific pens or authors without full handwriting samples.
Extending the approach to other simple shapes might improve robustness or allow multi-task learning for identity and tool identification.
If the accuracies improve with better models, it could lead to low-effort authentication systems based on quick circle sketches.
Analyzing the failure cases on unknown writers may reveal what aspects of style are most distinctive.

Load-bearing premise

Biometric writer characteristics and physical pen features naturally entangle within minimal, static traces of hand-drawn circles in a way that permits meaningful disentanglement and generalization to unseen writers.

What would settle it

If the top-performing models on the private leaderboard for writer identification fail to exceed random guessing levels when evaluated on the unknown writers, the claim of meaningful disentanglement would be falsified.

Figures

Figures reproduced from arXiv: 2605.07816 by Fei Wu, Janne van der Loop, Linda-Sophie Schneider, Lukas H\"uttner, Mathias Seuret, Thomas Gorges, Vincent Christlein.

**Figure 1.** Figure 1: Examples of morphological and textural variations in hand-drawn circles from randomly selected writers across the eight pens, all using black ink. The dataset consists of a newly assembled set of hand-drawn circles, drawn under controlled conditions. Data collection involved a cohort of 66 healthy adults from diverse cultural backgrounds, representing various age and gender groups. Participants received st… view at source ↗

**Figure 2.** Figure 2: Overview of the Circleid dataset splits and evaluation set distributions. (a) sizes of the train set, additional train set, and evaluation set. The additional train set is subdivided into samples with known and unknown writer annotations. (b) sizes of the public and private evaluation sets, further broken down by Part A and Part B. (c) distribution of samples across the eight pens in the evaluation set, sh… view at source ↗

**Figure 3.** Figure 3: Performance analysis of the top-3 teams on different subsets of the private leaderboard data. (a) writer identification task. Accuracies represent Top-1 accuracies, F1 scores for all categories, and macro-F1 scores for known categories. Recall and F1 for unknown writers are recall and binary F1 for unknown-writer detection on the unknown-writer subset. (b) pen classification task. Top-1 accuracies and macr… view at source ↗

**Figure 4.** Figure 4: Rank-shift analysis between the public and private leaderboards. (a) writer identification task. (b) pen classification task. Both models were initialized from ImageNet pretraining to support fine-grained texture modeling. Training was carried out in multiple stages. In the first stage, the models were optimized on labeled data using cross-entropy loss and label smoothing, with strong augmentations includi… view at source ↗

**Figure 5.** Figure 5: Empirical cumulative distribution functions of Top-1 accuracy for both tasks, shown separately for the public and private leaderboards. (a) writer identification leaderboard. (b) pen classification leaderboard. classification in (b). The Spearman rank correlations are high for both tasks, with ρ = 0.939 for writer identification and ρ = 0.938 for pen classification. This indicates that the rankings are rel… view at source ↗

**Figure 6.** Figure 6: Writer identification top-30 team performance on the private leaderboard evaluation set. (a) violin plots showing the distribution across the all-writer Top-1 accuracy, known-writer Top-1 accuracy, and unknown-writer recall, and binary known vs. unknown accuracy. In this binary setting, a prediction counts as correct when the model correctly distinguishes between any known writer and the unknown class, ind… view at source ↗

**Figure 7.** Figure 7: Pen classification top-30 team performance on the private leaderboard evaluation set. (a) per-pen F1 distribution for pens 1, 2, 4, 5, and 6, shown separately for all, known, and unknown writers. (b) per-pen F1 distribution for pens 3, 7, and 8 under the same subset split. (c) top-1 accuracy distribution of teams across all, known, and unknown writer subsets. pens 1, 2, 4, 5, and 6 consistently with strong… view at source ↗

**Figure 8.** Figure 8: Samples from the private leaderboard set that were most frequently misclassified by the top-30 teams in the pen classification task. Ground truth of all samples is shown. 6 Discussion Pen classification proved easier, while writer identification from isolated circles was considerably more difficult. This difference suggests that pen-specific stroke texture and material properties can be captured more readi… view at source ↗

read the original abstract

This paper presents CircleID, a large-scale ICDAR 2026 competition on writer identification and pen classification from scanned hand-drawn circles. The primary objective is to investigate how biometric writer characteristics and physical pen features naturally entangle within minimal, static traces. CircleID comprises two distinct tasks: (1) open-set writer identification, requiring models to recognize known writers while explicitly rejecting unknown ones, and (2) cross-writer pen classification, evaluated across both seen and unseen writers. Participants were provided with a new, controlled dataset of 46,155 tightly cropped circle images, digitized at 400 DPI and annotated for writer identity and pen type. The dataset comprises samples from 50 known and 16 unknown writers using eight different pens. Hosted on Kaggle as two separate tracks with public and private leaderboards, the competition provided participants with a ResNet baseline. In total, 389 teams (436 participants) made 3,185 submissions for the pen classification task, and 113 teams (141 participants) made 1,737 submissions for the writer identification track. The best-performing private leaderboard submissions achieved a Top-1 accuracy of 64.801% for writer identification and 92.726% for pen classification. This paper details the dataset, evaluates the winning methodologies, and analyzes the impact of out-of-distribution writers on model generalization and feature disentanglement. In this large-scale competition, CircleID establishes a new baseline for minimal-trace analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A competition report that ships a new public dataset of 46k circles and sets initial benchmarks for open-set writer ID and pen classification on minimal traces.

read the letter

This paper is a report on the CircleID ICDAR 2026 competition. It releases a controlled dataset of 46,155 tightly cropped circle images from 66 writers and 8 pens, then runs two tasks: open-set writer identification and cross-writer pen classification. They supplied a ResNet baseline, hosted everything on Kaggle with public and private leaderboards, and collected results from 389 teams on the pen track and 113 on the writer track. Top private scores landed at 64.8% for writer ID and 92.7% for pen classification, with some OOD analysis on unseen writers included.

Referee Report

2 major / 3 minor

Summary. This paper presents the CircleID competition as part of ICDAR 2026, aimed at writer identification and pen classification from scanned hand-drawn circles. It describes a new dataset with 46,155 images from 50 known and 16 unknown writers using 8 pens. The two tasks are open-set writer identification and cross-writer pen classification. The competition saw 389 teams with 3,185 submissions for pen classification and 113 teams with 1,737 submissions for writer identification. Top private leaderboard accuracies are 64.801% for writer identification and 92.726% for pen classification. A ResNet baseline is provided, and the paper evaluates methodologies and analyzes OOD writer impacts.

Significance. If the reported results hold, the paper makes a significant contribution by creating a large-scale benchmark for minimal-trace biometric analysis. The high number of participants and submissions, along with public and private leaderboards, provides solid empirical grounding. The authors are credited for releasing a controlled dataset, a baseline model, and conducting OOD analysis, which supports reproducibility and future work on disentangling writer and pen features.

major comments (2)

[§3] The description of the dataset collection lacks specific details on controls such as writer instructions, scanning consistency, and annotation validation procedures. This information is load-bearing for interpreting the generalization performance to the 16 unknown writers and the overall validity of the benchmark.
[§6] The analysis of the impact of out-of-distribution writers on model generalization does not include statistical tests to confirm the significance of observed differences in accuracy, which weakens the claims about feature disentanglement.

minor comments (3)

The abstract mentions 'tightly cropped' images but does not detail the cropping algorithm or criteria used.
It would be helpful to include the exact hyperparameters and training details for the provided ResNet baseline to ensure full reproducibility.
Clarify if the participant numbers (436 for pen, 141 for writer) account for overlaps between the two tracks.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment, the recommendation to accept, and the constructive comments on dataset documentation and statistical rigor. We address each major comment below and have revised the manuscript accordingly.

read point-by-point responses

Referee: [§3] The description of the dataset collection lacks specific details on controls such as writer instructions, scanning consistency, and annotation validation procedures. This information is load-bearing for interpreting the generalization performance to the 16 unknown writers and the overall validity of the benchmark.

Authors: We agree that additional protocol details strengthen the benchmark's interpretability. In the revised manuscript we have expanded §3 with a dedicated subsection on data collection controls. This now specifies the exact writer instructions (freehand circles of ~5 cm diameter, single continuous stroke, no retracing or lifting the pen), the scanning procedure (single Epson scanner model, fixed 400 DPI, identical brightness/contrast settings, no post-processing), and the annotation validation process (independent labeling by two annotators with a third resolving conflicts, yielding 99.2 % inter-annotator agreement). These additions directly support claims about generalization to the 16 unknown writers. revision: yes
Referee: [§6] The analysis of the impact of out-of-distribution writers on model generalization does not include statistical tests to confirm the significance of observed differences in accuracy, which weakens the claims about feature disentanglement.

Authors: We concur that formal statistical testing would reinforce the §6 analysis. The revised version now includes Wilcoxon signed-rank tests comparing accuracy distributions obtained with and without the OOD writers. The tests yield p < 0.01 for the reported accuracy drops, confirming that the observed differences are statistically significant and thereby strengthening the discussion of writer-pen feature disentanglement. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a competition report describing a new dataset, two tasks, a ResNet baseline, and aggregated participant results on public/private leaderboards. No derivation chain, equations, fitted parameters, or predictions are present; all reported accuracies (e.g., 64.801% writer ID, 92.726% pen classification) are direct empirical outcomes from 3185+ external submissions on held-out data. No self-citations are load-bearing for any claim, and the central contribution is descriptive benchmarking rather than any constructed or renamed result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on the domain assumption that circles contain separable writer and pen signals; no free parameters or invented entities are introduced.

axioms (1)

domain assumption Hand-drawn circles contain entangled biometric writer characteristics and physical pen features that can be analyzed separately.
Stated as the primary objective in the abstract.

pith-pipeline@v0.9.0 · 5587 in / 1084 out tokens · 28470 ms · 2026-05-11T02:06:31.095546+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 4 internal anchors

[1]

EURASIP Journal on Image and Video Processing2016(1), 34 (2016)

Bensefia, A., Paquet, T.: Writer verification based on a single handwriting word samples. EURASIP Journal on Image and Video Processing2016(1), 34 (2016)

work page 2016
[2]

Journal of Imaging11(6), 204 (2025)

Boudraa, M., Bennour, A., Nahas, M., Marie, R.R., Al-Sarem, M.: Historical manuscripts analysis: A deep learning system for writer identification using intel- ligent feature selection with vision transformers. Journal of Imaging11(6), 204 (2025)

work page 2025
[3]

Forensic Science International232(1), 206–212 (2013)

Braz, A., López-López, M., García-Ruiz, C.: Raman spectroscopy for forensic analysis of inks in questioned documents. Forensic Science International232(1), 206–212 (2013)

work page 2013
[4]

IEEE transactions on pattern analysis and machine intelligence29(4), 701–717 (2007)

Bulacu, M., Schomaker, L.: Text-independent writer identification and verification using textural and allographic features. IEEE transactions on pattern analysis and machine intelligence29(4), 701–717 (2007)

work page 2007
[5]

In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4685–4694 (June 2019) Circleid17

work page 2019
[6]

IEEE transactions on pattern analysis and machine intelligence30(10), 1858–1865 (2008)

Evangelidis, G.D., Psarakis, E.Z.: Parametric image alignment using enhanced correlation coefficient maximization. IEEE transactions on pattern analysis and machine intelligence30(10), 1858–1865 (2008)

work page 2008
[7]

Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM24(6), 381–395 (Jun 1981)

work page 1981
[8]

IEEE transactions on pattern analysis and machine intelligence43(10), 3614–3631 (2020)

Geng, C., Huang, S.j., Chen, S.: Recent advances in open set recognition: A survey. IEEE transactions on pattern analysis and machine intelligence43(10), 3614–3631 (2020)

work page 2020
[9]

In: 2017 seventh international conference on image processing theory, tools and applications (IPTA)

Hafemann, L.G., Sabourin, R., Oliveira, L.S.: Offline handwritten signature ver- ification—Literature review. In: 2017 seventh international conference on image processing theory, tools and applications (IPTA). pp. 1–8. IEEE (2017)

work page 2017
[10]

He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition (2015), https://arxiv.org/abs/1512.03385

work page internal anchor Pith review arXiv 2015
[11]

In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A ConvNet for the 2020s. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 11966–11976 (2022)

work page 2022
[12]

Loshchilov, I., Hutter, F.: Decoupled Weight Decay Regularization (2019), https: //arxiv.org/abs/1711.05101

work page internal anchor Pith review Pith/arXiv arXiv 2019
[13]

In: Proceedings of the seventh IEEE international conference on computer vision

Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision. vol. 2, pp. 1150–1157. Ieee (1999)

work page 1999
[14]

In: 2011 International conference on computer vision

Malisiewicz, T., Gupta, A., Efros, A.A.: Ensemble of exemplar-svms for object detection and beyond. In: 2011 International conference on computer vision. pp. 89–96. IEEE (2011)

work page 2011
[15]

In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Murray, N., Perronnin, F.: Generalized Max Pooling . In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2473–2480. IEEE Computer Society, Los Alamitos, CA, USA (Jun 2014)

work page 2014
[16]

International journal of computer vision115(3), 211–252 (2015)

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpa- thy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. International journal of computer vision115(3), 211–252 (2015)

work page 2015
[17]

Siméoni, O., Vo, H.V., Seitzer, M., Baldassarre, F., Oquab, M., Jose, C., Khalidov, V., Szafraniec, M., Yi, S., Ramamonjisoa, M., Massa, F., Haziza, D., Wehrstedt, L., Wang, J., Darcet, T., Moutakanni, T., Sentana, L., Roberts, C., Vedaldi, A., Tolan, J., Brandt, J., Couprie, C., Mairal, J., Jégou, H., Labatut, P., Bojanowski, P.: DINOv3 (2025), https://a...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[18]

org/abs/2312.15571

Sun, J., Dong, Q.: A Survey on Open-Set Image Recognition (2023), https://arxiv. org/abs/2312.15571

work page arXiv 2023
[19]

In: International conference on machine learning

Tan, M., Le, Q.: Efficientnetv2: Smaller models and faster training. In: International conference on machine learning. pp. 10096–10106. PMLR (2021)

work page 2021
[20]

IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 9677–9696 (2024)

Wang, X., Chen, H., Tang, S., Wu, Z., Zhu, W.: Disentangled representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence46(12), 9677–9696 (2024)

work page 2024
[21]

In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Wang, X., Han, X., Huang, W., Dong, D., Scott, M.R.: Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 5017–5025 (2019)

work page 2019
[22]

In: Proceedings of the IEEE/CVF international conference on computer vision

Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 6023–6032 (2019)

work page 2019
[23]

Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: Beyond Empirical Risk Minimization (2018), https://arxiv.org/abs/1710.09412

work page internal anchor Pith review arXiv 2018