BADAM: A Public Dataset for Baseline Detection in Arabic-script Manuscripts

Benjamin Kiessling; Daniel St\"okl Ben Ezra; Matthew Thomas Miller

arxiv: 1907.04041 · v1 · pith:2QHVSE73new · submitted 2019-07-09 · 💻 cs.CV

BADAM: A Public Dataset for Baseline Detection in Arabic-script Manuscripts

Benjamin Kiessling , Daniel St\"okl Ben Ezra , Matthew Thomas Miller This is my paper

Pith reviewed 2026-05-25 00:46 UTC · model grok-4.3

classification 💻 cs.CV

keywords baseline detectionArabic manuscriptsdocument layout analysishandwritten text recognitionpublic datasetconvolutional neural networktext line extraction

0 comments

The pith

BADAM supplies 400 annotated Arabic manuscript images to train baseline detectors for text line extraction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a public dataset called BADAM consisting of 400 annotated document images drawn from varied domains and historical periods in Arabic script. It notes that progress on layout analysis for non-Latin scripts has been limited by the absence of established datasets, which in turn slows handwritten text recognition on historical works. The authors also describe a fully convolutional encoder-decoder network designed to pull out text lines of arbitrary shape from these manuscripts. Accurate baseline detection serves as a prerequisite step for recognition systems, so the new resource is positioned to support method development in this area.

Core claim

We present a dataset of 400 annotated document images from different domains and time periods. A short elaboration on the particular challenges posed by handwriting in Arabic script for layout analysis and subsequent processing steps is given. Lastly, we propose a method based on a fully convolutional encoder-decoder network to extract arbitrarily shaped text line images from manuscripts.

What carries the argument

The BADAM dataset of 400 annotated Arabic-script manuscript images together with a fully convolutional encoder-decoder network for extracting text lines.

If this is right

Baseline detection systems can now be trained and evaluated on a public Arabic-script resource instead of private collections.
Layout analysis pipelines for historical documents gain a concrete starting point for handling non-Latin scripts.
The encoder-decoder architecture provides one concrete technique that future work can compare against when processing arbitrarily shaped lines.
Researchers gain a shared benchmark that makes incremental improvements in Arabic manuscript processing measurable.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the dataset proves representative, similar annotation efforts could be applied to other under-resourced scripts to accelerate recognition research.
Performance gaps on new manuscripts would indicate the need for additional diversity in future releases of the dataset.
The network architecture could be adapted to related tasks such as word segmentation once line images are reliably extracted.

Load-bearing premise

The 400 selected images capture enough of the variation in Arabic handwriting across domains and eras for models trained on them to work on unseen manuscripts.

What would settle it

Train the proposed network on the BADAM training split and measure its text-line extraction accuracy on a fresh collection of Arabic manuscripts drawn from periods or domains absent from the 400-image set.

Figures

Figures reproduced from arXiv: 1907.04041 by Benjamin Kiessling, Daniel St\"okl Ben Ezra, Matthew Thomas Miller.

**Figure 1.** Figure 1: Aspects of Arabic-script handwriting While many Arabic handwritten texts present only a single baseline per logical text line a large number of documents, especially calligraphic works in Thuluth and Nastaliq style, display per word slanted baselines (Fig. 1b), multiple baseline levels, and dislocation of fragments into the margins or above other text in the line (heaping) (Fig. 1c and 1a). Most of these… view at source ↗

**Figure 2.** Figure 2: Examples of annotation guideline application (baseline in [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Architecture of the baseline labelling network. Dropout and batch/group normalization layers are omitted. (beige: convolutional [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: 4 sample pages from the corpus The backbone model consists of the first 3 blocks of a 34-layer ResNet in the contracting path followed by 4 3×3 convolutiontransposed convolution blocks in the expanding paths with group normalization [25] (G = 32) and dropout (p = 0.1) employed after each layer and block respectively. A final 1×1 convolutional layer reduces the dimensionality of the input-sized 64-channel … view at source ↗

read the original abstract

The application of handwritten text recognition to historical works is highly dependant on accurate text line retrieval. A number of systems utilizing a robust baseline detection paradigm have emerged recently but the advancement of layout analysis methods for challenging scripts is held back by the lack of well-established datasets including works in non-Latin scripts. We present a dataset of 400 annotated document images from different domains and time periods. A short elaboration on the particular challenges posed by handwriting in Arabic script for layout analysis and subsequent processing steps is given. Lastly, we propose a method based on a fully convolutional encoder-decoder network to extract arbitrarily shaped text line images from manuscripts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper ships a new public dataset of 400 Arabic-script manuscript images that fills a stated gap, but gives almost no numbers on coverage or method performance.

read the letter

The main thing here is a dataset release. They put out 400 annotated images from Arabic manuscripts across domains and periods, plus a short note on script-specific layout issues and a standard fully convolutional encoder-decoder baseline. That is the concrete addition: prior work on baseline detection has mostly stayed with Latin scripts, so a public Arabic set is a practical step for people who need it. The abstract is clear that this is the point, and the claim of addressing a missing resource holds up on its own terms. No circular math or invented predictions to worry about. The method description is brief and uses an off-the-shelf architecture, which is fine for a dataset paper. The soft spot is the lack of supporting detail. There are no tables, no error rates, no breakdown of the 400 images by century, script style, or degradation type, and no hold-out test that would show whether models trained on this set actually generalize. The representativeness claim rests on the phrase “different domains and time periods” without counts or examples, so a reader cannot yet judge how broad the coverage really is. That is the main limitation in the current text. This is for researchers in historical document analysis who work with non-Latin scripts and need training data or a starting baseline. It is not aimed at general computer vision. The work shows straightforward engagement with the literature on layout analysis and does not overclaim. It deserves a serious referee because dataset releases in under-served scripts are worth checking for annotation quality and release details, even if the paper will need more evaluation numbers to be fully useful.

Referee Report

2 major / 1 minor

Summary. The paper introduces the BADAM dataset of 400 annotated Arabic-script manuscript images drawn from different domains and time periods, briefly discusses challenges specific to Arabic handwriting for layout analysis, and proposes a fully convolutional encoder-decoder network for extracting arbitrarily shaped text lines.

Significance. A well-curated public dataset for baseline detection in Arabic manuscripts would address a documented scarcity of resources for non-Latin scripts and could support reproducible progress in historical document analysis. The proposed network architecture is a standard choice whose utility would be strengthened by empirical validation on the released data.

major comments (2)

[Abstract] Abstract: the statement that the 400 images come 'from different domains and time periods' supplies no quantitative breakdown (counts per century, per script subtype, or per degradation class), which is load-bearing for the claim that models trained on BADAM will generalize to unseen manuscripts.
[Abstract] Abstract / Method section: the fully convolutional encoder-decoder is described only at a high level with no training protocol, loss function, hyper-parameters, or any quantitative results (precision, recall, or pixel-level metrics) on the 400 images, leaving unsupported the assertion that the method handles Arabic-specific challenges.

minor comments (1)

The abstract would be clearer if it stated the annotation format (e.g., polygon coordinates, pixel masks) and the tool or protocol used to produce the ground truth.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to incorporate the suggested improvements.

read point-by-point responses

Referee: [Abstract] Abstract: the statement that the 400 images come 'from different domains and time periods' supplies no quantitative breakdown (counts per century, per script subtype, or per degradation class), which is load-bearing for the claim that models trained on BADAM will generalize to unseen manuscripts.

Authors: We agree that a quantitative breakdown is necessary to substantiate the generalization claim. While the manuscript provides qualitative descriptions of the sources, we will add a table or dedicated subsection in the revised version detailing the distribution of the 400 images by century, script subtype, and degradation class. revision: yes
Referee: [Abstract] Abstract / Method section: the fully convolutional encoder-decoder is described only at a high level with no training protocol, loss function, hyper-parameters, or any quantitative results (precision, recall, or pixel-level metrics) on the 400 images, leaving unsupported the assertion that the method handles Arabic-specific challenges.

Authors: The primary contribution is the BADAM dataset; the FCN serves as an illustrative baseline. We acknowledge that the current description lacks sufficient detail to support claims about handling Arabic-specific challenges. In the revision we will expand the method section with the training protocol, loss function, hyperparameters, and report quantitative metrics (e.g., precision, recall, pixel-level IoU) evaluated on the released data. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset release with independent method description

full rationale

The paper presents a new public dataset of 400 images and describes a fully convolutional encoder-decoder network for baseline detection. No equations, parameter fits, predictions, or self-citations appear in the provided text that reduce any claimed result to the inputs by construction. The contribution is self-contained as a data release plus high-level architecture outline; representativeness claims are empirical assumptions, not circular derivations.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the creation and utility of a new annotated image collection plus the applicability of an existing fully convolutional architecture; no free parameters, domain-specific axioms, or new entities are introduced in the abstract.

pith-pipeline@v0.9.0 · 5633 in / 1055 out tokens · 40357 ms · 2026-05-25T00:46:38.613897+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We present a dataset of 400 annotated document images... propose a method based on a fully convolutional encoder-decoder network
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

U-Net architecture... ResNet blocks... group normalization

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 2 internal anchors

[1]

Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Ste- fan Pletschacher. 2011. Historical document layout analysis competition. In Doc- ument Analysis and Recognition (ICDAR), 2011 11th International Conference on . IEEE, 1516–1520

work page 2011
[2]

Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Ste- fan Pletschacher. 2013. Icdar 2013 competition on historical newspaper layout analysis (hnla 2013). In Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE, 1454–1458

work page 2013
[3]

Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Ste- fan Pletschacher. 2015. ICDAR2015 competition on recognition of documents with complex layouts-RDCL2015. In Document Analysis and Recognition (IC- DAR), 2015 13th International Conference on . IEEE, 1151–1155

work page 2015
[4]

Apostolos Antonacopoulos, Stefan Pletschacher, David Bridson, and Christos Papadopoulos. 2009. ICDAR 2009 page segmentation competition. In Document Analysis and Recognition, 2009. ICDAR’09. 10th International Conference on. IEEE, 1370–1374

work page 2009
[5]

Berat Barakat, Ahmad Droby, Majeed Kassis, and Jihad El-Sana. 2018. Text Line Segmentation for Challenging Handwritten Document Images using Fully Con- volutional Network. In 2018 16th International Conference on Frontiers in Hand- writing Recognition (ICFHR) . IEEE, 374–379

work page 2018
[6]

Jean-Christophe Burie, Mickaël Coustaty, Setiawan Hadi, Made Windu Antara Kesiman, Jean-Marc Ogier, Erick Paulus, Kimheng Sok, I Made Gede Sunarya, and Dona Valy. 2016. ICFHR2016 competition on the analysis of handwritten text in images of balinese palm leaf manuscripts. In Frontiers in Handwriting Recognition (ICFHR), 2016 15th International Conference o...

work page 2016
[7]

Christian Clausner, Apostolos Antonacopoulos, Nora Mcgregor, and Daniel Wilson-Nunn. 2018. ICFHR 2018 Competition on Recognition of Historical Ara- bic Scientific Manuscripts–RASM2018. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR) . IEEE, 471–476

work page 2018
[8]

Markus Diem, Florian Kleber, Stefan Fiel, Tobias Grüning, and Basilis Gatos

work page
[9]

In Document Analy- sis and Recognition (ICDAR), 2017 14th IAPR International Conference on , Vol

cbad: Icdar2017 competition on baseline detection. In Document Analy- sis and Recognition (ICDAR), 2017 14th IAPR International Conference on , Vol. 1. IEEE, 1355–1360

work page 2017
[10]

David H Douglas and Thomas K Peucker. 1973. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: The International Journal for Geographic Information and Geovi- sualization 10, 2 (1973), 112–122

work page 1973
[11]

Michael Fink, Thomas Layer, Georg Mackenbrock, and Michael Sprinzl. 2018. Baseline Detection in Historical Documents using Convolutional U-Nets. In2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 37– 42

work page 2018
[12]

Andreas Fischer, Volkmar Frinken, Alicia Fornés, and Horst Bunke. 2011. Tran- scription alignment of Latin manuscripts using hidden Markov models. In Pro- ceedings of the 2011 Workshop on Historical Document Imaging and Processing . ACM, 29–36

work page 2011
[13]

Basilis Gatos, Nikolaos Stamatopoulos, and Georgios Louloudis. 2010. ICHFR 2010 handwriting segmentation contest. In 2010 11th International Conference on Frontiers in Handwriting Recognition (ICFHR) . IEEE, 737–742

work page 2010
[14]

Basilios Gatos, Nikolaos Stamatopoulos, and Georgios Louloudis. 2011. IC- DAR2009 handwriting segmentation contest. International Journal on Document Analysis and Recognition (IJDAR) 14, 1 (2011), 25–33

work page 2011
[15]

Tobias Grüning, Roger Labahn, Markus Diem, Florian Kleber, and Stefan Fiel

work page
[16]

In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS)

Read-bad: A new dataset and evaluation scheme for baseline detection in archival documents. In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 351–356

work page 2018
[17]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026–1034

work page 2015
[18]

Majeed Kassis, Alaa Abdalhaleem, Ahmad Droby, Reem Alaasam, and Jihad El- Sana. 2017. VML-HD: The historical Arabic documents dataset for recognition systems. In Arabic Script Analysis and Recognition (ASAR), 2017 1st International Workshop on. IEEE, 11–14. Benjamin Kiessling, Daniel Stökl Ben Ezra, and Matthew Thomas Miller

work page 2017
[19]

Ta-Chih Lee, Rangasami L Kashyap, and Chong-Nam Chu. 1994. Building skele- ton models via 3-D medial surface axis thinning algorithms. CVGIP: Graphical Models and Image Processing 56, 6 (1994), 462–478

work page 1994
[20]

Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition . 3431–3440

work page 2015
[21]

Michael Murdock, Shawn Reid, Blaine Hamilton, and Jackson Reese. 2015. IC- DAR 2015 competition on text line detection in historical documents. In Doc- ument Analysis and Recognition (ICDAR), 2015 13th International Conference on . IEEE, 1171–1175

work page 2015
[22]

Lorenzo Quirós. 2018. Multi-Task Handwritten Document Layout Analysis. arXiv preprint arXiv:1806.08852 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[23]

Veronica Romero, Joan Andreu Sanchez, Vicente Bosch, Katrien Depuydt, and Jesse de Does. 2015. Influence of text line segmentation in handwritten text recognition. In Document Analysis and Recognition (ICDAR), 2015 13th Interna- tional Conference on. IEEE, 536–540

work page 2015
[24]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolu- tional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention . Springer, 234– 241

work page 2015
[25]

Jaakko Sauvola and Matti Pietikäinen. 2000. Adaptive document image binariza- tion. Pattern recognition 33, 2 (2000), 225–236

work page 2000
[26]

Foteini Simistira, Mathias Seuret, Nicole Eichenberger, Angelika Garz, Marcus Liwicki, and Rolf Ingold. 2016. Diva-hisdb: A precisely annotated large dataset of challenging medieval manuscripts. In Frontiers in Handwriting Recognition (ICFHR), 2016 15th International Conference on . IEEE, 471–476

work page 2016
[27]

Yuxin Wu and Kaiming He. 2018. Group Normalization. CoRR abs/1803.08494 (2018). arXiv: 1803.08494 http://arxiv.org/abs/1803.08494

work page internal anchor Pith review Pith/arXiv arXiv 2018

[1] [1]

Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Ste- fan Pletschacher. 2011. Historical document layout analysis competition. In Doc- ument Analysis and Recognition (ICDAR), 2011 11th International Conference on . IEEE, 1516–1520

work page 2011

[2] [2]

Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Ste- fan Pletschacher. 2013. Icdar 2013 competition on historical newspaper layout analysis (hnla 2013). In Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE, 1454–1458

work page 2013

[3] [3]

Apostolos Antonacopoulos, Christian Clausner, Christos Papadopoulos, and Ste- fan Pletschacher. 2015. ICDAR2015 competition on recognition of documents with complex layouts-RDCL2015. In Document Analysis and Recognition (IC- DAR), 2015 13th International Conference on . IEEE, 1151–1155

work page 2015

[4] [4]

Apostolos Antonacopoulos, Stefan Pletschacher, David Bridson, and Christos Papadopoulos. 2009. ICDAR 2009 page segmentation competition. In Document Analysis and Recognition, 2009. ICDAR’09. 10th International Conference on. IEEE, 1370–1374

work page 2009

[5] [5]

Berat Barakat, Ahmad Droby, Majeed Kassis, and Jihad El-Sana. 2018. Text Line Segmentation for Challenging Handwritten Document Images using Fully Con- volutional Network. In 2018 16th International Conference on Frontiers in Hand- writing Recognition (ICFHR) . IEEE, 374–379

work page 2018

[6] [6]

Jean-Christophe Burie, Mickaël Coustaty, Setiawan Hadi, Made Windu Antara Kesiman, Jean-Marc Ogier, Erick Paulus, Kimheng Sok, I Made Gede Sunarya, and Dona Valy. 2016. ICFHR2016 competition on the analysis of handwritten text in images of balinese palm leaf manuscripts. In Frontiers in Handwriting Recognition (ICFHR), 2016 15th International Conference o...

work page 2016

[7] [7]

Christian Clausner, Apostolos Antonacopoulos, Nora Mcgregor, and Daniel Wilson-Nunn. 2018. ICFHR 2018 Competition on Recognition of Historical Ara- bic Scientific Manuscripts–RASM2018. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR) . IEEE, 471–476

work page 2018

[8] [8]

Markus Diem, Florian Kleber, Stefan Fiel, Tobias Grüning, and Basilis Gatos

work page

[9] [9]

In Document Analy- sis and Recognition (ICDAR), 2017 14th IAPR International Conference on , Vol

cbad: Icdar2017 competition on baseline detection. In Document Analy- sis and Recognition (ICDAR), 2017 14th IAPR International Conference on , Vol. 1. IEEE, 1355–1360

work page 2017

[10] [10]

David H Douglas and Thomas K Peucker. 1973. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica: The International Journal for Geographic Information and Geovi- sualization 10, 2 (1973), 112–122

work page 1973

[11] [11]

Michael Fink, Thomas Layer, Georg Mackenbrock, and Michael Sprinzl. 2018. Baseline Detection in Historical Documents using Convolutional U-Nets. In2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 37– 42

work page 2018

[12] [12]

Andreas Fischer, Volkmar Frinken, Alicia Fornés, and Horst Bunke. 2011. Tran- scription alignment of Latin manuscripts using hidden Markov models. In Pro- ceedings of the 2011 Workshop on Historical Document Imaging and Processing . ACM, 29–36

work page 2011

[13] [13]

Basilis Gatos, Nikolaos Stamatopoulos, and Georgios Louloudis. 2010. ICHFR 2010 handwriting segmentation contest. In 2010 11th International Conference on Frontiers in Handwriting Recognition (ICFHR) . IEEE, 737–742

work page 2010

[14] [14]

Basilios Gatos, Nikolaos Stamatopoulos, and Georgios Louloudis. 2011. IC- DAR2009 handwriting segmentation contest. International Journal on Document Analysis and Recognition (IJDAR) 14, 1 (2011), 25–33

work page 2011

[15] [15]

Tobias Grüning, Roger Labahn, Markus Diem, Florian Kleber, and Stefan Fiel

work page

[16] [16]

In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS)

Read-bad: A new dataset and evaluation scheme for baseline detection in archival documents. In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 351–356

work page 2018

[17] [17]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026–1034

work page 2015

[18] [18]

Majeed Kassis, Alaa Abdalhaleem, Ahmad Droby, Reem Alaasam, and Jihad El- Sana. 2017. VML-HD: The historical Arabic documents dataset for recognition systems. In Arabic Script Analysis and Recognition (ASAR), 2017 1st International Workshop on. IEEE, 11–14. Benjamin Kiessling, Daniel Stökl Ben Ezra, and Matthew Thomas Miller

work page 2017

[19] [19]

Ta-Chih Lee, Rangasami L Kashyap, and Chong-Nam Chu. 1994. Building skele- ton models via 3-D medial surface axis thinning algorithms. CVGIP: Graphical Models and Image Processing 56, 6 (1994), 462–478

work page 1994

[20] [20]

Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition . 3431–3440

work page 2015

[21] [21]

Michael Murdock, Shawn Reid, Blaine Hamilton, and Jackson Reese. 2015. IC- DAR 2015 competition on text line detection in historical documents. In Doc- ument Analysis and Recognition (ICDAR), 2015 13th International Conference on . IEEE, 1171–1175

work page 2015

[22] [22]

Lorenzo Quirós. 2018. Multi-Task Handwritten Document Layout Analysis. arXiv preprint arXiv:1806.08852 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[23] [23]

Veronica Romero, Joan Andreu Sanchez, Vicente Bosch, Katrien Depuydt, and Jesse de Does. 2015. Influence of text line segmentation in handwritten text recognition. In Document Analysis and Recognition (ICDAR), 2015 13th Interna- tional Conference on. IEEE, 536–540

work page 2015

[24] [24]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolu- tional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention . Springer, 234– 241

work page 2015

[25] [25]

Jaakko Sauvola and Matti Pietikäinen. 2000. Adaptive document image binariza- tion. Pattern recognition 33, 2 (2000), 225–236

work page 2000

[26] [26]

Foteini Simistira, Mathias Seuret, Nicole Eichenberger, Angelika Garz, Marcus Liwicki, and Rolf Ingold. 2016. Diva-hisdb: A precisely annotated large dataset of challenging medieval manuscripts. In Frontiers in Handwriting Recognition (ICFHR), 2016 15th International Conference on . IEEE, 471–476

work page 2016

[27] [27]

Yuxin Wu and Kaiming He. 2018. Group Normalization. CoRR abs/1803.08494 (2018). arXiv: 1803.08494 http://arxiv.org/abs/1803.08494

work page internal anchor Pith review Pith/arXiv arXiv 2018