Full-Page Text Recognition: Learning Where to Start and When to Stop

Bastien Moysset , Christopher Kermorvant , Christian Wolf

Authors on Pith no claims yet

classification 💻 cs.CV

keywords textfulllocalizationpagerecognitionlinesmethodanalysis

read the original abstract

Text line detection and localization is a crucial step for full page document analysis, but still suffers from heterogeneity of real life documents. In this paper, we present a new approach for full page text recognition. Localization of the text lines is based on regressions with Fully Convolutional Neural Networks and Multidimensional Long Short-Term Memory as contextual layers. In order to increase the efficiency of this localization method, only the position of the left side of the text lines are predicted. The text recognizer is then in charge of predicting the end of the text to recognize. This method has shown good results for full page text recognition on the highly heterogeneous Maurdor dataset.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Nougat: Neural Optical Understanding for Academic Documents
cs.LG 2023-08 conditional novelty 6.0

Nougat applies a visual transformer to convert academic PDFs into markup language while accurately handling mathematical content on a new scientific document dataset.