DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images
read the original abstract
In this paper, we develop a novel unified framework called DeepText for text region proposal generation and text detection in natural images via a fully convolutional neural network (CNN). First, we propose the inception region proposal network (Inception-RPN) and design a set of text characteristic prior bounding boxes to achieve high word recall with only hundred level candidate proposals. Next, we present a powerful textdetection network that embeds ambiguous text category (ATC) information and multilevel region-of-interest pooling (MLRP) for text and non-text classification and accurate localization. Finally, we apply an iterative bounding box voting scheme to pursue high recall in a complementary manner and introduce a filtering algorithm to retain the most suitable bounding box, while removing redundant inner and outer boxes for each text instance. Our approach achieves an F-measure of 0.83 and 0.85 on the ICDAR 2011 and 2013 robust text detection benchmarks, outperforming previous state-of-the-art results.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
RFBTD: RFB Text Detector
RFBTD applies Receptive Field Blocks to scene text detection for arbitrary orientations and dense text, reporting an F-score of 47.09 on ICDAR2015 at 720p resolution.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.