pith. sign in

arxiv: 1805.01167 · v2 · pith:EN5P7SK2new · submitted 2018-05-03 · 💻 cs.CV

IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

classification 💻 cs.CV
keywords textdetectionscenemulti-orienteddatasetsdeformableicdar2015inceptext
0
0 comments X
read the original abstract

Incidental scene text detection, especially for multi-oriented text regions, is one of the most challenging tasks in many computer vision applications. Different from the common object detection task, scene text often suffers from a large variance of aspect ratio, scale, and orientation. To solve this problem, we propose a novel end-to-end scene text detector IncepText from an instance-aware segmentation perspective. We design a novel Inception-Text module and introduce deformable PSROI pooling to deal with multi-oriented text detection. Extensive experiments on ICDAR2015, RCTW-17, and MSRA-TD500 datasets demonstrate our method's superiority in terms of both effectiveness and efficiency. Our proposed method achieves 1st place result on ICDAR2015 challenge and the state-of-the-art performance on other datasets. Moreover, we have released our implementation as an OCR product which is available for public access.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

    cs.MM 2024-10 unverdicted novelty 3.0

    Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.