PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

Chengquan Zhang; Errui Ding; Fei Qi; Guangming Shi; Jingtuo Liu; Junyu Han; Pengfei Wang; Pengyuan Lyu; Shanshan Liu; Xiaoqiang Zhang

arxiv: 2104.05458 · v1 · pith:FGQH3BOGnew · submitted 2021-04-12 · 💻 cs.CV

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

Pengfei Wang , Chengquan Zhang , Fei Qi , Shanshan Liu , Xiaoqiang Zhang , Pengyuan Lyu , Junyu Han , Jingtuo Liu

show 2 more authors

Errui Ding Guangming Shi

This is my paper

classification 💻 cs.CV

keywords textarbitrarily-shapedcharacterpgnetproposedannotationscharacter-levelclassification

0 comments

read the original abstract

The reading of arbitrarily-shaped text has received increasing research attention. However, existing text spotters are mostly built on two-stage frameworks or character-based methods, which suffer from either Non-Maximum Suppression (NMS), Region-of-Interest (RoI) operations, or character-level annotations. In this paper, to address the above problems, we propose a novel fully convolutional Point Gathering Network (PGNet) for reading arbitrarily-shaped text in real-time. The PGNet is a single-shot text spotter, where the pixel-level character classification map is learned with proposed PG-CTC loss avoiding the usage of character-level annotations. With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations involved, which guarantees high efficiency. Additionally, reasoning the relations between each character and its neighbors, a graph refinement module (GRM) is proposed to optimize the coarse recognition and improve the end-to-end performance. Experiments prove that the proposed method achieves competitive accuracy, meanwhile significantly improving the running speed. In particular, in Total-Text, it runs at 46.7 FPS, surpassing the previous spotters with a large margin.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Do You Need Text Rectification? Soft Attention Mask Embedding for Rectification-Free Scene Text Spotting
cs.CV 2026-05 unverdicted novelty 6.0

SAME-Net adds a differentiable soft attention mask embedding module to achieve rectification-free end-to-end scene text spotting with 84.02% H-mean on Total-Text.