pith. sign in

arxiv: 1711.07245 · v2 · pith:2ZP3TCU6new · submitted 2017-11-20 · 💻 cs.CV

Optical Character Recognition (OCR) for Telugu: Database, Algorithm and Application

classification 💻 cs.CV
keywords telugualgorithmcharacterdatabasegermanicgithublearningmake
0
0 comments X
read the original abstract

Telugu is a Dravidian language spoken by more than 80 million people worldwide. The optical character recognition (OCR) of the Telugu script has wide ranging applications including education, health-care, administration etc. The beautiful Telugu script however is very different from Germanic scripts like English and German. This makes the use of transfer learning of Germanic OCR solutions to Telugu a non-trivial task. To address the challenge of OCR for Telugu, we make three contributions in this work: (i) a database of Telugu characters, (ii) a deep learning based OCR algorithm, and (iii) a client server solution for the online deployment of the algorithm. For the benefit of the Telugu people and the research community, we will make our code freely available at https://gayamtrishal.github.io/OCR_Telugu.github.io/

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.