Multimodal deep network using image features and OCR word embeddings boosts document classification accuracy by 3% over image-only baselines on Tobacco3482 and RVL-CDIP with a new QS-OCR dataset.
Embedded Textual Content for Document Image Classification with CNNs
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2019 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Multimodal deep networks for text and image-based document classification
Multimodal deep network using image features and OCR word embeddings boosts document classification accuracy by 3% over image-only baselines on Tobacco3482 and RVL-CDIP with a new QS-OCR dataset.