Nougat applies a visual transformer to convert academic PDFs into markup language while accurately handling mathematical content on a new scientific document dataset.
CoRRabs/2104.07787(2021), https://arxiv.org/abs/2104.07787
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
CONDITIONAL 2representative citing papers
A new dataset and open-source OCR pipeline transcribes medieval English legal manuscripts at up to 88% word accuracy using CNN+LSTM and language model correction.
citing papers explorer
-
Nougat: Neural Optical Understanding for Academic Documents
Nougat applies a visual transformer to convert academic PDFs into markup language while accurately handling mathematical content on a new scientific document dataset.
-
Democratizing the medieval English legal tradition
A new dataset and open-source OCR pipeline transcribes medieval English legal manuscripts at up to 88% word accuracy using CNN+LSTM and language model correction.