pith. sign in
Pith Number

pith:UH6KCTBG

pith:2023:UH6KCTBGAPVRTMHOHFR4JRRUZU
not attested not anchored not stored refs resolved

Nougat: Neural Optical Understanding for Academic Documents

Guillem Cucurull, Lukas Blecher, Robert Stojnic, Thomas Scialom

A visual transformer model converts images of scientific document pages into accurate semantic markup.

arxiv:2308.13418 v1 · 2023-08-25 · cs.LG · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{UH6KCTBGAPVRTMHOHFR4JRRUZU}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We propose Nougat, a Visual Transformer model that performs an Optical Character Recognition (OCR) task for processing scientific documents into a markup language, and demonstrate the effectiveness of our model on a new dataset of scientific documents.

C2weakest assumption

That visual processing of page images is sufficient to recover accurate semantic markup for complex layouts and nested mathematical expressions without systematic errors on unseen document styles.

C3one line summary

Nougat applies a visual transformer to convert academic PDFs into markup language while accurately handling mathematical content on a new scientific document dataset.

References

54 extracted · 54 resolved · 12 Pith anchors

[1] Statistics of the Common Crawl Corpus 2012, June 2013 2012
[2] An Overview of the Tesseract OCR Engine 2007 · doi:10.1109/icdar.2007.4376991
[3] S2ORC: The Semantic Scholar Open Research Corpus 2020 · doi:10.18653/v1/2020.acl-main
[4] URL https://aclanthology.org/2020.acl-main.447 2020
[5] Patrice Lopez. GROBID, February 2023. URL https://github.com/kermitt2/grobid. original-date: 2012-09- 13T15:48:54Z 2023

Formal links

3 machine-checked theorem links

Cited by

29 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:48.310568Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

a1fca14c2603eb19b0ee3963c4c634cd0423b37f3abb26167b7df8e1f92d61d5

Aliases

arxiv: 2308.13418 · arxiv_version: 2308.13418v1 · doi: 10.48550/arxiv.2308.13418 · pith_short_12: UH6KCTBGAPVR · pith_short_16: UH6KCTBGAPVRTMHO · pith_short_8: UH6KCTBG
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/UH6KCTBGAPVRTMHOHFR4JRRUZU \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: a1fca14c2603eb19b0ee3963c4c634cd0423b37f3abb26167b7df8e1f92d61d5
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "ddd44285883ee14065ff5c96ebc36c6944cd462b44dcbbd5714ecfcbc789d290",
    "cross_cats_sorted": [
      "cs.CV"
    ],
    "license": "http://creativecommons.org/licenses/by-sa/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2023-08-25T15:03:36Z",
    "title_canon_sha256": "4905edfdc5dec5ebbf769ed2070459ab1c3e8110a692956bb41ed686d1244b05"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2308.13418",
    "kind": "arxiv",
    "version": 1
  }
}