LayoutLM: Pre-training of text and layout for document image understanding

Xu, Yiheng, Li, Minghao, Cui, Lei, Huang, Shaohan, Wei, Furu, Zhou, Ming , month = jun, year = · 1912 · arXiv 1912.13318

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Improving Layout Representation Learning Across Inconsistently Annotated Datasets via Agentic Harmonization

cs.CV · 2026-04-13 · unverdicted · novelty 6.0

VLM-based harmonization of inconsistent annotations across two document layout corpora raises detection F-score from 0.860 to 0.883 and table TEDS from 0.750 to 0.814 while tightening embedding clusters.

Web Retrieval-Aware Chunking (W-RAC) for Efficient and Cost-Effective Retrieval-Augmented Generation Systems

cs.IR · 2026-01-08 · unverdicted · novelty 6.0

W-RAC decouples extraction from semantic planning via structured units and LLM grouping to match traditional retrieval performance at roughly 10x lower LLM token cost.

Structure-Preserving Document Translation via Multi-Stage LLM Pipeline: A Case Study in Marathi

cs.CL · 2026-06-27 · unverdicted · novelty 4.0

A multi-stage LLM pipeline for structure-preserving Marathi-to-English translation of government PDFs using layout-aware OCR and HTML reconstruction.

MADP: A Multi-Agent Pipeline for Sustainable Document Processing with Human-in-the-Loop

cs.AI · 2026-05-16 · conditional · novelty 4.0

MADP multi-agent pipeline with human-in-the-loop achieves 97% full automation on 955 real documents, 98.5% accuracy on ablation set, and 69-70% reductions in FTE, energy, and emissions versus manual processing.

From Handwriting to Structured Data: Benchmarking AI Digitisation of Handwritten Forms

cs.CV · 2026-04-14 · unverdicted · novelty 4.0

Frontier multimodal LLMs achieve ~85% accuracy and ~90% weighted F1 on digitizing complex handwritten medical forms, with Gemini 3.1 strongest overall and prompt optimization lifting macro metrics over 60%.

Information Extraction from Electricity Invoices with General-Purpose Large Language Models

cs.CL · 2026-04-01 · unverdicted · novelty 4.0

Few-shot prompting lifts F1 scores above 96 percent on electricity-invoice extraction for Gemini 1.5 Pro and Mistral-small, while hyperparameter changes produce only marginal gains.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Improving Layout Representation Learning Across Inconsistently Annotated Datasets via Agentic Harmonization cs.CV · 2026-04-13 · unverdicted · none · ref 38
VLM-based harmonization of inconsistent annotations across two document layout corpora raises detection F-score from 0.860 to 0.883 and table TEDS from 0.750 to 0.814 while tightening embedding clusters.
From Handwriting to Structured Data: Benchmarking AI Digitisation of Handwritten Forms cs.CV · 2026-04-14 · unverdicted · none · ref 7
Frontier multimodal LLMs achieve ~85% accuracy and ~90% weighted F1 on digitizing complex handwritten medical forms, with Gemini 3.1 strongest overall and prompt optimization lifting macro metrics over 60%.

LayoutLM: Pre-training of text and layout for document image understanding

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer