D oc M ath-Eval: Evaluating Math Reasoning Capabilities of LLM s in Understanding Long and Specialized Documents

Zhao, Yilun, Long, Yitao, Liu, Hongjun, Kamoi, Ryo, Nan, Linyong, Chen, Lyuhao · 2024 · DOI 10.18653/v1/2024.acl-long.852

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

TableVista: Benchmarking Multimodal Table Reasoning under Visual and Structural Complexity

cs.CL · 2026-05-07 · unverdicted · novelty 7.0

TableVista benchmark finds foundation models maintain performance across visual styles but degrade sharply on complex table structures and vision-only settings.

How Do Document Parsers Break? Auditing Structural Vulnerability in Document Intelligence

cs.CL · 2026-05-19

citing papers explorer

Showing 2 of 2 citing papers.

TableVista: Benchmarking Multimodal Table Reasoning under Visual and Structural Complexity cs.CL · 2026-05-07 · unverdicted · none · ref 34
TableVista benchmark finds foundation models maintain performance across visual styles but degrade sharply on complex table structures and vision-only settings.
How Do Document Parsers Break? Auditing Structural Vulnerability in Document Intelligence cs.CL · 2026-05-19 · unreviewed · ref 29

D oc M ath-Eval: Evaluating Math Reasoning Capabilities of LLM s in Understanding Long and Specialized Documents

fields

years

verdicts

representative citing papers

citing papers explorer