Chartx & chartvlm: A versatile bench- mark and foundation model for complicated chart reasoning

Xia, R · 2024 · arXiv 2402.12185

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1 dataset 1 other 1

citation-polarity summary

background 2 unclear 1

representative citing papers

PlotChain: Deterministic Checkpointed Evaluation of Multimodal LLMs on Engineering Plot Reading

cs.AI · 2026-01-29 · conditional · novelty 7.0

PlotChain benchmark reports top MLLMs reaching ~80% field-level accuracy on engineering plot reading under human-like tolerances, but with persistent failures on frequency-domain tasks like bandpass and FFT spectra.

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning

cs.CV · 2024-12-31 · accept · novelty 7.0

OCRBench v2 is a new benchmark with four times more tasks than prior versions that reveals most large multimodal models score below 50 out of 100 on visual text tasks and share five specific weaknesses.

CrystalXRD-Bench: Benchmarking Vision-Language Models for XRD Peak Indexing Across Diverse Crystalline Materials

cs.AI · 2026-05-28 · unverdicted · novelty 6.0

CrystalXRD-Bench is a new 250-sample benchmark for VLMs on XRD peak indexing, where the best model (GPT-5.4) reaches Jaccard 0.5888 and 37.6% exact match while most stay below 0.50, showing the task remains unsolved.

Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models

cs.AI · 2026-04-03 · unverdicted · novelty 6.0

Chart-RL uses RL policy optimization and LoRA to boost VLM chart reasoning, enabling a 4B model to reach 0.634 accuracy versus 0.580 for an 8B model with lower latency.

ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch

cs.CV · 2026-01-20 · conditional · novelty 6.0

ChartVerse uses Rollout Posterior Entropy and truth-anchored inverse QA synthesis to produce 640K high-quality chart reasoning samples, training an 8B model that surpasses its 30B teacher.

PlotPick: AI-powered batch extraction of numerical data from scientific figures

cs.CV · 2026-05-07 · conditional · novelty 5.0

PlotPick shows that general vision-language models outperform the dedicated DePlot model on chart-to-table benchmarks, with the largest gains on box plots and other chart types absent from specialized training data.

Large language model-enabled automated data extraction for concrete materials informatics

cond-mat.mtrl-sci · 2026-04-24 · conditional · novelty 5.0 · 2 refs

An LLM-powered agent pipeline extracts ~9,000 structured concrete materials records from 278 publications with F1 scores up to 0.97, creating the largest open blended cement concrete database and demonstrating that larger, richer datasets improve ML prediction and generalization.

PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling

cs.CV · 2024-10-08 · unverdicted · novelty 5.0

PDF-WuKong adds a sparse sampler to an MLLM for efficient long-PDF multimodal QA and reports an 8.6% F1 gain over proprietary models on a new 1.1M-pair academic-paper dataset.

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

cs.MM · 2024-10-28 · unverdicted · novelty 3.0

Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.

citing papers explorer

Showing 9 of 9 citing papers.

PlotChain: Deterministic Checkpointed Evaluation of Multimodal LLMs on Engineering Plot Reading cs.AI · 2026-01-29 · conditional · none · ref 21
PlotChain benchmark reports top MLLMs reaching ~80% field-level accuracy on engineering plot reading under human-like tolerances, but with persistent failures on frequency-domain tasks like bandpass and FFT spectra.
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning cs.CV · 2024-12-31 · accept · none · ref 20
OCRBench v2 is a new benchmark with four times more tasks than prior versions that reveals most large multimodal models score below 50 out of 100 on visual text tasks and share five specific weaknesses.
CrystalXRD-Bench: Benchmarking Vision-Language Models for XRD Peak Indexing Across Diverse Crystalline Materials cs.AI · 2026-05-28 · unverdicted · none · ref 8
CrystalXRD-Bench is a new 250-sample benchmark for VLMs on XRD peak indexing, where the best model (GPT-5.4) reaches Jaccard 0.5888 and 37.6% exact match while most stay below 0.50, showing the task remains unsolved.
Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models cs.AI · 2026-04-03 · unverdicted · none · ref 25
Chart-RL uses RL policy optimization and LoRA to boost VLM chart reasoning, enabling a 4B model to reach 0.634 accuracy versus 0.580 for an 8B model with lower latency.
ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch cs.CV · 2026-01-20 · conditional · none · ref 39
ChartVerse uses Rollout Posterior Entropy and truth-anchored inverse QA synthesis to produce 640K high-quality chart reasoning samples, training an 8B model that surpasses its 30B teacher.
PlotPick: AI-powered batch extraction of numerical data from scientific figures cs.CV · 2026-05-07 · conditional · none · ref 4
PlotPick shows that general vision-language models outperform the dedicated DePlot model on chart-to-table benchmarks, with the largest gains on box plots and other chart types absent from specialized training data.
Large language model-enabled automated data extraction for concrete materials informatics cond-mat.mtrl-sci · 2026-04-24 · conditional · none · ref 97 · 2 links
An LLM-powered agent pipeline extracts ~9,000 structured concrete materials records from 278 publications with F1 scores up to 0.97, creating the largest open blended cement concrete database and demonstrating that larger, richer datasets improve ML prediction and generalization.
PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling cs.CV · 2024-10-08 · unverdicted · none · ref 38
PDF-WuKong adds a sparse sampler to an MLLM for efficient long-PDF multimodal QA and reports an 8.6% F1 gain over proprietary models on a new 1.1M-pair academic-paper dataset.
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction cs.MM · 2024-10-28 · unverdicted · none · ref 262
Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.

Chartx & chartvlm: A versatile bench- mark and foundation model for complicated chart reasoning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer