ReVis parses image-based visualizations into a reusable DSL via an MLLM pipeline and supports reproduction, data updates, and customization through an interactive interface.
Chartmimic: Evaluating lmm’s cross-modal reasoning capability via chart-to-code generation
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
Visual-ERM is a new multimodal reward model that supplies fine-grained visual feedback for training vision-language models on chart-to-code, table, and SVG tasks, yielding measurable gains over prior rewards.
CharTide decouples chart-to-code data into three perspectives and uses inquiry-driven RL with atomic QA verification to let smaller VLMs surpass GPT-4o on chart-to-code tasks.
SciTikZer-8B uses a new dataset, benchmark, and dual self-consistency RL to generate TikZ code for scientific graphics, outperforming much larger models like Gemini-2.5-Pro.
Introduces a 93-question multimodal RAG benchmark with phrase-level recall and embedding-based hallucination metrics, finding closed-source pipelines outperform open-source ones especially on cross-modal and cross-document tasks.
citing papers explorer
-
ReVis: Towards Reusable Image-Based Visualizations with MLLMs
ReVis parses image-based visualizations into a reusable DSL via an MLLM pipeline and supports reproduction, data updates, and customization through an interactive interface.
-
Visual-ERM: Reward Modeling for Visual Equivalence
Visual-ERM is a new multimodal reward model that supplies fine-grained visual feedback for training vision-language models on chart-to-code, table, and SVG tasks, yielding measurable gains over prior rewards.
-
CharTide: Data-Centric Chart-to-Code Generation via Tri-Perspective Tuning and Inquiry-Driven Evolution
CharTide decouples chart-to-code data into three perspectives and uses inquiry-driven RL with atomic QA verification to let smaller VLMs surpass GPT-4o on chart-to-code tasks.
-
Scientific Graphics Program Synthesis via Dual Self-Consistency Reinforcement Learning
SciTikZer-8B uses a new dataset, benchmark, and dual self-consistency RL to generate TikZ code for scientific graphics, outperforming much larger models like Gemini-2.5-Pro.
-
FATHOMS-RAG: A Framework for the Assessment of Thinking and Observation in Multimodal Systems that use Retrieval Augmented Generation
Introduces a 93-question multimodal RAG benchmark with phrase-level recall and embedding-based hallucination metrics, finding closed-source pipelines outperform open-source ones especially on cross-modal and cross-document tasks.