CycleChart: A Unified Consistency-Based Learning Framework for Bidirectional Chart Understanding and Generation
Pith reviewed 2026-05-16 20:51 UTC · model grok-4.3
The pith
Enforcing generate-parse consistency on aligned chart data improves cross-task performance and generalization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CycleChart organizes all tasks around each single data instance in a lifecycle from source table and query through chart generation and rendering to schema and data parsing, with a generate-parse consistency objective that enforces semantic alignment between the forward and reverse directions, yielding strong results on four tasks and improved transfer to external benchmarks.
What carries the argument
The per-instance lifecycle design together with the generate-parse consistency objective that links generation from data to recovery from the rendered image.
If this is right
- The model captures the full chain of transformations from raw data through visual encoding to structured recovery.
- Performance improves simultaneously on NL2Chart generation, schema parsing, data parsing, and ChartQA.
- The approach transfers effectively to unseen external benchmarks.
- Cross-task generalization increases relative to conventional multi-task training that samples tasks independently.
Where Pith is reading between the lines
- The same consistency cycle could be applied to other paired generation-understanding tasks such as diagram or map creation.
- Training in this manner might reduce errors when charts are later edited or queried in new ways.
- Extending the framework to charts with interactive elements or multiple linked views remains open for testing.
Load-bearing premise
That enforcing generate-parse consistency on the authors' lifecycle-aligned benchmark will produce models whose improvements generalize beyond the specific chart rendering pipeline and annotation style used.
What would settle it
Evaluating the trained model on charts rendered with a different library or on real-world charts that lack the aligned annotations from CycleChart-Bench.
Figures
read the original abstract
Current chart-related tasks, such as chart generation (NL2Chart), chart schema parsing, chart data parsing, and chart question answering (ChartQA), are typically studied in isolation, preventing models from learning the shared semantics that link chart creation and interpretation. We introduce CycleChart, a consistency-based learning framework for bidirectional chart understanding and generation. Unlike conventional multi-task approaches that draw training samples independently across tasks, CycleChart organizes all tasks around each single data instance. From a source table and natural-language query, the model generates a chart specification, renders and executes it, then learns to recover the schema and underlying data from the resulting chart image. This per-instance lifecycle design lets the model capture the full chain of transformations, from raw data through visual encoding to structured recovery, and a generate--parse consistency objective enforces semantic alignment between the forward generation and reverse parsing directions. To support this framework, we construct CycleChart-Bench, a lifecycle-aligned benchmark where every chart sample carries aligned annotations for generation, schema parsing, data parsing, and question answering. CycleChart achieves strong results across all four tasks and transfers effectively to unseen external benchmarks, demonstrating improved cross-task generalization and marking a step toward more general chart understanding models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces CycleChart, a consistency-based framework for joint chart generation (NL2Chart), schema parsing, data parsing, and ChartQA. Tasks are organized around per-instance lifecycles on a newly constructed CycleChart-Bench benchmark: from table and query the model generates a chart specification, renders it, and must recover schema and data from the image, with a generate-parse consistency objective enforcing semantic alignment. The central claims are strong performance across the four tasks and effective transfer to unseen external benchmarks, demonstrating improved cross-task generalization over independent multi-task training.
Significance. If the consistency objective is shown to produce representations that capture invariant data-to-visual semantics rather than pipeline-specific regularities, the work could advance unified chart models that learn bidirectional mappings more robustly than isolated task training. The lifecycle-aligned benchmark construction is a constructive contribution, but its purpose-built nature makes the transfer claims load-bearing and in need of stronger controls.
major comments (2)
- [§4.2] §4.2 (Benchmark Construction): CycleChart-Bench is built around a single table→spec→render→image→parse pipeline. The generate-parse consistency objective could therefore be satisfied by learning pipeline-specific artifacts (exact encoding choices, color mappings, annotation conventions) rather than generalizable chart semantics. The transfer experiments to external benchmarks must include explicit ablations or controls for rendering and annotation differences to attribute gains to the proposed mechanism rather than distributional overlap.
- [§5.1] §5.1 (Experimental Results): The reported strong results and cross-task generalization claims lack ablations that isolate the contribution of the consistency loss from standard multi-task training on the same CycleChart-Bench data. Without these comparisons (and without quantitative metrics, error analysis, or statistical significance in the main results), it is difficult to confirm that the per-instance lifecycle design drives the improvements.
minor comments (2)
- [§3] Notation for the consistency objective (e.g., the exact formulation of the cycle loss) should be clarified with an explicit equation in §3 to avoid ambiguity when comparing to standard reconstruction losses.
- [Figure 2] Figure 2 (lifecycle diagram) would benefit from explicit arrows showing the forward generation and reverse parsing paths with the consistency term highlighted.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below, providing our response and indicating planned revisions to strengthen the presentation of the consistency objective and experimental claims.
read point-by-point responses
-
Referee: [§4.2] §4.2 (Benchmark Construction): CycleChart-Bench is built around a single table→spec→render→image→parse pipeline. The generate-parse consistency objective could therefore be satisfied by learning pipeline-specific artifacts (exact encoding choices, color mappings, annotation conventions) rather than generalizable chart semantics. The transfer experiments to external benchmarks must include explicit ablations or controls for rendering and annotation differences to attribute gains to the proposed mechanism rather than distributional overlap.
Authors: We agree that the single-pipeline construction of CycleChart-Bench introduces a risk that the consistency objective could exploit rendering-specific regularities rather than invariant data-to-visual semantics. Our transfer results to external benchmarks (which use different rendering libraries and annotation conventions) provide supporting evidence for generalization, but we acknowledge that explicit controls would make this attribution more robust. In the revised manuscript we will add an ablation that systematically varies rendering parameters and annotation styles between CycleChart-Bench and the target external benchmarks while measuring the resulting performance delta. revision: yes
-
Referee: [§5.1] §5.1 (Experimental Results): The reported strong results and cross-task generalization claims lack ablations that isolate the contribution of the consistency loss from standard multi-task training on the same CycleChart-Bench data. Without these comparisons (and without quantitative metrics, error analysis, or statistical significance in the main results), it is difficult to confirm that the per-instance lifecycle design drives the improvements.
Authors: We have included an ablation comparing the full CycleChart framework against a standard multi-task baseline trained on identical CycleChart-Bench data; these results appear in Section 5.2 and the appendix and show additional gains attributable to the consistency loss. We agree that the main results would benefit from expanded quantitative support. In the revision we will add statistical significance testing (paired t-tests with bootstrap confidence intervals), a concise error analysis of failure modes, and additional metrics to the primary experimental tables. revision: partial
Circularity Check
No significant circularity in the derivation chain
full rationale
The paper introduces a consistency-based training framework organized around per-instance lifecycles and constructs CycleChart-Bench to support it, but the central claims rest on empirical results across tasks plus explicit transfer evaluation on unseen external benchmarks. No equations or self-citations are shown that reduce the consistency objective or generalization claim to a fitted parameter or input by construction; the method is a standard bidirectional alignment technique whose outputs are not definitionally equivalent to its training signals. This is the common honest case of a self-contained empirical contribution.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Chart rendering from specification to image is a deterministic, lossless-enough mapping for the reverse parsing task to be well-defined.
- ad hoc to paper Joint training on the closed generation-parsing loop improves cross-task generalization beyond independent multi-task training.
invented entities (1)
-
CycleChart-Bench
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Recycle-gan: Unsupervised video retargeting
Aayush Bansal, Shugao Ma, Deva Ramanan, and Yaser Sheikh. Recycle-gan: Unsupervised video retargeting. In Proceedings of the European conference on computer vision (ECCV), pages 119–135, 2018. 3
work page 2018
-
[2]
Onechart: Purify the chart structural extrac- tion via one auxiliary token
Jinyue Chen, Lingyu Kong, Haoran Wei, Chenglong Liu, Zheng Ge, Liang Zhao, Jianjian Sun, Chunrui Han, and Xi- angyu Zhang. Onechart: Purify the chart structural extrac- tion via one auxiliary token. InProceedings of the 32nd ACM International Conference on Multimedia, pages 147– 155, 2024. 2
work page 2024
-
[3]
Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blis- tein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025. 6
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[4]
Towards cycle-consistent models for text and image retrieval
Marcella Cornia, Lorenzo Baraldi, Hamed R Tavakoli, and Rita Cucchiara. Towards cycle-consistent models for text and image retrieval. InProceedings of the European Con- ference on Computer Vision (ECCV) Workshops, pages 0–0,
-
[5]
Dazhen Deng, Aoyu Wu, Huamin Qu, and Yingcai Wu. DashBot: Insight-driven dashboard generation based on deep reinforcement learning.IEEE Transactions on Visu- alization and Computer Graphics, 29(1):690–700, 2023. 3
work page 2023
-
[6]
Yucheng Han, Chi Zhang, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, and Hanwang Zhang. Chartllama: A mul- timodal llm for chart understanding and generation.arXiv preprint arXiv:2311.16483, 2023. 1, 3
-
[7]
Dual learning for machine trans- lation.Advances in neural information processing systems, 29, 2016
Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tie- Yan Liu, and Wei-Ying Ma. Dual learning for machine trans- lation.Advances in neural information processing systems, 29, 2016. 3
work page 2016
-
[8]
Wenyi Hong, Wenmeng Yu, Xiaotao Gu, Guo Wang, Guob- ing Gan, Haomiao Tang, Jiale Cheng, Ji Qi, Junhui Ji, Li- hang Pan, et al. Glm-4.5v and glm-4.1v-thinking: Towards versatile multimodal reasoning with scalable reinforcement learning, 2025. 6
work page 2025
-
[9]
Enamul Hoque and M Saidul Islam. Natural language gen- eration for visualizations: State of the art, challenges and fu- ture directions. InComputer Graphics Forum, page e15266. Wiley Online Library, 2025. 3
work page 2025
-
[10]
Image quality metrics: Psnr vs
Alain Hor ´e and Djemel Ziou. Image quality metrics: Psnr vs. ssim. In2010 20th International Conference on Pattern Recognition, pages 2366–2369, 2010. 6
work page 2010
-
[11]
Scicap: Generating captions for scientific figures
Ting-Yao Hsu, C Lee Giles, and Ting-Hao Huang. Scicap: Generating captions for scientific figures. InFindings of the Association for Computational Linguistics: EMNLP 2021, pages 3258–3264, 2021. 2
work page 2021
-
[12]
Kung-Hsiang Huang, Hou Pong Chan, Yi R Fung, Haoyi Qiu, Mingyang Zhou, Shafiq Joty, Shih-Fu Chang, and Heng Ji. From pixels to insights: A survey on automatic chart understanding in the era of large foundation models.IEEE Transactions on Knowledge and Data Engineering, 2024. 1
work page 2024
-
[13]
Do LVLMs understand charts? analyzing and correcting factual errors in chart captioning
Kung-Hsiang Huang, Mingyang Zhou, Hou Pong Chan, Yi Fung, Zhenhailong Wang, Lingyu Zhang, Shih-Fu Chang, and Heng Ji. Do LVLMs understand charts? analyzing and correcting factual errors in chart captioning. InFindings of the Association for Computational Linguistics: ACL 2024, pages 730–749, Bangkok, Thailand, 2024. Association for Computational Linguistics. 1, 3
work page 2024
-
[14]
Claude 3.5 sonnet news.https : / / www
Anthropic Inc. Claude 3.5 sonnet news.https : / / www . anthropic . com / news / claude - 3 - 5 - sonnet, 2024. 6
work page 2024
-
[15]
Introducing gpt-4.1 in the api.https:// openai.com/index/gpt-4-1/, 2025
OpenAI Inc. Introducing gpt-4.1 in the api.https:// openai.com/index/gpt-4-1/, 2025. 6
work page 2025
-
[16]
Dvqa: Understanding data visualizations via ques- tion answering
Kushal Kafle, Brian Price, Scott Cohen, and Christopher Kanan. Dvqa: Understanding data visualizations via ques- tion answering. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 5648–5656,
-
[17]
Rouge: A package for automatic evaluation of summaries
Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. InText summarization branches out, pages 74–81, 2004. 6
work page 2004
-
[18]
Tianqi Luo, Chuhan Huang, Leixian Shen, Boyan Li, Shuyu Shen, Wei Zeng, Nan Tang, and Yuyu Luo. nvbench 2.0: A benchmark for natural language to visualization under ambi- guity.arXiv preprint arXiv:2503.12880, 2025. 1, 3, 4
-
[19]
Yuyu Luo, Jiawei Tang, and Guoliang Li. nvbench: A large- scale synthesized dataset for cross-domain natural language to visualization task.arXiv preprint arXiv:2112.12926,
-
[20]
Chartqa: A benchmark for question answer- ing about charts with visual and logical reasoning
Ahmed Masry, Xuan Long Do, Jia Qing Tan, Shafiq Joty, and Enamul Hoque. Chartqa: A benchmark for question answer- ing about charts with visual and logical reasoning. InFind- ings of the association for computational linguistics: ACL 2022, pages 2263–2279, 2022. 1, 2, 6, 7
work page 2022
-
[21]
UniChart: A universal vision-language pretrained model for chart comprehension and reasoning
Ahmed Masry, Parsa Kavehzadeh, Xuan Long Do, Ena- mul Hoque, and Shafiq Joty. Unichart: A universal vision- language pretrained model for chart comprehension and rea- soning.arXiv preprint arXiv:2305.14761, 2023. 2, 6
-
[22]
Chartinstruct: Instruction tuning for chart comprehension and reasoning
Ahmed Masry, Mehrad Shahmohammadi, Md Rizwan Parvez, Enamul Hoque, and Shafiq Joty. Chartinstruct: Instruction tuning for chart comprehension and reasoning. arXiv preprint arXiv:2403.09028, 2024. 2, 6
-
[23]
Ahmed Masry, Mohammed Saidul Islam, Mahir Ahmed, Aayush Bajaj, Firoz Kabir, Aaryaman Kartha, Md Tah- mid Rahman Laskar, Mizanur Rahman, Shadikur Rahman, Mehrad Shahmohammadi, et al. Chartqapro: A more di- verse and challenging benchmark for chart question answer- ing.arXiv preprint arXiv:2504.05506, 2025. 1, 2, 7
-
[24]
Fanqing Meng, Wenqi Shao, Quanfeng Lu, Peng Gao, Kaipeng Zhang, Yu Qiao, and Ping Luo. Chartassisstant: A universal chart multimodal language model via chart-to-table pre-training and multitask instruction tuning.arXiv preprint arXiv:2401.02384, 2024. 1, 2
-
[25]
Plotqa: Reasoning over scientific plots
Nitesh Methani, Pritha Ganguly, Mitesh M Khapra, and Pratyush Kumar. Plotqa: Reasoning over scientific plots. InProceedings of the ieee/cvf winter conference on appli- cations of computer vision, pages 1527–1536, 2020. 2
work page 2020
-
[26]
Arpit Narechania, Arjun Srinivasan, and John Stasko. Nl4dv: A toolkit for generating analytic specifications for data vi- sualization from natural language queries.IEEE Transac- tions on Visualization and Computer Graphics, 27(2):369– 379, 2020. 1, 3
work page 2020
-
[27]
Reverse-engineering visualiza- tions: Recovering visual encodings from chart images
Jorge Poco and Jeffrey Heer. Reverse-engineering visualiza- tions: Recovering visual encodings from chart images. In Computer graphics forum, pages 353–363. Wiley Online Li- brary, 2017. 2
work page 2017
-
[28]
Learning transferable visual models from natural language supervision, 2021
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision, 2021. 6
work page 2021
-
[29]
Zhihao Shuai, Boyan Li, Siyu Yan, Yuyu Luo, and Weikai Yang. Deepvis: Bridging natural language and data vi- sualization through step-wise reasoning.arXiv preprint arXiv:2508.01700, 2025. 1, 3
-
[30]
Vis- text: A benchmark for semantically rich chart captioning
Benny Tang, Angie Boggust, and Arvind Satyanarayan. Vis- text: A benchmark for semantically rich chart captioning. InProceedings of the 61st Annual Meeting of the Associa- tion for Computational Linguistics (Volume 1: Long Papers), pages 7268–7298, 2023. 2
work page 2023
-
[31]
ChartGPT: Lever- aging llms to generate charts from abstract natural language
Yuan Tian, Weiwei Cui, Dazhen Deng, Xinjing Yi, Yurun Yang, Haidong Zhang, and Yingcai Wu. ChartGPT: Lever- aging llms to generate charts from abstract natural language. IEEE Transactions on Visualization and Computer Graph- ics, 31(3):1731–1745, 2024. 1, 2, 3
work page 2024
-
[32]
Refchartqa: Grounding vi- sual answer on chart images through instruction tuning
Alexander V ogel, Omar Moured, Yufan Chen, Jiaming Zhang, and Rainer Stiefelhagen. Refchartqa: Grounding vi- sual answer on chart images through instruction tuning. In International Conference on Document Analysis and Recog- nition, pages 523–537. Springer, 2025. 1, 2, 3
work page 2025
-
[33]
Cycle-consistency learning for captioning and grounding
Ning Wang, Jiajun Deng, and Mingbo Jia. Cycle-consistency learning for captioning and grounding. InProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artifi- cial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence. AAAI Press, 2024. 3
work page 2024
- [34]
-
[35]
Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu, Richard Zhu, Kaiqu Liang, Xindi Wu, Haotian Liu, Sad- hika Malladi, et al. Charxiv: Charting gaps in realistic chart understanding in multimodal llms.Advances in Neural In- formation Processing Systems, 37:113569–113697, 2024. 1, 2, 7, 4
work page 2024
-
[36]
Chartmind: A comprehensive benchmark for complex real-world multimodal chart ques- tion answering
Jingxuan Wei, Nan Xu, Junnan Zhu, Gaowei Wu, Qi Chen, Bihui Yu, Lei Wang, et al. Chartmind: A comprehensive benchmark for complex real-world multimodal chart ques- tion answering. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 4555–4569, 2025. 2
work page 2025
-
[37]
Leland Wilkinson. The grammar of graphics. InHandbook of computational statistics: Concepts and methods, pages 375–414. Springer, 2011. 1
work page 2011
-
[38]
Kanit Wongsuphasawat, Dominik Moritz, Anushka Anand, Jock Mackinlay, Bill Howe, and Jeffrey Heer. V oyager: Ex- ploratory analysis via faceted browsing of visualization rec- ommendations.IEEE transactions on visualization and com- puter graphics, 22(1):649–658, 2015. 3
work page 2015
-
[39]
V oyager 2: Augmenting visual anal- ysis with partial view specifications
Kanit Wongsuphasawat, Zening Qu, Dominik Moritz, Riley Chang, Felix Ouk, Anushka Anand, Jock Mackinlay, Bill Howe, and Jeffrey Heer. V oyager 2: Augmenting visual anal- ysis with partial view specifications. InProceedings of the 2017 chi conference on human factors in computing systems, pages 2648–2659, 2017. 3
work page 2017
-
[40]
Yifan Wu, Lutao Yan, Leixian Shen, Yunhai Wang, Nan Tang, and Yuyu Luo. Chartinsights: Evaluating multimodal large language models for low-level chart question answer- ing.arXiv preprint arXiv:2405.07001, 2024. 2
-
[41]
Renqiu Xia, Hancheng Ye, Xiangchao Yan, Qi Liu, Hongbin Zhou, Zijun Chen, Botian Shi, Junchi Yan, and Bo Zhang. Chartx & chartvlm: A versatile benchmark and foundation model for complicated chart reasoning.IEEE Transactions on Image Processing, 2025. 1, 3
work page 2025
-
[42]
Chartbench: A benchmark for complex visual reasoning in charts.arXiv preprint arXiv:2312.15915,
Zhengzhuo Xu, Sinan Du, Yiyan Qi, Chengjin Xu, Chun Yuan, and Jian Guo. Chartbench: A benchmark for complex visual reasoning in charts.arXiv preprint arXiv:2312.15915,
-
[43]
Chartmoe: Mixture of di- versely aligned expert connector for chart understanding
Zhengzhuo Xu, Bowen Qu, Yiyan Qi, Sinan Du, Chengjin Xu, Chun Yuan, and Jian Guo. Chartmoe: Mixture of di- versely aligned expert connector for chart understanding. arXiv preprint arXiv:2409.03277, 2024. 2, 7
-
[44]
Chartpoint: Guiding mllms with grounding reflection for chart reasoning
Zhengzhuo Xu, SiNan Du, Yiyan Qi, Siwen Lu, Chengjin Xu, Chun Yuan, and Jian Guo. Chartpoint: Guiding mllms with grounding reflection for chart reasoning. InProceedings of the IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 426–436, 2025. 2, 3
work page 2025
-
[45]
Effective training data synthesis for improving mllm chart understanding
Yuwei Yang, Zeyu Zhang, Yunzhong Hou, Zhuowan Li, Gaowen Liu, Ali Payani, Yuan-Sen Ting, and Liang Zheng. Effective training data synthesis for improving mllm chart understanding. InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision, pages 2653–2663,
-
[46]
Yilin Ye, Jianing Hao, Yihan Hou, Zhan Wang, Shishi Xiao, Yuyu Luo, and Wei Zeng. Generative ai for visualization: State of the art and future directions.Visual Informatics, 8 (2):43–66, 2024. 3
work page 2024
-
[47]
Dual- gan: Unsupervised dual learning for image-to-image trans- lation
Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. Dual- gan: Unsupervised dual learning for image-to-image trans- lation. InProceedings of the IEEE international conference on computer vision, pages 2849–2857, 2017. 3
work page 2017
-
[48]
Liang Zhang, Anwen Hu, Haiyang Xu, Ming Yan, Yichen Xu, Qin Jin, Ji Zhang, and Fei Huang. Tinychart: Efficient chart understanding with visual token merging and program- of-thoughts learning.arXiv preprint arXiv:2404.16635,
-
[49]
Advancing chart question answering with robust chart com- ponent recognition
Hanwen Zheng, Sijia Wang, Chris Thomas, and Lifu Huang. Advancing chart question answering with robust chart com- ponent recognition. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 5741–
-
[50]
Unpaired image-to-image translation using cycle- consistent adversarial networks
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle- consistent adversarial networks. InProceedings of the IEEE international conference on computer vision, pages 2223– 2232, 2017. 3 CycleChart: A Unified Consistency-Based Learning Framework for Bidirectional Chart Understanding and Generation Supp...
work page 2017
-
[51]
Details of Experiment Setting 8.1. Hyperparameters We report the full hyperparameter configuration used for CycleChart-3B and CycleChart-7B fine-tuning: •Optimizer:AdamW •Learning rate:1×10 −5 •LoRA:rank = 16,α= 32, dropout = 0.05 •Batch size:2 •Training steps:2000 •Scheduler:constant •Frozen components:vision encoder (only projector + LLM updated) These ...
work page 2000
-
[52]
A natural language query describing a charting intent
-
[53]
A table schema with data types and example rows. Your task is to generate a valid Vega-Lite specifi- cation in JSON format that visualizes the requested information. If any filtering or aggregation is implied in the query, include it using thetransformfield. Input format: • Natural language query:str • Table info: { "columns_with_type": {}, "column_exampl...
-
[54]
A table schema with data types and example rows. Your task is to generate a valid Vega-Lite specifica- tion in JSON format. Do not extract or infer any data values from the im- age; only describe its visual structure and encodings using the provided table schema. Input format: • Chart image:img • Table info: { "columns_with_type": {}, "column_examples": [...
-
[55]
Your task is to extract all visible data values from the chart into a clean CSV table
Its corresponding Vega-Lite specification. Your task is to extract all visible data values from the chart into a clean CSV table. Only include columns that are visually encoded in the chart (from: x, y, color, size, theta, percentage). If the chart contains subplots (usingrowor columnencodings), include these fields as addi- tional columns in the output. ...
-
[56]
A natural language question about the chart. Your task is to answer the question using ONLY in- formation that is visible in the chart. Answer rules: • number→digits only; include unit ONLY if shown (e.g., %, $); no commas. • boolean→exactly “yes” or “no”. • category/text→must be a label that appears in the chart. • if not answerable→“unanswerable”. Input...
-
[57]
Dataset Construction Details CycleChart-Bench is constructed on top of nvBench 2.0, which provides natural-language queries, raw tables, and Vega-Lite specifications for NL2Chart. However, nvBench 2.0 contains only single-view charts and offers a limited va- riety of visualization types. To better support our unified generate–parse–reason framework, we su...
-
[58]
Successful Reasoning Cases (Ours vs
Quanlitative Analysis 10.1. Successful Reasoning Cases (Ours vs. Base- line) Table 4 presents representative ChartQA examples that re- veal how generate–parse consistency improves CycleChart- 7B’s reasoning behavior. All examples are taken from the ChartXiv[35] benchmark, whose figures are considerably more complex than those in our training corpus. While...
work page 2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.