pith. sign in

arxiv: 2606.09174 · v2 · pith:7SYPYFRFnew · submitted 2026-06-08 · 💻 cs.HC

Demonstrating chart-plot: Closing the Last Mile of Academic Chart Generation

Pith reviewed 2026-06-27 15:22 UTC · model grok-4.3

classification 💻 cs.HC
keywords academic chart generationLLM agentsLaTeX renderingstyle distillationdata visualizationpublication workflowagentic systemsmatplotlib
0
0 comments X

The pith

chart-plot turns researcher intent into LaTeX-ready academic charts that match venue style and survive layout constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Large language models already translate intent into matplotlib code, yet the resulting charts almost always require repeated manual fixes before they fit a paper. The authors argue that the remaining bottleneck is publication rather than generation: the chart must match the style of accepted figures at the target venue, fit the final layout, and accept precise edits. chart-plot addresses this with a style-aware code generator trained on textual descriptions of venue figures, an iterative render loop that compiles inside the target LaTeX document until constraints are satisfied, and a structured edit layer that makes every visual element directly manipulable. Early case studies on grouped bars, scaling lines, and paired distributions plus a small user study provide initial support. If the approach works, researchers could move from description to publication-ready figure in one pass.

Core claim

The paper presents chart-plot as an agentic harness that closes the last mile of academic chart generation. It consists of a style-aware code generator conditioned on a textual style skill distilled from accepted figures at the target venue, a deployment-aware render loop that compiles the chart inside the target LaTeX context and revises until layout constraints are met, and a structured edit layer that exposes every chart element as a directly manipulable handle. Early results are reported on three chart-type case studies and a small user study.

What carries the argument

chart-plot, an agentic harness with a style-aware code generator, a deployment-aware LaTeX render loop, and a structured edit layer for direct element manipulation

If this is right

  • Generated charts match the visual style of previously accepted figures at the target venue.
  • The render loop produces charts that survive the target LaTeX layout without manual fixes.
  • Authors gain direct handles to edit individual chart elements rather than rewriting code.
  • The system works on grouped bar charts, scaling line charts, and paired distribution charts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same three-component structure could be applied to other output formats such as HTML or Word documents.
  • Combining the edit layer with existing paper-writing agents might allow end-to-end figure refinement inside a single workflow.
  • The reliance on venue-specific style distillation raises the question of how quickly the system adapts when a venue changes its figure guidelines.

Load-bearing premise

Distilling a textual style skill from accepted venue figures and pairing it with iterative LaTeX rendering and structured edits will reliably produce figures that match top-venue output and meet layout constraints without further manual work.

What would settle it

A test set of new chart requests where the generated figures still require more than one round of manual revision to pass venue style and layout checks.

Figures

Figures reproduced from arXiv: 2606.09174 by Jiale Lao, Tingfeng Lan, Wei Chen, Yingchaojie Feng, Yinghao Tang, Yupeng Xie.

Figure 1
Figure 1. Figure 1: The last mile. A generated chart becomes [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The chart-plot architecture. The author specifies a goal and data, selects a target venue and an optional reference style, [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Three style cases (rows, top to bottom: Case 1 grouped-bar ablation; Case 2 scaling line chart; Case 3 paired-condition [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 6
Figure 6. Figure 6: User study results (N=4 computer science re [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 5
Figure 5. Figure 5: The edit layer, captured from the live web interface. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
read the original abstract

Large language models can translate a researcher's intent into runnable matplotlib code, yet the resulting chart rarely lands in a paper without multiple rounds of manual revision. We argue that the open problem is not chart code generation but chart publication: making the output look like a top-venue figure, survive the target layout, and respond to precise author edits. We present chart-plot, an agentic harness that closes this last mile through three components: (1) a style-aware code generator conditioned on a textual style skill distilled from accepted figures at the target venue, (2) a deployment-aware render loop that compiles the chart inside the target LaTeX context and revises until layout constraints are met, and (3) a structured edit layer that exposes every chart element as a directly manipulable handle. We report early results on three chart-type case studies (grouped bar, scaling line, paired distributions) and a small user study.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript presents chart-plot, an agentic harness for generating publication-ready academic charts. It argues that the remaining challenge after LLM-based matplotlib code generation is achieving top-venue appearance, surviving target LaTeX layout constraints, and supporting precise edits. The system comprises three components: (1) a style-aware code generator conditioned on a textual style skill distilled from accepted figures at the target venue, (2) a deployment-aware render loop that compiles the chart inside the target LaTeX context and iterates until layout constraints are satisfied, and (3) a structured edit layer exposing every chart element as a manipulable handle. Early results are reported on three chart-type case studies (grouped bar, scaling line, paired distributions) plus a small user study.

Significance. If the components reliably produce figures meeting publication criteria without repeated manual intervention, the work would address a common practical bottleneck in academic workflows, particularly in HCI and related fields where figure quality affects acceptance. The combination of venue-specific style distillation with iterative LaTeX rendering and structured editing offers a concrete integration not previously demonstrated at this granularity.

major comments (2)
  1. Abstract: The central claim that the three components together 'close the last mile' by producing top-venue figures that survive layout constraints without multiple rounds of manual revision rests on unverified assertions. The abstract reports only 'early results' on three case studies and a small user study, with no quantitative metrics (e.g., revision counts, success rates, inter-venue generalization, or statistical tests) or baselines provided. This leaves the weakest assumption—that the system will reliably meet publication criteria—untested at the scale needed to substantiate the claim.
  2. Evaluation section (implied by the reported results): The manuscript provides no details on the user study's scale, methodology, tasks, or outcome measures. Without participant numbers, quantitative scores, or comparison conditions, it is impossible to determine whether the structured edit layer or other components deliver measurable improvements over existing manual or LLM-only workflows.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on the evaluation and claims. We address each major point below, clarifying the scope of our demonstration paper while agreeing to strengthen the reported evidence where possible.

read point-by-point responses
  1. Referee: Abstract: The central claim that the three components together 'close the last mile' by producing top-venue figures that survive layout constraints without multiple rounds of manual revision rests on unverified assertions. The abstract reports only 'early results' on three case studies and a small user study, with no quantitative metrics (e.g., revision counts, success rates, inter-venue generalization, or statistical tests) or baselines provided. This leaves the weakest assumption—that the system will reliably meet publication criteria—untested at the scale needed to substantiate the claim.

    Authors: The manuscript is explicitly framed as a demonstration of the integrated approach rather than a large-scale empirical study; the phrase 'early results' signals this scope. The case studies show the components functioning on representative academic chart types (grouped bar, scaling line, paired distributions), and the render loop is designed to iterate until LaTeX constraints are met. We agree, however, that the abstract's phrasing could overstate reliability. In revision we will add concrete metrics drawn from the case studies, such as the number of render-loop iterations required per figure and the fraction of outputs that satisfied venue layout rules without further manual changes. revision: yes

  2. Referee: Evaluation section (implied by the reported results): The manuscript provides no details on the user study's scale, methodology, tasks, or outcome measures. Without participant numbers, quantitative scores, or comparison conditions, it is impossible to determine whether the structured edit layer or other components deliver measurable improvements over existing manual or LLM-only workflows.

    Authors: We acknowledge that the current description of the user study is too terse. The full manuscript contains a dedicated evaluation subsection, but it does not yet report participant count, exact tasks, or quantitative outcome measures with sufficient clarity. We will revise this section to specify the study scale, the editing tasks performed by participants, the metrics collected (e.g., time to achieve desired edits, number of handle operations required), and any direct comparisons to baseline workflows. This will allow readers to assess the practical benefit of the structured edit layer. revision: yes

Circularity Check

0 steps flagged

No circularity; system description with case-study support

full rationale

The paper presents chart-plot as an agentic system with three explicitly described components (style-aware generator, deployment-aware render loop, structured edit layer) and supports the claim via early results on three chart types plus a user study. No equations, parameters, predictions, or derivations appear. No self-citations, fitted inputs, or ansatzes are invoked. The central claim is a system proposal whose validity rests on the reported demonstrations rather than any reduction to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5700 in / 1197 out tokens · 22370 ms · 2026-06-27T15:22:35.297769+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DataMagic: Transforming Tabular Data into Data Insight Video

    cs.HC 2026-06 unverdicted novelty 5.0

    DataMagic generates narrative data videos from tabular data and queries via DVSpec declarative bindings and a Generate-then-Orchestrate multi-agent pipeline.

Reference graph

Works this paper leans on

39 extracted references · 4 canonical work pages · cited by 1 Pith paper

  1. [1]

    Anthropic. 2025. Claude Skills: Reusable Capability Bundles for AI Agents. https://www.anthropic.com/news/skills. Online. 6

  2. [2]

    Yiyu Chen, Yifan Wu, Shuyu Shen, Yupeng Xie, Leixian Shen, Hui Xiong, and Yuyu Luo. 2025. ChartMark: A Structured Grammar for Chart Annotation.arXiv preprint arXiv:2507.21810(2025)

  3. [3]

    Victor Dibia. 2023. LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models.arXiv preprint arXiv:2303.02927(2023)

  4. [4]

    Richard Gerum. 2019. Pylustrator: code generation for reproducible figures for publication.arXiv preprint arXiv:1910.00279(2019)

  5. [5]

    Shangding Gu. 2026. From Model Scaling to System Scaling: Scaling the Harness in Agentic AI.arXiv preprint arXiv:2605.26112(2026)

  6. [6]

    Yucheng Han, Chi Zhang, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, and Hanwang Zhang. 2023. ChartLlama: A Multimodal LLM for Chart Understanding and Generation.arXiv preprint arXiv:2311.16483(2023)

  7. [7]

    Sirui Hong et al. 2024. MetaGPT: Meta Programming for A Multi-Agent Collabo- rative Framework. InInternational Conference on Learning Representations

  8. [8]

    Qi Jiang, Guodao Sun, Tong Li, Jingwei Tang, Wang Xia, Yunchao Wang, Li Jiang, and Ronghua Liang. 2025. AutoMA: Automated Generation of Multi- level Annotations for Time Series Visualization. InIEEE Pacific Visualization Symposium (PacificVis). 80–90. https://doi.org/10.1109/PACIFICVIS64226.2025. 00014

  9. [9]

    Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan

    Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan. 2024. SWE-bench: Can Language Models Resolve Real-World GitHub Issues?. InInternational Conference on Learning Representa- tions

  10. [10]

    Boyan Li, Yiran Peng, Yupeng Xie, Sirong Lu, Yizhang Zhu, Xing Mu, Xinyu Liu, and Yuyu Luo. 2026. Deepeye: A steerable self-driving data agent system.arXiv preprint arXiv:2603.28889(2026)

  11. [11]

    Junjie Li, Xi Xiao, Yunbei Zhang, Chen Liu, Lin Zhao, Xiaoying Liao, Yingrui Ji, Janet Wang, Jianyang Gu, Yingqiang Ge, Weijie Xu, Xi Fang, Xiang Xu, Tianchen Zhao, Youngeun Kim, Tianyang Wang, Jihun Hamm, Smita Krishnaswamy, Jun Huan, and Chandan K. Reddy. 2026. Agent Harness Engineering: A Survey. https://openreview.net/forum?id=3hXEPbG0dh Under review for TMLR

  12. [12]

    Ji-Feng Luo, Yuzhen Chen, Kaixun Zhang, Xudong An, Menghan Hu, Guangtao Zhai, and Xiao-Ping Zhang. 2025. Human-Centered Financial Signal Analysis Based on Visual Patterns in Stock Charts.IEEE Transactions on Multimedia27 (2025), 4193–4205. https://doi.org/10.1109/TMM.2025.3535278

  13. [13]

    Nelson, Halden Lin, Adam M

    Dominik Moritz, Chenglong Wang, Greg L. Nelson, Halden Lin, Adam M. Smith, Bill Howe, and Jeffrey Heer. 2018. Formalizing Visualization Design Knowledge as Constraints: Actionable and Extensible Models in Draco.IEEE Transactions on Visualization and Computer Graphics(2018)

  14. [14]

    Xuying Ning, Katherine Tieu, Dongqi Fu, Tianxin Wei, Zihao Li, et al. 2026. Code as Agent Harness.arXiv preprint arXiv:2605.18747(2026)

  15. [15]

    Bo Pan, Yixiao Fu, Ke Wang, Junyu Lu, Lunke Pan, Ziyang Qian, Yuhan Chen, Guoliang Wang, Yitao Zhou, Li Zheng, Yinghao Tang, Zhen Wen, Yuchen Wu, Junhua Lu, Biao Zhu, Minfeng Zhu, Bo Zhang, and Wei Chen. 2025. VIS-Shepherd: Constructing Critic for LLM-based Data Visualization Generation.arXiv preprint arXiv:2506.13326(2025)

  16. [16]

    Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023. Toolformer: Language Models Can Teach Themselves to Use Tools. InAdvances in Neural Information Processing Systems

  17. [17]

    Wonduk Seo et al . 2025. VisPath: Automated Visualization Code Synthesis via Multi-Path Reasoning and Feedback-Driven Optimization.arXiv preprint arXiv:2502.11140(2025)

  18. [18]

    Yinghao Tang, Tingfeng Lan, Xiuqi Huang, Hui Lu, and Wei Chen. 2025. SCOR- PIO: Serving the Right Requests at the Right Time for Heterogeneous SLOs in LLM Inference.arXiv preprint arXiv:2505.23022(2025)

  19. [19]

    Yinghao Tang, Xueding Liu, Boyuan Zhang, Tingfeng Lan, Yupeng Xie, Jiale Lao, Yiyao Wang, Haoxuan Li, Tingting Gao, Bo Pan, Luoxuan Weng, Xiuqi Huang, Minfeng Zhu, Yingchaojie Feng, Yuyu Luo, and Wei Chen. 2026. IGenBench: Benchmarking the Reliability of Text-to-Infographic Generation.arXiv preprint arXiv:2601.04498(2026)

  20. [20]

    Yinghao Tang, Yupeng Xie, Yingchaojie Feng, Tingfeng Lan, and Wei Chen. 2026. Demonstrating ViviDoc: Generating Interactive Documents through Human- Agent Collaboration.arXiv preprint arXiv:2603.01912(2026)

  21. [21]

    Yinghao Tang, Yupeng Xie, Yingchaojie Feng, Tingfeng Lan, Jiale Lao, Yue Cheng, and Wei Chen. 2026. ViviDoc: Generating Interactive Documents through Human-Agent Collaboration.arXiv preprint arXiv:2603.27991(2026)

  22. [22]

    Priyan Vaithilingam, Elena L Glassman, Jeevana Priya Inala, and Chenglong Wang. 2024. Dynavis: Dynamically synthesized ui widgets for visualization editing. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–17

  23. [23]

    Chenglong Wang, Bongshin Lee, Steven M Drucker, Dan Marshall, and Jianfeng Gao. 2025. Data formulator 2: Iterative creation of data visualizations, with ai transforming data along the way. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–17

  24. [24]

    Xingyao Wang et al . 2025. OpenHands: An Open Platform for AI Software Developers as Generalist Agents. InInternational Conference on Learning Repre- sentations

  25. [25]

    Zelin Wang, Yuanyuan Yin, Jien Wang, Haiyan Yan, Xuan Xie, and Yiqing Zheng

  26. [26]

    ggplotAgent: a self-debugging multi-modal agent for robust and repro- ducible scientific visualization.Bioinformatics Advances6, 1 (2026), vbaf332

  27. [27]

    Luoxuan Weng, Yinghao Tang, Yingchaojie Feng, Zhuo Chang, Ruiqin Chen, Haozhe Feng, Chen Hou, Danqing Huang, Yang Li, Huaming Rao, Haonan Wang, Canshi Wei, Xiaofeng Yang, Yuhui Zhang, Yifeng Zheng, Xiuqi Huang, Minfeng Zhu, Yuxin Ma, Bin Cui, Peng Chen, and Wei Chen. 2025. DataLab: A Unified Platform for LLM-Powered Business Intelligence. InIEEE Internati...

  28. [28]

    White, Doug Burger, and Chi Wang

    Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White, Doug Burger, and Chi Wang. 2023. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation.arXiv preprint arXiv:2308.08155 (2023)

  29. [29]

    Yupeng Xie, Yuyu Luo, Guoliang Li, and Nan Tang. 2024. HAIChart: Human and AI Paired Visualization System.Proceedings of the VLDB Endowment(2024). https://doi.org/10.14778/3681954.3681992

  30. [30]

    Yupeng Xie, Zhiyang Zhang, Yifan Wu, Sirong Lu, Jiayi Zhang, Zhaoyang Yu, Jinlin Wang, Sirui Hong, Bang Liu, Chenglin Wu, and Yuyu Luo. 2025. VisJudge- Bench: Aesthetics and Quality Assessment of Visualizations.arXiv preprint arXiv:2510.22373(2025)

  31. [31]

    Pengyu Yan et al. 2024. ChartReformer: Natural Language-Driven Chart Image Editing. InICDAR

  32. [32]

    Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press

    John Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. 2024. SWE-agent: Agent-Computer Inter- faces Enable Automated Software Engineering. InAdvances in Neural Information Processing Systems

  33. [33]

    Zhiyu Yang, Zihan Zhou, Shuo Wang, Xin Cong, Xu Han, Yukun Yan, Zhenghao Liu, Zhixing Tan, Pengyuan Liu, Dong Yu, Zhiyuan Liu, Xiaodong Shi, and Maosong Sun. 2024. MatPlotAgent: Method and Evaluation for LLM-based Agentic Scientific Data Visualization.arXiv preprint arXiv:2402.11453(2024)

  34. [34]

    Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. InInternational Conference on Learning Representations

  35. [35]

    Fatemeh Pesaran Zadeh, Juyeon Kim, Jin-Hwa Kim, and Gunhee Kim. 2024. Text2Chart31: Instruction tuning for chart generation with automatic feedback. InProceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 11459–11480

  36. [36]

    Xuanle Zhao et al. 2025. ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation.arXiv preprint arXiv:2501.06598(2025)

  37. [37]

    Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, and Graham Neubig

    Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, and Graham Neubig. 2024. WebArena: A Realistic Web Environment for Building Autonomous Agents. InInternational Conference on Learning Representations

  38. [38]

    Minjun Zhu, Zhen Lin, Yixuan Weng, Panzhong Lu, Qiujie Xie, Yifan Wei, Sifan Liu, Qiyao Sun, and Yue Zhang. 2026. AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations.arXiv preprint arXiv:2602.03828(2026)

  39. [39]

    Jonathan Zong, Dhiraj Barnwal, Rupayan Neogy, and Arvind Satyanarayan. 2020. Lyra 2: Designing interactive visualizations by demonstration.IEEE Transactions on Visualization and Computer Graphics27, 2 (2020), 304–314. 7