Explainable Iterative Data Visualisation Refinement via an LLM Agent

Burak Susam; Tingting Mu

arxiv: 2604.15319 · v2 · submitted 2026-03-02 · 💻 cs.HC · cs.AI

Explainable Iterative Data Visualisation Refinement via an LLM Agent

Burak Susam , Tingting Mu This is my paper

Pith reviewed 2026-05-15 17:49 UTC · model grok-4.3

classification 💻 cs.HC cs.AI

keywords LLM agentdata visualizationhyperparameter optimizationiterative refinementexplainable AIhigh-dimensional dataembedding visualizationautomated analysis

0 comments

The pith

An LLM agent automates high-quality data visualization by iteratively refining algorithm hyperparameters using both metrics and semantic insights.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an agentic AI system that employs a large language model to evaluate and improve data visualization plots for high-dimensional data. It bridges quantitative metrics with qualitative descriptions to suggest better hyperparameter settings for embedding algorithms. Through an iterative loop, the system refines the visualization until it faithfully represents the data's structures. This automation addresses the challenge of manually tuning parameters to uncover meaningful patterns. A sympathetic reader would care because effective visualizations are crucial for exploratory analysis, and this reduces the expertise barrier.

Core claim

The central claim is that by treating visualization evaluation and hyperparameter optimization as a semantic task, an LLM can generate multi-faceted reports that contextualize hard metrics with descriptive summaries and provide actionable recommendations for refining data visualizations. Implementing this in an iterative optimization loop allows the system to rapidly produce high-quality visualization plots in full automation.

What carries the argument

The iterative optimization loop driven by an LLM agent that translates quantitative visualization metrics into qualitative judgments and hyperparameter recommendations.

If this is right

Visualization plots can be generated without expert manual tuning of hyperparameters.
Each iteration provides explainable reports linking metrics to insights.
The process works across different embedding algorithms for high-dimensional data.
High-quality plots that encourage pattern discovery are achieved rapidly.
Full automation reduces the time and expertise needed for exploratory data analysis.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar LLM agents could be applied to refine other types of visualizations or machine learning models.
Combining this with user feedback might further improve the accuracy of recommendations.
Testing on datasets with known ground truth structures would validate the system's effectiveness.
The approach might generalize to other optimization tasks where qualitative judgment is needed.

Load-bearing premise

An LLM can reliably translate quantitative visualization metrics into accurate qualitative judgments and actionable hyperparameter recommendations without systematic bias or hallucination.

What would settle it

A direct comparison where the automated visualizations are evaluated against those chosen by human experts on benchmark datasets with known cluster structures, checking if the LLM iterations match or exceed expert quality.

read the original abstract

Exploratory analysis of high-dimensional data relies on embedding the data into a low-dimensional space (typically 2D or 3D), based on which visualization plot is produced to uncover meaningful structures and to communicate geometric and distributional data characteristics. However, finding a suitable algorithm configuration, particularly hyperparameter setting, to produce a visualization plot that faithfully represents the underlying reality and encourages pattern discovery remains challenging. To address this challenge, we propose an agentic AI pipleline that leverages a large language model (LLM) to bridge the gap between rigorous quantitative assessment and qualitative human insight. By treating visualization evaluation and hyperparameter optimization as a semantic task, our system generates a multi-faceted report that contextualizes hard metrics with descriptive summaries, and suggests actionable recommendation of algorithm configuration for refining data visualization. By implementing an iterative optimization loop of this process, the system is able to produce rapidly a high-quality visualization plot, in full automation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a system proposal for an LLM agent that turns visualization metrics into semantic reports and config tweaks in an iterative loop, but it supplies no experiments or evidence that the loop actually improves the plots.

read the letter

The main point is a proposal for an LLM-based agent that reads quantitative metrics from embeddings like t-SNE or UMAP, writes a natural-language report, and suggests hyperparameter changes to refine the visualization automatically. It runs this in a loop until the output looks good. The idea targets a real daily headache in exploratory data analysis where picking the right perplexity or neighbor count still relies on manual trial and error.

Referee Report

2 major / 1 minor

Summary. The paper proposes an agentic AI pipeline that uses a large language model (LLM) to bridge quantitative visualization metrics and qualitative human insight for high-dimensional data embeddings. The system generates multi-faceted reports contextualizing metrics with descriptive summaries and provides actionable hyperparameter recommendations; an iterative optimization loop is claimed to automate rapid production of high-quality visualization plots.

Significance. If the iterative LLM-driven refinement loop can be empirically shown to produce measurable improvements in visualization quality (e.g., trustworthiness, continuity, or neighborhood preservation metrics), the work would offer a novel approach to automating hyperparameter tuning in dimensionality reduction and visualization, potentially lowering the barrier for exploratory analysis in data science and HCI.

major comments (2)

[Abstract] Abstract: The central claim that 'by implementing an iterative optimization loop of this process, the system is able to produce rapidly a high-quality visualization plot, in full automation' is unsupported by any implementation details, experimental results, before/after metric tables, ablation studies, or baseline comparisons. Without such evidence the automation assertion cannot be evaluated.
The manuscript provides no analysis of LLM reliability in this domain, such as whether quantitative metrics are translated into recommendations that actually improve objective embedding quality rather than merely rephrasing inputs or introducing bias/hallucination.

minor comments (1)

[Abstract] Typo in Abstract: 'pipleline' should read 'pipeline'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive review and for highlighting key areas where the manuscript requires strengthening. We address each major comment below and commit to revisions that will incorporate the requested empirical evidence and analysis.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'by implementing an iterative optimization loop of this process, the system is able to produce rapidly a high-quality visualization plot, in full automation' is unsupported by any implementation details, experimental results, before/after metric tables, ablation studies, or baseline comparisons. Without such evidence the automation assertion cannot be evaluated.

Authors: We agree that the abstract claim requires empirical substantiation. The current manuscript focuses on describing the agentic pipeline architecture, the LLM's role in generating semantic reports, and the iterative refinement mechanism. In the revised version we will add a dedicated experimental section that includes: (1) full implementation details of the iterative loop (prompt templates, metric-to-text translation, and stopping criteria); (2) before-and-after quantitative results on standard high-dimensional datasets using trustworthiness, continuity, and neighborhood-preservation metrics; (3) ablation studies isolating the contribution of the LLM agent; and (4) baseline comparisons against conventional hyperparameter search methods. These additions will directly support the automation claim. revision: yes
Referee: [—] The manuscript provides no analysis of LLM reliability in this domain, such as whether quantitative metrics are translated into recommendations that actually improve objective embedding quality rather than merely rephrasing inputs or introducing bias/hallucination.

Authors: We acknowledge this is a valid and important gap. The manuscript currently presents the system design without systematic validation of the LLM's recommendation quality. In the revision we will introduce a new subsection that evaluates LLM reliability through: repeated runs on the same inputs to measure output consistency, tracking of objective metric changes after applying the suggested configuration updates, and qualitative review of whether recommendations go beyond rephrasing to produce verifiable embedding improvements. This analysis will be supported by concrete examples and quantitative deltas. revision: yes

Circularity Check

0 steps flagged

No circularity: system proposal with no derivations or self-referential reductions

full rationale

The manuscript is a system description of an LLM-based iterative pipeline for visualization hyperparameter refinement. No equations, fitted parameters, uniqueness theorems, or self-citations are invoked in a load-bearing way that would reduce any claim to its own inputs by construction. The iterative loop is presented as an engineering process rather than a closed-form result equivalent to its assumptions, so the derivation chain contains no circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The proposal rests on the untested premise that current LLMs can serve as reliable judges of visualization fidelity; no free parameters are introduced because no quantitative model is fitted.

axioms (1)

domain assumption Large language models can produce accurate qualitative assessments of data visualization quality that align with human expert judgment.
The entire pipeline depends on the LLM generating trustworthy multi-faceted reports and recommendations.

invented entities (1)

LLM visualization refinement agent no independent evidence
purpose: To bridge quantitative metrics and qualitative insight in an automated iterative loop
The agent is introduced as the core new component without prior independent validation in the abstract.

pith-pipeline@v0.9.0 · 5454 in / 1247 out tokens · 51974 ms · 2026-05-15T17:49:05.839906+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 1 internal anchor

[1]

Visualizing data using t-SNE,

L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,”J. Mach. Learn. Res., vol. 9, no. 86, pp. 2579–2605, Nov. 2008. [Online]. Available: http://jmlr.org/papers/v9/vandermaaten08a.html

work page 2008
[2]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

L. McInnes, J. Healy, and J. Melville, “UMAP: Uni- form manifold approximation and projection for dimen- sion reduction,”arXiv preprint arXiv:1802.03426, 2020. [Online]. Available: https://arxiv.org/abs/1802.03426

work page internal anchor Pith review Pith/arXiv arXiv 2020
[3]

An extensive comparative study of cluster validity indices,

O. Arbelaitz, I. Gurrutxaga, J. Muguerza, J. M. Pérez, and I. Perona, “An extensive comparative study of cluster validity indices,”Pattern Recognit., vol. 46, no. 1, pp. 243–256, Jan. 2013. [Online]. Available: https://doi.org/10.1016/j.patcog.2012.07.021

work page doi:10.1016/j.patcog.2012.07.021 2013
[4]

The specious art of single- cell genomics,

T. Chari and L. Pachter, “The specious art of single- cell genomics,”PLoS Comput. Biol., vol. 19, no. 10, p. e1011288, Oct. 2023. [Online]. Available: https://doi.org/10.1371/journal.pcbi.1011288

work page doi:10.1371/journal.pcbi.1011288 2023
[5]

Understanding how dimension reduction tools work: An empirical approach to deciphering t-SNE, UMAP , TriMAP , and PaCMAP for data visualization,

Y . Wang, H. Huang, C. Rudin, and Y . Shaposhnik, “Understanding how dimension reduction tools work: An empirical approach to deciphering t-SNE, UMAP , TriMAP , and PaCMAP for data visualization,”J. Mach. Learn. Res., vol. 22, no. 1, pp. 9129–9201, 2021. [Online]. Available: https://arxiv.org/abs/2012.04456

work page arXiv 2021
[6]

How to use t-SNE effectively,

M. Wattenberg, F . Viégas, and I. Johnson, “How to use t-SNE effectively,”Distill, 2016. [Online]. Available: https://doi.org/10.23915/distill.00002

work page doi:10.23915/distill.00002 2016
[7]

Local multidimensional scaling,

J. Venna and S. Kaski, “Local multidimensional scaling,”Neural Netw., vol. 19, no. 8, pp. 1189–1199, Oct. 2006. [Online]. Available: https://doi.org/10.1016/j.neunet.2006.06.014

work page doi:10.1016/j.neunet.2006.06.014 2006
[8]

Random search for hyper- parameter optimization,

J. Bergstra and Y . Bengio, “Random search for hyper- parameter optimization,”J. Mach. Learn. Res., vol. 13, no. 10, pp. 281–305, Feb. 2012. [Online]. Available: http://jmlr.org/papers/v13/bergstra12a.html

work page 2012
[9]

Practical Bayesian optimization of machine learning algorithms,

J. Snoek, H. Larochelle, and R. P . Adams, “Practical Bayesian optimization of machine learning algorithms,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 25, 2012, pp. 2951–2959

work page 2012
[10]

Frontiers Comput

L. Wanget al., “A survey on large language model based autonomous agents,”Front. Comput. Sci., vol. 18, no. 6, pp. 1–26, Dec. 2024. [Online]. Available: https://doi.org/10.1007/s11704-024-40231-1

work page doi:10.1007/s11704-024-40231-1 2024
[11]

MatPlotAgent: Method and eval- uation for LLM-based agentic scientific data vi- sualization,

Z. Y anget al., “MatPlotAgent: Method and eval- uation for LLM-based agentic scientific data vi- sualization,” inFindings of the Assoc. for Com- put. Linguistics: ACL 2024, Bangkok, Thailand, Aug. 2024, pp. 11789–11804. [Online]. Available: https://aclanthology.org/2024.findings-acl.701/

work page 2024
[12]

Coda: Agentic systems for collaborative data visualization,

Z. Chenet al., “Coda: Agentic systems for collaborative data visualization,”arXiv preprint arXiv:2510.03194, 2025. [Online]. Available: https://arxiv.org/abs/2510.03194

work page arXiv 2025
[13]

AgentHPO: Large lan- guage model agent for hyper-parameter optimiza- tion,

S. Liu, C. Gao, and Y . Li, “AgentHPO: Large lan- guage model agent for hyper-parameter optimiza- tion,” inProc. Conf. Parsimony and Learn. (CPAL), vol. 280, 2025, pp. 1146–1169. [Online]. Available: https://proceedings.mlr.press/v280/liu25c.html

work page 2025
[14]

A statistical method for evaluating systematic relationships,

R. R. Sokal and C. D. Michener, “A statistical method for evaluating systematic relationships,”Univ. Kans. Sci. Bull., vol. 38, pp. 1409–1438, 1958

work page 1958
[15]

Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,

P . J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,”J. Comput. Appl. Math., vol. 20, pp. 53–65, Nov. 1987. 13 [Online]. Available: https://doi.org/10.1016/0377- 0427(87)90125-7

work page doi:10.1016/0377- 1987
[16]

SIGMOD Rec

M. M. Breunig, H.-P . Kriegel, R. T. Ng, and J. Sander, “LOF: Identifying density-based local outliers,”SIGMOD Rec., vol. 29, no. 2, pp. 93–104, May 2000. [Online]. Available: https://doi.org/10.1145/335191.335388

work page doi:10.1145/335191.335388 2000
[17]

35 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R

C. Spearman, “The proof and measurement of as- sociation between two things,”Amer. J. Psychol., vol. 15, no. 1, pp. 72–101, Jan. 1904. [Online]. Available: https://doi.org/10.2307/1412159

work page doi:10.2307/1412159 1904
[18]

Single-cell profiling of healthy human kidney reveals features of sex-based transcrip- tional programs and tissue-specific immunity,

C. M. McEvoyet al., “Single-cell profiling of healthy human kidney reveals features of sex-based transcrip- tional programs and tissue-specific immunity,”Nat. Commun., vol. 13, no. 1, p. 7634, Dec. 2022. [Online]. Available: https://doi.org/10.1038/s41467-022-35297- z

work page doi:10.1038/s41467-022-35297- 2022
[19]

Spatiotemporal immune zona- tion of the human kidney,

B. J. Stewartet al., “Spatiotemporal immune zona- tion of the human kidney,”Science, vol. 365, no. 6459, pp. 1461–1466, Sep. 2019. [Online]. Available: https://doi.org/10.1126/science.aat5031

work page doi:10.1126/science.aat5031 2019
[20]

UCSC Cell Browser: Visual- ize your single-cell data,

M. L. Speiret al., “UCSC Cell Browser: Visual- ize your single-cell data,”Bioinformatics, vol. 37, no. 23, pp. 4578–4580, Dec. 2021. [Online]. Available: https://doi.org/10.1093/bioinformatics/btab503 Burak Susamis a post-graduate researcher at the University of Manchester, UK. His current research interests include data visu- alization, agentic AI and di...

work page doi:10.1093/bioinformatics/btab503 2021

[1] [1]

Visualizing data using t-SNE,

L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,”J. Mach. Learn. Res., vol. 9, no. 86, pp. 2579–2605, Nov. 2008. [Online]. Available: http://jmlr.org/papers/v9/vandermaaten08a.html

work page 2008

[2] [2]

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

L. McInnes, J. Healy, and J. Melville, “UMAP: Uni- form manifold approximation and projection for dimen- sion reduction,”arXiv preprint arXiv:1802.03426, 2020. [Online]. Available: https://arxiv.org/abs/1802.03426

work page internal anchor Pith review Pith/arXiv arXiv 2020

[3] [3]

An extensive comparative study of cluster validity indices,

O. Arbelaitz, I. Gurrutxaga, J. Muguerza, J. M. Pérez, and I. Perona, “An extensive comparative study of cluster validity indices,”Pattern Recognit., vol. 46, no. 1, pp. 243–256, Jan. 2013. [Online]. Available: https://doi.org/10.1016/j.patcog.2012.07.021

work page doi:10.1016/j.patcog.2012.07.021 2013

[4] [4]

The specious art of single- cell genomics,

T. Chari and L. Pachter, “The specious art of single- cell genomics,”PLoS Comput. Biol., vol. 19, no. 10, p. e1011288, Oct. 2023. [Online]. Available: https://doi.org/10.1371/journal.pcbi.1011288

work page doi:10.1371/journal.pcbi.1011288 2023

[5] [5]

Understanding how dimension reduction tools work: An empirical approach to deciphering t-SNE, UMAP , TriMAP , and PaCMAP for data visualization,

Y . Wang, H. Huang, C. Rudin, and Y . Shaposhnik, “Understanding how dimension reduction tools work: An empirical approach to deciphering t-SNE, UMAP , TriMAP , and PaCMAP for data visualization,”J. Mach. Learn. Res., vol. 22, no. 1, pp. 9129–9201, 2021. [Online]. Available: https://arxiv.org/abs/2012.04456

work page arXiv 2021

[6] [6]

How to use t-SNE effectively,

M. Wattenberg, F . Viégas, and I. Johnson, “How to use t-SNE effectively,”Distill, 2016. [Online]. Available: https://doi.org/10.23915/distill.00002

work page doi:10.23915/distill.00002 2016

[7] [7]

Local multidimensional scaling,

J. Venna and S. Kaski, “Local multidimensional scaling,”Neural Netw., vol. 19, no. 8, pp. 1189–1199, Oct. 2006. [Online]. Available: https://doi.org/10.1016/j.neunet.2006.06.014

work page doi:10.1016/j.neunet.2006.06.014 2006

[8] [8]

Random search for hyper- parameter optimization,

J. Bergstra and Y . Bengio, “Random search for hyper- parameter optimization,”J. Mach. Learn. Res., vol. 13, no. 10, pp. 281–305, Feb. 2012. [Online]. Available: http://jmlr.org/papers/v13/bergstra12a.html

work page 2012

[9] [9]

Practical Bayesian optimization of machine learning algorithms,

J. Snoek, H. Larochelle, and R. P . Adams, “Practical Bayesian optimization of machine learning algorithms,” inAdv. Neural Inf. Process. Syst. (NeurIPS), vol. 25, 2012, pp. 2951–2959

work page 2012

[10] [10]

Frontiers Comput

L. Wanget al., “A survey on large language model based autonomous agents,”Front. Comput. Sci., vol. 18, no. 6, pp. 1–26, Dec. 2024. [Online]. Available: https://doi.org/10.1007/s11704-024-40231-1

work page doi:10.1007/s11704-024-40231-1 2024

[11] [11]

MatPlotAgent: Method and eval- uation for LLM-based agentic scientific data vi- sualization,

Z. Y anget al., “MatPlotAgent: Method and eval- uation for LLM-based agentic scientific data vi- sualization,” inFindings of the Assoc. for Com- put. Linguistics: ACL 2024, Bangkok, Thailand, Aug. 2024, pp. 11789–11804. [Online]. Available: https://aclanthology.org/2024.findings-acl.701/

work page 2024

[12] [12]

Coda: Agentic systems for collaborative data visualization,

Z. Chenet al., “Coda: Agentic systems for collaborative data visualization,”arXiv preprint arXiv:2510.03194, 2025. [Online]. Available: https://arxiv.org/abs/2510.03194

work page arXiv 2025

[13] [13]

AgentHPO: Large lan- guage model agent for hyper-parameter optimiza- tion,

S. Liu, C. Gao, and Y . Li, “AgentHPO: Large lan- guage model agent for hyper-parameter optimiza- tion,” inProc. Conf. Parsimony and Learn. (CPAL), vol. 280, 2025, pp. 1146–1169. [Online]. Available: https://proceedings.mlr.press/v280/liu25c.html

work page 2025

[14] [14]

A statistical method for evaluating systematic relationships,

R. R. Sokal and C. D. Michener, “A statistical method for evaluating systematic relationships,”Univ. Kans. Sci. Bull., vol. 38, pp. 1409–1438, 1958

work page 1958

[15] [15]

Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,

P . J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,”J. Comput. Appl. Math., vol. 20, pp. 53–65, Nov. 1987. 13 [Online]. Available: https://doi.org/10.1016/0377- 0427(87)90125-7

work page doi:10.1016/0377- 1987

[16] [16]

SIGMOD Rec

M. M. Breunig, H.-P . Kriegel, R. T. Ng, and J. Sander, “LOF: Identifying density-based local outliers,”SIGMOD Rec., vol. 29, no. 2, pp. 93–104, May 2000. [Online]. Available: https://doi.org/10.1145/335191.335388

work page doi:10.1145/335191.335388 2000

[17] [17]

35 Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R

C. Spearman, “The proof and measurement of as- sociation between two things,”Amer. J. Psychol., vol. 15, no. 1, pp. 72–101, Jan. 1904. [Online]. Available: https://doi.org/10.2307/1412159

work page doi:10.2307/1412159 1904

[18] [18]

Single-cell profiling of healthy human kidney reveals features of sex-based transcrip- tional programs and tissue-specific immunity,

C. M. McEvoyet al., “Single-cell profiling of healthy human kidney reveals features of sex-based transcrip- tional programs and tissue-specific immunity,”Nat. Commun., vol. 13, no. 1, p. 7634, Dec. 2022. [Online]. Available: https://doi.org/10.1038/s41467-022-35297- z

work page doi:10.1038/s41467-022-35297- 2022

[19] [19]

Spatiotemporal immune zona- tion of the human kidney,

B. J. Stewartet al., “Spatiotemporal immune zona- tion of the human kidney,”Science, vol. 365, no. 6459, pp. 1461–1466, Sep. 2019. [Online]. Available: https://doi.org/10.1126/science.aat5031

work page doi:10.1126/science.aat5031 2019

[20] [20]

UCSC Cell Browser: Visual- ize your single-cell data,

M. L. Speiret al., “UCSC Cell Browser: Visual- ize your single-cell data,”Bioinformatics, vol. 37, no. 23, pp. 4578–4580, Dec. 2021. [Online]. Available: https://doi.org/10.1093/bioinformatics/btab503 Burak Susamis a post-graduate researcher at the University of Manchester, UK. His current research interests include data visu- alization, agentic AI and di...

work page doi:10.1093/bioinformatics/btab503 2021