Sycamore: Characterizing Synthetic Personas for Evaluating Genomics Visualization Retrieval

Astrid van den Brandt; Huyen N. Nguyen; Nils Gehlenborg

arxiv: 2605.08630 · v2 · pith:NLWJL6XVnew · submitted 2026-05-09 · 💻 cs.HC

Sycamore: Characterizing Synthetic Personas for Evaluating Genomics Visualization Retrieval

Huyen N. Nguyen , Astrid van den Brandt , Nils Gehlenborg This is my paper

Pith reviewed 2026-06-30 23:22 UTC · model grok-4.3

classification 💻 cs.HC

keywords synthetic personasLLM-based evaluationgenomics visualizationuser evaluationvisualization systemsdomain expertsfeedback alignmentmultimodal retrieval

0 comments

The pith

Grounding synthetic personas in user study artifacts aligns their feedback with real expert concerns in genomics visualization retrieval, though both miss modality preferences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates how synthetic personas perform when evaluating a genomics visualization search engine compared to real domain experts. It uses a three-condition design with ungrounded LLMs, grounded ones using prior interview data, and a real expert baseline. Grounding makes synthetic feedback match user language and concerns better, while ungrounded versions highlight operational issues not raised by experts. Both synthetic approaches settle on a find-and-adapt model and overlook experts' preference for image modalities. This setup helps clarify the role of synthetic evaluators alongside scarce expert studies in specialized domains.

Core claim

Using Sycamore's three-condition probe on Geranium, the study finds that grounding synthetic personas with voice-of-customer artifacts from prior interviews shifts their feedback toward the language and concerns of documented users, ungrounded personas drift toward operational specifics not mentioned by real participants, and both synthetic conditions converge on a find-and-adapt frame while missing the image-modality preference observed in the expert study.

What carries the argument

The three-condition probe design that compares outputs from ungrounded synthetic personas, grounded synthetic personas constrained by voice-of-customer artifacts, and real expert baselines to characterize differences in evaluation feedback.

If this is right

Grounding synthetic personas improves their alignment with real user concerns and language.
Synthetic personas tend to converge on a find-and-adapt frame regardless of grounding.
Real expert studies reveal preferences like image-modality that synthetic evaluators miss.
Voice-of-customer artifacts from interviews can constrain synthetic personas effectively.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Synthetic personas could reduce the need for initial expert recruitment in visualization system evaluation by providing preliminary insights.
Future designs might combine synthetic and real feedback to capture both common frames and modality-specific preferences.
The convergence on find-and-adapt suggests this is a robust user need in genomics visualization retrieval worth prioritizing in system design.

Load-bearing premise

The published baseline study of real domain experts provides an unbiased and complete reference standard against which synthetic outputs can be compared without bias or incompleteness.

What would settle it

Replicating the expert study with a new cohort of genomics domain experts and finding that their concerns differ substantially from the published baseline or that synthetic personas also identify image-modality preferences.

Figures

Figures reproduced from arXiv: 2605.08630 by Astrid van den Brandt, Huyen N. Nguyen, Nils Gehlenborg.

**Figure 1.** Figure 1: Diagram of the Sycamore system. Left: three-condition evaluation on a common visualization retrieval system to address [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: The object of evaluation, Geranium [16] multimodal retrieval system. Users can search with text, image, or Gosling specification and will rank their modality preferences at the end of the evaluation. 3.2 Evaluation Protocol Sycamore adopts the published Geranium user study [16] in two roles. First, its protocol provides the shared procedure that all three conditions follow. Second, its reported findings pr… view at source ↗

**Figure 3.** Figure 3: Four-step pipeline for instantiating synthetic evaluators: [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: The Sycamore session viewer streaming a grounded evaluator (CB1, Computational Biologist) through the Geranium protocol. (1) [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

read the original abstract

Evaluating visualization systems in niche domains such as genomics is challenging due to scarcity of domain experts and difficulty recruiting a representative user base. While LLM-based synthetic personas are increasingly used to ease evaluation bottlenecks, they face well-founded skepticism. Rather than weighing synthetic personas as substitutes for real users, we ask a fundamental open question: when synthetic personas evaluate a real visualization system, what do they actually produce, and how does that output change when grounded in documented human contexts? We present Sycamore, an exploratory three-condition probe design using Geranium, a search engine for multimodal genomics visualization, as a case study. Sycamore evaluates Geranium using: (1) ungrounded synthetic personas from generic LLM priors; (2) grounded synthetic personas constrained by voice-of-customer artifacts from a prior interview study; and (3) a published baseline study of real domain experts. We observe that grounding shifts synthetic feedback toward the language and concerns of documented users, while ungrounded evaluators drift toward operational specifics that real participants did not raise; both synthetic conditions, however, converge on a find-and-adapt frame and miss the image-modality preference observed in the expert study. We discuss what these observations imply for where synthetic personas might fit alongside expert studies in domain-specific visualization evaluation. All supplemental materials are available at https://osf.io/kdfr3/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The three-condition probe comparing ungrounded and grounded synthetic personas against a real-expert baseline in genomics visualization retrieval is the concrete new piece, though the observations stay directional and rest on an unvalidated baseline.

read the letter

The paper's main contribution is running the same evaluation task on Geranium through three setups: generic LLM personas, personas constrained by prior interview artifacts, and the published expert study. The directional observation that grounding moves output toward documented user language while both synthetic versions miss the image-modality preference and converge on a find-and-adapt frame is a usable data point for this narrow setting.

They handle the framing reasonably by treating the work as a probe rather than a replacement claim and by releasing supplemental materials. The design itself is simple and directly contrasts the conditions against an external reference, which avoids some of the usual circularity in synthetic-user papers.

The soft spots are straightforward. No quantitative metrics, prompt templates, inter-rater details, or statistical comparisons appear in the abstract, so the strength of the shift and miss claims cannot be checked. More importantly, the central comparison treats the prior expert study as a fixed, representative standard without any sensitivity check on its coverage or recruitment. If that study under-sampled certain contexts, the reported differences are relative to an incomplete reference rather than to domain experts more broadly. The exploratory nature is acknowledged, but that also means the results function more as a case illustration than as settled evidence.

This is for HCI researchers already working on evaluation methods for scientific visualization or on LLM synthetic users in specialized domains. A reader already thinking about these bottlenecks could extract a practical design pattern and the caution about modality preferences. It does not move the broader genomics or visualization literature.

I would send it to peer review. The question is relevant and the three-way setup is replicable; referees can ask for the missing methods details and a clearer discussion of baseline limitations.

Referee Report

2 major / 1 minor

Summary. The paper presents Sycamore, an exploratory three-condition probe design that evaluates the Geranium genomics visualization search engine using (1) ungrounded synthetic personas from generic LLM priors, (2) grounded synthetic personas constrained by voice-of-customer artifacts from a prior interview study, and (3) a published baseline study of real domain experts. It reports that grounding shifts synthetic feedback toward the language and concerns of documented users, ungrounded evaluators drift toward operational specifics not raised by real participants, and both synthetic conditions converge on a find-and-adapt frame while missing the image-modality preference observed in the expert study. Supplemental materials are provided at OSF.

Significance. If the observations hold under more rigorous quantification, the work offers a useful framing for the appropriate role of synthetic personas alongside expert studies in niche-domain visualization evaluation, where expert recruitment is difficult. The open release of supplemental materials supports reproducibility and is a clear strength.

major comments (2)

[Abstract] Abstract and probe description: the central observational claims (shifts in language/concerns, convergence on find-and-adapt, and missing image-modality preference) are presented as directional findings without quantitative metrics, inter-rater reliability details, prompt templates, or statistical comparisons, leaving the support for these claims unverifiable from the reported evidence.
[Results and Discussion] Baseline comparison (throughout results and discussion): the claims of 'drift,' 'convergence,' and 'miss' treat the published expert study as a complete, unbiased reference standard, but the manuscript provides no sensitivity check, coverage validation, or discussion of potential selection/reporting biases in the baseline's participant pool or protocol, which is load-bearing for the comparative conclusions.

minor comments (1)

[Methods] The three conditions could be summarized in a table for clearer side-by-side comparison of inputs, outputs, and observed differences.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive review and for recognizing the potential value of this work in framing the role of synthetic personas in niche-domain visualization evaluation. We address each major comment below, indicating where revisions will be made.

read point-by-point responses

Referee: [Abstract] Abstract and probe description: the central observational claims (shifts in language/concerns, convergence on find-and-adapt, and missing image-modality preference) are presented as directional findings without quantitative metrics, inter-rater reliability details, prompt templates, or statistical comparisons, leaving the support for these claims unverifiable from the reported evidence.

Authors: The study is explicitly framed as an exploratory three-condition probe rather than a confirmatory experiment, so the claims are intentionally directional and observational. Quantitative metrics, inter-rater reliability, and statistical comparisons are not applicable to this design and were not performed. To improve verifiability, we will expand the methods section to include the exact prompt templates used for both synthetic conditions and will ensure all analysis materials (including any coding schemes) are fully documented in the OSF supplement. We will also revise the abstract and discussion to more explicitly characterize the findings as qualitative observations. revision: partial
Referee: [Results and Discussion] Baseline comparison (throughout results and discussion): the claims of 'drift,' 'convergence,' and 'miss' treat the published expert study as a complete, unbiased reference standard, but the manuscript provides no sensitivity check, coverage validation, or discussion of potential selection/reporting biases in the baseline's participant pool or protocol, which is load-bearing for the comparative conclusions.

Authors: We agree that the comparative claims rest on the published expert study serving as a reference point and that potential biases in its participant pool or protocol should be addressed. Because the baseline is a previously published study, we lack access to its raw data and therefore cannot conduct new sensitivity or coverage analyses. We will add a dedicated limitations paragraph in the discussion that explicitly discusses possible selection and reporting biases in the baseline and how they could influence the observed differences. This will qualify the language around 'drift,' 'convergence,' and 'miss' to reflect the exploratory nature of the comparison. revision: partial

standing simulated objections not resolved

Access to the raw participant data and protocol details from the published baseline expert study, which would be required to perform sensitivity checks or coverage validation.

Circularity Check

0 steps flagged

No circularity: exploratory comparison to external published baseline

full rationale

The paper conducts an empirical, qualitative comparison across three conditions (ungrounded synthetic personas, grounded synthetic personas using prior interview artifacts, and a published real-expert baseline study). There are no equations, fitted parameters, predictions, or first-principles derivations. The central observations (shifts in language/concerns, convergence on find-and-adapt frame, missing image-modality preference) are direct thematic comparisons of outputs against the external baseline, not reductions of any result to the paper's own inputs by construction. The reference to prior work functions as an independent benchmark rather than a self-definitional or load-bearing premise that forces the outcome. This matches the default case of a self-contained empirical probe with external anchors.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This is an empirical HCI probe study whose central observations rest on the assumption that LLM outputs can be meaningfully compared to human expert feedback and that the prior interview artifacts are representative.

axioms (1)

domain assumption LLM-generated personas can produce evaluable feedback on visualization systems when prompted appropriately
Foundational premise enabling the entire three-condition comparison.

pith-pipeline@v0.9.1-grok · 5775 in / 1314 out tokens · 25835 ms · 2026-06-30T23:22:55.414112+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Through the WordStream Glass: Revisiting Quantitative Encoding for Qualitative Learning Analytics
cs.CY 2026-06 unverdicted novelty 4.0

A study of 10 experts reveals disagreement on whether frequency visualizations aid or hinder qualitative analysis of student responses in learning analytics tools.

Reference graph

Works this paper leans on

26 extracted references · 12 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

ATLAS.ti Mac

ATLAS.ti Scientific Software Development GmbH. ATLAS.ti Mac. https://atlasti.com, 2024. Version 24.0.1. 3

2024
[2]

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhari- wal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-V oss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. A...

2020
[3]

Cooper et al.The inmates are running the asylum: Why high-tech products drive us crazy and how to restore the sanity, vol

A. Cooper et al.The inmates are running the asylum: Why high-tech products drive us crazy and how to restore the sanity, vol. 2. Sams Indianapolis, 2004. 2

2004
[4]

Crisan, B

A. Crisan, B. Fiore-Gartland, and M. Tory. Passing the data baton: A retrospective analysis on data science work and workers.IEEE Transactions on Visualization and Computer Graphics, 27(2):1860– 1870, 2021. doi: 10.1109/TVCG.2020.3030340 2

work page doi:10.1109/tvcg.2020.3030340 2021
[5]

B. Gao, Z. Zeng, Y . Yu, I. P. Werry, C. L. Chan, M. Chen, H. Zhang, B. Huang, J. Ji, C. Leung, and C. Miao. ”it seems to understand my heart”: An empirical study of persona-driven persuasive ai agent for aging-in-place in singapore. InProceedings of the 2026 CHI Confer- ence on Human Factors in Computing Systems, CHI ’26. Association for Computing Machin...

work page arXiv 2026
[6]

S. Jain, C. Park, M. Viana, A. Wilson, and D. Calacci. Interaction con- text often increases sycophancy in llms. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems, CHI ’26. Association for Computing Machinery, New York, NY , USA, 2026. doi: 10.1145/3772318.3791915 4

work page doi:10.1145/3772318.3791915 2026
[7]

Are AI-Generated Synthetic Users Replacing Personas? What UX Designers Need to Know, 2024

James Newhook, Interaction Design Foundation. Are AI-Generated Synthetic Users Replacing Personas? What UX Designers Need to Know, 2024. 1

2024
[8]

you always get an answer

I. Kaate, J. Salminen, S.-G. Jung, T. T. T. Xuan, E. H ¨ayh¨anen, J. Y . Azem, and B. J. Jansen. “you always get an answer”: Analyzing users’ interaction with ai-generated personas given unanswerable questions and risk of hallucination. InProceedings of the 30th International Conference on Intelligent User Interfaces, pp. 1624–1638, 2025. 2

2025
[9]

A. B. Kocaballi, M. Prpa, J. Salminen, D. Amin, and B. J Jansen. From generation to simulation: Responsible use of ai personas in human- centered design and research. InProceedings of the Extended Ab- stracts of the 2026 CHI Conference on Human Factors in Computing Systems, CHI EA ’26. Association for Computing Machinery, New York, NY , USA, 2026. doi: 10...

work page doi:10.1145/3772363.3778745 2026
[10]

Krzywinski, J

M. Krzywinski, J. Schein, I. Birol, J. Connors, R. Gascoyne, D. Hors- man, S. J. Jones, and M. A. Marra. Circos: an information aesthetic for comparative genomics.Genome research, 19(9):1639–1645, 2009. 1

2009
[11]

S. LYi, Q. Wang, F. Lekschas, and N. Gehlenborg. Gosling: A grammar-based toolkit for scalable and interactive genomics data visualization.IEEE Transactions on Visualization and Computer Graphics, 28(1):140–150, 2021. 1, 2

2021
[12]

S. L’Yi, A. van den Brandt, E. Adams, H. N. Nguyen, and N. Gehlen- borg. Learnable and expressive visualization authoring through blended interfaces.IEEE Transactions on Visualization and Computer Graphics, 31(1):459–469, 2025. doi: 10.1109/TVCG.2024.3456598 1

work page doi:10.1109/tvcg.2024.3456598 2025
[13]

S. L’Yi, Q. Wang, and N. Gehlenborg. The role of visualization in genomics data analysis workflows: The interviews. In2023 IEEE Visualization and Visual Analytics (VIS), pp. 101–105. IEEE, 2023. 2

2023
[14]

H. N. Nguyen and N. Gehlenborg. Safire: Similarity framework for visualization retrieval. In2025 IEEE Visualization and Visual Ana- lytics (VIS), pp. 246–250, 2025. doi: 10.1109/VIS60296.2025.00055 2

work page doi:10.1109/vis60296.2025.00055 2025
[15]

H. N. Nguyen and N. Gehlenborg. Visualization retrieval for data literacy: Position paper.CHI 2026 Workshop on Data Literacy, Mar

2026
[16]

doi: 10.48550/arXiv.2604.09598 4

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.09598
[17]

H. N. Nguyen, S. L’Yi, T. C. Smits, S. Gao, M. Zitnik, and N. Gehlen- borg. Geranium: Multimodal retrieval of genomics data visualiza- tions.IEEE Transactions on Visualization and Computer Graphics, pp. 1–17, 2026. doi: 10.1109/TVCG.2026.3683429 1, 2, 3

work page doi:10.1109/tvcg.2026.3683429 2026
[18]

Xiong, C

A. Pandey, S. L’Yi, Q. Wang, M. A. Borkin, and N. Gehlenborg. Genorec: A recommendation system for interactive genomics data visualization.IEEE Transactions on Visualization and Computer Graphics, 29(1):570–580, 2023. doi: 10.1109/TVCG.2022.3209407 2

work page doi:10.1109/tvcg.2022.3209407 2023
[19]

J. S. Park, J. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. Bernstein. Generative agents: Interactive simulacra of human behav- ior. UIST ’23. Association for Computing Machinery, New York, NY , USA, 2023. doi: 10.1145/3586183.3606763 2

work page doi:10.1145/3586183.3606763 2023
[20]

J. S. Park, C. Q. Zou, J. Kamphorst, N. Egan, A. Shaw, B. M. Hill, C. Cai, M. R. Morris, P. Liang, R. Willer, and M. S. Bernstein. Llm agents grounded in self-reports enable general-purpose simulation of individuals, 2026. 2

2026
[21]

Salminen, C

J. Salminen, C. Liu, W. Pian, J. Chi, E. H ¨ayh¨anen, and B. J. Jansen. Deus ex machina and personas from large language models: Investi- gating the composition of ai-generated persona descriptions. InPro- ceedings of the 2024 CHI Conference on Human Factors in Comput- ing Systems, CHI ’24. Association for Computing Machinery, New York, NY , USA, 2024. do...

work page doi:10.1145/3613904.3642036 2024
[22]

Thorvaldsd ´ottir, J

H. Thorvaldsd ´ottir, J. T. Robinson, and J. P. Mesirov. Integrative ge- nomics viewer (igv): high-performance genomics data visualization and exploration.Briefings in bioinformatics, 14(2):178–192, 2013. 1, 2

2013
[23]

M. Truss. Personacite: V oc-grounded interviewable agentic synthetic ai personas for verifiable user and design research. InProceedings of the Extended Abstracts of the 2026 CHI Conference on Human Fac- tors in Computing Systems, pp. 1–7, 2026. 2, 3

2026
[24]

van den Brandt, S

A. van den Brandt, S. L’Yi, H. N. Nguyen, A. Vilanova, and N. Gehlenborg. Understanding visualization authoring techniques for genomics data in the context of personas and tasks.IEEE Trans- actions on Visualization and Computer Graphics, 31(1):1180–1190,
[25]

doi: 10.1109/TVCG.2024.3456298 1, 2, 3

work page doi:10.1109/tvcg.2024.3456298 2024
[26]

Welch, F

L. Welch, F. Lewitter, R. Schwartz, C. Brooksbank, P. Radivojac, B. Gaeta, and M. V . Schneider. Bioinformatics curriculum guidelines: toward a definition of core competencies.PLOS computational biol- ogy, 10(3):e1003496, 2014. 2

2014

[1] [1]

ATLAS.ti Mac

ATLAS.ti Scientific Software Development GmbH. ATLAS.ti Mac. https://atlasti.com, 2024. Version 24.0.1. 3

2024

[2] [2]

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhari- wal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-V oss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. A...

2020

[3] [3]

Cooper et al.The inmates are running the asylum: Why high-tech products drive us crazy and how to restore the sanity, vol

A. Cooper et al.The inmates are running the asylum: Why high-tech products drive us crazy and how to restore the sanity, vol. 2. Sams Indianapolis, 2004. 2

2004

[4] [4]

Crisan, B

A. Crisan, B. Fiore-Gartland, and M. Tory. Passing the data baton: A retrospective analysis on data science work and workers.IEEE Transactions on Visualization and Computer Graphics, 27(2):1860– 1870, 2021. doi: 10.1109/TVCG.2020.3030340 2

work page doi:10.1109/tvcg.2020.3030340 2021

[5] [5]

B. Gao, Z. Zeng, Y . Yu, I. P. Werry, C. L. Chan, M. Chen, H. Zhang, B. Huang, J. Ji, C. Leung, and C. Miao. ”it seems to understand my heart”: An empirical study of persona-driven persuasive ai agent for aging-in-place in singapore. InProceedings of the 2026 CHI Confer- ence on Human Factors in Computing Systems, CHI ’26. Association for Computing Machin...

work page arXiv 2026

[6] [6]

S. Jain, C. Park, M. Viana, A. Wilson, and D. Calacci. Interaction con- text often increases sycophancy in llms. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems, CHI ’26. Association for Computing Machinery, New York, NY , USA, 2026. doi: 10.1145/3772318.3791915 4

work page doi:10.1145/3772318.3791915 2026

[7] [7]

Are AI-Generated Synthetic Users Replacing Personas? What UX Designers Need to Know, 2024

James Newhook, Interaction Design Foundation. Are AI-Generated Synthetic Users Replacing Personas? What UX Designers Need to Know, 2024. 1

2024

[8] [8]

you always get an answer

I. Kaate, J. Salminen, S.-G. Jung, T. T. T. Xuan, E. H ¨ayh¨anen, J. Y . Azem, and B. J. Jansen. “you always get an answer”: Analyzing users’ interaction with ai-generated personas given unanswerable questions and risk of hallucination. InProceedings of the 30th International Conference on Intelligent User Interfaces, pp. 1624–1638, 2025. 2

2025

[9] [9]

A. B. Kocaballi, M. Prpa, J. Salminen, D. Amin, and B. J Jansen. From generation to simulation: Responsible use of ai personas in human- centered design and research. InProceedings of the Extended Ab- stracts of the 2026 CHI Conference on Human Factors in Computing Systems, CHI EA ’26. Association for Computing Machinery, New York, NY , USA, 2026. doi: 10...

work page doi:10.1145/3772363.3778745 2026

[10] [10]

Krzywinski, J

M. Krzywinski, J. Schein, I. Birol, J. Connors, R. Gascoyne, D. Hors- man, S. J. Jones, and M. A. Marra. Circos: an information aesthetic for comparative genomics.Genome research, 19(9):1639–1645, 2009. 1

2009

[11] [11]

S. LYi, Q. Wang, F. Lekschas, and N. Gehlenborg. Gosling: A grammar-based toolkit for scalable and interactive genomics data visualization.IEEE Transactions on Visualization and Computer Graphics, 28(1):140–150, 2021. 1, 2

2021

[12] [12]

S. L’Yi, A. van den Brandt, E. Adams, H. N. Nguyen, and N. Gehlen- borg. Learnable and expressive visualization authoring through blended interfaces.IEEE Transactions on Visualization and Computer Graphics, 31(1):459–469, 2025. doi: 10.1109/TVCG.2024.3456598 1

work page doi:10.1109/tvcg.2024.3456598 2025

[13] [13]

S. L’Yi, Q. Wang, and N. Gehlenborg. The role of visualization in genomics data analysis workflows: The interviews. In2023 IEEE Visualization and Visual Analytics (VIS), pp. 101–105. IEEE, 2023. 2

2023

[14] [14]

H. N. Nguyen and N. Gehlenborg. Safire: Similarity framework for visualization retrieval. In2025 IEEE Visualization and Visual Ana- lytics (VIS), pp. 246–250, 2025. doi: 10.1109/VIS60296.2025.00055 2

work page doi:10.1109/vis60296.2025.00055 2025

[15] [15]

H. N. Nguyen and N. Gehlenborg. Visualization retrieval for data literacy: Position paper.CHI 2026 Workshop on Data Literacy, Mar

2026

[16] [16]

doi: 10.48550/arXiv.2604.09598 4

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.09598

[17] [17]

H. N. Nguyen, S. L’Yi, T. C. Smits, S. Gao, M. Zitnik, and N. Gehlen- borg. Geranium: Multimodal retrieval of genomics data visualiza- tions.IEEE Transactions on Visualization and Computer Graphics, pp. 1–17, 2026. doi: 10.1109/TVCG.2026.3683429 1, 2, 3

work page doi:10.1109/tvcg.2026.3683429 2026

[18] [18]

Xiong, C

A. Pandey, S. L’Yi, Q. Wang, M. A. Borkin, and N. Gehlenborg. Genorec: A recommendation system for interactive genomics data visualization.IEEE Transactions on Visualization and Computer Graphics, 29(1):570–580, 2023. doi: 10.1109/TVCG.2022.3209407 2

work page doi:10.1109/tvcg.2022.3209407 2023

[19] [19]

J. S. Park, J. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. Bernstein. Generative agents: Interactive simulacra of human behav- ior. UIST ’23. Association for Computing Machinery, New York, NY , USA, 2023. doi: 10.1145/3586183.3606763 2

work page doi:10.1145/3586183.3606763 2023

[20] [20]

J. S. Park, C. Q. Zou, J. Kamphorst, N. Egan, A. Shaw, B. M. Hill, C. Cai, M. R. Morris, P. Liang, R. Willer, and M. S. Bernstein. Llm agents grounded in self-reports enable general-purpose simulation of individuals, 2026. 2

2026

[21] [21]

Salminen, C

J. Salminen, C. Liu, W. Pian, J. Chi, E. H ¨ayh¨anen, and B. J. Jansen. Deus ex machina and personas from large language models: Investi- gating the composition of ai-generated persona descriptions. InPro- ceedings of the 2024 CHI Conference on Human Factors in Comput- ing Systems, CHI ’24. Association for Computing Machinery, New York, NY , USA, 2024. do...

work page doi:10.1145/3613904.3642036 2024

[22] [22]

Thorvaldsd ´ottir, J

H. Thorvaldsd ´ottir, J. T. Robinson, and J. P. Mesirov. Integrative ge- nomics viewer (igv): high-performance genomics data visualization and exploration.Briefings in bioinformatics, 14(2):178–192, 2013. 1, 2

2013

[23] [23]

M. Truss. Personacite: V oc-grounded interviewable agentic synthetic ai personas for verifiable user and design research. InProceedings of the Extended Abstracts of the 2026 CHI Conference on Human Fac- tors in Computing Systems, pp. 1–7, 2026. 2, 3

2026

[24] [24]

van den Brandt, S

A. van den Brandt, S. L’Yi, H. N. Nguyen, A. Vilanova, and N. Gehlenborg. Understanding visualization authoring techniques for genomics data in the context of personas and tasks.IEEE Trans- actions on Visualization and Computer Graphics, 31(1):1180–1190,

[25] [25]

doi: 10.1109/TVCG.2024.3456298 1, 2, 3

work page doi:10.1109/tvcg.2024.3456298 2024

[26] [26]

Welch, F

L. Welch, F. Lewitter, R. Schwartz, C. Brooksbank, P. Radivojac, B. Gaeta, and M. V . Schneider. Bioinformatics curriculum guidelines: toward a definition of core competencies.PLOS computational biol- ogy, 10(3):e1003496, 2014. 2

2014