pith. sign in

arxiv: 2304.04083 · v2 · pith:XY24ZWA2new · submitted 2023-04-08 · 💻 cs.HC · cs.GR

VOICE: Visual Oracle for Interaction, Conversation, and Explanation

Pith reviewed 2026-05-24 09:09 UTC · model grok-4.3

classification 💻 cs.HC cs.GR
keywords conversational visualizationvoice interactionmolecular visualizationlarge language modelsinteractive 3Dscience communicationpack-of-bots
0
0 comments X

The pith

The VOICE framework pairs arbitrary voice commands with real-time verbal responses and matching 3D visual flythroughs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents VOICE as a system that links large language models to interactive 3D visualization for science communication. It uses a pack-of-bots architecture with fine-tuning and prompt engineering to interpret voice inputs, assign tasks, generate explanations, and produce synchronized visual sequences. The approach allows users to navigate and manipulate models through natural language while maintaining low latency and high accuracy in the coupling of speech and visuals. The system is applied and tested on molecular models with multi-scale and multi-instance features, with input from educational experts on its educational value.

Core claim

VOICE relies on a pack-of-bots that performs distinct roles such as task assignment, instruction extraction, and content generation; after fine-tuning and prompt engineering, these bots enable the system to accept arbitrary voice commands, deliver verbal responses, and generate tightly coupled visual representations including flythrough sequences for 3D molecular models.

What carries the argument

A pack-of-bots architecture in which specialized bots handle task assignment, instruction extraction, and coherent content generation, customized via fine-tuning and prompt engineering.

If this is right

  • Natural language inputs allow real-time navigation and manipulation of 3D models.
  • Text-to-visualization produces flythrough sequences that match the verbal explanation content.
  • The framework maintains low latency and high accuracy when coupling voice responses to visual changes.
  • The method applies to molecular models that include multi-scale and multi-instance attributes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bot-pack structure could be adapted to visualization domains other than molecules by changing the fine-tuning data.
  • Educational settings might benefit from voice-driven sessions that reduce reliance on mouse-and-keyboard controls.
  • Direct comparison of latency and error rates against existing visualization interfaces would quantify the claimed gains.
  • Extending the system to handle multi-user conversations could support group explanations without additional engineering.

Load-bearing premise

Fine-tuning and prompt engineering of the bots will produce accurate, coherent responses to arbitrary user queries in molecular visualization without hallucinations or task failures.

What would settle it

Run user tests with ambiguous, complex, or out-of-domain voice commands on the molecular models and check whether responses remain accurate, coherent, and free of hallucinations or failures.

Figures

Figures reproduced from arXiv: 2304.04083 by Alexandra Irger, Anders Ynnerman, Deng Luo, Donggang Jia, Ivan Viola, Johanna Bjorklund, Lonni Besancon, Ondrej Strnad.

Figure 1
Figure 1. Figure 1: VOICE’s initial screen. VOICE can process an arbitrary speech request to answer a question, return a corresponding animation, or conversationally explore the model. The complexity and non-linearity of learning processes are well documented [45] and not to be underestimated. Excessive exploratory freedom can lead to engagement thresholds for science center visitors that are hard to overcome. Several mitigat… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the VOICE framework. The dialogue system begins with a user’s speech query. It uses a “pack-of-bots” architecture to process this query. The system either answers questions or follows instructions, which are then given to a visualization system. translate them into our internal instruction syntax, which is then processed by the visualization system. The Cutting Plane process is triggered direct… view at source ↗
Figure 3
Figure 3. Figure 3: Few-show prompt engineering and prompt-based fine-tuning. Few-shot prompt engineering enables direct output acquisition without altering the model. Conversely, prompt-based fine-tuning updates the model through multiple steps. 7 [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Overview scene (a), focus scene (b), and cutting plane scene (c). The overview scene shows the external component labels and spatial information, while the focus scene illustrates the structural details. The cutting plane scene displays the internal components. scene, as illustrated in [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The interactive text-to-visualization method is demonstrated in the scene tree. Minimum index values are updated based on the index value of each node. Then, a traversal list is generated based on the minimum index values. We have implemented transitional sentences within a dedicated Speech Only scene to enhance user guidance and maintain a coherent and fluid conversation. We provide a subtle rotation anim… view at source ↗
Figure 6
Figure 6. Figure 6: The initial view of three molecular models and demonstration of different commands. For each model, we demonstrate a multi-turn conversation scenario including either Explorer, Pilot, or cutting actions. Based on our comparison, small LLMs like Alpca-lora-7b cannot correctly complete our Explorer and Pilot bots’ instructions extraction tasks and cannot generate accurate content. Open-sourced LLM with top p… view at source ↗
read the original abstract

We present VOICE, a novel approach to science communication that connects large language models' (LLM) conversational capabilities with interactive exploratory visualization. VOICE introduces several innovative technical contributions that drive our conversational visualization framework. Our foundation is a pack-of-bots that can perform specific tasks, such as assigning tasks, extracting instructions, and generating coherent content. We employ fine-tuning and prompt engineering techniques to tailor bots' performance to their specific roles and accurately respond to user queries. Our interactive text-to-visualization method generates a flythrough sequence matching the content explanation. Besides, natural language interaction provides capabilities to navigate and manipulate the 3D models in real-time. The VOICE framework can receive arbitrary voice commands from the user and respond verbally, tightly coupled with corresponding visual representation with low latency and high accuracy. We demonstrate the effectiveness of our approach by applying it to the molecular visualization domain: analyzing three 3D molecular models with multi-scale and multi-instance attributes. We finally evaluate VOICE with the identified educational experts to show the potential of our approach. All supplemental materials are available at https://osf.io/g7fbr.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper presents VOICE, a novel framework for science communication that integrates large language models with interactive exploratory visualization. It introduces a pack-of-bots for task-specific roles using fine-tuning and prompt engineering, a text-to-visualization method for flythrough sequences, and natural language interaction for real-time 3D model navigation and manipulation. The system is demonstrated on three 3D molecular models with multi-scale and multi-instance attributes, evaluated by educational experts, and claims to handle arbitrary voice commands with low latency and high accuracy.

Significance. If the performance claims hold, VOICE could advance conversational interfaces for visualization in education and science communication by enabling natural voice-driven exploration of complex 3D models. The integration of LLMs with visualization is timely, but the lack of quantitative benchmarks makes it difficult to assess its novelty or superiority over prior systems.

major comments (2)
  1. [Abstract] Abstract: The central claim that 'the VOICE framework can receive arbitrary voice commands from the user and respond verbally, tightly coupled with corresponding visual representation with low latency and high accuracy' is unsupported by any quantitative metrics, error rates, latency measurements, hallucination rates, or baseline comparisons. The expert evaluation is mentioned but supplies no details on methodology, tasks, or outcomes.
  2. [Evaluation] Evaluation (implied by expert review description): The assumption that fine-tuning and prompt engineering of the pack-of-bots will yield coherent, hallucination-free responses to arbitrary queries in the molecular visualization domain is load-bearing for the contribution but receives no empirical validation or test coverage description.
minor comments (1)
  1. The manuscript should specify what supplemental materials (e.g., code, prompts, or evaluation data) are provided at the OSF link to support reproducibility of the pack-of-bots implementation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and indicate where revisions will be made to strengthen the presentation of our claims and evaluation.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that 'the VOICE framework can receive arbitrary voice commands from the user and respond verbally, tightly coupled with corresponding visual representation with low latency and high accuracy' is unsupported by any quantitative metrics, error rates, latency measurements, hallucination rates, or baseline comparisons. The expert evaluation is mentioned but supplies no details on methodology, tasks, or outcomes.

    Authors: We agree that the abstract asserts low latency and high accuracy without accompanying quantitative evidence in the current manuscript. In revision we will qualify this claim to describe observed behavior from our implementation and expand the evaluation section with details on the expert review protocol, specific tasks, participant feedback, and any available latency or accuracy observations from the molecular model demonstrations. revision: yes

  2. Referee: [Evaluation] Evaluation (implied by expert review description): The assumption that fine-tuning and prompt engineering of the pack-of-bots will yield coherent, hallucination-free responses to arbitrary queries in the molecular visualization domain is load-bearing for the contribution but receives no empirical validation or test coverage description.

    Authors: The manuscript relies on fine-tuning and prompt engineering for the pack-of-bots but does not supply a dedicated description of test coverage or hallucination mitigation results. We will add a short subsection outlining the validation steps performed during development, including example query sets used for the molecular domain and observed coherence outcomes, to make this aspect more transparent. revision: yes

Circularity Check

0 steps flagged

No circularity: system description with no derivations or self-referential reductions

full rationale

The paper is a descriptive account of an implemented conversational visualization system (pack-of-bots, fine-tuning, prompt engineering, text-to-visualization flythroughs, real-time 3D manipulation). No equations, first-principles derivations, fitted parameters presented as predictions, or load-bearing self-citations appear in the provided text. Claims rest on system construction and expert evaluation rather than any chain that reduces to its own inputs by definition. This matches the default case of a non-circular engineering paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical modeling or parameter fitting is present; the paper is a systems and HCI demonstration.

pith-pipeline@v0.9.0 · 5751 in / 1031 out tokens · 20763 ms · 2026-05-24T09:09:16.219661+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages

  1. [1]

    Hybrid Tactile/Tangible Interaction for 3D Data Exploration

    Lonni Besanc ¸on, Paul Issartel, Mehdi Ammi, and Tobias Isenberg. Hybrid Tactile/Tangible Interaction for 3D Data Exploration. IEEE Transactions on Visualization and Computer Graphics, 23(1):881–890, 2017

  2. [2]

    Mouse, Tactile, and Tangible Input for 3D Manipulation

    Lonni Besanc ¸on, Paul Issartel, Mehdi Ammi, and Tobias Isenberg. Mouse, Tactile, and Tangible Input for 3D Manipulation. In Proc. CHI, pages 4727–4740, Denver, United States, May 2017

  3. [3]

    Exploring and explaining climate change: Exploranation as a visualization pedagogy for societal action

    Lonni Besanc ¸on, Konrad Sch¨onborn, Erik Sund´en, He Yin, Samuel Rising, Peter Westerdahl, Patric Ljung, Josef Widestr¨om, Charles Hansen, and Anders Ynnerman. Exploring and explaining climate change: Exploranation as a visualization pedagogy for societal action. In VIS4GOOD, a workshop on Visualization for Social Good, held as part of IEEE VIS 2022, 2022

  4. [4]

    Biau, Bruno Frachet, Virginie Pineau, El Hadi Sariali, Marc Soubeyrand, Rabah Taouachi, Tobias Isenberg, and Pierre Dragicevic

    Lonni Besanc ¸on, Amir Semmo, David J. Biau, Bruno Frachet, Virginie Pineau, El Hadi Sariali, Marc Soubeyrand, Rabah Taouachi, Tobias Isenberg, and Pierre Dragicevic. Reducing Affective Responses to Surgical Images and Videos Through Stylization. Computer Graphics Forum, 39(1):462–483, January 2020

  5. [5]

    The State of the Art of Spatial Interfaces for 3D Visualization

    Lonni Besanc ¸on, Anders Ynnerman, Daniel F Keefe, Lingyun Yu, and Tobias Isenberg. The State of the Art of Spatial Interfaces for 3D Visualization. Computer Graphics Forum, 40(1):293–326, February 2021

  6. [6]

    Social interaction and learning among family groups visiting a museum

    Linda M Blud. Social interaction and learning among family groups visiting a museum. Museum Management and Curatorship, 9(1):43–51, 2009

  7. [7]

    Openspace: A system for astrographics.IEEE Transactions on Visualization and Computer Graphics, 26(1):633–642, 2020

    Alexander Bock, Emil Axelsson, Jonathas Costa, Gene Payne, Micah Acinapura, Vivian Trakinski, Carter Emmart, Cl´audio Silva, Charles Hansen, and Anders Ynnerman. Openspace: A system for astrographics.IEEE Transactions on Visualization and Computer Graphics, 26(1):633–642, 2020

  8. [8]

    Openspace: Changing the narrative of public dissemination in astronomical visualization from what to how

    Alexander Bock, Emil Axelsson, Carter Emmart, Masha Kuznetsova, Charles Hansen, and Anders Ynnerman. Openspace: Changing the narrative of public dissemination in astronomical visualization from what to how. IEEE computer graphics and applications, 38(3):44–57, 2018

  9. [9]

    Springer International Publishing, Cham, 2020

    Michael B¨ottinger, Helen-Nicole Kostis, Maria Velez-Rojas, Penny Rheingans, and Anders Ynnerman.Reflections on Visualization for Broad Audiences, pages 297–305. Springer International Publishing, Cham, 2020. 18 VOICE A PREPRINT

  10. [10]

    Moliverse: Contextually embedding the microcosm into the universe

    Mathis Brossier, Robin Sk˚anberg, Lonni Besanc ¸on, Mathieu Linares, Tobias Isenberg, Anders Ynnerman, and Alexander Bock. Moliverse: Contextually embedding the microcosm into the universe. Computers and Graphics, 112:22–30, May 2023

  11. [11]

    Local standards for sample size at chi

    Kelly Caine. Local standards for sample size at chi. In Proc. CHI, CHI ’16, pages 981–992, New York, NY , USA,

  12. [12]

    Human vs

    Xusen Cheng, Xiaoping Zhang, Jason Cohen, and Jian Mou. Human vs. ai: Understanding the impact of anthropomorphism on consumer response to chatbots from the perspective of trust and relationship norms. Information Processing & Management, 59(3):102940, 2022

  13. [13]

    Teaching biochemistry and molecular biology with virtual reality—lesson creation and student response

    Heather A Coan, Geoff Goehle, and Robert T Youker. Teaching biochemistry and molecular biology with virtual reality—lesson creation and student response. Journal of Teaching and Learning, 14(1):71–92, 2020

  14. [14]

    Marianna J Coulentianos, Ilka Rodriguez-Calero, Shanna R Daly, and Kathleen H Sienko. Stakeholder engagement with prototypes during front-end medical device design: Who is engaged with what prototype? In Frontiers in Biomedical Devices, volume 83549, page V001T08A001. American Society of Mechanical Engineers, 2020

  15. [15]

    Nano for the public: An exploranation perspective

    Gunnar H¨ost, Karljohan Palmerius, and Konrad Sch¨onborn. Nano for the public: An exploranation perspective. IEEE Computer Graphics and Applications, 40(2):32–42, 2020

  16. [16]

    Isenberg, P

    T. Isenberg, P. Isenberg, J. Chen, M. Sedlmair, and T. M¨oller. A systematic review on the practice of evaluating visualization. IEEE Transactions on Visualization and Computer Graphics, 19(12):2818–2827, Dec 2013

  17. [17]

    Brief exposure increases mind perception to chatgpt and is moderated by the individual propensity to anthropomorphize

    Oliver Jacobs, Farid Pazhoohi, and Alan Kingstone. Brief exposure increases mind perception to chatgpt and is moderated by the individual propensity to anthropomorphize. 2023

  18. [18]

    Teaching and learning chemistry via augmented and immersive virtual reality

    Zulma A Jim´enez. Teaching and learning chemistry via augmented and immersive virtual reality. In Technology Integration in Chemistry Education and Research (TICER), pages 31–52. ACS Publications, 2019

  19. [19]

    Understanding parents’ roles in children’s learning and engagement in informal science learning sites

    Angelina Joy, Fidelia Law, Luke McGuire, Channing Mathews, Adam Hartstone-Rose, Mark Winterbottom, Adam Rutland, Grace E Fields, and Kelly Lynn Mulvey. Understanding parents’ roles in children’s learning and engagement in informal science learning sites. Frontiers in Psychology, 12:635839, 2021

  20. [20]

    Keefe and Tobias Isenberg

    Daniel F. Keefe and Tobias Isenberg. Reimagining the scientific visualization interaction paradigm. IEEE Computer, 46(5):51–57, May 2013

  21. [21]

    How many participants do researchers recruit? a look at 678 ux/hci studies

    Lisa Koeman. How many participants do researchers recruit? a look at 678 ux/hci studies. Online. Last visited 06 January 2019, 2018

  22. [22]

    Hyperlabels: Browsing of dense and hierarchical molecular 3d models

    David Kouˇril, Tobias Isenberg, Barbora Kozl´ıkov´a, Miriah Meyer, M Eduard Gr¨oller, and Ivan Viola. Hyperlabels: Browsing of dense and hierarchical molecular 3d models. IEEE Transactions on Visualization and Computer Graphics, 27(8):3493–3504, 2020

  23. [23]

    Molecumentary: Adaptable narrated documentaries using molecular visualization

    David Kouril, Ondrej Strnad, Peter Mindek, Sarkis Halladjian, Tobias Isenberg, Eduard Groeller, and Ivan Viola. Molecumentary: Adaptable narrated documentaries using molecular visualization. IEEE Transactions on Visualization & Computer Graphics, (01):1–1, 2021

  24. [24]

    Peers, teachers and guides: A study of three conditions for scaffolding conceptual learning in science centers

    Ingeborg Krange, Kenneth Silseth, and Palmyre Pierroux. Peers, teachers and guides: A study of three conditions for scaffolding conceptual learning in science centers. Cultural Studies of Science Education, 15(1):241–263, 2020

  25. [25]

    Peers, teachers and guides: A study of three conditions for scaffolding conceptual learning in science centers

    Ingeborg Krange, Kenneth Silseth, and Palmyre Pierroux. Peers, teachers and guides: A study of three conditions for scaffolding conceptual learning in science centers. Cultural Studies of Science Education, 15:241–263, 2020

  26. [26]

    Cellview: a tool for illustrative and multi-scale rendering of large biomolecular datasets

    Mathieu Le Muzic, Ludovic Autin, Julius Parulek, and Ivan Viola. Cellview: a tool for illustrative and multi-scale rendering of large biomolecular datasets. In Eurographics Workshop on Visual Computing for Biomedicine, volume 2015, page 61. NIH Public Access, 2015

  27. [27]

    Advisor: Automatic visualization answer for natural-language question on tabular data

    Can Liu, Yun Han, Ruike Jiang, and Xiaoru Yuan. Advisor: Automatic visualization answer for natural-language question on tabular data. In 2021 IEEE 14th Pacific Visualization Symposium (PacificVis), pages 11–20, 2021

  28. [28]

    Lubos, R

    P. Lubos, R. Beimler, M. Lammers, and F. Steinicke. Touching the cloud: Bimanual annotation of immersive point clouds. In Proc. 3DUI, pages 191–192, Los Alamitos, 2014. IEEE Computer Society

  29. [29]

    Synthesizing natural language to visualization (nl2vis) benchmarks from nl2sql benchmarks

    Yuyu Luo, Nan Tang, Guoliang Li, Chengliang Chai, Wenbo Li, and Xuedi Qin. Synthesizing natural language to visualization (nl2vis) benchmarks from nl2sql benchmarks. In Proceedings of the 2021 International Conference on Management of Data, SIGMOD ’21, page 1235–1247, New York, NY , USA, 2021. Association for Computing Machinery

  30. [30]

    Natural language to visualization by neural machine translation

    Yuyu Luo, Nan Tang, Guoliang Li, Jiawei Tang, Chengliang Chai, and Xuedi Qin. Natural language to visualization by neural machine translation. IEEE Transactions on Visualization and Computer Graphics, 28(1):217–226, 2022. 19 VOICE A PREPRINT

  31. [31]

    Living liquid: Design and evaluation of an exploratory visualization tool for museum visitors

    Joyce Ma, Isaac Liao, Kwan-Liu Ma, and Jennifer Frazier. Living liquid: Design and evaluation of an exploratory visualization tool for museum visitors. IEEE Transactions on Visualization and Computer Graphics, 18(12):2799– 2808, 2012

  32. [32]

    Decoding a complex visualization in a science museum – an empirical study

    Joyce Ma, Kwan-Liu Ma, and Jennifer Frazier. Decoding a complex visualization in a science museum – an empirical study. IEEE Transactions on Visualization and Computer Graphics, 26(1):472–481, 2020

  33. [33]

    Are users willing to embrace chatgpt? exploring the factors on the acceptance of chatbots from the perspective of aidua framework

    Xiaoyue Ma and Yudi Huo. Are users willing to embrace chatgpt? exploring the factors on the acceptance of chatbots from the perspective of aidua framework. Technology in Society, 75:102362, 2023

  34. [34]

    Chat2vis: Generating data visualisations via natural language using chatgpt, codex and gpt-3 large language models

    Paula Maddigan and Teo Susnjak. Chat2vis: Generating data visualisations via natural language using chatgpt, codex and gpt-3 large language models. IEEE Access, 2023

  35. [35]

    Luisa Massarani, Rosicler Neves, Graziele Scalfi, Antero Vin´ıcius Portela Firmino Pinto, Carla Almeida, Luis Amorim, Marina Ramalho, Luiz Bento, Monica Santos Dahmouche, Renata Fontanetto, et al. The role of mediators in science museums: An analysis of conversations and interactions of brazilian families in free and mediated visits to an interactive exhi...

  36. [36]

    Visualization multi-pipeline for communicating biology

    Peter Mindek, David Kouˇril, Johannes Sorger, Daniel Toloudis, Blair Lyons, Graham Johnson, M Eduard Gr¨oller, and Ivan Viola. Visualization multi-pipeline for communicating biology. IEEE Transactions on Visualization and Computer Graphics, 24(1):883–892, 2017

  37. [37]

    Facilitating conversational interaction in natural language interfaces for visualization

    Rishab Mitra, Arpit Narechania, Alex Endert, and John Stasko. Facilitating conversational interaction in natural language interfaces for visualization. In 2022 IEEE Visualization and Visual Analytics (VIS), pages 6–10, 2022

  38. [38]

    Nl4dv: A toolkit for generating analytic specifications for data visualization from natural language queries

    Arpit Narechania, Arjun Srinivasan, and John Stasko. Nl4dv: A toolkit for generating analytic specifications for data visualization from natural language queries. IEEE Transactions on Visualization and Computer Graphics, 27(2):369–379, 2020

  39. [39]

    Modeling in the time of covid-19: Statistical and rule-based mesoscale models

    Ngan Nguyen, Ondˇrej Strnad, Tobias Klein, Deng Luo, Ruwayda Alharbi, Peter Wonka, Martina Maritan, Peter Mindek, Ludovic Autin, David S Goodsell, et al. Modeling in the time of covid-19: Statistical and rule-based mesoscale models. IEEE transactions on visualization and computer graphics, 27(2):722, 2021

  40. [40]

    Openai: Introducing chatgpt

    OpenAI. Openai: Introducing chatgpt. https://openai.com/blog/chatgpt, 2022. Accessed: March 27, 2023

  41. [41]

    Gpt-4 technical report, 2023

    OpenAI. Gpt-4 technical report, 2023

  42. [42]

    Working with forensic practitioners to understand the opportunities and challenges for mixed-reality digital autopsy

    Vahid Pooryousef, Maxime Cordeil, Lonni Besan c ¸on, Christophe Hurter, Tim Dwyer, and Richard Bassed. Working with forensic practitioners to understand the opportunities and challenges for mixed-reality digital autopsy. In Proc. CHI, CHI ’23, New York, NY , USA, 2023. Association for Computing Machinery

  43. [43]

    Virtual reality for surgical planning–evaluation based on two liver tumor resections.Frontiers in Surgery, 9:821060, 2022

    Anke V Reinschluessel, Thomas Muender, Daniela Salzmann, Tanja Doering, Rainer Malaka, and Dirk Weyhe. Virtual reality for surgical planning–evaluation based on two liver tumor resections.Frontiers in Surgery, 9:821060, 2022

  44. [44]

    Oemig, Geraldine B

    Penny Rheingans, Helen-Nicole Kostis, Paulo A. Oemig, Geraldine B. Robbins, and Anders Ynnerman. Reaching Broad Audiences in an Educational Setting, pages 365–380. Springer International Publishing, Cham, 2020

  45. [45]

    How technology resources can be used to represent personal inquiry and support students’ understanding of it across contexts

    Eileen Scanlon, Stamatina Anastopoulou, Lucinda Kerawalla, and Paul Mulholland. How technology resources can be used to represent personal inquiry and support students’ understanding of it across contexts. Journal of Computer Assisted Learning, 27(6):516–529, 2011

  46. [46]

    Science museums and centres: evolution and contemporary trends

    Bernard Schiele. Science museums and centres: evolution and contemporary trends. In Routledge handbook of public communication of science and technology, pages 53–76. Routledge, 2021

  47. [47]

    Bridging the educational research-teaching practice gap: Founda- tions for assessing and developing biochemistry students’ visual literacy

    Konrad J Sch¨onborn and Trevor R Anderson. Bridging the educational research-teaching practice gap: Founda- tions for assessing and developing biochemistry students’ visual literacy. Biochemistry and molecular biology education, 38(5):347–354, 2010

  48. [48]

    Education, entertainment, and engagement in museums in the digital age

    Nellie Seale. Education, entertainment, and engagement in museums in the digital age. InCompanion Proceedings of the Annual Symposium on Computer-Human Interaction in Play, pages 326–329, 2023

  49. [49]

    Engagement in a science museum–the role of social interactions

    Neta Shaby, Orit Ben-Zvi Assaraf, and Tali Tal. Engagement in a science museum–the role of social interactions. Visitor Studies, 22(1):1–20, 2019

  50. [50]

    An examination of the interactions between museum educators and students on a school visit to science museum

    Neta Shaby, Orit Ben-Zvi Assaraf, and Tali Tal. An examination of the interactions between museum educators and students on a school visit to science museum. Journal of Research in Science Teaching, 56(2):211–239, 2019

  51. [51]

    Virtual reality in museums: does it promote visitor enjoyment and learning? International Journal of Human–Computer Interaction, 39(18):3586–3603, 2023

    Hamza Shahab, Mozard Mohtar, Ezlika Ghazali, Philipp A Rauschnabel, and Andrea Geipel. Virtual reality in museums: does it promote visitor enjoyment and learning? International Journal of Human–Computer Interaction, 39(18):3586–3603, 2023. 20 VOICE A PREPRINT

  52. [52]

    Relations between parent–child interaction and children’s engagement and learning at a museum exhibit about electric circuits.Developmental Science, 24(3):e13057, 2021

    David M Sobel, Susan M Letourneau, Cristine H Legare, and Maureen Callanan. Relations between parent–child interaction and children’s engagement and learning at a museum exhibit about electric circuits.Developmental Science, 24(3):e13057, 2021

  53. [53]

    Drucker, and Ken Hinckley

    Arjun Srinivasan, Bongshin Lee, Nathalie Henry Riche, Steven M. Drucker, and Ken Hinckley. Inchorus: Designing consistent multimodal interactions for data visualization on tablet devices. In Proc. CHI, CHI ’20, page 1–13, New York, NY , USA, 2020. Association for Computing Machinery

  54. [54]

    Hashimoto

    Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, and Tatsunori B. Hashimoto. Stanford alpaca: An instruction-following llama model. https://github.com/ tatsu-lab/stanford_alpaca, 2023

  55. [55]

    Alpaca lora library

    Tloen. Alpaca lora library. https://github.com/tloen/alpaca-lora, 2023. 08-August-2023

  56. [56]

    Boom chameleon: Simultaneous capture of 3D viewpoint, voice and gesture annotations on a spatially-aware display

    Michael Tsang, George W Fitzmaurice, Gordon Kurtenbach, Azam Khan, and Bill Buxton. Boom chameleon: Simultaneous capture of 3D viewpoint, voice and gesture annotations on a spatially-aware display. In Proc. UIST, pages 111–120, New York, 2002. ACM

  57. [57]

    Llama-2-70b-instruct-v2

    Upstage. Llama-2-70b-instruct-v2. https://huggingface.co/upstage/Llama-2-70b-instruct-v2 ,

  58. [58]

    Llama-30b-instruct

    Upstage. Llama-30b-instruct. https://huggingface.co/upstage/llama-30b-instruct, 2023. 10-August- 2023

  59. [59]

    Towards natural language-based visualization authoring

    Yun Wang, Zhitao Hou, Leixian Shen, Tongshuang Wu, Jiaqi Wang, He Huang, Haidong Zhang, and Dong- mei Zhang. Towards natural language-based visualization authoring. IEEE Transactions on Visualization and Computer Graphics, 29(1):1222–1232, 2022

  60. [60]

    T4 model

    John Winfer, Aeliya Syed, Leon Thistle Paul Ekers, Ngan Nguyen, Ondrej Strnad, David Goodsell, Ivan Viola, and Deng Luo. T4 model. https://www.nanovis.org/T4-model.html. (Accessed on 09/07/2021)

  61. [61]

    User retention of mobile augmented reality for cultural heritage learning

    Ningning Xu, Yue Li, Jie Lin, Lingyun Yu, and Hai-Ning Liang. User retention of mobile augmented reality for cultural heritage learning. In 2022 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), pages 447–452, 2022

  62. [62]

    Molecular architecture of the sars-cov-2 virus

    Hangping Yao, Yutong Song, Yong Chen, Nanping Wu, Jialu Xu, Chujie Sun, Jiaxing Zhang, Tianhao Weng, Zheyuan Zhang, Zhigang Wu, et al. Molecular architecture of the sars-cov-2 virus. Cell, 183(3):730–738, 2020

  63. [63]

    Structure and function of bacteriophage t4

    Moh Lan Yap and Michael G Rossmann. Structure and function of bacteriophage t4. Future microbiology, 9(12):1319–1327, 2014

  64. [64]

    Reaching Broad Audiences from a Science Center or Museum Setting, pages 341–364

    Anders Ynnerman, Patric Ljung, and Alexander Bock. Reaching Broad Audiences from a Science Center or Museum Setting, pages 341–364. Springer International Publishing, Cham, 2020

  65. [65]

    Exploranation: A new science communication paradigm

    Anders Ynnerman, Jonas L¨owgren, and Lena Tibell. Exploranation: A new science communication paradigm. IEEE computer graphics and applications, 38(3):13–20, 2018

  66. [66]

    Interactive visualization of 3D scanned mummies at public venues

    Anders Ynnerman, Thomas Rydell, Daniel Antoine, David Hughes, Anders Persson, and Patric Ljung. Interactive visualization of 3D scanned mummies at public venues. Commun. ACM, 59(12):72–81, December 2016. 21