Recognition: unknown
BONSAI: A Mixed-Initiative Workspace for Human-AI Co-Development of Visual Analytics Applications
Pith reviewed 2026-05-10 02:21 UTC · model grok-4.3
The pith
BONSAI uses a four-layer modular architecture and four-phase process to let humans and AI co-develop reusable visual analytics applications with full provenance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BONSAI utilizes a modular four-layer architecture (hardware, services, orchestration, application) that allows human and AI developers to independently contribute reusable components. The workspace incorporates this architecture into a structured four-phase development process (plan, design, monitor, and review), ensuring distributed agency and full provenance, where all human and AI contributions are structurally bounded and tracked. Case studies demonstrate the efficient creation of novel tools and the rapid reconstruction of complex VA applications directly from research paper descriptions.
What carries the argument
The four-layer modular architecture (hardware, services, orchestration, application) combined with a four-phase development process (plan, design, monitor, review) that structurally bounds and tracks all human and AI contributions.
If this is right
- Human and AI developers can contribute independently to the same VA project while maintaining reusability across different applications.
- All contributions carry complete provenance records that support auditing and debugging of the final system.
- Complex VA applications described in research papers can be rebuilt rapidly by following the structured phases.
- The approach avoids both the fragility of tightly coupled monoliths and the restrictions of simplistic frameworks.
- Distributed agency between humans and AI becomes feasible without sacrificing structural integrity in the developed tools.
Where Pith is reading between the lines
- The same bounded co-development model could apply to building interactive data tools in scientific domains beyond visual analytics.
- Provenance tracking might enable easier compliance checks when AI assists in creating analytics systems used for decision-making.
- The architecture could support incremental updates where only one layer is revised without affecting the others.
- Teams might test whether the four phases reduce the time to first working prototype compared to unconstrained AI coding.
Load-bearing premise
The four-layer architecture and four-phase process can constrain AI generation enough to produce reusable and auditable components without losing the expressiveness needed for complex visual analytics or introducing hidden dependencies.
What would settle it
An experiment in which an AI-generated component produced inside BONSAI creates an interdependency that breaks the application when reused in a new project or fails to accurately reconstruct a described visual analytics tool from a research paper.
Figures
read the original abstract
Developing Visual Analytics (VA) applications requires integrating complex machine learning models with expressive interactive interfaces. Developers face a stark trade-off: building tightly-coupled monoliths plagued by fragile interdependencies, or relying on restrictive, simplistic frameworks. Meanwhile, unconstrained, single-shot AI code generation promises speed but yields unstructured, unauditable chaos. The core challenge is combining the control and expressiveness of custom development with the efficiency of AI generation under strict constraints. To address this, we introduce BONSAI, a mixed-initiative workspace for the multi-agent co-development of VA applications. BONSAI utilizes a modular four-layer architecture (hardware, services, orchestration, application) that allows human and AI developers to independently contribute reusable components. The workspace incorporates this architecture into a structured four-phase development process (plan, design, monitor, and review), ensuring distributed agency and full provenance, where all human and AI contributions are structurally bounded and tracked. We evaluate BONSAI through case studies demonstrating the efficient creation of novel tools and the rapid reconstruction of complex VA applications directly from research paper descriptions. Ultimately, this paper contributes a conceptual workflow, a scalable architecture, and an integrated system that successfully balances AI's generative speed with the structural rigor required for complex VA development.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces BONSAI, a mixed-initiative workspace for human-AI co-development of Visual Analytics (VA) applications. It proposes a modular four-layer architecture (hardware, services, orchestration, application) that supports independent contributions of reusable components by humans and AI, integrated into a structured four-phase process (plan, design, monitor, review) to ensure distributed agency and full provenance tracking. The core contribution is a conceptual workflow and integrated system claimed to balance AI generative efficiency with structural rigor, avoiding both monolithic fragility and unstructured chaos. Evaluation is presented via case studies demonstrating efficient novel tool creation and rapid reconstruction of complex VA applications directly from research paper descriptions.
Significance. If the central claims hold, BONSAI would offer a valuable advance in human-computer interaction and visual analytics by providing a scalable architecture and workflow that enables controlled AI assistance while preserving auditability and reusability. The explicit credit for a modular design allowing independent component contributions and full provenance tracking is a strength, as is the focus on addressing the monolith-vs.-restrictive-framework trade-off. However, the absence of quantitative support in the evaluation limits the assessed impact to conceptual rather than demonstrated.
major comments (2)
- [Evaluation section] Evaluation section (as summarized in the abstract): the claims that BONSAI enables 'efficient creation of novel tools' and 'rapid reconstruction of complex VA applications' rest on case studies, yet no metrics (e.g., development time, component reuse rates, interdependency counts, provenance audit success), baselines, or error analysis are reported. This leaves the assertion that the four-layer architecture and four-phase process 'effectively constrain AI generation' as an untested qualitative description rather than an evidenced outcome.
- [Architecture and process description] Architecture and process description (abstract and system sections): while the four-layer model and plan/design/monitor/review phases are presented as ensuring reusable, auditable components with distributed agency, no concrete mechanisms, interface specifications, or examples are given showing how the structure prevents fragile interdependencies or monoliths in practice. This is load-bearing for the central claim of balancing expressiveness with constraints.
minor comments (2)
- [Introduction] The abstract and introduction could more explicitly reference prior work on mixed-initiative systems in VA or AI-assisted development to better situate the novelty of the four-phase process.
- Notation for the four layers and phases is clear at a high level but would benefit from a diagram or table summarizing component interfaces and provenance tracking points.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We address each major comment point by point below, indicating where revisions will be made to improve the manuscript.
read point-by-point responses
-
Referee: [Evaluation section] Evaluation section (as summarized in the abstract): the claims that BONSAI enables 'efficient creation of novel tools' and 'rapid reconstruction of complex VA applications' rest on case studies, yet no metrics (e.g., development time, component reuse rates, interdependency counts, provenance audit success), baselines, or error analysis are reported. This leaves the assertion that the four-layer architecture and four-phase process 'effectively constrain AI generation' as an untested qualitative description rather than an evidenced outcome.
Authors: We agree that the current evaluation, which relies on qualitative case studies, would be strengthened by the addition of concrete metrics. The case studies demonstrate the workflow through specific instances of novel tool creation and reconstruction from paper descriptions, but we did not report numerical data such as phase durations or reuse counts. In the revised manuscript, we will expand the evaluation section to include observed metrics from the case studies (e.g., approximate time spent in each phase, number of independently contributed components, and provenance tracking examples) to provide more direct evidence for the claims about efficiency and constraint effectiveness. revision: yes
-
Referee: [Architecture and process description] Architecture and process description (abstract and system sections): while the four-layer model and plan/design/monitor/review phases are presented as ensuring reusable, auditable components with distributed agency, no concrete mechanisms, interface specifications, or examples are given showing how the structure prevents fragile interdependencies or monoliths in practice. This is load-bearing for the central claim of balancing expressiveness with constraints.
Authors: The system section describes the four-layer architecture and four-phase process, including how modularity enables independent contributions and full provenance tracking. We acknowledge, however, that additional concrete mechanisms and examples are needed to explicitly show how the structure avoids fragile interdependencies and monoliths. In the revision, we will add specific examples drawn from the case studies illustrating layer interfaces and phase transitions, along with details on how components were bounded to maintain reusability and auditability. revision: yes
Circularity Check
No circularity: descriptive system design with no derivations or self-referential reductions
full rationale
The paper presents BONSAI as a proposed mixed-initiative workspace featuring a four-layer architecture (hardware, services, orchestration, application) and a four-phase process (plan, design, monitor, review). These are introduced as design choices to enable reusable components, distributed agency, and provenance tracking. No mathematical equations, fitted parameters, predictions, or first-principles derivations appear in the provided text. The evaluation consists of qualitative case studies showing tool creation and reconstruction from paper descriptions, without any reduction of outputs to inputs by construction, self-citation chains, or ansatz smuggling. The architecture is asserted to constrain AI generation, but this is a design claim evaluated externally via cases rather than a tautological loop. The contribution remains self-contained as a conceptual and system proposal.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption A modular four-layer architecture allows human and AI developers to independently contribute reusable components without fragile interdependencies
- domain assumption A structured four-phase process (plan, design, monitor, review) ensures distributed agency and full provenance for all contributions
invented entities (1)
-
BONSAI mixed-initiative workspace
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz
S. Amershi, D. Weld, M. V orvoreanu, A. Fourney, B. Nushi, P. Collisson et al. Guidelines for human-AI interaction. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019. doi: 10. 1145/3290605.3300233 2
-
[2]
Model context protocol (MCP): Open standard for AI applica- tion integration, 2024
Anthropic. Model context protocol (MCP): Open standard for AI applica- tion integration, 2024. Donated to Agentic AI Foundation (Linux Founda- tion), Dec 2025. 2
2024
-
[3]
Claude code: Agentic coding tool, 2025
Anthropic. Claude code: Agentic coding tool, 2025. 2
2025
-
[4]
Cursor: The AI-first code editor, 2025
Anysphere. Cursor: The AI-first code editor, 2025. 2
2025
-
[5]
Agent trace: An open specification for AI code attribution, 2026
Anysphere (Cursor). Agent trace: An open specification for AI code attribution, 2026. 1, 2
2026
-
[6]
Apache Airflow documentation, 2024
Apache Software Foundation. Apache Airflow documentation, 2024. Originally developed at Airbnb in 2014. 2
2024
-
[7]
Grounded Copilot: How Programmers Interact with Code-Generating Models
S. Barke, M. B. James, and N. Polikarpova. Grounded copilot: How pro- grammers interact with code-generating models.Proceedings of the ACM on Programming Languages, 7(OOPSLA1), art. no. 78, 2023. Distin- guished Paper Award, OOPSLA 2023. doi: 10.1145/3586030 1
-
[8]
B. B. Bederson and J. D. Hollan. Pad++: a zoomable graphical interface system. InConference Companion on Human Factors in Computing Systems, CHI ’95, pp. 23–24. Association for Computing Machinery, New York, NY , USA, 1995. doi: 10.1145/223355.223394 6
- [9]
- [10]
-
[11]
Ceneda, N
D. Ceneda, N. Andrienko, G. Andrienko, T. Gschwandtner, S. Miksch, N. Piccolotto et al. Guide me in analysis: A framework for guidance designers.Computer Graphics Forum, 39(6):269–288, 2020. doi: 10. 1111/cgf.14017 2
2020
- [12]
-
[13]
: Data visualization practitioners’ perspectives on chartjunk
Z. Cutler, K. Gadhave, and A. Lex. Trrack: A library for provenance- tracking in web-based visualizations. InIEEE VIS 2020 Short Papers, pp. 116–120, 2020. doi: 10.1109/VIS47514.2020.00030 2
-
[14]
Cutler, J
Z. Cutler, J. Wilburn, H. Shrestha, Y . Ding, B. Bollen, K. A. Nadib et al. ReVISit 2: A full experiment life cycle user study framework.IEEE Transactions on Visualization and Computer Graphics, 32, 2026. IEEE VIS 2025 Best Paper Award. 8
2026
-
[15]
The Diagram is like Guardrails
Z. Ding, E. Jun, J. Chan, and D. Moritz. “The Diagram is like Guardrails”: Structuring GenAI-assisted hypotheses exploration with an interactive shared representation. InProceedings of the 2025 Conference on Creativity and Cognition (C&C), 2025. doi: 10.1145/3698061.3726935 2
-
[16]
K. Eckelt, K. Gadhave, A. Lex, and M. Streit. Loops: Leveraging prove- nance and visualization to support exploratory data analysis in notebooks. IEEE Transactions on Visualization and Computer Graphics, 2024. Proc. IEEE VIS 2024. doi: 10.1109/TVCG.2024.3456320 2
-
[17]
M. El-Assady, R. Kehlbeck, Y . Metz, U. Schlegel, R. Sevastjanova, F. Sper- rle et al. Semantic color mapping: A pipeline for assigning meaningful colors to text. InIEEE Workshop on Visualization Guidelines in Research, Design, and Education (VisGuides), 2022. doi: 10.1109/VisGuides57787. 2022.00008 8
-
[18]
M. El-Assady, F. Sperrle, O. Deussen, D. Keim, and C. Collins. Visual analytics for topic model optimization based on user-steerable speculative execution.IEEE Transactions on Visualization and Computer Graphics, 25(1):374–384, 2019. doi: 10.1109/TVCG.2018.2864769 2
-
[19]
Dagster: Cloud-native data pipeline orchestrator, 2024
Elementl. Dagster: Cloud-native data pipeline orchestrator, 2024. 2
2024
-
[20]
arXiv preprint arXiv:2411.04468 , year=
A. Fourney, G. Bansal, H. Mozannar, C. Tan, et al. Magentic-One: A generalist multi-agent system for solving complex tasks.arXiv preprint arXiv:2411.04468, 2024. 2
-
[21]
Agent2agent protocol (A2A), 2025
Google. Agent2agent protocol (A2A), 2025. v0.3, July 2025. Now under Linux Foundation. 2
2025
-
[22]
: From Visual Exploration to Storytelling and Back Again
S. Gratzl, A. Lex, N. Gehlenborg, N. Cosgrove, and M. Streit. From visual exploration to storytelling and back again.Computer Graphics Forum, 35(3):491–500, 2016. doi: 10.1111/cgf.12925 2
-
[23]
P. F. Gyarmati, D. Moritz, T. Möller, and L. Koesten. Structured visual- ization design knowledge for grounding generative reasoning and situated feedback, 2025. doi: 10.48550/ARXIV.2512.20306 5
-
[24]
J. Heer and D. Moritz. Mosaic: An architecture for scalable & interopera- ble data views.IEEE Transactions on Visualization and Computer Graph- ics, 30(1):436–446, 2024. doi: 10.1109/TVCG.2023.3327189 2
-
[25]
S. Hong, M. Zhuge, J. Chen, X. Zheng, Y . Cheng, J. Wang et al. MetaGPT: Meta programming for a multi-agent collaborative framework. InThe Twelfth International Conference on Learning Representations (ICLR),
-
[26]
Oral Presentation. 2
-
[27]
E. Horvitz. Principles of mixed-initiative user interfaces. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’99), pp. 159–166, 1999. doi: 10.1145/302979.303030 2
-
[28]
JetBrains Junie AI Agent, 2025
JetBrains. JetBrains Junie AI Agent, 2025. 2
2025
-
[29]
M. B. Jones and D. Hardt. The OAuth 2.0 Authorization Framework: Bearer Token Usage. RFC 6750, Oct. 2012. doi: 10.17487/RFC6750 5
-
[30]
D. A. Keim, G. Andrienko, J.-D. Fekete, C. Görg, J. Kohlhammer, and G. Melançon. Visual analytics: Definition, process, and challenges. In Information Visualization: Human-Centered Issues and Perspectives, pp. 154–175. Springer, 2008. doi: 10.1007/978-3-540-70956-5_7 1
-
[31]
Kestra: Open-source orchestration platform
Kestra Technologies. Kestra: Open-source orchestration platform. https: //kestra.io, 2026. Accessed: 2026-03-30. 2, 5, 7
2026
-
[32]
Kluyver, B
T. Kluyver, B. Ragan-Kelley, F. Pérez, B. Granger, M. Bussonnier, J. Fred- eric et al. Jupyter notebooks – a publishing format for reproducible com- putational workflows. In F. Loizides and B. Schmidt, eds.,Positioning and Power in Academic Publishing: Players, Agents and Agendas, pp. 87 – 90. IOS Press, 2016. 6
2016
-
[33]
N. F. Liu, K. Lin, J. Hewitt, A. Paranjape, M. Bevilacqua, F. Petroni et al. Lost in the middle: How language models use long contexts.Transactions of the Association for Computational Linguistics, 12:157–173, 2024. doi: 10.1162/tacl_a_00638 1
-
[34]
S. Lloyd. Least squares quantization in pcm.IEEE Transactions on Information Theory, 28(2):129–137, Mar. 1982. doi: 10.1109/tit.1982. 1056489 6
-
[35]
S. Monadjemi, Y . Guo, K. Xu, A. Endert, and A. Crisan. A scoping review of mixed initiative visual analytics in the automation renaissance.arXiv preprint arXiv:2509.19152, 2025. 2
-
[36]
D. Moritz, C. Wang, G. L. Nelson, H. Lin, A. M. Smith, B. Howe et al. Formalizing visualization design knowledge as constraints: Actionable and extensible models in Draco.IEEE Transactions on Visualization and Computer Graphics, 25(1):438–448, 2019. InfoVis 2018 Best Paper Award. doi: 10.1109/TVCG.2018.2865240 2
-
[37]
H. Nielsen, R. T. Fielding, and T. Berners-Lee. Hypertext Transfer Protocol – HTTP/1.0. RFC 1945, May 1996. doi: 10.17487/RFC1945 5
-
[38]
Codex: Cloud-based software engineering agent, 2025
OpenAI. Codex: Cloud-based software engineering agent, 2025. 2
2025
-
[39]
D. L. Parnas. On the criteria to be used in decomposing systems into modules.Communications of the ACM, 15(12):1053–1058, Dec. 1972. doi: 10.1145/361598.361623 1
-
[40]
Pearce, B
H. Pearce, B. Ahmad, B. Tan, B. Dolan-Gavitt, and R. Karri. Asleep at the keyboard? assessing the security of GitHub Copilot’s code contributions. In2022 IEEE Symposium on Security and Privacy (S&P), pp. 754–768,
-
[41]
doi: 10.1109/SP46214.2022.9833571 1, 2
-
[42]
S. Peng, E. Kalliamvakou, P. Cihon, and M. Demirer. The impact of AI on developer productivity: Evidence from GitHub Copilot.arXiv preprint arXiv:2302.06590, 2023. 2
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
Perry, M
N. Perry, M. Srivastava, D. Kumar, and D. Boneh. Do users write more insecure code with AI assistants? InProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS),
2023
-
[44]
doi: 10.1145/3576915.3623157 1, 2
- [45]
-
[46]
Prefect: Modern workflow orchestration, 2024
Prefect Technologies. Prefect: Modern workflow orchestration, 2024. 2
2024
-
[47]
E. D. Ragan, A. Endert, J. Sanyal, and J. Chen. Characterizing provenance in visualization and data analysis: An organizational framework of prove- nance types and purposes.IEEE Transactions on Visualization and Com- puter Graphics, 22(1):31–40, 2016. doi: 10.1109/TVCG.2015.2467551 2
-
[48]
IEEE Transactions on Visualization and Computer Graphics 23, 341–350
A. Satyanarayan, D. Moritz, K. Wongsuphasawat, and J. Heer. Vega-Lite: A grammar of interactive graphics.IEEE Transactions on Visualization and Computer Graphics, 23(1):341–350, 2017. InfoVis 2016 Best Paper Award. doi: 10.1109/TVCG.2016.2599030 2
-
[49]
A. Satyanarayan, R. Russell, J. Hoffswell, and J. Heer. Reactive Vega: A streaming dataflow architecture for declarative interactive visualization. IEEE Transactions on Visualization and Computer Graphics, 22(1):659– 668, 2016. doi: 10.1109/TVCG.2015.2467091 2
-
[50]
Souza, A
R. Souza, A. Gueroudji, S. DeWitt, D. Rosendo, T. Ghosal, R. Ross et al. PROV-AGENT: Unified provenance for tracking AI agent interactions in agentic workflows. InProceedings of the 21st IEEE International Conference on e-Science, pp. 467–473, 2025. 1, 2
2025
-
[51]
Extending the Nested Model for User-Centric XAI: A Design Study on GNN-Based Drug Repurposing
F. Sperrle, D. Ceneda, and M. El-Assady. Lotse: A practical framework for guidance in visual analytics.IEEE Transactions on Visualization and Computer Graphics, 29(1):1124–1134, 2023. doi: 10.1109/TVCG.2022. 3209456 2
-
[52]
F. Sperrle, A. Jeitler, J. Bernard, D. A. Keim, and M. El-Assady. Co- adaptive visual data analysis and guidance processes.Computers & Graph- ics, 100:93–105, 2021. doi: 10.1016/j.cag.2021.06.016 2
-
[53]
F. Sperrle, H. Schäfer, D. A. Keim, and M. El-Assady. Learning contextu- alized user preferences for co-adaptive guidance in mixed-initiative topic model refinement.Computer Graphics Forum, 40(3):215–226, 2021. doi: 10.1111/cgf.14301 2
-
[54]
T. Spinner, R. Kehlbeck, R. Sevastjanova, T. Stähle, D. A. Keim, O. Deussen et al. generaitor: Tree-in-the-loop text generation for language model explainability and adaptation.ACM Transactions on Interactive Intelligent Systems, 14(2):1–32, June 2024. doi: 10.1145/3652028 8
-
[55]
G. Srivastava, M. S, R. Venkataraman, K. V , and P. N. A review of the state of the art in business intelligence software.Enterprise Information Systems, 16(1):1–28, Jan. 2021. doi: 10.1080/17517575.2021.1872107 6
- [56]
-
[57]
H. Stitz, S. Luger, S. Gratzl, and M. Streit. A VOCADO: Visualization of workflow–derived data provenance for reproducible biomedical research. Computer Graphics Forum, 35(3):481–490, 2016. doi: 10.1111/cgf.12924 2
- [58]
-
[59]
Vibe coding: Programming through conversa- tion with artificial intelligence,
Various. Vibe coding: Programming through conversation with artificial intelligence.arXiv preprint arXiv:2506.23253, 2025. 1
-
[60]
E. Wall, S. Das, R. Chawla, B. Kalidindi, E. T. Brown, and A. Endert. Podium: Ranking data using mixed-initiative visual analytics.IEEE Trans. Vis. Comput. Graph., 24(1):288–297, 2018. doi: 10.1109/TVCG.2017. 2745078 8
-
[61]
Windsurf Editor, 2026
Windsurf Inc. Windsurf Editor, 2026. 2
2026
-
[62]
K. Wongsuphasawat, Z. Qu, D. Moritz, R. Chang, F. Ouk, A. Anand et al. V oyager 2: Augmenting visual analysis with partial view specifications. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 2648–2659, 2017. doi: 10.1145/3025453.3025768 2
-
[63]
Q. Wu, G. Bansal, J. Zhang, Y . Wu, S. Zhang, E. Zhu et al. AutoGen: Enabling next-gen LLM applications via multi-agent conversation.arXiv preprint arXiv:2308.08155, 2023. 2
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[64]
K. Xu, A. Ottley, C. Walchshofer, M. Streit, R. Chang, and J. Wenskovitch. Survey on the analysis of user interactions and visualization provenance. Computer Graphics Forum, 39(3):757–783, 2020. doi: 10.1111/cgf.14035 2
-
[65]
J. Yang, C. E. Jimenez, A. Wettig, K. Lieret, S. Yao, K. Narasimhan et al. SWE-agent: Agent-computer interfaces enable automated software engineering. InThirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS), 2024. 1
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.