Visus: An Interactive System for Automatic Machine Learning Model Building and Curation

A\'ecio Santos; Bowen Yu; Cl\'audio T. Silva; Cristian Felix; Enrico Bertini; Jorge Piazentin Ono; Juliana Freire; Sonia Castelo; Sungsoo Hong

arxiv: 1907.02889 · v1 · pith:XS2OMCZ6new · submitted 2019-07-05 · 💻 cs.LG · cs.HC

Visus: An Interactive System for Automatic Machine Learning Model Building and Curation

A\'ecio Santos , Sonia Castelo , Cristian Felix , Jorge Piazentin Ono , Bowen Yu , Sungsoo Hong , Cl\'audio T. Silva , Enrico Bertini

show 1 more author

Juliana Freire

This is my paper

Pith reviewed 2026-05-25 02:17 UTC · model grok-4.3

classification 💻 cs.LG cs.HC

keywords AutoMLinteractive visualizationpipeline curationmachine learning interfacesdomain expert toolsmodel refinement

0 comments

The pith

Visus is an interactive system that supports domain experts in building and curating AutoML-generated machine learning pipelines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Visus as a response to the scarcity of data scientists by giving domain experts tools to refine the end-to-end pipelines that AutoML systems produce. It grounds design choices in an explicit framework, shows a concrete usage scenario, and reports feedback from testing sessions with domain experts. The central claim is that such an interface makes the model-building process accessible when users have little machine-learning background. If the claim holds, AutoML outputs move from best-effort artifacts to objects that non-experts can actively improve.

Core claim

Visus is a system designed to support the model building process and curation of ML data processing pipelines generated by AutoML systems. The work describes the framework used to ground design choices, illustrates a usage scenario enabled by the system, and discusses feedback received in user testing sessions with domain experts.

What carries the argument

Visus, the interactive interface that guides users through inspection, modification, and refinement of AutoML pipelines.

Load-bearing premise

Domain experts lack machine-learning expertise and therefore need dedicated interactive interfaces to curate AutoML outputs effectively.

What would settle it

A controlled comparison in which domain experts using Visus produce no measurable improvement in pipeline quality or usability over experts working without the system.

Figures

Figures reproduced from arXiv: 1907.02889 by A\'ecio Santos, Bowen Yu, Cl\'audio T. Silva, Cristian Felix, Enrico Bertini, Jorge Piazentin Ono, Juliana Freire, Sonia Castelo, Sungsoo Hong.

**Figure 2.** Figure 2: Visus’s data selection and problem definition screens: (A) select or load dataset view, (B) select task view (create a [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Explore models view Examine Model Explanations. To better understand how a given pipeline performs, Visus generates more detailed explanations. For classification problems, Visus currently supports two visualizations: a standard confusion matrix and rule matrix [15]. The confusion matrix shows the predicted classes as columns and the true classes as rows as shown in [PITH_FULL_IMAGE:figures/full_fig_p00… view at source ↗

**Figure 5.** Figure 5: Data augmentation support in Visus. (G) Dataset [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: User-generated visualizations: (1) initial confusion [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

read the original abstract

While the demand for machine learning (ML) applications is booming, there is a scarcity of data scientists capable of building such models. Automatic machine learning (AutoML) approaches have been proposed that help with this problem by synthesizing end-to-end ML data processing pipelines. However, these follow a best-effort approach and a user in the loop is necessary to curate and refine the derived pipelines. Since domain experts often have little or no expertise in machine learning, easy-to-use interactive interfaces that guide them throughout the model building process are necessary. In this paper, we present Visus, a system designed to support the model building process and curation of ML data processing pipelines generated by AutoML systems. We describe the framework used to ground our design choices and a usage scenario enabled by Visus. Finally, we discuss the feedback received in user testing sessions with domain experts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Visus presents a new interactive system for AutoML curation with a design framework and qualitative user feedback, but the evaluation lacks quantitative support.

read the letter

Visus is a new system for interactive curation of AutoML pipelines, grounded in a design framework and tested with domain experts through usage scenarios and feedback sessions. The central idea is that AutoML generates pipelines but users, especially domain experts without ML background, need help to curate them. The paper does a solid job laying out the problem of AutoML outputs needing human refinement and showing how Visus supports that process. The framework for design choices is a useful addition, and including a concrete usage scenario helps make the system tangible. Reporting feedback from domain experts aligns with the goal of making tools accessible to non-ML users. This approach shows honest engagement with the practical challenges in the field. The main limitation is the evaluation. It relies on qualitative descriptions of user testing without quantitative metrics, comparisons, or detailed analysis of what worked or didn't. This is common in system papers but leaves the effectiveness claims somewhat open. The abstract doesn't provide specifics on the number of participants or the exact nature of the feedback, which makes it harder to gauge the impact. No issues with circularity or invented math here, as it's all descriptive of the system. This work is for people interested in building or studying interactive systems for machine learning pipelines, particularly in HCI or visualization communities. A reader working on similar tools would get practical ideas from the framework and scenario. It should go to peer review. The system is clearly described and the user input provides some grounding, so referees can assess and suggest improvements on the evaluation side.

Referee Report

1 major / 2 minor

Summary. The paper presents Visus, an interactive system to support domain experts (with limited ML expertise) in curating and refining end-to-end ML data processing pipelines generated by AutoML systems. It grounds the system design in a stated framework, describes a usage scenario, and reports qualitative feedback from user testing sessions with domain experts.

Significance. If the system and its design choices function as described, the work addresses a practical gap in making AutoML outputs usable by non-experts through interactive curation interfaces. The explicit design framework and reported user sessions provide concrete examples of interface features for pipeline inspection and refinement, which could inform future HCI-for-AutoML efforts. The contribution is primarily descriptive and system-oriented rather than a new algorithmic or theoretical result.

major comments (1)

[user testing / evaluation] User testing section: The validation rests entirely on qualitative feedback from domain-expert sessions, with no reported participant count, task protocol, success metrics, or comparison to a baseline interface. This leaves the central claim that Visus 'supports the model building process and curation' supported only by narrative description rather than observable outcomes.

minor comments (2)

[abstract] Abstract: The motivation sentence on domain experts having 'little or no expertise in machine learning' is repeated from the introduction without additional grounding; a brief citation to prior studies on AutoML user barriers would strengthen it.
[usage scenario] The usage scenario is presented narratively; adding a short table or figure summarizing the sequence of user actions and system responses would improve clarity and reproducibility of the demonstrated workflow.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and constructive suggestion regarding the user testing section. We agree that additional specifics will strengthen the manuscript and will revise accordingly.

read point-by-point responses

Referee: [user testing / evaluation] User testing section: The validation rests entirely on qualitative feedback from domain-expert sessions, with no reported participant count, task protocol, success metrics, or comparison to a baseline interface. This leaves the central claim that Visus 'supports the model building process and curation' supported only by narrative description rather than observable outcomes.

Authors: We acknowledge that the current user testing description is primarily narrative. In the revised version we will explicitly report the number of domain-expert participants, outline the session protocol (including tasks performed and questions asked), and provide more concrete examples of the feedback received and how it informed design decisions. Because the study was designed as a qualitative validation of the proposed design framework rather than a controlled experiment, we did not collect quantitative success metrics or run a baseline comparison; we will make this scope explicit so readers understand the nature of the evidence. revision: partial

Circularity Check

0 steps flagged

No significant circularity: system description with independent user validation

full rationale

The paper presents a descriptive system (Visus) for curating AutoML pipelines, grounded in an explicitly stated design framework, illustrated via a usage scenario, and evaluated through reported user testing sessions with domain experts. No equations, fitted parameters, predictions, or derivations exist that could reduce to inputs by construction. No self-citation chains are invoked as load-bearing uniqueness theorems or ansatzes. The central claims are self-contained in the paper's own construction and external user feedback, qualifying for the default non-circularity outcome.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No mathematical free parameters or invented physical entities. The design framework is presented as grounding the choices, but its specific axioms are not enumerated in the abstract. The central assumption that interactive interfaces are necessary for domain experts is treated as a domain_assumption rather than derived.

axioms (1)

domain assumption Domain experts require guided interactive interfaces because they lack ML expertise
This premise is stated directly in the abstract as the reason an interactive system is needed.

pith-pipeline@v0.9.0 · 5706 in / 1176 out tokens · 17280 ms · 2026-05-25T02:17:47.322307+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages · 2 internal anchors

[1]

A User-based Visual Analytics Workflow for Exploratory Model Analysis

Dylan Cashman, Shah Rukh Humayoun, Florian Heimerl, Kendall Park, Subhajit Das, John Thompson, Bahador Saket, Abigail Mosca, John T. Stasko, Alex Endert, Michael Gleicher, and Remco Chang. 2018. Visual Analytics for Automated Model Discovery. CoRR abs/1809.10782 (2018). arXiv:1809.10782 http://arxiv.org/abs/ 1809.10782

work page internal anchor Pith review Pith/arXiv arXiv 2018
[2]

Adriane Chapman, Elena Simperl, Laura Koesten, George Konstantinidis, Luis Daniel Ibáñez-Gonzalez, Emilia Kacprzak, and Paul T. Groth. 2019. Dataset search: a survey. CoRR abs/1901.00735 (2019). arXiv:1901.00735 http://arxiv.org/ abs/1901.00735

work page arXiv 2019
[3]

Louis Columbus. [n. d.]. IBM Predicts Demand For Data Scientists Will Soar 28% By 2020. https://www.forbes.com/sites/louiscolumbus/2017/05/13/ ibm-predicts-demand-for-data-scientists-will-soar-28-by-2020/

work page 2020
[4]

Iddo Drori, Yamuna Krishnamurthy, Remi Rampin, Raoni Lourenço, Jorge Ono, Kyunghyun Cho, Claudio Silva, and Juliana Freire. 2018. AlphaD3M: Machine Learning Pipeline Synthesis. In Proceedings of Machine Learning Research, ICML 2018 AutoML Workshop

work page 2018
[5]

Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and robust automated machine learning. In Advances in neural information processing systems . 2962–2970

work page 2015
[6]

Yolanda Gil, James Honaker, Shikhar Gupta, Yibo Ma, Vito D’Orazio, Daniel Garijo, Shruti Gadewar, Qifan Yang, and Neda Jahanshad. 2019. Towards Human- guided Machine Learning. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI ’19). ACM, New York, NY, USA, 614–624. https: //doi.org/10.1145/3301275.3302324

work page doi:10.1145/3301275.3302324 2019
[7]

Yolanda Gil, Ke-Thia Yao, Varun Ratnakar, Daniel Garijo, Greg Ver Steeg, Rob Brekelmans, Mayank Kejriwal, Fanghao Luo, and I-De Huang. 2018. P4ML: A Phased Performance-Based Pipeline Planner for Automated Machine Learning. In Proceedings of Machine Learning Research, ICML 2018 AutoML Workshop

work page 2018
[8]

Moritz Hardt, Eric Price, , and Nati Srebro. 2016. Equality of Opportu- nity in Supervised Learning. In Advances in Neural Information Processing Systems 29 , D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Gar- nett (Eds.). Curran Associates, Inc., 3315–3323. http://papers.nips.cc/paper/ 6374-equality-of-opportunity-in-supervised-learning.pdf

work page 2016
[9]

Hastie, R

T. Hastie, R. Tibshirani, and J.H. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction . Springer. https://books.google. com/books?id=eBSgoAEACAAJ

work page 2009
[10]

James Honaker and Vito D’Orazio. 2014. Statistical Modeling by Gesture: A graph- ical, Browser-based Statistical Interface for Data Repositories.. In HT (Doctoral Consortium/Late-breaking Results/Workshops)

work page 2014
[11]

Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren. 2019. Automatic machine learning: methods, systems, challenges. Challenges in Machine Learning (2019)

work page 2019
[12]

J J Thomas, K A Cook, Institute Electrical, and Electronics Engineers. 2005. Illuminating the path: The research and development agenda for visual analytics

work page 2005
[13]

Fei-Fei Li and Jia Li. [n. d.]. Cloud AutoML: Making AI accessi- ble to every business. https://www.blog.google/products/google-cloud/ cloud-automl-making-ai-accessible-every-business/

work page
[14]

Microsoft. [n. d.]. Microsoft Azure Machine Learning Studio. https://studio. azureml.net/

work page
[15]

Y. Ming, H. Qu, and E. Bertini. 2019. RuleMatrix: Visualizing and Understanding Classifiers with Rules. IEEE Transactions on Visualization and Computer Graphics 25, 1 (Jan 2019), 342–352. https://doi.org/10.1109/TVCG.2018.2864812

work page doi:10.1109/tvcg.2018.2864812 2019
[16]

nyc-vision-zero [n. d.]. NYC Vision Zero. https://www1.nyc.gov/site/visionzero/ index.page

work page
[17]

Olson, Nathan Bartley, Ryan J

Randal S. Olson, Nathan Bartley, Ryan J. Urbanowicz, and Jason H. Moore. 2016. Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO ’16). ACM, New York, NY, USA, 485–492. https://doi.org/10.1145/ 2908812.2908918

work page arXiv 2016
[18]

Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification Without Disparate Mistreatment. In Proceedings of the 26th International Conference on World Wide Web (WWW ’17). International World Wide Web Conferences Steering Committee, Republic...

work page doi:10.1145/3038912.3052660 2017
[19]

Indre Zliobaite. 2015. A survey on measuring indirect discrimination in machine learning. CoRR abs/1511.00148 (2015). arXiv:1511.00148 http://arxiv.org/abs/ 1511.00148

work page internal anchor Pith review Pith/arXiv arXiv 2015

[1] [1]

A User-based Visual Analytics Workflow for Exploratory Model Analysis

Dylan Cashman, Shah Rukh Humayoun, Florian Heimerl, Kendall Park, Subhajit Das, John Thompson, Bahador Saket, Abigail Mosca, John T. Stasko, Alex Endert, Michael Gleicher, and Remco Chang. 2018. Visual Analytics for Automated Model Discovery. CoRR abs/1809.10782 (2018). arXiv:1809.10782 http://arxiv.org/abs/ 1809.10782

work page internal anchor Pith review Pith/arXiv arXiv 2018

[2] [2]

Adriane Chapman, Elena Simperl, Laura Koesten, George Konstantinidis, Luis Daniel Ibáñez-Gonzalez, Emilia Kacprzak, and Paul T. Groth. 2019. Dataset search: a survey. CoRR abs/1901.00735 (2019). arXiv:1901.00735 http://arxiv.org/ abs/1901.00735

work page arXiv 2019

[3] [3]

Louis Columbus. [n. d.]. IBM Predicts Demand For Data Scientists Will Soar 28% By 2020. https://www.forbes.com/sites/louiscolumbus/2017/05/13/ ibm-predicts-demand-for-data-scientists-will-soar-28-by-2020/

work page 2020

[4] [4]

Iddo Drori, Yamuna Krishnamurthy, Remi Rampin, Raoni Lourenço, Jorge Ono, Kyunghyun Cho, Claudio Silva, and Juliana Freire. 2018. AlphaD3M: Machine Learning Pipeline Synthesis. In Proceedings of Machine Learning Research, ICML 2018 AutoML Workshop

work page 2018

[5] [5]

Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and robust automated machine learning. In Advances in neural information processing systems . 2962–2970

work page 2015

[6] [6]

Yolanda Gil, James Honaker, Shikhar Gupta, Yibo Ma, Vito D’Orazio, Daniel Garijo, Shruti Gadewar, Qifan Yang, and Neda Jahanshad. 2019. Towards Human- guided Machine Learning. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI ’19). ACM, New York, NY, USA, 614–624. https: //doi.org/10.1145/3301275.3302324

work page doi:10.1145/3301275.3302324 2019

[7] [7]

Yolanda Gil, Ke-Thia Yao, Varun Ratnakar, Daniel Garijo, Greg Ver Steeg, Rob Brekelmans, Mayank Kejriwal, Fanghao Luo, and I-De Huang. 2018. P4ML: A Phased Performance-Based Pipeline Planner for Automated Machine Learning. In Proceedings of Machine Learning Research, ICML 2018 AutoML Workshop

work page 2018

[8] [8]

Moritz Hardt, Eric Price, , and Nati Srebro. 2016. Equality of Opportu- nity in Supervised Learning. In Advances in Neural Information Processing Systems 29 , D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Gar- nett (Eds.). Curran Associates, Inc., 3315–3323. http://papers.nips.cc/paper/ 6374-equality-of-opportunity-in-supervised-learning.pdf

work page 2016

[9] [9]

Hastie, R

T. Hastie, R. Tibshirani, and J.H. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction . Springer. https://books.google. com/books?id=eBSgoAEACAAJ

work page 2009

[10] [10]

James Honaker and Vito D’Orazio. 2014. Statistical Modeling by Gesture: A graph- ical, Browser-based Statistical Interface for Data Repositories.. In HT (Doctoral Consortium/Late-breaking Results/Workshops)

work page 2014

[11] [11]

Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren. 2019. Automatic machine learning: methods, systems, challenges. Challenges in Machine Learning (2019)

work page 2019

[12] [12]

J J Thomas, K A Cook, Institute Electrical, and Electronics Engineers. 2005. Illuminating the path: The research and development agenda for visual analytics

work page 2005

[13] [13]

Fei-Fei Li and Jia Li. [n. d.]. Cloud AutoML: Making AI accessi- ble to every business. https://www.blog.google/products/google-cloud/ cloud-automl-making-ai-accessible-every-business/

work page

[14] [14]

Microsoft. [n. d.]. Microsoft Azure Machine Learning Studio. https://studio. azureml.net/

work page

[15] [15]

Y. Ming, H. Qu, and E. Bertini. 2019. RuleMatrix: Visualizing and Understanding Classifiers with Rules. IEEE Transactions on Visualization and Computer Graphics 25, 1 (Jan 2019), 342–352. https://doi.org/10.1109/TVCG.2018.2864812

work page doi:10.1109/tvcg.2018.2864812 2019

[16] [16]

nyc-vision-zero [n. d.]. NYC Vision Zero. https://www1.nyc.gov/site/visionzero/ index.page

work page

[17] [17]

Olson, Nathan Bartley, Ryan J

Randal S. Olson, Nathan Bartley, Ryan J. Urbanowicz, and Jason H. Moore. 2016. Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (GECCO ’16). ACM, New York, NY, USA, 485–492. https://doi.org/10.1145/ 2908812.2908918

work page arXiv 2016

[18] [18]

Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification Without Disparate Mistreatment. In Proceedings of the 26th International Conference on World Wide Web (WWW ’17). International World Wide Web Conferences Steering Committee, Republic...

work page doi:10.1145/3038912.3052660 2017

[19] [19]

Indre Zliobaite. 2015. A survey on measuring indirect discrimination in machine learning. CoRR abs/1511.00148 (2015). arXiv:1511.00148 http://arxiv.org/abs/ 1511.00148

work page internal anchor Pith review Pith/arXiv arXiv 2015