From Intent to AI Pipelines: A Controlled Agentic Framework for Non-AI Expert Scientists

Houari Sahraoui; Hyacinth Ali; Jessie Galasso-Carbonnel

arxiv: 2605.18764 · v1 · pith:6ZYHJKG5new · submitted 2026-04-10 · 💻 cs.IR · cs.AI

From Intent to AI Pipelines: A Controlled Agentic Framework for Non-AI Expert Scientists

Hyacinth Ali , Jessie Galasso-Carbonnel , Houari Sahraoui This is my paper

Pith reviewed 2026-05-21 09:22 UTC · model grok-4.3

classification 💻 cs.IR cs.AI

keywords AI pipelinesagentic frameworkslarge language modelsnon-expert usershuman-in-the-looppipeline generationdomain adaptationAI for scientists

0 comments

The pith

A four-stage framework lets non-AI scientists build competitive pipelines from their own intent using large language models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DDAP, a controlled human-in-the-loop agentic framework that breaks AI pipeline construction into four stages: problem definition, compute environment specification, pipeline generation, and code generation. The approach adapts to a user's domain context, expertise level, and resource limits while keeping the user in charge of key choices. Experiments on datasets from business, biology, and health science domains show the generated models reach performance levels close to those created by AI experts in many tasks. Readers in applied fields would care because the method could let domain scientists run their own large-scale analyses without first becoming AI specialists.

Core claim

DDAP structures the development process into four stages of guided interaction that adapt to domain context, user expertise, and resource constraints while maintaining user control over key decisions. When evaluated across multiple datasets spanning business, biology, and health science domains by comparing its AI models against expert-developed models, the framework achieves competitive results in several tasks, although performance varies across problem types, particularly for text-based clustering tasks.

What carries the argument

The four-stage controlled agentic process of DDAP, which uses large language models to interpret user intent and generate pipeline structures and code while preserving human oversight at each stage.

If this is right

Domain scientists in medicine, agriculture, and social sciences can create and run their own predictive models and data analyses without hiring AI specialists.
The staged structure improves reproducibility by logging each decision and adaptation step.
Performance remains competitive for many supervised and regression tasks but shows clear gaps on unsupervised text clustering.
Resource constraints can be incorporated early so the generated code respects available compute limits.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If language-model reliability improves on intent interpretation, the framework could reduce the remaining performance gaps without changing its core structure.
The same staged, controllable approach might transfer to other technical workflows such as experimental protocol design or simulation setup.
Adding explicit validation checks after each stage could make the system more robust for production use by non-experts.

Load-bearing premise

Large language models can reliably interpret user-provided domain context and intent to produce correct pipeline structures and code without introducing systematic errors that would require extensive human debugging.

What would settle it

A direct head-to-head comparison on new text-clustering datasets where non-expert users run DDAP to completion and the resulting model accuracy is measured against expert baselines on identical data splits.

Figures

Figures reproduced from arXiv: 2605.18764 by Houari Sahraoui, Hyacinth Ali, Jessie Galasso-Carbonnel.

**Figure 1.** Figure 1: Domain-Driven AI Pipelines Architecture 2.5.5 Mean Absolute Error. Mean Absolute Error (MAE) is widely used in regression tasks to measure the average magnitude of prediction errors by averaging the absolute differences between predicted and true values. A perfect model will have an MAE of zero (0). It provides an intuitive interpretation of how far predictions deviate from actual values, and the lower th… view at source ↗

**Figure 2.** Figure 2: DDAP Workflow (Stage 1 and 2) [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Problem Definition System Message (Excerpt) [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 5.** Figure 5: Preprocessing Generation System Message (Excerpt) [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: DDAP Workflow (Stage 3 and 4) preprocessing techniques generation and (2) pipeline specification generation. This decomposition aims to improve more focused reasoning at each step and reduce the likelihood of errors in the final pipeline design. In the first step, a one-shot interaction is used to generate candidate preprocessing strategies tailored to the task and data characteristics, and the system m… view at source ↗

**Figure 9.** Figure 9: Code Repair System Message (Excerpt) the overall orchestrator process can be viewed as a staged composition of transformations: 𝐴4 = 𝑓𝛼4 ◦ 𝑓𝛼3 ◦ 𝑓𝛼2 ◦ 𝑓𝛼1 (𝑈0) (10) where 𝑈0 is the initial user intent. At this stage, Code-oriented LLMs, such as Code Llama [36], are then employed to generate executable code tailored to the defined compute environment and preferred platform. The orchestrator manages a contr… view at source ↗

**Figure 8.** Figure 8: Code Generation System Message (Excerpt) [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 10.** Figure 10: Jute Pests Classification Model Performance [PITH_FULL_IMAGE:figures/full_fig_p007_10.png] view at source ↗

**Figure 11.** Figure 11: Parkinsons Telemonitoring Model Performance [PITH_FULL_IMAGE:figures/full_fig_p007_11.png] view at source ↗

**Figure 12.** Figure 12: Product Classification Model Performance [PITH_FULL_IMAGE:figures/full_fig_p008_12.png] view at source ↗

**Figure 13.** Figure 13: Waste Material Classification Model Performance [PITH_FULL_IMAGE:figures/full_fig_p008_13.png] view at source ↗

**Figure 14.** Figure 14: Stock Market Forecast Model Performance accuracy, our framework provides a more comprehensive evaluation using precision, recall, and F1-score, offering deeper insight into model behavior. In conclusion, the experimental results summarized in [PITH_FULL_IMAGE:figures/full_fig_p009_14.png] view at source ↗

read the original abstract

Artificial Intelligence (AI) pipelines have become integral to modern research, supporting fields such as Medical Sciences, Agriculture, and Social Sciences, and enabling large-scale data analysis, predictive modeling, and the automation of complex tasks. However, designing and implementing AI solutions remains challenging for many researchers due to the expertise required in the design and development of end-to-end AI systems. To address this gap, we present Domain-Driven Adaptable AI Pipelines (DDAP), a controlled, human-in-the-loop, agentic framework that leverages large language models to guide users in a systematic construction of AI pipelines and their corresponding implementation code. DDAP structures the development process into four stages: problem definition, compute environment specification, pipeline generation, and code generation. Through this staged interaction, the framework adapts to domain context, user expertise, and resource constraints, while maintaining user control over key decisions. We evaluate DDAP across multiple datasets spanning business, biology, and health science domains by comparing its AI models against expert-developed models. The experimental results show that DDAP achieves competitive results in several tasks compared to expert baselines, although performance varies across problem types, particularly for text-based clustering tasks. By combining guided interaction, adaptability, and reproducibility, DDAP demonstrates that a controlled agentic framework can generate competitive AI pipelines for non-expert users.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DDAP gives non-experts a four-stage human-controlled process for LLM-assisted AI pipelines, but the competitive results are stated without metrics or details on how much user correction was needed.

read the letter

DDAP structures the work into problem definition, compute environment specification, pipeline generation, and code generation, with user control and domain adaptation at each step. This staged human-in-the-loop design is the main concrete contribution for helping scientists outside AI fields build pipelines without starting from scratch. It addresses a real need in areas like biology and health science where researchers want to apply AI but lack the technical background. The emphasis on reproducibility and adaptability to user expertise and resources is a practical strength that builds sensibly on existing AutoML and LLM coding tools. The paper shows clear thinking about keeping the human in charge to avoid unchecked LLM errors. The evaluation section is the clear weak point. The abstract says the results are competitive with expert baselines on business, biology, and health tasks, yet it supplies no numbers, no dataset details, no error bars, and no account of how many overrides or debugging steps the users performed. Because the framework is explicitly designed for human intervention, the lack of that information makes it hard to judge how much the automated stages actually deliver. The variation in performance, especially weaker results on text-based clustering, is noted but left unexplained. This paper is aimed at domain scientists and tool developers who want structured ways to lower the AI expertise barrier. Readers working on human-AI collaboration in scientific workflows could extract the staging idea for their own use. It deserves peer review because the core framing is coherent and the target problem matters, even though the current evidence is limited. I would send it to referees with a request to add quantitative results and analysis of the human effort required at each stage.

Referee Report

2 major / 1 minor

Summary. The paper presents Domain-Driven Adaptable AI Pipelines (DDAP), a controlled human-in-the-loop agentic framework that uses large language models to help non-AI expert scientists build AI pipelines and implementation code. The process is divided into four stages: problem definition, compute environment specification, pipeline generation, and code generation. The framework is evaluated on datasets from business, biology, and health science domains, where it is reported to achieve competitive results compared to expert baselines, with noted variations in performance across different problem types, especially text-based clustering tasks.

Significance. If the results hold under rigorous quantitative evaluation, DDAP could be significant in lowering barriers for domain scientists to adopt AI techniques, fostering greater reproducibility and adaptability in research across various fields. The human-in-the-loop aspect ensures user control, which is a positive design choice for practical usability.

major comments (2)

[Evaluation] The manuscript claims that 'the experimental results show that DDAP achieves competitive results in several tasks compared to expert baselines' but does not include any quantitative metrics, error bars, dataset details, or statistical tests. This is a load-bearing issue for the central claim as it prevents verification of competitiveness.
[Human-in-the-Loop Aspects] Given that DDAP is explicitly a human-in-the-loop framework allowing user control at each stage, the paper should detail the extent of human interventions, such as corrections to LLM-generated pipelines or code, required to reach the reported performance levels. Without this, the attribution of results to the agentic component remains unclear.

minor comments (1)

[Abstract] The abstract mentions performance variation by task type but does not explain the reasons for underperformance in text-based clustering tasks.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback on our manuscript describing the DDAP framework. The comments identify important areas for strengthening the evaluation and clarifying the human-in-the-loop contributions. We address each point below and will revise the manuscript to incorporate the requested details.

read point-by-point responses

Referee: [Evaluation] The manuscript claims that 'the experimental results show that DDAP achieves competitive results in several tasks compared to expert baselines' but does not include any quantitative metrics, error bars, dataset details, or statistical tests. This is a load-bearing issue for the central claim as it prevents verification of competitiveness.

Authors: We acknowledge that the current version of the manuscript presents the competitiveness claim at a high level without sufficient supporting quantitative evidence. The full experimental section compares DDAP outputs to expert baselines across business, biology, and health domains but omits explicit metrics, variability measures, dataset specifications, and statistical analysis. We will add a revised evaluation section that includes performance tables with metrics such as accuracy, precision, recall, or clustering scores as appropriate; error bars or standard deviations from repeated runs where applicable; complete dataset descriptions including sizes, sources, and preprocessing steps; and statistical tests (e.g., paired t-tests or non-parametric equivalents) to substantiate the 'competitive' characterization. This revision will enable independent verification of the results. revision: yes
Referee: [Human-in-the-Loop Aspects] Given that DDAP is explicitly a human-in-the-loop framework allowing user control at each stage, the paper should detail the extent of human interventions, such as corrections to LLM-generated pipelines or code, required to reach the reported performance levels. Without this, the attribution of results to the agentic component remains unclear.

Authors: We agree that quantifying human interventions is necessary to properly attribute performance to the agentic framework versus user guidance. The manuscript describes the four-stage process and user control but does not report the frequency or nature of interventions observed during evaluation. In the revision, we will include a new subsection on human-in-the-loop usage that reports, based on our experimental logs, the average number of user corrections or refinements per stage, representative examples of interventions (such as adjusting problem definitions or validating generated code), and an analysis of how these interventions influenced final pipeline performance. This will provide clearer insight into the balance between automation and human oversight. revision: yes

Circularity Check

0 steps flagged

No significant circularity: evaluation uses external expert baselines

full rationale

The paper presents DDAP as a four-stage human-in-the-loop framework and supports its claims through direct empirical comparison of generated pipelines against independently developed expert baselines on external datasets from business, biology, and health-science domains. No mathematical derivations, equations, fitted parameters, or predictions are defined; the performance claims rest on external benchmarks rather than any self-referential construction or self-citation chain. The evaluation is therefore self-contained against independent references.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that current LLMs possess sufficient reasoning capability to translate user intent into valid pipelines and code across domains; no free parameters or new entities are introduced.

axioms (1)

domain assumption Large language models can accurately interpret domain-specific user intent and generate appropriate AI pipeline structures and code.
Invoked throughout the description of the four-stage interaction process.

pith-pipeline@v0.9.0 · 5775 in / 1242 out tokens · 38703 ms · 2026-05-21T09:22:38.588755+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

67 extracted references · 67 canonical work pages · 3 internal anchors

[1]

Oguz Akbilgic. 2013. ISTANBUL STOCK EXCHANGE. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C54P4J

work page doi:10.24432/c54p4j 2013
[2]

Oguz Akbilgic, Hamparsum Bozdogan, and M Erdal Balaban. 2014. A novel hybrid RBF neural networks model as a forecaster.Statistics and Computing24, 3 (2014), 365–375

work page 2014
[3]

Leonidas Akritidis. 2020. Product Classification and Clustering. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5M91Z

work page doi:10.24432/c5m91z 2020
[4]

Leonidas Akritidis, Athanasios Fevgas, and Panayiotis Bozanis. 2018. Effective products categorization with importance scores and morphological analysis of the titles. In2018 IEEE 30th international conference on tools with artificial intelligence (ICTAI). IEEE, 213–220

work page 2018
[5]

Leonidas Akritidis, Athanasios Fevgas, Panayiotis Bozanis, and Christos Makris

work page
[6]

A self-verifying clustering approach to unsupervised matching of product titles.Artificial Intelligence Review53, 7 (2020), 4777–4820

work page 2020
[7]

Răzvan Daniel Albu and Florin Lucian Morgoş. 2025. AI-Assisted Low-Code Plat- forms in Modern Research. In2025 18th International Conference on Engineering of Modern Electric Systems (EMES). IEEE, 1–4

work page 2025
[8]

Anonymous, Anonymous, and Anonymous. 2026. AI Pipeline Generation for Sci- entists Without AI Expertise Using a Controlled Agentic Framework (Replication Package). doi:10.5281/zenodo.19241799

work page doi:10.5281/zenodo.19241799 2026
[9]

Muhammad Arslan, Hussam Ghanem, Saba Munawar, and Christophe Cruz. 2024. A Survey on RAG with LLMs.Procedia computer science246 (2024), 3781–3790

work page 2024
[10]

Suriya Ganesh Ayyamperumal and Limin Ge. 2024. Current state of LLM Risks and AI Guardrails.arXiv preprint arXiv:2406.12934(2024)

work page arXiv 2024
[11]

James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization.Advances in neural information processing systems24 (2011)

work page 2011
[12]

Alexander C Bock and Ulrich Frank. 2021. Low-code platform.Business & Information Systems Engineering63, 6 (2021), 733–740

work page 2021
[13]

Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. 2021. On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258(2021)

work page internal anchor Pith review Pith/arXiv arXiv 2021
[14]

Browne, Edward Powley, Daniel Whitehouse, Simon M

Cameron B. Browne, Edward Powley, Daniel Whitehouse, Simon M. Lucas, Pe- ter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, and Simon Colton. 2012. A survey of Monte Carlo tree search methods.IEEE Transactions on Computational Intelligence and AI in Games4, 1 (2012), 1–43. doi:10.1109/TCIAIG.2012.2186810

work page doi:10.1109/tciaig.2012.2186810 2012
[15]

Daqing Chen. 2015. Online Retail. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5BW33

work page doi:10.24432/c5bw33 2015
[16]

Daqing Chen, Sai Laing Sain, and Kun Guo. 2012. Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining.Journal of Database Marketing & Customer Strategy Management 19, 3 (2012), 197–208

work page 2012
[17]

2025.Built to make you extraordinarily productive, Cursor is the best way to code with AI.Retrieved September 29, 2025 from https://cursor.com/

Cursor. 2025.Built to make you extraordinarily productive, Cursor is the best way to code with AI.Retrieved September 29, 2025 from https://cursor.com/

work page 2025
[18]

Stefano D’Urso, Barbara Martini, and Filippo Sciarrone. 2024. A Novel LLM Architecture for Intelligent System Configuration. In2024 28th International Conference Information Visualisation (IV). IEEE, 326–331

work page 2024
[19]

Unai Garciarena, Roberto Santana, and Alexander Mendiburu. 2018. Analysis of the complexity of the automatic pipeline generation problem. In2018 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1–8

work page 2018
[20]

M Ghanem, AK Ghaith, VG El-Hajj, A Bhandarkar, A de Giorgio, A Elmi-Terander, et al. [n. d.]. Limitations in evaluating machine learning models for imbalanced binary outcome classification in spine surgery: A systematic review. Brain Sci. 2023; 13 (12): 1723

work page 2023
[21]

2025.AI that builds with you

GitHub. 2025.AI that builds with you. Retrieved September 29, 2025 from https://github.com/features/copilot

work page 2025
[22]

Manuel Goyanes, Carlos Lopezosa, and Valeriano Piñeiro-Naval. 2025. The use of artificial intelligence (AI) in research: a review of author guidelines in leading journals across eight social science disciplines.Scientometrics(2025), 1–17

work page 2025
[23]

Yang Gu, Hengyu You, Jian Cao, Muran Yu, Haoran Fan, and Shiyou Qian. 2025. Large language models for constructing and optimizing machine learning work- flows: A survey.ACM Transactions on Software Engineering and Methodology (2025)

work page 2025
[24]

Yuval Heffetz, Roman Vainshtein, Gilad Katz, and Lior Rokach. 2020. Deepline: Automl tool for pipelines generation using deep reinforcement learning and hierarchical actions filtering. InProceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 2103–2113

work page 2020
[25]

2026.UCI Machine Learning Repository

University of California Irvine. 2026.UCI Machine Learning Repository. Retrieved March 16, 2026 from https://archive.ics.uci.edu/

work page 2026
[26]

Muhammad Tanvirul Islam. 2024. Jute Pest. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5289P

work page doi:10.24432/c5289p 2024
[27]

Muhammad Tanvirul Islam and Md Sadekur Rahman. 2024. An efficient deep learning approach for jute pest classification using transfer learning. In2024 6th International Conference on Electrical Engineering and Information & Communica- tion Technology (ICEEICT). IEEE, 1473–1478

work page 2024
[28]

Sathvik Joel, Jie Wu, and Fatemeh Fard. 2024. A survey on llm-based code generation for low-resource and domain-specific programming languages.ACM Transactions on Software Engineering and Methodology(2024)

work page 2024
[29]

Osama Khan, Mohd Parvez, Pratibha Kumari, Samia Parvez, and Shadab Ahmad

work page
[30]

The future of pharmacy: how AI is revolutionizing the industry.Intelligent Pharmacy1, 1 (2023), 32–40

work page 2023
[31]

2011.The 80/20 Principle: The Secret of Achieving More with Less: Updated 20th anniversary edition of the productivity and business classic

Richard Koch. 2011.The 80/20 Principle: The Secret of Achieving More with Less: Updated 20th anniversary edition of the productivity and business classic. Hachette UK

work page 2011
[32]

Yulia Kumar, Wenxiao Li, Kuan Huang, Michael Thompson, and Brendan Hannon

work page
[33]

In2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)

Natural Language Coding (NLC) for Autonomous Stock Trading: A New Dimension in No-Code/Low-Code (NCLC) AI. In2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security Companion (QRS-C). IEEE, 873–874

work page
[34]

DonHee Lee and Seong No Yoon. 2021. Application of artificial intelligence- based technologies in the healthcare industry: Opportunities and challenges. International journal of environmental research and public health18, 1 (2021), 271

work page 2021
[35]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems33 (2020), 9459–9474

work page 2020
[36]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.ACM computing surveys55, 9 (2023), 1–35

work page 2023
[37]

Kritin Maddireddy, Santhosh Kotekal Methukula, Chandrasekar Sridhar, and Karthik Vaidhyanathan. 2025. LoCoML: A Framework for Real-World ML Infer- ence Pipelines. In2025 IEEE/ACM 4th International Conference on AI Engineering– Software Engineering for AI (CAIN). IEEE, 83–88

work page 2025
[38]

Eder Martinez and Diego Cisterna. 2023. Using low-code and artificial intelligence to support continuous improvement in the construction industry. InProceedings of the 31st Annual Conference of the International Group for Lean Construction (IGLC31). 197–207

work page 2023
[39]

2025.Introducing Code Llama, a state-of-the-art large language model for coding

Meta. 2025.Introducing Code Llama, a state-of-the-art large language model for coding. Retrieved September 16, 2025 from https://ai.meta.com/blog/code-llama- large-language-model-coding/

work page 2025
[40]

Bonan Min, Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, Eneko Agirre, Ilana Heintz, and Dan Roth. 2023. Recent advances in natural language processing via large pre-trained language models: A survey.Comput. Surveys56, 2 (2023), 1–40

work page 2023
[41]

Shervin Minaee, Tomas Mikolov, Narjes Nikzad, Meysam Chenaghlu, Richard Socher, Xavier Amatriain, and Jianfeng Gao. 2024. Large language models: A survey.arXiv preprint arXiv:2402.06196(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[42]

Vinith Kumar Nair, R Harikrishnan, S Anjali, Kavya Gopan, et al. 2024. Barriers to AI Adoption in Sales: Challenges and Implications for Sales Professionals Using the Total Interpretive Structural Modelling (TISM) Approach. In2024 IEEE 4th International Conference on ICT in Business Industry & Government (ICTBIG). IEEE, 1–5

work page 2024
[43]

Bentley James Oakes, Michalis Famelis, and Houari Sahraoui. 2024. Building domain-specific machine learning workflows: A conceptual framework for the state of the practice.ACM Transactions on Software Engineering and Methodology 33, 4 (2024), 1–50

work page 2024
[44]

Cecilia B Öman and Christian Junestedt. 2008. Chemical characterization of landfill leachates–400 parameters and compounds.Waste management28, 10 (2008), 1876–1891

work page 2008
[45]

Gunjan Paliwal, Anujkumarsinh Donvir, Praveen Gujar, and Sriram Panyam

work page
[46]

In 2024 IEEE Eighth Ecuador Technical Chapters Meeting (ETCM)

Low-code/no-code meets GenAI: A new era in product development. In 2024 IEEE Eighth Ecuador Technical Chapters Meeting (ETCM). IEEE, 1–9

work page 2024
[47]

Deven Panchal, Isilay Baran, Dan Musgrove, and David Lu. 2023. MLOps: Creat- ing powerful AI pipelines by stitching together heterogeneous Machine Learning models. In2023 IEEE International Conference on Technology Management, Opera- tions and Decisions (ICTMOD). IEEE, 1–6

work page 2023
[48]

Traian Rebedea, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, and Jonathan Cohen. 2023. Nemo guardrails: A toolkit for controllable and safe llm applications with programmable rails.arXiv preprint arXiv:2310.10501(2023)

work page arXiv 2023
[49]

K Satyanarayan Reddy, Rajesh Gotur, and Vandana Bhat. 2025. Generative AI Adoption in Enterprise: A Comprehensive Case Study Analysis of Implementa- tion Strategies and Outcomes Across Diverse Sectors. In2025 6th International Conference on Recent Advances in Information Technology (RAIT). IEEE, 1–6

work page 2025
[50]

Zhao Ru-tao, Wang Jing, Chen Gao-jian, Li Qian-wen, and Yuan Yun-jing. 2020. A Machine learning pipeline generation approach for data analysis. In2020 IEEE 6th International Conference on Computer and Communications (ICCC). IEEE, 1488–1493

work page 2020
[51]

Ripon K Saha, Akira Ura, Sonal Mahajan, Chenguang Zhu, Linyi Li, Yang Hu, Hiroaki Yoshida, Sarfraz Khurshid, and Mukul R Prasad. 2022. SapientML: syn- thesizing machine learning pipelines by learning from human-writen solutions. 11 InProceedings of the 44th international conference on software engineering. 1932– 1944

work page 2022
[52]

Murray Shanahan, Kyle McDonell, and Laria Reynolds. 2023. Role play with large language models.Nature623, 7987 (2023), 493–498

work page 2023
[53]

Zeyuan Shang, Emanuel Zgraggen, Benedetto Buratti, Ferdinand Kossmann, Philipp Eichmann, Yeounoh Chung, Carsten Binnig, Eli Upfal, and Tim Kraska

work page
[54]

InProceedings of the 2019 international conference on management of data

Democratizing data science through interactive curation of ml pipelines. InProceedings of the 2019 international conference on management of data. 1171– 1188

work page 2019
[55]

Jayasankar Shyam, Cyril K Sony, Aswin Jeev Johny, Basil Siby, and Jacob Thomas

work page
[56]

In 2025 Emerging Technologies for Intelligent Systems (ETIS)

Bridging the Gap for Non-Programmers with No-Code ML Solutions. In 2025 Emerging Technologies for Intelligent Systems (ETIS). IEEE, 1–5

work page 2025
[57]

Sam Single, Saeid Iranmanesh, and Raad Raad. 2023. RealWaste. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5SS4G

work page doi:10.24432/c5ss4g 2023
[58]

Sam Single, Saeid Iranmanesh, and Raad Raad. 2023. Realwaste: a novel real-life data set for landfill waste classification using deep learning.Information14, 12 (2023), 633

work page 2023
[59]

Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms.Advances in neural information processing systems25 (2012)

work page 2012
[60]

Vladimir Sonkin and Cătălin Tudose. 2025. Beyond Snippet Assistance: A Workflow-Centric Framework for End-to-End AI-Driven Code Generation.Com- puters14, 3 (2025), 94

work page 2025
[61]

Maojun Sun, Ruijian Han, Binyan Jiang, Houduo Qi, Defeng Sun, Yancheng Yuan, and Jian Huang. 2025. Lambda: A large model based data agent.J. Amer. Statist. Assoc.(2025), 1–13

work page 2025
[62]

Athanasios Tsanas and Max Little. 2009. Parkinsons Telemonitoring. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5ZS3N

work page doi:10.24432/c5zs3n 2009
[63]

Athanasios Tsanas, Max Little, Patrick McSharry, and Lorraine Ramig. 2009. Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests.Nature Precedings(2009), 1–1

work page 2009
[64]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompting elicits reasoning in large language models.Advances in neural information processing systems35 (2022), 24824–24837

work page 2022
[65]

Yujing Yang, Boqi Chen, Kua Chen, Gunter Mussbacher, and Dániel Varró. 2024. Multi-step iterative automated domain modeling with large language models. InProceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems. 587–595

work page 2024
[66]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. InThe eleventh international conference on learning representations

work page 2022
[67]

Penghao Zhao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, Jie Jiang, and Bin Cui. 2024. Retrieval- augmented generation for ai-generated content: A survey.arXiv preprint arXiv:2402.19473(2024). 12

work page internal anchor Pith review Pith/arXiv arXiv 2024

[1] [1]

Oguz Akbilgic. 2013. ISTANBUL STOCK EXCHANGE. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C54P4J

work page doi:10.24432/c54p4j 2013

[2] [2]

Oguz Akbilgic, Hamparsum Bozdogan, and M Erdal Balaban. 2014. A novel hybrid RBF neural networks model as a forecaster.Statistics and Computing24, 3 (2014), 365–375

work page 2014

[3] [3]

Leonidas Akritidis. 2020. Product Classification and Clustering. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5M91Z

work page doi:10.24432/c5m91z 2020

[4] [4]

Leonidas Akritidis, Athanasios Fevgas, and Panayiotis Bozanis. 2018. Effective products categorization with importance scores and morphological analysis of the titles. In2018 IEEE 30th international conference on tools with artificial intelligence (ICTAI). IEEE, 213–220

work page 2018

[5] [5]

Leonidas Akritidis, Athanasios Fevgas, Panayiotis Bozanis, and Christos Makris

work page

[6] [6]

A self-verifying clustering approach to unsupervised matching of product titles.Artificial Intelligence Review53, 7 (2020), 4777–4820

work page 2020

[7] [7]

Răzvan Daniel Albu and Florin Lucian Morgoş. 2025. AI-Assisted Low-Code Plat- forms in Modern Research. In2025 18th International Conference on Engineering of Modern Electric Systems (EMES). IEEE, 1–4

work page 2025

[8] [8]

Anonymous, Anonymous, and Anonymous. 2026. AI Pipeline Generation for Sci- entists Without AI Expertise Using a Controlled Agentic Framework (Replication Package). doi:10.5281/zenodo.19241799

work page doi:10.5281/zenodo.19241799 2026

[9] [9]

Muhammad Arslan, Hussam Ghanem, Saba Munawar, and Christophe Cruz. 2024. A Survey on RAG with LLMs.Procedia computer science246 (2024), 3781–3790

work page 2024

[10] [10]

Suriya Ganesh Ayyamperumal and Limin Ge. 2024. Current state of LLM Risks and AI Guardrails.arXiv preprint arXiv:2406.12934(2024)

work page arXiv 2024

[11] [11]

James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization.Advances in neural information processing systems24 (2011)

work page 2011

[12] [12]

Alexander C Bock and Ulrich Frank. 2021. Low-code platform.Business & Information Systems Engineering63, 6 (2021), 733–740

work page 2021

[13] [13]

Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. 2021. On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258(2021)

work page internal anchor Pith review Pith/arXiv arXiv 2021

[14] [14]

Browne, Edward Powley, Daniel Whitehouse, Simon M

Cameron B. Browne, Edward Powley, Daniel Whitehouse, Simon M. Lucas, Pe- ter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, and Simon Colton. 2012. A survey of Monte Carlo tree search methods.IEEE Transactions on Computational Intelligence and AI in Games4, 1 (2012), 1–43. doi:10.1109/TCIAIG.2012.2186810

work page doi:10.1109/tciaig.2012.2186810 2012

[15] [15]

Daqing Chen. 2015. Online Retail. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5BW33

work page doi:10.24432/c5bw33 2015

[16] [16]

Daqing Chen, Sai Laing Sain, and Kun Guo. 2012. Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining.Journal of Database Marketing & Customer Strategy Management 19, 3 (2012), 197–208

work page 2012

[17] [17]

2025.Built to make you extraordinarily productive, Cursor is the best way to code with AI.Retrieved September 29, 2025 from https://cursor.com/

Cursor. 2025.Built to make you extraordinarily productive, Cursor is the best way to code with AI.Retrieved September 29, 2025 from https://cursor.com/

work page 2025

[18] [18]

Stefano D’Urso, Barbara Martini, and Filippo Sciarrone. 2024. A Novel LLM Architecture for Intelligent System Configuration. In2024 28th International Conference Information Visualisation (IV). IEEE, 326–331

work page 2024

[19] [19]

Unai Garciarena, Roberto Santana, and Alexander Mendiburu. 2018. Analysis of the complexity of the automatic pipeline generation problem. In2018 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1–8

work page 2018

[20] [20]

M Ghanem, AK Ghaith, VG El-Hajj, A Bhandarkar, A de Giorgio, A Elmi-Terander, et al. [n. d.]. Limitations in evaluating machine learning models for imbalanced binary outcome classification in spine surgery: A systematic review. Brain Sci. 2023; 13 (12): 1723

work page 2023

[21] [21]

2025.AI that builds with you

GitHub. 2025.AI that builds with you. Retrieved September 29, 2025 from https://github.com/features/copilot

work page 2025

[22] [22]

Manuel Goyanes, Carlos Lopezosa, and Valeriano Piñeiro-Naval. 2025. The use of artificial intelligence (AI) in research: a review of author guidelines in leading journals across eight social science disciplines.Scientometrics(2025), 1–17

work page 2025

[23] [23]

Yang Gu, Hengyu You, Jian Cao, Muran Yu, Haoran Fan, and Shiyou Qian. 2025. Large language models for constructing and optimizing machine learning work- flows: A survey.ACM Transactions on Software Engineering and Methodology (2025)

work page 2025

[24] [24]

Yuval Heffetz, Roman Vainshtein, Gilad Katz, and Lior Rokach. 2020. Deepline: Automl tool for pipelines generation using deep reinforcement learning and hierarchical actions filtering. InProceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 2103–2113

work page 2020

[25] [25]

2026.UCI Machine Learning Repository

University of California Irvine. 2026.UCI Machine Learning Repository. Retrieved March 16, 2026 from https://archive.ics.uci.edu/

work page 2026

[26] [26]

Muhammad Tanvirul Islam. 2024. Jute Pest. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5289P

work page doi:10.24432/c5289p 2024

[27] [27]

Muhammad Tanvirul Islam and Md Sadekur Rahman. 2024. An efficient deep learning approach for jute pest classification using transfer learning. In2024 6th International Conference on Electrical Engineering and Information & Communica- tion Technology (ICEEICT). IEEE, 1473–1478

work page 2024

[28] [28]

Sathvik Joel, Jie Wu, and Fatemeh Fard. 2024. A survey on llm-based code generation for low-resource and domain-specific programming languages.ACM Transactions on Software Engineering and Methodology(2024)

work page 2024

[29] [29]

Osama Khan, Mohd Parvez, Pratibha Kumari, Samia Parvez, and Shadab Ahmad

work page

[30] [30]

The future of pharmacy: how AI is revolutionizing the industry.Intelligent Pharmacy1, 1 (2023), 32–40

work page 2023

[31] [31]

2011.The 80/20 Principle: The Secret of Achieving More with Less: Updated 20th anniversary edition of the productivity and business classic

Richard Koch. 2011.The 80/20 Principle: The Secret of Achieving More with Less: Updated 20th anniversary edition of the productivity and business classic. Hachette UK

work page 2011

[32] [32]

Yulia Kumar, Wenxiao Li, Kuan Huang, Michael Thompson, and Brendan Hannon

work page

[33] [33]

In2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)

Natural Language Coding (NLC) for Autonomous Stock Trading: A New Dimension in No-Code/Low-Code (NCLC) AI. In2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security Companion (QRS-C). IEEE, 873–874

work page

[34] [34]

DonHee Lee and Seong No Yoon. 2021. Application of artificial intelligence- based technologies in the healthcare industry: Opportunities and challenges. International journal of environmental research and public health18, 1 (2021), 271

work page 2021

[35] [35]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems33 (2020), 9459–9474

work page 2020

[36] [36]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.ACM computing surveys55, 9 (2023), 1–35

work page 2023

[37] [37]

Kritin Maddireddy, Santhosh Kotekal Methukula, Chandrasekar Sridhar, and Karthik Vaidhyanathan. 2025. LoCoML: A Framework for Real-World ML Infer- ence Pipelines. In2025 IEEE/ACM 4th International Conference on AI Engineering– Software Engineering for AI (CAIN). IEEE, 83–88

work page 2025

[38] [38]

Eder Martinez and Diego Cisterna. 2023. Using low-code and artificial intelligence to support continuous improvement in the construction industry. InProceedings of the 31st Annual Conference of the International Group for Lean Construction (IGLC31). 197–207

work page 2023

[39] [39]

2025.Introducing Code Llama, a state-of-the-art large language model for coding

Meta. 2025.Introducing Code Llama, a state-of-the-art large language model for coding. Retrieved September 16, 2025 from https://ai.meta.com/blog/code-llama- large-language-model-coding/

work page 2025

[40] [40]

Bonan Min, Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, Eneko Agirre, Ilana Heintz, and Dan Roth. 2023. Recent advances in natural language processing via large pre-trained language models: A survey.Comput. Surveys56, 2 (2023), 1–40

work page 2023

[41] [41]

Shervin Minaee, Tomas Mikolov, Narjes Nikzad, Meysam Chenaghlu, Richard Socher, Xavier Amatriain, and Jianfeng Gao. 2024. Large language models: A survey.arXiv preprint arXiv:2402.06196(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[42] [42]

Vinith Kumar Nair, R Harikrishnan, S Anjali, Kavya Gopan, et al. 2024. Barriers to AI Adoption in Sales: Challenges and Implications for Sales Professionals Using the Total Interpretive Structural Modelling (TISM) Approach. In2024 IEEE 4th International Conference on ICT in Business Industry & Government (ICTBIG). IEEE, 1–5

work page 2024

[43] [43]

Bentley James Oakes, Michalis Famelis, and Houari Sahraoui. 2024. Building domain-specific machine learning workflows: A conceptual framework for the state of the practice.ACM Transactions on Software Engineering and Methodology 33, 4 (2024), 1–50

work page 2024

[44] [44]

Cecilia B Öman and Christian Junestedt. 2008. Chemical characterization of landfill leachates–400 parameters and compounds.Waste management28, 10 (2008), 1876–1891

work page 2008

[45] [45]

Gunjan Paliwal, Anujkumarsinh Donvir, Praveen Gujar, and Sriram Panyam

work page

[46] [46]

In 2024 IEEE Eighth Ecuador Technical Chapters Meeting (ETCM)

Low-code/no-code meets GenAI: A new era in product development. In 2024 IEEE Eighth Ecuador Technical Chapters Meeting (ETCM). IEEE, 1–9

work page 2024

[47] [47]

Deven Panchal, Isilay Baran, Dan Musgrove, and David Lu. 2023. MLOps: Creat- ing powerful AI pipelines by stitching together heterogeneous Machine Learning models. In2023 IEEE International Conference on Technology Management, Opera- tions and Decisions (ICTMOD). IEEE, 1–6

work page 2023

[48] [48]

Traian Rebedea, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, and Jonathan Cohen. 2023. Nemo guardrails: A toolkit for controllable and safe llm applications with programmable rails.arXiv preprint arXiv:2310.10501(2023)

work page arXiv 2023

[49] [49]

K Satyanarayan Reddy, Rajesh Gotur, and Vandana Bhat. 2025. Generative AI Adoption in Enterprise: A Comprehensive Case Study Analysis of Implementa- tion Strategies and Outcomes Across Diverse Sectors. In2025 6th International Conference on Recent Advances in Information Technology (RAIT). IEEE, 1–6

work page 2025

[50] [50]

Zhao Ru-tao, Wang Jing, Chen Gao-jian, Li Qian-wen, and Yuan Yun-jing. 2020. A Machine learning pipeline generation approach for data analysis. In2020 IEEE 6th International Conference on Computer and Communications (ICCC). IEEE, 1488–1493

work page 2020

[51] [51]

Ripon K Saha, Akira Ura, Sonal Mahajan, Chenguang Zhu, Linyi Li, Yang Hu, Hiroaki Yoshida, Sarfraz Khurshid, and Mukul R Prasad. 2022. SapientML: syn- thesizing machine learning pipelines by learning from human-writen solutions. 11 InProceedings of the 44th international conference on software engineering. 1932– 1944

work page 2022

[52] [52]

Murray Shanahan, Kyle McDonell, and Laria Reynolds. 2023. Role play with large language models.Nature623, 7987 (2023), 493–498

work page 2023

[53] [53]

Zeyuan Shang, Emanuel Zgraggen, Benedetto Buratti, Ferdinand Kossmann, Philipp Eichmann, Yeounoh Chung, Carsten Binnig, Eli Upfal, and Tim Kraska

work page

[54] [54]

InProceedings of the 2019 international conference on management of data

Democratizing data science through interactive curation of ml pipelines. InProceedings of the 2019 international conference on management of data. 1171– 1188

work page 2019

[55] [55]

Jayasankar Shyam, Cyril K Sony, Aswin Jeev Johny, Basil Siby, and Jacob Thomas

work page

[56] [56]

In 2025 Emerging Technologies for Intelligent Systems (ETIS)

Bridging the Gap for Non-Programmers with No-Code ML Solutions. In 2025 Emerging Technologies for Intelligent Systems (ETIS). IEEE, 1–5

work page 2025

[57] [57]

Sam Single, Saeid Iranmanesh, and Raad Raad. 2023. RealWaste. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5SS4G

work page doi:10.24432/c5ss4g 2023

[58] [58]

Sam Single, Saeid Iranmanesh, and Raad Raad. 2023. Realwaste: a novel real-life data set for landfill waste classification using deep learning.Information14, 12 (2023), 633

work page 2023

[59] [59]

Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms.Advances in neural information processing systems25 (2012)

work page 2012

[60] [60]

Vladimir Sonkin and Cătălin Tudose. 2025. Beyond Snippet Assistance: A Workflow-Centric Framework for End-to-End AI-Driven Code Generation.Com- puters14, 3 (2025), 94

work page 2025

[61] [61]

Maojun Sun, Ruijian Han, Binyan Jiang, Houduo Qi, Defeng Sun, Yancheng Yuan, and Jian Huang. 2025. Lambda: A large model based data agent.J. Amer. Statist. Assoc.(2025), 1–13

work page 2025

[62] [62]

Athanasios Tsanas and Max Little. 2009. Parkinsons Telemonitoring. UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5ZS3N

work page doi:10.24432/c5zs3n 2009

[63] [63]

Athanasios Tsanas, Max Little, Patrick McSharry, and Lorraine Ramig. 2009. Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests.Nature Precedings(2009), 1–1

work page 2009

[64] [64]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompting elicits reasoning in large language models.Advances in neural information processing systems35 (2022), 24824–24837

work page 2022

[65] [65]

Yujing Yang, Boqi Chen, Kua Chen, Gunter Mussbacher, and Dániel Varró. 2024. Multi-step iterative automated domain modeling with large language models. InProceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems. 587–595

work page 2024

[66] [66]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. InThe eleventh international conference on learning representations

work page 2022

[67] [67]

Penghao Zhao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, Jie Jiang, and Bin Cui. 2024. Retrieval- augmented generation for ai-generated content: A survey.arXiv preprint arXiv:2402.19473(2024). 12

work page internal anchor Pith review Pith/arXiv arXiv 2024