Industry Practitioners Perspectives on AI Model Quality: Perceptions, Challenges, and Solutions

Chenyu Wang; Daniela Damian; David Lo; Yunbo Lyu; Ze Shi Li; Zhou Yang

arxiv: 2402.16391 · v4 · submitted 2024-02-26 · 💻 cs.SE

Industry Practitioners Perspectives on AI Model Quality: Perceptions, Challenges, and Solutions

Chenyu Wang , Zhou Yang , Yunbo Lyu , Ze Shi Li , Daniela Damian , David Lo This is my paper

Pith reviewed 2026-05-24 04:20 UTC · model grok-4.3

classification 💻 cs.SE

keywords AI model qualitypractitioner perspectivesquality attributesdata imbalanceactive learningsoftware engineeringAI deployment

0 comments

The pith

AI model quality priorities shift by application context according to industry interviews.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that industry practitioners do not treat all quality attributes equally; their priorities depend on the specific use case, such as favoring efficiency over correctness in real-time systems. Data imbalance is identified as a key challenge to correctness and robustness, with active learning as a common mitigation. The findings from 15 interviews are validated by a survey of 50 practitioners, suggesting these views are common. A sympathetic reader would care because this can guide AI research to address the attributes that matter most in practice rather than assuming universal importance of correctness.

Core claim

Through interviews with 15 AI practitioners, the paper finds that practitioners prioritize quality attributes differently depending on context. For instance, efficiency can be more important than correctness in real-time applications, while scalability and deployability are no longer primary concerns. Data imbalance is a major obstacle to maintaining model correctness and robustness, and practitioners often use strategies like active learning to mitigate it. These findings are largely confirmed by a survey of 50 practitioners.

What carries the argument

Context-dependent prioritization of nine key quality attributes, revealed through practitioner interviews and validated by survey.

If this is right

Researchers should focus on attributes practitioners value most, such as efficiency in certain contexts.
Improving one attribute should not come at the expense of others considered more critical.
Data imbalance mitigation techniques like active learning should be further developed.
Scalability and deployability may receive less attention in future AI development.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Quality assessment frameworks for AI may need to be customizable based on application domain.
This suggests potential trade-offs in model development that current benchmarks do not capture.
Future studies could observe actual deployed models to verify self-reported practices.

Load-bearing premise

The 15 interviewed and 50 surveyed practitioners represent the broader population of industry AI practitioners and their self-reports match actual practices.

What would settle it

A study finding that a majority of practitioners still consider scalability a primary concern across contexts would contradict the claims.

Figures

Figures reproduced from arXiv: 2402.16391 by Chenyu Wang, Daniela Damian, David Lo, Yunbo Lyu, Ze Shi Li, Zhou Yang.

**Figure 1.** Figure 1: The workflow of our study. AI-based systems development and identified the challenges and opportunities in this area. Felderer et al. [38] also pointed out many challenges that QA4AI faces, such as the understandability and interpretability of AI models, accuracy and correctness measures, and dynamic and frequently changing environments. Existing studies have delved into certain QA4AI aspects within the in… view at source ↗

**Figure 2.** Figure 2: The ranking result of each QA4AI property. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

read the original abstract

Artificial Intelligence (AI) is now used across nearly every industry, making AI model quality essential for building reliable and trustworthy systems. Historically, correctness has been the main focus, but industry AI models must also satisfy many other important quality attributes. To understand how these attributes are perceived, the challenges they create, and the solutions used in practice, we identify nine key quality attributes and interview 15 AI practitioners from diverse backgrounds. The interviews show that practitioners prioritize attributes differently depending on context. For example, efficiency can matter more than correctness in real-time applications, while scalability and deployability are no longer seen as primary concerns. Data imbalance emerges as a major obstacle to maintaining model correctness and robustness, and practitioners commonly use mitigation strategies such as active learning. We validate our main findings with a survey of 50 practitioners, which shows that most of the findings are widely recognized. These results can help researchers focus on the attributes practitioners value most and avoid improving one attribute at the expense of others that are considered more critical.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Practitioner interviews surface context-dependent quality priorities and data imbalance issues, but the small convenience sample undercuts broad claims.

read the letter

The main point is that this paper gathers interview data from 15 AI practitioners and follows up with a 50-person survey to map what they care about in model quality. It reports that priorities shift by context, efficiency can outweigh correctness in real-time settings, scalability and deployability are not always central, and data imbalance stands out as a practical obstacle that teams address with active learning. The survey suggests these observations are recognized more widely. That is the core contribution: direct input from people building and deploying models rather than just academic assumptions about what should matter. The work is straightforward empirical software engineering and adds a practitioner lens to the nine attributes without claiming a new theory. The sampling and analysis details are thin in the abstract, and the full paper would need to show exactly how the attributes were derived and how quotes or counts support the data imbalance finding. The bigger limitation is representativeness. Fifteen interviews plus fifty survey responses is a modest base, and without clear information on selection, company types, roles, or response rates it is difficult to separate real patterns from who happened to respond. Self-reported strategies also sit one step removed from observed practice. This paper is aimed at researchers in AI engineering and empirical SE who want to know what current practitioners say they value. A reader working on model quality trade-offs or tooling could pick up useful signals here. It deserves peer review because the topic is relevant and the method is standard for this kind of study, though referees would likely press for stronger sampling description and more transparent analysis steps.

Referee Report

3 major / 0 minor

Summary. The paper claims that AI model quality involves nine key attributes beyond correctness; interviews with 15 practitioners reveal context-dependent prioritization (e.g., efficiency over correctness in real-time settings; scalability/deployability no longer primary), with data imbalance as a major obstacle to correctness/robustness and active learning as a common mitigation; a follow-up survey of 50 practitioners validates that most findings are widely recognized.

Significance. If the empirical claims hold, the work could usefully redirect research attention toward practitioner-valued attributes and trade-offs. The mixed-methods design (interviews plus validation survey) is a strength when methods are transparent.

major comments (3)

[Abstract, §3] Abstract and §3 (Methods): the central generalization claims (context-dependent prioritization, data imbalance as 'major obstacle', active learning as 'commonly used') rest on the untested representativeness of the 15-interviewee convenience sample plus 50-survey respondents. No information is supplied on recruitment method, response rate, stratification by role/company size/domain, or external validation of self-reports against observed practice; this directly undermines the load-bearing assumption identified in the stress-test note.
[Abstract] Abstract: the process for identifying the nine quality attributes is not described (e.g., whether derived from prior literature, pilot interviews, or thematic analysis of the 15 transcripts). Without this, it is impossible to assess whether the attribute set is exhaustive or biased toward the sampled practitioners.
[Abstract] Abstract and validation paragraph: the survey is said to show that 'most of the findings are widely recognized,' yet no quantitative results, response distributions, or statistical tests are referenced; this leaves the validation claim unsupported.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive feedback. We address each major comment below, agreeing where revisions are needed to improve transparency while defending the exploratory nature of the mixed-methods design.

read point-by-point responses

Referee: [Abstract, §3] Abstract and §3 (Methods): the central generalization claims rest on the untested representativeness of the 15-interviewee convenience sample plus 50-survey respondents. No information is supplied on recruitment method, response rate, stratification by role/company size/domain, or external validation of self-reports against observed practice.

Authors: We agree the manuscript should provide more methodological transparency. The sample is a convenience sample recruited via professional networks and LinkedIn, which is standard for qualitative SE studies; we do not claim statistical representativeness but present context-specific insights. We will revise §3 to detail recruitment, participant roles/domains, and add a limitations paragraph on generalizability and self-report nature. Response rate is not applicable as it was not a closed survey. revision: yes
Referee: [Abstract] Abstract: the process for identifying the nine quality attributes is not described (e.g., whether derived from prior literature, pilot interviews, or thematic analysis of the 15 transcripts).

Authors: The nine attributes emerged from thematic analysis of the interview data, cross-referenced with prior literature on software quality attributes (e.g., ISO 25010 extensions for ML). We will revise the abstract and §3 to explicitly describe the identification process, including coding approach and how saturation was assessed. revision: yes
Referee: [Abstract] Abstract and validation paragraph: the survey is said to show that 'most of the findings are widely recognized,' yet no quantitative results, response distributions, or statistical tests are referenced.

Authors: We agree this claim requires supporting data. The survey used Likert-scale items; we will add response distributions (e.g., % agreement per finding) and any relevant descriptive statistics in the revised validation section. revision: yes

standing simulated objections not resolved

External validation of self-reports against observed practice is unavailable given the interview/survey design.

Circularity Check

0 steps flagged

No circularity: empirical interview/survey study with no derivation chain

full rationale

The paper reports practitioner perspectives obtained through 15 interviews and a follow-up survey of 50 respondents. No equations, fitted parameters, predictions, or mathematical derivations appear in the provided text. Claims about attribute prioritization, data imbalance, and mitigation strategies are presented as direct outputs of the collected responses rather than reductions of any prior inputs by construction. Self-citation load-bearing, ansatz smuggling, or renaming of known results are absent. Representativeness of the sample is a validity issue for generalization but does not create circularity in any derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The study depends on standard assumptions of qualitative research methods without new free parameters or invented entities.

axioms (1)

domain assumption Self-reported data from interviews and surveys accurately captures practitioners' perceptions and practices.
Central to interpreting the findings as reflective of industry realities.

pith-pipeline@v0.9.0 · 5713 in / 1135 out tokens · 28299 ms · 2026-05-24T04:20:21.016709+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Results-Actionability Gap: Understanding How Practitioners Evaluate LLM Products in the Wild
cs.SE 2026-01 conditional novelty 7.0

Qualitative study of 19 practitioners reveals ten LLM product evaluation practices and introduces the results-actionability gap as a key barrier to turning findings into improvements.

Reference graph

Works this paper leans on

144 extracted references · 144 canonical work pages · cited by 1 Pith paper · 12 internal anchors

[1]

[n. d.]. Apache Ignite. https://ignite.apache.org/

work page
[2]

[n. d.]. Apache Spark. https://spark.apache.org/

work page
[3]

[n. d.]. ChatGPT is easily abused, and that’s a big problem. https://adguard.com/en/blog/chatgpt-dan-prompt- abuse.html , Vol. 1, No. 1, Article . Publication date: February 2018. Quality Assurance for Artificial Intelligence: A Study of Industrial Concerns, Challenges and Best Practices 37

work page 2018
[4]

[n. d.]. Kubernetes. https://kubernetes.io/

work page
[5]

[n. d.]. NVIDIA CUDA toolkit. https://developer.nvidia.com/cuda-toolkit

work page
[6]

[n. d.]. NVIDIA TensorRT. https://developer.nvidia.com/tensorrt

work page
[7]

[n. d.]. NVIDIA Triton Inference Server. https://developer.nvidia.com/nvidia-triton-inference-server

work page
[8]

[n. d.]. Personal Data Protection Act. https://www.pdpc.gov.sg/Overview-of-PDPA/The-Legislation/Personal-Data- Protection-Act

work page
[9]

[n. d.]. Pinecone. https://www.pinecone.io/

work page
[10]

[n. d.]. PyTorch. https://pytorch.org/

work page
[11]

[n. d.]. Seldon. https://www.seldon.io/

work page
[12]

[n. d.]. TensorFlow. https://www.tensorflow.org/

work page
[13]

History of the Basel Committee

2014. History of the Basel Committee. https://www.bis.org/bcbs/history.htm

work page 2014
[14]

ISO 9001:2015

2015. ISO 9001:2015. https://www.iso.org/standard/62085.html

work page 2015
[15]

General Data Protection Regulation (GDPR)

2022. General Data Protection Regulation (GDPR). https://gdpr-info.eu/

work page 2022
[16]

Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. 308–318

work page 2016
[17]

Ibrahim M Ahmed and Manar Younis Kashmoola. 2021. Threats on machine learning technique by data poisoning attack: A survey. In Advances in Cyber Security: Third International Conference, ACeS 2021, Penang, Malaysia, August 24–25, 2021, Revised Selected Papers 3 . Springer, 586–600

work page 2021
[19]

Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: A Case Study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) . 291–300. https://doi.org/10.11...

work page doi:10.1109/icse-seip.2019.00042 2019
[20]

Shin Ando and Chun-Yuan Huang. 2017. Deep Over-sampling Framework for Classifying Imbalanced Data. arXiv:1704.07515 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[21]

Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Herrera. 2019. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. arXiv:1910.100...

work page arXiv 2019
[22]

Muhammad Hilmi Asyrofi, Zhou Yang, Imam Nur Bani Yusuf, Hong Jin Kang, Ferdian Thung, and David Lo. 2022. BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems. IEEE Transactions on Software Engineering 48, 12 (2022), 5087–5101. https://doi.org/10.1109/TSE.2021.3136169

work page doi:10.1109/tse.2021.3136169 2022
[23]

Yang Bao, Gilles Hilary, and Bin Ke. 2022. Artificial intelligence and fraud detection. Innovative Technology at the Interface of Finance and Operations: Volume I (2022), 223–247

work page 2022
[24]

Hollen Barmer, Rachel Dzombak, Matthew Gaston, Vijaykumar Palat, Frank Redner, Tanisha Smith, and John Wohlbier

work page
[25]

(9 2021)

Scalable AI. (9 2021). https://doi.org/10.1184/R1/16560273.v1

work page doi:10.1184/r1/16560273.v1 2021
[26]

Mohammad Riyaz Belgaum, Zainab Alansari, Shahrulniza Musa, Muhammad Mansoor Alam, and MS Mazliham. 2021. Role of artificial intelligence in cloud computing, IoT and SDN: Reliability and scalability issues. International Journal of Electrical and Computer Engineering 11, 5 (2021), 4458

work page 2021
[27]

Kartikeya Bhardwaj, Naveen Suda, and Radu Marculescu. 2019. Dream Distillation: A Data-Independent Model Compression Framework. arXiv:1905.07072 [stat.ML]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[28]

Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, and D. Sculley. 2017. The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction. In Proceedings of IEEE Big Data

work page 2017
[29]

Lawrence

Christian Cabrera, Andrei Paleyes, Pierre Thodoroff, and Neil D. Lawrence. 2023. Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective. arXiv:2302.04810 [cs.SE]

work page arXiv 2023
[30]

Longbing Cao. 2021. AI in Finance: Challenges, Techniques and Opportunities. arXiv:2107.09051 [q-fin.CP]

work page arXiv 2021
[31]

Longbing Cao. 2022. AI in Finance: Challenges, Techniques, and Opportunities. ACM Comput. Surv. 55, 3, Article 64 (feb 2022), 38 pages. https://doi.org/10.1145/3502289

work page doi:10.1145/3502289 2022
[32]

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel. 2021. Extracting Training Data from Large Language Models. arXiv:2012.07805 [cs.CR]

work page arXiv 2021
[33]

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16 (jun 2002), 321–357. https://doi.org/10.1613/jair.953 , Vol. 1, No. 1, Article . Publication date: February 2018. 38 Chenyu Wang, Zhou Yang, Ze Shi Li, Daniela Damian, and David Lo

work page doi:10.1613/jair.953 2002
[34]

Karel Crombecq, Luciano De Tommasi, Dirk Gorissen, and Tom Dhaene. 2009. A novel sequential design strategy for global surrogate modeling. In Proceedings of the 2009 Winter Simulation Conference (WSC) . 731–742. https: //doi.org/10.1109/WSC.2009.5429687

work page doi:10.1109/wsc.2009.5429687 2009
[35]

Cruzes and Tore Dyba

Daniela S. Cruzes and Tore Dyba. 2011. Recommended Steps for Thematic Synthesis in Software Engineering. In Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement (ESEM ’11) . IEEE Computer Society, USA, 275–284. https://doi.org/10.1109/ESEM.2011.36

work page doi:10.1109/esem.2011.36 2011
[36]

Arun Das and Paul Rad. 2020. Opportunities and challenges in explainable artificial intelligence (xai): A survey. arXiv preprint arXiv:2006.11371 (2020)

work page arXiv 2020
[37]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs.CL]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[38]

Yuanrui Fan, Xin Xia, David Lo, Ahmed E Hassan, and Shanping Li. 2021. What makes a popular academic AI repository? Empirical Software Engineering 26, 1 (2021), 1–35

work page 2021
[39]

Michael Felderer and Rudolf Ramler. 2021. Quality Assurance for AI-Based Systems: Overview and Challenges (Introduction to Interactive Session). In Software Quality: Future Perspectives on Software Engineering Quality . Springer International Publishing, 33–42. https://doi.org/10.1007/978-3-030-65854-0_3

work page doi:10.1007/978-3-030-65854-0_3 2021
[40]

Yang Feng, Qingkai Shi, Xinyu Gao, Jun Wan, Chunrong Fang, and Zhenyu Chen. 2020. DeepGini: Prioritizing Massive Tests to Enhance the Robustness of Deep Neural Networks. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (Virtual Event, USA) (ISSTA 2020). Association for Computing Machinery, New York, NY, USA, ...

work page doi:10.1145/3395363.3397357 2020
[41]

Stefan Feuerriegel, Mateusz Dolata, and Gerhard Schwabe. 2020. Fair AI: Challenges and opportunities. Business & information systems engineering 62 (2020), 379–384

work page 2020
[42]

Aaron Fisher, Cynthia Rudin, and Francesca Dominici. 2019. All Models are Wrong, but Many are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously. arXiv:1801.01489 [stat.ME]

work page arXiv 2019
[43]

Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence informa- tion and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. 1322–1333

work page 2015
[44]

Jerome Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics 29 (11 2000). https://doi.org/10.1214/aos/1013203451

work page doi:10.1214/aos/1013203451 2000
[45]

Shipeng Fu, Zhen Li, Kai Liu, Sadia Din, Muhammad Imran, and Xiaomin Yang. 2020. Model Compression for IoT Applications in Industry 4.0 via Multiscale Knowledge Transfer. IEEE Transactions on Industrial Informatics 16, 9 (2020), 6013–6022. https://doi.org/10.1109/TII.2019.2953106

work page doi:10.1109/tii.2019.2953106 2020
[46]

Zhe Fu, Jingyu Yang, Changming Bai, Xiao Chen, Cun Zhang, Yanlin Zhang, and Dongsheng Wang. 2020. Astraea: Deploy AI Services at the Edge in Elegant Ways. In 2020 IEEE International Conference on Edge Computing (EDGE) . 49–53. https://doi.org/10.1109/EDGE50951.2020.00015

work page doi:10.1109/edge50951.2020.00015 2020
[47]

Amin Ghadesi, Maxime Lamothe, and Heng Li. 2023. What Causes Exceptions in Machine Learning Applications? Mining Machine Learning-Related Stack Traces on Stack Overflow. arXiv:2304.12857 [cs.LG]

work page arXiv 2023
[48]

Alex Goldstein, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2014. Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation. arXiv:1309.6392 [stat.AP]

work page internal anchor Pith review Pith/arXiv arXiv 2014
[49]

Chen Gong, Zhou Yang, Yunpeng Bai, Jieke Shi, Arunesh Sinha, Bowen Xu, David Lo, Xinwen Hou, and Guoliang Fan

work page
[50]

In Proceedings of the 38th Annual Computer Security Applications Conference (Austin, TX, USA) (ACSAC ’22)

Curiosity-Driven and Victim-Aware Adversarial Policies. In Proceedings of the 38th Annual Computer Security Applications Conference (Austin, TX, USA) (ACSAC ’22). Association for Computing Machinery, New York, NY, USA, 186–200. https://doi.org/10.1145/3564625.3564636

work page doi:10.1145/3564625.3564636
[51]

Generative Adversarial Networks

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. arXiv:1406.2661 [stat.ML]

work page internal anchor Pith review Pith/arXiv arXiv 2014
[52]

Leo Goodman. 1961. Snowball Sampling. Ann Math Stat 32 (03 1961). https://doi.org/10.1214/aoms/1177705148

work page doi:10.1214/aoms/1177705148 1961
[53]

Serge Gorbunov and Arnold Rosenbloom. 2010. Autofuzz: Automated network protocol fuzzing framework. Ijcsns 10, 8 (2010), 239

work page 2010
[54]

Waltz, Philip M

Philip Gross, Albert Boulanger, Marta Arias, David L. Waltz, Philip M. Long, Charles Lawson, Roger Anderson, Matthew Koenig, Mark Mastrocinque, William Fairechio, John A. Johnson, Serena Lee, Frank Doherty, and Arthur Kressner. 2006. Predicting Electricity Distribution Feeder Failures Using Machine Learning Susceptibility Analysis. In IAAI. http://www.phi...

work page 2006
[55]

Greg Guest, Arwen Bunce, and Laura Johnson. 2006. How Many Interviews Are Enough?: An Experiment with Data Saturation and Variability. Field Methods 18, 1 (Feb. 2006), 59–82. https://doi.org/10.1177/1525822X05279903 Publisher: SAGE Publications Inc

work page doi:10.1177/1525822x05279903 2006
[56]

Michelle Guo, Albert Haque, De-An Huang, Serena Yeung, and Li Fei-Fei. 2018. Dynamic Task Prioritization for Multitask Learning. In Proceedings of the European Conference on Computer Vision (ECCV) . , Vol. 1, No. 1, Article . Publication date: February 2018. Quality Assurance for Artificial Intelligence: A Study of Industrial Concerns, Challenges and Best...

work page 2018
[57]

Ronan Hamon, Henrik Junklewitz, Ignacio Sanchez, et al. 2020. Robustness and explainability of artificial intelligence. Publications Office of the European Union 207 (2020)

work page 2020
[58]

Hui Han, Wen-Yuan Wang, and Bing-Huan Mao. 2005. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In Proceedings of the 2005 International Conference on Advances in Intelligent Computing - Volume Part I (Hefei, China) (ICIC’05). Springer-Verlag, Berlin, Heidelberg, 878–887. https://doi.org/10.1007/11538059_91

work page doi:10.1007/11538059_91 2005
[59]

Miriam Harris, Amy Qi, Luke Jeagal, Nazi Torabi, Dick Menzies, Alexei Korobitsyn, Madhukar Pai, Ruvandhi R Nathavitharana, and Faiz Ahmad Khan. 2019. A systematic review of the diagnostic accuracy of artificial intelligence- based computer programs to analyze chest x-rays for pulmonary tuberculosis. PloS one 14, 9 (2019), e0221339

work page 2019
[60]

Mardhiya Hayati, Siti Mutmainah, and Syed Ghufran. 2021. Random and Synthetic Over-Sampling Approach to Resolve Data Imbalance in Classification. International Journal of Artificial Intelligence Research 4 (01 2021), 86. https://doi.org/10.29099/ijair.v4i2.152

work page doi:10.29099/ijair.v4i2.152 2021
[61]

Zecheng He, Tianwei Zhang, and Ruby B Lee. 2019. Model inversion attacks against collaborative inference. In Proceedings of the 35th Annual Computer Security Applications Conference . 148–162

work page 2019
[62]

Hearst, S.T

M.A. Hearst, S.T. Dumais, E. Osuna, J. Platt, and B. Scholkopf. 1998. Support vector machines. IEEE Intelligent Systems and their Applications 13, 4 (1998), 18–28. https://doi.org/10.1109/5254.708428

work page doi:10.1109/5254.708428 1998
[63]

Lukas Heiland, Marius Hauser, and Justus Bogner. 2023. Design Patterns for AI-based Systems: A Multivocal Literature Review and Pattern Repository. arXiv:2303.13173 [cs.SE]

work page arXiv 2023
[64]

Henrik Heymann, Hendrik Mende, Maik Frye, and Robert H. Schmitt. 2023. Assessment Framework for Deployability of Machine Learning Models in Production. Procedia CIRP 118 (2023), 32–37. https://doi.org/10.1016/j.procir.2023.06.007 16th CIRP Conference on Intelligent Computation in Manufacturing Engineering

work page doi:10.1016/j.procir.2023.06.007 2023
[65]

Hans-Martin Heyn, Eric Knauss, Amna Pir Muhammad, Olof Eriksson, Jennifer Linder, Padmini Subbiah, Shameer Ku- mar Pradhan, and Sagar Tungal. 2021. Requirement Engineering Challenges for AI-intense Systems Develop- ment. In 2021 IEEE/ACM 1st Workshop on AI Engineering - Software Engineering for AI (W AIN) . 89–96. https: //doi.org/10.1109/WAIN52551.2021.00020

work page doi:10.1109/wain52551.2021.00020 2021
[66]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. arXiv:1503.02531 [stat.ML]

work page internal anchor Pith review Pith/arXiv arXiv 2015
[67]

Carrie Howell, Wei Su, Ariann Nassel, April Agne, and Andrea Cherrington. 2020. Area based stratified random sampling using geospatial technology in a community-based survey. BMC Public Health 20 (11 2020). https: //doi.org/10.1186/s12889-020-09793-0

work page doi:10.1186/s12889-020-09793-0 2020
[68]

Krystal Hu. 2023. CHATGPT sets record for fastest-growing user base - analyst note. https://www.reuters.com/ technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/

work page 2023
[69]

Shotaro Ishihara. 2023. Training Data Extraction From Pre-trained Language Models: A Survey. arXiv:2305.16157 [cs.CL]

work page arXiv 2023
[70]

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2017. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. arXiv:1712.05877 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[71]

Jean-Marie John-Mathews, Dominique Cardon, and Christine Balagué. 2022. From reality to world. A critical perspective on AI fairness. Journal of Business Ethics 178, 4 (July 2022), 945–959. https://doi.org/10.1007/s10551-022- 05055-8 FNEGE 1, HCERES A, ABS 3

work page doi:10.1007/s10551-022- 2022
[72]

Milan Jovic, Andrea Adamoli, and Matthias Hauswirth. 2011. Catch me if you can: performance bug detection in the wild. In Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications. 155–170

work page 2011
[73]

Reza karemi and mohammadreza nasiri. 2023. Identifying and Prioritizing Factors Affecting Knowledge Sharing in Software Companies. Sciences and Techniques of Information Management (2023), –. https://doi.org/10.22091/stim. 2023.10146.2043

work page doi:10.22091/stim 2023
[74]

Sanjay Kariyappa and Moinuddin K Qureshi. 2019. Defending Against Model Stealing Attacks with Adaptive Misinformation. arXiv:1911.07100 [stat.ML]

work page arXiv 2019
[76]

Jinhan Kim, Robert Feldt, and Shin Yoo. 2019. Guiding Deep Learning System Testing Using Surprise Adequacy. In2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) . IEEE. https://doi.org/10.1109/icse.2019.00108

work page doi:10.1109/icse.2019.00108 2019
[77]

Segment Anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick. 2023. Segment Anything. arXiv:2304.02643 [cs.CV]

work page internal anchor Pith review Pith/arXiv arXiv 2023
[78]

Pavneet Singh Kochhar, Xin Xia, and David Lo. 2019. Practitioners’ Views on Good Software Testing Practices. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) . , Vol. 1, No. 1, Article . Publication date: February 2018. 40 Chenyu Wang, Zhou Yang, Ze Shi Li, Daniela Damian, and David Lo 61...

work page doi:10.1109/icse-seip.2019.00015 2019
[79]

Taesung Lee, Benjamin Edwards, Ian Molloy, and Dong Su. 2018. Defending Against Machine Learning Model Stealing Attacks Using Deceptive Perturbations. arXiv:1806.00054 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2018
[80]

Jing Li, Aixin Sun, Jianglei Han, and Chenliang Li. 2022. A Survey on Deep Learning for Named Entity Recognition. IEEE Transactions on Knowledge and Data Engineering34, 1 (jan 2022), 50–70. https://doi.org/10.1109/tkde.2020.2981314

work page doi:10.1109/tkde.2020.2981314 2022
[81]

Liang, Maryam Arab, Minhyuk Ko, Amy J

Jenny T. Liang, Maryam Arab, Minhyuk Ko, Amy J. Ko, and Thomas D. LaToza. 2023. A Qualitative Study on the Implementation Design Decisions of Developers. arXiv:2301.09789 [cs.SE]

work page arXiv 2023
[82]

Bowen Liu, Boao Xiao, Xutong Jiang, Siyuan Cen, Xin He, Wanchun Dou, and Huaming Chen. 2023. Adversarial Attacks on Large Language Model-Based System and Mitigating Strategies: A Case Study on ChatGPT. Sec. and Commun. Netw. 2023 (jan 2023), 10 pages. https://doi.org/10.1155/2023/8691095

work page doi:10.1155/2023/8691095 2023

Showing first 80 references.

[1] [1]

[n. d.]. Apache Ignite. https://ignite.apache.org/

work page

[2] [2]

[n. d.]. Apache Spark. https://spark.apache.org/

work page

[3] [3]

[n. d.]. ChatGPT is easily abused, and that’s a big problem. https://adguard.com/en/blog/chatgpt-dan-prompt- abuse.html , Vol. 1, No. 1, Article . Publication date: February 2018. Quality Assurance for Artificial Intelligence: A Study of Industrial Concerns, Challenges and Best Practices 37

work page 2018

[4] [4]

[n. d.]. Kubernetes. https://kubernetes.io/

work page

[5] [5]

[n. d.]. NVIDIA CUDA toolkit. https://developer.nvidia.com/cuda-toolkit

work page

[6] [6]

[n. d.]. NVIDIA TensorRT. https://developer.nvidia.com/tensorrt

work page

[7] [7]

[n. d.]. NVIDIA Triton Inference Server. https://developer.nvidia.com/nvidia-triton-inference-server

work page

[8] [8]

[n. d.]. Personal Data Protection Act. https://www.pdpc.gov.sg/Overview-of-PDPA/The-Legislation/Personal-Data- Protection-Act

work page

[9] [9]

[n. d.]. Pinecone. https://www.pinecone.io/

work page

[10] [10]

[n. d.]. PyTorch. https://pytorch.org/

work page

[11] [11]

[n. d.]. Seldon. https://www.seldon.io/

work page

[12] [12]

[n. d.]. TensorFlow. https://www.tensorflow.org/

work page

[13] [13]

History of the Basel Committee

2014. History of the Basel Committee. https://www.bis.org/bcbs/history.htm

work page 2014

[14] [14]

ISO 9001:2015

2015. ISO 9001:2015. https://www.iso.org/standard/62085.html

work page 2015

[15] [15]

General Data Protection Regulation (GDPR)

2022. General Data Protection Regulation (GDPR). https://gdpr-info.eu/

work page 2022

[16] [16]

Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. 308–318

work page 2016

[17] [17]

Ibrahim M Ahmed and Manar Younis Kashmoola. 2021. Threats on machine learning technique by data poisoning attack: A survey. In Advances in Cyber Security: Third International Conference, ACeS 2021, Penang, Malaysia, August 24–25, 2021, Revised Selected Papers 3 . Springer, 586–600

work page 2021

[18] [19]

Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: A Case Study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) . 291–300. https://doi.org/10.11...

work page doi:10.1109/icse-seip.2019.00042 2019

[19] [20]

Shin Ando and Chun-Yuan Huang. 2017. Deep Over-sampling Framework for Classifying Imbalanced Data. arXiv:1704.07515 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2017

[20] [21]

Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Herrera. 2019. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. arXiv:1910.100...

work page arXiv 2019

[21] [22]

Muhammad Hilmi Asyrofi, Zhou Yang, Imam Nur Bani Yusuf, Hong Jin Kang, Ferdian Thung, and David Lo. 2022. BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems. IEEE Transactions on Software Engineering 48, 12 (2022), 5087–5101. https://doi.org/10.1109/TSE.2021.3136169

work page doi:10.1109/tse.2021.3136169 2022

[22] [23]

Yang Bao, Gilles Hilary, and Bin Ke. 2022. Artificial intelligence and fraud detection. Innovative Technology at the Interface of Finance and Operations: Volume I (2022), 223–247

work page 2022

[23] [24]

Hollen Barmer, Rachel Dzombak, Matthew Gaston, Vijaykumar Palat, Frank Redner, Tanisha Smith, and John Wohlbier

work page

[24] [25]

(9 2021)

Scalable AI. (9 2021). https://doi.org/10.1184/R1/16560273.v1

work page doi:10.1184/r1/16560273.v1 2021

[25] [26]

Mohammad Riyaz Belgaum, Zainab Alansari, Shahrulniza Musa, Muhammad Mansoor Alam, and MS Mazliham. 2021. Role of artificial intelligence in cloud computing, IoT and SDN: Reliability and scalability issues. International Journal of Electrical and Computer Engineering 11, 5 (2021), 4458

work page 2021

[26] [27]

Kartikeya Bhardwaj, Naveen Suda, and Radu Marculescu. 2019. Dream Distillation: A Data-Independent Model Compression Framework. arXiv:1905.07072 [stat.ML]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[27] [28]

Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, and D. Sculley. 2017. The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction. In Proceedings of IEEE Big Data

work page 2017

[28] [29]

Lawrence

Christian Cabrera, Andrei Paleyes, Pierre Thodoroff, and Neil D. Lawrence. 2023. Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective. arXiv:2302.04810 [cs.SE]

work page arXiv 2023

[29] [30]

Longbing Cao. 2021. AI in Finance: Challenges, Techniques and Opportunities. arXiv:2107.09051 [q-fin.CP]

work page arXiv 2021

[30] [31]

Longbing Cao. 2022. AI in Finance: Challenges, Techniques, and Opportunities. ACM Comput. Surv. 55, 3, Article 64 (feb 2022), 38 pages. https://doi.org/10.1145/3502289

work page doi:10.1145/3502289 2022

[31] [32]

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel. 2021. Extracting Training Data from Large Language Models. arXiv:2012.07805 [cs.CR]

work page arXiv 2021

[32] [33]

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research 16 (jun 2002), 321–357. https://doi.org/10.1613/jair.953 , Vol. 1, No. 1, Article . Publication date: February 2018. 38 Chenyu Wang, Zhou Yang, Ze Shi Li, Daniela Damian, and David Lo

work page doi:10.1613/jair.953 2002

[33] [34]

Karel Crombecq, Luciano De Tommasi, Dirk Gorissen, and Tom Dhaene. 2009. A novel sequential design strategy for global surrogate modeling. In Proceedings of the 2009 Winter Simulation Conference (WSC) . 731–742. https: //doi.org/10.1109/WSC.2009.5429687

work page doi:10.1109/wsc.2009.5429687 2009

[34] [35]

Cruzes and Tore Dyba

Daniela S. Cruzes and Tore Dyba. 2011. Recommended Steps for Thematic Synthesis in Software Engineering. In Proceedings of the 2011 International Symposium on Empirical Software Engineering and Measurement (ESEM ’11) . IEEE Computer Society, USA, 275–284. https://doi.org/10.1109/ESEM.2011.36

work page doi:10.1109/esem.2011.36 2011

[35] [36]

Arun Das and Paul Rad. 2020. Opportunities and challenges in explainable artificial intelligence (xai): A survey. arXiv preprint arXiv:2006.11371 (2020)

work page arXiv 2020

[36] [37]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs.CL]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[37] [38]

Yuanrui Fan, Xin Xia, David Lo, Ahmed E Hassan, and Shanping Li. 2021. What makes a popular academic AI repository? Empirical Software Engineering 26, 1 (2021), 1–35

work page 2021

[38] [39]

Michael Felderer and Rudolf Ramler. 2021. Quality Assurance for AI-Based Systems: Overview and Challenges (Introduction to Interactive Session). In Software Quality: Future Perspectives on Software Engineering Quality . Springer International Publishing, 33–42. https://doi.org/10.1007/978-3-030-65854-0_3

work page doi:10.1007/978-3-030-65854-0_3 2021

[39] [40]

Yang Feng, Qingkai Shi, Xinyu Gao, Jun Wan, Chunrong Fang, and Zhenyu Chen. 2020. DeepGini: Prioritizing Massive Tests to Enhance the Robustness of Deep Neural Networks. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (Virtual Event, USA) (ISSTA 2020). Association for Computing Machinery, New York, NY, USA, ...

work page doi:10.1145/3395363.3397357 2020

[40] [41]

Stefan Feuerriegel, Mateusz Dolata, and Gerhard Schwabe. 2020. Fair AI: Challenges and opportunities. Business & information systems engineering 62 (2020), 379–384

work page 2020

[41] [42]

Aaron Fisher, Cynthia Rudin, and Francesca Dominici. 2019. All Models are Wrong, but Many are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously. arXiv:1801.01489 [stat.ME]

work page arXiv 2019

[42] [43]

Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence informa- tion and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. 1322–1333

work page 2015

[43] [44]

Jerome Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics 29 (11 2000). https://doi.org/10.1214/aos/1013203451

work page doi:10.1214/aos/1013203451 2000

[44] [45]

Shipeng Fu, Zhen Li, Kai Liu, Sadia Din, Muhammad Imran, and Xiaomin Yang. 2020. Model Compression for IoT Applications in Industry 4.0 via Multiscale Knowledge Transfer. IEEE Transactions on Industrial Informatics 16, 9 (2020), 6013–6022. https://doi.org/10.1109/TII.2019.2953106

work page doi:10.1109/tii.2019.2953106 2020

[45] [46]

Zhe Fu, Jingyu Yang, Changming Bai, Xiao Chen, Cun Zhang, Yanlin Zhang, and Dongsheng Wang. 2020. Astraea: Deploy AI Services at the Edge in Elegant Ways. In 2020 IEEE International Conference on Edge Computing (EDGE) . 49–53. https://doi.org/10.1109/EDGE50951.2020.00015

work page doi:10.1109/edge50951.2020.00015 2020

[46] [47]

Amin Ghadesi, Maxime Lamothe, and Heng Li. 2023. What Causes Exceptions in Machine Learning Applications? Mining Machine Learning-Related Stack Traces on Stack Overflow. arXiv:2304.12857 [cs.LG]

work page arXiv 2023

[47] [48]

Alex Goldstein, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2014. Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation. arXiv:1309.6392 [stat.AP]

work page internal anchor Pith review Pith/arXiv arXiv 2014

[48] [49]

Chen Gong, Zhou Yang, Yunpeng Bai, Jieke Shi, Arunesh Sinha, Bowen Xu, David Lo, Xinwen Hou, and Guoliang Fan

work page

[49] [50]

In Proceedings of the 38th Annual Computer Security Applications Conference (Austin, TX, USA) (ACSAC ’22)

Curiosity-Driven and Victim-Aware Adversarial Policies. In Proceedings of the 38th Annual Computer Security Applications Conference (Austin, TX, USA) (ACSAC ’22). Association for Computing Machinery, New York, NY, USA, 186–200. https://doi.org/10.1145/3564625.3564636

work page doi:10.1145/3564625.3564636

[50] [51]

Generative Adversarial Networks

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. arXiv:1406.2661 [stat.ML]

work page internal anchor Pith review Pith/arXiv arXiv 2014

[51] [52]

Leo Goodman. 1961. Snowball Sampling. Ann Math Stat 32 (03 1961). https://doi.org/10.1214/aoms/1177705148

work page doi:10.1214/aoms/1177705148 1961

[52] [53]

Serge Gorbunov and Arnold Rosenbloom. 2010. Autofuzz: Automated network protocol fuzzing framework. Ijcsns 10, 8 (2010), 239

work page 2010

[53] [54]

Waltz, Philip M

Philip Gross, Albert Boulanger, Marta Arias, David L. Waltz, Philip M. Long, Charles Lawson, Roger Anderson, Matthew Koenig, Mark Mastrocinque, William Fairechio, John A. Johnson, Serena Lee, Frank Doherty, and Arthur Kressner. 2006. Predicting Electricity Distribution Feeder Failures Using Machine Learning Susceptibility Analysis. In IAAI. http://www.phi...

work page 2006

[54] [55]

Greg Guest, Arwen Bunce, and Laura Johnson. 2006. How Many Interviews Are Enough?: An Experiment with Data Saturation and Variability. Field Methods 18, 1 (Feb. 2006), 59–82. https://doi.org/10.1177/1525822X05279903 Publisher: SAGE Publications Inc

work page doi:10.1177/1525822x05279903 2006

[55] [56]

Michelle Guo, Albert Haque, De-An Huang, Serena Yeung, and Li Fei-Fei. 2018. Dynamic Task Prioritization for Multitask Learning. In Proceedings of the European Conference on Computer Vision (ECCV) . , Vol. 1, No. 1, Article . Publication date: February 2018. Quality Assurance for Artificial Intelligence: A Study of Industrial Concerns, Challenges and Best...

work page 2018

[56] [57]

Ronan Hamon, Henrik Junklewitz, Ignacio Sanchez, et al. 2020. Robustness and explainability of artificial intelligence. Publications Office of the European Union 207 (2020)

work page 2020

[57] [58]

Hui Han, Wen-Yuan Wang, and Bing-Huan Mao. 2005. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In Proceedings of the 2005 International Conference on Advances in Intelligent Computing - Volume Part I (Hefei, China) (ICIC’05). Springer-Verlag, Berlin, Heidelberg, 878–887. https://doi.org/10.1007/11538059_91

work page doi:10.1007/11538059_91 2005

[58] [59]

Miriam Harris, Amy Qi, Luke Jeagal, Nazi Torabi, Dick Menzies, Alexei Korobitsyn, Madhukar Pai, Ruvandhi R Nathavitharana, and Faiz Ahmad Khan. 2019. A systematic review of the diagnostic accuracy of artificial intelligence- based computer programs to analyze chest x-rays for pulmonary tuberculosis. PloS one 14, 9 (2019), e0221339

work page 2019

[59] [60]

Mardhiya Hayati, Siti Mutmainah, and Syed Ghufran. 2021. Random and Synthetic Over-Sampling Approach to Resolve Data Imbalance in Classification. International Journal of Artificial Intelligence Research 4 (01 2021), 86. https://doi.org/10.29099/ijair.v4i2.152

work page doi:10.29099/ijair.v4i2.152 2021

[60] [61]

Zecheng He, Tianwei Zhang, and Ruby B Lee. 2019. Model inversion attacks against collaborative inference. In Proceedings of the 35th Annual Computer Security Applications Conference . 148–162

work page 2019

[61] [62]

Hearst, S.T

M.A. Hearst, S.T. Dumais, E. Osuna, J. Platt, and B. Scholkopf. 1998. Support vector machines. IEEE Intelligent Systems and their Applications 13, 4 (1998), 18–28. https://doi.org/10.1109/5254.708428

work page doi:10.1109/5254.708428 1998

[62] [63]

Lukas Heiland, Marius Hauser, and Justus Bogner. 2023. Design Patterns for AI-based Systems: A Multivocal Literature Review and Pattern Repository. arXiv:2303.13173 [cs.SE]

work page arXiv 2023

[63] [64]

Henrik Heymann, Hendrik Mende, Maik Frye, and Robert H. Schmitt. 2023. Assessment Framework for Deployability of Machine Learning Models in Production. Procedia CIRP 118 (2023), 32–37. https://doi.org/10.1016/j.procir.2023.06.007 16th CIRP Conference on Intelligent Computation in Manufacturing Engineering

work page doi:10.1016/j.procir.2023.06.007 2023

[64] [65]

Hans-Martin Heyn, Eric Knauss, Amna Pir Muhammad, Olof Eriksson, Jennifer Linder, Padmini Subbiah, Shameer Ku- mar Pradhan, and Sagar Tungal. 2021. Requirement Engineering Challenges for AI-intense Systems Develop- ment. In 2021 IEEE/ACM 1st Workshop on AI Engineering - Software Engineering for AI (W AIN) . 89–96. https: //doi.org/10.1109/WAIN52551.2021.00020

work page doi:10.1109/wain52551.2021.00020 2021

[65] [66]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the Knowledge in a Neural Network. arXiv:1503.02531 [stat.ML]

work page internal anchor Pith review Pith/arXiv arXiv 2015

[66] [67]

Carrie Howell, Wei Su, Ariann Nassel, April Agne, and Andrea Cherrington. 2020. Area based stratified random sampling using geospatial technology in a community-based survey. BMC Public Health 20 (11 2020). https: //doi.org/10.1186/s12889-020-09793-0

work page doi:10.1186/s12889-020-09793-0 2020

[67] [68]

Krystal Hu. 2023. CHATGPT sets record for fastest-growing user base - analyst note. https://www.reuters.com/ technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/

work page 2023

[68] [69]

Shotaro Ishihara. 2023. Training Data Extraction From Pre-trained Language Models: A Survey. arXiv:2305.16157 [cs.CL]

work page arXiv 2023

[69] [70]

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2017. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. arXiv:1712.05877 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2017

[70] [71]

Jean-Marie John-Mathews, Dominique Cardon, and Christine Balagué. 2022. From reality to world. A critical perspective on AI fairness. Journal of Business Ethics 178, 4 (July 2022), 945–959. https://doi.org/10.1007/s10551-022- 05055-8 FNEGE 1, HCERES A, ABS 3

work page doi:10.1007/s10551-022- 2022

[71] [72]

Milan Jovic, Andrea Adamoli, and Matthias Hauswirth. 2011. Catch me if you can: performance bug detection in the wild. In Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications. 155–170

work page 2011

[72] [73]

Reza karemi and mohammadreza nasiri. 2023. Identifying and Prioritizing Factors Affecting Knowledge Sharing in Software Companies. Sciences and Techniques of Information Management (2023), –. https://doi.org/10.22091/stim. 2023.10146.2043

work page doi:10.22091/stim 2023

[73] [74]

Sanjay Kariyappa and Moinuddin K Qureshi. 2019. Defending Against Model Stealing Attacks with Adaptive Misinformation. arXiv:1911.07100 [stat.ML]

work page arXiv 2019

[74] [76]

Jinhan Kim, Robert Feldt, and Shin Yoo. 2019. Guiding Deep Learning System Testing Using Surprise Adequacy. In2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) . IEEE. https://doi.org/10.1109/icse.2019.00108

work page doi:10.1109/icse.2019.00108 2019

[75] [77]

Segment Anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick. 2023. Segment Anything. arXiv:2304.02643 [cs.CV]

work page internal anchor Pith review Pith/arXiv arXiv 2023

[76] [78]

Pavneet Singh Kochhar, Xin Xia, and David Lo. 2019. Practitioners’ Views on Good Software Testing Practices. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) . , Vol. 1, No. 1, Article . Publication date: February 2018. 40 Chenyu Wang, Zhou Yang, Ze Shi Li, Daniela Damian, and David Lo 61...

work page doi:10.1109/icse-seip.2019.00015 2019

[77] [79]

Taesung Lee, Benjamin Edwards, Ian Molloy, and Dong Su. 2018. Defending Against Machine Learning Model Stealing Attacks Using Deceptive Perturbations. arXiv:1806.00054 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2018

[78] [80]

Jing Li, Aixin Sun, Jianglei Han, and Chenliang Li. 2022. A Survey on Deep Learning for Named Entity Recognition. IEEE Transactions on Knowledge and Data Engineering34, 1 (jan 2022), 50–70. https://doi.org/10.1109/tkde.2020.2981314

work page doi:10.1109/tkde.2020.2981314 2022

[79] [81]

Liang, Maryam Arab, Minhyuk Ko, Amy J

Jenny T. Liang, Maryam Arab, Minhyuk Ko, Amy J. Ko, and Thomas D. LaToza. 2023. A Qualitative Study on the Implementation Design Decisions of Developers. arXiv:2301.09789 [cs.SE]

work page arXiv 2023

[80] [82]

Bowen Liu, Boao Xiao, Xutong Jiang, Siyuan Cen, Xin He, Wanchun Dou, and Huaming Chen. 2023. Adversarial Attacks on Large Language Model-Based System and Mitigating Strategies: A Case Study on ChatGPT. Sec. and Commun. Netw. 2023 (jan 2023), 10 pages. https://doi.org/10.1155/2023/8691095

work page doi:10.1155/2023/8691095 2023