pith. sign in

arxiv: 2506.23234 · v2 · submitted 2025-06-29 · 💻 cs.SE

From Release to Adoption: Challenges in Reusing Pre-trained AI Models for Downstream Developers

Pith reviewed 2026-05-19 07:55 UTC · model grok-4.3

classification 💻 cs.SE
keywords pre-trained modelsPTM reusesoftware engineeringgithub issueschallenge taxonomyissue resolutionopen source softwaremodel adoption
0
0 comments X

The pith

Developers reusing pre-trained models encounter seven main challenge categories, with related issues taking longer to resolve than others.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines the specific obstacles that arise when downstream developers incorporate pre-trained AI models into existing software systems. The authors collected and coded 840 issue reports drawn from 31 open-source GitHub projects to build a taxonomy that groups the problems into seven categories such as model usage, performance, and output quality. They also compared resolution times and found that PTM-linked issues persist longer before closure, with measurable differences across the categories. A sympathetic reader cares because pre-trained models are now widely adopted yet their practical integration hurdles remain poorly catalogued, directly affecting development speed and maintenance effort in real projects.

Core claim

Through qualitative analysis of 840 PTM-related issue reports from 31 OSS GitHub projects, the authors develop a taxonomy of seven challenge categories that downstream developers face when reusing pre-trained models and show via statistical tests that these issues take significantly longer to resolve than non-PTM issues, with variation across categories.

What carries the argument

A taxonomy of seven PTM challenge categories built from qualitative coding of GitHub issue reports.

Load-bearing premise

The 31 selected open-source projects and their 840 issue reports represent the broader population of pre-trained model reuse scenarios and the qualitative coding process captured all relevant challenge types without systematic bias.

What would settle it

A larger study sampling hundreds of additional projects and re-running the resolution-time comparison finds either a substantially different set of challenge categories or no statistically significant difference in time-to-resolution between PTM and non-PTM issues.

Figures

Figures reproduced from arXiv: 2506.23234 by Christoph Treude, Haoyu Gao, Mansooreh Zahedi, Patanamon Thongtanunam, Peerachai Banyongrakkul.

Figure 1
Figure 1. Figure 1: A motivating example illustrating a PTM-related chal [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: An example of the data analysis process for developing the taxonomy of PTM-related challenges. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The top two levels of our taxonomy of PTM-related challenges in downstream software projects, where each challenge [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Resolution time analysis of PTM-related issues. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
read the original abstract

Pre-trained models (PTMs) have gained widespread popularity and achieved remarkable success across various fields, driven by their groundbreaking performance and easy accessibility through hosting providers. However, the challenges faced by downstream developers in reusing PTMs in software systems are less explored. To bridge this knowledge gap, we qualitatively created and analyzed a dataset of 840 PTM-related issue reports from 31 OSS GitHub projects. We systematically developed a comprehensive taxonomy of PTM-related challenges that developers face in downstream projects. Our study identifies seven key categories of challenges that downstream developers face in reusing PTMs, such as model usage, model performance, and output quality. We also compared our findings with existing taxonomies. Additionally, we conducted a resolution time analysis and, based on statistical tests, found that PTM-related issues take significantly longer to be resolved than issues unrelated to PTMs, with significant variation across challenge categories. We discuss the implications of our findings for practitioners and possibilities for future research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript reports a qualitative study analyzing 840 PTM-related GitHub issues from 31 open-source projects to develop a taxonomy of seven challenge categories for downstream developers reusing pre-trained AI models. It also includes a statistical analysis comparing resolution times of PTM-related versus unrelated issues.

Significance. If the sampling and coding procedures are robust, the work offers a valuable empirical taxonomy of real-world challenges in PTM adoption, extending prior taxonomies and highlighting practical implications for longer resolution times. The use of a large dataset of actual developer issues and statistical tests strengthens the contribution to software engineering research on AI model reuse.

major comments (2)
  1. The selection criteria for the 31 OSS GitHub projects and the process for identifying and extracting the 840 PTM-related issue reports (including inclusion/exclusion rules and sampling frame) are not sufficiently detailed. This is load-bearing for claims of representativeness in the seven-category taxonomy and the resolution-time statistical comparisons.
  2. The qualitative coding process that produced the seven challenge categories lacks any reporting of inter-rater agreement or reliability metrics (e.g., Cohen's kappa or percentage agreement). This directly affects the validity of the taxonomy and the subsequent category-specific resolution time analysis.
minor comments (2)
  1. The abstract states that PTM-related issues take significantly longer to resolve but does not name the specific statistical test(s) used; add this detail for precision.
  2. A table comparing the new seven-category taxonomy with prior work (mentioned in the abstract) would improve clarity and allow readers to see overlaps and gaps at a glance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of methodological transparency that we will address in the revision to strengthen the presentation of our qualitative taxonomy and statistical findings. We respond to each major comment below.

read point-by-point responses
  1. Referee: The selection criteria for the 31 OSS GitHub projects and the process for identifying and extracting the 840 PTM-related issue reports (including inclusion/exclusion rules and sampling frame) are not sufficiently detailed. This is load-bearing for claims of representativeness in the seven-category taxonomy and the resolution-time statistical comparisons.

    Authors: We agree that the current description of project selection and issue extraction could be expanded for greater clarity and to better support claims about the taxonomy and statistical comparisons. In the revised manuscript, we will add a dedicated subsection under Methodology that details the criteria for selecting the 31 projects (e.g., minimum stars, activity level, and evidence of PTM usage), the overall sampling frame from GitHub, and the precise inclusion/exclusion rules applied to filter the 840 PTM-related issues from the initial set of reports. revision: yes

  2. Referee: The qualitative coding process that produced the seven challenge categories lacks any reporting of inter-rater agreement or reliability metrics (e.g., Cohen's kappa or percentage agreement). This directly affects the validity of the taxonomy and the subsequent category-specific resolution time analysis.

    Authors: We acknowledge that explicit reporting of inter-rater reliability would improve the perceived rigor of the taxonomy development. The coding was performed iteratively by two authors with regular consensus discussions to resolve disagreements. In the revision, we will describe this process in more detail within the Data Analysis section and report agreement metrics (including Cohen's kappa calculated on a randomly sampled subset of issues) to address concerns about validity. revision: yes

Circularity Check

0 steps flagged

Empirical study reports observations from external GitHub data with no derivation chain

full rationale

This is a qualitative empirical study that extracts 840 PTM-related issue reports from 31 external OSS GitHub projects, performs qualitative coding to produce a seven-category taxonomy, compares the taxonomy to prior work, and runs statistical tests on resolution times. No equations, fitted parameters, predictions, or self-referential derivations are present. All claims rest on direct analysis of independently sourced external data rather than reducing to the paper's own inputs by construction. The analysis is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The study relies on standard qualitative research practices in empirical software engineering without introducing new mathematical parameters or postulated entities.

axioms (1)
  • domain assumption Qualitative coding of issue reports can produce a reliable and comprehensive taxonomy of developer challenges
    Core premise of the taxonomy construction step described in the abstract.

pith-pipeline@v0.9.0 · 5722 in / 1220 out tokens · 25477 ms · 2026-05-19T07:55:19.794334+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

115 extracted references · 115 canonical work pages

  1. [1]

    Software Engineering for AI-Based Systems: A Survey,

    S. Mart ´ıinez-Fern´andez, J. Bogner, X. Franch, M. Oriol, J. Siebert, A. Trendowicz, A. M. V ollmer, and S. Wagner, “Software Engineering for AI-Based Systems: A Survey,” ACM Trans. Softw. Eng. Methodol., vol. 31, no. 2, apr 2022

  2. [2]

    Software architecture for ML- based Systems: What exists and what lies ahead,

    H. Muccini and K. Vaidhyanathan, “Software architecture for ML- based Systems: What exists and what lies ahead,” in Proceedings of the IEEE/ACM 1st Workshop on AI Engineering - Software Engineering for AI (WAIN 2021) . IEEE, 2021, pp. 121–128

  3. [3]

    Documenting ethical considerations in open source ai models,

    H. Gao, M. Zahedi, C. Treude, S. Rosenstock, and M. Cheong, “Documenting ethical considerations in open source ai models,” inPro- ceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2024). New York, NY , USA: Association for Computing Machinery, 2024, p. 177–188

  4. [4]

    Pre-trained models: Past, present and future,

    X. Han and et al., “Pre-trained models: Past, present and future,” AI Open, vol. 2, pp. 225–250, 2021

  5. [5]

    Reusing Deep Learning Models: Challenges and Directions in Software Engineering,

    J. C. Davis, P. Jajal, W. Jiang, T. R. Schorlemmer, N. Synovic, and G. K. Thiruvathukal, “Reusing Deep Learning Models: Challenges and Directions in Software Engineering,” in Proceedings of the 2023 IEEE John Vincent Atanasoff Symposium on Modern Computing (JVA 2023) , 2023, pp. 17–30

  6. [6]

    GPT-4 Technical Report,

    OpenAI, “GPT-4 Technical Report,” vol. 4, pp. 1–100, 2023

  7. [7]

    PeaTMOSS: A Dataset and Initial Analysis of Pre-Trained Models in Open-Source Software,

    W. Jiang, J. Yasmin, J. Jones, N. Synovic, J. Kuo, N. Bielanski, Y . Tian, G. K. Thiruvathukal, and J. C. Davis, “PeaTMOSS: A Dataset and Initial Analysis of Pre-Trained Models in Open-Source Software,” in Proceedings of the 21st International Conference on Mining Software Repositories (MSR 2024) , vol. 1, 2024

  8. [8]

    How the machine ‘thinks’: Understanding opacity in machine learning algorithms,

    J. Burrell, “How the machine ‘thinks’: Understanding opacity in machine learning algorithms,” Big Data & Society , vol. 3, no. 1, p. 2053951715622512, 2016

  9. [9]

    Towards semantic versioning of open pre-trained language model releases on hugging face,

    A. Ajibode, A. A. Bangash, F. R. Cogo, B. Adams, and A. E. Hassan, “Towards semantic versioning of open pre-trained language model releases on hugging face,” Empirical Software Engineering , vol. 30, no. 3, 2025

  10. [10]

    Comparative analysis of real issues in open-source machine learning projects,

    T. D. Lai, A. Simmons, S. Barnett, J. G. Schneider, and R. Vasa, “Comparative analysis of real issues in open-source machine learning projects,” Empirical Software Engineering , vol. 29, no. 3, 2024

  11. [11]

    An Empirical Study on the Bugs Found while Reusing Pre-trained Natural Language Processing Models,

    R. Pan, S. Biswas, M. Chakraborty, B. D. Cruz, and H. Rajan, “An Empirical Study on the Bugs Found while Reusing Pre-trained Natural Language Processing Models,” 2022

  12. [12]

    Deep Learning Model Reuse in the HuggingFace Community: Chal- lenges, Benefit and Trends,

    M. Taraghi, G. Dorcelus, A. Foundjem, F. Tambon, and F. Khomh, “Deep Learning Model Reuse in the HuggingFace Community: Chal- lenges, Benefit and Trends,” in Proceedings of the 31st IEEE Interna- tional Conference on Software Analysis, Evolution and Reengineering (SANER 2024). IEEE, mar 2024, pp. 512–523

  13. [13]

    Challenges of Using Pre- trained Models: the Practitioners’ Perspective,

    X. Tan, T. Li, R. Chen, F. Liu, and L. Zhang, “Challenges of Using Pre- trained Models: the Practitioners’ Perspective,” in Proceedings of the 31st IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2024) . Los Alamitos, CA, USA: IEEE Computer Society, mar 2024, pp. 67–78

  14. [14]

    An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry,

    W. Jiang, N. Synovic, M. Hyatt, T. R. Schorlemmer, R. Sethi, Y . H. Lu, G. K. Thiruvathukal, and J. C. Davis, “An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry,” in Proceedings of the 45th International Conference on Software Engineering (ICSE 2023) , 2023, pp. 2463–2475

  15. [15]

    Does reusing pre-trained NLP model propagate bugs?

    M. Chakraborty, “Does reusing pre-trained NLP model propagate bugs?” in Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021) , 2021, pp. 1686–1688

  16. [16]

    A Replication Package of the Paper:

    P. Banyongrakkul, M. Zahedi, P. Thongtanunam, C. Treude, and H. Gao, “A Replication Package of the Paper: ”From Release to Adoption: Challenges in Reusing Pre-trained AI Models for Downstream Developers”,” mar 2025. [Online]. Available: http://doi.org/10.5281/zenodo.15020621

  17. [17]

    What do pre-trained code models know about code?

    A. Karmakar and R. Robbes, “What do pre-trained code models know about code?” in Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE 2021) , 2021, pp. 1332–1336

  18. [18]

    BERT: Pre- training of deep bidirectional transformers for language understanding,

    J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre- training of deep bidirectional transformers for language understanding,” in Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2019), vol. 1, oct 2019, pp. 4171–4186

  19. [19]

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning,

    DeepSeek-AI, “DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning,” vol. 500, pp. 1–22, 2025

  20. [20]

    High-Resolution Image Synthesis with Latent Diffusion Models ,

    R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “ High-Resolution Image Synthesis with Latent Diffusion Models ,” in Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR 2022). Los Alamitos, CA, USA: IEEE Computer Society, jun 2022, pp. 10 674–10 685

  21. [21]

    Explor- ing the Carbon Footprint of Hugging Face’s ML Models: A Repository Mining Study,

    J. Casta ˜no, S. Mart´ınez-Fern´andez, X. Franch, and J. Bogner, “Explor- ing the Carbon Footprint of Hugging Face’s ML Models: A Repository Mining Study,” in Proceedings of the 17th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2023). IEEE, oct 2023, pp. 1–12

  22. [22]

    A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT,

    C. Zhou and et al., “A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT,” International Journal of Machine Learning and Cybernetics , 2023

  23. [23]

    An Empirical Study on Performance Bugs in Deep Learning Frameworks,

    T. Makkouk, D. J. Kim, and T. H. Chen, “An Empirical Study on Performance Bugs in Deep Learning Frameworks,” in Proceedings of the 38th IEEE International Conference on Software Maintenance and Evolution (ICSME 2022) , 2022, pp. 35–46

  24. [24]

    On Reporting Performance and Accuracy Bugs for Deep Learning Frameworks: An Exploratory Study from GitHub,

    G. Long and T. Chen, “On Reporting Performance and Accuracy Bugs for Deep Learning Frameworks: An Exploratory Study from GitHub,” in Proceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering (EASE 2022) , 2022, pp. 90–99

  25. [25]

    A comprehensive empirical study on bug characteristics of deep learning frameworks,

    Y . Yang, T. He, Z. Xia, and Y . Feng, “A comprehensive empirical study on bug characteristics of deep learning frameworks,” Information and Software Technology, vol. 151, no. July, p. 107004, 2022

  26. [26]

    Toward Understanding Deep Learning Framework Bugs,

    J. Chen, Y . Liang, Q. Shen, J. Jiang, and S. Li, “Toward Understanding Deep Learning Framework Bugs,” ACM Transactions on Software Engineering and Methodology , 2023

  27. [27]

    Taxonomy of real faults in deep learning systems,

    N. Humbatova, G. Jahangirova, G. Bavota, V . Riccio, A. Stocco, and P. Tonella, “Taxonomy of real faults in deep learning systems,” in Proceedings of the 42nd International Conference on Software Engineering (ICSE 2020) , 2020, pp. 1110–1121

  28. [28]

    What Do Users Ask in Open-Source AI Repositories? An Empirical Study of GitHub Issues,

    Z. Yang, C. Wang, J. Shi, T. Hoang, P. Kochhar, Q. Lu, Z. Xing, and D. Lo, “What Do Users Ask in Open-Source AI Repositories? An Empirical Study of GitHub Issues,” in Proceedings of the 20th IEEE/ACM International Conference on Mining Software Repositories (MSR 2023), 2023, pp. 1–13

  29. [29]

    Archi- tecture Decisions in AI-based Systems Development: An Empirical Study,

    B. Zhang, T. Liu, P. Liang, C. Wang, M. Shahin, and J. Yu, “Archi- tecture Decisions in AI-based Systems Development: An Empirical Study,” in Proceedings of the 30th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2023) , 2023

  30. [30]

    A Large- Scale Study of ML-Related Python Projects,

    S. Idowu, Y . Sens, T. Berger, J. Krueger, and M. Vierhauser, “A Large- Scale Study of ML-Related Python Projects,” in Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing (SAC 2024) , 2024, pp. 1272–1281

  31. [31]

    Pliers #440: http://github.com/PsychoinformaticsLab/pliers/issues/440

  32. [32]

    Pliers #318: http://github.com/PsychoinformaticsLab/pliers/issues/318

  33. [33]

    ImaginAIry #326: http://github.com/brycedrennan/imaginAIry/issues/ 326

  34. [34]

    How do Hugging Face Models Document Datasets, Bias, and Licenses? An Empirical Study,

    F. Pepe, V . Nardone, A. Mastropaolo, G. Canfora, G. Bavota, and M. Di Penta, “How do Hugging Face Models Document Datasets, Bias, and Licenses? An Empirical Study,” in Proceddings of the 32nd IEEE/ACM International Conference on Program Comprehension (ICPC 2024) , vol. 1, no. iii, 2024

  35. [35]

    PTMTorrent: A Dataset for Mining Open-source Pre-trained Model Packages,

    W. Jiang, N. Synovic, P. Jajal, T. R. Schorlemmer, A. Tewari, B. Pareek, G. K. Thiruvathukal, and J. C. Davis, “PTMTorrent: A Dataset for Mining Open-source Pre-trained Model Packages,” in Proceedings of the 20th IEEE/ACM International Conference on Mining Software Repositories (MSR 2023) , 2023, pp. 57–61

  36. [36]

    An exploratory study of dataset and model management in open source machine learning applications,

    T. R. Toma and C. P. Bezemer, “An exploratory study of dataset and model management in open source machine learning applications,” in Proceedings of the 3rd IEEE/ACM International Conference on AI Engineering - Software Engineering for AI (CAIN 2024) , 2024

  37. [37]

    The State of the ML-universe: 10 Years of Artificial Intelligence & Machine Learn- ing Software Development on GitHub,

    D. Gonzalez, T. Zimmermann, and N. Nagappan, “The State of the ML-universe: 10 Years of Artificial Intelligence & Machine Learn- ing Software Development on GitHub,” in Proceedings of the 17th IEEE/ACM International Conference on Mining Software Repositories (MSR 2020), 2020, pp. 431–442

  38. [38]

    23 Shades of Self-Admitted Technical Debt: an Empirical Study on Machine Learning Software,

    D. Obrien, S. Biswas, S. Imtiaz, R. Abdalkareem, E. Shihab, and H. Rajan, “23 Shades of Self-Admitted Technical Debt: an Empirical Study on Machine Learning Software,” in Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022), 2022, pp. 734–746

  39. [39]

    Evaluating transfer learning for simplifying github readmes,

    H. Gao, C. Treude, and M. Zahedi, “Evaluating transfer learning for simplifying github readmes,” in Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023) . New York, NY , USA: Association for Computing Machinery, 2023, p. 1548–1560

  40. [40]

    DeepPull: Deep Learning- Based Approach for Predicting Reopening, Decision, and Lifetime of Pull Requests on GitHub Open-Source Projects,

    P. Banyongrakkul and S. Phoomvuthisarn, “DeepPull: Deep Learning- Based Approach for Predicting Reopening, Decision, and Lifetime of Pull Requests on GitHub Open-Source Projects,” in Communications in Computer and Information Science , vol. 2104 CCIS. Springer Nature Switzerland, 2024, pp. 100–123

  41. [41]

    The Art and Practice of Data Science Pipelines: A Comprehensive Study of Data Science Pipelines In Theory, In-The-Small, and In-The-Large,

    S. Biswas, M. Wardat, and H. Rajan, “The Art and Practice of Data Science Pipelines: A Comprehensive Study of Data Science Pipelines In Theory, In-The-Small, and In-The-Large,” inProceedings of the 44th International Conference on Software Engineering (ICSE 2022) , vol. 2022-May, 2022, pp. 2091–2103

  42. [42]

    Interrater reliability: the kappa statistic

    M. L. McHugh, “Interrater reliability: the kappa statistic.” Biochemia medica, vol. 22, no. 3, pp. 276–282, 2012

  43. [43]

    Recommended steps for thematic synthesis in software engineering,

    D. S. Cruzes and T. Dyb ˚a, “Recommended steps for thematic synthesis in software engineering,” in Proceedings of the 5th International Sym- posium on Empirical Software Engineering and Measurement (ESEM 2011), no. 7491. IEEE, 2011, pp. 275–284

  44. [44]

    Using thematic analysis in psychology; In qualittaive research in psychology,

    V . Braun and V . Clarke, “Using thematic analysis in psychology; In qualittaive research in psychology,” Uwe Bristol , vol. 3, no. 2, pp. 77–101, 2006

  45. [45]

    Sampling in software engineering research: a critical review and guidelines,

    S. Baltes and P. Ralph, “Sampling in software engineering research: a critical review and guidelines,” Empirical Software Engineering , vol. 27, no. 4, 2022

  46. [46]

    Predicting issue types with seBERT,

    A. Trautsch and S. Herbold, “Predicting issue types with seBERT,” in Proceedings of the 1st International Workshop on Natural Language- Based Software Engineering (NLBSE 20 22) . New York, NY , USA: Association for Computing Machinery, 2023, pp. 37–39

  47. [47]

    The chi-square test of independence

    M. L. McHugh, “The chi-square test of independence.” Biochemia medica, vol. 23, no. 2, pp. 143–149, 2013

  48. [48]

    On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other,

    H. B. Mann and D. R. Whitney, “On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other,” The Annals of Mathematical Statistics , vol. 18, no. 1, pp. 50–60, 1947

  49. [49]

    Use of Ranks in One-Criterion Variance Analysis,

    W. H. Kruskal and W. A. Wallis, “Use of Ranks in One-Criterion Variance Analysis,” Journal of the American Statistical Association , vol. 47, no. 260, pp. 583–621, 1952

  50. [50]

    Threestudio #260: http://github.com/threestudio-project/threestudio/ issues/260

  51. [51]

    SimSwap #133: http://github.com/neuralchen/SimSwap/issues/133

  52. [52]

    SimSwap #416: http://github.com/neuralchen/SimSwap/issues/416

  53. [53]

    IOPaint #90: http://github.com/Sanster/IOPaint/issues/90

  54. [54]

    ImaginAIry #255: http://github.com/brycedrennan/imaginAIry/issues/ 255

  55. [55]

    IOPaint #89: http://github.com/Sanster/IOPaint/issues/89

  56. [56]

    BabyAGI #32: http://github.com/yoheinakajima/babyagi/issues/32

  57. [57]

    NLP-Suite #1010: http://github.com/NLP-Suite/NLP-Suite/issues/ 1010

  58. [58]

    NLP-Suite #876: https://github.com/NLP-Suite/NLP-Suite/issues/876

  59. [59]

    IOPaint #292: http://github.com/Sanster/IOPaint/issues/292

  60. [60]

    AutoPR #65: http://github.com/irgolic/AutoPR/issues/65

  61. [61]

    Zamba #234: http://github.com/drivendataorg/Zamba/issues/234

  62. [62]

    OpenBB #4136: http://github.com/OpenBB-finance/OpenBB/issues/ 4136

  63. [63]

    Zamba #54: http://github.com/drivendataorg/Zamba/issues/54

  64. [64]

    ImaginAIry #113: http://github.com/brycedrennan/imaginAIry/issues/ 113

  65. [65]

    IOPaint #270: http://github.com/Sanster/IOPaint/issues/270

  66. [66]

    IOPaint #315: http://github.com/Sanster/IOPaint/issues/315

  67. [67]

    MedCAT #34: http://github.com/CogStack/MedCAT/issues/34

  68. [68]

    ImaginAIry #352: http://github.com/brycedrennan/imaginAIry/issues/ 352

  69. [69]

    Scattertext #16: http://github.com/JasonKessler/Scattertext/issues/16

  70. [70]

    StoryToolkitAI #40: http://github.com/octimot/StoryToolkitAI/issues/ 40

  71. [71]

    Threestudio #128: http://github.com/threestudio-project/threestudio/ issues/128

  72. [72]

    StoryToolkitAI #47: http://github.com/octimot/StoryToolkitAI/issues/ 47

  73. [73]

    NLP-Suite #47: http://github.com/NLP-Suite/NLP-Suite/issues/47

  74. [74]

    SimSwap #245: http://github.com/neuralchen/SimSwap/issues/245

  75. [75]

    SimSwap #363: http://github.com/neuralchen/SimSwap/issues/363

  76. [76]

    App.enfugue.ai #25: http://github.com/painebenjamin/app.enfugue.ai/ issues/25

  77. [77]

    EditAnything #33: http://github.com/sail-sg/EditAnything/issues/33

  78. [78]

    MedCAT #152: http://github.com/CogStack/MedCAT/issues/152

  79. [79]

    IOPaint #3: http://github.com/Sanster/IOPaint/issues/3

  80. [80]

    MonikA.I #50: http://github.com/Rubiksman78/MonikA.I/issues/50

Showing first 80 references.