pith. sign in

arxiv: 2604.25363 · v1 · submitted 2026-04-28 · 💻 cs.SE

Commit-Aware Learning-Based Test Case Prioritization for Continuous Integration

Pith reviewed 2026-05-07 16:07 UTC · model grok-4.3

classification 💻 cs.SE
keywords test case prioritizationcontinuous integrationregression testingcommit-aware predictionlearning-based TCPcross-project validationversion control diffsfault detection
0
0 comments X

The pith

A learning model that adds structural details from code commits to coverage and history data improves test prioritization in continuous integration pipelines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a commit-aware method that builds a predictive model for reordering tests when a new code change arrives. It pulls structural properties from version-control diffs, links them to which tests cover the changed code, and combines both with past execution records to guess which tests are likeliest to fail. The model is trained and tested across five Defects4J projects using leave-one-project-out validation so that no project-specific tuning occurs. Results indicate the added commit information lifts both the accuracy of identifying failing tests and the speed at which faults surface when tests run in the suggested order. Readers would care because regression testing in frequent CI builds consumes large resources, and earlier fault detection can cut that cost.

Core claim

Given a new commit, the method estimates for each test the probability it will reveal at least one failure by fusing structural properties extracted from version-control diffs, test coverage relations, and historical execution behavior into one predictive model, then reorders the test suite accordingly; when evaluated on five Defects4J projects under leave-one-project-out cross-project validation, this commit-aware approach significantly outperforms non-commit-aware baselines in both classification and prioritization effectiveness.

What carries the argument

The unified predictive model that combines structural properties of version-control diffs with test coverage relations and historical execution behavior to output per-test failure probabilities.

If this is right

  • Tests can be reordered so that regression faults appear earlier in each CI run.
  • The learned model works across projects without needing project-specific retraining.
  • Both the classification of tests expected to fail and the quality of the resulting ranking improve.
  • CI pipelines can expose more faults while executing fewer tests overall.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Commit structural features could be added to other test-selection or bug-localization tools that already use coverage and history.
  • The performance lift suggests that finer-grained analysis of which parts of a diff matter most might yield still better predictors.
  • Teams running large suites in frequent builds could reduce total test time by adopting similar change-aware ranking.

Load-bearing premise

That structural properties extracted from version-control diffs supply predictive value beyond what test coverage and historical execution data already provide, and that the resulting model generalizes across projects without per-project tuning.

What would settle it

A new set of projects or live CI traces where the commit-aware model shows no measurable gain in fault-detection rate or average percentage of faults detected compared with the coverage-and-history baselines.

Figures

Figures reproduced from arXiv: 2604.25363 by Gerardo Canfora, Lorenzo Abbondante.

Figure 1
Figure 1. Figure 1: figure 1: the diff-based features are the ones that most influence the decision view at source ↗
Figure 1
Figure 1. Figure 1: XGBoost feature importance on Lang test set 5.2 RQ2: improved prediction effectiveness on fault detection The analysis of the prioritization quartiles reveals a distinct behavioral shift compared to the total collapse observed in the classification task, as shown in table 3. The APFD Gain demonstrates a surprising resilience in the absence of diff-based features. This indicates that the models retain a goo… view at source ↗
read the original abstract

Regression testing in Continuous Integration (CI) pipelines is increasingly costly due to the growing size and execution frequency of test suites. Test Case Prioritization (TCP) mitigates this problem by reordering tests to expose faults earlier. However, most existing techniques rely primarily on historical execution data and coverage metrics, neglecting the rich structural information contained in code changes. This paper proposes a commit-aware, learning-based TCP method that combines structural properties of version-control diffs, test coverage relations, and historical execution behavior into a unified predictive model. Given a new commit, the method estimates the probability that each test suite will reveal at least one failure and prioritizes test execution accordingly. We evaluate our method on five Defects4J projects using a leave-one-project-out cross-project validation setting. Results show that the commit-aware TCP significantly outperform non-commit-aware-baselines in both classification and prioritization effectiveness. Our findings show that including commit structural semantics substantially enhances regression fault detection and enables robust, generalizable learning-based TCP in CI environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a commit-aware learning-based test case prioritization (TCP) technique for continuous integration that fuses structural features extracted from version-control diffs with coverage relations and historical execution data. A predictive model estimates per-test failure probability for a new commit and reorders the suite accordingly. Evaluation uses leave-one-project-out cross-validation across five Defects4J projects and reports that the commit-aware approach significantly outperforms non-commit-aware baselines on both classification and prioritization metrics.

Significance. If the empirical gains prove robust and the diff-derived features demonstrably add signal beyond coverage and history, the work would strengthen learning-based TCP by showing that structural change semantics improve fault detection effectiveness and cross-project transfer in CI settings. The LOPO protocol, if validated on a broader corpus, would support claims of generalizability.

major comments (3)
  1. [Evaluation] Evaluation section (LOPO protocol): The use of only five Defects4J projects under leave-one-project-out provides insufficient evidence for the claim of 'robust, generalizable' learning-based TCP. All projects share the same benchmark ecosystem (Java, comparable test-suite sizes, artificially seeded faults), so observed gains may reflect dataset artifacts rather than true cross-project transfer of the commit-aware model.
  2. [Results] Results and claims (abstract and §5): The headline assertion that commit-aware TCP 'significantly outperform[s]' baselines lacks reported quantitative metrics, statistical tests (p-values, effect sizes), confidence intervals, or ablation results that isolate the contribution of structural diff features. Without an ablation removing the diff-derived inputs while retaining coverage and history, it is impossible to confirm that the commit-aware component supplies additive predictive value.
  3. [Method] Method and evaluation (class imbalance): The binary classification task (test reveals at least one failure) is inherently imbalanced, yet the manuscript supplies no description of imbalance handling (class weighting, oversampling, threshold tuning, or appropriate metrics such as AUC-PR). This omission undermines the reliability of the reported classification effectiveness.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'non-commit-aware-baselines' is inconsistently hyphenated and should be defined or replaced with a clearer term such as 'coverage-and-history baselines'.
  2. [Method] Notation: The description of the unified predictive model would benefit from an explicit equation or pseudocode showing how diff features, coverage, and history are combined into the failure-probability estimate.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We address each major point below and indicate the revisions planned for the next version of the manuscript.

read point-by-point responses
  1. Referee: [Evaluation] The use of only five Defects4J projects under leave-one-project-out provides insufficient evidence for the claim of 'robust, generalizable' learning-based TCP. All projects share the same benchmark ecosystem (Java, comparable test-suite sizes, artificially seeded faults), so observed gains may reflect dataset artifacts rather than true cross-project transfer of the commit-aware model.

    Authors: We agree that five projects constitute a modest corpus and that the shared Defects4J ecosystem limits the strength of generalizability claims. In the revision we will (i) explicitly list this as a threat to external validity, (ii) soften the language in the abstract and conclusion from 'robust, generalizable' to 'promising cross-project transfer within the Defects4J corpus', and (iii) add a dedicated paragraph outlining concrete plans for future multi-language and larger-scale evaluation. The LOPO protocol itself remains a standard and rigorous design for the available data. revision: partial

  2. Referee: [Results] The headline assertion that commit-aware TCP 'significantly outperform[s]' baselines lacks reported quantitative metrics, statistical tests (p-values, effect sizes), confidence intervals, or ablation results that isolate the contribution of structural diff features. Without an ablation removing the diff-derived inputs while retaining coverage and history, it is impossible to confirm that the commit-aware component supplies additive predictive value.

    Authors: The current manuscript reports raw performance numbers but omits the requested statistical apparatus and ablation. We will add: (a) Wilcoxon signed-rank tests with p-values and effect sizes (Cliff's delta) for all pairwise comparisons, (b) 95% confidence intervals obtained via bootstrap, and (c) a new ablation table that trains identical models with and without the diff-derived feature set while keeping coverage and history features fixed. These additions will appear in Section 5 and the supplementary material. revision: yes

  3. Referee: [Method] The binary classification task (test reveals at least one failure) is inherently imbalanced, yet the manuscript supplies no description of imbalance handling (class weighting, oversampling, threshold tuning, or appropriate metrics such as AUC-PR). This omission undermines the reliability of the reported classification effectiveness.

    Authors: Class weighting was applied during training, but the description was inadvertently omitted. The revised method section will explicitly state that we used inverse class-frequency weighting inside the gradient-boosted tree learner and that the decision threshold was chosen to maximize F1 on the validation fold. We will also report AUC-PR (and average precision) alongside AUC-ROC and accuracy to give a balanced view of performance under imbalance. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical claims rest on external benchmarks and cross-validation.

full rationale

The paper proposes a commit-aware learning-based TCP method combining diff structural properties, coverage, and history, then evaluates it empirically via leave-one-project-out cross-validation on five Defects4J projects. No equations, derivations, or first-principles results are presented that reduce to fitted parameters or self-definitions by construction. Central claims of significant outperformance are supported by direct comparisons to non-commit-aware baselines on a public benchmark dataset, without load-bearing self-citations, ansatz smuggling, or renaming of known results. The LOPO protocol and external baselines render the evaluation self-contained and falsifiable outside any internal fitting loop.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Central claim rests on the assumption that commit diffs contain useful structural signals for failure prediction and that cross-project generalization is feasible; no explicit free parameters, axioms, or invented entities are stated in the abstract.

pith-pipeline@v0.9.0 · 5466 in / 1004 out tokens · 56119 ms · 2026-05-07T16:07:33.832563+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages

  1. [1]

    Zenodo (2026)

    Abbondante, L., Canfora, G.: Replication package for commit-aware learning- based test case prioritization for continuous integration. Zenodo (2026). https://doi.org/https://doi.org/10.5281/zenodo.19355771

  2. [2]

    In: IEEE In- ternational Conference On Artificial Intelligence Testing

    Abdelkarim, M., ElAdawi, R.: Tcp-net++: Test case prioritization using end-to- end deep neural networks - deployment analysis and enhancements. In: IEEE In- ternational Conference On Artificial Intelligence Testing. pp. 99–106. IEEE (2023). https://doi.org/10.1109/AITEST58265.2023.00024

  3. [3]

    PLOS ONE 17(5), 1–26 (05 2022)

    Ahmed, F.S., Majeed, A., Khan, T.A., Bhatti, S.N.: Value-based cost-cognizant test case prioritization for regression testing. PLOS ONE 17(5), 1–26 (05 2022). https://doi.org/10.1371/journal.pone.0264972

  4. [4]

    IEEE Access 13, 172435–172455 (2025)

    Alrakban, N.A., Alrashoud, M., Abdullah-Al-Wadud, M.: Optimizing test case pri- oritization with meta deep reinforcement learning in continuous integration. IEEE Access 13, 172435–172455 (2025). https://doi.org/10.1109/ACCESS.2025.3617387

  5. [5]

    International Journal of Advanced Computer Science and Applications 8(1) (2017)

    Ashraf, E., Mahmood, K., Khan, T.A., Ahmed, S.: Value based pso test case pri- oritization algorithm. International Journal of Advanced Computer Science and Applications 8(1) (2017). https://doi.org/10.14569/IJACSA.2017.080149

  6. [6]

    IEEE Trans

    Bagherzadeh, M., Kahani, N., Briand, L.C.: Reinforcement learning for test case prioritization. IEEE Trans. Software Eng. 48(8), 2836–2856 (2022). https://doi.org/10.1109/TSE.2021.3070549

  7. [7]

    Procedia Computer Science 258, 4070–4083 (2025)

    Behera, A., Acharya, A.A.: An effective gru-based deep learn- ing method for test case prioritization in continuous integra- tion testing. Procedia Computer Science 258, 4070–4083 (2025). https://doi.org/https://doi.org/10.1016/j.procs.2025.04.658, international Con- ference on Machine Learning and Data Engineering

  8. [8]

    Ferreira, Rui Abreu, and Pedro Cruz

    Bertolino, A., Guerriero, A., Miranda, B., Pietrantuono, R., Russo, S.: Learning-to- rank vs ranking-to-learn: strategies for regression testing in continuous integration. In: ICSE ’20: 42nd International Conference on Software Engineering. pp. 1–12. ACM (2020). https://doi.org/10.1145/3377811.3380369

  9. [9]

    In: Proceedings of the 24th ACM SIGSOFT International Sympo- sium on Foundations of Software Engineering

    Busjaeger, B., Xie, T.: Learning for test prioritization: an industrial case study. In: Proceedings of the 24th ACM SIGSOFT International Sympo- sium on Foundations of Software Engineering. pp. 975–980. ACM (2016). https://doi.org/10.1145/2950290.2983954

  10. [10]

    ACM Trans

    Chen, Z., Chen, J., Wang, W., Zhou, J., Wang, M., Chen, X., Zhou, S., Wang, J.: Exploring better black-box test case prioritization via log analysis. ACM Trans. Softw. Eng. Methodol. 32(3), 72:1–72:32 (2023). https://doi.org/10.1145/3569932

  11. [11]

    In: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis

    Cheng, R., Wang, S., Jabbarvand, R., Marinov, D.: Revisiting test-case prioriti- zation on long-running test suites. In: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. pp. 615–627. ACM (2024). https://doi.org/10.1145/3650212.3680307

  12. [12]

    IEEE Trans

    Elbaum, S.G., Malishevsky, A.G., Rothermel, G.: Test case prioritization: A family of empirical studies. IEEE Trans. Software Eng. 28(2), 159–182 (2002). https://doi.org/10.1109/32.988497

  13. [13]

    In: Proceedings of the 1998 Interna- tional Conference on Software Engineering

    Graves, T.L., Harrold, M.J., Kim, J., Porter, A.A., Rothermel, G.: An empirical study of regression test selection techniques. In: Proceedings of the 1998 Interna- tional Conference on Software Engineering. pp. 188–197. IEEE Computer Society (1998). https://doi.org/10.1109/ICSE.1998.671115

  14. [14]

    Hajri, I., Goknil, A., Pastore, F., Briand, L.C.: Automating system test case classi- fication and prioritization for use case-driven testing in product lines. Empir. Softw. Eng. 25(5), 3711–3769 (2020). https://doi.org/10.1007/S10664-020-09853-4

  15. [15]

    doi:10.1109/ICST.2015.7102604

    Hemmati, H., Fang, Z., Mäntylä, M.V.: Prioritizing manual test cases in traditional and rapid release environments. In: 8th IEEE International Conference on Software Testing, Verification and Validation. pp. 1–10. IEEE Computer Society (2015). https://doi.org/10.1109/ICST.2015.7102602

  16. [16]

    In: 23rd IEEE International Conference on Software Quality, Reliability, and Security

    Jabbar, E., Hemmati, H., Feldt, R.: Investigating execution trace embed- ding for test case prioritization. In: 23rd IEEE International Conference on Software Quality, Reliability, and Security. pp. 279–290. IEEE (2023). https://doi.org/10.1109/QRS60937.2023.00036

  17. [17]

    CoRR abs/2206.15428 (2022)

    Jabbar, E., Zangeneh, S., Hemmati, H., Feldt, R.: Test2vec: An execution trace embedding for test case prioritization. CoRR abs/2206.15428 (2022). https://doi.org/10.48550/ARXIV.2206.15428

  18. [18]

    Defects4j: a database of existing faults to enable controlled testing studies for java programs,

    Just, R., Jalali, D., Ernst, M.D.: Defects4j: a database of existing faults to enable controlled testing studies for java programs. In: International Sympo- sium on Software Testing and Analysis, ISSTA. pp. 437–440. ACM (2014). https://doi.org/10.1145/2610384.2628055

  19. [19]

    CoRR abs/2404.16395 (2024)

    Karatayev, A., Ogorodova, A., Shamoi, P.: Fuzzy inference system for test case prioritization in software testing. CoRR abs/2404.16395 (2024). https://doi.org/10.48550/ARXIV.2404.16395

  20. [20]

    In: IEEE International Conference on Software Testing, Verification and Validation

    Khan, M.A., Azim, A., Liscano, R., Smith, K., Chang, Y., Seferi, G., Tauseef, Q.: An end-to-end test case prioritization framework us- ing optimized machine learning models. In: IEEE International Conference on Software Testing, Verification and Validation. pp. 1–8. IEEE (2024). https://doi.org/10.1109/ICSTW60967.2024.00014

  21. [21]

    IEEE Access 7, 132360–132373 (2019)

    Khatibsyarbini, M., Isa, M.A., Jawawi, D.N.A., Hamed, H.N.A., Suffian, M.D.M.: Test case prioritization using firefly algorithm for software testing. IEEE Access 7, 132360–132373 (2019). https://doi.org/10.1109/ACCESS.2019.2940620

  22. [22]

    IEEE Access 9, 166262–166282 (2021)

    Khatibsyarbini, M., Isa, M.A., Jawawi, D.N.A., Shafie, M.L.M., Wan-Kadir, W.M.N., Hamed, H.N.A., Suffian, M.D.M.: Trend application of machine learning in test case prioritization: A review on techniques. IEEE Access 9, 166262–166282 (2021). https://doi.org/10.1109/ACCESS.2021.3135508

  23. [23]

    In: 15th IEEE International Conference on Machine Learning and Applications

    Lachmann, R., Schulze, S., Nieke, M., Seidl, C., Schaefer, I.: System-level test case prioritization using machine learning. In: 15th IEEE International Conference on Machine Learning and Applications. pp. 361–368. IEEE Computer Society (2016). https://doi.org/10.1109/ICMLA.2016.0065

  24. [24]

    IEEE Trans

    Li, F., Zhou, J., Li, Y., Hao, D., Zhang, L.: AGA: an accelerated greedy additional algorithm for test case prioritization. IEEE Trans. Software Eng. 48(12), 5102– 5119 (2022). https://doi.org/10.1109/TSE.2021.3137929

  25. [25]

    IEEE Trans

    Li, Z., Harman, M., Hierons, R.M.: Search algorithms for regression test case prioritization. IEEE Trans. Software Eng. 33(4), 225–237 (2007). https://doi.org/10.1109/TSE.2007.38

  26. [26]

    In: Proceedings of the 40th International Conference on Software Engineering

    Liang, J., Elbaum, S.G., Rothermel, G.: Redefining prioritization: continu- ous prioritization for continuous integration. In: Proceedings of the 40th In- ternational Conference on Software Engineering. pp. 688–698. ACM (2018). https://doi.org/10.1145/3180155.3180213

  27. [27]

    In: 2018 IEEE International Conference on Soft- ware Maintenance and Evolution

    Luo, Q., Moran, K., Poshyvanyk, D., Penta, M.D.: Assessing test case prioritiza- tion on real faults and mutants. In: 2018 IEEE International Conference on Soft- ware Maintenance and Evolution. pp. 240–251. IEEE Computer Society (2018). https://doi.org/10.1109/ICSME.2018.00033

  28. [28]

    In: Proceedings of the 41st International Conference on Software Engi- neering: Software Engineering in Practice

    Machalica, M., Samylkin, A., Porth, M., Chandra, S.: Predictive test selec- tion. In: Proceedings of the 41st International Conference on Software Engi- neering: Software Engineering in Practice. pp. 91–100. IEEE / ACM (2019). https://doi.org/10.1109/ICSE-SEIP.2019.00018

  29. [29]

    Mahdieh, M., Mirian-Hosseinabadi, S., Etemadi, K., Nosrati, A., Jalali, S.: Incorporating fault-proneness estimations into coverage-based test case prioritization methods. Inf. Softw. Technol. 121, 106269 (2020). https://doi.org/10.1016/J.INFSOF.2020.106269

  30. [30]

    Mahdieh, M., Mirian-Hosseinabadi, S., Mahdieh, M.: Test case prioritization using test case diversification and fault-proneness estimations. Autom. Softw. Eng. 29(2), 50 (2022). https://doi.org/10.1007/S10515-022-00344-Y

  31. [31]

    doi:10.1109/AST58925.2023.00008 , url =

    Mamata, R., Azim, A., Liscano, R., Smith, K., Chang, Y., Seferi, G., Tauseef, Q.: Test case prioritization using transfer learning in continuous integration environ- ments. In: IEEE/ACM International Conference on Automation of Software Test. pp. 191–200. IEEE (2023). https://doi.org/10.1109/AST58925.2023.00023

  32. [32]

    Marijan, D.: Comparative study of machine learning test case prioritization for continuous integration testing. Softw. Qual. J. 31(4), 1415–1438 (2023). https://doi.org/10.1007/S11219-023-09646-0

  33. [33]

    In: Proceedings of the 40th International Conference on Software Engineering

    Miranda, B., Cruciani, E., Verdecchia, R., Bertolino, A.: F AST approaches to scalable similarity-based test case prioritization. In: Proceedings of the 40th International Conference on Software Engineering. pp. 222–232. ACM (2018). https://doi.org/10.1145/3180155.3180210

  34. [34]

    Pan, R., Bagherzadeh, M., Ghaleb, T.A., Briand, L.C.: Test case selection and prioritization using machine learning: a systematic literature review. Empir. Softw. Eng. 27(2), 29 (2022). https://doi.org/10.1007/S10664-021-10066-6

  35. [35]

    Kapfhammer, Gordon Fraser, and Phil McMinn

    Paterson, D., Campos, J., Abreu, R., Kapfhammer, G.M., Fraser, G., McMinn, P.: An empirical study on the use of defect prediction for test case prioritization. In: 12th IEEE Conference on Software Testing, Validation and Verification. pp. 346–357. IEEE (2019). https://doi.org/10.1109/ICST.2019.00041

  36. [36]

    In: ISSTA ’20: 29th ACM SIGSOFT International Symposium on Software Testing and Analysis

    Peng, Q., Shi, A., Zhang, L.: Empirically revisiting and enhancing ir-based test-case prioritization. In: ISSTA ’20: 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp. 324–336. ACM (2020). https://doi.org/10.1145/3395363.3397383

  37. [37]

    Qian, Z., Yu, Q., Zhu, H., Liu, J., Fu, T.: Reinforcement learning for test case prior- itization based on lleed k-means clustering and dynamic priority factor. Inf. Softw. Technol. 179, 107654 (2025). https://doi.org/10.1016/J.INFSOF.2024.107654

  38. [38]

    IEEE Trans

    Rothermel, G., Untch, R.H., Chu, C., Harrold, M.J.: Prioritizing test cases for regression testing. IEEE Trans. Software Eng. 27(10), 929–948 (2001). https://doi.org/10.1109/32.962562

  39. [39]

    In: IEEE International Conference on Software Analysis, Evolution and Reengineering

    da Roza, E.A., Lima, J.A.P., Silva, R.C., Vergilio, S.R.: Machine learning regression techniques for test case prioritization in continuous integration environment. In: IEEE International Conference on Software Analysis, Evolution and Reengineering. pp. 196–206. IEEE (2022). https://doi.org/10.1109/SANER53432.2022.00034

  40. [40]

    da Roza, E.A., do Prado Lima, J.A., Vergilio, S.R.: On the use of contex- tual information for machine learning based test case prioritization in con- tinuous integration development. Inf. Softw. Technol. 171, 107444 (2024). https://doi.org/10.1016/J.INFSOF.2024.107444

  41. [41]

    International Jour- nal of Advanced Computer Science and Applications 12(2) (2021)

    Samad, A., Mahdin, H., Kazmi, R., Ibrahim, R.: Regression test case prioritization: A systematic literature review. International Jour- nal of Advanced Computer Science and Applications 12(2) (2021). https://doi.org/10.14569/IJACSA.2021.0120282

  42. [42]

    Samad, A., Mahdin, H.B., Kazmi, R., Ibrahim, R., Baharum, Z.: Mul- tiobjective test case prioritization using test case effectiveness: Multicri- teria scoring method. Sci. Program. 2021, 9988987:1–9988987:13 (2021). https://doi.org/10.1155/2021/9988987

  43. [43]

    In2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)

    Sharif, A., Marijan, D., Liaaen, M.: Deeporder: Deep learning for test case prioritization in continuous integration testing. In: IEEE International Con- ference on Software Maintenance and Evolution. pp. 525–534. IEEE (2021). https://doi.org/10.1109/ICSME52107.2021.00053

  44. [44]

    Mathematics 11(5), 1101 (2023)

    Singhal, S., Kumar, S., Kumar, R., Mohanty, M.: Multi-objective fault-coverage based regression test selection using enhanced aco_tcsp. Mathematics 11(5), 1101 (2023). https://doi.org/10.3390/math11051101

  45. [45]

    In: Pro- ceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis

    Spieker, H., Gotlieb, A., Marijan, D., Mossige, M.: Reinforcement learning for automatic test case prioritization and selection in continuous integration. In: Pro- ceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp. 12–22. ACM (2017). https://doi.org/10.1145/3092703.3092709

  46. [46]

    In: 2020 International Con- ference on Computer Engineering and Application

    Su, W., Li, Z., Wang, Z., Yang, D.: A meta-heuristic test case pri- oritization method based on hybrid model. In: 2020 International Con- ference on Computer Engineering and Application. pp. 430–435 (2020). https://doi.org/10.1109/ICCEA50009.2020.00099

  47. [47]

    A., Yasmin, A.,

    Vescan, A., Gaceanu, R.D., Szederjesi-Dragomir, A.: Neural network-based test case prioritization in continuous integration. In: 38th IEEE/ACM Interna- tional Conference on Automated Software Engineering. pp. 68–77. IEEE (2023). https://doi.org/10.1109/ASEW60602.2023.00014

  48. [48]

    In: IEEE International Confer- ence on Software Maintenance and Evolution

    Vescan, A., Serban, C.: Towards a new test case prioritization ap- proach based on fuzzy clustering analysis. In: IEEE International Confer- ence on Software Maintenance and Evolution. pp. 786–788. IEEE (2020). https://doi.org/10.1109/ICSME46990.2020.00091

  49. [49]

    IEEE Access 13, 118082–118095 (2025)

    Vescan, A., Tiutin, C.: Test case prioritization based on neural net- works classification: A replication study and hyper-parameter optimiza- tion using taguchi methods. IEEE Access 13, 118082–118095 (2025). https://doi.org/10.1109/ACCESS.2025.3586144

  50. [50]

    Wang, H., Yu, R., Wang, D., Du, Y., Zhao, Y., Chen, J., Wang, Z.: An empirical study of test case prioritization on the linux kernel. Autom. Softw. Eng. 32(2), 49 (2025). https://doi.org/10.1007/S10515-025-00522-8

  51. [51]

    Wang, X., Zhang, S.: Cluster-based adaptive test case prioritization. Inf. Softw. Technol. 165, 107339 (2024). https://doi.org/10.1016/J.INFSOF.2023.107339

  52. [52]

    IEEE Trans

    Yaraghi, A.S., Bagherzadeh, M., Kahani, N., Briand, L.C.: Scalable and accurate test case prioritization in continuous integration contexts. IEEE Trans. Software Eng. 49(4), 1615–1639 (2023). https://doi.org/10.1109/TSE.2022.3184842

  53. [53]

    Yoo, S., Harman, M.: Regression testing minimization, selection and prior- itization: a survey. Softw. Test. Verification Reliab. 22(2), 67–120 (2012). https://doi.org/10.1002/STV.430

  54. [54]

    In: IEEE International Con- ference on Software Maintenance and Evolution

    Zhao, Y., Hao, D., Zhang, L.: Revisiting machine learning based test case prioritization for continuous integration. In: IEEE International Con- ference on Software Maintenance and Evolution. pp. 232–244. IEEE (2023). https://doi.org/10.1109/ICSME58846.2023.00032