pith. sign in

arxiv: 2606.03640 · v1 · pith:HSVO7RELnew · submitted 2026-05-28 · 💻 cs.SE

Can AI be Easy? Lessons Learned from the EZR.py Toolkit

Pith reviewed 2026-06-29 05:59 UTC · model grok-4.3

classification 💻 cs.SE
keywords tabular datasoftware engineering optimizationactive learningsimplified algorithmsNaive Bayesdecision treessimulated annealingEZR.py
0
0 comments X

The pith

A 400-line Python toolkit built by simplifying classical algorithms matches or beats complex AI tools on tabular software engineering optimization tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents EZR.py as evidence that many learning algorithms become nearly identical once reduced to their core operations through repeated reading and refactoring. The authors test these simplified implementations on over 120 tabular tasks and report performance at least as good as SHAP, LIME, SMAC3, and FASTREAD. They emphasize that the resulting code runs hundreds of times faster, requires far less labeled data, and builds models from very few variables. A reader would care because the work suggests that accessible, lightweight tools can suffice for this class of problems instead of large external libraries.

Core claim

The authors claim that EZR.py, a 400-line open-source Python toolkit implementing Naive Bayes, k-means, classification and regression trees, simulated annealing, local search, active learning, and complementary-Bayes text filtering, achieves results comparable to or better than state-of-the-art explanation tools, optimizers, and text filters on the 120-plus tabular SE optimization tasks in the MOOT repository, while running 500 times faster than SMAC3, using orders of magnitude less labeled data, and building trees from fewer than ten variables even when thousands are available.

What carries the argument

The EZR.py toolkit, created by repeatedly reading and refactoring AI tools to collapse them into a few lines each, which unifies the listed classical algorithms under a single small codebase.

If this is right

  • Classical algorithms collapse to a few lines each once stripped to their core.
  • A state-of-the-art active learner fits in roughly 80 lines.
  • Small unified toolkits can rival large libraries within tabular SE optimization.
  • Reading and refactoring code is a useful method of generating insight into algorithms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same simplification process might be applied to other data types or domains beyond tabular SE data.
  • Teams could replace heavy dependencies with custom lightweight code for better speed and fewer external requirements.
  • The observation that many algorithms are nearly the same at core could guide the design of new minimal implementations in adjacent fields.

Load-bearing premise

The experimental comparisons on the MOOT repository tasks are fair, unbiased, and representative of the broader domain without post-hoc selection or unstated differences in evaluation protocols.

What would settle it

Running EZR.py head-to-head against SMAC3 or SHAP on a fresh collection of tabular SE tasks outside the MOOT repository and finding consistent underperformance on the same metrics.

read the original abstract

Much recent press claims that developers no longer need to read code. We disagree, at least within the domain of tabular software-engineering (SE) optimization tasks: rows of $x$ and $y$ values where the $y$ values are expensive to obtain. As evidence we present 400 lines of EZR.py, a Python toolkit (no heavy dependencies) that implements Naive Bayes, $k$-means clustering, classification and regression trees, simulated annealing, local search, active learning, and complementary-Bayes text-mining relevance filtering for tabular SE data. EZR was built by repeatedly reading and refactoring AI tools to simplify and unify them. The result demonstrates that many seemingly different learning algorithms are nearly the same once stripped back to their core: classical algorithms collapse to a few lines each, and a state-of-the-art active learner fits in roughly 80 lines. Tested on the 120+ tabular SE optimization tasks in the MOOT repository, these tiny tools perform as well as or better than state-of-the-art explanation tools (SHAP, LIME), the SMAC3 optimizer, and SVM-based text-mining filters (FASTREAD), while running 500$\times$ faster than SMAC3, using orders of magnitude less labelled data, and building trees from fewer than ten variables even when thousands are available. We conclude that, within the scope of tabular SE optimization, reading and refactoring code is a useful method of generating insight, and small unified toolkits can rival large libraries. EZR is available under an open-source license. Install via \textsf{pip install ezr}; example data at \textsf{github.com/timm/moot}.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript presents EZR.py, a compact 400-line Python toolkit implementing simplified versions of Naive Bayes, k-means clustering, classification/regression trees, simulated annealing, local search, active learning, and complementary-Bayes text-mining for tabular software-engineering optimization tasks. It argues that repeated reading and refactoring of AI code yields unified, minimal implementations, and reports that these tools match or exceed the performance of SHAP, LIME, SMAC3, and FASTREAD on 120+ MOOT repository tasks while running 500× faster, using far less labeled data, and selecting fewer than ten variables even when thousands are available. The work concludes that code simplification is a viable path to insight and that small toolkits can rival large libraries in this domain.

Significance. If the empirical claims hold under rigorous validation, the paper would provide concrete evidence that, within tabular SE optimization, state-of-the-art AI methods can be reduced to a few dozen lines each without sacrificing (and sometimes improving) performance, speed, and data efficiency. The explicit open-source release (pip install ezr; github.com/timm/moot) and emphasis on reproducibility constitute a clear strength. This could shift emphasis in SE research toward minimal, readable implementations rather than ever-larger libraries.

major comments (1)
  1. [Results / MOOT evaluation] The central performance claim (abstract and results section) that EZR tools 'perform as well as or better than' SHAP, LIME, SMAC3, and FASTREAD on the 120+ MOOT tasks, while being 500× faster and using orders of magnitude less labeled data, is load-bearing yet unsupported by any description of experimental protocol. No information is supplied on data splits, cross-validation scheme, number of runs, statistical tests, hyperparameter search budgets for baselines, whether all methods received identical labeled subsets, or the precise metric/threshold used to declare 'as well as or better.' This omission prevents verification that comparisons are fair and unbiased.
minor comments (2)
  1. [Abstract and implementation description] The abstract asserts that 'classical algorithms collapse to a few lines each' and that a state-of-the-art active learner fits in ~80 lines, but the manuscript would benefit from an explicit table or appendix listing line counts per component to make this claim verifiable.
  2. [Results] The claim of 'building trees from fewer than ten variables even when thousands are available' is interesting but would be strengthened by reporting the distribution of selected variables across tasks rather than a single bound.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thorough review and for highlighting the need for greater transparency in our experimental protocol. We agree that this information is essential for verifying the central claims and will revise the manuscript to include it.

read point-by-point responses
  1. Referee: The central performance claim (abstract and results section) that EZR tools 'perform as well as or better than' SHAP, LIME, SMAC3, and FASTREAD on the 120+ MOOT tasks, while being 500× faster and using orders of magnitude less labeled data, is load-bearing yet unsupported by any description of experimental protocol. No information is supplied on data splits, cross-validation scheme, number of runs, statistical tests, hyperparameter search budgets for baselines, whether all methods received identical labeled subsets, or the precise metric/threshold used to declare 'as well as or better.' This omission prevents verification that comparisons are fair and unbiased.

    Authors: We acknowledge the omission. The revised manuscript will add an 'Experimental Protocol' subsection detailing: (1) use of the MOOT repository's predefined train/test splits for each of the 120+ tasks; (2) 20 independent runs per method with different random seeds; (3) Wilcoxon signed-rank tests (Bonferroni-corrected) for declaring 'as well as or better' (no significant difference at p>0.05 or superior median performance); (4) identical labeled subsets provided to all methods; (5) baseline hyperparameter settings taken from the original SHAP, LIME, SMAC3, and FASTREAD papers with no additional tuning beyond defaults; and (6) timing measured on identical hardware. All code, seeds, and raw results are already available in the public github.com/timm/moot repository to support full reproducibility. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical toolkit claims rest on external MOOT benchmarks without self-referential derivations

full rationale

The paper presents EZR.py as a simplified toolkit implementing standard algorithms (Naive Bayes, k-means, trees, simulated annealing, active learning) and reports empirical performance on the external MOOT repository of 120+ tabular SE tasks. No equations, first-principles derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The central claims are direct experimental comparisons to SHAP, LIME, SMAC3, and FASTREAD; these rest on reported outcomes rather than reducing to the paper's own inputs by construction. Absence of any derivation chain means no circularity patterns (self-definitional, fitted-input-as-prediction, etc.) are present.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, axioms, or invented entities are identifiable. The central performance claims rest on empirical comparisons whose details are not provided.

pith-pipeline@v0.9.1-grok · 5835 in / 1262 out tokens · 23620 ms · 2026-06-29T05:59:20.719949+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

94 extracted references · 18 canonical work pages

  1. [1]

    The Future of JavaScript

    Dahl R. The Future of JavaScript. Keynote, JSConf EU; 2024

  2. [2]

    Jensen Huang Advises Against Learning to Code — Leave It Up to AI

    Tyson M. Jensen Huang Advises Against Learning to Code — Leave It Up to AI. Tom’s Hardware; 2024. https: //www.tomshardware.com/tech-industry/artificial-intelligence/ jensen-huang-advises-against-learning-to-code-leave-it-up-to-ai

  3. [3]

    Elon Musk Predicts the Death of Traditional Coding by Year-End

    Haruni A. Elon Musk Predicts the Death of Traditional Coding by Year-End. Wall Street Pit; 2026. https://wallstreetpit.com/ 128455-elon-musk-predicts-the-death-of-traditional-coding-by-year-end/

  4. [4]

    Slashdot; 2025

    ‘No Longer Think You Should Learn to Code,’ Says CEO of AI Coding Startup. Slashdot; 2025. https://developers.slashdot.org/story/25/03/31/1623201/ no-longer-think-you-should-learn-to-code-says-ceo-of-ai-coding-startup

  5. [5]

    MOOT: a Repository of Many Multi- Objective Optimization Tasks

    Menzies T, Chen T, Ye Y, et al. MOOT: a Repository of Many Multi- Objective Optimization Tasks. In: MSR 2026. pdf: https://arxiv.org/ pdf/2511.16882, data: https://github.com/timm/moot

  6. [6]

    Minimal Data, Maximum Clarity: A Heuris- tic for Explaining Optimization

    Rayegan A, Menzies T. Minimal Data, Maximum Clarity: A Heuris- tic for Explaining Optimization. Journal of Software and Systems, to appear. 2026

  7. [7]

    Promisetune: Unveiling causally promising and explainable configuration tuning

    Chen P, Chen T. Promisetune: Unveiling causally promising and explainable configuration tuning. arXiv preprint arXiv:2507.05995. 2025

  8. [8]

    Can large language models improve se active learning via warm-starts?

    Senthilkumar L, Menzies T. Can large language models improve se active learning via warm-starts?. ACM Transactions on Software Engineering and Methodology. 2024

  9. [9]

    Accuracy Can Lie: On the Impact of Surrogate Model in Configuration Tuning

    Chen P, Gong J, Chen T. Accuracy Can Lie: On the Impact of Surrogate Model in Configuration Tuning. IEEE Trans. Softw. Eng.. 2025;51(2):548-580

  10. [10]

    Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project Health

    Lustosa A, Menzies T. Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project Health. ACM Transactions on Software Engineering and Methodology. 2024;33(3):1–22

  11. [11]

    iSNEAK: Partial Ordering as Heuristics for Model- Based Reasoning in Software Engineering

    Lustosa A, Menzies T. iSNEAK: Partial Ordering as Heuristics for Model- Based Reasoning in Software Engineering. IEEE Access. 2024;12:142915–142929. Can AI be Easy? 21

  12. [12]

    Beyond evolutionary algorithms for search-based software engineering

    Chen J, Nair V , Menzies T. Beyond evolutionary algorithms for search-based software engineering. Information and Software Tech- nology. 2018;95:281–294

  13. [13]

    EA Sports FC 25 Database — Ratings and Stats

    nyagami . EA Sports FC 25 Database — Ratings and Stats. 2025. Kaggle

  14. [14]

    Telco Customer Churn

    blastchar . Telco Customer Churn. 2025. Kaggle

  15. [15]

    Life Expectancy (WHO) Dataset

    Rajarshi KA. Life Expectancy (WHO) Dataset. 2025. Kaggle

  16. [16]

    Marketing Analytics – Marketing Data

    Daoud J. Marketing Analytics – Marketing Data. 2022. Kaggle

  17. [17]

    Software Supply Chain Attacks, a Threat to Global Cybersecurity: SolarWinds’ Case Study

    Martínez J, Durán JM. Software Supply Chain Attacks, a Threat to Global Cybersecurity: SolarWinds’ Case Study. International Jour- nal of Safety and Security Engineering. 2021;11(5):537–545. doi: 10.18280/ijsse.110505

  18. [18]

    A new method for solving hard sat- isfiability problems

    Selman B, Levesque H, Mitchell D. A new method for solving hard sat- isfiability problems. In:Proceedings of the Tenth National Conference on Artificial IntelligenceAAAI’92. AAAI Press 1992:440–446

  19. [19]

    Noise strategies for improving local search

    Selman B, Kautz HA, Cohen B. Noise strategies for improving local search. In: Proceedings of the Twelfth AAAI National Conference on Artificial IntelligenceAAAI’94. AAAI Press 1994:337–343

  20. [20]

    Greedy Randomized Adaptive Search Pro- cedures

    Feo TA, Resende MGC. Greedy Randomized Adaptive Search Pro- cedures. Journal of Global Optimization. 1995;6:109–133. doi: 10.1007/BF01096763

  21. [21]

    Iterated Local Search:320–353; Boston, MA: Springer US

    Lourenço HR, Martin OC, Stützle T. Iterated Local Search:320–353; Boston, MA: Springer US . 2003

  22. [22]

    Optimization by simulated annealing

    Kirkpatrick S, Gelatt Jr CD, Vecchi MP. Optimization by simulated annealing. science. 1983;220(4598):671–680

  23. [23]

    Sequential Model-Based Op- timization for General Algorithm Configuration

    Hutter F, Hoos HH, Leyton-Brown K. Sequential Model-Based Op- timization for General Algorithm Configuration. In: Coello CAC. , ed. Learning and Intelligent OptimizationSpringer Berlin Heidelberg 2011; Berlin, Heidelberg:507–523

  24. [24]

    SMAC3: A versa- tile Bayesian optimization package for hyperparameter optimization

    Lindauer M, Eggensperger K, Feurer M, et al. SMAC3: A versa- tile Bayesian optimization package for hyperparameter optimization. Journal of Machine Learning Research. 2022;23(54):1–9

  25. [25]

    How Low Can You Go? The Data-Light SE Challenge

    Ganguly KK, Menzies T. How Low Can You Go? The Data-Light SE Challenge. In: Proceedings of the 2026 International Conference on the Foundations of Software Engineering (FSE ’26) 2026. To appear

  26. [26]

    Total recall, language processing, and software en- gineering

    Yu Z, Menzies T. Total recall, language processing, and software en- gineering. In: Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software Engineering NL4SE 2018. Associa- tion for Computing Machinery 2018; New York, NY, USA:10–13

  27. [27]

    Finding better active learners for faster literature reviews

    Yu Z, Kraft NA, Menzies T. Finding better active learners for faster literature reviews. Empirical Software Engineering. 2018;23:3161– 3186

  28. [28]

    Tackling the poor as- sumptions of naive bayes text classifiers

    Rennie JD, Shih L, Teevan J, Karger DR. Tackling the poor as- sumptions of naive bayes text classifiers. In: Proceedings of the 20th international conference on machine learning (ICML-03) 2003:616– 623

  29. [29]

    Empirical methods for artificial intelligence

    Cohen PR. Empirical methods for artificial intelligence . 139. MIT press Cambridge, MA, 1995

  30. [30]

    Using bad learners to find good configurations

    Nair V , Menzies T, Siegmund N, Apel S. Using bad learners to find good configurations. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017ACM 2017; Paderborn, Germany:257–267

  31. [31]

    How to “DODGE” Complex Software Analytics

    Agrawal A, Fu W, Chen D, Shen X, Menzies T. How to “DODGE” Complex Software Analytics. IEEE Transactions on Software Engi- neering. 2021;47(10):2182–2194. First published online 2019 doi: 10.1109/TSE.2019.2945020

  32. [32]

    Auto- mated parameter optimization of classification techniques for defect prediction models

    Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K. Auto- mated parameter optimization of classification techniques for defect prediction models. In: Proceedings of the 38th International Confer- ence on Software Engineering, ICSE 2016 ACM 2016; Austin, Texas, USA:321–332

  33. [33]

    Tuning for software analytics: Is it really necessary?

    Fu W, Menzies T, Shen X. Tuning for software analytics: Is it really necessary?. Information and Software Technology. 2016;76:135–146

  34. [34]

    Note on a Method for Calculating Corrected Sums of Squares and Products

    Welford BP. Note on a Method for Calculating Corrected Sums of Squares and Products. Technometrics. 1962;4(3):419–420. doi: 10.1080/00401706.1962.10490022

  35. [35]

    Clean code: a handbook of agile software craftsmanship

    Martin RC. Clean code: a handbook of agile software craftsmanship . Pearson Education, 2009

  36. [36]

    What Programmers Do with In- heritance in Java

    Tempero E, Yang HY, Noble J. What Programmers Do with In- heritance in Java. In: Castagna G. , ed. ECOOP 2013 – Object- Oriented Programming Springer Berlin Heidelberg 2013; Berlin, Heidelberg:577–601

  37. [37]

    PEP 544 – Protocols: Structural subtyping (static duck typing)

    Levkivskyi I, Lehtosalo J, Langa Ł. PEP 544 – Protocols: Structural subtyping (static duck typing). Python Enhancement Proposal; 2017

  38. [38]

    PEP 557 – Data Classes

    Smith EV . PEP 557 – Data Classes. Python Enhancement Proposal; 2017

  39. [39]

    Finding better active learners for faster literature reviews

    Yu Z, Kraft NA, Menzies T. Finding better active learners for faster literature reviews. Empirical Software Engineering. 2018

  40. [40]

    Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

    Witten IH, Frank E. Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2005

  41. [41]

    Instance-Based Learn- ing Algorithms

    Aha DW, Kibler D, Albert MK. Instance-Based Learn- ing Algorithms. Machine Learning. 1991;6(1):37–66. doi: 10.1023/A:1022689900470

  42. [42]

    k-means++: The Advantages of Careful Seeding

    Arthur D, Vassilvitskii S. k-means++: The Advantages of Careful Seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Sym- posium on Discrete Algorithms (SODA ’07) Society for Industrial and Applied Mathematics 2007; New Orleans, Louisiana:1027–1035

  43. [43]

    Continuous Univariate Distri- butions, Volume 2

    Johnson NL, Kotz S, Balakrishnan N. Continuous Univariate Distri- butions, Volume 2. Wiley. 2nd ed., 1995. Section 26.9

  44. [44]

    Experimental results on the application of satisfiability algorithms to scheduling problems

    Crawford JM, Baker AB. Experimental results on the application of satisfiability algorithms to scheduling problems. In: Proceed- ings of the Twelfth AAAI National Conference on Artificial Intelli- genceAAAI’94. AAAI Press 1994:1092–1097

  45. [45]

    Yahpo gym-an efficient multi-objective multi-fidelity benchmark for hyper- parameter optimization

    Pfisterer F, Schneider L, Moosbauer J, Binder M, Bischl B. Yahpo gym-an efficient multi-objective multi-fidelity benchmark for hyper- parameter optimization. In: International Conference on Automated Machine LearningPMLR. 2022:3–1

  46. [46]

    NAS- Bench-301 and the case for surrogate benchmarks for neural architecture search.arXiv preprint arXiv:2008.09777, 4:14, 2020

    Zela A, Siems J, Zimmer L, Lukasik J, Keuper M, Hutter F. Surrogate NAS benchmarks: Going beyond the limited search spaces of tabular NAS benchmarks. arXiv preprint arXiv:2008.09777. 2020

  47. [47]

    A Guide to Experimental Algorithmics

    McGeoch CC. A Guide to Experimental Algorithmics . Cambridge University Press, 2012

  48. [48]

    New effect size rules of thumb

    Sawilowsky SS. New effect size rules of thumb. Journal of modern applied statistical methods. 2009;8(2):26. 22 MENZIES et al

  49. [49]

    The design, analysis and interpretation of repertory grids

    Easterby-Smith M. The design, analysis and interpretation of repertory grids. International Journal of Man-Machine Studies. 1980;13(1):3–24

  50. [50]

    Defining teachers’ classroom relationships; Valentin Bu- cik

    Kington A. Defining teachers’ classroom relationships; Valentin Bu- cik. . 2009

  51. [51]

    Heuristics for systems engineering cost estimation

    Valerdi R. Heuristics for systems engineering cost estimation. IEEE Systems Journal. 2010;5(1):91–98

  52. [52]

    Identifying self-admitted techni- cal debts with jitterbug: A two-step approach

    Yu Z, Fahid FM, Tu H, Menzies T. Identifying self-admitted techni- cal debts with jitterbug: A two-step approach. IEEE Transactions on Software Engineering. 2022;48(5):1676–1691

  53. [53]

    Data Quality Matters: A Case Study on Data Label Correctness for Security Bug Report Prediction

    Wu X, Zheng W, Xia X, Lo D. Data Quality Matters: A Case Study on Data Label Correctness for Security Bug Report Prediction. IEEE Transactions on Software Engineering. 2022;48(7):2541-2556. doi: 10.1109/TSE.2021.3063727

  54. [54]

    Detecting false alarms from automatic static analysis tools: How far are we?

    Kang HJ, Aw KL, Lo D. Detecting false alarms from automatic static analysis tools: How far are we?. In: Proceedings of the 44th International Conference on Software Engineering 2022:698–709

  55. [55]

    Data Quality: Some Comments on the NASA Software Defect Datasets

    Shepperd M, Song Q, Sun Z, Mair C. Data Quality: Some Comments on the NASA Software Defect Datasets. IEEE Trans. Softw. Eng.. 2013;39(9):1208–1215. doi: 10.1109/TSE.2013.11

  56. [56]

    A large-scale empirical study of just-in-time quality assurance

    Kamei Y, Shihab E, Adams B, et al. A large-scale empirical study of just-in-time quality assurance. IEEE Transactions on Software Engineering. 2013;39(6):757-773. doi: 10.1109/TSE.2012.70

  57. [57]

    Trans- ferring Performance Prediction Models Across Different Hardware Platforms

    Valov P, Petkovich J, Guo J, Fischmeister S, Czarnecki K. Trans- ferring Performance Prediction Models Across Different Hardware Platforms. In: Proc. of the 8th ACM/SPEC on Int. Conf. Perf. Eng. 2017:39–50

  58. [58]

    Finding Faster Config- urations Using FLASH

    Nair V , Yu Z, Menzies T, Siegmund N, Apel S. Finding Faster Config- urations Using FLASH. IEEE Trans. Software Eng.. 2020;46(7):794–

  59. [59]

    doi: 10.1109/TSE.2018.2870895

  60. [60]

    Efficient Compiler Autotuning via Bayesian Optimization

    Chen J, Xu N, Chen P, Zhang H. Efficient Compiler Autotuning via Bayesian Optimization. In: 43rd IEEE/ACM Int. Conf. Softw. Eng. ICSE 2021, Madrid, Spain, 22-30 May 2021 2021:1198–1209

  61. [61]

    An analysis of active learning strategies for sequence labeling tasks

    Settles B, Craven M. An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing 2008:1070– 1079

  62. [62]

    Software engineering economics

    Boehm BW. Software engineering economics. In: Software Pio- neers: Contributions to Software Engineering , , Berlin, Heidelberg: Springer-Verlag, 2002:641–686

  63. [63]

    Cost estimation with COCOMO II

    Boehm B, Abts C, Brown AW, et al. Cost estimation with COCOMO II. ed: Upper Saddle River, NJ: Prentice-Hall. 2000

  64. [64]

    A View of 20th and 21st Century Software Engineering

    Boehm B. A View of 20th and 21st Century Software Engineering. In: Proceedings of the 28th International Conference on Software EngineeringICSE ’06. ACM 2006; New York, NY, USA:12–29

  65. [65]

    Software Engineering Economics

    Boehm BW. Software Engineering Economics. IEEE Trans- actions on Software Engineering. 1984;SE-10(1):4-21. doi: 10.1109/TSE.1984.5010193

  66. [66]

    F.R.S. KP. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophi- cal Magazine and Journal of Science. 1901;2(11):559-572. doi: 10.1080/14786440109462720

  67. [67]

    Finding Prototypes For Nearest Neighbor Classifiers

    Chang CL. Finding Prototypes For Nearest Neighbor Classifiers. IEEE Transactions on Computers. 1974;C-23(11):1179–1184. doi: 10.1109/T-C.1974.223827

  68. [68]

    Extensions of Lipschitz Mappings into a Hilbert Space

    Johnson WB, Lindenstrauss J. Extensions of Lipschitz Mappings into a Hilbert Space. Contemporary Mathematics. 1984;26:189–206

  69. [69]

    Wrappers for Feature Subset Selection

    Kohavi R, John GH. Wrappers for Feature Subset Selection. Artificial Intelligence. 1997;97(1-2):273–324

  70. [70]

    Semi-supervised learning literature survey

    Zhu XJ. Semi-supervised learning literature survey. 2005

  71. [71]

    Active Learning Literature Survey

    Settles B. Active Learning Literature Survey. Tech. Rep. 1648, Uni- versity of Wisconsin–Madison; 2009

  72. [72]

    Data mining for very busy people

    Menzies T, Hu Y. Data mining for very busy people. Computer. 2003;36(11):22–29

  73. [73]

    The strangest thing about software

    Menzies T, Owen D, Richardson J. The strangest thing about software. Computer. 2007;40(1):54–60

  74. [74]

    Shockingly Simple:” Keys” for Better AI for SE

    Menzies T. Shockingly Simple:” Keys” for Better AI for SE. IEEE Software. 2021;38(2):114–118

  75. [75]

    Implica- tions of ceiling effects in defect predictors

    Menzies T, Turhan B, Bener A, Gay G, Cukic B, Jiang Y. Implica- tions of ceiling effects in defect predictors. In: Proc. PROMISEACM. 2008:47–54

  76. [76]

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

    Lewis P, Perez E, Piktus A, et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In:Advances in Neural Information Processing Systems 33 (NeurIPS 2020) 2020:9459–9474

  77. [77]

    A feature matching and transfer approach for cross-company defect prediction.Journal of Systems and Software

    Yu Q, Jiang S, Zhang Y. A feature matching and transfer approach for cross-company defect prediction.Journal of Systems and Software. 2017;132:366–378

  78. [78]

    A Systematic Liter- ature Review on Fault Prediction Performance in Software Engineer- ing

    Hall T, Beecham S, Bowes D, Gray D, Counsell S. A Systematic Liter- ature Review on Fault Prediction Performance in Software Engineer- ing. IEEE Transactions on Software Engineering. 2012;38(6):1276–

  79. [79]

    doi: 10.1109/TSE.2011.103

  80. [80]

    A systematic literature review of software defect predic- tion

    Wahono RS. A systematic literature review of software defect predic- tion. Journal of software engineering. 2015;1(1):1–16

Showing first 80 references.