pith. sign in

arxiv: 1907.04736 · v1 · pith:QFQSYUFQnew · submitted 2019-07-10 · 💻 cs.NE

Lexicase selection in Learning Classifier Systems

Pith reviewed 2026-05-24 23:25 UTC · model grok-4.3

classification 💻 cs.NE
keywords lexicase selectionlearning classifier systemsbatch-lexicasegeneralizationparent selectionevolutionary machine learningsupervised classificationrule-based systems
0
0 comments X

The pith

Batch-lexicase selection in learning classifier systems produces more generic rules that generalize better than tournament or fitness proportionate selection, including on partial data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies lexicase parent selection, which evaluates performance on individual data points in random order, to Learning Classifier Systems for supervised binary classification. It introduces batch-lexicase selection as a variant that processes data in batches to allow tuning of selection pressure. Comparisons show that this approach yields rules with greater generality than standard methods. The result matters because more generic rules improve performance on unseen future data and remain effective when some data points are missing. A reader would care if the goal is to strengthen generalization in evolutionary rule-based learners without relying on aggregated fitness scores.

Core claim

Lexicase selection and its batch variant, when used for parent selection in Learning Classifier Systems, result in the evolution of more generic rules. On binary classification problems this leads to stronger generalization on future data than tournament or fitness proportionate selection, and the advantage persists when data is partial or missing.

What carries the argument

Batch-lexicase selection, a variant of lexicase that groups data points into batches to modulate selection pressure during parent selection for LCS rule populations.

If this is right

  • More generic rules are created that favor generalization on future data.
  • Better generalization performance is observed compared to tournament and fitness proportionate selection.
  • The advantages hold in situations of partial or missing data.
  • Selection pressure can be tuned by adjusting batch size in the lexicase procedure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same batching idea might be applied to other evolutionary algorithms that already use lexicase to test whether generalization gains transfer outside LCS.
  • Rule sets evolved this way could be examined for interpretability on noisy industrial datasets where missing values are common.
  • If the pattern holds, practitioners might replace aggregated fitness entirely in LCS pipelines when the priority is robustness to incomplete inputs.

Load-bearing premise

The binary classification problems and LCS configurations used allow the observed differences in generalization to be attributed primarily to the choice of parent selection method rather than other implementation details or dataset characteristics.

What would settle it

Experiments on the same binary classification tasks but with a different LCS implementation or additional datasets where batch-lexicase produces no measurable gain in accuracy on held-out test sets or on versions with artificially removed data points.

Figures

Figures reproduced from arXiv: 1907.04736 by Lee Spector, Sneha Aenugu.

Figure 1
Figure 1. Figure 1: Results for a 11-bit multiplexer averaged over 50 independent runs. Standard deviation bars report fluctuations over [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Results for a 20-bit multiplexer partial data averaged over 50 independent runs. Batch-lexicase has the highest accu [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Histogram of the rule distribution for the 11 bit [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Results for the parity problem averaged over 50 independent runs. Batch-lexicase selection converges to perfect test [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Results for the LED problem averaged over 50 independent runs. Batch-lexicase gives better generalization in the [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Histogram of the rule distribution based on the [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Results for the Car Evaluation problem averaged over 50 independent runs. Batch-lexicase gives better generalization [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Parameter tuning results for the 20-bit multiplexer partial data averaged over 50 independent runs. Setting batch [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
read the original abstract

The lexicase parent selection method selects parents by considering performance on individual data points in random order instead of using a fitness function based on an aggregated data accuracy. While the method has demonstrated promise in genetic programming and more recently in genetic algorithms, its applications in other forms of evolutionary machine learning have not been explored. In this paper, we investigate the use of lexicase parent selection in Learning Classifier Systems (LCS) and study its effect on classification problems in a supervised setting. We further introduce a new variant of lexicase selection, called batch-lexicase selection, which allows for the tuning of selection pressure. We compare the two lexicase selection methods with tournament and fitness proportionate selection methods on binary classification problems. We show that batch-lexicase selection results in the creation of more generic rules which is favorable for generalization on future data. We further show that batch-lexicase selection results in better generalization in situations of partial or missing data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper investigates lexicase parent selection and introduces batch-lexicase selection (with tunable pressure) in Learning Classifier Systems for supervised binary classification. It compares these against tournament and fitness-proportionate selection, claiming that batch-lexicase produces more generic rules and yields superior generalization, especially on partial or missing data.

Significance. If the attribution of gains to the selection method can be isolated, the work would add a practical parent-selection option for LCS that favors generality. The batch variant's pressure control is a modest but useful extension of lexicase ideas from GP/GA into LCS.

major comments (3)
  1. [Methods / Experimental setup] Experimental setup (likely §4 or Methods): the manuscript does not state whether covering, deletion, subsumption, and other LCS hyperparameters were held fixed across the four selection methods. Without explicit controls or ablations, observed differences in rule generality cannot be attributed primarily to lexicase ordering rather than incidental interactions with the rest of the LCS loop.
  2. [Results] Results section: the abstract and claims assert improved generalization and more generic rules, yet no information is supplied on number of independent runs, statistical tests, confidence intervals, or exact performance metrics. This prevents assessment of whether the reported advantages are reliable or could be due to variance.
  3. [Experiments on partial/missing data] Evaluation on partial/missing data: the paper claims better generalization under missing data but does not describe how missingness was simulated, what fraction of data was removed, or whether the same missing-data handling was applied uniformly to all selection methods.
minor comments (1)
  1. Notation for batch size and lexicase ordering should be defined once in a table or early section rather than repeated inline.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments. We address each major comment below and indicate where the manuscript will be revised to improve clarity and completeness.

read point-by-point responses
  1. Referee: Experimental setup (likely §4 or Methods): the manuscript does not state whether covering, deletion, subsumption, and other LCS hyperparameters were held fixed across the four selection methods. Without explicit controls or ablations, observed differences in rule generality cannot be attributed primarily to lexicase ordering rather than incidental interactions with the rest of the LCS loop.

    Authors: All other LCS components (covering, deletion, subsumption, and remaining hyperparameters) were held fixed across the four selection methods to isolate the effect of parent selection. We will revise the Methods section to state this explicitly. revision: yes

  2. Referee: Results section: the abstract and claims assert improved generalization and more generic rules, yet no information is supplied on number of independent runs, statistical tests, confidence intervals, or exact performance metrics. This prevents assessment of whether the reported advantages are reliable or could be due to variance.

    Authors: We will update the Results section to report the number of independent runs, the statistical tests performed, confidence intervals, and more precise performance metrics. revision: yes

  3. Referee: Evaluation on partial/missing data: the paper claims better generalization under missing data but does not describe how missingness was simulated, what fraction of data was removed, or whether the same missing-data handling was applied uniformly to all selection methods.

    Authors: We will add a description of the missing-data simulation procedure, the fraction of data removed, and confirmation that identical conditions were used for all selection methods. revision: yes

Circularity Check

0 steps flagged

No circularity; purely empirical comparisons with no derivations or self-referential reductions

full rationale

The paper contains no equations, derivations, fitted parameters presented as predictions, or load-bearing self-citations of uniqueness theorems. Its claims rest on direct experimental comparisons of selection methods (tournament, fitness-proportionate, lexicase, batch-lexicase) on binary classification tasks in LCS. Batch-lexicase is introduced as a tunable variant by definition, and results on rule generality and generalization are reported as observed outcomes rather than reductions to prior inputs. This is self-contained empirical work; no step reduces by construction to the paper's own definitions or citations.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on standard domain assumptions of LCS as a supervised classifier and the premise that selection method differences drive generalization; no free parameters, ad-hoc axioms, or invented entities are introduced in the abstract.

pith-pipeline@v0.9.0 · 5687 in / 1098 out tokens · 25231 ms · 2026-05-24T23:25:29.860768+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

  1. [1]

    Learning Concept Classification Rules Using Genetic Algorithms

    1991. Learning Concept Classification Rules Using Genetic Algorithms. (1991). http://www.aic.nrl.navy.mil/~spears/papers/ijcai91.ps.gz Warning: the year was guessed out of the URL

  2. [2]

    Thomas Jansen 0001 and Christine Zarges. 2018. Theoretical Analysis of Lexicase Selection in Multi-objective Optimization. InParallel Problem Solving from Nature - PPSN XV - 15th International Conference, Coimbra, Portugal, September 8-12, 2018, Proceedings, Part II (Lecture Notes in Computer Science) , Anne Auger, Carlos M. Fonseca, Nuno Lourenço 0002, P...

  3. [3]

    Garrell-Guiu

    Ester Bernadó-Mansilla and Josep M. Garrell-Guiu. 2003. Accuracy-Based Learn- ing Classifier Systems: Models, Analysis and Applications to Classification Tasks. Evolutionary Computation 11, 3 (2003), 209–238

  4. [4]

    E. K. Burke, S. Gustafson, and G. Kendall. 2004. Diversity in Genetic Programming: An Analysis of Measures and Correlation With Fitness. IEEE-EC 8 (Feb. 2004), 47–62. Issue 1

  5. [5]

    Butz, David E

    Martin V. Butz, David E. Goldberg, and Kurian K. Tharakunnel. 2003. Analysis and Improvement of Fitness Exploitation in XCS: Bounding Models, Tournament Selection, and Bilateral Accuracy.Evolutionary Computation 11, 3 (2003), 239–277

  6. [6]

    Butz, Kumara Sastry, and David E

    Martin V. Butz, Kumara Sastry, and David E. Goldberg. 2005. Strong, Stable, and Reliable Fitness Pressure in XCS due to Tournament Selection. Genetic Programming and Evolvable Machines 6, 1 (March 2005), 53–77. https://doi.org/ doi:10.1007/s10710-005-7619-9

  7. [7]

    Butz, Kumara Sastry, and David E

    Martin V. Butz, Kumara Sastry, and David E. Goldberg. 2005. Strong, Stable, and Reliable Fitness Pressure in XCS due to Tournament Selection. Genetic Programming and Evolvable Machines 6, 1 (2005), 53–77

  8. [8]

    Butz and Stewart W

    Martin V. Butz and Stewart W. Wilson. 2000. An Algorithmic Description of XCS . Technical Report 2000017. Illinois Genetic Algorithms Laboratory

  9. [9]

    William La Cava, Lee Spector, and Kourosh Danai. 2016. Epsilon-Lexicase Se- lection for Regression. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference, Denver, CO, USA, July 20 - 24, 2016 , Tobias Friedrich 0001, Frank Neumann 0001, and Andrew M. Sutton (Eds.). ACM, 741–748

  10. [10]

    Frank and A

    A. Frank and A. Asuncion. 2010. UCI Machine Learning Repository. (2010). http://archive.ics.uci.edu/ml

  11. [11]

    Burke, and Graham Kendall

    Steven Gustafson, Edmund K. Burke, and Graham Kendall. 2004. Sampling of Unique Structures and Behaviours in Genetic Programming. In Genetic Pro- gramming 7th European Conference, EuroGP 2004, Proceedings (LNCS) , Vol. 3003. Springer-Verlag, 279–288

  12. [12]

    Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2015. Lexicase Selection For Program Synthesis: A Diversity Analysis. In Genetic Programming Theory and Practice XIII (Genetic and Evolutionary Computation) , Rick Riolo, William P. Worzel, M. Kotanchek, and A. Kordon (Eds.). Springer, Ann Arbor, USA, 151–167. https://doi.org/doi:10.1007/978-3-319-...

  13. [13]

    Thomas Helmuth, Nicholas Freitag McPhee, and Lee Spector. 2016. Effects of Lexicase and Tournament Selection on Diversity Recovery and Maintenance. In GECCO ’16 Companion: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion . ACM, Denver, Colorado, USA, 983–990. https://doi.org/doi:10.1145/2908961.2931657

  14. [14]

    Thomas Helmuth, Lee Spector, and James Matheson. 2015. Solving Uncompromis- ing Problems with Lexicase Selection. IEEE Transactions on Evolutionary Compu- tation 19, 5 (Oct. 2015), 630–643. https://doi.org/doi:10.1109/TEVC.2014.2362729

  15. [15]

    Faten Kharbat, Larry Bull, and Mohammed Odeh. 2005. Revisiting genetic se- lection in the XCS learning classifier system. In IEEE Congress on Evolutionary Computation. IEEE, 2061–2068

  16. [16]

    Krzysztof Krawiec and Pawel Lichocki. 2010. Using Co-solvability to Model and Exploit Synergetic Effects in Evolution. In PPSN (2) (Lecture Notes in Computer Science), Robert Schaefer, Carlos Cotta, Joanna Kolodziej, and Günter Rudolph (Eds.), Vol. 6239. Springer, 492–501

  17. [17]

    William La Cava, Thomas Helmuth, Lee Spector, and Jason H. Moore. [n. d.]. A probabilistic and multi-objective analysis of lexicase selection and epsilon- lexicase selection. Evolutionary Computation ([n. d.]). https://doi.org/doi:10. 1162/evco_a_00224 Forthcoming

  18. [18]

    K. Sastry. M. Butz. and D. Goldberg. 2002. Tournament selection in XCS. Pro- ceedings of the 5th Genetic and Evolutionary Computation Conference (GECCO) 1869 (2002)

  19. [19]

    R. I. (Bob) McKay. 2001. An Investigation of Fitness Sharing in Genetic Program- ming. The Australian Journal of Intelligent Information Processing Systems 7, 1/2 (July 2001), 43–51. http://sc.snu.ac.kr/PAPERS/AJIIPSfitshr.pdf

  20. [20]

    Blossom Metevier, Anil Kumar Saini, and Lee Spector. 2018. Lexicase Selection Beyond Genetic Programming. In Genetic Programming Theory and Practice XVI , Wolfgang Banzhaf, Lee Spector, and Leigh Sheneman (Eds.). Springer, Ann Arbor, USA. https://doi.org/doi:10.1007/978-3-030-04735-1_7

  21. [21]

    Albert Orriols-Puig and Ester Bernadó-Mansilla. 2008. Revisiting UCS: Descrip- tion, Fitness Sharing, and Comparison with XCS. In Learning Classifier Systems. 10th and 11th International Workshops (2006-2007), Jaume Bacardit, Ester Bernadó- Mansilla, Martin Butz, Tim Kovacs, Xavier Llorà, and Keiki Takadama (Eds.). Lecture Notes in Computer Science, Vol. ...

  22. [22]

    Lee Spector. 2012. Assessment of problem modality by differential performance of lexicase selection in genetic programming: a preliminary report. In 1st workshop on Understanding Problems (GECCO-UP) , Kent McClymont and Ed Keedwell (Eds.). ACM, Philadelphia, Pennsylvania, USA, 401–408. https://doi.org/doi: 10.1145/2330784.2330846

  23. [23]

    Lee Spector, William La Cava, Saul Shanabrook, Thomas Helmuth, and Edward Pantridge. 2017. Relaxations of Lexicase Parent Selection. In Genetic Program- ming Theory and Practice XV (Genetic and Evolutionary Computation) , Wolf- gang Banzhaf, Randal S. Olson, William Tozier, and Rick Riolo (Eds.). Springer, University of Michigan in Ann Arbor, USA, 105–120...

  24. [24]

    Urbanowicz and Jason H

    Ryan J. Urbanowicz and Jason H. Moore. 2009. Learning Classifier Systems: A Complete Introduction, Review, and Roadmap. Journal of Artificial Evolution and Applications 2009 (2009). http://www.hindawi.com/journals/jaea/2009/736398. abs.html Article ID 736398

  25. [25]

    Stewart W. Wilson. 1995. Classifier Fitness Based on Accuracy. Evolutionary Computation 3, 2 (1995), 149–175

  26. [26]

    Du YP Xu QS., Liang YZ. 2004. Monte Carlo cross-validation for selecting a model and estimating the prediction error in multivariate calibration. J Chemom 18, 2 (2004), 112–120