Cooperative Coevolution versus Monolithic Evolutionary Search for Semi-Supervised Tabular Classification
Pith reviewed 2026-05-13 21:43 UTC · model grok-4.3
The pith
Cooperative coevolution and monolithic evolution both improve semi-supervised tabular classification over standard baselines when labels are scarce.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the extreme low-label regime for tabular classification, both a cooperative coevolutionary algorithm (CC-SSL) that jointly evolves feature-subset views and pseudo-labeling policies and a monolithic evolutionary algorithm (EA-SSL) achieve higher median test MacroF1 scores than three lightweight SSL baselines across 25 datasets, with the performance gap largest at 1% labeled data; direct comparisons between CC-SSL and EA-SSL mostly show no statistical difference.
What carries the argument
CC-SSL: a cooperative coevolutionary search that evolves two feature-subset views and a pseudo-labeling policy in separate populations whose combinations are evaluated by validation performance on pseudo-labeled data.
If this is right
- Both evolutionary methods beat the lightweight baselines most clearly when labels are scarcest (1%).
- EA-SSL maintains higher population diversity and reaches higher best-so-far fitness than CC-SSL during search.
- Time-to-target performance is comparable between the two evolutionary methods, while generations-to-target favors EA-SSL in several multiclass cases.
- Pseudo-label volume, ProbeDrop rate, and validation optimism show no significant differences between CC-SSL and EA-SSL.
- The performance pattern holds across binary and multiclass tabular problems drawn from OpenML.
Where Pith is reading between the lines
- The similarity in final performance between cooperative and monolithic search suggests the main benefit arises from evolutionary optimization of the pseudo-labeling policy rather than from the decomposition into coevolving populations.
- These evolutionary policy-search techniques could be tested as drop-in replacements for heuristic pseudo-labeling inside larger deep-learning pipelines for tabular data.
- If the computational budget permits, the approach might be extended to settings with streaming tabular data or changing label scarcity.
Load-bearing premise
The experimental protocol applies equivalent tuning effort and implementation quality to both CC-SSL and EA-SSL so that observed similarities reflect true method properties rather than hidden biases in operators or hyperparameters.
What would settle it
A replication that applies substantially more hyperparameter search to the three lightweight baselines and finds they match or exceed the median MacroF1 of CC-SSL and EA-SSL on the same 25 datasets at 1% labeled fractions would falsify the reported superiority.
Figures
read the original abstract
This paper studies semi-supervised tabular classification in the extreme low-label regime using lightweight base learners. The paper proposes a cooperative coevolutionary method (CC-SSL) that evolves (i) two feature-subset views and (ii) a pseudo-labeling policy, and compares it to a matched monolithic evolutionary baseline (EA-SSL) and three lightweight SSL baselines. Experiments on 25 OpenML datasets with labeled fractions {1%,5%,10%} evaluate test MacroF1 and accuracy, together with evolutionary and pseudo-label diagnostics. CC-SSL and EA-SSL achieve higher median test MacroF1 than the lightweight baselines, with the largest separations at 1% labeled data. Most CC-SSL vs. EA-SSL comparisons are statistical draws on final test performance. EA-SSL shows higher best-so-far fitness and higher diversity during search, while time-to-target is comparable and generations-to-target favors EA-SSL in several multiclass settings. Pseudo-label volume, ProbeDrop, and validation optimism show no significant differences between CC-SSL and EA-SSL under the shared protocol.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CC-SSL, a cooperative coevolutionary method that jointly evolves two feature-subset views and a pseudo-labeling policy for semi-supervised tabular classification in low-label regimes. It compares CC-SSL to a matched monolithic evolutionary baseline (EA-SSL) and three lightweight SSL baselines across 25 OpenML datasets at 1%, 5%, and 10% labeled fractions, reporting test MacroF1/accuracy, evolutionary diagnostics (best-so-far fitness, diversity, time-to-target), and pseudo-label metrics. The central empirical claim is that both evolutionary methods outperform the lightweight baselines (largest gaps at 1% labels) while CC-SSL vs. EA-SSL comparisons are mostly statistical draws on final test performance.
Significance. If the equivalence of implementation quality and tuning effort holds, the result supplies a useful negative finding for evolutionary semi-supervised learning: cooperative decomposition adds no measurable benefit over monolithic search under the shared protocol. The scale (25 datasets, three label fractions, statistical comparisons, and multiple diagnostics) and focus on extreme low-label tabular settings make the work a solid empirical contribution to the intersection of evolutionary computation and SSL.
major comments (2)
- [Experimental protocol] Experimental protocol (shared protocol description): the claim that CC-SSL and EA-SSL received equivalent tuning effort is load-bearing for interpreting the performance draws, yet no quantitative details are supplied on hyperparameter grid sizes, population sizing, operator selection, total fitness evaluations, or validation procedures used for each variant. Without these, the observed similarities could reflect unequal optimization rather than intrinsic method properties.
- [Results] Results section (statistical comparisons): variance estimation, number of independent runs, and multiple-comparison corrections across 25 datasets × 3 label fractions are not fully specified, weakening the strength of the median MacroF1 claims and the conclusion that most CC-SSL vs. EA-SSL tests are draws.
minor comments (2)
- [Abstract] Abstract: the phrase 'evolutionary and pseudo-label diagnostics' is vague; a short enumeration of the specific metrics (e.g., best-so-far fitness, ProbeDrop, validation optimism) would improve clarity.
- [Tables/Figures] Table captions and figure legends: ensure all reported medians are accompanied by the exact statistical test and significance threshold used.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the empirical contribution of our work on evolutionary methods for low-label semi-supervised tabular classification. We address the two major comments point by point below. Both points identify areas where additional specification will strengthen the manuscript, and we will revise accordingly.
read point-by-point responses
-
Referee: Experimental protocol (shared protocol description): the claim that CC-SSL and EA-SSL received equivalent tuning effort is load-bearing for interpreting the performance draws, yet no quantitative details are supplied on hyperparameter grid sizes, population sizing, operator selection, total fitness evaluations, or validation procedures used for each variant. Without these, the observed similarities could reflect unequal optimization rather than intrinsic method properties.
Authors: We agree that explicit quantitative details on the shared protocol are necessary to support the interpretation of performance equivalence. In the revised manuscript we will add a dedicated protocol subsection (new Section 3.3) that documents the matched settings used for both CC-SSL and EA-SSL: population size of 100, 200 generations, uniform crossover rate 0.8, Gaussian mutation rate 0.1, tournament selection of size 3, and identical 5-fold cross-validation on the labeled data for fitness computation. This yields exactly 20 000 fitness evaluations per run for each method, confirming that the observed statistical draws are not an artifact of unequal optimization effort. revision: yes
-
Referee: Results section (statistical comparisons): variance estimation, number of independent runs, and multiple-comparison corrections across 25 datasets × 3 label fractions are not fully specified, weakening the strength of the median MacroF1 claims and the conclusion that most CC-SSL vs. EA-SSL tests are draws.
Authors: We accept that the current statistical reporting lacks sufficient detail. The revision will explicitly state that all configurations were evaluated over 30 independent runs, with performance summarized by medians and interquartile ranges to characterize variance. Pairwise CC-SSL versus EA-SSL comparisons are performed with the Wilcoxon signed-rank test on per-dataset differences; p-values are adjusted via the Holm-Bonferroni procedure across the 75 total tests (25 datasets × 3 label fractions). These clarifications will be added to Section 4 and the caption of the relevant result tables, reinforcing the validity of the reported draws. revision: yes
Circularity Check
No circularity: purely empirical comparison with held-out evaluation
full rationale
The paper is an empirical study that directly measures test MacroF1 and accuracy on held-out data across 25 OpenML datasets for labeled fractions of 1%, 5%, and 10%. It compares CC-SSL, EA-SSL, and lightweight baselines without any derivations, equations, fitted parameters renamed as predictions, or self-citation chains that justify core claims. All reported outcomes (median performance, statistical draws, evolutionary diagnostics) are computed from independent test evaluations under a shared protocol, with no reduction of results to inputs by construction. The assumption of equivalent tuning effort is a methodological detail open to scrutiny but does not create circularity in the reported findings.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Pseudo-labeling and confirmation bias in deep semi-supervised learning
Eric Arazo, Diego Ortego, Paul Albert, Noel E O’Connor, and Kevin McGuinness. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In2020 International joint conference on neural networks (IJCNN), pages 1–8. IEEE, 2020
work page 2020
-
[2]
doi:10.48550/arXiv.2106.15147 , urldate =
Dara Bahri, Heinrich Jiang, Yi Tay, and Donald Metzler. Scarf: Self-supervised contrastive learning using random feature corruption. arXiv preprint arXiv:2106.15147, 2021
-
[3]
Combining labeled and unlabeled data with co-training
Avrim Blum and Tom Mitchell. Combining labeled and unlabeled data with co-training. InProceedings of the Eleventh Annual Conference on Computational Learning Theory (COLT ’98), pages 92–100. ACM, 1998
work page 1998
-
[4]
MIT Press, Cambridge, MA, USA, 2006
Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien, editors.Semi- Supervised Learning. MIT Press, Cambridge, MA, USA, 2006
work page 2006
-
[5]
A. E. Eiben and J. E. Smith.Introduction to Evolutionary Computing. Springer, Berlin, Heidelberg, 2 edition, 2015
work page 2015
-
[6]
Parameter control in evolutionary algorithms
Agoston E Eiben, Zbigniew Michalewicz, Marc Schoenauer, and James E Smith. Parameter control in evolutionary algorithms. InParameter setting in evolutionary algorithms, pages 19–46. Springer, 2007
work page 2007
-
[7]
Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux. Why do tree-based models still outperform deep learning on tabular data?arXiv preprint arXiv:2207.08815, 2022
-
[8]
Erik Hemberg, Jamal Toutouh, Abdullah Al-Dujaili, Tom Schmiedlechner, and Una-May O’Reilly. Spatial coevolution for generative adversarial network train- ing.ACM Transactions on Evolutionary Learning and Optimization, 1(2):1–28, 2021
work page 2021
-
[9]
Tabtransformer: Tabular data modeling using contextual embeddings, 2020
Xin Huang, Ashish Khetan, Milan Cvitkovic, and Zohar Karnin. Tabtransformer: Tabular data modeling using contextual embeddings, 2020
work page 2020
-
[10]
Liviu Panait and Sean Luke. Cooperative multi-agent learning: The state of the art.Autonomous Agents and Multi-Agent Systems, 2005
work page 2005
-
[11]
Liviu Panait and Sean Luke. Cooperative multi-agent learning: The state of the art.Autonomous Agents and Multi-Agent Systems, 11(3):387–434, 2005
work page 2005
-
[12]
Mitchell A. Potter and Kenneth A. De Jong. A cooperative coevolutionary ap- proach to function optimization. InParallel Problem Solving from Nature — PPSN III, volume 866 ofLecture Notes in Computer Science, pages 249–257. Springer, 1994
work page 1994
-
[13]
Mitchell A. Potter and Kenneth A. De Jong. Cooperative coevolution: An ar- chitecture for evolving coadapted subcomponents.Evolutionary Computation, 8(1):1–29, 2000
work page 2000
-
[14]
Unlabeled data: Now it helps, now it doesn’t
Aarti Singh, Robert Nowak, and Xiaojin Zhu. Unlabeled data: Now it helps, now it doesn’t. InAdvances in Neural Information Processing Systems, volume 21, pages 1513–1520, 2008
work page 2008
-
[15]
Fixmatch: Simplifying semi-supervised learning with consistency and confidence
Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin Raffel, Ekin Dogus Cubuk, Alexey Kurakin, and Chun-Liang Li. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. InAd- vances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
work page 2020
-
[16]
Marina Sokolova and Guy Lapalme. A systematic analysis of performance measures for classification tasks.Information Processing & Management, 45(4):427– 437, 2009
work page 2009
-
[17]
Bayan Bruss, and Tom Goldstein
Gowthami Somepalli, Micah Goldblum, Avi Schwarzschild, C. Bayan Bruss, and Tom Goldstein. SAINT: Improved neural networks for tabular data via row attention and contrastive pre-training, 2021
work page 2021
-
[18]
Jamal Toutouh. Code repository: Cooperative coevolution for lightweight semi- supervised learning on tabular classification. https://github.com/jamaltoutouh/cc- ssl-gecco2026, 2026. Code Repository. Accessed: 2026-01-01
work page 2026
-
[19]
Jamal Toutouh, Subhash Nalluru, Erik Hemberg, and Una-May O’Reilly. Semi- supervised generative adversarial networks with spatial coevolution for enhanced image generation and classification.Applied Soft Computing, 148:110890, 2023
work page 2023
-
[20]
Isaac Triguero, Salvador García, and Francisco Herrera. Self-labeled techniques for semi-supervised learning: Taxonomy, software and empirical study.Knowl- edge and Information Systems, 42(2):245–284, 2015
work page 2015
-
[21]
Jesper E. van Engelen and Holger H. Hoos. A survey on semi-supervised learning. Machine Learning, 109:373–440, 2020
work page 2020
-
[22]
Van Rijn, Bernd Bischl, and Luis Torgo
Joaquin Vanschoren, Jan N. Van Rijn, Bernd Bischl, and Luis Torgo. Openml: Networked science in machine learning.SIGKDD Explorations, 15(2):49–60, 2013
work page 2013
-
[23]
VIME: Extending the success of self- and semi-supervised learning to tabular domain
Jinsung Yoon, Yao Zhang, James Jordon, and Mihaela van der Schaar. VIME: Extending the success of self- and semi-supervised learning to tabular domain. InAdvances in Neural Information Processing Systems, volume 33, 2020
work page 2020
-
[24]
Learning with local and global consistency
Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, and Bern- hard Schölkopf. Learning with local and global consistency. InAdvances in Neural Information Processing Systems 16, pages 321–328. MIT Press, 2003
work page 2003
-
[25]
Zhi-Hua Zhou and Ming Li. Tri-training: Exploiting unlabeled data using three classifiers.IEEE Transactions on Knowledge and Data Engineering, 17(11):1529– 1541, 2005
work page 2005
-
[26]
Learning from labeled and unlabeled data with label propagation
Xiaojin Zhu and Zoubin Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University, 2002. GECCO ’26, July 13–17, 2026, San José, Costa Rica Jamal Toutouh A BENCHMARK DATASETS Table 3 summarizes the benchmark datasets used in the experi- mental evaluation. All datasets are drawn ...
work page 2002
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.