A Study and Analysis of a Feature Subset Selection Technique using Penguin Search Optimization Algorithm (FS-PeSOA)
Pith reviewed 2026-05-24 22:08 UTC · model grok-4.3
The pith
FS-PeSOA adapts penguin hunting jumps to find small feature subsets that improve accuracy in standard classifiers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that translating the group hunting strategy of penguins into an optimization loop produces an effective search over feature-subset candidates, and that the subsets found by this loop yield higher classification accuracy than state-of-the-art methods when evaluated with Random Forest, Nearest Neighbour and SVM on UCI data.
What carries the argument
Penguin Search optimization algorithm: a population-based procedure that generates trial feature subsets by simulating random-depth dives and information sharing, then scores each subset by its classification performance under three fixed evaluators.
If this is right
- FS-PeSOA will generate trial subsets whose fitness is scored by Random Forest, Nearest Neighbour and SVM.
- The algorithm will be tested on standard UCI benchmark datasets.
- Classification accuracy obtained with FS-PeSOA will be compared directly with state-of-the-art feature-selection methods.
- The approach is expected to identify smaller feature sets that still support accurate class prediction.
Where Pith is reading between the lines
- If the penguin model works for feature selection, the same information-sharing loop could be reused for other combinatorial search problems in machine learning.
- Success on UCI data would motivate testing the method on high-dimensional real-world collections such as gene-expression or image-feature sets.
- The three-classifier fitness step could be replaced by a single faster evaluator in resource-constrained settings without changing the core search mechanism.
Load-bearing premise
The natural hunting strategy of penguins can be translated into a search procedure that reliably finds feature subsets giving higher classification accuracy than existing methods.
What would settle it
Run FS-PeSOA on the planned UCI datasets and measure whether the selected subsets produce lower or equal accuracy with Random Forest, Nearest Neighbour or SVM compared with current state-of-the-art feature selection algorithms.
Figures
read the original abstract
In today world of enormous amounts of data, it is very important to extract useful knowledge from it. This can be accomplished by feature subset selection. Feature subset selection is a method of selecting a minimum number of features with the help of which our machine can learn and predict which class a particular data belongs to. We will introduce a new adaptive algorithm called Feature selection Penguin Search optimization algorithm which is a metaheuristic approach. It is adapted from the natural hunting strategy of penguins in which a group of penguins take jumps at random depths and come back and share the status of food availability with other penguins and in this way, the global optimum solution is found. In order to explore the feature subset candidates, the bioinspired approach Penguin Search optimization algorithm generates during the process a trial feature subset and estimates its fitness value by using three different classifiers for each case: Random Forest, Nearest Neighbour and Support Vector Machines. However, we are planning to implement our proposed approach Feature selection Penguin Search optimization algorithm on some well known benchmark datasets collected from the UCI repository and also try to evaluate and compare its classification accuracy with some state of art algorithms.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a new metaheuristic for feature subset selection called FS-PeSOA, adapted from the hunting behavior of penguins (random-depth jumps and food-status sharing to locate a global optimum). It states that the approach will generate trial feature subsets, evaluate their fitness using Random Forest, Nearest Neighbor, and SVM classifiers, and plans to implement and compare the method against state-of-the-art algorithms on UCI benchmark datasets.
Significance. A fully specified and empirically validated penguin-inspired feature-selection algorithm could add a new bio-inspired optimizer to the feature-selection literature. However, because the manuscript supplies neither a formal algorithm definition nor any experimental results, its significance cannot be assessed from the current text.
major comments (3)
- [Abstract] Abstract: the central claim is the introduction of a new adaptive algorithm FS-PeSOA, yet the text supplies only a high-level biological analogy and states that the authors 'are planning to implement' the method; no encoding of feature subsets (binary vector, subset size, etc.), position-update equations, control parameters, or pseudocode are provided.
- [Abstract] Abstract: the title promises 'a study and analysis,' but the manuscript contains no implementation, no UCI dataset results, no accuracy numbers, and no comparison tables, leaving the empirical claims without support.
- [Abstract] Abstract: the fitness-evaluation procedure is described only as 'estimates its fitness value by using three different classifiers'; no details are given on how the three classifier outputs are combined into a single fitness score or how the search balances exploration and exploitation.
minor comments (1)
- [Abstract] Abstract: 'today world' should read 'today's world'; 'Neighbour' spelling should be consistent with the journal's preferred variant.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The comments correctly identify that the submitted manuscript is a high-level proposal rather than a fully implemented and evaluated study. We will revise the manuscript to provide the requested algorithmic details and to adjust the title and claims to match the current scope. Full experimental results would require additional implementation work beyond the present draft.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim is the introduction of a new adaptive algorithm FS-PeSOA, yet the text supplies only a high-level biological analogy and states that the authors 'are planning to implement' the method; no encoding of feature subsets (binary vector, subset size, etc.), position-update equations, control parameters, or pseudocode are provided.
Authors: We agree the current text is limited to a biological analogy. In revision we will add the binary encoding of feature subsets, the position-update equations derived from penguin depth jumps and food-status sharing, the control parameters (e.g., maximum jump depth, sharing probability), and pseudocode for the complete FS-PeSOA procedure. revision: yes
-
Referee: [Abstract] Abstract: the title promises 'a study and analysis,' but the manuscript contains no implementation, no UCI dataset results, no accuracy numbers, and no comparison tables, leaving the empirical claims without support.
Authors: The title is indeed broader than the content delivered. We will change the title to reflect a proposed method (e.g., 'A Proposed Feature Subset Selection Technique using Penguin Search Optimization Algorithm (FS-PeSOA)') and will remove or qualify any statements implying completed experiments. Full empirical validation on UCI datasets will be reserved for a subsequent extended manuscript. revision: partial
-
Referee: [Abstract] Abstract: the fitness-evaluation procedure is described only as 'estimates its fitness value by using three different classifiers'; no details are given on how the three classifier outputs are combined into a single fitness score or how the search balances exploration and exploitation.
Authors: We will expand the fitness section to define an explicit aggregation rule (e.g., weighted average of classification accuracies from Random Forest, Nearest Neighbor, and SVM) and will describe how the penguin-inspired operators control the exploration-exploitation trade-off through random-depth jumps and information sharing. revision: yes
Circularity Check
No derivation chain exists; paper announces planned algorithm without equations or formal definition.
full rationale
The manuscript provides only a high-level biological analogy for FS-PeSOA and states an intention to implement and compare on UCI data. No position-update rules, feature-subset encoding, fitness functions, or any equations appear, so no load-bearing step can reduce to its own inputs by construction. The central claim is an unexecuted proposal rather than a delivered derivation.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Stewart, S., & Thomas, M. (2007). Eigenvalues and eigenvectors: Formal, symbolic, and embodied thinking. In The 10th Conference of the Special Interest Group of the Mathematical Association of America on Research in Undergraduate Mathematics Education (pp. 275-296)
work page 2007
-
[2]
S., Chakraborty, S., & Kairi, A
Tibrewal, B., Chaudhury, G. S., Chakraborty, S., & Kairi, A. (2019). Rough Set-Based Feature Subset Selection Technique Using Jaccard‟s Similarity Index. In Proceedings of International Ethical Hacking Conference 2018 (pp. 477-487). Springer, Singapore
work page 2019
-
[3]
Goswami, S., Das, A.K., Guha, P. et al. (2017). An approach of feature selection using graph -theoretic heuristic and hill climbing. Pattern Analysis and Applications, Springer. https://doi.org/10.1007/s1 0044- 017-0668-x
work page doi:10.1007/s1 2017
-
[4]
Goswami, S., Das, A.K., Guha, P. et al. (2017). A new hybrid feature selection approach using feature association map for supervised and unsupervised classification. Expert Systems with Applications, Elsevier, 88, 81-94. https://doi.org/10.1016/j.eswa.2017.06.032
-
[5]
Ng, A. (2000). CS229 Lecture notes. CS229 Lecture notes, 1(1), 1-3
work page 2000
-
[6]
Goswami, S., Chakraborty, S., Guha, P., Tarafdar, A., & Kedia, A. (2019). Filter -Based Feature Selection Methods Using Hill Climbing Approach. In Natural Computing for Unsupervised Learning (pp. 213 -234). Springer, Cham
work page 2019
-
[7]
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3(Mar), 1157-1182
work page 2003
-
[8]
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97(1-2), 273 - 324
work page 1997
-
[9]
Gheraibia, Y., & Moussaoui, A. (2013, June). Penguins search optimization algorithm (PeSOA). In International Conference on Industrial, Engineering and Other Applications of Applied Intel ligent Systems (pp. 222-231). Springer, Berlin, Heidelberg
work page 2013
-
[10]
Chandrasekhar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16-28
work page 2014
-
[11]
Al-Ani, A. (2005). Feature subset selection using ant colony optimization. International journal of computational intelligence
work page 2005
-
[12]
Sahu, B., & Mishra, D. (2012). A novel feature selection algorithm using particle swarm optimization for cancer microarray data. Procedia Engineering, 38, 27-31
work page 2012
-
[13]
Mafarja, M., & Mirjalili, S. (2018). Whale optimization approaches for wrapper feature selection. Applied Soft Computing, 62, 441-453
work page 2018
-
[14]
Rashedi, E., & Nezamabadi-pour, H. (2014). Feature subset selection using improved binary gravitational search algorithm. Journal of Intelligent & Fuzzy Systems, 26(3), 1211-1221
work page 2014
-
[15]
Parsopoulos, K. E., & Vrahatis, M. N. (2002). Particle swarm optimization method for constrained optimization problems. Intelligent Technologies –Theory and Application: New Trends in Intelligent Technologies , 76(1), 214-220
work page 2002
-
[16]
B., Zaharakis, I., & Pintelas, P
Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering, 160, 3-24
work page 2007
-
[17]
Lichman, M., & Bache, K. (2013). Uci mac hine learning repository. university of california, irvine, school of information and computer sciences. In [Online]. Available: http://archive.ics.uci.edu/ml
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.