Cluster-Adaptive Feature Extraction and its Theoretical Foundation with Minkowski Weighted k-Means

Renato Cordeiro de Amorim; Vladimir Makarenkov

arxiv: 2603.25958 · v2 · pith:3OWMTJABnew · submitted 2026-03-26 · 💻 cs.LG

Cluster-Adaptive Feature Extraction and its Theoretical Foundation with Minkowski Weighted k-Means

Renato Cordeiro de Amorim , Vladimir Makarenkov This is my paper

Pith reviewed 2026-05-22 11:10 UTC · model grok-4.3

classification 💻 cs.LG

keywords Minkowski weighted k-meansCluster-Adaptive Feature Extractionfeature weightingunsupervised feature extractionwithin-cluster dispersionpower-mean aggregationfeature selection

0 comments

The pith

Minkowski weighted k-means feature weights rescale data to reverse within-cluster dispersion ordering and suppress noisy features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a theoretical basis for the Minkowski weighted k-means algorithm by showing that its objective function is a power-mean aggregation of within-cluster dispersions, where the Minkowski exponent p determines how selectively features are used. From this, feature weights are shown to depend solely on relative dispersions and to follow a power-law relation that guarantees high-dispersion features are suppressed. The authors then propose Cluster-Adaptive Feature Extraction, which applies these weights to rescale the input data before performing standard unsupervised feature extraction. They prove that the rescaling inverts the dispersion ordering, reducing the influence of noisy features while boosting informative ones. Experiments with controlled noise levels confirm that this leads to better performance than traditional feature extraction alone.

Core claim

By expressing the mwk-means objective as a power-mean aggregation of within-cluster dispersions with order p, the feature weights are derived to depend only on relative dispersion ratios via a power-law, providing explicit guarantees on suppressing high-dispersion features, and the algorithm is shown to converge. This foundation enables CAFE, where rescaling the data with these weights reverses the within-cluster dispersion ordering, suppressing noisy features and amplifying informative ones for improved unsupervised feature extraction.

What carries the argument

The power-mean aggregation representation of the mwk-means objective function, which determines the feature weights' power-law dependence on dispersion ratios and enables the dispersion-order reversal in CAFE.

If this is right

The choice of the Minkowski exponent p controls the transition between selective and uniform feature weighting.
Feature weights in mwk-means are independent of absolute dispersion values and depend only on relative ratios.
CAFE consistently improves the performance of traditional unsupervised feature extraction methods when within-cluster noise is present.
The mwk-means algorithm converges to a local minimum under the derived bounds on the objective.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This rescaling technique could be applied to other clustering algorithms that produce feature weights to enhance feature extraction in noisy high-dimensional data.
Choosing p based on the expected level of feature noise might optimize the suppression effect in practice.
CAFE might integrate with supervised methods by using the weights for dimensionality reduction prior to classification.
Similar power-mean reformulations could be explored for other distance-based clustering objectives to derive weighting schemes.

Load-bearing premise

The derivation of the power-law relationship for feature weights assumes that the mwk-means objective can be precisely expressed as a power-mean aggregation of the within-cluster dispersions for the given Minkowski exponent p.

What would settle it

A counterexample where applying the CAFE rescaling to data with high-dispersion noisy features does not reverse the dispersion ordering or fails to improve extraction results would falsify the central claim.

Figures

Figures reproduced from arXiv: 2603.25958 by Renato Cordeiro de Amorim, Vladimir Makarenkov.

**Figure 2.** Figure 2: Values of the normalised objective across datasets and runs for dif [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗

read the original abstract

The Minkowski weighted $k$-means ($mwk$-means) algorithm extends classical $k$-means by incorporating feature weights and a Minkowski distance. We first show that the $mwk$-means objective can be expressed as a power-mean aggregation of within-cluster dispersions, with the order determined by the Minkowski exponent $p$. This formulation reveals how $p$ controls the transition between selective and uniform use of features. Using this representation, we derive bounds for the objective function and characterise the structure of the feature weights, showing that they depend only on relative dispersion and follow a power-law relationship with dispersion ratios. This leads to explicit guarantees on the suppression of high-dispersion features, and we establish convergence of the algorithm. Building on these theoretical results, we introduce Cluster-Adaptive Feature Extraction (CAFE), a method that uses the $mwk$-means feature weights to rescale the data prior to unsupervised feature extraction. We prove that this rescaling reverses the within-cluster dispersion ordering, suppressing noisy features and amplifying informative ones. Numerous experiments conducted under controlled within-cluster noise show that CAFE consistently improves the results of traditional feature extraction methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reformulates mwk-means as a power-mean to derive weight bounds and a reversal proof for their new CAFE rescaling step.

read the letter

The central contribution here is the power-mean rewrite of the Minkowski weighted k-means objective. This step makes the role of the exponent p explicit in how features get weighted, and it leads to a clean characterization that the weights depend only on relative within-cluster dispersions and follow a power-law relation to their ratios. From there the authors prove explicit suppression bounds for high-dispersion features, show convergence, and introduce CAFE, which rescales the data by those weights before ordinary feature extraction. They also prove that the rescaling reverses the dispersion ordering, which is the main theoretical payoff. The controlled-noise experiments then show that standard extraction methods improve after this step. That combination of reformulation, bounds, and reversal guarantee is not just a restatement of earlier weighted k-means papers. The algebra is presented directly from the objective, so the claims are at least internally consistent on paper. The experiments are narrow but honest about the setting they test. The main soft spot is exactly the one the stress-test flags: the power-mean aggregation has to match the actual weighting dynamics for the chosen p, otherwise the power-law relation and the reversal proof do not go through. The abstract does not display the full derivation steps, so it is hard to see whether any approximation or extra assumption slips in when they move from the objective to the weight formula. After rescaling, cluster structure could shift, and it is not obvious that the same weights remain valid without re-running the procedure. The experiments stay inside synthetic controlled noise, so we still lack evidence on whether the gains survive real data or different cluster shapes. This is the kind of paper that matters to people building unsupervised pipelines who want a principled way to down-weight noisy dimensions before extraction. A reader already working with Minkowski distances or feature weighting will get the most out of it. The theoretical framing is grounded enough that a serious referee should look at the derivations and ask for more extensive tests. I would send it to review rather than desk-reject.

Referee Report

2 major / 2 minor

Summary. The paper reformulates the Minkowski weighted k-means (mwk-means) objective as a power-mean aggregation of within-cluster dispersions (order set by Minkowski exponent p), derives that feature weights depend only on relative dispersions and obey a power-law relationship with dispersion ratios, establishes suppression guarantees for high-dispersion features plus algorithm convergence, and introduces Cluster-Adaptive Feature Extraction (CAFE) that rescales data by these weights to reverse within-cluster dispersion ordering. Experiments under controlled within-cluster noise claim consistent improvements over standard feature extraction methods.

Significance. If the power-mean reformulation and reversal proof hold without hidden dependencies, the work supplies a principled theoretical basis for feature weighting in clustering and a practical pre-processing step that could improve unsupervised feature extraction in noisy data. The explicit bounds, weight characterization, and convergence result are positive contributions; the controlled experiments provide initial evidence but leave generalizability open.

major comments (2)

[Theoretical foundation / power-mean representation] Theoretical foundation section (power-mean reformulation of mwk-means objective): the claim that this aggregation exactly yields feature weights depending only on relative dispersion and following a power-law with dispersion ratios is load-bearing for both the suppression guarantees and the CAFE reversal proof. The joint optimization of weights and centroids may introduce dependencies not captured by a static power-mean of dispersions, and the manuscript does not provide an explicit verification that the reformulation preserves the original stationary-point conditions for arbitrary p.
[CAFE definition and reversal proof] CAFE reversal proof: the argument that rescaling by mwk-means weights reverses within-cluster dispersion ordering and suppresses noisy features assumes post-rescaling stability of the weights and cluster structure. No analysis is given of whether the rescaled data requires re-optimization of weights or whether the reversal holds after the first iteration, which directly affects whether the claimed suppression is guaranteed in the subsequent feature-extraction stage.

minor comments (2)

Notation for the power-mean order p and the resulting weight formula should be stated explicitly with an equation number immediately after the reformulation, to avoid ambiguity when referring to the power-law relationship later.
The experimental section would benefit from reporting the exact Minkowski exponent p values used and whether they were fixed or tuned, as p controls the selective-to-uniform transition highlighted in the theory.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment point by point below, offering clarifications based on the manuscript's derivations and indicating revisions to enhance clarity and completeness where appropriate.

read point-by-point responses

Referee: Theoretical foundation section (power-mean reformulation of mwk-means objective): the claim that this aggregation exactly yields feature weights depending only on relative dispersion and following a power-law with dispersion ratios is load-bearing for both the suppression guarantees and the CAFE reversal proof. The joint optimization of weights and centroids may introduce dependencies not captured by a static power-mean of dispersions, and the manuscript does not provide an explicit verification that the reformulation preserves the original stationary-point conditions for arbitrary p.

Authors: We appreciate the referee highlighting the need for explicit verification on this load-bearing claim. The power-mean reformulation follows directly from rewriting the mwk-means objective as an aggregation of within-cluster dispersions raised to the Minkowski exponent p; the closed-form weight update (for fixed centroids) then depends only on relative dispersions via the stated power-law relationship. Because the reformulation is algebraically equivalent to the original objective, the alternating optimization procedure yields stationary points of both. To address the concern directly, we will revise the theoretical foundation section to include a short lemma that verifies preservation of the stationary conditions for arbitrary p > 0, together with a brief remark on the absence of hidden dependencies introduced by joint optimization. This constitutes a partial revision. revision: partial
Referee: CAFE reversal proof: the argument that rescaling by mwk-means weights reverses within-cluster dispersion ordering and suppresses noisy features assumes post-rescaling stability of the weights and cluster structure. No analysis is given of whether the rescaled data requires re-optimization of weights or whether the reversal holds after the first iteration, which directly affects whether the claimed suppression is guaranteed in the subsequent feature-extraction stage.

Authors: The referee correctly notes that the reversal proof is stated for the initial rescaling step. The proof shows that multiplying each feature by the mwk-means weight (computed on the original data) inverts the ordering of within-cluster dispersions, thereby suppressing high-dispersion features before any downstream extraction occurs. CAFE is explicitly positioned as a one-pass preprocessing transformation; re-optimization of weights on the rescaled data is neither required nor assumed for the suppression guarantee. We agree that a short discussion of this design choice would improve the manuscript. We will add a clarifying paragraph in the CAFE section stating that the reversal applies to the rescaled representation used by subsequent methods and that empirical results remain consistent without re-clustering. This is a partial revision. revision: partial

Circularity Check

0 steps flagged

No circularity: derivations of power-mean reformulation, weight structure, and CAFE reversal are self-contained analysis of the mwk-means objective

full rationale

The paper derives the power-mean representation directly from the mwk-means objective function, then uses that representation to characterize feature weights as depending only on relative dispersions via a power-law relation. This is a standard mathematical unpacking of an existing objective rather than a fit or self-referential definition. The subsequent proof that CAFE rescaling reverses dispersion ordering follows from those derived properties without reducing to a tautology or to a self-citation chain. No load-bearing step equates a claimed result to its own inputs by construction; the analysis remains independent of the experimental outcomes and does not rename known patterns or smuggle ansatzes via prior self-citations.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claims rest on the power-mean representation of the mwk-means objective and on standard properties of Minkowski distances; no new physical entities are postulated.

free parameters (1)

Minkowski exponent p
Determines the order of the power-mean aggregation and controls selective versus uniform feature use; its value is part of the algorithm input.

axioms (1)

domain assumption The mwk-means objective equals a power-mean aggregation of within-cluster dispersions whose order is fixed by p
Invoked to derive bounds, weight structure, and the reversal property for CAFE.

pith-pipeline@v0.9.0 · 5733 in / 1381 out tokens · 58728 ms · 2026-05-22T11:10:39.260627+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We show that the mwk-means objective can be expressed as a power-mean aggregation of within-cluster dispersions, with the order determined by the Minkowski exponent p... wlv/wlu = (Dlu/Dlv)^{1/(p-1)}
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

This leads to explicit guarantees on the suppression of high-dispersion features

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

[1]

Clustering of single-cell multi-omics data with a multimodal deep learning method,

X. Lin, T. Tian, Z. Wei, and H. Hakonarson, “Clustering of single-cell multi-omics data with a multimodal deep learning method,”Nature com- munications, vol. 13, no. 1, p. 7705, 2022

work page 2022
[2]

Biclustering in bioinformat- ics using big data and high performance computing applications: challenges and perspectives, a review: A. lopez-fernandez et al.,

A. L´ opez-Fern´ andez, F. A. Gomez-Vela, D. S. Rodriguez-Baena, F. M. Delgado-Chaves, and J. Gonzalez-Dominguez, “Biclustering in bioinformat- ics using big data and high performance computing applications: challenges and perspectives, a review: A. lopez-fernandez et al.,”The Journal of Su- percomputing, vol. 81, no. 10, p. 1123, 2025

work page 2025
[3]

Androidgyny: Reviewing clustering techniques for android malware family classification,

T. S. R. Pimenta, F. Ceschin, and A. Gregio, “Androidgyny: Reviewing clustering techniques for android malware family classification,”Digital Threats: Research and Practice, vol. 5, no. 1, pp. 1–35, 2024

work page 2024
[4]

A comprehensive survey of clustering al- gorithms: State-of-the-art machine learning applications, taxonomy, chal- lenges, and future research prospects,

A. E. Ezugwu, A. M. Ikotun, O. O. Oyelade, L. Abualigah, J. O. Agushaka, C. I. Eke, and A. A. Akinyelu, “A comprehensive survey of clustering al- gorithms: State-of-the-art machine learning applications, taxonomy, chal- lenges, and future research prospects,”Engineering applications of artificial intelligence, vol. 110, p. 104743, 2022

work page 2022
[5]

Identifying meaningful clusters in malware data,

R. C. de Amorim and C. D. L. Ruiz, “Identifying meaningful clusters in malware data,”Expert Systems with Applications, vol. 177, p. 114971, 2021

work page 2021
[6]

Deep image clustering: A survey,

H. Huang, C. Wang, X. Wei, and Y. Zhou, “Deep image clustering: A survey,”Neurocomputing, vol. 599, p. 128101, 2024

work page 2024
[7]

Some methods for classification and analysis of multivariate observations,

J. MacQueen, “Some methods for classification and analysis of multivariate observations,” inProceedings of the Fifth Berkeley Symposium on Mathe- matical Statistics and Probability, Volume 1: Statistics, vol. 5, pp. 281–298, University of California press, 1967

work page 1967
[8]

K-means clustering algorithms: A comprehensive review, variants analy- sis, and advances in the era of big data,

A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analy- sis, and advances in the era of big data,”Information Sciences, vol. 622, pp. 178–210, 2023. 15

work page 2023
[9]

Transforming complex problems into k-means solutions,

H. Liu, J. Chen, J. Dy, and Y. Fu, “Transforming complex problems into k-means solutions,”IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 7, pp. 9149–9168, 2023

work page 2023
[10]

The k-means algorithm: A com- prehensive survey and performance evaluation,

M. Ahmed, R. Seraj, and S. M. S. Islam, “The k-means algorithm: A com- prehensive survey and performance evaluation,”Electronics, vol. 9, no. 8, p. 1295, 2020

work page 2020
[11]

An extensive empirical comparison of k-means initialization algorithms,

S. Harris and R. C. De Amorim, “An extensive empirical comparison of k-means initialization algorithms,”Ieee Access, vol. 10, pp. 58752–58768, 2022

work page 2022
[12]

Silhouette coefficient- based weighting k-means algorithm,

H. Lai, T. Huang, B. Lu, S. Zhang, and R. Xiaog, “Silhouette coefficient- based weighting k-means algorithm,”Neural Computing and Applications, vol. 37, no. 5, pp. 3061–3075, 2025

work page 2025
[13]

A survey on feature selection ap- proaches for clustering,

E. Hancer, B. Xue, and M. Zhang, “A survey on feature selection ap- proaches for clustering,”Artificial intelligence review, vol. 53, no. 6, pp. 4519–4545, 2020

work page 2020
[14]

Fuzzy clustering based on feature weights for multi- variate time series,

H. Li and M. Wei, “Fuzzy clustering based on feature weights for multi- variate time series,”Knowledge-Based Systems, vol. 197, p. 105907, 2020

work page 2020
[15]

Feature-weight and cluster- weight learning in fuzzy c-means method for semi-supervised clustering,

A. G. Oskouei, N. Samadi, and J. Tanha, “Feature-weight and cluster- weight learning in fuzzy c-means method for semi-supervised clustering,” Applied Soft Computing, vol. 161, p. 111712, 2024

work page 2024
[16]

Minkowski metric, feature weighting and anomalous cluster initializing in k-means clustering,

R. C. De Amorim and B. Mirkin, “Minkowski metric, feature weighting and anomalous cluster initializing in k-means clustering,”Pattern Recognition, vol. 45, no. 3, pp. 1061–1075, 2012

work page 2012
[17]

Feature weighting methods: A review,

I. Ni˜ no-Adan, D. Manjarres, I. Landa-Torres, and E. Portillo, “Feature weighting methods: A review,”Expert Systems with Applications, vol. 184, p. 115424, 2021

work page 2021
[18]

Adaptive explicit kernel minkowski weighted k-means,

A. Aradnia, M. A. Haeri, and M. M. Ebadzadeh, “Adaptive explicit kernel minkowski weighted k-means,”Information sciences, vol. 584, pp. 503–518, 2022

work page 2022
[19]

A survey on soft subspace clustering,

Z. Deng, K.-S. Choi, Y. Jiang, J. Wang, and S. Wang, “A survey on soft subspace clustering,”Information sciences, vol. 348, pp. 84–106, 2016

work page 2016
[20]

Uncovering large-scale conformational change in molecular dynamics without prior knowledge,

R. L. Melvin, R. C. Godwin, J. Xiao, W. G. Thompson, K. S. Berenhaut, and F. R. Salsbury Jr, “Uncovering large-scale conformational change in molecular dynamics without prior knowledge,”Journal of chemical theory and computation, vol. 12, no. 12, pp. 6130–6146, 2016

work page 2016
[21]

Mutsα’s multi-domain allosteric response to three dna dam- age types revealed by machine learning,

R. L. Melvin, W. G. Thompson, R. C. Godwin, W. H. Gmeiner, and F. R. Salsbury Jr, “Mutsα’s multi-domain allosteric response to three dna dam- age types revealed by machine learning,”Frontiers in physics, vol. 5, p. 10, 2017. 16

work page 2017
[22]

A combination of particle swarm optimization and minkowski weighted k-means clustering: applica- tion in lateralization of temporal lobe epilepsy,

S.-S. Jamali-Dinan, H. Soltanian-Zadeh, S. M. Bowyer, H. Almohri, H. De- hghani, K. Elisevich, and M.-R. Nazem-Zadeh, “A combination of particle swarm optimization and minkowski weighted k-means clustering: applica- tion in lateralization of temporal lobe epilepsy,”Brain topography, vol. 33, no. 4, pp. 519–532, 2020

work page 2020
[23]

A novel method for optic disc localization us- ing fast circlet transform and chan-vese segmentation,

S. Gowthaman and A. Das, “A novel method for optic disc localization us- ing fast circlet transform and chan-vese segmentation,”Scientific Reports, vol. 15, no. 1, p. 31399, 2025

work page 2025
[24]

Recovering the number of clusters in data sets with noise features using feature rescaling factors,

R. C. De Amorim and C. Hennig, “Recovering the number of clusters in data sets with noise features using feature rescaling factors,”Information sciences, vol. 324, pp. 126–145, 2015

work page 2015
[25]

Feature selection: A data perspective,

J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, “Feature selection: A data perspective,”ACM computing surveys (CSUR), vol. 50, no. 6, pp. 1–45, 2017

work page 2017
[26]

Unsupervised feature selection via discrete spectral clustering and feature weights,

R. Shang, J. Kong, L. Wang, W. Zhang, C. Wang, Y. Li, and L. Jiao, “Unsupervised feature selection via discrete spectral clustering and feature weights,”Neurocomputing, vol. 517, pp. 106–117, 2023. 17

work page 2023

[1] [1]

Clustering of single-cell multi-omics data with a multimodal deep learning method,

X. Lin, T. Tian, Z. Wei, and H. Hakonarson, “Clustering of single-cell multi-omics data with a multimodal deep learning method,”Nature com- munications, vol. 13, no. 1, p. 7705, 2022

work page 2022

[2] [2]

Biclustering in bioinformat- ics using big data and high performance computing applications: challenges and perspectives, a review: A. lopez-fernandez et al.,

A. L´ opez-Fern´ andez, F. A. Gomez-Vela, D. S. Rodriguez-Baena, F. M. Delgado-Chaves, and J. Gonzalez-Dominguez, “Biclustering in bioinformat- ics using big data and high performance computing applications: challenges and perspectives, a review: A. lopez-fernandez et al.,”The Journal of Su- percomputing, vol. 81, no. 10, p. 1123, 2025

work page 2025

[3] [3]

Androidgyny: Reviewing clustering techniques for android malware family classification,

T. S. R. Pimenta, F. Ceschin, and A. Gregio, “Androidgyny: Reviewing clustering techniques for android malware family classification,”Digital Threats: Research and Practice, vol. 5, no. 1, pp. 1–35, 2024

work page 2024

[4] [4]

A comprehensive survey of clustering al- gorithms: State-of-the-art machine learning applications, taxonomy, chal- lenges, and future research prospects,

A. E. Ezugwu, A. M. Ikotun, O. O. Oyelade, L. Abualigah, J. O. Agushaka, C. I. Eke, and A. A. Akinyelu, “A comprehensive survey of clustering al- gorithms: State-of-the-art machine learning applications, taxonomy, chal- lenges, and future research prospects,”Engineering applications of artificial intelligence, vol. 110, p. 104743, 2022

work page 2022

[5] [5]

Identifying meaningful clusters in malware data,

R. C. de Amorim and C. D. L. Ruiz, “Identifying meaningful clusters in malware data,”Expert Systems with Applications, vol. 177, p. 114971, 2021

work page 2021

[6] [6]

Deep image clustering: A survey,

H. Huang, C. Wang, X. Wei, and Y. Zhou, “Deep image clustering: A survey,”Neurocomputing, vol. 599, p. 128101, 2024

work page 2024

[7] [7]

Some methods for classification and analysis of multivariate observations,

J. MacQueen, “Some methods for classification and analysis of multivariate observations,” inProceedings of the Fifth Berkeley Symposium on Mathe- matical Statistics and Probability, Volume 1: Statistics, vol. 5, pp. 281–298, University of California press, 1967

work page 1967

[8] [8]

K-means clustering algorithms: A comprehensive review, variants analy- sis, and advances in the era of big data,

A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analy- sis, and advances in the era of big data,”Information Sciences, vol. 622, pp. 178–210, 2023. 15

work page 2023

[9] [9]

Transforming complex problems into k-means solutions,

H. Liu, J. Chen, J. Dy, and Y. Fu, “Transforming complex problems into k-means solutions,”IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 7, pp. 9149–9168, 2023

work page 2023

[10] [10]

The k-means algorithm: A com- prehensive survey and performance evaluation,

M. Ahmed, R. Seraj, and S. M. S. Islam, “The k-means algorithm: A com- prehensive survey and performance evaluation,”Electronics, vol. 9, no. 8, p. 1295, 2020

work page 2020

[11] [11]

An extensive empirical comparison of k-means initialization algorithms,

S. Harris and R. C. De Amorim, “An extensive empirical comparison of k-means initialization algorithms,”Ieee Access, vol. 10, pp. 58752–58768, 2022

work page 2022

[12] [12]

Silhouette coefficient- based weighting k-means algorithm,

H. Lai, T. Huang, B. Lu, S. Zhang, and R. Xiaog, “Silhouette coefficient- based weighting k-means algorithm,”Neural Computing and Applications, vol. 37, no. 5, pp. 3061–3075, 2025

work page 2025

[13] [13]

A survey on feature selection ap- proaches for clustering,

E. Hancer, B. Xue, and M. Zhang, “A survey on feature selection ap- proaches for clustering,”Artificial intelligence review, vol. 53, no. 6, pp. 4519–4545, 2020

work page 2020

[14] [14]

Fuzzy clustering based on feature weights for multi- variate time series,

H. Li and M. Wei, “Fuzzy clustering based on feature weights for multi- variate time series,”Knowledge-Based Systems, vol. 197, p. 105907, 2020

work page 2020

[15] [15]

Feature-weight and cluster- weight learning in fuzzy c-means method for semi-supervised clustering,

A. G. Oskouei, N. Samadi, and J. Tanha, “Feature-weight and cluster- weight learning in fuzzy c-means method for semi-supervised clustering,” Applied Soft Computing, vol. 161, p. 111712, 2024

work page 2024

[16] [16]

Minkowski metric, feature weighting and anomalous cluster initializing in k-means clustering,

R. C. De Amorim and B. Mirkin, “Minkowski metric, feature weighting and anomalous cluster initializing in k-means clustering,”Pattern Recognition, vol. 45, no. 3, pp. 1061–1075, 2012

work page 2012

[17] [17]

Feature weighting methods: A review,

I. Ni˜ no-Adan, D. Manjarres, I. Landa-Torres, and E. Portillo, “Feature weighting methods: A review,”Expert Systems with Applications, vol. 184, p. 115424, 2021

work page 2021

[18] [18]

Adaptive explicit kernel minkowski weighted k-means,

A. Aradnia, M. A. Haeri, and M. M. Ebadzadeh, “Adaptive explicit kernel minkowski weighted k-means,”Information sciences, vol. 584, pp. 503–518, 2022

work page 2022

[19] [19]

A survey on soft subspace clustering,

Z. Deng, K.-S. Choi, Y. Jiang, J. Wang, and S. Wang, “A survey on soft subspace clustering,”Information sciences, vol. 348, pp. 84–106, 2016

work page 2016

[20] [20]

Uncovering large-scale conformational change in molecular dynamics without prior knowledge,

R. L. Melvin, R. C. Godwin, J. Xiao, W. G. Thompson, K. S. Berenhaut, and F. R. Salsbury Jr, “Uncovering large-scale conformational change in molecular dynamics without prior knowledge,”Journal of chemical theory and computation, vol. 12, no. 12, pp. 6130–6146, 2016

work page 2016

[21] [21]

Mutsα’s multi-domain allosteric response to three dna dam- age types revealed by machine learning,

R. L. Melvin, W. G. Thompson, R. C. Godwin, W. H. Gmeiner, and F. R. Salsbury Jr, “Mutsα’s multi-domain allosteric response to three dna dam- age types revealed by machine learning,”Frontiers in physics, vol. 5, p. 10, 2017. 16

work page 2017

[22] [22]

A combination of particle swarm optimization and minkowski weighted k-means clustering: applica- tion in lateralization of temporal lobe epilepsy,

S.-S. Jamali-Dinan, H. Soltanian-Zadeh, S. M. Bowyer, H. Almohri, H. De- hghani, K. Elisevich, and M.-R. Nazem-Zadeh, “A combination of particle swarm optimization and minkowski weighted k-means clustering: applica- tion in lateralization of temporal lobe epilepsy,”Brain topography, vol. 33, no. 4, pp. 519–532, 2020

work page 2020

[23] [23]

A novel method for optic disc localization us- ing fast circlet transform and chan-vese segmentation,

S. Gowthaman and A. Das, “A novel method for optic disc localization us- ing fast circlet transform and chan-vese segmentation,”Scientific Reports, vol. 15, no. 1, p. 31399, 2025

work page 2025

[24] [24]

Recovering the number of clusters in data sets with noise features using feature rescaling factors,

R. C. De Amorim and C. Hennig, “Recovering the number of clusters in data sets with noise features using feature rescaling factors,”Information sciences, vol. 324, pp. 126–145, 2015

work page 2015

[25] [25]

Feature selection: A data perspective,

J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, “Feature selection: A data perspective,”ACM computing surveys (CSUR), vol. 50, no. 6, pp. 1–45, 2017

work page 2017

[26] [26]

Unsupervised feature selection via discrete spectral clustering and feature weights,

R. Shang, J. Kong, L. Wang, W. Zhang, C. Wang, Y. Li, and L. Jiao, “Unsupervised feature selection via discrete spectral clustering and feature weights,”Neurocomputing, vol. 517, pp. 106–117, 2023. 17

work page 2023