Advancing clustering methods in physics education research: A case for mixture models
Pith reviewed 2026-05-22 00:26 UTC · model grok-4.3
The pith
Mixture models provide a probabilistic alternative to k-means that accounts for classification errors and integrates subgroup membership into broader analyses in physics education research.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Mixture models, specifically latent class analysis for categorical data, serve as a model-based alternative to k-modes clustering. They account for classification errors and permit direct integration of subgroup membership into a broader latent variable framework, as shown through parallel analyses that address identical research questions.
What carries the argument
Latent class analysis, a mixture model that estimates the probability of each individual belonging to each latent class from observed response patterns rather than forcing a single assignment.
If this is right
- Subgroup membership can be modeled jointly with other variables inside one framework instead of through separate post-hoc steps.
- Classification uncertainty is quantified and carried forward rather than treated as zero.
- Model fit to the observed data can be assessed directly.
- The same workflow applies to the categorical survey responses that dominate education research.
Where Pith is reading between the lines
- The same shift from hard clustering to mixture models could be tested in psychology or sociology datasets that also rely on survey-based subgroups.
- Researchers could examine whether mixture-model subgroups produce different predictions for student outcomes than k-means subgroups on held-out data.
- Extensions might combine mixture models with continuous variables or multilevel structures common in classroom studies.
Load-bearing premise
That the probabilistic structure of mixture models will produce practically more useful insights than hard partitioning when applied to typical physics education research datasets and questions.
What would settle it
A replication in which the mixture-model and k-modes analyses produce identical subgroup interpretations and reach the same substantive conclusions on the same dataset, or in which adding subgroup membership to other variables yields no measurable improvement.
Figures
read the original abstract
Clustering methods are often used in physics education research (PER) to identify subgroups of individuals within a population who share similar response patterns or characteristics. K-means (or k-modes, for categorical data) is one of the most commonly used clustering methods in PER. This algorithm, however, is not model-based: it relies on algorithmic partitioning and assigns individuals to subgroups with definite membership. Researchers must also conduct post-hoc analyses to relate subgroup membership to other variables. Mixture models offer a model-based alternative that accounts for classification errors and allows researchers to directly integrate subgroup membership into a broader latent variable framework. In this paper, we outline the theoretical similarities and differences between k-modes clustering and latent class analysis (one type of mixture model for categorical data). We also present parallel analyses using each method to address the same research questions in order to demonstrate these similarities and differences. We provide the data and R code to replicate the worked example presented in the paper for researchers interested in using mixture models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that k-modes clustering, commonly used in PER, is limited by its algorithmic hard partitioning and requirement for post-hoc analyses, whereas latent class analysis (LCA) as a mixture model is model-based, accounts for classification uncertainty, and permits direct integration of subgroup membership into larger latent variable models. It outlines theoretical similarities and differences between the approaches and illustrates them via parallel analyses addressing identical research questions on the same data, with accompanying R code and data for replication.
Significance. If the parallel analyses convincingly show that LCA's probabilistic features produce more reliable or distinct inferences about subgroups and their relations to other variables, the work could meaningfully shift PER practice toward model-based clustering. The explicit reproducibility materials strengthen the contribution by lowering barriers to adoption.
major comments (2)
- [parallel analyses / empirical example] In the section presenting the parallel analyses, the manuscript reports broadly similar subgroup profiles and post-hoc relations under both methods but does not quantify or highlight any differences arising from LCA's soft assignments or explicit modeling of classification error; this leaves the claim of practical superiority as an untested assumption rather than a demonstrated outcome.
- [abstract and methods] The abstract and methods description provide insufficient detail on data characteristics (e.g., sample size, number and distribution of categorical items), model selection criteria, and fit diagnostics (e.g., BIC, entropy, or classification probabilities for the LCA solution), which are necessary to evaluate whether the mixture model is well-identified and whether the reported differences are robust.
minor comments (2)
- [theoretical comparison] Notation for posterior class probabilities and item-response probabilities in the theoretical comparison section could be introduced more explicitly to aid readers without prior mixture-model experience.
- [results figures] Figure captions for the parallel-analysis results should include the exact number of classes retained and the criterion used for that choice.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback, which has helped us strengthen the manuscript. We address each major comment below and have revised the paper to incorporate additional details and clarifications while preserving the core contribution of comparing k-modes and latent class analysis in physics education research.
read point-by-point responses
-
Referee: In the section presenting the parallel analyses, the manuscript reports broadly similar subgroup profiles and post-hoc relations under both methods but does not quantify or highlight any differences arising from LCA's soft assignments or explicit modeling of classification error; this leaves the claim of practical superiority as an untested assumption rather than a demonstrated outcome.
Authors: We agree that the original parallel analyses section focused primarily on the broad similarities in subgroup profiles and relations, which was deliberate to show that both approaches can address the same research questions. To address this concern, we have revised the section to include quantitative metrics highlighting LCA-specific features, such as average posterior class probabilities, entropy values, and a brief discussion of how accounting for classification uncertainty can affect the precision of post-hoc relations. These additions demonstrate the practical value of the model-based approach without overstating superiority, as the profiles remain largely consistent in this dataset. revision: yes
-
Referee: The abstract and methods description provide insufficient detail on data characteristics (e.g., sample size, number and distribution of categorical items), model selection criteria, and fit diagnostics (e.g., BIC, entropy, or classification probabilities for the LCA solution), which are necessary to evaluate whether the mixture model is well-identified and whether the reported differences are robust.
Authors: We appreciate this observation and have expanded both the abstract and methods sections in the revised manuscript. We now report the sample size, the number and distribution of the categorical items, the model selection process (including BIC comparisons across class solutions), and key fit diagnostics such as entropy and average classification probabilities for the selected LCA model. These revisions allow readers to better assess model identification and the robustness of the results. revision: yes
Circularity Check
No circularity: standard methodological comparison with independent empirical illustration
full rationale
The paper's core argument rests on established distinctions between algorithmic hard partitioning (k-modes) and model-based approaches (LCA) that account for classification uncertainty and permit direct integration into latent variable models. These distinctions are presented as theoretical background rather than derived from any fitted quantities within the paper. The parallel analyses serve as an empirical demonstration of similarities and differences on the same research questions, with data and R code supplied for reproducibility; no step reduces a claimed prediction or result to a parameter fitted from the same dataset or to a self-citation chain. The derivation chain is therefore self-contained against external methodological literature and does not exhibit any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
free parameters (1)
- number of latent classes
axioms (1)
- domain assumption Response patterns arise from a finite mixture of categorical distributions corresponding to latent classes.
Reference graph
Works this paper leans on
-
[1]
Structural parameters: the proportion of the population belonging to each class, P (c = k), which indicates the relative class size, and
-
[2]
Measurement parameters: the conditional item proba- bilities, or the probability that a student in classk would endorse a specific indicator j, P (uj = 1|c = k) [20]. The basic LCA model assumes conditional independence, meaning that the latent class variable creating the subgroups explains all of the shared variance in the observed indicators. To estimat...
-
[3]
What combinations of social support do undergraduate women and gender minorities in physics draw upon?
-
[4]
How does students’ combination of social support relate to their gender identity and physics identity? We note that we do not aim to completely address these re- search questions; rather, we use them to illustrate a practical application of clustering methods in PER. As such, we sim- plify our analysis to only include one gender identity as a pre- dictor ...
work page 2025
-
[5]
High professional and identity-based support,
Cluster identification We performed the k-modes clustering using the klaR pack- age in R (Version 4.3.2) [19]. The clustering was performed iteratively for a range of cluster values ( k) from 1 to 10. The maximum number of iterations allowed was set as 300 and a random seed was set for reproducibility. We used two metrics to determine the optimal number o...
-
[6]
Relating social support cluster membership to gender identity and physics identity We related students’ cluster membership to other variables using a post-hoc analysis as in prior work [5, 8]. First, we used logistic regression to examine the relationship between student gender identity, particularly non-binary status, and k- modes cluster membership (rec...
-
[7]
High professional and identity-based support
Class enumeration We estimated LCA models using maximum likelihood es- timation with robust standard errors in MplusAutomation in R (Version 4.3.2) [42]. We estimated the models with 200 random sets of starting values, as recommended by Hipp and Bauer [43], to ensure that the model converged on a global rather than a local solution. The algorithm first ra...
work page 2000
-
[8]
Here we demonstrate an example of this to address the second re- search question
Relating social support class membership to gender identity and physics identity Another advantage of LCA (and mixture modeling more broadly) is that it allows for integrating the identified classes into a larger system of auxiliary variables to understand how the emergent classes relate to other measured variables. Here we demonstrate an example of this ...
-
[9]
Rochelle Guti ´errez. Research commentary: A gap-gazing fetish in mathematics education? Problematizing research on the achievement gap. Journal for Research in Mathematics Edu- cation, 39(4):357–364, 2008
work page 2008
-
[10]
Shaun R. Harper and Andrew H. Nichols. Are they not all the same?: Racial heterogeneity among black male undergraduates. Journal of College Student Development, 49(3):199–214, 2008
work page 2008
-
[11]
Jarrad W. T. Pond and Jacquelyn J. Chini. Exploring student learning profiles in algebra-based studio physics: A person- centered approach. Physical Review Physics Education Re- search, 13:010119, 2017
work page 2017
-
[12]
Onofrio Rosario Battaglia, Benedetto Di Paola, and Claudio Fazio. Unsupervised quantitative methods to analyze student reasoning lines: Theoretical aspects and examples. Physical Review Physics Education Research, 15:020112, 2019
work page 2019
-
[13]
Katherine N. Quinn, Michelle M. Kelley, Kathryn L. McGill, Emily M. Smith, Zachary Whipps, and N. G. Holmes. Group roles in unstructured labs show inequitable gender divide.Phys- ical Review Physics Education Research, 16:010129, 2020
work page 2020
-
[14]
Tong Wan, Constance M. Doty, Ashley A. Geraets, Christo- pher A. Nix, Erin K. H. Saitta, and Jacquelyn J. Chini. Evaluat- ing the impact of a classroom simulator training on graduate teaching assistants’ instructional practices and undergraduate student learning. Physical Review Physics Education Research, 17:010146, 2021
work page 2021
-
[15]
Meagan Sundstrom, David G. Wu, Cole Walsh, Ashley B. Heim, and N. G. Holmes. Examining the effects of lab instruc- tion and gender composition on intergroup interaction networks in introductory physics labs. Physical Review Physics Educa- tion Research, 18:010102, 2022
work page 2022
-
[16]
Gerd Kortemeyer and Wolfgang Bauer. Cheat sites and artificial intelligence usage in online introductory physics courses: What is the extent and what effect does it have on assessments?Phys- ical Review Physics Education Research, 20:010145, 2024
work page 2024
-
[17]
Char- acterizing active learning environments in physics using latent profile analysis
Kelley Commeford, Eric Brewe, and Adrienne Traxler. Char- acterizing active learning environments in physics using latent profile analysis. Physical Review Physics Education Research, 18(1):010113, 2022
work page 2022
-
[18]
Minghui Wang, Meagan Sundstrom, Karen Nylund-Gibson, and Marsha Ing. Open Science Framework. https://osf. io/7y9gf/, 2025
work page 2025
-
[19]
Some methods for classification and anal- ysis of multivariate observations
James MacQueen. Some methods for classification and anal- ysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probabil- ity, Volume 1: Statistics, volume 5, pages 281–298. University of California Press, 1967
work page 1967
-
[20]
Algorithms for hierarchi- cal clustering: An overview
Fionn Murtagh and Pedro Contreras. Algorithms for hierarchi- cal clustering: An overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(1):86–97, 2012
work page 2012
- [21]
-
[22]
DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN
Erich Schubert, J ¨org Sander, Martin Ester, Hans Peter Kriegel, and Xiaowei Xu. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Transactions on Database Systems (TODS), 42(3):1–21, 2017
work page 2017
-
[23]
Unsupervised deep embedding for clustering analysis
Junyuan Xie, Ross Girshick, and Ali Farhadi. Unsupervised deep embedding for clustering analysis. In International Con- ference on Machine Learning, pages 478–487. PMLR, 2016
work page 2016
- [24]
-
[25]
Katherine E. Masyn. Latent class analysis and finite mixture modeling. In Todd D. Little, editor, The Oxford Handbook of Quantitative Methods, volume 2, pages 551–611. Oxford Uni- versity Press, New York, 2013
work page 2013
-
[26]
An introduction to statistical learning, volume 112
Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshi- rani, et al. An introduction to statistical learning, volume 112. Springer, 2013
work page 2013
-
[27]
A fast clustering algorithm to cluster very large categorical data sets in data mining
Zhexue Huang. A fast clustering algorithm to cluster very large categorical data sets in data mining. Dmkd, 3(8):34–39, 1997
work page 1997
-
[28]
Karen Nylund-Gibson and Andrew Young Choi. Ten frequently asked questions about latent class analysis.Translational Issues in Psychological Science, 4(4):440, 2018
work page 2018
-
[29]
Jeroen K. Vermunt and Jay Magidson. Latent class cluster anal- ysis. Applied Latent Class Analysis, 11(89-106):60, 2002
work page 2002
-
[30]
K-means clustering with outlier removal
Guojun Gan and Michael Kwok-Po Ng. K-means clustering with outlier removal. Pattern Recognition Letters , 90:8–14, 2017
work page 2017
-
[31]
k-means–: A unified ap- proach to clustering and outlier detection
Sanjay Chawla and Aristides Gionis. k-means–: A unified ap- proach to clustering and outlier detection. InProceedings of the 2013 SIAM International Conference on Data Mining , pages 189–197. SIAM, 2013
work page 2013
-
[32]
A latent transition mixture model using the three- step specification
Karen Nylund-Gibson, Ryan Grimm, Matt Quirk, and Michael Furlong. A latent transition mixture model using the three- step specification. Structural Equation Modeling: A Multidis- ciplinary Journal, 21(3):439–454, 2014
work page 2014
-
[33]
Grant, Jodi McCloskey, Meghan Hatfield, Con- nie Uratsu, James D
Richard W. Grant, Jodi McCloskey, Meghan Hatfield, Con- nie Uratsu, James D. Ralston, Elizabeth Bayliss, and Chris J. Kennedy. Use of latent class analysis and k-means cluster- ing to identify complex patient profiles. JAMA Network Open, 3(12):e2029068–e2029068, 2020
work page 2020
-
[34]
Cooper, Xiao Hu, Roma Maguire, Kathi Apostolidis, Jo Armes, Yvette P
Nikoloas Papachristou, Payam Barnaghi, Bruce A. Cooper, Xiao Hu, Roma Maguire, Kathi Apostolidis, Jo Armes, Yvette P. Conley, Marilyn Hammer, Stylianos Katsaragakis, et al. Congruence between latent class and k-modes analyses in the identification of oncology patients with distinct symp- tom experiences. Journal of Pain and Symptom Management , 55(2):318–...
work page 2018
-
[35]
Latent class models for clustering: A comparison with k-means
Jay Magidson and Jeroen Vermunt. Latent class models for clustering: A comparison with k-means. Canadian Journal of Marketing Research, 20(1):36–43, 2002
work page 2002
-
[36]
Jacob Clark Blickenstaff. Women and science careers: Leaky pipeline or gender filter? Gender and Education , 17(4):369– 386, 2005
work page 2005
-
[37]
Linda J. Sax, Kathleen J. Lehman, Ram ´on S. Barthelemy, and Gloria Lim. Women in physics: A comparison to science, tech- nology, engineering, and math education over four decades. Physical Review Physics Education Research , 12(2):020108, 2016
work page 2016
-
[38]
Women in physics and 13 astronomy, 2019
Anne Marie Porter and Rachel Ivie. Women in physics and 13 astronomy, 2019. AIP Statistical Research Center, 2019
work page 2019
-
[39]
Maxwell Franklin, Eric Brewe, and Annette R. Ponnock. Ex- amining reasons undergraduate women join physics. Physical Review Physics Education Research, 19:010110, 2023
work page 2023
-
[40]
Justin A. Gutzwa, Ram ´on S. Barthelemy, Camila Amaral, Madison Swirtz, Adrienne Traxler, and Charles Henderson. How women and lesbian, gay, bisexual, transgender, and queer physics doctoral students navigate graduate education: The roles of professional environments and social networks. Physi- cal Review Physics Education Research, 20:020115, 2024
work page 2024
-
[41]
Egocentric mixed-methods SNA: Analyzing inter- views with women and/or queer and LGBT+ Ph.D
Chase Hatcher, Lily Donis, Adrienne Traxler, Madison Swirtz, Camila Manni, Justin Gutzwa, Charles Henderson, and Ram ´on Barthelemy. Egocentric mixed-methods SNA: Analyzing inter- views with women and/or queer and LGBT+ Ph.D. physicists. arXiv preprint arXiv:2504.10621, 2025
-
[42]
Maxwell Franklin and Eric Brewe. What correlates with persis- tence of women in physics?Physical Review Physics Education Research, 21:010115, 2025
work page 2025
-
[43]
Dropout from higher education: A theoretical synthesis of recent research
Vincent Tinto. Dropout from higher education: A theoretical synthesis of recent research. Review of Educational Research, 45(1):89–125, 1975
work page 1975
-
[44]
McCoy, Rachelle Winkle-Wagner, and Imani Barnes
Paris Wicker, Dorian L. McCoy, Rachelle Winkle-Wagner, and Imani Barnes. A web of support: A critical narrative analysis of black women’s relationships in stem disciplines.The Review of Higher Education, 47(1):93–125, 2023
work page 2023
-
[45]
Ann Y . Kim and Gale M. Sinatra. Science identity development: An interactionist approach. International Journal of STEM Ed- ucation, 5:1–6, 2018
work page 2018
-
[46]
Shweta Mishra. Social networks, social capital, social support and academic success in higher education: A systematic review with a special focus on ‘underrepresented’ students. Educa- tional Research Review, 29:100307, 2020
work page 2020
-
[47]
Sadler, and Marie- Claire Shanahan
Zahra Hazari, Gerhard Sonnert, Philip M. Sadler, and Marie- Claire Shanahan. Connecting high school physics experi- ences, outcome expectations, physics identity, and physics ca- reer choice: A gender study. Journal of Research in Science Teaching, 47(8):978–1003, 2010
work page 2010
-
[48]
Zahra Hazari, Deepa Chari, Geoff Potvin, and Eric Brewe. The context dependence of physics identity: Examining the role of performance/competence, recognition, interest, and sense of belonging for lower and upper female physics undergraduates. Journal of Research in Science Teaching , 57(10):1583–1607, 2020
work page 2020
- [49]
-
[50]
Michael N. Hallquist and Joshua F. Wiley. MplusAutomation: An R package for facilitating large-scale latent variable anal- yses in M plus. Structural Equation Modeling: A Multidisci- plinary Journal, 25(4):621–638, 2018
work page 2018
-
[51]
John R. Hipp and Daniel J. Bauer. Local solutions in the es- timation of growth mixture models. Psychological methods, 11(1):36, 2006
work page 2006
-
[52]
Daniel S. Nagin. Group-based modeling of development. Har- vard University Press, 2005
work page 2005
-
[53]
Jeroen K. Vermunt. Latent class modeling with covariates: Two improved three-step approaches. Political Analysis, 18(4):450– 469, 2010
work page 2010
-
[54]
Daniel Spurk, Andreas Hirschi, Mo Wang, Domingo Valero, and Simone Kauffeld. Latent profile analysis: A review and “how to” guide of its application within vocational behavior re- search. Journal of Vocational Behavior, 120:103445, 2020
work page 2020
-
[55]
Karen Nylund-Gibson, Adam C. Garber, Delwin B. Carter, Meiki Chan, Dina A. N. Arch, Odelia Simon, Kelly Whaling, Erica Tartt, and Smaranda I. Lawrie. Ten frequently asked ques- tions about latent transition analysis. Psychological Methods, 28(2):284, 2023
work page 2023
-
[56]
High professional and low identity-based support
Gitta H. Lubke and Bengt Muth ´en. Investigating population heterogeneity with factor mixture models. Psychological Meth- ods, 10(1):21, 2005. 14 APPENDIX A. Model convergence rates for latent class analysis Tables V and VI show the convergence and log-likelihood replication rates for different numbers of initial and final ran- dom starts of the models. A...
work page 2005
-
[57]
The other half of Class 1 is assigned to Cluster 2 (High professional and low identity-based support). There is strong alignment between Class 2 (Low professional and identity- based support) and Cluster 2, with about 94% of students in Class 2 being assigned to Cluster 2. Therefore, the two al- gorithms produce slightly different classifications when cre...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.