Latent Impact and Differential Item Functioning Analysis for Asymmetric IRT Models
Pith reviewed 2026-05-08 08:06 UTC · model grok-4.3
The pith
An l1-regularized mixture model identifies latent impact and DIF items in asymmetric IRT models without observed groups or anchors.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By combining latent class mixture modeling with asymmetric IRT and l1 regularization for DIF selection, the estimator simultaneously recovers the latent classes representing impact and identifies the sparse set of DIF items, all without requiring observed group labels or pre-specified anchor items.
What carries the argument
The ℓ1-regularised estimator applied to the DIF shift parameters in a latent class asymmetric IRT mixture model, which induces sparsity to select DIF items while estimating class-specific impact.
If this is right
- The method recovers latent class probabilities, item parameters, and DIF effects accurately in simulations across varied configurations.
- Applications to educational data can distinguish cases with both impact and DIF from those with mainly impact and little DIF.
- Analysis of measurement invariance becomes possible in settings lacking observed groups or known anchors.
- The framework accommodates asymmetric response processes that symmetric IRT models cannot capture well.
Where Pith is reading between the lines
- This could be extended to other latent variable models in psychology or sociology where unobserved heterogeneity affects item responses.
- Sensitivity analyses for the sparsity level would be needed if the few-DIF-items assumption is uncertain in a new dataset.
- Pairing the estimator with cross-validation on held-out responses could help guard against over-selection of DIF items.
Load-bearing premise
The number of items with differential functioning is relatively small.
What would settle it
A simulation where the true proportion of DIF items is large would show whether the estimator still recovers the latent classes and correctly selects the DIF items or instead overfits.
Figures
read the original abstract
Differential item functioning (DIF) arises alongside latent population heterogeneity in many applications, and both must be accounted for when assessing measurement invariance. In many practical settings, however, the comparison groups are unobserved and anchor items are unknown. A further challenge is that item response theory models traditionally assume symmetric link functions, yet empirical response processes may exhibit substantial asymmetry. This paper proposes a general framework for jointly analysing impact and DIF under asymmetric item response models. Unobserved group differences are represented by latent classes within a mixture item response model, while item-specific shifts capture DIF effects. Assuming the number of DIF items is relatively small, an $\ell_1$-regularised estimator is used to simultaneously identify the latent classes and select DIF items without requiring observed group labels or pre-specified anchor items. A simulation study evaluates recovery of impact, item parameters, and DIF effects across a range of configurations. The method is illustrated using two empirical applications from educational testing. In one dataset, the selected model reveals both impact and item-level DIF, whereas in the other, the results indicate substantial impact but little evidence of item-level DIF.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a mixture IRT framework for asymmetric link functions that jointly models latent impact (unobserved group heterogeneity) and DIF via item-specific shifts. An ℓ1-regularized estimator simultaneously identifies the latent classes and selects DIF items without observed group labels or pre-specified anchors, under the assumption that the number of DIF items is relatively small. Recovery is assessed in a simulation study across configurations, and the method is applied to two educational testing datasets, one showing both impact and DIF and the other showing impact with little DIF.
Significance. If the recovery and selection properties hold, the work offers a practical advance for measurement invariance analysis in settings with latent heterogeneity and asymmetric responses, where traditional methods requiring observed groups or anchors are infeasible. The joint estimation and regularization approach could improve fairness assessments in testing applications.
major comments (3)
- [§4] §4 (Simulation Study): recovery claims for impact, item parameters, and DIF effects are presented without error bars, standard errors, or detailed metrics on false positive/negative rates for DIF selection; this weakens assessment of the estimator's reliability across the tested configurations.
- [§3] §3 (Estimation procedure): the ℓ1-regularized estimator and class identification rest on the assumption that the number of DIF items is relatively small (stated in the abstract and method); no sensitivity analysis or robustness checks to violations of this assumption are reported, which is load-bearing for the central claim of simultaneous identification without anchors.
- [§3] §3, Eq. for the penalized likelihood: the regularization parameter and number of latent classes are treated as tuning parameters, but the manuscript provides limited guidance on their selection procedure and its effect on the resulting class recovery and DIF selection.
minor comments (2)
- [Abstract] Abstract: the two empirical datasets are described only generically as 'educational testing'; adding brief identifiers or sample sizes would improve context.
- [Notation] Notation: ensure consistent use of symbols for the asymmetric link function and the mixture weights across sections and equations.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify important aspects of the simulation study and estimation procedure. We respond to each major comment below and indicate the planned revisions.
read point-by-point responses
-
Referee: [§4] §4 (Simulation Study): recovery claims for impact, item parameters, and DIF effects are presented without error bars, standard errors, or detailed metrics on false positive/negative rates for DIF selection; this weakens assessment of the estimator's reliability across the tested configurations.
Authors: We agree that the simulation results would be strengthened by including variability measures and explicit selection performance metrics. In the revised manuscript, we will add standard deviations (or error bars) across the Monte Carlo replications for all reported recovery metrics on impact, item parameters, and DIF effects. We will also report false positive and false negative rates for DIF item selection under each configuration to provide a fuller picture of reliability. revision: yes
-
Referee: [§3] §3 (Estimation procedure): the ℓ1-regularized estimator and class identification rest on the assumption that the number of DIF items is relatively small (stated in the abstract and method); no sensitivity analysis or robustness checks to violations of this assumption are reported, which is load-bearing for the central claim of simultaneous identification without anchors.
Authors: The sparsity assumption on the number of DIF items is indeed central to the ℓ1-regularization strategy and the claim of anchor-free identification. We will add a sensitivity analysis in the simulation study (or an appendix) that varies the proportion of DIF items to examine how performance degrades when the assumption is moderately violated, thereby clarifying the practical scope of the method. revision: yes
-
Referee: [§3] §3, Eq. for the penalized likelihood: the regularization parameter and number of latent classes are treated as tuning parameters, but the manuscript provides limited guidance on their selection procedure and its effect on the resulting class recovery and DIF selection.
Authors: We will expand the relevant section to give more explicit guidance on selecting the regularization parameter and the number of latent classes. The revision will describe the concrete procedure employed (e.g., information criteria or cross-validation), report the chosen values, and include a brief discussion or supplementary results illustrating the sensitivity of class recovery and DIF selection to these choices. revision: yes
Circularity Check
No significant circularity; estimator builds on standard mixture and regularization methods
full rationale
The paper proposes a joint framework for latent impact and DIF under asymmetric IRT models using a mixture model with latent classes and an ℓ1-regularized estimator for simultaneous class identification and DIF item selection. This relies on established statistical techniques (mixture IRT, lasso-type regularization) without any derivation step that reduces a claimed result to a fitted parameter or self-citation by construction. The central claim (recovery of impact, parameters, and DIF without observed labels or anchors) is evaluated via simulation and empirical application rather than being tautological with the inputs. No self-definitional, fitted-input-as-prediction, or load-bearing self-citation patterns appear in the provided abstract or description. The assumption that the number of DIF items is small is stated explicitly as a modeling choice, not smuggled in as a derived result.
Axiom & Free-Parameter Ledger
free parameters (2)
- ℓ1 regularization parameter
- number of latent classes
axioms (2)
- ad hoc to paper The number of DIF items is relatively small
- domain assumption Latent classes can represent unobserved group differences in response processes
Reference graph
Works this paper leans on
-
[1]
Bolt, Daniel M. and Cohen, Allan S. and Wollack, James A. , title =. Journal of Educational Measurement , year =
-
[2]
British Journal of Mathematical and Statistical Psychology , year=
Defining asymmetry in item response theory , author=. British Journal of Mathematical and Statistical Psychology , year=
-
[3]
British Journal of Mathematical and Statistical Psychology , volume=
Identifiability analysis of the fixed-effects one-parameter logistic positive exponent model , author=. British Journal of Mathematical and Statistical Psychology , volume=. 2025 , publisher=
work page 2025
-
[4]
Large-scale Assessments in Education , volume=
Investigating item complexity as a source of cross-national DIF in TIMSS math and science , author=. Large-scale Assessments in Education , volume=. 2024 , publisher=
work page 2024
-
[5]
The Journal of Machine Learning Research , volume=
Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data , author=. The Journal of Machine Learning Research , volume=. 2008 , publisher=
work page 2008
-
[6]
Structural Equation Modeling: a Multidisciplinary Journal , volume=
Simplifying the assessment of measurement invariance over multiple background variables: Using regularized moderated nonlinear factor analysis to detect differential item functioning , author=. Structural Equation Modeling: a Multidisciplinary Journal , volume=. 2020 , publisher=
work page 2020
-
[7]
A statistical test for differential item pair functioning , author=. Psychometrika , volume=. 2015 , publisher=
work page 2015
- [8]
-
[9]
Journal of Educational Measurement , volume=
Detection of test collusion via Kullback--Leibler divergence , author=. Journal of Educational Measurement , volume=. 2013 , publisher=
work page 2013
-
[10]
Applied Psychological Measurement , volume=
Graph theory approach to detect examinees involved in test collusion , author=. Applied Psychological Measurement , volume=. 2021 , publisher=
work page 2021
-
[11]
Journal of Educational and Behavioral Statistics , volume=
Measuring student ability, classifying schools, and detecting item bias at school level, based on student-level dichotomous items , author=. Journal of Educational and Behavioral Statistics , volume=. 2014 , publisher=
work page 2014
-
[12]
Statistical Theories of Mental Test Scores , pages=
Some latent trait models and their use in inferring an examinee's ability , author=. Statistical Theories of Mental Test Scores , pages=. 1968 , address=
work page 1968
-
[13]
Marginal maximum likelihood estimation of item parameters: Application of an
Bock, R Darrell and Aitkin, Murray , journal=. Marginal maximum likelihood estimation of item parameters: Application of an. 1981 , publisher=
work page 1981
-
[14]
Bock, R Darrell and Zimowski, Michele F , booktitle=. Multiple group. 1997 , publisher=
work page 1997
-
[15]
Journal of Educational and Behavioral Statistics , volume=
A mixture item response model for multiple-choice data , author=. Journal of Educational and Behavioral Statistics , volume=. 2001 , publisher=
work page 2001
-
[16]
Applied Psychological Measurement , volume=
An iterative procedure for linking metrics and assessing item bias in item response theory , author=. Applied Psychological Measurement , volume=. 1988 , publisher=
work page 1988
-
[17]
Educational and Psychological Measurement , volume=
A Monte Carlo study of an iterative Wald test procedure for DIF analysis , author=. Educational and Psychological Measurement , volume=. 2017 , publisher=
work page 2017
-
[18]
arXiv preprint arXiv:2108.08604 , year=
Item Response Theory--A Statistical Framework for Educational and Psychological Measurement , author=. arXiv preprint arXiv:2108.08604 , year=
-
[19]
A Skew Item Response Model , author=. Bayesian Analysis , volume=
-
[20]
Journal of Educational and Behavioral Statistics , volume=
Bayesian estimation of the logistic positive exponent IRT model , author=. Journal of Educational and Behavioral Statistics , volume=. 2010 , publisher=
work page 2010
-
[21]
Journal of Educational and Behavioral Statistics , pages=
Item pool quality control in educational testing: change point model, compound risk, and sequential detection , author=. Journal of Educational and Behavioral Statistics , pages=. 2022 , publisher=
work page 2022
-
[22]
DIF statistical inference without knowing anchoring items , author=. psychometrika , volume=. 2023 , publisher=
work page 2023
-
[23]
Joint maximum likelihood estimation for high-dimensional exploratory item factor analysis , author=. Psychometrika , volume=. 2019 , publisher=
work page 2019
-
[24]
The Annals of Applied Statistics , volume=
Detection of two-way outliers in multivariate data and application to cheating detection in educational tests , author=. The Annals of Applied Statistics , volume=. 2022 , publisher=
work page 2022
-
[25]
Cho, Sun-Joo and Suh, Youngsuk and Lee, Woo-yeol , journal=. An. 2016 , publisher=
work page 2016
-
[26]
Cho, Sun-Joo and Cohen, Allan S , journal=. A multilevel mixture. 2010 , publisher=
work page 2010
-
[27]
Handbook of quantitative methods for detecting cheating on tests , author=. 2017 , publisher=
work page 2017
-
[28]
The effects of purification of matching criterion on the identification of
Clauser, Brian and Mazor, Kathy and Hambleton, Ronald K , journal=. The effects of purification of matching criterion on the identification of. 1993 , publisher=
work page 1993
-
[29]
Journal of Educational Measurement , volume=
A mixture model analysis of differential item functioning , author=. Journal of Educational Measurement , volume=. 2005 , publisher=
work page 2005
-
[30]
Detecting intervention effects using a multilevel latent transition analysis with a mixture
Cho, Sun-Joo and Cohen, Allan S and Bottge, Brian , journal=. Detecting intervention effects using a multilevel latent transition analysis with a mixture. 2013 , publisher=
work page 2013
-
[31]
Applied Psychological Measurement , volume=
Latent transition analysis with a mixture item response theory measurement model , author=. Applied Psychological Measurement , volume=. 2010 , publisher=
work page 2010
-
[32]
After differential item functioning is detected:
Cho, Sun-Joo and Suh, Youngsuk and Lee, Woo-yeol , journal=. After differential item functioning is detected:. 2016 , publisher=
work page 2016
-
[33]
International Journal of Pure and Applied Mathematics , volume=
On Mixture Regression Shrinkage and Selection Via the MR-Lasso , author=. International Journal of Pure and Applied Mathematics , volume=
-
[34]
Journal of the American Statistical Association , volume=
The adaptive lasso and its oracle properties , author=. Journal of the American Statistical Association , volume=. 2006 , publisher=
work page 2006
-
[35]
In defense of the indefensible: A very naive approach to high-dimensional inference , author=. Statistical Science , volume=. 2021 , publisher=
work page 2021
-
[36]
The Annals of Statistics , volume=
Nearly unbiased variable selection under minimax concave penalty , author=. The Annals of Statistics , volume=
-
[37]
Journal of the American statistical Association , volume=
Variable selection via nonconcave penalized likelihood and its oracle properties , author=. Journal of the American statistical Association , volume=. 2001 , publisher=
work page 2001
-
[38]
Robitzsch, Alexander , journal=. Regularized mixture. 2022 , publisher=
work page 2022
-
[39]
Computational Statistics & Data Analysis , volume=
Model-based clustering of high-dimensional data: A review , author=. Computational Statistics & Data Analysis , volume=. 2014 , publisher=
work page 2014
-
[40]
Advances in Data Analysis and Classification , volume=
A LASSO-penalized BIC for mixture model selection , author=. Advances in Data Analysis and Classification , volume=. 2014 , publisher=
work page 2014
-
[41]
Optimization Methods & Software , volume=
Covariance selection for nonchordal graphs via chordal embedding , author=. Optimization Methods & Software , volume=. 2008 , publisher=
work page 2008
-
[42]
International Journal of Testing , volume=
Differential item functioning: A mixture distribution conceptualization , author=. International Journal of Testing , volume=. 2002 , publisher=
work page 2002
-
[43]
Applied Psychological Measurement , volume=
Explanatory secondary dimension modeling of latent differential item functioning , author=. Applied Psychological Measurement , volume=. 2011 , publisher=
work page 2011
-
[44]
Maximum likelihood from incomplete data via the
Dempster, Arthur P and Laird, Nan M and Rubin, Donald B , journal=. Maximum likelihood from incomplete data via the. 1977 , publisher=
work page 1977
-
[45]
Detection of Differential Item Functioning with Nonlinear Regression: A
Drabinov. Detection of Differential Item Functioning with Nonlinear Regression: A. Journal of Educational Measurement , volume=. 2017 , publisher=
work page 2017
-
[46]
Journal of Educational Measurement , volume=
Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test , author=. Journal of Educational Measurement , volume=. 1986 , publisher=
work page 1986
-
[47]
Handbook of quantitative methods for detecting cheating on tests , pages=
Detecting preknowledge and item compromise: Understanding the status quo , author=. Handbook of quantitative methods for detecting cheating on tests , pages=. 2016 , publisher=
work page 2016
- [48]
-
[49]
Fidalgo, AM and Mellenbergh, Gideon J and Mu. Effects of amount of. Methods of Psychological Research Online , volume=. 2000 , publisher=
work page 2000
-
[50]
Applied Psychological Measurement , volume=
IRT models for ability-based guessing , author=. Applied Psychological Measurement , volume=. 2006 , publisher=
work page 2006
-
[51]
Educational and Psychological Measurement , volume=
Investigation of specific learning disability and testing accommodations based differential item functioning using a multilevel multidimensional mixture item response theory model , author=. Educational and Psychological Measurement , volume=. 2013 , publisher=
work page 2013
-
[52]
Bayesian Item Response Modeling: Theory and Applications , author=. 2010 , publisher=
work page 2010
-
[53]
Frederickx, Sofie and Tuerlinckx, Francis and De Boeck, Paul and Magis, David , journal=. 2010 , publisher=
work page 2010
-
[54]
Sparse inverse covariance estimation with the graphical lasso , author=. Biostatistics , volume=. 2008 , publisher=
work page 2008
-
[55]
The Elements of Statistical Learning: Data Mining, Inference, and Prediction , author=. 2009 , publisher=
work page 2009
- [56]
-
[57]
Differential item functioning and the
Holland, Paul W and Thayer, Dorothy T , journal=. Differential item functioning and the. 1986 , publisher=
work page 1986
-
[58]
International Journal of Selection and Assessment , volume=
Determinants, detection and amelioration of adverse impact in personnel selection procedures: Issues, evidence and lessons learned , author=. International Journal of Selection and Assessment , volume=. 2001 , publisher=
work page 2001
-
[59]
Journal of the American statistical Association , volume=
Estimation of a model with multiple indicators and multiple causes of a single latent variable , author=. Journal of the American statistical Association , volume=. 1975 , publisher=
work page 1975
-
[60]
Journal of Educational Measurement , volume=
Detection of differential item functioning in multiple groups , author=. Journal of Educational Measurement , volume=. 1995 , publisher=
work page 1995
-
[61]
Anchor selection strategies for
Kopf, Julia and Zeileis, Achim and Strobl, Carolin , journal=. Anchor selection strategies for. 2015 , publisher=
work page 2015
-
[62]
A framework for anchor methods and an iterative forward approach for
Kopf, Julia and Zeileis, Achim and Strobl, Carolin , journal=. A framework for anchor methods and an iterative forward approach for. 2015 , publisher=
work page 2015
-
[63]
SIAM Journal on Optimization , volume=
Proximal Newton-type methods for minimizing composite functions , author=. SIAM Journal on Optimization , volume=. 2014 , publisher=
work page 2014
-
[64]
Multivariate Behavioral Research , volume=
Detecting social desirability bias using factor mixture models , author=. Multivariate Behavioral Research , volume=. 2010 , publisher=
work page 2010
-
[65]
Applications of Item Response Theory to Practical Testing Problems , author=. 1980 , publisher=
work page 1980
-
[66]
Frederic M. Lord , title =. Basic problems in cross-cultural psychology , editor =. 1977 , publisher =
work page 1977
-
[67]
Multivariate Behavioral Research , volume=
Improvement in detection of differential item functioning using a mixture item response theory model , author=. Multivariate Behavioral Research , volume=. 2010 , publisher=
work page 2010
-
[68]
Journal of the National Cancer Institute , volume=
Statistical aspects of the analysis of data from retrospective studies of disease , author=. Journal of the National Cancer Institute , volume=. 1959 , publisher=
work page 1959
-
[69]
Journal of Educational and Behavioral Statistics , volume=
Detection of differential item functioning using the lasso approach , author=. Journal of Educational and Behavioral Statistics , volume=. 2015 , publisher=
work page 2015
-
[70]
Applied Psychological Measurement , volume=
A Bayesian method for the detection of item preknowledge in computerized adaptive testing , author=. Applied Psychological Measurement , volume=. 2003 , publisher=
work page 2003
-
[71]
High-dimensional graphs and variable selection with the lasso , author=
-
[72]
Applied Psychological Measurement , volume=
A mixture Rasch model with item response time components , author=. Applied Psychological Measurement , volume=. 2010 , publisher=
work page 2010
-
[73]
Statistical approaches to measurement invariance , author=. 2012 , publisher=
work page 2012
-
[74]
Modeling item responses when different subjects employ different solution strategies , author=. Psychometrika , volume=. 1990 , publisher=
work page 1990
-
[75]
Journal of Educational Statistics , volume=
A method for studying the homogeneity of test items with respect to other relevant variables , author=. Journal of Educational Statistics , volume=. 1985 , publisher=
work page 1985
-
[76]
Latent variable modeling in heterogeneous populations , author=. Psychometrika , volume=. 1989 , publisher=
work page 1989
-
[77]
Some uses of structural equation modeling in validity studies: Extending
Muth. Some uses of structural equation modeling in validity studies: Extending. Test validity , pages=. 2013 , publisher=
work page 2013
-
[78]
Muthen, Bengt and Lehman, James , journal=. Multiple group. 1985 , publisher=
work page 1985
-
[79]
The Annals of Statistics , pages=
Asymptotic properties of criteria for selection of variables in multiple regression , author=. The Annals of Statistics , pages=. 1984 , publisher=
work page 1984
-
[80]
Handbook of quantitative methods for detecting cheating on tests , pages=
Detecting candidate preknowledge and compromised content using differential person and item functioning , author=. Handbook of quantitative methods for detecting cheating on tests , pages=. 2016 , publisher=
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.