Modeling within-department homogeneity in research quality rankings: an application to the Italian ISPD
Pith reviewed 2026-05-13 18:28 UTC · model grok-4.3
The pith
Modeling intra-department correlation by size yields an adjusted ISPD that ranks Italian departments more fairly than the original index.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The presence of within-department homogeneity among standardized scores, modeled as a function of department size, leads to a new adjusted ISPD that provides fairer rankings, with the Betoidal distribution enabling estimation from coarsened public data.
What carries the argument
Parametric model of intra-departmental correlation as a function of size, estimated by maximum likelihood and paired with the Betoidal distribution for rounded and truncated observations.
Load-bearing premise
Intra-departmental correlation among scores depends only on department size through a specific parametric form, and the Betoidal distribution accurately captures the rounding and truncation in the public data.
What would settle it
A simulation in which true department qualities are known but the size-dependent correlation is misspecified, or real data in which the adjusted index fails to improve recovery of held-out quality measures relative to the original ISPD.
Figures
read the original abstract
In this paper, we consider the academic department ranking system of Italy, which is based on a performance index named Indice Standardizzato di Performance Dipartimentale (ISPD). While critiques to the ISPD have been moved for its marked tendency to polarization, we here formalize a yet unexplored determinant of this phenomenon, that is, the presence of within-department homogeneity among the standardized scores used to build the index. We account for this intra-departmental correlation by modeling it as a function of departments' size. The proposed model, estimated via Maximum Likelihood, allows to build a fairer ranking procedure via the definition of a properly adjusted version of the ISPD. The estimation framework is also adapted to fit publicly available data, which are coarsened by rounding and/or left-truncated. To this end, a novel probability distribution termed Betoidal is introduced. Empirical evidence in favor of the proposed model is found in the 2017 and 2022 data. Moreover, a simulation study shows that the adjusted index significantly overcomes not only the original ISPD, but also other more data-demanding competing proposals.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes modeling within-department homogeneity in standardized research performance scores as a parametric function of department size. It introduces a novel 'Betoidal' distribution to handle rounding and left-truncation in publicly available ISPD data, estimates the model by maximum likelihood, and defines an adjusted ISPD intended to produce fairer department rankings. Empirical fits are reported on 2017 and 2022 Italian data, and a simulation study claims the adjusted index outperforms the original ISPD and other competing proposals.
Significance. If the modeling assumptions prove robust, the work supplies a statistically principled correction for size-induced polarization bias in research rankings. The adaptation of the likelihood to coarsened data via the Betoidal distribution addresses a practical constraint of public indices, and the simulation evidence of improved ranking performance could inform policy adjustments in academic evaluation systems.
major comments (3)
- [§3] §3 (Betoidal distribution): the novel distribution is introduced to encode rounding and left-truncation, yet no comparison to a standard truncated-beta likelihood, no residual diagnostics, and no sensitivity checks to the rounding mechanism are provided; this directly affects the reliability of the ML parameter estimates used for the adjusted ISPD.
- [§5] §5 (Empirical application): the adjusted ISPD is constructed from parameters estimated on the identical 2017 and 2022 datasets used to evaluate its 'fairer' property, so the reported improvement is internal to the fitted model rather than benchmarked against an external criterion.
- [§6] §6 (Simulation study): superiority is demonstrated under data generated from the proposed model; no robustness results are shown when the size-dependent correlation function is misspecified or when intra-department homogeneity follows a different structure, which is central to the claim that the adjustment removes polarization bias.
minor comments (3)
- Notation for the size-dependent correlation parameters should be defined once and used consistently; currently it varies between the model equations and the estimation section.
- Add a brief discussion of related literature on intra-cluster correlation adjustments in ranking indices or performance metrics.
- The simulation tables would benefit from reporting coverage probabilities or bias metrics in addition to ranking accuracy to strengthen the comparison.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major point below, indicating where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [§3] §3 (Betoidal distribution): the novel distribution is introduced to encode rounding and left-truncation, yet no comparison to a standard truncated-beta likelihood, no residual diagnostics, and no sensitivity checks to the rounding mechanism are provided; this directly affects the reliability of the ML parameter estimates used for the adjusted ISPD.
Authors: We agree that additional validation of the Betoidal distribution is warranted to support the reliability of the ML estimates. In the revised manuscript we will add: (i) a direct comparison of the Betoidal likelihood against a standard truncated-beta specification on the same data, (ii) residual diagnostics (e.g., PIT histograms and QQ plots adapted for the coarsened support), and (iii) sensitivity analyses varying the rounding interval and truncation threshold. These results will be reported in an expanded subsection of §3. revision: yes
-
Referee: [§5] §5 (Empirical application): the adjusted ISPD is constructed from parameters estimated on the identical 2017 and 2022 datasets used to evaluate its 'fairer' property, so the reported improvement is internal to the fitted model rather than benchmarked against an external criterion.
Authors: We acknowledge that the empirical application evaluates the adjusted index on the same data used for estimation, rendering the reported reduction in polarization an in-sample illustration rather than an external validation. The primary evidence for superiority remains the simulation study in §6. In revision we will explicitly clarify this distinction in §5, emphasize that the empirical section demonstrates practical consequences of the adjustment on real rankings, and add a brief discussion of the simulation results as the out-of-sample benchmark. revision: partial
-
Referee: [§6] §6 (Simulation study): superiority is demonstrated under data generated from the proposed model; no robustness results are shown when the size-dependent correlation function is misspecified or when intra-department homogeneity follows a different structure, which is central to the claim that the adjustment removes polarization bias.
Authors: The simulation study is intentionally conducted under the data-generating process implied by our model to isolate the performance of the adjustment when assumptions hold. We recognize that robustness to misspecification is important for the broader claim. In the revised version we will include additional simulation scenarios in which the intra-department correlation is generated from alternative structures (constant correlation, random-effects, and non-monotonic size dependence) and report the resulting ranking performance of the adjusted ISPD under these misspecifications. revision: yes
Circularity Check
Adjusted ISPD fairness defined via ML fit on the same data used for ranking evaluation
specific steps
-
fitted input called prediction
[Abstract]
"The proposed model, estimated via Maximum Likelihood, allows to build a fairer ranking procedure via the definition of a properly adjusted version of the ISPD. [...] a simulation study shows that the adjusted index significantly overcomes not only the original ISPD, but also other more data-demanding competing proposals."
The adjusted ISPD is defined using parameters estimated on the identical data to which it is applied; its claimed superiority (both empirical and in simulation) therefore reduces to the consequences of that fit rather than an external criterion for fairness.
full rationale
The paper's core claim is that the ML-estimated model yields a 'fairer' adjusted ISPD. This adjustment is constructed directly from parameters fitted to the 2017/2022 data, and the simulation study demonstrating superiority is performed under the model's own assumptions (intra-department correlation as function of size, Betoidal for coarsening). While the simulation offers a partial external check, the 'fairer' property lacks an independent benchmark outside the fitted model, producing moderate circularity of the fitted-input-called-prediction type. No self-citation load-bearing or self-definitional reduction of the full derivation chain is present.
Axiom & Free-Parameter Ledger
free parameters (1)
- size-dependent correlation parameters
axioms (1)
- domain assumption Standardized researcher scores within a department exhibit correlation that depends only on department size
invented entities (1)
-
Betoidal distribution
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2018 – 2022 (link)
MUR. Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2018 – 2022 (link). Technical report,
work page 2018
-
[2]
Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2023 – 2027 (link)
MUR. Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2023 – 2027 (link). Technical report,
work page 2023
-
[3]
G. Poggi and C. A. Nappi. Il voto standardizzato per l’esercizio VQR 2004-2010.RIV: rassegna italiana di valutazione, 59(2):34–58,
work page 2004
-
[4]
The variance of the scaled average ˜Z= P i∈N Zi/ √ Nis given by V( ˜Z) = 1 N X i∈N V(Z i) + 1 N X i∈N X i′∈N \i Cov(Zi, Zi′) = 1 N N+ X i∈N X i′∈N \i ρii′ , 16 whereρ ii′ = Corr(Zi, Zi′). Letting ρ= 1 N(N−1) X i∈N X i′∈N \i ρii′ be the average pairwise intra-departmental correlation, we haveV( ˜Z) = 1 + (N−1)ρ. Hence, unlessρdecreases sufficiently ...
work page 2011
-
[5]
= Z +∞ −∞ {Φ(z)}2 1 σ ϕ z σ dz = Z +∞ −∞ {Φ(kσ)} 2ϕ(k)dk= 1 π atan p 1 + 2σ2 , where the last equality follows from identities reported in Owen (1980). Equivalently, E(X
work page 1980
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.