Modeling within-department homogeneity in research quality rankings: an application to the Italian ISPD

Giorgio E. Montanari; Marco Doretti

arxiv: 2604.03073 · v1 · submitted 2026-04-03 · 📊 stat.ME

Modeling within-department homogeneity in research quality rankings: an application to the Italian ISPD

Giorgio E. Montanari , Marco Doretti This is my paper

Pith reviewed 2026-05-13 18:28 UTC · model grok-4.3

classification 📊 stat.ME

keywords ISPDdepartment rankingintra-departmental correlationadjusted indexBetoidal distributionmaximum likelihood estimationresearch qualityItaly

0 comments

The pith

Modeling intra-department correlation by size yields an adjusted ISPD that ranks Italian departments more fairly than the original index.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper models how standardized research scores tend to be more alike within smaller Italian academic departments, treating this homogeneity as a parametric function of department size. Maximum-likelihood estimation of the resulting model produces an adjusted version of the Indice Standardizzato di Performance Dipartimentale that reduces the polarization observed in current rankings. A new Betoidal distribution is introduced to accommodate the rounding and left-truncation present in publicly released data. Fits to the 2017 and 2022 Italian evaluation rounds support the size-dependent correlation structure. Simulations further indicate that the adjusted index recovers underlying department quality more accurately than the unadjusted ISPD or other data-intensive alternatives.

Core claim

The presence of within-department homogeneity among standardized scores, modeled as a function of department size, leads to a new adjusted ISPD that provides fairer rankings, with the Betoidal distribution enabling estimation from coarsened public data.

What carries the argument

Parametric model of intra-departmental correlation as a function of size, estimated by maximum likelihood and paired with the Betoidal distribution for rounded and truncated observations.

Load-bearing premise

Intra-departmental correlation among scores depends only on department size through a specific parametric form, and the Betoidal distribution accurately captures the rounding and truncation in the public data.

What would settle it

A simulation in which true department qualities are known but the size-dependent correlation is misspecified, or real data in which the adjusted index fails to improve recovery of held-out quality measures relative to the original ISPD.

Figures

Figures reproduced from arXiv: 2604.03073 by Giorgio E. Montanari, Marco Doretti.

**Figure 2.** Figure 2: Two settings with ρA = 0.05 = ρB, NA = 75, NB = 150, and (a) zA = 2 = zB, and (b) zA = −2 = zB. In (a), ISPDA = 97.72% = ISPDB, with the correct performance measures being P(ZA ≤ 2) = 82.19% and P(ZB ≤ 2) = 75.43%. In (b), ISPDA = 2.28% = ISPDB, with P(ZA ≤ −2) = 17.81% and P(ZB ≤ −2) = 24.57%. different sizes, scaled averages should be standardized so that they can be interpreted as realizations from a co… view at source ↗

**Figure 3.** Figure 3: (a) Betoidal density for various levels of [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Density of X ∼ Betoidal(σ) and Y ∼ Beta(a, a) with V (X) = V (Y ) and (a) σ = 0.5 (a = 3.4005), and (b) σ = 2.5 (a = 0.2568). When only the upper tail of the ISPD values is observed, as in the 2022 case, it is necessary to consider a left-truncated version of the distribution. To this end, we indicate by X⋆ ∼ LT-Betoidal(σ, x⋆ ) a Betoidal random variable truncated below a given value x ⋆ . Its support is … view at source ↗

**Figure 5.** Figure 5: Empirical histograms of (a) ISPD and (b) ISPD [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Empirical histograms of scaled averages (divided by the corresponding theoretical standard deviation) for [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

read the original abstract

In this paper, we consider the academic department ranking system of Italy, which is based on a performance index named Indice Standardizzato di Performance Dipartimentale (ISPD). While critiques to the ISPD have been moved for its marked tendency to polarization, we here formalize a yet unexplored determinant of this phenomenon, that is, the presence of within-department homogeneity among the standardized scores used to build the index. We account for this intra-departmental correlation by modeling it as a function of departments' size. The proposed model, estimated via Maximum Likelihood, allows to build a fairer ranking procedure via the definition of a properly adjusted version of the ISPD. The estimation framework is also adapted to fit publicly available data, which are coarsened by rounding and/or left-truncated. To this end, a novel probability distribution termed Betoidal is introduced. Empirical evidence in favor of the proposed model is found in the 2017 and 2022 data. Moreover, a simulation study shows that the adjusted index significantly overcomes not only the original ISPD, but also other more data-demanding competing proposals.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's main contribution is a size-dependent model for within-department score correlation plus a new Betoidal distribution to handle coarsened ISPD data, used to produce an adjusted ranking.

read the letter

The core advance here is treating intra-department homogeneity in standardized scores as a parametric function of department size, then using maximum likelihood to adjust the Italian ISPD index accordingly. They also introduce the Betoidal distribution to accommodate the rounding and left-truncation in the publicly released data. Both the 2017 and 2022 datasets show support for the size-dependent correlation, and the simulations indicate the adjusted index recovers rankings better than the raw ISPD or some competing approaches under the assumed data-generating process.

Referee Report

3 major / 3 minor

Summary. The paper proposes modeling within-department homogeneity in standardized research performance scores as a parametric function of department size. It introduces a novel 'Betoidal' distribution to handle rounding and left-truncation in publicly available ISPD data, estimates the model by maximum likelihood, and defines an adjusted ISPD intended to produce fairer department rankings. Empirical fits are reported on 2017 and 2022 Italian data, and a simulation study claims the adjusted index outperforms the original ISPD and other competing proposals.

Significance. If the modeling assumptions prove robust, the work supplies a statistically principled correction for size-induced polarization bias in research rankings. The adaptation of the likelihood to coarsened data via the Betoidal distribution addresses a practical constraint of public indices, and the simulation evidence of improved ranking performance could inform policy adjustments in academic evaluation systems.

major comments (3)

[§3] §3 (Betoidal distribution): the novel distribution is introduced to encode rounding and left-truncation, yet no comparison to a standard truncated-beta likelihood, no residual diagnostics, and no sensitivity checks to the rounding mechanism are provided; this directly affects the reliability of the ML parameter estimates used for the adjusted ISPD.
[§5] §5 (Empirical application): the adjusted ISPD is constructed from parameters estimated on the identical 2017 and 2022 datasets used to evaluate its 'fairer' property, so the reported improvement is internal to the fitted model rather than benchmarked against an external criterion.
[§6] §6 (Simulation study): superiority is demonstrated under data generated from the proposed model; no robustness results are shown when the size-dependent correlation function is misspecified or when intra-department homogeneity follows a different structure, which is central to the claim that the adjustment removes polarization bias.

minor comments (3)

Notation for the size-dependent correlation parameters should be defined once and used consistently; currently it varies between the model equations and the estimation section.
Add a brief discussion of related literature on intra-cluster correlation adjustments in ranking indices or performance metrics.
The simulation tables would benefit from reporting coverage probabilities or bias metrics in addition to ranking accuracy to strengthen the comparison.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below, indicating where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [§3] §3 (Betoidal distribution): the novel distribution is introduced to encode rounding and left-truncation, yet no comparison to a standard truncated-beta likelihood, no residual diagnostics, and no sensitivity checks to the rounding mechanism are provided; this directly affects the reliability of the ML parameter estimates used for the adjusted ISPD.

Authors: We agree that additional validation of the Betoidal distribution is warranted to support the reliability of the ML estimates. In the revised manuscript we will add: (i) a direct comparison of the Betoidal likelihood against a standard truncated-beta specification on the same data, (ii) residual diagnostics (e.g., PIT histograms and QQ plots adapted for the coarsened support), and (iii) sensitivity analyses varying the rounding interval and truncation threshold. These results will be reported in an expanded subsection of §3. revision: yes
Referee: [§5] §5 (Empirical application): the adjusted ISPD is constructed from parameters estimated on the identical 2017 and 2022 datasets used to evaluate its 'fairer' property, so the reported improvement is internal to the fitted model rather than benchmarked against an external criterion.

Authors: We acknowledge that the empirical application evaluates the adjusted index on the same data used for estimation, rendering the reported reduction in polarization an in-sample illustration rather than an external validation. The primary evidence for superiority remains the simulation study in §6. In revision we will explicitly clarify this distinction in §5, emphasize that the empirical section demonstrates practical consequences of the adjustment on real rankings, and add a brief discussion of the simulation results as the out-of-sample benchmark. revision: partial
Referee: [§6] §6 (Simulation study): superiority is demonstrated under data generated from the proposed model; no robustness results are shown when the size-dependent correlation function is misspecified or when intra-department homogeneity follows a different structure, which is central to the claim that the adjustment removes polarization bias.

Authors: The simulation study is intentionally conducted under the data-generating process implied by our model to isolate the performance of the adjustment when assumptions hold. We recognize that robustness to misspecification is important for the broader claim. In the revised version we will include additional simulation scenarios in which the intra-department correlation is generated from alternative structures (constant correlation, random-effects, and non-monotonic size dependence) and report the resulting ranking performance of the adjusted ISPD under these misspecifications. revision: yes

Circularity Check

1 steps flagged

Adjusted ISPD fairness defined via ML fit on the same data used for ranking evaluation

specific steps

fitted input called prediction [Abstract]
"The proposed model, estimated via Maximum Likelihood, allows to build a fairer ranking procedure via the definition of a properly adjusted version of the ISPD. [...] a simulation study shows that the adjusted index significantly overcomes not only the original ISPD, but also other more data-demanding competing proposals."

The adjusted ISPD is defined using parameters estimated on the identical data to which it is applied; its claimed superiority (both empirical and in simulation) therefore reduces to the consequences of that fit rather than an external criterion for fairness.

full rationale

The paper's core claim is that the ML-estimated model yields a 'fairer' adjusted ISPD. This adjustment is constructed directly from parameters fitted to the 2017/2022 data, and the simulation study demonstrating superiority is performed under the model's own assumptions (intra-department correlation as function of size, Betoidal for coarsening). While the simulation offers a partial external check, the 'fairer' property lacks an independent benchmark outside the fitted model, producing moderate circularity of the fitted-input-called-prediction type. No self-citation load-bearing or self-definitional reduction of the full derivation chain is present.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that department-level homogeneity is a smooth function of size and that the Betoidal distribution correctly captures the data coarsening process; both are introduced without external validation.

free parameters (1)

size-dependent correlation parameters
Estimated by maximum likelihood from the 2017 and 2022 datasets; their values determine the adjusted ISPD.

axioms (1)

domain assumption Standardized researcher scores within a department exhibit correlation that depends only on department size
Invoked to justify the model specification in the abstract.

invented entities (1)

Betoidal distribution no independent evidence
purpose: To model left-truncated and rounded performance scores
Newly defined probability distribution introduced to fit the coarsened public data.

pith-pipeline@v0.9.0 · 5498 in / 1320 out tokens · 27885 ms · 2026-05-13T18:28:53.620004+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages

[1]

Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2018 – 2022 (link)

MUR. Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2018 – 2022 (link). Technical report,

work page 2018
[2]

Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2023 – 2027 (link)

MUR. Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2023 – 2027 (link). Technical report,

work page 2023
[3]

Poggi and C

G. Poggi and C. A. Nappi. Il voto standardizzato per l’esercizio VQR 2004-2010.RIV: rassegna italiana di valutazione, 59(2):34–58,

work page 2004
[4]

Letting ρ= 1 N(N−1) X i∈N X i′∈N \i ρii′ be the average pairwise intra-departmental correlation, we haveV( ˜Z) = 1 + (N−1)ρ

The variance of the scaled average ˜Z= P i∈N Zi/ √ Nis given by V( ˜Z) = 1 N X i∈N V(Z i) + 1 N X i∈N X i′∈N \i Cov(Zi, Zi′) = 1 N  N+ X i∈N X i′∈N \i ρii′   , 16 whereρ ii′ = Corr(Zi, Zi′). Letting ρ= 1 N(N−1) X i∈N X i′∈N \i ρii′ be the average pairwise intra-departmental correlation, we haveV( ˜Z) = 1 + (N−1)ρ. Hence, unlessρdecreases sufficiently ...

work page 2011
[5]

Equivalently, E(X

= Z +∞ −∞ {Φ(z)}2 1 σ ϕ z σ dz = Z +∞ −∞ {Φ(kσ)} 2ϕ(k)dk= 1 π atan p 1 + 2σ2 , where the last equality follows from identities reported in Owen (1980). Equivalently, E(X

work page 1980

[1] [1]

Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2018 – 2022 (link)

MUR. Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2018 – 2022 (link). Technical report,

work page 2018

[2] [2]

Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2023 – 2027 (link)

MUR. Elenco dei Dipartimenti ammessi alla selezione dei Dipartimenti di eccellenza 2023 – 2027 (link). Technical report,

work page 2023

[3] [3]

Poggi and C

G. Poggi and C. A. Nappi. Il voto standardizzato per l’esercizio VQR 2004-2010.RIV: rassegna italiana di valutazione, 59(2):34–58,

work page 2004

[4] [4]

Letting ρ= 1 N(N−1) X i∈N X i′∈N \i ρii′ be the average pairwise intra-departmental correlation, we haveV( ˜Z) = 1 + (N−1)ρ

The variance of the scaled average ˜Z= P i∈N Zi/ √ Nis given by V( ˜Z) = 1 N X i∈N V(Z i) + 1 N X i∈N X i′∈N \i Cov(Zi, Zi′) = 1 N  N+ X i∈N X i′∈N \i ρii′   , 16 whereρ ii′ = Corr(Zi, Zi′). Letting ρ= 1 N(N−1) X i∈N X i′∈N \i ρii′ be the average pairwise intra-departmental correlation, we haveV( ˜Z) = 1 + (N−1)ρ. Hence, unlessρdecreases sufficiently ...

work page 2011

[5] [5]

Equivalently, E(X

= Z +∞ −∞ {Φ(z)}2 1 σ ϕ z σ dz = Z +∞ −∞ {Φ(kσ)} 2ϕ(k)dk= 1 π atan p 1 + 2σ2 , where the last equality follows from identities reported in Owen (1980). Equivalently, E(X

work page 1980