Community detection in small-sample ordinal regimes: A benchmarking framework for Delphi data

Fabrizio Maturo; Simone Di Zio; Yuri Calleo

arxiv: 2606.20114 · v1 · pith:4SITKLFDnew · submitted 2026-06-18 · 📊 stat.ME · stat.AP

Community detection in small-sample ordinal regimes: A benchmarking framework for Delphi data

Yuri Calleo , Simone Di Zio , Fabrizio Maturo This is my paper

Pith reviewed 2026-06-26 16:14 UTC · model grok-4.3

classification 📊 stat.ME stat.AP

keywords community detectionDelphi methodordinal datadimensionality reductionnetwork analysissmall sample sizeconsensus datagraph partitioning

0 comments

The pith

Collinearity among expert judgments can be reinterpreted as a topological signal of cohesion, allowing community detection on weighted graphs to identify latent themes and reduce dimensions stably in small-sample ordinal Delphi data where P

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that high-dimensional Delphi questionnaires with small expert panels create rank-deficient data that makes covariance methods like PCA unstable and prone to overfitting. By converting item correlations into a weighted graph, community detection algorithms can partition the items into thematic groups that serve as a reliable form of dimensionality reduction. This works because the same collinearity that destabilizes traditional models becomes a detectable pattern of cohesion on the graph. A sympathetic reader would care because it supplies an automated, stable alternative for extracting structure from consensus data that standard factor analysis cannot handle.

Core claim

Mapping correlations among ordinal Delphi items onto a weighted graph topology and applying community detection algorithms identifies latent thematic structures that deliver stable dimensionality reduction, addressing the spectral instability and rank deficiency that arise in high-dimensional low-sample regimes with ordinal scales and systemic noise.

What carries the argument

Weighted graph constructed from item correlations, with community detection algorithms used to partition the graph into thematic communities that function as reduced dimensions.

If this is right

Delphi researchers gain an automated procedure for dimensionality reduction that remains stable even when the number of items greatly exceeds the number of experts.
Collinearity is treated as useful topological information rather than redundancy to be removed.
The method supplies structural stability and psychometric consistency in regimes where factor analysis breaks down.
Benchmarking on synthetic data shows robustness measured by structural density, information flow, and spectral partitioning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same graph construction could be applied to other small-sample ordinal surveys outside Delphi studies to extract themes without relying on PCA.
Detected communities might serve as input features for subsequent predictive models built on expert consensus data.
Real-world validation would require comparing the automated partitions against expert-labeled themes on multiple independent Delphi panels.

Load-bearing premise

Synthetic datasets built to mimic ordinal scales and systemic noise in consensus data capture the dependence structure of real expert judgment panels well enough for performance to transfer.

What would settle it

Running the graph-based procedure on an actual Delphi panel and observing that the detected communities fail to match independent thematic groupings made by the experts or produce less stable reduced scores than manual selection.

Figures

Figures reproduced from arXiv: 2606.20114 by Fabrizio Maturo, Simone Di Zio, Yuri Calleo.

**Figure 2.** Figure 2: Comparative visualization of consensus networks. Nodes represent items, and [PITH_FULL_IMAGE:figures/full_fig_p024_2.png] view at source ↗

**Figure 3.** Figure 3: Alluvial diagram illustrating the flow of item classification across clustering [PITH_FULL_IMAGE:figures/full_fig_p025_3.png] view at source ↗

**Figure 4.** Figure 4: Spectral Eigengap Heuristic. The optimal number of clusters [PITH_FULL_IMAGE:figures/full_fig_p027_4.png] view at source ↗

**Figure 5.** Figure 5: Strip plot of Internal Consistency (Cronbach’s [PITH_FULL_IMAGE:figures/full_fig_p029_5.png] view at source ↗

**Figure 6.** Figure 6: Consensus Heatmap aggregating partitions from all four algorithms. Dark red [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗

**Figure 7.** Figure 7: Algorithmic Stress Test (Sensitivity Analysis). The plot illustrates the decay in [PITH_FULL_IMAGE:figures/full_fig_p031_7.png] view at source ↗

**Figure 8.** Figure 8: Dimensionality breakdown analysis. The plot contrasts the stability of the [PITH_FULL_IMAGE:figures/full_fig_p034_8.png] view at source ↗

read the original abstract

The statistical modeling of consensus in Delphi data faces a critical bottleneck: the high dimensionality of questionnaire items relative to the limited sample size of expert panels. This rank deficiency leads traditional latent variable models, such as Principal Component Analysis, to be structurally unstable and prone to overfitting. Addressing this methodological gap, this study proposes a transition from variable-centric covariance models to network-centric connectivity models. By mapping item correlations onto a weighted graph topology, we present a simulation-based benchmark that utilizes community detection algorithms to identify latent thematic structures, effectively addressing the spectral instability and rank deficiency typical of high-dimensional, low-sample-size regimes. The research systematically evaluates the robustness of topological approaches based on structural density, information flow, and spectral partitioning against synthetic datasets designed to replicate the pathological conditions of consensus data, including ordinal scales and systemic noise. The central methodological contribution lies in demonstrating that collinearity among expert judgments - traditionally treated as statistical redundancy to be regularized - can be effectively reinterpreted as a topological signal of cohesion. This framework provides researchers with a structured and automated procedure for dimensionality reduction, ensuring structural stability and psychometric consistency even in small-sample regimes where standard factor analysis breaks down.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper applies community detection to correlation graphs from small Delphi panels as a workaround for PCA instability, but the claims hinge on synthetic benchmarks whose match to real expert data is unshown.

read the letter

The main takeaway is that the authors map item correlations from ordinal Delphi responses onto weighted graphs and run community detection to pull out latent themes, treating collinearity as a cohesion signal instead of something to regularize away. This targets the high-dimensional low-n setting common in expert panels where standard factor methods overfit.

The paper does a solid job naming the practical bottleneck: few experts, many questionnaire items, ordinal scales, and the resulting rank deficiency. The shift to a network view and the simulation setup to test structural density and spectral methods against noise is a reasonable way to explore alternatives. It gives researchers a structured procedure for dimensionality reduction that stays stable where PCA does not.

The soft spot is the transfer from synthetics to actual panels. The data-generation process is designed to mimic ordinal scales and systemic noise, but real Delphi matrices often include expert-specific biases, non-stationary responses from the iterative rounds, and item patterns tied to the process itself. Without evidence that performance on these synthetics predicts behavior on empirical data, the stability claim stays provisional. The abstract also omits concrete algorithm choices, metrics, and quantitative outcomes, so the strength of the benchmark is difficult to judge from the given material.

This is for applied statisticians and social scientists who work with small expert surveys in forecasting or policy. A reader facing similar high-dim ordinal problems might pick up the graph-construction idea as a starting point.

It deserves peer review because the framing is coherent and the domain extension is legitimate, even if the empirical side needs tightening.

Referee Report

3 major / 1 minor

Summary. The paper claims that in high-dimensional, small-sample ordinal Delphi regimes, PCA is structurally unstable due to rank deficiency, and proposes instead mapping item correlations to weighted graphs and applying community detection algorithms to identify latent thematic structures. Collinearity among expert judgments is reinterpreted as a topological signal of cohesion rather than statistical redundancy, enabling stable dimensionality reduction; this is evaluated via a simulation benchmark on synthetic datasets replicating ordinal scales and systemic noise.

Significance. If the simulation results are shown to be robust and the dependence structures are representative, the framework could provide a practical alternative for dimensionality reduction in Delphi studies where traditional factor models fail, particularly in policy and forecasting applications.

major comments (3)

[Abstract] Abstract: the simulation benchmark is described at a high level but supplies no concrete details on the community detection algorithms tested, the performance metrics used, the data-generation process for the synthetic datasets, or any quantitative results, preventing assessment of whether the stability claim holds.
[Abstract] Abstract / central claim: the reinterpretation of collinearity as a topological cohesion signal is presented conceptually, but no equations or procedures are given showing how the weighted graph is constructed from correlations or how community detection parameters are selected independently of the evaluation data, leaving open the possibility that the benchmark is circular.
[Simulation design] Simulation design: the central claim that the method yields stable, psychometrically consistent reductions rests on synthetic datasets replicating ordinal scales and systemic noise, yet no evidence is provided that the simulated dependence structure matches the expert-specific biases, non-stationarity, or item-response patterns of real Delphi panels, which is load-bearing for transferability to actual small-sample ordinal data.

minor comments (1)

[Abstract] The abstract refers to 'structural density, information flow, and spectral partitioning' without defining these quantities or citing the specific algorithms employed.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. We agree that the abstract requires more specific details to enable proper evaluation and will revise accordingly. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the simulation benchmark is described at a high level but supplies no concrete details on the community detection algorithms tested, the performance metrics used, the data-generation process for the synthetic datasets, or any quantitative results, preventing assessment of whether the stability claim holds.

Authors: We agree that the abstract is too high-level for assessment. In the revised version we will expand the abstract to name the algorithms tested (Louvain, Leiden, and spectral partitioning), the metrics (modularity, normalized mutual information, and cross-noise stability), the data-generation process (ordinal discretization of multivariate normals with expert-specific random effects and additive systemic noise), and the main quantitative outcomes (e.g., community detection retaining >0.85 NMI under rank-deficient regimes where PCA collapses). revision: yes
Referee: [Abstract] Abstract / central claim: the reinterpretation of collinearity as a topological cohesion signal is presented conceptually, but no equations or procedures are given showing how the weighted graph is constructed from correlations or how community detection parameters are selected independently of the evaluation data, leaving open the possibility that the benchmark is circular.

Authors: The full methods section defines the mapping w_ij = max(0, ho_ij − au) with au chosen to guarantee a single connected component, and selects the resolution parameter via grid search on a held-out validation partition that is never used for the reported benchmark metrics. To remove any ambiguity we will add one sentence to the abstract summarizing this non-circular selection procedure. revision: partial
Referee: [Simulation design] Simulation design: the central claim that the method yields stable, psychometrically consistent reductions rests on synthetic datasets replicating ordinal scales and systemic noise, yet no evidence is provided that the simulated dependence structure matches the expert-specific biases, non-stationarity, or item-response patterns of real Delphi panels, which is load-bearing for transferability to actual small-sample ordinal data.

Authors: We will expand the simulation-design subsection to cite Delphi literature for the chosen bias and noise magnitudes and will add sensitivity plots varying those parameters. Direct empirical calibration against real panels is not possible in this study because raw Delphi response matrices are rarely released. revision: partial

standing simulated objections not resolved

Direct empirical matching of the simulated dependence structure to real Delphi panels' expert-specific biases, non-stationarity, or item-response patterns, because such raw data are typically confidential and unavailable.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained methodological proposal

full rationale

The paper advances a conceptual reinterpretation of collinearity as a graph-topological cohesion signal and benchmarks community detection on synthetic ordinal datasets designed to mimic Delphi pathologies. No equations, fitted parameters, or self-citations are quoted that reduce any claimed prediction or uniqueness result to the input data by construction. The central contribution is a shift from covariance to connectivity models evaluated on independently generated synthetics; this structure does not match any of the enumerated circularity patterns and remains externally falsifiable via real-panel transfer.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, invented entities, or detailed axioms beyond the general assumption that graph topology captures latent structure in ordinal consensus data.

axioms (1)

domain assumption Community detection on correlation graphs recovers latent thematic structures in ordinal Delphi data
Central to the proposed transition from variable-centric to network-centric models.

pith-pipeline@v0.9.1-grok · 5733 in / 1315 out tokens · 21953 ms · 2026-06-26T16:14:11.861003+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 14 canonical work pages

[1]

doi:10.1111/j.2517-6161.1995.tb02031.x. L. Berenschot and Y. Grift. Validity and reliability of the (adjusted) impact on participation and autonomy questionnaire for social-support populations.Health and Quality of Life Outcomes, 17(1):41,

work page doi:10.1111/j.2517-6161.1995.tb02031.x 1995
[2]

Bokrantz, A

43 J. Bokrantz, A. Skoogh, C. Berlin, and J. Stahre. Maintenance in digitalised manufacturing: Delphi-based scenarios for 2030.International Journal of Production Economics, 191:154–169,

2030
[3]

doi:10.1111/j.1600-0587.2012.07348.x. K. N. Fleming, A. Mosleh, and R. K. Deremer. A systematic procedure for the incorporation of common cause events into risk and reliability models. Nuclear Engineering and Design, 93(2-3):245–273,

work page doi:10.1111/j.1600-0587.2012.07348.x 2012
[4]

doi:10.1016/j.swevo.2015.09.001. L. Grilli and C. Rampichini. Multilevel factor models for ordinal variables. Structural Equation Modeling: A Multidisciplinary Journal, 14(1):1–25,

work page doi:10.1016/j.swevo.2015.09.001 2015
[5]

doi:10.1080/10705510709336734. Henry F. Kaiser. The varimax criterion for analytic rotation in factor analysis.Psychometrika, 23(3):187–200,

work page doi:10.1080/10705510709336734
[6]

doi:10.1007/BF02289233. A. Lancichinetti and S. Fortunato. Community detection algorithms: a comparative analysis.Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, 80(5):056117,

work page doi:10.1007/bf02289233
[7]

N. A. M. Saffie, N. A. M. Shukor, and K. A. Rasmani. Fuzzy delphi method: Issues and challenges. InProceedings of the 2016 International Conference on Logistics, Informatics and Service Sciences (LISS), pages 1–7. IEEE, July

2016
[8]

45 Michael T

doi:10.1109/LISS.2016.7854490. 45 Michael T. Schaub, Jean-Charles Delvenne, Martin Rosvall, and Renaud Lambiotte. The many facets of community detection in complex networks. Applied Network Science, 2(1):4,

work page doi:10.1109/liss.2016.7854490 2016
[9]

doi:10.1007/s41109-017-0023-6. D. Steinley. Properties of the hubert-arable adjusted rand index. Psychological Methods, 9(3):386,

work page doi:10.1007/s41109-017-0023-6
[10]

Marieke E

doi:10.1016/S0040-1625(01)00177-9. Marieke E. Timmerman and Urbano Lorenzo-Seva. Dimensionality assessment of ordered polytomous items with parallel analysis. Psychological Methods, 16(2):209–220,

work page doi:10.1016/s0040-1625(01)00177-9
[11]

Sigmund Tobias and J

doi:10.1037/a0023353. Sigmund Tobias and J. E. Carlson. Brief report: Bartlett’s test of sphericity and chance findings in factor analysis.Multivariate Behavioral Research, 4(3):375–377,

work page doi:10.1037/a0023353
[12]

doi:10.1207/s15327906mbr0403_8. V. A. Traag, L. Waltman, and N. J. Van Eck. From louvain to leiden: guaranteeing well-connected communities.Scientific Reports, 9(1):1–12,

work page doi:10.1207/s15327906mbr0403_8
[13]

doi:10.1198/016214508000000021. M. W. Watkins. Exploratory factor analysis: A guide to best practice. Journal of Black Psychology, 44(3):219–246,

work page doi:10.1198/016214508000000021
[14]

doi:10.1080/01621459.1972.10481251. J. W. Zartha Sossa, W. Halal, and R. Hernandez Zarta. Delphi method: analysis of rounds, stakeholder and statistical indicators.Foresight, 21(5): 525–544,

work page doi:10.1080/01621459.1972.10481251 1972
[15]

Z. Zhang. Construction of mathematical modeling for teaching evaluation index system based on the delphi ahp method.Scientific Programming, 2022(1):7744067,

2022
[16]

William R

doi:10.1155/2022/7744067. William R. Zwick and Wayne F. Velicer. Comparison of five rules for determining the number of components to retain.Psychological Bulletin, 99(3):432–442,

work page doi:10.1155/2022/7744067 2022
[17]

doi:10.1037/0033-2909.99.3.432. 47

work page doi:10.1037/0033-2909.99.3.432

[1] [1]

doi:10.1111/j.2517-6161.1995.tb02031.x. L. Berenschot and Y. Grift. Validity and reliability of the (adjusted) impact on participation and autonomy questionnaire for social-support populations.Health and Quality of Life Outcomes, 17(1):41,

work page doi:10.1111/j.2517-6161.1995.tb02031.x 1995

[2] [2]

Bokrantz, A

43 J. Bokrantz, A. Skoogh, C. Berlin, and J. Stahre. Maintenance in digitalised manufacturing: Delphi-based scenarios for 2030.International Journal of Production Economics, 191:154–169,

2030

[3] [3]

doi:10.1111/j.1600-0587.2012.07348.x. K. N. Fleming, A. Mosleh, and R. K. Deremer. A systematic procedure for the incorporation of common cause events into risk and reliability models. Nuclear Engineering and Design, 93(2-3):245–273,

work page doi:10.1111/j.1600-0587.2012.07348.x 2012

[4] [4]

doi:10.1016/j.swevo.2015.09.001. L. Grilli and C. Rampichini. Multilevel factor models for ordinal variables. Structural Equation Modeling: A Multidisciplinary Journal, 14(1):1–25,

work page doi:10.1016/j.swevo.2015.09.001 2015

[5] [5]

doi:10.1080/10705510709336734. Henry F. Kaiser. The varimax criterion for analytic rotation in factor analysis.Psychometrika, 23(3):187–200,

work page doi:10.1080/10705510709336734

[6] [6]

doi:10.1007/BF02289233. A. Lancichinetti and S. Fortunato. Community detection algorithms: a comparative analysis.Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, 80(5):056117,

work page doi:10.1007/bf02289233

[7] [7]

N. A. M. Saffie, N. A. M. Shukor, and K. A. Rasmani. Fuzzy delphi method: Issues and challenges. InProceedings of the 2016 International Conference on Logistics, Informatics and Service Sciences (LISS), pages 1–7. IEEE, July

2016

[8] [8]

45 Michael T

doi:10.1109/LISS.2016.7854490. 45 Michael T. Schaub, Jean-Charles Delvenne, Martin Rosvall, and Renaud Lambiotte. The many facets of community detection in complex networks. Applied Network Science, 2(1):4,

work page doi:10.1109/liss.2016.7854490 2016

[9] [9]

doi:10.1007/s41109-017-0023-6. D. Steinley. Properties of the hubert-arable adjusted rand index. Psychological Methods, 9(3):386,

work page doi:10.1007/s41109-017-0023-6

[10] [10]

Marieke E

doi:10.1016/S0040-1625(01)00177-9. Marieke E. Timmerman and Urbano Lorenzo-Seva. Dimensionality assessment of ordered polytomous items with parallel analysis. Psychological Methods, 16(2):209–220,

work page doi:10.1016/s0040-1625(01)00177-9

[11] [11]

Sigmund Tobias and J

doi:10.1037/a0023353. Sigmund Tobias and J. E. Carlson. Brief report: Bartlett’s test of sphericity and chance findings in factor analysis.Multivariate Behavioral Research, 4(3):375–377,

work page doi:10.1037/a0023353

[12] [12]

doi:10.1207/s15327906mbr0403_8. V. A. Traag, L. Waltman, and N. J. Van Eck. From louvain to leiden: guaranteeing well-connected communities.Scientific Reports, 9(1):1–12,

work page doi:10.1207/s15327906mbr0403_8

[13] [13]

doi:10.1198/016214508000000021. M. W. Watkins. Exploratory factor analysis: A guide to best practice. Journal of Black Psychology, 44(3):219–246,

work page doi:10.1198/016214508000000021

[14] [14]

doi:10.1080/01621459.1972.10481251. J. W. Zartha Sossa, W. Halal, and R. Hernandez Zarta. Delphi method: analysis of rounds, stakeholder and statistical indicators.Foresight, 21(5): 525–544,

work page doi:10.1080/01621459.1972.10481251 1972

[15] [15]

Z. Zhang. Construction of mathematical modeling for teaching evaluation index system based on the delphi ahp method.Scientific Programming, 2022(1):7744067,

2022

[16] [16]

William R

doi:10.1155/2022/7744067. William R. Zwick and Wayne F. Velicer. Comparison of five rules for determining the number of components to retain.Psychological Bulletin, 99(3):432–442,

work page doi:10.1155/2022/7744067 2022

[17] [17]

doi:10.1037/0033-2909.99.3.432. 47

work page doi:10.1037/0033-2909.99.3.432