Revealing posturographic features associated with the risk of falling in patients with Parkinsonian syndromes via machine learning
Pith reviewed 2026-05-24 21:25 UTC · model grok-4.3
The pith
The ts-AUC multivariate test detects posturographic differences between fallers and non-fallers in Parkinsonian patients where standard multiple testing finds none.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The ts-AUC test showed a statistically significant difference (p-value = 0.01) between the faller and non-faller groups in the multidimensional posturographic feature space, while multiple testing with p-value adjustment did not; the difference was restricted to the open-eyes protocol, with fallers exhibiting increased antero-posterior movements and increased posturographic area.
What carries the argument
ts-AUC, the non-parametric multivariate two-sample test applied directly to the high-dimensional feature vectors derived from statokinesigrams.
If this is right
- The open-eyes Romberg protocol alone separates fallers from non-fallers on the chosen feature set.
- Fallers display reliably larger antero-posterior sway and overall posturographic area than non-fallers.
- Multivariate machine-learning tests can be treated as direct extensions of classical statistical tools for multifactorial clinical data.
- Standard univariate testing with correction can miss group differences that are visible when all features are considered jointly.
Where Pith is reading between the lines
- The identified features could be tested as inputs to prospective fall-prediction models if longitudinal outcome data become available.
- The same ts-AUC approach could be applied to other high-dimensional sensor recordings in neurology where many correlated variables are collected simultaneously.
- The eyes-open condition may expose compensatory strategies that are masked when vision is removed.
Load-bearing premise
The clinical classification of patients into fallers versus non-fallers is accurate and independent of the posturographic measurements.
What would settle it
An independent replication using the identical ts-AUC procedure on statokinesigram recordings from a new cohort of Parkinsonian patients, grouped by the same clinical criteria, that yields a non-significant result at the reported threshold.
Figures
read the original abstract
Falling in Parkinsonian syndromes (PS) is associated with postural instability and consists a common cause of disability among PS patients. Current posturographic practices record the body's center-of-pressure displacement (statokinesigram) while the patient stands on a force platform. Statokinesigrams, after appropriate signal processing, can offer numerous posturographic features, which however challenges the efforts for valid statistics via standard univariate approaches. In this work, we present the ts-AUC, a non-parametric multivariate two-sample test, which we employ to analyze statokinesigram differences among PS patients that are fallers (PSf) and non-fallers (PSNF). We included 123 PS patients who were classified into PSF or PSNF based on clinical assessment and underwent simple Romberg Test (eyes open/eyes closed). We analyzed posturographic features using both multiple testing with p-value adjustment and the ts-AUC. While the ts-AUC showed significant difference between groups (p-value = 0.01), multiple testing did not show any such difference. Interestingly, significant difference between the two groups was found only using the open-eyes protocol. PSF showed significantly increased antero-posterior movements as well as increased posturographic area, compared to PSNF. Our study demonstrates the superiority of the ts-AUC test compared to standard statistical tools in distinguishing PSF and PSNF in the multidimensional feature space. This result highlights more generally the fact that machine learning-based statistical tests can be seen as a natural extension of classical statistical approaches and should be considered, especially when dealing with multifactorial assessments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces ts-AUC, a non-parametric multivariate two-sample test, and applies it to posturographic features extracted from statokinesigrams of 123 Parkinsonian syndrome patients classified as fallers (PSf) or non-fallers (PSNF) via clinical assessment. Using the Romberg test (eyes open/closed), the authors report that ts-AUC detects a significant group difference (p=0.01) in the open-eyes protocol—where multiple-testing with p-value adjustment finds none—with PSf showing increased antero-posterior movements and posturographic area; they conclude that ts-AUC is superior to standard univariate tools for multidimensional posturographic data.
Significance. If the type-I error control of ts-AUC can be established for high-dimensional correlated features, the approach could usefully extend classical statistics to clinical settings where multiple comparisons obscure group differences. The empirical observation that significance appears only in the open-eyes condition and aligns with known postural instability markers is potentially actionable for fall-risk assessment, but the absence of methodological validation limits the strength of this contribution.
major comments (3)
- [Abstract] Abstract: the headline claim that ts-AUC yields p=0.01 (while adjusted univariate testing does not) is load-bearing for the superiority argument, yet the abstract supplies neither the definition of the ts-AUC statistic, its null distribution, nor any Monte-Carlo calibration on data whose covariance matches the empirical 123-patient feature set; without this, the reported p-value cannot be interpreted as calibrated evidence of a genuine difference.
- [Methods] Methods (patient classification and feature extraction): the two-sample framing assumes that the clinical faller/non-faller label is independent of the posturographic features and free of leakage; no discussion or sensitivity analysis addresses possible dependence between the label and the statokinesigram-derived variables.
- [Abstract] Abstract and Results: no sample-size justification or power calculation is provided for the 123-patient cohort, nor is it shown that the open-eyes versus closed-eyes contrast survives any form of multiplicity correction across the two protocols.
minor comments (2)
- [Abstract] The abstract refers to ts-AUC as both a 'non-parametric multivariate two-sample test' and a 'machine learning-based statistical test'; the precise relationship to standard ML classifiers or kernel methods should be clarified.
- The description of 'increased posturographic area' and 'increased antero-posterior movements' would benefit from explicit feature definitions or references to the processing pipeline used to derive them.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address each major comment below. Where the manuscript requires clarification or additional analysis, we indicate the planned revisions.
read point-by-point responses
-
Referee: [Abstract] Abstract: the headline claim that ts-AUC yields p=0.01 (while adjusted univariate testing does not) is load-bearing for the superiority argument, yet the abstract supplies neither the definition of the ts-AUC statistic, its null distribution, nor any Monte-Carlo calibration on data whose covariance matches the empirical 123-patient feature set; without this, the reported p-value cannot be interpreted as calibrated evidence of a genuine difference.
Authors: We agree that the abstract should be more self-contained. The methods section defines ts-AUC as a non-parametric multivariate two-sample test whose statistic is the area under the ROC curve obtained from a classifier trained to discriminate the two groups; the p-value is obtained by permutation of group labels. To strengthen the claim, the revised manuscript will add a short Monte-Carlo study in the methods that generates synthetic data with the same dimension and empirical covariance structure as the 123-patient feature set and verifies type-I error control at the nominal level. This calibration will be referenced in the abstract. revision: yes
-
Referee: [Methods] Methods (patient classification and feature extraction): the two-sample framing assumes that the clinical faller/non-faller label is independent of the posturographic features and free of leakage; no discussion or sensitivity analysis addresses possible dependence between the label and the statokinesigram-derived variables.
Authors: The faller/non-faller labels were obtained from clinical history and neurological assessment performed independently of the force-platform recordings. Nevertheless, we acknowledge the absence of explicit discussion. The revised methods will include a paragraph stating the temporal and procedural separation between clinical labeling and statokinesigram acquisition, together with a brief sensitivity check that recomputes ts-AUC after randomly flipping a small fraction of labels to simulate possible misclassification. revision: yes
-
Referee: [Abstract] Abstract and Results: no sample-size justification or power calculation is provided for the 123-patient cohort, nor is it shown that the open-eyes versus closed-eyes contrast survives any form of multiplicity correction across the two protocols.
Authors: We agree that a sample-size justification is missing. The revised results will report a post-hoc power estimate based on the observed effect size in the open-eyes condition. For the two-protocol multiplicity issue, we will apply a Bonferroni correction (threshold 0.025) to the reported p=0.01; the open-eyes result remains significant while the closed-eyes result does not. This correction will be stated in both the abstract and results. revision: yes
Circularity Check
No circularity: empirical application of introduced test statistic
full rationale
The paper introduces ts-AUC as a new non-parametric multivariate test and applies it to posturographic features from two groups defined by independent clinical assessment. The reported p=0.01 is an output of running the test on the observed data; no derivation, equation, or fitted parameter reduces the significance result to the input labels or features by construction. No self-citation chains, ansatzes, or renamings of known results appear in the load-bearing steps. The analysis remains an empirical comparison whose validity hinges on external properties of the test (type-I control) rather than internal definitional equivalence.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Preventing falls in elderly persons,
M.E. Tinetti, “Preventing falls in elderly persons,”New England journal of medicine , vol. 348, no. 1, pp. 42–49, 2003
work page 2003
-
[2]
Postural stability in the elderly: a com- parison between fallers and non-fallers,
I. Melzer, N. Benjuya, and J. Kaplanski, “Postural stability in the elderly: a com- parison between fallers and non-fallers,”Age and ageing, vol. 33, no. 6, pp. 602–607, 2004
work page 2004
-
[3]
The costs of fatal and non-fatal falls among older adults,
J.A. Stevens, P.S. Corso, E.A. Finkelstein, and T.R. Miller, “The costs of fatal and non-fatal falls among older adults,”Injury prevention : journal of the International Society for Child and Adolescent Injury Prevention , vol. 12, no. 5, pp. 290–295, 2006
work page 2006
-
[4]
J.R. Chagdes, S. Rietdyk, J.M. Haddad, H.N. Zelaznik, A. Raman, C.K. Rhea, and T.A. Silver, “Multiple timescales in postural dynamics associated with vision and a secondary task are revealed by wavelet analysis,”Experimental Brain Research, vol. 197, no. 3, pp. 297–310, 2009
work page 2009
-
[5]
Postural sway as a marker of progression in parkinson’s disease: a pilot longitudinal study,
M. Mancini, P. Carlson-Kuhta, C. Zampieri, J.G. Nutt, L. Chiari, and F.B. Horak, “Postural sway as a marker of progression in parkinson’s disease: a pilot longitudinal study,” Gait & posture, vol. 36, no. 3, pp. 471–476, 2012
work page 2012
-
[6]
Do multiple outcome measures require p-value adjustment?,
R.J. Feise, “Do multiple outcome measures require p-value adjustment?,” BMC Medical Research Methodology, vol. 2, no. 1, pp. 8, 2002
work page 2002
-
[7]
The misuse and abuse of statistics in biomedical research,
M.S. Thiese, Z.C. Arnold, and S.D. Walker, “The misuse and abuse of statistics in biomedical research,” Biochemia Medica, vol. 25, no. 1, pp. 5–11, Feb 2015. 14
work page 2015
-
[8]
What’s wrong with Bonferroni adjustments,
T.V. Perneger, “What’s wrong with Bonferroni adjustments,”British Medical Jour- nal, vol. 316, no. 7139, pp. 1236–1238, Apr 1998
work page 1998
-
[9]
J. Wood, N. Freemantle, M. King, and I. Nazareth, “Trap of trends to statistical significance: likelihood of near significant p value becoming more significant with extra data,” Bmj, vol. 348, pp. g2215, 2014
work page 2014
-
[10]
AUC optimization and the two- sample problem,
N. Vayatis, M. Depecker, and S.J. Clémençcon, “AUC optimization and the two- sample problem,” in Advances in Neural Information Processing Systems , 2009, pp. 360–368
work page 2009
-
[11]
Ranking and scoring using empirical risk minimization,
S. Clémençon, G. Lugosi, and N. Vayatis, “Ranking and scoring using empirical risk minimization,” inProceedings of the International Conference on Computational Learning Theory, 2005, pp. 1–15
work page 2005
-
[12]
A. Gretton, K.M. Borgwardt, M.J. Rasch, B. Schölkopf, and A. Smola, “A kernel two- sample test,” Journal of Machine Learning Research , vol. 13, no. Mar, pp. 723–773, 2012
work page 2012
-
[13]
Generalization bounds for the area under the ROC curve,
S. Agarwal, T. Graepel, R. Herbrich, S. Har-Peled, and D. Roth, “Generalization bounds for the area under the ROC curve,”Journal of Machine Learning Research , vol. 6, no. Apr, pp. 393–425, 2005
work page 2005
-
[14]
AUCoptimizationvs.errorrateminimization,
C.CortesandM.Mohri, “AUCoptimizationvs.errorrateminimization,” in Advances in Neural Information Processing Systems , 2004, pp. 313–320
work page 2004
-
[15]
Center-of-pressure pa- rameters used in the assessment of postural control,
R.M. Palmieri, C.D. Ingersoll, M.B. Stone, and B.A. Krause, “Center-of-pressure pa- rameters used in the assessment of postural control,”Journal of Sport Rehabilitation, vol. 11, no. 1, pp. 51–66, 2002
work page 2002
-
[16]
J. Audiffren, I. Bargiotas, N. Vayatis, P.-P. Vidal, and D. Ricard, “A non linear scoring approach for evaluating balance: classification of elderly as fallers and non- fallers,” Plos One, vol. 11, no. 12, 2016
work page 2016
-
[17]
I. Bargiotas, J. Audiffren, N. Vayatis, P.-P. Vidal, S. Buffat, A.P. Yelnik, and D. Ri- card, “On the importance of local dynamics in statokinesigram: A multivariate approach for postural control evaluation in elderly,” PloS one, vol. 13, no. 2, pp. e0192868, 2018
work page 2018
-
[18]
Validity and reliability of the nintendo wii balance board for assessment of standing balance,
R.A. Clark, A.L. Bryant, Y. Pua, P. McCrory, K. Bennell, and M. Hunt, “Validity and reliability of the nintendo wii balance board for assessment of standing balance,” Gait & posture, vol. 31, no. 3, pp. 307–310, 2010
work page 2010
-
[19]
J.M. Leach, M. Mancini, R.J. Peterka, T.L. Hayes, and F.B. Horak, “Validating and calibrating the nintendo wii balance board to derive reliable center of pressure measures,” Sensors, vol. 14, no. 10, pp. 18244–18267, 2014
work page 2014
-
[20]
I. Bargiotas, A. Moreau, A. Vienne, F. Bompaire, M. Baruteau, M. de Laage, M. Campos, D. Psimaras, N. Vayatis, C. Labourdette, P.-P. Vidal, D. Ricard, and S. Buffat, “Balance impairment in radiation induced leukoencephalopathy patients is coupled with altered visual attention in natural tasks,”Frontiers in Neurology, vol. 9, pp. 1185, 2019
work page 2019
-
[21]
Preprocessing the nintendo wii board signal to derive more accurate descriptors of statokinesigrams,
J. Audiffren and E. Contal, “Preprocessing the nintendo wii board signal to derive more accurate descriptors of statokinesigrams,” Sensors, vol. 16, no. 8, pp. 1208, 2016
work page 2016
-
[22]
A.A. Zecevic, A.W. Salmoni, M. Speechley, and A.A. Vandervoort, “Defining a fall andreasonsforfalling: comparisonsamongtheviewsofseniors, healthcareproviders, and the research literature,”The Gerontologist, vol. 46, no. 3, pp. 367–376, 2006
work page 2006
-
[23]
Assessment of pos- tural instability in patients with Parkinson’s disease,
J.W. Błaszczyk, R. Orawiec, D. Duda-Kłodowska, and G. Opala, “Assessment of pos- tural instability in patients with Parkinson’s disease,”Experimental Brain Research, vol. 183, no. 1, pp. 107–114, 2007. 15
work page 2007
-
[24]
Dynamic param- eters of balance which correlate to elderly persons with a history of falls,
J.W. Muir, D.P. Kiel, M. Hannan, J. Magaziner, and C.T. Rubin, “Dynamic param- eters of balance which correlate to elderly persons with a history of falls,”Plos One, vol. 8, no. 8, pp. e70566, 2013
work page 2013
-
[25]
L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp. 5–32, 2001
work page 2001
-
[26]
J.A. Doornik and H. Hansen, “Out-of-bag estimation,” Tech. Rep., Technical report, Dept. of Statistics, Univ. of California, Berkeley, 1996
work page 1996
-
[27]
Variable selection using random forests,
R. Genuer, J.-M. Poggi, and C. Tuleau-Malot, “Variable selection using random forests,” Pattern Recognition Letters, vol. 31, no. 14, pp. 2225–2236, 2010
work page 2010
-
[28]
Generative models and model criticism via optimized maximum mean discrepancy,
D.J. Sutherland, H-Y Tung, H. Strathmann, S. De, A. Ramdas, A. Smola, and A. Gretton, “Generative models and model criticism via optimized maximum mean discrepancy,” in Proceedings of the International Conference on Learning Represen- tations, 2017
work page 2017
-
[29]
A simple sequentially rejective multiple test procedure,
S. Holm, “A simple sequentially rejective multiple test procedure,” Scandinavian Journal of Statistics , pp. 65–70, 1979
work page 1979
-
[30]
Rectangular confidence regions for the means of multivariate normal distributions,
Z. Šidák, “Rectangular confidence regions for the means of multivariate normal distributions,” Journal of the American Statistical Association , vol. 62, no. 318, pp. 626–633, 1967
work page 1967
-
[31]
ISway: a sensitive, valid and reliable measure of postural control,
M. Mancini, A. Salarian, P. Carlson-Kuhta, C. Zampieri, L. King, L. Chiari, and F.B. Horak, “ISway: a sensitive, valid and reliable measure of postural control,”Journal of Neuroengineering and Rehabilitation , vol. 9, no. 1, pp. 1, 2012
work page 2012
-
[32]
Balance dysfunction in parkinson’s disease,
S. Rinalduzzi, C. Trompetto, L. Marinelli, A. Alibardi, P. Missori, F. Fattapposta, F. Pierelli, and A. Currà, “Balance dysfunction in parkinson’s disease,” BioMed Research International, vol. 2015, 2015
work page 2015
-
[33]
Predictors of future falls in parkinson disease,
G.K. Kerr, C.J. Worringham, M.H. Cole, P.F. Lacherez, J.M. Wood, and P.A. Sil- burn, “Predictors of future falls in parkinson disease,”Neurology, vol. 75, no. 2, pp. 116–124, 2010
work page 2010
-
[34]
Postural sway and falls in parkinson’s disease: a regression approach,
M.Matinolli, J.T.Korpelainen, R.Korpelainen, K.A.Sotaniemi, M.Virranniemi, and V.V. Myllylä, “Postural sway and falls in parkinson’s disease: a regression approach,” Movement Disorders, vol. 22, no. 13, pp. 1927–1935, 2007
work page 1927
-
[35]
Clinical and physiological as- sessments for elucidating falls risk in parkinson’s disease,
M.D. Latt, S.R. Lord, J.G. Morris, and V.S. Fung, “Clinical and physiological as- sessments for elucidating falls risk in parkinson’s disease,”Movement Disorders, vol. 24, no. 9, pp. 1280–1289, 2009
work page 2009
-
[36]
On the overestimation of random forest’s out-of-bag error,
S. Janitza and R. Hornung, “On the overestimation of random forest’s out-of-bag error,” PloS one, vol. 13, no. 8, pp. e0201904, 2018
work page 2018
-
[37]
Falls prediction in elderly people: a 1-year prospective study,
J. Swanenburg, E.D. de Bruin, D. Uebelhart, and T. Mulder, “Falls prediction in elderly people: a 1-year prospective study,” Gait & Posture , vol. 31, no. 3, pp. 317–321, 2010
work page 2010
-
[38]
The relationship between precision-recall and roc curves,
J. Davis and M. Goadrich, “The relationship between precision-recall and roc curves,” in Proceedings of the International Conference on Machine Learning. ACM, 2006, pp. 233–240
work page 2006
-
[39]
G. Pagnacco, E. Oggero, and C. Wright, “Biomedical instruments versus toys: a pre- liminary comparison of force platforms and the nintendo Wii balance board-biomed 2011.,” Biomedical Sciences Instrumentation, vol. 47, pp. 12–17, 2011
work page 2011
-
[40]
L. Castelli, L. Stocchi, M. Patrignani, G. Sellitto, M. Giuliani, and L. Prosperini, “We-measure: Toward a low-cost portable posturography for patients with multi- ple sclerosis using the commercial wii balance board,”Journal of the Neurological Sciences, vol. 359, no. 1-2, pp. 440–444, 2015. 16
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.