pith. sign in

arxiv: 1907.06614 · v1 · pith:OPJPI3BRnew · submitted 2019-07-15 · 💻 cs.LG · stat.AP· stat.ML

Revealing posturographic features associated with the risk of falling in patients with Parkinsonian syndromes via machine learning

Pith reviewed 2026-05-24 21:25 UTC · model grok-4.3

classification 💻 cs.LG stat.APstat.ML
keywords Parkinsonian syndromesposturographyfall riskmultivariate testmachine learningstatokinesigramRomberg testts-AUC
0
0 comments X

The pith

The ts-AUC multivariate test detects posturographic differences between fallers and non-fallers in Parkinsonian patients where standard multiple testing finds none.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether a new non-parametric multivariate two-sample procedure called ts-AUC can identify differences in the full set of posturographic features extracted from statokinesigrams recorded during Romberg tests. In 123 Parkinsonian syndrome patients split into fallers and non-fallers by clinical assessment, ts-AUC returned a significant group difference at p=0.01 while multiple univariate tests with adjustment did not; the distinction appeared only in the eyes-open condition and involved greater antero-posterior displacement plus larger sway area among fallers. The work therefore positions machine-learning-style multivariate tests as practical extensions of classical statistics when the data are high-dimensional and the measurements are correlated.

Core claim

The ts-AUC test showed a statistically significant difference (p-value = 0.01) between the faller and non-faller groups in the multidimensional posturographic feature space, while multiple testing with p-value adjustment did not; the difference was restricted to the open-eyes protocol, with fallers exhibiting increased antero-posterior movements and increased posturographic area.

What carries the argument

ts-AUC, the non-parametric multivariate two-sample test applied directly to the high-dimensional feature vectors derived from statokinesigrams.

If this is right

  • The open-eyes Romberg protocol alone separates fallers from non-fallers on the chosen feature set.
  • Fallers display reliably larger antero-posterior sway and overall posturographic area than non-fallers.
  • Multivariate machine-learning tests can be treated as direct extensions of classical statistical tools for multifactorial clinical data.
  • Standard univariate testing with correction can miss group differences that are visible when all features are considered jointly.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The identified features could be tested as inputs to prospective fall-prediction models if longitudinal outcome data become available.
  • The same ts-AUC approach could be applied to other high-dimensional sensor recordings in neurology where many correlated variables are collected simultaneously.
  • The eyes-open condition may expose compensatory strategies that are masked when vision is removed.

Load-bearing premise

The clinical classification of patients into fallers versus non-fallers is accurate and independent of the posturographic measurements.

What would settle it

An independent replication using the identical ts-AUC procedure on statokinesigram recordings from a new cohort of Parkinsonian patients, grouped by the same clinical criteria, that yields a non-significant result at the reported threshold.

Figures

Figures reproduced from arXiv: 1907.06614 by Argyris Kalogeratos, Damien Ricard, Ioannis Bargiotas, Myrto Limnios, Nicolas Vayatis, Pierre-Paul Vidal.

Figure 1
Figure 1. Figure 1: Examples of statokinesigrams from fallers and non-fallers. The x-axis is the [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Scheme of the ts-AUC algorithm. In order to find the AUC [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The importance of features as estimated by applying the approach of [27] using [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Radar chart comparing fallers and non-fallers based on the mean (o) [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The average performance of two-sample testing approaches with smaller popula [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The average performance of two-sample testing approaches with smaller non [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
read the original abstract

Falling in Parkinsonian syndromes (PS) is associated with postural instability and consists a common cause of disability among PS patients. Current posturographic practices record the body's center-of-pressure displacement (statokinesigram) while the patient stands on a force platform. Statokinesigrams, after appropriate signal processing, can offer numerous posturographic features, which however challenges the efforts for valid statistics via standard univariate approaches. In this work, we present the ts-AUC, a non-parametric multivariate two-sample test, which we employ to analyze statokinesigram differences among PS patients that are fallers (PSf) and non-fallers (PSNF). We included 123 PS patients who were classified into PSF or PSNF based on clinical assessment and underwent simple Romberg Test (eyes open/eyes closed). We analyzed posturographic features using both multiple testing with p-value adjustment and the ts-AUC. While the ts-AUC showed significant difference between groups (p-value = 0.01), multiple testing did not show any such difference. Interestingly, significant difference between the two groups was found only using the open-eyes protocol. PSF showed significantly increased antero-posterior movements as well as increased posturographic area, compared to PSNF. Our study demonstrates the superiority of the ts-AUC test compared to standard statistical tools in distinguishing PSF and PSNF in the multidimensional feature space. This result highlights more generally the fact that machine learning-based statistical tests can be seen as a natural extension of classical statistical approaches and should be considered, especially when dealing with multifactorial assessments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces ts-AUC, a non-parametric multivariate two-sample test, and applies it to posturographic features extracted from statokinesigrams of 123 Parkinsonian syndrome patients classified as fallers (PSf) or non-fallers (PSNF) via clinical assessment. Using the Romberg test (eyes open/closed), the authors report that ts-AUC detects a significant group difference (p=0.01) in the open-eyes protocol—where multiple-testing with p-value adjustment finds none—with PSf showing increased antero-posterior movements and posturographic area; they conclude that ts-AUC is superior to standard univariate tools for multidimensional posturographic data.

Significance. If the type-I error control of ts-AUC can be established for high-dimensional correlated features, the approach could usefully extend classical statistics to clinical settings where multiple comparisons obscure group differences. The empirical observation that significance appears only in the open-eyes condition and aligns with known postural instability markers is potentially actionable for fall-risk assessment, but the absence of methodological validation limits the strength of this contribution.

major comments (3)
  1. [Abstract] Abstract: the headline claim that ts-AUC yields p=0.01 (while adjusted univariate testing does not) is load-bearing for the superiority argument, yet the abstract supplies neither the definition of the ts-AUC statistic, its null distribution, nor any Monte-Carlo calibration on data whose covariance matches the empirical 123-patient feature set; without this, the reported p-value cannot be interpreted as calibrated evidence of a genuine difference.
  2. [Methods] Methods (patient classification and feature extraction): the two-sample framing assumes that the clinical faller/non-faller label is independent of the posturographic features and free of leakage; no discussion or sensitivity analysis addresses possible dependence between the label and the statokinesigram-derived variables.
  3. [Abstract] Abstract and Results: no sample-size justification or power calculation is provided for the 123-patient cohort, nor is it shown that the open-eyes versus closed-eyes contrast survives any form of multiplicity correction across the two protocols.
minor comments (2)
  1. [Abstract] The abstract refers to ts-AUC as both a 'non-parametric multivariate two-sample test' and a 'machine learning-based statistical test'; the precise relationship to standard ML classifiers or kernel methods should be clarified.
  2. The description of 'increased posturographic area' and 'increased antero-posterior movements' would benefit from explicit feature definitions or references to the processing pipeline used to derive them.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below. Where the manuscript requires clarification or additional analysis, we indicate the planned revisions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline claim that ts-AUC yields p=0.01 (while adjusted univariate testing does not) is load-bearing for the superiority argument, yet the abstract supplies neither the definition of the ts-AUC statistic, its null distribution, nor any Monte-Carlo calibration on data whose covariance matches the empirical 123-patient feature set; without this, the reported p-value cannot be interpreted as calibrated evidence of a genuine difference.

    Authors: We agree that the abstract should be more self-contained. The methods section defines ts-AUC as a non-parametric multivariate two-sample test whose statistic is the area under the ROC curve obtained from a classifier trained to discriminate the two groups; the p-value is obtained by permutation of group labels. To strengthen the claim, the revised manuscript will add a short Monte-Carlo study in the methods that generates synthetic data with the same dimension and empirical covariance structure as the 123-patient feature set and verifies type-I error control at the nominal level. This calibration will be referenced in the abstract. revision: yes

  2. Referee: [Methods] Methods (patient classification and feature extraction): the two-sample framing assumes that the clinical faller/non-faller label is independent of the posturographic features and free of leakage; no discussion or sensitivity analysis addresses possible dependence between the label and the statokinesigram-derived variables.

    Authors: The faller/non-faller labels were obtained from clinical history and neurological assessment performed independently of the force-platform recordings. Nevertheless, we acknowledge the absence of explicit discussion. The revised methods will include a paragraph stating the temporal and procedural separation between clinical labeling and statokinesigram acquisition, together with a brief sensitivity check that recomputes ts-AUC after randomly flipping a small fraction of labels to simulate possible misclassification. revision: yes

  3. Referee: [Abstract] Abstract and Results: no sample-size justification or power calculation is provided for the 123-patient cohort, nor is it shown that the open-eyes versus closed-eyes contrast survives any form of multiplicity correction across the two protocols.

    Authors: We agree that a sample-size justification is missing. The revised results will report a post-hoc power estimate based on the observed effect size in the open-eyes condition. For the two-protocol multiplicity issue, we will apply a Bonferroni correction (threshold 0.025) to the reported p=0.01; the open-eyes result remains significant while the closed-eyes result does not. This correction will be stated in both the abstract and results. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical application of introduced test statistic

full rationale

The paper introduces ts-AUC as a new non-parametric multivariate test and applies it to posturographic features from two groups defined by independent clinical assessment. The reported p=0.01 is an output of running the test on the observed data; no derivation, equation, or fitted parameter reduces the significance result to the input labels or features by construction. No self-citation chains, ansatzes, or renamings of known results appear in the load-bearing steps. The analysis remains an empirical comparison whose validity hinges on external properties of the test (type-I control) rather than internal definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract supplies no explicit free parameters, axioms, or invented entities. The method is described only at the level of 'non-parametric multivariate two-sample test' without further specification.

pith-pipeline@v0.9.0 · 5847 in / 1260 out tokens · 15629 ms · 2026-05-24T21:25:58.449499+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

  1. [1]

    Preventing falls in elderly persons,

    M.E. Tinetti, “Preventing falls in elderly persons,”New England journal of medicine , vol. 348, no. 1, pp. 42–49, 2003

  2. [2]

    Postural stability in the elderly: a com- parison between fallers and non-fallers,

    I. Melzer, N. Benjuya, and J. Kaplanski, “Postural stability in the elderly: a com- parison between fallers and non-fallers,”Age and ageing, vol. 33, no. 6, pp. 602–607, 2004

  3. [3]

    The costs of fatal and non-fatal falls among older adults,

    J.A. Stevens, P.S. Corso, E.A. Finkelstein, and T.R. Miller, “The costs of fatal and non-fatal falls among older adults,”Injury prevention : journal of the International Society for Child and Adolescent Injury Prevention , vol. 12, no. 5, pp. 290–295, 2006

  4. [4]

    Multiple timescales in postural dynamics associated with vision and a secondary task are revealed by wavelet analysis,

    J.R. Chagdes, S. Rietdyk, J.M. Haddad, H.N. Zelaznik, A. Raman, C.K. Rhea, and T.A. Silver, “Multiple timescales in postural dynamics associated with vision and a secondary task are revealed by wavelet analysis,”Experimental Brain Research, vol. 197, no. 3, pp. 297–310, 2009

  5. [5]

    Postural sway as a marker of progression in parkinson’s disease: a pilot longitudinal study,

    M. Mancini, P. Carlson-Kuhta, C. Zampieri, J.G. Nutt, L. Chiari, and F.B. Horak, “Postural sway as a marker of progression in parkinson’s disease: a pilot longitudinal study,” Gait & posture, vol. 36, no. 3, pp. 471–476, 2012

  6. [6]

    Do multiple outcome measures require p-value adjustment?,

    R.J. Feise, “Do multiple outcome measures require p-value adjustment?,” BMC Medical Research Methodology, vol. 2, no. 1, pp. 8, 2002

  7. [7]

    The misuse and abuse of statistics in biomedical research,

    M.S. Thiese, Z.C. Arnold, and S.D. Walker, “The misuse and abuse of statistics in biomedical research,” Biochemia Medica, vol. 25, no. 1, pp. 5–11, Feb 2015. 14

  8. [8]

    What’s wrong with Bonferroni adjustments,

    T.V. Perneger, “What’s wrong with Bonferroni adjustments,”British Medical Jour- nal, vol. 316, no. 7139, pp. 1236–1238, Apr 1998

  9. [9]

    Trap of trends to statistical significance: likelihood of near significant p value becoming more significant with extra data,

    J. Wood, N. Freemantle, M. King, and I. Nazareth, “Trap of trends to statistical significance: likelihood of near significant p value becoming more significant with extra data,” Bmj, vol. 348, pp. g2215, 2014

  10. [10]

    AUC optimization and the two- sample problem,

    N. Vayatis, M. Depecker, and S.J. Clémençcon, “AUC optimization and the two- sample problem,” in Advances in Neural Information Processing Systems , 2009, pp. 360–368

  11. [11]

    Ranking and scoring using empirical risk minimization,

    S. Clémençon, G. Lugosi, and N. Vayatis, “Ranking and scoring using empirical risk minimization,” inProceedings of the International Conference on Computational Learning Theory, 2005, pp. 1–15

  12. [12]

    A kernel two- sample test,

    A. Gretton, K.M. Borgwardt, M.J. Rasch, B. Schölkopf, and A. Smola, “A kernel two- sample test,” Journal of Machine Learning Research , vol. 13, no. Mar, pp. 723–773, 2012

  13. [13]

    Generalization bounds for the area under the ROC curve,

    S. Agarwal, T. Graepel, R. Herbrich, S. Har-Peled, and D. Roth, “Generalization bounds for the area under the ROC curve,”Journal of Machine Learning Research , vol. 6, no. Apr, pp. 393–425, 2005

  14. [14]

    AUCoptimizationvs.errorrateminimization,

    C.CortesandM.Mohri, “AUCoptimizationvs.errorrateminimization,” in Advances in Neural Information Processing Systems , 2004, pp. 313–320

  15. [15]

    Center-of-pressure pa- rameters used in the assessment of postural control,

    R.M. Palmieri, C.D. Ingersoll, M.B. Stone, and B.A. Krause, “Center-of-pressure pa- rameters used in the assessment of postural control,”Journal of Sport Rehabilitation, vol. 11, no. 1, pp. 51–66, 2002

  16. [16]

    A non linear scoring approach for evaluating balance: classification of elderly as fallers and non- fallers,

    J. Audiffren, I. Bargiotas, N. Vayatis, P.-P. Vidal, and D. Ricard, “A non linear scoring approach for evaluating balance: classification of elderly as fallers and non- fallers,” Plos One, vol. 11, no. 12, 2016

  17. [17]

    On the importance of local dynamics in statokinesigram: A multivariate approach for postural control evaluation in elderly,

    I. Bargiotas, J. Audiffren, N. Vayatis, P.-P. Vidal, S. Buffat, A.P. Yelnik, and D. Ri- card, “On the importance of local dynamics in statokinesigram: A multivariate approach for postural control evaluation in elderly,” PloS one, vol. 13, no. 2, pp. e0192868, 2018

  18. [18]

    Validity and reliability of the nintendo wii balance board for assessment of standing balance,

    R.A. Clark, A.L. Bryant, Y. Pua, P. McCrory, K. Bennell, and M. Hunt, “Validity and reliability of the nintendo wii balance board for assessment of standing balance,” Gait & posture, vol. 31, no. 3, pp. 307–310, 2010

  19. [19]

    Validating and calibrating the nintendo wii balance board to derive reliable center of pressure measures,

    J.M. Leach, M. Mancini, R.J. Peterka, T.L. Hayes, and F.B. Horak, “Validating and calibrating the nintendo wii balance board to derive reliable center of pressure measures,” Sensors, vol. 14, no. 10, pp. 18244–18267, 2014

  20. [20]

    Balance impairment in radiation induced leukoencephalopathy patients is coupled with altered visual attention in natural tasks,

    I. Bargiotas, A. Moreau, A. Vienne, F. Bompaire, M. Baruteau, M. de Laage, M. Campos, D. Psimaras, N. Vayatis, C. Labourdette, P.-P. Vidal, D. Ricard, and S. Buffat, “Balance impairment in radiation induced leukoencephalopathy patients is coupled with altered visual attention in natural tasks,”Frontiers in Neurology, vol. 9, pp. 1185, 2019

  21. [21]

    Preprocessing the nintendo wii board signal to derive more accurate descriptors of statokinesigrams,

    J. Audiffren and E. Contal, “Preprocessing the nintendo wii board signal to derive more accurate descriptors of statokinesigrams,” Sensors, vol. 16, no. 8, pp. 1208, 2016

  22. [22]

    Defining a fall andreasonsforfalling: comparisonsamongtheviewsofseniors, healthcareproviders, and the research literature,

    A.A. Zecevic, A.W. Salmoni, M. Speechley, and A.A. Vandervoort, “Defining a fall andreasonsforfalling: comparisonsamongtheviewsofseniors, healthcareproviders, and the research literature,”The Gerontologist, vol. 46, no. 3, pp. 367–376, 2006

  23. [23]

    Assessment of pos- tural instability in patients with Parkinson’s disease,

    J.W. Błaszczyk, R. Orawiec, D. Duda-Kłodowska, and G. Opala, “Assessment of pos- tural instability in patients with Parkinson’s disease,”Experimental Brain Research, vol. 183, no. 1, pp. 107–114, 2007. 15

  24. [24]

    Dynamic param- eters of balance which correlate to elderly persons with a history of falls,

    J.W. Muir, D.P. Kiel, M. Hannan, J. Magaziner, and C.T. Rubin, “Dynamic param- eters of balance which correlate to elderly persons with a history of falls,”Plos One, vol. 8, no. 8, pp. e70566, 2013

  25. [25]

    Random forests,

    L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp. 5–32, 2001

  26. [26]

    Out-of-bag estimation,

    J.A. Doornik and H. Hansen, “Out-of-bag estimation,” Tech. Rep., Technical report, Dept. of Statistics, Univ. of California, Berkeley, 1996

  27. [27]

    Variable selection using random forests,

    R. Genuer, J.-M. Poggi, and C. Tuleau-Malot, “Variable selection using random forests,” Pattern Recognition Letters, vol. 31, no. 14, pp. 2225–2236, 2010

  28. [28]

    Generative models and model criticism via optimized maximum mean discrepancy,

    D.J. Sutherland, H-Y Tung, H. Strathmann, S. De, A. Ramdas, A. Smola, and A. Gretton, “Generative models and model criticism via optimized maximum mean discrepancy,” in Proceedings of the International Conference on Learning Represen- tations, 2017

  29. [29]

    A simple sequentially rejective multiple test procedure,

    S. Holm, “A simple sequentially rejective multiple test procedure,” Scandinavian Journal of Statistics , pp. 65–70, 1979

  30. [30]

    Rectangular confidence regions for the means of multivariate normal distributions,

    Z. Šidák, “Rectangular confidence regions for the means of multivariate normal distributions,” Journal of the American Statistical Association , vol. 62, no. 318, pp. 626–633, 1967

  31. [31]

    ISway: a sensitive, valid and reliable measure of postural control,

    M. Mancini, A. Salarian, P. Carlson-Kuhta, C. Zampieri, L. King, L. Chiari, and F.B. Horak, “ISway: a sensitive, valid and reliable measure of postural control,”Journal of Neuroengineering and Rehabilitation , vol. 9, no. 1, pp. 1, 2012

  32. [32]

    Balance dysfunction in parkinson’s disease,

    S. Rinalduzzi, C. Trompetto, L. Marinelli, A. Alibardi, P. Missori, F. Fattapposta, F. Pierelli, and A. Currà, “Balance dysfunction in parkinson’s disease,” BioMed Research International, vol. 2015, 2015

  33. [33]

    Predictors of future falls in parkinson disease,

    G.K. Kerr, C.J. Worringham, M.H. Cole, P.F. Lacherez, J.M. Wood, and P.A. Sil- burn, “Predictors of future falls in parkinson disease,”Neurology, vol. 75, no. 2, pp. 116–124, 2010

  34. [34]

    Postural sway and falls in parkinson’s disease: a regression approach,

    M.Matinolli, J.T.Korpelainen, R.Korpelainen, K.A.Sotaniemi, M.Virranniemi, and V.V. Myllylä, “Postural sway and falls in parkinson’s disease: a regression approach,” Movement Disorders, vol. 22, no. 13, pp. 1927–1935, 2007

  35. [35]

    Clinical and physiological as- sessments for elucidating falls risk in parkinson’s disease,

    M.D. Latt, S.R. Lord, J.G. Morris, and V.S. Fung, “Clinical and physiological as- sessments for elucidating falls risk in parkinson’s disease,”Movement Disorders, vol. 24, no. 9, pp. 1280–1289, 2009

  36. [36]

    On the overestimation of random forest’s out-of-bag error,

    S. Janitza and R. Hornung, “On the overestimation of random forest’s out-of-bag error,” PloS one, vol. 13, no. 8, pp. e0201904, 2018

  37. [37]

    Falls prediction in elderly people: a 1-year prospective study,

    J. Swanenburg, E.D. de Bruin, D. Uebelhart, and T. Mulder, “Falls prediction in elderly people: a 1-year prospective study,” Gait & Posture , vol. 31, no. 3, pp. 317–321, 2010

  38. [38]

    The relationship between precision-recall and roc curves,

    J. Davis and M. Goadrich, “The relationship between precision-recall and roc curves,” in Proceedings of the International Conference on Machine Learning. ACM, 2006, pp. 233–240

  39. [39]

    Biomedical instruments versus toys: a pre- liminary comparison of force platforms and the nintendo Wii balance board-biomed 2011.,

    G. Pagnacco, E. Oggero, and C. Wright, “Biomedical instruments versus toys: a pre- liminary comparison of force platforms and the nintendo Wii balance board-biomed 2011.,” Biomedical Sciences Instrumentation, vol. 47, pp. 12–17, 2011

  40. [40]

    We-measure: Toward a low-cost portable posturography for patients with multi- ple sclerosis using the commercial wii balance board,

    L. Castelli, L. Stocchi, M. Patrignani, G. Sellitto, M. Giuliani, and L. Prosperini, “We-measure: Toward a low-cost portable posturography for patients with multi- ple sclerosis using the commercial wii balance board,”Journal of the Neurological Sciences, vol. 359, no. 1-2, pp. 440–444, 2015. 16