pith. sign in

arxiv: 2402.13103 · v1 · submitted 2024-02-20 · 💻 cs.LG · math.ST· stat.TH

Multivariate Functional Linear Discriminant Analysis for the Classification of Short Time Series with Missing Data

Pith reviewed 2026-05-24 03:30 UTC · model grok-4.3

classification 💻 cs.LG math.STstat.TH
keywords multivariate functional datalinear discriminant analysismissing datatime series classificationECM algorithmincomplete observationsfunctional classification
0
0 comments X

The pith

A multivariate version of functional linear discriminant analysis classifies short time series containing missing values.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops MUDRA as a multivariate extension of functional linear discriminant analysis to address classification of short time series where data may be incomplete. It pairs this model with an expectation/conditional-maximization algorithm that estimates parameters while accounting for dependencies across features. Testing on the Articulary Word Recognition data set shows stronger predictive performance than existing methods, with the advantage growing as the proportion of missing entries increases. The approach supports interpretable multiclass classification and dimension reduction even when large fractions of the observations are absent.

Core claim

MUDRA extends FLDA to the multivariate setting by incorporating an ECM algorithm that jointly handles missing values and estimates statistical dependencies between features. On the Articulary Word Recognition data set this yields higher classification accuracy than state-of-the-art alternatives, with the gap widening under substantial missingness, while preserving interpretability of the resulting decision rules.

What carries the argument

MUDRA, the multivariate functional linear discriminant analysis model, together with its ECM algorithm for parameter inference under missing data.

If this is right

  • Classification and dimension reduction become feasible for multivariate functional data even when large portions of the observations are absent.
  • Performance gains relative to prior methods increase with the amount of missing data.
  • The resulting models remain interpretable, supporting use in domains that require explanation of decisions.
  • The ECM procedure provides a general route for fitting functional discriminant models under incompleteness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar ECM-based handling of missingness could be adapted to other functional data models beyond linear discriminant analysis.
  • The method may prove especially relevant for sensor-derived series in clinical monitoring where dropouts are common.
  • Extending the approach to non-stationary or longer series would test whether the tractability assumption holds beyond short time windows.

Load-bearing premise

Statistical dependencies between features can be estimated in a computationally tractable way by the ECM algorithm even when values are missing.

What would settle it

A replication on the Articulary Word Recognition data set or a comparable multivariate time-series benchmark in which MUDRA shows no accuracy gain over baselines when missingness is introduced would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2402.13103 by Cl\'emence R\'eda, Olaf Wolkenhauer, Orell Trautmann, Rahul Bordoloi, Saptarshi Bej.

Figure 1
Figure 1. Figure 1: Example of a short multivariate time series with irregular sampling intervals and missing data. Each color corresponds to a class, and each plot represents a single feature. Some features or time points might be missing across individuals, for instance, Feature 2 is not measured for the individual in Class 1, and Feature 1 is not measured at the same time points between Class 1 and Class 2. For instance, t… view at source ↗
Figure 2
Figure 2. Figure 2: Top plot: curves of feature 1 (on the left) and feature 2 (right) across the three classes. Bottom plot: corresponding estimated curves by MUDRA with r = 3, b = 7 [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Synthetic experiments with b = 7. Left: MSE between estimated and true functional points for r ∈ {1, 2, 3, 4}. Right: F1–scores on the classification task on the synthetic data set for r ∈ {1, 2, 3, 4}. 3.2 Time–Series Dimension Reduction on a Benchmark Data set We also applied the MUDRA algorithm to the real-world data set “Articulary Word Recognition” from Dau et al. (2019). Further details about the sim… view at source ↗
Figure 4
Figure 4. Figure 4: ROCKET versus MUDRA with a ridge regression–based classifier on the “Articulary Word Recognition” data set with b = 9 and for r ∈ [1, 9]. mation in short–time–series classification. We applied MUDRA or ROCKET to perform a dimension reduction on the data set. Resulting features were fed to a ridge regression–based classifier. Such a procedure allows to test how informative the resulting representations are,… view at source ↗
Figure 5
Figure 5. Figure 5: Runtimes for ROCKET and MUDRA on the complete real–life data set for r = 7 and b = 9 (N = 1, 000 iterations per algorithm). As evidenced by [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
read the original abstract

Functional linear discriminant analysis (FLDA) is a powerful tool that extends LDA-mediated multiclass classification and dimension reduction to univariate time-series functions. However, in the age of large multivariate and incomplete data, statistical dependencies between features must be estimated in a computationally tractable way, while also dealing with missing data. There is a need for a computationally tractable approach that considers the statistical dependencies between features and can handle missing values. We here develop a multivariate version of FLDA (MUDRA) to tackle this issue and describe an efficient expectation/conditional-maximization (ECM) algorithm to infer its parameters. We assess its predictive power on the "Articulary Word Recognition" data set and show its improvement over the state-of-the-art, especially in the case of missing data. MUDRA allows interpretable classification of data sets with large proportions of missing data, which will be particularly useful for medical or psychological data sets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper develops MUDRA, a multivariate extension of functional linear discriminant analysis (FLDA) for multiclass classification of short time series with missing data. It specifies a model that captures cross-feature statistical dependencies via basis expansions and introduces an efficient ECM algorithm for parameter inference under missingness. The approach is evaluated on the Articulary Word Recognition dataset, where it reports improved predictive performance relative to existing methods, particularly under missing data, while enabling interpretable classification.

Significance. If the empirical results are robust, MUDRA addresses a practical gap by providing a tractable multivariate FLDA variant that jointly handles feature dependencies and missing values through the ECM procedure. This is relevant for domains such as medical or psychological time-series data. The manuscript supplies the model specification, basis-expansion details, and ECM update rules that support feasibility for short series and moderate feature counts.

minor comments (2)
  1. [Abstract] Abstract: the claim of improvement over the state-of-the-art would be strengthened by briefly indicating the baselines, missingness mechanism, and whether error bars or cross-validation details are reported in the experiments section.
  2. Notation for the multivariate functional observations and the missingness indicator should be introduced once and used consistently to aid readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our manuscript on MUDRA and for recommending minor revision. The assessment correctly captures the contribution of extending FLDA to the multivariate setting with missing data via an ECM algorithm. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces MUDRA as a multivariate extension of FLDA, specifies the model via basis expansions and an ECM algorithm for parameter inference from observed data (including missing values), and evaluates predictive performance on the external Articulary Word Recognition dataset. No load-bearing step reduces a claimed prediction or uniqueness result to a fitted quantity by construction, nor does any central premise rest solely on self-citation chains. The derivation supplies explicit update rules and is tested against independent benchmarks, making the work self-contained against external data.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides insufficient detail to enumerate free parameters, axioms, or invented entities; no explicit model equations or assumptions are stated.

pith-pipeline@v0.9.0 · 5715 in / 1172 out tokens · 33059 ms · 2026-05-24T03:30:10.160058+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 2 internal anchors

  1. [1]

    doi: 10.1145/ 361573.361582

    ISSN 0001-0782. doi: 10.1145/ 361573.361582. URL https://doi.org/10.1145/361573.361582. Place: New York, NY, USA Publisher: Association for Computing Machinery. Agnieszka Bier, Agnieszka Jastrzebska, and Pawel Olszewski. Variable-Length Multivariate Time SeriesClassificationUsingROCKET:ACaseStudyofIncidentDetection. IEEE Access, 10:95701– 95715,

  2. [2]

    doi: 10.1109/ACCESS.2022.3203523

    ISSN 2169-3536. doi: 10.1109/ACCESS.2022.3203523. URLhttps://ieeexplore. ieee.org/document/9874797/. 17 Bordoloi, Réda, Trautmann, Bej and Wolkenhauer Rasmus Bro. PARAFAC. Tutorial and applications.Chemometrics and Intelligent Laboratory Sys- tems, 38(2):149–171, October

  3. [3]

    doi: 10.1016/S0169-7439(97)00032-4

    ISSN 0169-7439. doi: 10.1016/S0169-7439(97)00032-4. URL https://www.sciencedirect.com/science/article/pii/S0169743997000324. Hoang Anh Dau, Anthony Bagnall, Kaveh Kamgar, Chin-Chia Michael Yeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana, and Eamonn Keogh. The UCR time series archive. IEEE/CAA Journal of Automatica Sinica, 6(6):1293–1305, November

  4. [4]

    doi: 10.1109/JAS.2019.1911747

    ISSN 2329-9274. doi: 10.1109/JAS.2019.1911747. URL https://ieeexplore.ieee.org/document/8894743. Confer- ence Name: IEEE/CAA Journal of Automatica Sinica. Angus Dempster, François Petitjean, and Geoffrey I. Webb. ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min Knowl Disc, 34(5):1454–1495, September

  5. [5]

    doi: 10.1007/s10618-020-00701-z

    ISSN 1573-756X. doi: 10.1007/s10618-020-00701-z. URL https://doi.org/10.1007/s10618-020-00701-z. Ronald A Fisher. The use of multiple measurements in taxonomic problems.Annals of Eugenics, 7 (2):179–188,

  6. [6]

    doi: 10.1007/s42081-023-00226-x

    ISSN 2520-8764. doi: 10.1007/s42081-023-00226-x. URL https://doi.org/10.1007/s42081-023-00226-x. Sugnet Gardner-Lubbe. Linear discriminant analysis for multiple functional data analysis.J Appl Stat, 48(11):1917–1933,

  7. [7]

    doi: 10.1080/02664763.2020.1780569

    ISSN 0266-4763 1360-0532. doi: 10.1080/02664763.2020.1780569. Place: England. Hunter Glanz and Luis Carvalho. An Expectation-Maximization Algorithm for the Matrix Normal Distribution, September

  8. [8]

    An Expectation-Maximization Algorithm for the Matrix Normal Distribution

    URLhttp://arxiv.org/abs/1309.6609. Gene Golub, Stephen Nash, and Charles Van Loan. A hessenberg-schur method for the problem ax+ xb= c.IEEE Transactions on Automatic Control, 24(6):909–913,

  9. [9]

    doi: 10.1007/978-0-387-78189-1_6

    ISBN 978-0-387-78189-1. doi: 10.1007/978-0-387-78189-1_6. URL https://doi.org/10.1007/978-0-387-78189-1_6. Gareth M. James and Trevor J. Hastie. Functional Linear Discriminant Analysis for Irregularly Sampled Curves. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 63 (3):533–550,

  10. [10]

    URL http://www.jstor.org/stable/2680587

    ISSN 13697412, 14679868. URL http://www.jstor.org/stable/2680587. Publisher: [Royal Statistical Society, Wiley]. Andrew T. Jebb, Louis Tay, Wei Wang, and Qiming Huang. Time series analysis for psychological research: examining and forecasting change.Frontiers in Psychology, 6,

  11. [11]

    TensorLy: Tensor Learning in Python

    ISSN 1664-1078. URL https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2015. 00727. Jean Kossaifi, Yannis Panagakis, Anima Anandkumar, and Maja Pantic. Tensorly: Tensor learning in python.arXiv preprint arXiv:1610.09555,

  12. [12]

    doi: 10.3102/10769986221149140

    ISSN 1076-9986. doi: 10.3102/10769986221149140. URL https://doi.org/10.3102/10769986221149140. Publisher: American Educational Research Association. Jason Lines, Sarah Taylor, and Anthony Bagnall. HIVE-COTE: The Hierarchical Vote Collective of Transformation-Based Ensembles for Time Series Classification. In2016 IEEE 16th International Conference on Data ...

  13. [13]

    doi: 10.1109/ICDM.2016

  14. [14]

    ISSN: 2374-8486

    URL https://ieeexplore.ieee.org/document/7837946. ISSN: 2374-8486. Connor Mclaughlin and Lili Su. Fedlda: Personalized federated learning through collaborative linear discriminant analysis. InInternational Workshop on Federated Learning in the Age of Foundation Models in Conjunction with NeurIPS 2023,

  15. [15]

    doi: 10.1201/9780203749289

    ISBN 978-0-203-74928-9. doi: 10.1201/9780203749289. Viet-Dung Nguyen, Karim Abed-Meraim, and Nguyen Linh-Trung. Fast adaptive parafac decom- position algorithm with linear complexity. In2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6235–6239. IEEE,

  16. [16]

    URL http://www.jstor.org/stable/2246049

    ISSN 08834237. URL http://www.jstor.org/stable/2246049. Publisher: Institute of Mathematical Statistics. Nilam Ram and Kevin J. Grimm. Growth Mixture Modeling: A Method for Identifying Differences in Longitudinal Change Among Unobserved Groups. Int J Behav Dev, 33(6):565–576,

  17. [17]

    doi: 10.1177/0165025409343765

    ISSN 0165-0254. doi: 10.1177/0165025409343765. URL https://www.ncbi.nlm.nih.gov/pmc/ articles/PMC3718544/. C Radhakrishna Rao. The utilization of multiple measurements in problems of biological classifica- tion. Journal of the Royal Statistical Society. Series B (Methodological), 10(2):159–203,

  18. [18]

    doi: 10.1007/s10618-020-00727-3

    ISSN 1573-756X. doi: 10.1007/s10618-020-00727-3. URL https://doi.org/10.1007/s10618-020-00727-3. Youcef Saad and Martin H. Schultz. GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems.SIAM J. Sci. and Stat. Comput., 7(3):856–869, July

  19. [19]

    Gmres: A generalized minimal residual algorithm for solving nonsymmetric linear systems,

    ISSN 0196-5204. doi: 10.1137/0907058. URLhttps://epubs.siam.org/doi/10.1137/0907058. Publisher: Society for Industrial and Applied Mathematics. Yue Song, Nicu Sebe, and Wei Wang. Fast differentiable matrix square root. arXiv preprint arXiv:2201.08663,

  20. [20]

    Symposium i anvendt statistik 2016, pages 108–123,

  21. [21]

    Inexact gmres iterations and relaxation strate- gies with fast-multipole boundary element method.Advances in Computational Mathematics, 48 (3):32, 2022a

    Tingyu Wang, Simon K Layton, and Lorena A Barba. Inexact gmres iterations and relaxation strate- gies with fast-multipole boundary element method.Advances in Computational Mathematics, 48 (3):32, 2022a. Will Ke Wang, Ina Chen, Leeor Hershkovich, Jiamu Yang, Ayush Shetty, Geetika Singh, Yihang Jiang, Aditya Kotla, Jason Zisheng Shang, Rushil Yerrabelli, Al...

  22. [22]

    URL http://www.jstor.org/stable/27590579

    ISSN 01621459. URL http://www.jstor.org/stable/27590579. Publisher: [American Statistical As- sociation, Taylor & Francis, Ltd.]. Jinsung Yoon, William R. Zame, and Mihaela van der Schaar. Estimating Missing Data in Temporal DataStreamsUsingMulti-DirectionalRecurrentNeuralNetworks. IEEE Transactions on Biomed- ical Engineering, 66(5):1477–1490, May

  23. [23]

    doi: 10.1109/TBME.2018.2874712

    ISSN 1558-2531. doi: 10.1109/TBME.2018.2874712. URL https://ieeexplore.ieee.org/document/8485748. ConferenceName: IEEETransactions on Biomedical Engineering. Fa Zhu, Junbin Gao, Jian Yang, and Ning Ye. Neighborhood linear discriminant analysis.Pattern Recognition, 123:108422,