Dementia classification from spontaneous speech using wrapper-based feature selection

Marko Niemel\"a; Mikaela von Bonsdorff; Sami \"Ayr\"am\"o; Tommi K\"arkk\"ainen

arxiv: 2502.03484 · v2 · submitted 2025-02-04 · 📡 eess.AS · cs.LG· cs.SD

Dementia classification from spontaneous speech using wrapper-based feature selection

Marko Niemel\"a , Mikaela von Bonsdorff , Sami \"Ayr\"am\"o , Tommi K\"arkk\"ainen This is my paper

Pith reviewed 2026-05-23 04:04 UTC · model grok-4.3

classification 📡 eess.AS cs.LGcs.SD

keywords dementia classificationspontaneous speechacoustic featureswrapper feature selectionminimal learning machinecognitive assessment

0 comments

The pith

Acoustic features from entire speech recordings support competitive dementia classification while lowering computation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that acoustic features extracted from complete spontaneous speech recordings, rather than only active speech segments, can be used for dementia classification without loss of accuracy. This reduces the volume of data processed and improves efficiency. Wrapper-based feature selection then ranks the acoustic characteristics by importance for distinguishing healthy speech from that of people with dementia. One evaluated model achieves the same accuracy level at substantially lower computational cost due to properties of its formulation. A sympathetic reader would care because dementia diagnosis currently requires extensive clinical work, and a scalable speech-based method could expand access to early assessment.

Core claim

The authors claim that acoustic features taken from full recordings using standard extraction tools, when paired with classifier-based wrapper feature selection, produce dementia classification performance that matches results from speech-segment-only features, while the Extreme Minimal Learning Machine exhibits competitive accuracy at markedly reduced computational cost as an inherent result of its model structure and learning procedure.

What carries the argument

Classifier-based wrapper feature selection applied to acoustic feature vectors drawn from complete recordings to rank and retain diagnostically relevant characteristics.

If this is right

Fewer feature vectors need processing, directly lowering computational requirements.
Classification accuracy stays competitive despite the inclusion of non-speech material.
The Extreme Minimal Learning Machine provides an efficient option among tested models due to its built-in properties.
The resulting framework remains interpretable while functioning as a supportive assessment tool.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could remove the separate step of detecting and trimming speech activity in applied systems.
Comparable full-recording methods might be examined for detecting other conditions that alter speech patterns.
Reduced computation could allow testing on hardware with limited resources or in real-time settings.

Load-bearing premise

Acoustic features extracted from entire recordings, including non-speech segments, contain enough diagnostically relevant information to match the performance obtained from speech-active segments alone.

What would settle it

A side-by-side test on the same recordings showing that accuracy falls when full recordings are used instead of speech-only segments.

Figures

Figures reproduced from arXiv: 2502.03484 by Marko Niemel\"a, Mikaela von Bonsdorff, Sami \"Ayr\"am\"o, Tommi K\"arkk\"ainen.

**Figure 2.** Figure 2: Confusion matrix for dementia diagnosis based on Ridge regression. [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

read the original abstract

Dementia encompasses a group of syndromes that impair cognitive functions such as memory, reasoning, and the ability to perform daily activities. As populations globally age, over 10 million new dementia diagnoses are reported annually. Currently, clinical diagnosis of dementia remains challenging due to overlapping symptoms, the need to exclude alternative conditions and the requirement for a comprehensive clinical evaluation and cognitive assessment. This underscores the growing need to develop feasible and accurate methods for detecting cognitive deficiencies. Recent advances in machine learning have highlighted spontaneous speech as a promising noninvasive, cost-effective, and scalable biomarker for dementia detection. In this study, spontaneous speech recordings from the ADReSS and Pitt Corpus datasets are analyzed, consisting of picture description tasks performed by cognitively healthy individuals and people with Alzheimer's disease. Unlike prior approaches that focus solely on speech-active segments, acoustic features are extracted from entire recordings using the openSMILE toolkit. This representation reduces the number of feature vectors and improves computational efficiency without compromising classification performance. Classification models with classifier-based wrapper feature selection are employed to estimate feature importance and identify diagnostically relevant acoustic characteristics. Among the evaluated models, the Extreme Minimal Learning Machine achieved competitive classification accuracy with substantially lower computational cost, reflecting an inherent property of the model formulation and learning procedure. Overall, the results demonstrate that the proposed framework is computationally efficient, interpretable, and well suited as a supportive tool for speech-based dementia assessment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's efficiency claim about full recordings rests on an untested assumption rather than a direct comparison.

read the letter

Your colleague should know that this paper's efficiency story depends on an assumption that hasn't been checked with a direct comparison. The work extracts openSMILE acoustic features from complete recordings of picture description tasks in the ADReSS and Pitt corpora, rather than trimming to speech segments only. They apply wrapper feature selection with several classifiers and single out the Extreme Minimal Learning Machine for its speed and decent accuracy. This combination is a reasonable extension of earlier speech biomarker studies, and the focus on computational cost is practical given the scale of potential screening. What stands out as solid is the EMLM result itself. The model formulation allows fast training, and if the numbers hold, it could be a useful option when resources are limited. The wrapper approach also lets them rank features, which adds some transparency about what acoustic cues are driving the decisions. The main weakness is the lack of evidence for the 'no compromise' claim. The abstract states that full recordings reduce vectors without hurting performance, but there is no side-by-side experiment on the same files with and without voice activity detection. That makes the central efficiency argument rest on an untested modeling choice rather than observed data. The abstract also gives no actual accuracy figures, cross-validation details, or statistical tests, so the 'competitive' label is hard to evaluate from what's shown. This paper is aimed at applied researchers in clinical speech processing who already know the standard datasets and toolkits. It won't change the field, but someone building a pipeline might pick up the full-recording idea or try EMLM if they need speed. I would recommend sending it for peer review. The topic is relevant, the methods are standard but applied in a slightly different way, and a referee could push for the missing ablation and numbers. It is not ready as is, but it is coherent enough to deserve that step.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes extracting acoustic features from entire spontaneous speech recordings (including non-speech segments) using openSMILE on the ADReSS and Pitt corpora for dementia classification. It applies wrapper-based feature selection across multiple classifiers and identifies the Extreme Minimal Learning Machine (EMLM) as achieving competitive accuracy with substantially lower computational cost due to its formulation, claiming the full-recording approach reduces feature vectors without compromising performance.

Significance. If the performance equivalence holds, the work provides a scalable, efficient alternative to VAD-based pipelines for speech-based dementia assessment, with the EMLM results offering a concrete efficiency advantage. The wrapper selection for interpretability is a secondary strength.

major comments (2)

[Abstract and §3] Abstract and §3 (feature extraction): the central claim that full-recording openSMILE features yield equivalent diagnostic utility 'without compromising classification performance' is asserted but unsupported by any side-by-side ablation against speech-active segments on the same ADReSS/Pitt files; this equivalence is load-bearing for both the efficiency narrative and the EMLM results.
[Results] Results section: the abstract and reader's summary indicate no reported quantitative accuracies, cross-validation protocol details, baseline comparisons to prior ADReSS work, or statistical significance tests; without these, the 'competitive' claim cannot be evaluated.

minor comments (1)

[Abstract] Abstract lacks any numerical performance or cost metrics, which should be added for a self-contained summary.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We respond point-by-point to the major concerns below.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (feature extraction): the central claim that full-recording openSMILE features yield equivalent diagnostic utility 'without compromising classification performance' is asserted but unsupported by any side-by-side ablation against speech-active segments on the same ADReSS/Pitt files; this equivalence is load-bearing for both the efficiency narrative and the EMLM results.

Authors: We acknowledge that the manuscript does not contain a direct ablation comparing full-recording features to VAD-segmented features on the identical files. The claim of no performance compromise is supported by the observed competitive accuracies relative to published VAD-based results on the same corpora, but a within-study comparison is absent. In revision we will add an explicit discussion of this point in §3 and, if feasible within the experimental setup, include a limited comparison using a standard VAD pipeline on the ADReSS and Pitt recordings. revision: yes
Referee: [Results] Results section: the abstract and reader's summary indicate no reported quantitative accuracies, cross-validation protocol details, baseline comparisons to prior ADReSS work, or statistical significance tests; without these, the 'competitive' claim cannot be evaluated.

Authors: The Results section of the full manuscript reports the quantitative accuracies, the 10-fold cross-validation protocol, direct numerical comparisons to prior ADReSS studies, and statistical significance testing. The abstract, however, omits these figures. We will revise the abstract to include the key performance numbers and ensure the Results section makes the protocol, baselines, and tests fully explicit. revision: yes

Circularity Check

0 steps flagged

No significant circularity; standard empirical ML pipeline

full rationale

The paper reports an experimental pipeline: openSMILE feature extraction from full recordings on ADReSS/Pitt data, wrapper feature selection, and classifier comparison (including EMLM). The efficiency claim and 'without compromising performance' statement are presented as outcomes of their runs rather than definitions or fitted inputs renamed as predictions. No equations, self-citations, uniqueness theorems, or ansatzes appear in the provided text that would reduce any central result to its own inputs by construction. Comparisons are internal to the same datasets and models, which is conventional and externally falsifiable via replication on the public corpora. The derivation chain is self-contained experimental reporting.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim depends on standard assumptions in machine learning for audio classification and the validity of the chosen datasets; no new entities are introduced, but the feature selection process introduces data-dependent choices.

free parameters (2)

selected acoustic features
Chosen via wrapper method based on classification performance on the training data
hyperparameters of classification models
Tuned to achieve optimal performance on the datasets

axioms (2)

domain assumption Acoustic features from full audio recordings contain diagnostically relevant information for dementia without requiring speech segmentation
The abstract states that this representation improves efficiency without compromising performance
domain assumption The datasets ADReSS and Pitt Corpus are representative for evaluating dementia classification from speech
Used as the basis for analysis

pith-pipeline@v0.9.0 · 5797 in / 1497 out tokens · 76483 ms · 2026-05-23T04:04:19.545177+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages · 1 internal anchor

[1]

W. H. Organization, et al., Global action plan on the public health response to dementia 2017–2025, World Health Organization, 2017

work page 2017
[2]

Y. Wang, M. L. Haaksma, I. H. Ramakers, F. R. Verhey, W. M. van de Flier, P. Scheltens, I. van Maurik, M. G. Olde Rikkert, J.-M. S. Le- outsakos, R. J. Melis, Cognitive and functional progression of dementia in two longitudinal studies, International journal of geriatric psychiatry 34 (11) (2019) 1623–1632. 23

work page 2019
[3]

Ngandu, J

T. Ngandu, J. Lehtisalo, A. Solomon, E. Lev¨ alahti, S. Ahtiluoto, R. An- tikainen, L. B¨ ackman, T. H¨ anninen, A. Jula, T. Laatikainen, et al., A 2 year multidomain intervention of diet, exercise, cognitive training, and vascular risk monitoring versus control to prevent cognitive decline in at-risk elderly people (FINGER): a randomised controlled trial...

work page 2015
[4]

Kulmala, T

J. Kulmala, T. Ngandu, S. Havulinna, E. Lev¨ alahti, J. Lehtisalo, A. Solomon, R. Antikainen, T. Laatikainen, P. Pippola, M. Peltonen, et al., The effect of multidomain lifestyle intervention on daily function- ing in older people, Journal of the American Geriatrics Society 67 (6) (2019) 1138–1144

work page 2019
[5]

Kivipelto, F

M. Kivipelto, F. Mangialasche, H. M. Snyder, R. Allegri, S. Andrieu, H. Arai, L. Baker, S. Belleville, H. Brodaty, S. M. Brucki, et al., World- wide fingers network: a global approach to risk reduction and prevention of dementia, Alzheimer’s & dementia 16 (7) (2020) 1078–1094

work page 2020
[6]

C. R. Jack Jr, D. A. Bennett, K. Blennow, M. C. Carrillo, B. Dunn, S. B. Haeberlein, D. M. Holtzman, W. Jagust, F. Jessen, J. Karlawish, et al., Nia-aa research framework: toward a biological definition of alzheimer’s disease, Alzheimer’s & dementia 14 (4) (2018) 535–562

work page 2018
[7]

Hanyu, T

H. Hanyu, T. Asano, T. Iwamoto, M. Takasaki, H. Shindo, K. Abe, Magnetization transfer measurements of the hippocampus in patients with alzheimer’s disease, vascular dementia, and other types of demen- tia, American journal of neuroradiology 21 (7) (2000) 1235–1242

work page 2000
[8]

F.-Y. Chiu, Y. Yen, Imaging biomarkers for clinical applications in neuro-oncology: current status and future perspectives, Biomarker Re- search 11 (1) (2023) 35

work page 2023
[9]

P. N. Young, M. Estarellas, E. Coomans, M. Srikrishna, H. Beaumont, A. Maass, A. V. Venkataraman, R. Lissaman, D. Jim´ enez, M. J. Betts, et al., Imaging biomarkers in neurodegeneration: current and future practices, Alzheimer’s research & therapy 12 (2020) 1–17

work page 2020
[10]

M. A. Ebrahimighahnavieh, S. Luo, R. Chiong, Deep learning to detect alzheimer’s disease from neuroimaging: A systematic literature review, Computer methods and programs in biomedicine 187 (2020) 105242. 24

work page 2020
[11]

mini-mental state

M. F. Folstein, S. E. Folstein, P. R. McHugh, “mini-mental state”: a practical method for grading the cognitive state of patients for the clin- ician, Journal of psychiatric research 12 (3) (1975) 189–198

work page 1975
[12]

Z. S. Nasreddine, N. A. Phillips, V. B´ edirian, S. Charbonneau, V. White- head, I. Collin, J. L. Cummings, H. Chertkow, The montreal cognitive assessment, moca: a brief screening tool for mild cognitive impairment, Journal of the American Geriatrics Society 53 (4) (2005) 695–699

work page 2005
[13]

J. C. Morris, A. Heyman, R. C. Mohs, J. P. Hughes, G. van Belle, G. Fil- lenbaum, E. D. Mellits, C. Clark, The consortium to establish a registry for alzheimer’s disease (cerad). part i. clinical and neuropsychological assessment of alzheimer’s disease., Neurology 39 (9) (1989) 1159–1165

work page 1989
[14]

Zorluoglu, M

G. Zorluoglu, M. E. Kamasak, L. Tavacioglu, P. O. Ozanar, A mobile application for cognitive screening of dementia, Computer methods and programs in biomedicine 118 (2) (2015) 252–262

work page 2015
[15]

Reilly, J

J. Reilly, J. E. Peelle, S. M. Antonucci, M. Grossman, Anomia as a marker of distinct semantic memory impairments in alzheimer’s disease and semantic dementia., Neuropsychology 25 (4) (2011) 413

work page 2011
[16]

Ivanova, I

O. Ivanova, I. Mart´ ınez-Nicol´ as, E. Garc´ ıa-Pi˜ nuela, J. J. G. Meil´ an, Defying syntactic preservation in alzheimer’s disease: what type of im- pairment predicts syntactic change in dementia (if it does) and why?, Frontiers in Language Sciences 2 (2023) 1199107

work page 2023
[17]

K. C. Fraser, J. A. Meltzer, F. Rudzicz, Linguistic features identify alzheimer’s disease in narrative speech, Journal of Alzheimer’s disease 49 (2) (2015) 407–422

work page 2015
[18]

De la Fuente Garcia, C

S. De la Fuente Garcia, C. W. Ritchie, S. Luz, Artificial intelligence, speech, and language processing approaches to monitoring alzheimer’s disease: a systematic review, Journal of Alzheimer’s Disease 78 (4) (2020) 1547–1574

work page 2020
[19]

Beltrami, G

D. Beltrami, G. Gagliardi, R. Rossini Favretti, E. Ghidoni, F. Tam- burini, L. Calz` a, Speech analysis by natural language processing tech- niques: a possible tool for very early detection of cognitive decline?, Frontiers in aging neuroscience 10 (2018) 369. 25

work page 2018
[20]

Lopez-de Ipina, J

K. Lopez-de Ipina, J. Alonso-Hernandez, J. Sole-Casals, C. M. Travieso- Gonzalez, A. Ezeiza, M. Faundez-Zanuy, P. M. Calvo, B. Beitia, Fea- ture selection for automatic analysis of emotional response based on nonlinear speech modeling suitable for diagnosis of alzheimer’s disease, Neurocomputing 150 (2015) 392–401

work page 2015
[21]

Tanaka, H

H. Tanaka, H. Adachi, N. Ukita, M. Ikeda, H. Kazui, T. Kudo, S. Naka- mura, Detecting dementia through interactive computer avatars, IEEE journal of translational engineering in health and medicine 5 (2017) 1– 11

work page 2017
[22]

S. Luz, F. Haider, S. de la Fuente Garcia, D. Fromm, B. MacWhinney, Alzheimer’s dementia recognition through spontaneous speech, Frontiers in computer science 3 (2021) 780169

work page 2021
[23]

J. J. G. Meil´ an, F. Mart´ ınez-S´ anchez, J. Carro, D. E. L´ opez, L. Millian- Morell, J. M. Arana, Speech in alzheimer’s disease: can temporal and acoustic parameters discriminate dementia?, Dementia and geriatric cognitive disorders 37 (5-6) (2014) 327–334

work page 2014
[24]

K. R. Scherer, T. Johnstone, G. Klasmeyer, Vocal expression of emotion, Handbook of affective sciences (2003) 433–456

work page 2003
[25]

Mart´ ınez-Nicol´ as, T

I. Mart´ ınez-Nicol´ as, T. E. Llorente, F. Mart´ ınez-S´ anchez, J. J. G. Meil´ an, Ten years of research on automatic voice and speech analysis of people with alzheimer’s disease and mild cognitive impairment: a systematic review article, Frontiers in Psychology 12 (2021) 620251

work page 2021
[26]

Pistono, M

A. Pistono, M. Jucla, E. J. Barbeau, L. Saint-Aubert, B. Lemesle, B. Calvet, B. K¨ opke, M. Puel, J. Pariente, Pauses during autobiograph- ical discourse reflect episodic memory processes in early alzheimer’s dis- ease, Journal of Alzheimer’s disease 50 (3) (2016) 687–698

work page 2016
[27]

Pappagari, J

R. Pappagari, J. Cho, L. Moro-Velazquez, N. Dehak, Using state of the art speaker recognition and natural language processing technologies to detect alzheimer’s disease and assess its severity., in: Interspeech, 2020, pp. 2177–2181

work page 2020
[28]

Meghanani, C

A. Meghanani, C. S. Anoop, A. Ramakrishnan, An exploration of log- mel spectrogram and mfcc features for alzheimer’s dementia recognition 26 from spontaneous speech, in: 2021 IEEE spoken language technology workshop (SLT), IEEE, 2021, pp. 670–677

work page 2021
[29]

R. J. Morris, C. Oh, P. Franklin, Second formant transitions for acous- tic analysis to differentiate among dementia types, The Journal of the Acoustical Society of America 154 (4 supplement) (2023) A206–A206

work page 2023
[30]

M. M. Parlak, G. Saylam, M. A. Babademez, ¨O. B. Munis, S. A. Tokg¨ oz, Voice analysis results in individuals with alzheimer’s disease: How do age and cognitive status affect voice parameters?, Brain and Behavior 13 (11) (2023) e3271

work page 2023
[31]

Eyben, K

F. Eyben, K. R. Scherer, B. W. Schuller, J. Sundberg, E. Andr´ e, C. Busso, L. Y. Devillers, J. Epps, P. Laukka, S. S. Narayanan, et al., The geneva minimalistic acoustic parameter set (gemaps) for voice re- search and affective computing, IEEE transactions on affective comput- ing 7 (2) (2015) 190–202

work page 2015
[33]

Eyben, M

F. Eyben, M. W¨ ollmer, B. Schuller, Opensmile: the munich versatile and fast open-source audio feature extractor, in: Proceedings of the 18th ACM international conference on Multimedia, 2010, pp. 1459–1462

work page 2010
[34]

Degottex, J

G. Degottex, J. Kane, T. Drugman, T. Raitio, S. Scherer, Covarep—a collaborative voice analysis repository for speech technologies, in: 2014 ieee international conference on acoustics, speech and signal processing (icassp), IEEE, 2014, pp. 960–964

work page 2014
[35]

Snyder, D

D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, S. Khudanpur, X- vectors: Robust dnn embeddings for speaker recognition, in: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, 2018, pp. 5329–5333

work page 2018
[36]

I. T. Jolliffe, Principal component analysis and factor analysis, Principal component analysis (2002) 150–166

work page 2002
[37]

Cummins, Y

N. Cummins, Y. Pan, Z. Ren, J. Fritsch, V. S. Nallanthighal, H. Chris- tensen, D. Blackburn, B. W. Schuller, M. Magimai-Doss, H. Strik, et al., A comparison of acoustic and linguistics methodologies for alzheimer’s 27 dementia recognition, in: Interspeech 2020, ISCA-International Speech Communication Association, 2020, pp. 2182–2186

work page 2020
[38]

Gangamohan, B

P. Gangamohan, B. Yegnanarayana, A robust and alternative approach to zero frequency filtering method for epoch extraction., in: INTER- SPEECH, 2017, pp. 2297–2300

work page 2017
[39]

Z. Li, F. Liu, W. Yang, S. Peng, J. Zhou, A survey of convolutional neu- ral networks: analysis, applications, and prospects, IEEE transactions on neural networks and learning systems 33 (12) (2021) 6999–7019

work page 2021
[40]

J. F. Gemmeke, D. P. Ellis, D. Freedman, A. Jansen, W. Lawrence, R. C. Moore, M. Plakal, M. Ritter, Audio set: An ontology and human- labeled dataset for audio events, in: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, 2017, pp. 776–780

work page 2017
[41]

J. Koo, J. H. Lee, J. Pyo, Y. Jo, K. Lee, Exploiting multi-modal features from pre-trained networks for alzheimer’s dementia recognition, arXiv preprint arXiv:2009.04070 (2020)

work page arXiv 2009
[42]

C. M. Bishop, N. M. Nasrabadi, Pattern recognition and machine learn- ing, Vol. 4, Springer, 2006

work page 2006
[43]

Dehak, P

N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, P. Ouellet, Front-end factor analysis for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing 19 (4) (2010) 788–798

work page 2010
[44]

Pompili, T

A. Pompili, T. Rolland, A. Abad, The inesc-id multi-modal system for the adress 2020 challenge, arXiv preprint arXiv:2005.14646 (2020)

work page arXiv 2020
[45]

D. A. Reynolds, T. F. Quatieri, R. B. Dunn, Speaker verification us- ing adapted gaussian mixture models, Digital signal processing 10 (1-3) (2000) 19–41

work page 2000
[46]

Nagrani, J

A. Nagrani, J. S. Chung, A. Zisserman, Voxceleb: a large-scale speaker identification dataset, arXiv preprint arXiv:1706.08612 (2017)

work page arXiv 2017
[47]

Schmitt, B

M. Schmitt, B. Schuller, openxbow–introducing the passau open-source crossmodal bag-of-words toolkit, Journal of Machine Learning Research 18 (96) (2017) 1–5. 28

work page 2017
[48]

Chopra, R

S. Chopra, R. Hadsell, Y. LeCun, Learning a similarity metric discrim- inatively, with application to face verification, in: 2005 IEEE com- puter society conference on computer vision and pattern recognition (CVPR’05), Vol. 1, IEEE, 2005, pp. 539–546

work page 2005
[49]

M. S. S. Syed, Z. S. Syed, M. Lech, E. Pirogova, Automated screening for alzheimer’s dementia through spontaneous speech., in: Interspeech, Vol. 2020, 2020, pp. 2222–6

work page 2020
[50]

J. Chen, Y. Wang, D. Wang, A feature study for classification-based speech separation at low signal-to-noise ratios, IEEE/ACM Transactions on Audio, Speech, and Language Processing 22 (12) (2014) 1993–2002

work page 2014
[51]

Kohonen, The self-organizing map, Proceedings of the IEEE 78 (9) (1990) 1464–1480

T. Kohonen, The self-organizing map, Proceedings of the IEEE 78 (9) (1990) 1464–1480

work page 1990
[52]

Y. Tian, L. He, Z.-y. Li, W.-l. Wu, W.-Q. Zhang, J. Liu, Speaker ver- ification using fisher vector, in: The 9th International Symposium on Chinese Spoken Language Processing, IEEE, 2014, pp. 419–422

work page 2014
[53]

A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum likelihood from incomplete data via the em algorithm, Journal of the royal statistical society: series B (methodological) 39 (1) (1977) 1–22

work page 1977
[54]

Martinc, S

M. Martinc, S. Pollak, Tackling the adress challenge: A multimodal approach to the automated recognition of alzheimer’s dementia., in: In- terspeech, 2020, pp. 2157–2161

work page 2020
[55]

Balagopalan, B

A. Balagopalan, B. Eyre, F. Rudzicz, J. Novikova, To bert or not to bert: comparing speech and language-based approaches for alzheimer’s disease detection, arXiv preprint arXiv:2008.01551 (2020)

work page arXiv 2008
[56]

J. H. Friedman, Greedy function approximation: a gradient boosting machine, Annals of statistics (2001) 1189–1232

work page 2001
[57]

Edwards, C

E. Edwards, C. Dognin, B. Bollepalli, M. K. Singh, V. Analytics, Multi- scale system for alzheimer’s dementia recognition through spontaneous speech., in: Interspeech, 2020, pp. 2197–2201

work page 2020
[58]

Haider, S

F. Haider, S. De La Fuente, S. Luz, An assessment of paralinguistic acoustic features for detection of alzheimer’s dementia in spontaneous 29 speech, IEEE Journal of Selected Topics in Signal Processing 14 (2) (2019) 272–281

work page 2019
[59]

Rohanian, J

M. Rohanian, J. Hough, M. Purver, Multi-modal fusion with gating using audio, lexical and disfluency features for alzheimer’s dementia recognition from spontaneous speech, arXiv preprint arXiv:2106.09668 (2021)

work page arXiv 2021
[60]

D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning representations by back-propagating errors, nature 323 (6088) (1986) 533–536

work page 1986
[61]

Gated Multimodal Units for Information Fusion

J. Arevalo, T. Solorio, M. Montes-y G´ omez, F. A. Gonz´ alez, Gated mul- timodal units for information fusion, arXiv preprint arXiv:1702.01992 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[62]

J. T. Becker, F. Boiler, O. L. Lopez, J. Saxton, K. L. McGonigle, The natural history of alzheimer’s disease: description of study cohort and accuracy of diagnosis, Archives of neurology 51 (6) (1994) 585–594

work page 1994
[63]

M. W. Fong, R. Van Patten, R. P. Fucetola, The factor structure of the boston diagnostic aphasia examination, Journal of the International Neuropsychological Society 25 (7) (2019) 772–776

work page 2019
[64]

EBU-Recommendation, Loudness normalisation and permitted max- imum level of audio signals, Eur

R. EBU-Recommendation, Loudness normalisation and permitted max- imum level of audio signals, Eur. Broadcast. Union (2011)

work page 2011
[65]

Eyben, Real-time speech and music classification by large audio fea- ture space extraction, Springer, 2015

F. Eyben, Real-time speech and music classification by large audio fea- ture space extraction, Springer, 2015

work page 2015
[66]

A. E. Hoerl, R. W. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12 (1) (1970) 55–67

work page 1970
[67]

K¨ arkk¨ ainen, Extreme minimal learning machine: Ridge regression with distance-based basis, Neurocomputing 342 (2019) 33–48

T. K¨ arkk¨ ainen, Extreme minimal learning machine: Ridge regression with distance-based basis, Neurocomputing 342 (2019) 33–48

work page 2019
[68]

H¨ am¨ al¨ ainen, A

J. H¨ am¨ al¨ ainen, A. S. C. Alencar, T. K¨ arkk¨ ainen, C. L. C. Mattos, H. S. Amauri, J. P. P. Gomes, Minimal learning machine: Theoretical re- sults and clustering-based reference point selection, Journal of Machine Learning Research 21 (239) (2020) 1–29. 30

work page 2020
[69]

T. K¨ arkk¨ ainen, Assessment of feature saliency of mlp using analytic sensitivity., in: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2015, pp. 273–278

work page 2015
[70]

T. K¨ arkk¨ ainen, S.¨Ayr¨ am¨ o, On computation of spatial median for ro- bust data mining, Evolutionary and Deterministic Methods for Design, Optimization and Control with Applications to Industrial and Societal Problems, EUROGEN, Munich (2005) 14

work page 2005
[71]

Cortes, Support-vector networks, Machine Learning (1995)

C. Cortes, Support-vector networks, Machine Learning (1995)

work page 1995
[72]

Ruppert, The elements of statistical learning: data mining, inference, and prediction (2004)

D. Ruppert, The elements of statistical learning: data mining, inference, and prediction (2004)

work page 2004
[73]

Wolfowitz, Non-parametric statistical inference, in: Proceedings of the [First] Berkeley Symposium on Mathematical Statistics and Proba- bility, Vol

J. Wolfowitz, Non-parametric statistical inference, in: Proceedings of the [First] Berkeley Symposium on Mathematical Statistics and Proba- bility, Vol. 1, University of California Press, 1949, pp. 93–114

work page 1949
[74]

Hastie, J

T. Hastie, J. Qian, Glmnet vignette, Retrieved June 9 (2016) (2014) 1–30

work page 2016
[75]

Platt, Sequential minimal optimization: A fast algorithm for train- ing support vector machines, Tech

J. Platt, Sequential minimal optimization: A fast algorithm for train- ing support vector machines, Tech. rep., Microsoft Research Technical Report (1998)

work page 1998
[76]

Breiman, Random forests, Machine learning 45 (2001) 5–32

L. Breiman, Random forests, Machine learning 45 (2001) 5–32

work page 2001
[77]

M. A. Hall, Correlation-based feature selection for machine learning, Ph.D. thesis, The University of Waikato (1999)

work page 1999
[78]

Guyon, J

I. Guyon, J. Weston, S. Barnhill, V. Vapnik, Gene selection for cancer classification using support vector machines, Machine learning 46 (2002) 389–422

work page 2002
[79]

K. Ahn, M. Cho, S. W. Kim, K. E. Lee, Y. Song, S. Yoo, S. Y. Jeon, J. L. Kim, D. H. Yoon, H.-J. Kong, Deep learning of speech data for early detection of alzheimer’s disease in the elderly, Bioengineering 10 (9) (2023) 1093

work page 2023
[80]

Huang, H

L. Huang, H. Yang, Y. Che, J. Yang, Automatic speech analysis for detecting cognitive decline of older adults, Frontiers in Public Health 12 (2024) 1417966. 31

work page 2024
[81]

Mahajan, V

P. Mahajan, V. Baths, Acoustic and language based deep learning ap- proaches for alzheimer’s dementia detection from spontaneous speech, Frontiers in Aging Neuroscience 13 (2021) 623607

work page 2021

Showing first 80 references.

[1] [1]

W. H. Organization, et al., Global action plan on the public health response to dementia 2017–2025, World Health Organization, 2017

work page 2017

[2] [2]

Y. Wang, M. L. Haaksma, I. H. Ramakers, F. R. Verhey, W. M. van de Flier, P. Scheltens, I. van Maurik, M. G. Olde Rikkert, J.-M. S. Le- outsakos, R. J. Melis, Cognitive and functional progression of dementia in two longitudinal studies, International journal of geriatric psychiatry 34 (11) (2019) 1623–1632. 23

work page 2019

[3] [3]

Ngandu, J

T. Ngandu, J. Lehtisalo, A. Solomon, E. Lev¨ alahti, S. Ahtiluoto, R. An- tikainen, L. B¨ ackman, T. H¨ anninen, A. Jula, T. Laatikainen, et al., A 2 year multidomain intervention of diet, exercise, cognitive training, and vascular risk monitoring versus control to prevent cognitive decline in at-risk elderly people (FINGER): a randomised controlled trial...

work page 2015

[4] [4]

Kulmala, T

J. Kulmala, T. Ngandu, S. Havulinna, E. Lev¨ alahti, J. Lehtisalo, A. Solomon, R. Antikainen, T. Laatikainen, P. Pippola, M. Peltonen, et al., The effect of multidomain lifestyle intervention on daily function- ing in older people, Journal of the American Geriatrics Society 67 (6) (2019) 1138–1144

work page 2019

[5] [5]

Kivipelto, F

M. Kivipelto, F. Mangialasche, H. M. Snyder, R. Allegri, S. Andrieu, H. Arai, L. Baker, S. Belleville, H. Brodaty, S. M. Brucki, et al., World- wide fingers network: a global approach to risk reduction and prevention of dementia, Alzheimer’s & dementia 16 (7) (2020) 1078–1094

work page 2020

[6] [6]

C. R. Jack Jr, D. A. Bennett, K. Blennow, M. C. Carrillo, B. Dunn, S. B. Haeberlein, D. M. Holtzman, W. Jagust, F. Jessen, J. Karlawish, et al., Nia-aa research framework: toward a biological definition of alzheimer’s disease, Alzheimer’s & dementia 14 (4) (2018) 535–562

work page 2018

[7] [7]

Hanyu, T

H. Hanyu, T. Asano, T. Iwamoto, M. Takasaki, H. Shindo, K. Abe, Magnetization transfer measurements of the hippocampus in patients with alzheimer’s disease, vascular dementia, and other types of demen- tia, American journal of neuroradiology 21 (7) (2000) 1235–1242

work page 2000

[8] [8]

F.-Y. Chiu, Y. Yen, Imaging biomarkers for clinical applications in neuro-oncology: current status and future perspectives, Biomarker Re- search 11 (1) (2023) 35

work page 2023

[9] [9]

P. N. Young, M. Estarellas, E. Coomans, M. Srikrishna, H. Beaumont, A. Maass, A. V. Venkataraman, R. Lissaman, D. Jim´ enez, M. J. Betts, et al., Imaging biomarkers in neurodegeneration: current and future practices, Alzheimer’s research & therapy 12 (2020) 1–17

work page 2020

[10] [10]

M. A. Ebrahimighahnavieh, S. Luo, R. Chiong, Deep learning to detect alzheimer’s disease from neuroimaging: A systematic literature review, Computer methods and programs in biomedicine 187 (2020) 105242. 24

work page 2020

[11] [11]

mini-mental state

M. F. Folstein, S. E. Folstein, P. R. McHugh, “mini-mental state”: a practical method for grading the cognitive state of patients for the clin- ician, Journal of psychiatric research 12 (3) (1975) 189–198

work page 1975

[12] [12]

Z. S. Nasreddine, N. A. Phillips, V. B´ edirian, S. Charbonneau, V. White- head, I. Collin, J. L. Cummings, H. Chertkow, The montreal cognitive assessment, moca: a brief screening tool for mild cognitive impairment, Journal of the American Geriatrics Society 53 (4) (2005) 695–699

work page 2005

[13] [13]

J. C. Morris, A. Heyman, R. C. Mohs, J. P. Hughes, G. van Belle, G. Fil- lenbaum, E. D. Mellits, C. Clark, The consortium to establish a registry for alzheimer’s disease (cerad). part i. clinical and neuropsychological assessment of alzheimer’s disease., Neurology 39 (9) (1989) 1159–1165

work page 1989

[14] [14]

Zorluoglu, M

G. Zorluoglu, M. E. Kamasak, L. Tavacioglu, P. O. Ozanar, A mobile application for cognitive screening of dementia, Computer methods and programs in biomedicine 118 (2) (2015) 252–262

work page 2015

[15] [15]

Reilly, J

J. Reilly, J. E. Peelle, S. M. Antonucci, M. Grossman, Anomia as a marker of distinct semantic memory impairments in alzheimer’s disease and semantic dementia., Neuropsychology 25 (4) (2011) 413

work page 2011

[16] [16]

Ivanova, I

O. Ivanova, I. Mart´ ınez-Nicol´ as, E. Garc´ ıa-Pi˜ nuela, J. J. G. Meil´ an, Defying syntactic preservation in alzheimer’s disease: what type of im- pairment predicts syntactic change in dementia (if it does) and why?, Frontiers in Language Sciences 2 (2023) 1199107

work page 2023

[17] [17]

K. C. Fraser, J. A. Meltzer, F. Rudzicz, Linguistic features identify alzheimer’s disease in narrative speech, Journal of Alzheimer’s disease 49 (2) (2015) 407–422

work page 2015

[18] [18]

De la Fuente Garcia, C

S. De la Fuente Garcia, C. W. Ritchie, S. Luz, Artificial intelligence, speech, and language processing approaches to monitoring alzheimer’s disease: a systematic review, Journal of Alzheimer’s Disease 78 (4) (2020) 1547–1574

work page 2020

[19] [19]

Beltrami, G

D. Beltrami, G. Gagliardi, R. Rossini Favretti, E. Ghidoni, F. Tam- burini, L. Calz` a, Speech analysis by natural language processing tech- niques: a possible tool for very early detection of cognitive decline?, Frontiers in aging neuroscience 10 (2018) 369. 25

work page 2018

[20] [20]

Lopez-de Ipina, J

K. Lopez-de Ipina, J. Alonso-Hernandez, J. Sole-Casals, C. M. Travieso- Gonzalez, A. Ezeiza, M. Faundez-Zanuy, P. M. Calvo, B. Beitia, Fea- ture selection for automatic analysis of emotional response based on nonlinear speech modeling suitable for diagnosis of alzheimer’s disease, Neurocomputing 150 (2015) 392–401

work page 2015

[21] [21]

Tanaka, H

H. Tanaka, H. Adachi, N. Ukita, M. Ikeda, H. Kazui, T. Kudo, S. Naka- mura, Detecting dementia through interactive computer avatars, IEEE journal of translational engineering in health and medicine 5 (2017) 1– 11

work page 2017

[22] [22]

S. Luz, F. Haider, S. de la Fuente Garcia, D. Fromm, B. MacWhinney, Alzheimer’s dementia recognition through spontaneous speech, Frontiers in computer science 3 (2021) 780169

work page 2021

[23] [23]

J. J. G. Meil´ an, F. Mart´ ınez-S´ anchez, J. Carro, D. E. L´ opez, L. Millian- Morell, J. M. Arana, Speech in alzheimer’s disease: can temporal and acoustic parameters discriminate dementia?, Dementia and geriatric cognitive disorders 37 (5-6) (2014) 327–334

work page 2014

[24] [24]

K. R. Scherer, T. Johnstone, G. Klasmeyer, Vocal expression of emotion, Handbook of affective sciences (2003) 433–456

work page 2003

[25] [25]

Mart´ ınez-Nicol´ as, T

I. Mart´ ınez-Nicol´ as, T. E. Llorente, F. Mart´ ınez-S´ anchez, J. J. G. Meil´ an, Ten years of research on automatic voice and speech analysis of people with alzheimer’s disease and mild cognitive impairment: a systematic review article, Frontiers in Psychology 12 (2021) 620251

work page 2021

[26] [26]

Pistono, M

A. Pistono, M. Jucla, E. J. Barbeau, L. Saint-Aubert, B. Lemesle, B. Calvet, B. K¨ opke, M. Puel, J. Pariente, Pauses during autobiograph- ical discourse reflect episodic memory processes in early alzheimer’s dis- ease, Journal of Alzheimer’s disease 50 (3) (2016) 687–698

work page 2016

[27] [27]

Pappagari, J

R. Pappagari, J. Cho, L. Moro-Velazquez, N. Dehak, Using state of the art speaker recognition and natural language processing technologies to detect alzheimer’s disease and assess its severity., in: Interspeech, 2020, pp. 2177–2181

work page 2020

[28] [28]

Meghanani, C

A. Meghanani, C. S. Anoop, A. Ramakrishnan, An exploration of log- mel spectrogram and mfcc features for alzheimer’s dementia recognition 26 from spontaneous speech, in: 2021 IEEE spoken language technology workshop (SLT), IEEE, 2021, pp. 670–677

work page 2021

[29] [29]

R. J. Morris, C. Oh, P. Franklin, Second formant transitions for acous- tic analysis to differentiate among dementia types, The Journal of the Acoustical Society of America 154 (4 supplement) (2023) A206–A206

work page 2023

[30] [30]

M. M. Parlak, G. Saylam, M. A. Babademez, ¨O. B. Munis, S. A. Tokg¨ oz, Voice analysis results in individuals with alzheimer’s disease: How do age and cognitive status affect voice parameters?, Brain and Behavior 13 (11) (2023) e3271

work page 2023

[31] [31]

Eyben, K

F. Eyben, K. R. Scherer, B. W. Schuller, J. Sundberg, E. Andr´ e, C. Busso, L. Y. Devillers, J. Epps, P. Laukka, S. S. Narayanan, et al., The geneva minimalistic acoustic parameter set (gemaps) for voice re- search and affective computing, IEEE transactions on affective comput- ing 7 (2) (2015) 190–202

work page 2015

[32] [33]

Eyben, M

F. Eyben, M. W¨ ollmer, B. Schuller, Opensmile: the munich versatile and fast open-source audio feature extractor, in: Proceedings of the 18th ACM international conference on Multimedia, 2010, pp. 1459–1462

work page 2010

[33] [34]

Degottex, J

G. Degottex, J. Kane, T. Drugman, T. Raitio, S. Scherer, Covarep—a collaborative voice analysis repository for speech technologies, in: 2014 ieee international conference on acoustics, speech and signal processing (icassp), IEEE, 2014, pp. 960–964

work page 2014

[34] [35]

Snyder, D

D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, S. Khudanpur, X- vectors: Robust dnn embeddings for speaker recognition, in: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, 2018, pp. 5329–5333

work page 2018

[35] [36]

I. T. Jolliffe, Principal component analysis and factor analysis, Principal component analysis (2002) 150–166

work page 2002

[36] [37]

Cummins, Y

N. Cummins, Y. Pan, Z. Ren, J. Fritsch, V. S. Nallanthighal, H. Chris- tensen, D. Blackburn, B. W. Schuller, M. Magimai-Doss, H. Strik, et al., A comparison of acoustic and linguistics methodologies for alzheimer’s 27 dementia recognition, in: Interspeech 2020, ISCA-International Speech Communication Association, 2020, pp. 2182–2186

work page 2020

[37] [38]

Gangamohan, B

P. Gangamohan, B. Yegnanarayana, A robust and alternative approach to zero frequency filtering method for epoch extraction., in: INTER- SPEECH, 2017, pp. 2297–2300

work page 2017

[38] [39]

Z. Li, F. Liu, W. Yang, S. Peng, J. Zhou, A survey of convolutional neu- ral networks: analysis, applications, and prospects, IEEE transactions on neural networks and learning systems 33 (12) (2021) 6999–7019

work page 2021

[39] [40]

J. F. Gemmeke, D. P. Ellis, D. Freedman, A. Jansen, W. Lawrence, R. C. Moore, M. Plakal, M. Ritter, Audio set: An ontology and human- labeled dataset for audio events, in: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, 2017, pp. 776–780

work page 2017

[40] [41]

J. Koo, J. H. Lee, J. Pyo, Y. Jo, K. Lee, Exploiting multi-modal features from pre-trained networks for alzheimer’s dementia recognition, arXiv preprint arXiv:2009.04070 (2020)

work page arXiv 2009

[41] [42]

C. M. Bishop, N. M. Nasrabadi, Pattern recognition and machine learn- ing, Vol. 4, Springer, 2006

work page 2006

[42] [43]

Dehak, P

N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, P. Ouellet, Front-end factor analysis for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing 19 (4) (2010) 788–798

work page 2010

[43] [44]

Pompili, T

A. Pompili, T. Rolland, A. Abad, The inesc-id multi-modal system for the adress 2020 challenge, arXiv preprint arXiv:2005.14646 (2020)

work page arXiv 2020

[44] [45]

D. A. Reynolds, T. F. Quatieri, R. B. Dunn, Speaker verification us- ing adapted gaussian mixture models, Digital signal processing 10 (1-3) (2000) 19–41

work page 2000

[45] [46]

Nagrani, J

A. Nagrani, J. S. Chung, A. Zisserman, Voxceleb: a large-scale speaker identification dataset, arXiv preprint arXiv:1706.08612 (2017)

work page arXiv 2017

[46] [47]

Schmitt, B

M. Schmitt, B. Schuller, openxbow–introducing the passau open-source crossmodal bag-of-words toolkit, Journal of Machine Learning Research 18 (96) (2017) 1–5. 28

work page 2017

[47] [48]

Chopra, R

S. Chopra, R. Hadsell, Y. LeCun, Learning a similarity metric discrim- inatively, with application to face verification, in: 2005 IEEE com- puter society conference on computer vision and pattern recognition (CVPR’05), Vol. 1, IEEE, 2005, pp. 539–546

work page 2005

[48] [49]

M. S. S. Syed, Z. S. Syed, M. Lech, E. Pirogova, Automated screening for alzheimer’s dementia through spontaneous speech., in: Interspeech, Vol. 2020, 2020, pp. 2222–6

work page 2020

[49] [50]

J. Chen, Y. Wang, D. Wang, A feature study for classification-based speech separation at low signal-to-noise ratios, IEEE/ACM Transactions on Audio, Speech, and Language Processing 22 (12) (2014) 1993–2002

work page 2014

[50] [51]

Kohonen, The self-organizing map, Proceedings of the IEEE 78 (9) (1990) 1464–1480

T. Kohonen, The self-organizing map, Proceedings of the IEEE 78 (9) (1990) 1464–1480

work page 1990

[51] [52]

Y. Tian, L. He, Z.-y. Li, W.-l. Wu, W.-Q. Zhang, J. Liu, Speaker ver- ification using fisher vector, in: The 9th International Symposium on Chinese Spoken Language Processing, IEEE, 2014, pp. 419–422

work page 2014

[52] [53]

A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum likelihood from incomplete data via the em algorithm, Journal of the royal statistical society: series B (methodological) 39 (1) (1977) 1–22

work page 1977

[53] [54]

Martinc, S

M. Martinc, S. Pollak, Tackling the adress challenge: A multimodal approach to the automated recognition of alzheimer’s dementia., in: In- terspeech, 2020, pp. 2157–2161

work page 2020

[54] [55]

Balagopalan, B

A. Balagopalan, B. Eyre, F. Rudzicz, J. Novikova, To bert or not to bert: comparing speech and language-based approaches for alzheimer’s disease detection, arXiv preprint arXiv:2008.01551 (2020)

work page arXiv 2008

[55] [56]

J. H. Friedman, Greedy function approximation: a gradient boosting machine, Annals of statistics (2001) 1189–1232

work page 2001

[56] [57]

Edwards, C

E. Edwards, C. Dognin, B. Bollepalli, M. K. Singh, V. Analytics, Multi- scale system for alzheimer’s dementia recognition through spontaneous speech., in: Interspeech, 2020, pp. 2197–2201

work page 2020

[57] [58]

Haider, S

F. Haider, S. De La Fuente, S. Luz, An assessment of paralinguistic acoustic features for detection of alzheimer’s dementia in spontaneous 29 speech, IEEE Journal of Selected Topics in Signal Processing 14 (2) (2019) 272–281

work page 2019

[58] [59]

Rohanian, J

M. Rohanian, J. Hough, M. Purver, Multi-modal fusion with gating using audio, lexical and disfluency features for alzheimer’s dementia recognition from spontaneous speech, arXiv preprint arXiv:2106.09668 (2021)

work page arXiv 2021

[59] [60]

D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning representations by back-propagating errors, nature 323 (6088) (1986) 533–536

work page 1986

[60] [61]

Gated Multimodal Units for Information Fusion

J. Arevalo, T. Solorio, M. Montes-y G´ omez, F. A. Gonz´ alez, Gated mul- timodal units for information fusion, arXiv preprint arXiv:1702.01992 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[61] [62]

J. T. Becker, F. Boiler, O. L. Lopez, J. Saxton, K. L. McGonigle, The natural history of alzheimer’s disease: description of study cohort and accuracy of diagnosis, Archives of neurology 51 (6) (1994) 585–594

work page 1994

[62] [63]

M. W. Fong, R. Van Patten, R. P. Fucetola, The factor structure of the boston diagnostic aphasia examination, Journal of the International Neuropsychological Society 25 (7) (2019) 772–776

work page 2019

[63] [64]

EBU-Recommendation, Loudness normalisation and permitted max- imum level of audio signals, Eur

R. EBU-Recommendation, Loudness normalisation and permitted max- imum level of audio signals, Eur. Broadcast. Union (2011)

work page 2011

[64] [65]

Eyben, Real-time speech and music classification by large audio fea- ture space extraction, Springer, 2015

F. Eyben, Real-time speech and music classification by large audio fea- ture space extraction, Springer, 2015

work page 2015

[65] [66]

A. E. Hoerl, R. W. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12 (1) (1970) 55–67

work page 1970

[66] [67]

K¨ arkk¨ ainen, Extreme minimal learning machine: Ridge regression with distance-based basis, Neurocomputing 342 (2019) 33–48

T. K¨ arkk¨ ainen, Extreme minimal learning machine: Ridge regression with distance-based basis, Neurocomputing 342 (2019) 33–48

work page 2019

[67] [68]

H¨ am¨ al¨ ainen, A

J. H¨ am¨ al¨ ainen, A. S. C. Alencar, T. K¨ arkk¨ ainen, C. L. C. Mattos, H. S. Amauri, J. P. P. Gomes, Minimal learning machine: Theoretical re- sults and clustering-based reference point selection, Journal of Machine Learning Research 21 (239) (2020) 1–29. 30

work page 2020

[68] [69]

T. K¨ arkk¨ ainen, Assessment of feature saliency of mlp using analytic sensitivity., in: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2015, pp. 273–278

work page 2015

[69] [70]

T. K¨ arkk¨ ainen, S.¨Ayr¨ am¨ o, On computation of spatial median for ro- bust data mining, Evolutionary and Deterministic Methods for Design, Optimization and Control with Applications to Industrial and Societal Problems, EUROGEN, Munich (2005) 14

work page 2005

[70] [71]

Cortes, Support-vector networks, Machine Learning (1995)

C. Cortes, Support-vector networks, Machine Learning (1995)

work page 1995

[71] [72]

Ruppert, The elements of statistical learning: data mining, inference, and prediction (2004)

D. Ruppert, The elements of statistical learning: data mining, inference, and prediction (2004)

work page 2004

[72] [73]

Wolfowitz, Non-parametric statistical inference, in: Proceedings of the [First] Berkeley Symposium on Mathematical Statistics and Proba- bility, Vol

J. Wolfowitz, Non-parametric statistical inference, in: Proceedings of the [First] Berkeley Symposium on Mathematical Statistics and Proba- bility, Vol. 1, University of California Press, 1949, pp. 93–114

work page 1949

[73] [74]

Hastie, J

T. Hastie, J. Qian, Glmnet vignette, Retrieved June 9 (2016) (2014) 1–30

work page 2016

[74] [75]

Platt, Sequential minimal optimization: A fast algorithm for train- ing support vector machines, Tech

J. Platt, Sequential minimal optimization: A fast algorithm for train- ing support vector machines, Tech. rep., Microsoft Research Technical Report (1998)

work page 1998

[75] [76]

Breiman, Random forests, Machine learning 45 (2001) 5–32

L. Breiman, Random forests, Machine learning 45 (2001) 5–32

work page 2001

[76] [77]

M. A. Hall, Correlation-based feature selection for machine learning, Ph.D. thesis, The University of Waikato (1999)

work page 1999

[77] [78]

Guyon, J

I. Guyon, J. Weston, S. Barnhill, V. Vapnik, Gene selection for cancer classification using support vector machines, Machine learning 46 (2002) 389–422

work page 2002

[78] [79]

K. Ahn, M. Cho, S. W. Kim, K. E. Lee, Y. Song, S. Yoo, S. Y. Jeon, J. L. Kim, D. H. Yoon, H.-J. Kong, Deep learning of speech data for early detection of alzheimer’s disease in the elderly, Bioengineering 10 (9) (2023) 1093

work page 2023

[79] [80]

Huang, H

L. Huang, H. Yang, Y. Che, J. Yang, Automatic speech analysis for detecting cognitive decline of older adults, Frontiers in Public Health 12 (2024) 1417966. 31

work page 2024

[80] [81]

Mahajan, V

P. Mahajan, V. Baths, Acoustic and language based deep learning ap- proaches for alzheimer’s dementia detection from spontaneous speech, Frontiers in Aging Neuroscience 13 (2021) 623607

work page 2021