On improving learning capability of ELM and an application to brain-computer interface
Pith reviewed 2026-05-24 21:39 UTC · model grok-4.3
The pith
Replacing SVD with Hessenberg or Householder decomposition in ELM improves training speed or accuracy for EEG-based brain-computer interfaces.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ELM achieves high performances rapidly on benchmark datasets but declines on large real-life data due to the low-convergence of SVD. The study resolves this by replacing SVD with five more efficient methods: lower upper triangularization, Hessenberg decomposition, Schur decomposition, modified Gram Schmidt algorithm and Householder reflection. On electroencephalography based brain-computer interface classification, subject-based results indicate that Hessenberg decomposition should be preferred for training pace and Householder reflection for performances.
What carries the argument
Replacing the SVD pseudoinverse computation in ELM with one of five alternative decompositions for EEG BCI classification tasks.
If this is right
- Hessenberg decomposition should be chosen when training pace is the priority in ELM for BCI.
- Householder reflection should be chosen when classification performance is the priority.
- The alternative methods are more efficient than SVD for large data applications.
- These replacements address the decline in ELM performance on real-life large datasets.
Where Pith is reading between the lines
- The speed and accuracy benefits could be evaluated on other types of large classification datasets to see if the preferences hold.
- These decomposition choices might allow ELM to be deployed in time-sensitive real-world systems beyond BCI.
Load-bearing premise
The five alternative decompositions produce numerically stable and equivalent solutions to the original ELM pseudoinverse problem on the EEG data without introducing new numerical artifacts or changing generalization behavior.
What would settle it
A test where the classification performance or training time of ELM using one of the alternative methods on the EEG BCI data differs substantially from the SVD version would indicate the methods are not equivalent replacements.
read the original abstract
As a type of pseudoinverse learning, extreme learning machine (ELM) is able to achieve high performances in a rapid pace on benchmark datasets. However, when it is applied to real life large data, decline related to low-convergence of singular value decomposition (SVD) method occurs. Our study aims to resolve this issue via replacing SVD with theoretically and empirically much efficient 5 number of methods: lower upper triangularization, Hessenberg decomposition, Schur decomposition, modified Gram Schmidt algorithm and Householder reflection. Comparisons were made on electroencephalography based brain-computer interface classification problem to decide which method is the most useful. Results of subject-based classifications suggested that if priority was given to training pace, Hessenberg decomposition method, whereas if priority was given to performances Householder reflection method should be preferred.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that replacing SVD with five alternative decompositions (LU triangularization, Hessenberg, Schur, modified Gram-Schmidt, Householder reflection) for computing ELM output weights yields faster training on large data while preserving or improving accuracy, and reports subject-wise EEG BCI results favoring Hessenberg decomposition when training speed is prioritized and Householder reflection when classification performance is prioritized.
Significance. If the five methods are shown to produce numerically equivalent solutions to the SVD-based Moore-Penrose pseudoinverse on rectangular hidden-layer matrices, the empirical timing and accuracy trade-offs on real BCI data would provide a practical contribution for deploying ELM in latency-sensitive settings.
major comments (3)
- [Theoretical background / ELM formulation] The central claim requires that each of the five decompositions solves the identical least-squares problem min ||H beta - T||_2 as the SVD pseudoinverse. No section derives the explicit mapping from LU, Hessenberg, Schur, modified Gram-Schmidt or Householder to the Moore-Penrose solution for rectangular H; Hessenberg and Schur are eigenvalue reductions and are not standard for rectangular least-squares.
- [Experiments and results] Table of subject-based classification results and timing comparisons: without reported verification (e.g., residual norms ||H beta_alt - T|| versus ||H beta_SVD - T|| or condition-number diagnostics on the EEG H matrices), it is impossible to determine whether accuracy differences arise from computational efficiency or from altered rank handling / numerical stability.
- [Discussion / conclusions] The recommendation ordering (Hessenberg for pace, Householder for performance) is load-bearing on the assumption that all methods produce beta vectors with identical generalization behavior; the manuscript provides no cross-method comparison of the resulting decision boundaries or leave-one-subject-out statistics that would confirm equivalence.
minor comments (2)
- [Abstract] Abstract: the phrase 'theoretically and empirically much efficient' is imprecise; the theoretical justification for each method's equivalence to SVD should be stated explicitly.
- [ELM formulation] Notation: dimensions of the hidden-layer matrix H (N x L) and target matrix T should be stated once at the outset so readers can immediately see that the problem is rectangular.
Simulated Author's Rebuttal
We thank the referee for the careful review and valuable comments. We address each major point below, clarifying our approach and indicating revisions to improve the manuscript.
read point-by-point responses
-
Referee: [Theoretical background / ELM formulation] The central claim requires that each of the five decompositions solves the identical least-squares problem min ||H beta - T||_2 as the SVD pseudoinverse. No section derives the explicit mapping from LU, Hessenberg, Schur, modified Gram-Schmidt or Householder to the Moore-Penrose solution for rectangular H; Hessenberg and Schur are eigenvalue reductions and are not standard for rectangular least-squares.
Authors: We acknowledge that the manuscript does not include explicit derivations showing how each decomposition yields the Moore-Penrose solution for rectangular hidden-layer matrices H. LU, modified Gram-Schmidt and Householder reflections are standard routes to QR factorization and thus to least-squares solutions; we will add a new subsection deriving the explicit mappings for these three. For Hessenberg and Schur we recognize that they are eigenvalue-oriented and not directly applicable to rectangular least-squares without additional reduction steps; the revised text will state this limitation and restrict claims accordingly. revision: yes
-
Referee: [Experiments and results] Table of subject-based classification results and timing comparisons: without reported verification (e.g., residual norms ||H beta_alt - T|| versus ||H beta_SVD - T|| or condition-number diagnostics on the EEG H matrices), it is impossible to determine whether accuracy differences arise from computational efficiency or from altered rank handling / numerical stability.
Authors: We agree that residual-norm and condition-number diagnostics are necessary to confirm numerical equivalence. In the revision we will add a new table reporting ||H beta_alt - T||_2 for each method versus the SVD baseline, together with the 2-norm condition numbers of the EEG-derived H matrices, allowing readers to separate speed gains from possible stability differences. revision: yes
-
Referee: [Discussion / conclusions] The recommendation ordering (Hessenberg for pace, Householder for performance) is load-bearing on the assumption that all methods produce beta vectors with identical generalization behavior; the manuscript provides no cross-method comparison of the resulting decision boundaries or leave-one-subject-out statistics that would confirm equivalence.
Authors: The subject-wise accuracy and timing tables already form the empirical basis for the ordering. To strengthen the claim we will add, in the revised discussion, pairwise comparisons of the Euclidean norms of the obtained beta vectors and the variance of the leave-one-subject-out accuracies across the five methods. Full visualization of decision boundaries is outside the present scope but the added statistics will make the generalization-equivalence assumption explicit and testable. revision: partial
- Explicit mapping of Hessenberg and Schur decompositions onto the Moore-Penrose pseudoinverse for rectangular matrices (these methods are eigenvalue reductions and lack a standard least-squares formulation for non-square H).
Circularity Check
Empirical comparison of solvers; no derivation reduces to inputs by construction
full rationale
The manuscript is an empirical benchmark of five matrix factorizations (LU, Hessenberg, Schur, modified Gram-Schmidt, Householder) versus SVD for the ELM least-squares step on EEG data. No equation or claim is shown to be equivalent to its own inputs; the reported accuracy and timing tables are direct measurements, not predictions derived from fitted parameters or self-citations. The central preference ordering rests on observed runtimes and subject-wise accuracies rather than any self-definitional or load-bearing self-citation step. The skeptic concern about numerical equivalence of the solvers is a correctness/assumption issue, not a circularity issue.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
replacing SVD with ... lower upper triangularization, Hessenberg decomposition, Schur decomposition, modified Gram Schmidt algorithm and Householder reflection
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Hw = t ... H+ = (HT H)^{-1}H
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
helsinki, fi n- land: Wma; june 1964
Association, W.M., et al.: Code of ethics of the world medical association: Declaration of helsinki. helsinki, fi n- land: Wma; june 1964. BMJ 2, 177 (1964) [4]
work page 1964
-
[2]
Journal of neuroscience methods 244, 2–7 (2015) [3]
Bai, L., Yu, T., Li, Y.: A brain computer interface-based explorer. Journal of neuroscience methods 244, 2–7 (2015) [3]
work page 2015
-
[3]
Božinovski, A., Tonković, S., Išgum, V., Božinovska, L.: Robot control using anticipatory brain potentials. Automatika–Journal for Control, Measurement, Elec- tronics, Computing and Communications 52(1) (2011) [2]
work page 2011
-
[4]
Neural networks 81, 91–102 (2016) [2, 9, 10]
Cao, J., Zhang, K., Luo, M., Yin, C., Lai, X.: Extreme learning machine and adaptive sparse representation for image classification. Neural networks 81, 91–102 (2016) [2, 9, 10]
work page 2016
-
[5]
IEEE transactions on neural systems and rehabilitation engineering 16(1), 51–61 (2008) [2]
Citi, L., Poli, R., Cinel, C., Sepulveda, F.: P300-based bci mouse with genetically-optimized analogue control. IEEE transactions on neural systems and rehabilitation engineering 16(1), 51–61 (2008) [2]
work page 2008
-
[6]
Cohen, M.X.: Analyzing neural time series data: theory and practice. MIT press (2014) [2]
work page 2014
-
[7]
Biological psychiatry 47(12), 1064–1071 (2000) [2]
Costa, L., Bauer, L., Kuperman, S., Porjesz, B., OâĂŹ- Connor, S., Hesselbrock, V., Rohrbaugh, J., Begleiter, H.: Frontal p300 decrements, alcohol dependence, and anti- social personality disorder. Biological psychiatry 47(12), 1064–1071 (2000) [2]
work page 2000
-
[8]
Journal of neuroscience methods 134(1), 9–21 (2004) [4]
Delorme, A., Makeig, S.: EEGLAB: an open source tool- box for analysis of single-trial EEG dynamics including independent component analysis. Journal of neuroscience methods 134(1), 9–21 (2004) [4]
work page 2004
-
[9]
IEEE transactions on rehabilitation engineering 8(2), 174–179 (2000) [2]
Donchin, E., Spencer, K.M., Wijesinghe, R.: The men- tal prosthesis: assessing the speed of a p300-based brain- computer interface. IEEE transactions on rehabilitation engineering 8(2), 174–179 (2000) [2]
work page 2000
-
[10]
Presence: Teleoperators and Virtual Environments 19(1), 12–24 (2010) [3]
Donnerer, M., Steed, A.: Using a p300 brain computer interface in an immersive virtual environment. Presence: Teleoperators and Virtual Environments 19(1), 12–24 (2010) [3]
work page 2010
-
[12]
Electroencephalography and clinical Neurophysiology 70(6), 510–523 (1988) [2]
Farwell, L.A., Donchin, E.: Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials. Electroencephalography and clinical Neurophysiology 70(6), 510–523 (1988) [2]
work page 1988
-
[13]
Numerische Mathematik 14(5), 403–420 (1970) [4]
Golub, G.H., Reinsch, C.: Singular value decomposition and least squares solutions. Numerische Mathematik 14(5), 403–420 (1970) [4]
work page 1970
-
[14]
Golub, G.H., Van Loan, C.F.: Matrix computations, vol. 3. JHU Press (2012) [7]
work page 2012
-
[15]
A VEST of the Pseudoinverse Learning Algorithm
Guo, P.: A vest of the pseudoinverse learning algorithm. arXiv preprint arXiv:1805.07828 (2018) [2]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[16]
In: Pro- ceedings of 1995 International Conference on Neural In- formation Processing, pp
Guo, P., Chen, C.P., Sun, Y.: An exact supervised learn- ing for a three-layer supervised neural network. In: Pro- ceedings of 1995 International Conference on Neural In- formation Processing, pp. 1041–1044 (1995) [1, 10]
work page 1995
-
[17]
Handy, T.C.: Event-related potentials: A methods hand- book. MIT press (2005) [2]
work page 2005
-
[18]
Haykin, S.: Neural networks: a comprehensive founda- tion. 2nd ed. New Jersey: Prentice Hall (1996) [9]
work page 1996
-
[19]
Hoffmann, U., Vesin, J.M., Ebrahimi, T., Diserens, K.: An efficient p300-based brain–computer interface for dis- abled subjects. Journal of Neuroscience methods 167(1), 115–125 (2008) [2] On improving learning capability of ELM and an application t o brain-computer interface 11
work page 2008
-
[20]
Neurocomputing 102, 31–44 (2013) [4]
Horata, P., Chiewchanwattana, S., Sunat, K.: Robust ex- treme learning machine. Neurocomputing 102, 31–44 (2013) [4]
work page 2013
-
[21]
Huang, G.B., Chen, L., Siew, C.K., et al.: Universal ap- proximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Networks 17(4), 879–892 (2006) [1]
work page 2006
-
[22]
Huang, G.b., Zhu, Q.y., Siew, C.k.: Extreme learning ma- chine : Theory and applications. Neurocomputing 70, 489–501 (2006). DOI 10.1016/j.neucom.2005.12.126 [4]
-
[23]
Psychophysiology 40(5), 684–701 (2003) [2]
Jeon, Y.W., Polich, J.: Meta-analysis of p300 and schizophrenia: Patients, paradigms, and practical impli- cations. Psychophysiology 40(5), 684–701 (2003) [2]
work page 2003
-
[24]
Journal of neuro- science methods 205(2), 265–276 (2012) [3]
Jin, J., Allison, B.Z., Wang, X., Neuper, C.: A combined brain–computer interface based on p300 potentials and motion-onset visual evoked potentials. Journal of neuro- science methods 205(2), 265–276 (2012) [3]
work page 2012
-
[25]
In: International Conference on Neural In- formation Processing, pp
Kutlu, Y., Yayik, A., Yıldırım, E., Yıldırım, S.: Orthog - onal extreme learning machine based p300 visual event- related bci. In: International Conference on Neural In- formation Processing, pp. 284–291. Springer (2015) [1, 2]
work page 2015
-
[26]
Neural Computing and Appli- cations (2017)
Kutlu, Y., Yayık, A., Yıldırım, E., Yıldırım, S.: Lu tri- angularization extreme learning machine in eeg cogni- tive task classification. Neural Computing and Appli- cations (2017). DOI 10.1007/s00521-017-3142-1. URL https://doi.org/10.1007/s00521-017-3142-1 [1, 10]
-
[27]
Neural Computing and Applications pp
Kutlu, Y., Yayık, A., Yildirim, E., Yildirim, S.: Lu tri- angularization extreme learning machine in eeg cognitive task classification. Neural Computing and Applications pp. 1–10 (2017) [1]
work page 2017
-
[28]
IEEE Transactions on Neural Networks 21(1), 158–162 (2010) [1]
Miche, Y., Sorjamaa, A., Bas, P., Simula, O., Jutten, C., Lendasse, A.: Op-elm: optimally pruned extreme learning machine. IEEE Transactions on Neural Networks 21(1), 158–162 (2010) [1]
work page 2010
-
[29]
Neurocomputing 6(2), 163–180 (1994) [1, 10]
Pao, Y.H., Park, G.H., Sobajic, D.J.: Learning and gener - alization characteristics of the random vector functional - link net. Neurocomputing 6(2), 163–180 (1994) [1, 10]
work page 1994
-
[30]
Polich, J., Ladish, C., Bloom, F.E.: P300 assessment of early alzheimer’s disease. Electroencephalography and Clinical Neurophysiology/Evoked Potentials Section 77(3), 179–189 (1990) [2]
work page 1990
-
[31]
IEEE transactions on biomedical engineering 55(3), 1147–1154 (2008) [2]
Rakotomamonjy, A., Guigue, V.: Bci competition iii: dataset ii-ensemble of svms for bci p300 speller. IEEE transactions on biomedical engineering 55(3), 1147–1154 (2008) [2]
work page 2008
-
[32]
Information Science s 367, 1078–1093 (2016) [2]
Ren, Y., Suganthan, P.N., Srikanth, N., Amaratunga, G.: Random vector functional link network for short-term electricity load demand forecasting. Information Science s 367, 1078–1093 (2016) [2]
work page 2016
-
[33]
Presence: teleoperators and virtual environments 19(1), 35–53 (2010) [4]
Renard, Y., Lotte, F., Gibert, G., Congedo, M., Maby, E., Delannoy, V., Bertrand, O., Lécuyer, A.: OpenViBE: an open-source software platform to design, test, and use brain-computer interfaces in real and virtual environ- ments. Presence: teleoperators and virtual environments 19(1), 35–53 (2010) [4]
work page 2010
-
[34]
In: Sig- nal Processing Conference, 2008 16th European, pp
Rivet, B., Souloumiac, A., Gibert, G., Attina, V.: âĂIJp300 spellerâĂİ brain-computer interface: Enhance- ment of p300 evoked potential by spatial filters. In: Sig- nal Processing Conference, 2008 16th European, pp. 1–5. IEEE (2008) [2]
work page 2008
-
[35]
Ron-Angevin, R., Silva-Sauer, D., et al.: Proposal of a p300-based bci speller using a predictive text system. In- ternational Cong. Neurotechnology p. 35âĂŞ40 (2013) [2]
work page 2013
-
[36]
Neu- rocomputing 72(1), 359–366 (2008) [1]
Rong, H.J., Ong, Y.S., Tan, A.H., Zhu, Z.: A fast pruned- extreme learning machine for classification problem. Neu- rocomputing 72(1), 359–366 (2008) [1]
work page 2008
-
[37]
In: In- ternational Conference on Pattern Recognition, pp
Schmidt, W.F., Kraaijveld, M.A., Duin, R.P., et al.: Fee d forward neural networks with random weights. In: In- ternational Conference on Pattern Recognition, pp. 1–1. IEEE COMPUTER SOCIETY PRESS (1992) [1, 9]
work page 1992
-
[38]
Arquivos de neuro-psiquiatria 60(3B), 742–747 (2002) [2]
Schochat, E., Scheuer, C.I., Andrade, Ê.R.d.: Abr and auditory p300 findings inchildren with adhd. Arquivos de neuro-psiquiatria 60(3B), 742–747 (2002) [2]
work page 2002
-
[39]
World Scientific Publishing Co Inc (2014) [2]
Sewell, G.: Computational methods of linear algebra. World Scientific Publishing Co Inc (2014) [2]
work page 2014
-
[40]
Science 150(3700), 1187–1188 (1965) [2]
Sutton, S., Braren, M., Zubin, J., John, E.: Evoked- potential correlates of stimulus uncertainty. Science 150(3700), 1187–1188 (1965) [2]
work page 1965
-
[41]
Clinical neurophysiology 120(8), 1562– 1566 (2009) [3]
Takano, K., Komatsu, T., Hata, N., Nakajima, Y., Kansaku, K.: Visual stimuli for the p300 brain–computer interface: a comparison of white/gray and green/blue flicker matrices. Clinical neurophysiology 120(8), 1562– 1566 (2009) [3]
work page 2009
-
[42]
Procedia Computer Science 35, 1292–1299 (2014) [3]
Tsuda, M., Lang, Y., Wu, H.: Analysis and identifica- tion of the eeg signals from visual stimulation. Procedia Computer Science 35, 1292–1299 (2014) [3]
work page 2014
-
[43]
Journal of Applied Mathe- matics 2013 (2013) [4]
Tzeng, J.: Split-and-combine singular value decomposi - tion for large-scale matrix. Journal of Applied Mathe- matics 2013 (2013) [4]
work page 2013
-
[44]
IEEE transactions on neural networks and learn- ing systems 25(10), 1828–1841 (2014) [1]
Wang, N., Er, M.J., Han, M.: Parsimonious ex- treme learning machine using recursive orthogonal least squares. IEEE transactions on neural networks and learn- ing systems 25(10), 1828–1841 (2014) [1]
work page 2014
-
[45]
Neurocomputing 74(16), 2483–2490 (2011) [1]
Wang, Y., Cao, F., Yuan, Y.: A study on effectiveness of extreme learning machine. Neurocomputing 74(16), 2483–2490 (2011) [1]
work page 2011
-
[46]
Journal of Machine Learning Research 10(Feb), 207–244 (2009) [1]
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research 10(Feb), 207–244 (2009) [1]
work page 2009
-
[47]
Jones & Bartlett Publishers (2005) [6]
Williams, G.: Linear algebra with applications. Jones & Bartlett Publishers (2005) [6]
work page 2005
-
[48]
IEEE Transactions on Neural Networks and Learning Systems 23(9), 1498–1505 (2012) [1]
Yang, Y., Wang, Y., Yuan, X.: Bidirectional extreme learning machine for regression problem and its learning effectiveness. IEEE Transactions on Neural Networks and Learning Systems 23(9), 1498–1505 (2012) [1]
work page 2012
-
[49]
Natural and En- gineering Sciences 2(2) (2017) [2]
Yayık, A., Kutlu, Y.: Online lda based brain-computer interface system to aid disabled people. Natural and En- gineering Sciences 2(2) (2017) [2]
work page 2017
-
[50]
Neural computing and applications 27(1), 111–120 (2016) [1]
Ying, L.: Orthogonal incremental extreme learning ma- chine for regression and multiclass classification. Neural computing and applications 27(1), 111–120 (2016) [1]
work page 2016
-
[51]
Information sciences 367, 1094–1105 (2016) [1, 2, 10]
Zhang, L., Suganthan, P.N.: A comprehensive evaluation of random vector functional link networks. Information sciences 367, 1094–1105 (2016) [1, 2, 10]
work page 2016
-
[52]
Information Sciences 364, 146–155 (2016) [1, 10]
Zhang, L., Suganthan, P.N.: A survey of randomized algo- rithms for training neural networks. Information Sciences 364, 146–155 (2016) [1, 10]
work page 2016
-
[53]
IEEE transactions on cybernetics 47(10), 3243–3253 (2016) [2, 10]
Zhang, L., Suganthan, P.N.: Visual tracking with convo- lutional random vector functional link network. IEEE transactions on cybernetics 47(10), 3243–3253 (2016) [2, 10]
work page 2016
-
[54]
IEEE Transactions on Neural Networks and Learning Systems 23(2), 365–371 (2012) [1]
Zhang, R., Lan, Y., Huang, G.b., Xu, Z.B.: Universal ap- proximation of extreme learning machine with adaptive growth of hidden nodes. IEEE Transactions on Neural Networks and Learning Systems 23(2), 365–371 (2012) [1]
work page 2012
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.