Fast semi-supervised discriminant analysis for binary classification of large data-sets

Hugo Ceulemans; Jaak Simm; Joerg Kurt Wegner; Joris Tavernier; Karl Meerbergen; Yves Moreau

arxiv: 1709.04794 · v2 · pith:HM5INLNFnew · submitted 2017-09-14 · 💻 cs.AI · cs.NA· cs.PF

Fast semi-supervised discriminant analysis for binary classification of large data-sets

Joris Tavernier , Jaak Simm , Karl Meerbergen , Joerg Kurt Wegner , Hugo Ceulemans , Yves Moreau This is my paper

classification 💻 cs.AI cs.NAcs.PF

keywords methodsdatasemi-supervisedalgorithmsanalysisdiscriminantkrylovscalable

0 comments

read the original abstract

High-dimensional data requires scalable algorithms. We propose and analyze three scalable and related algorithms for semi-supervised discriminant analysis (SDA). These methods are based on Krylov subspace methods which exploit the data sparsity and the shift-invariance of Krylov subspaces. In addition, the problem definition was improved by adding centralization to the semi-supervised setting. The proposed methods are evaluated on a industry-scale data set from a pharmaceutical company to predict compound activity on target proteins. The results show that SDA achieves good predictive performance and our methods only require a few seconds, significantly improving computation time on previous state of the art.

This paper has not been read by Pith yet.

Fast semi-supervised discriminant analysis for binary classification of large data-sets

discussion (0)