Finding the needle in high-dimensional haystack: A tutorial on canonical correlation analysis

Cedric Huchuan Xia; Danielle S. Bassett; Danilo Bzdok; Hao-Ting Wang; Janaina Mourao-Miranda; Jonathan Smallwood; Theodore D. Satterthwaite

arxiv: 1812.02598 · v1 · pith:NFJ6BZHZnew · submitted 2018-12-06 · 📊 stat.ML · cs.LG· stat.AP

Finding the needle in high-dimensional haystack: A tutorial on canonical correlation analysis

Hao-Ting Wang , Jonathan Smallwood , Janaina Mourao-Miranda , Cedric Huchuan Xia , Theodore D. Satterthwaite , Danielle S. Bassett , Danilo Bzdok This is my paper

classification 📊 stat.ML cs.LGstat.AP

keywords analysiscanonicalcorrelationdataassociationsbecomingbeginningbehavioral

0 comments

read the original abstract

Since the beginning of the 21st century, the size, breadth, and granularity of data in biology and medicine has grown rapidly. In the example of neuroscience, studies with thousands of subjects are becoming more common, which provide extensive phenotyping on the behavioral, neural, and genomic level with hundreds of variables. The complexity of such big data repositories offer new opportunities and pose new challenges to investigate brain, cognition, and disease. Canonical correlation analysis (CCA) is a prototypical family of methods for wrestling with and harvesting insight from such rich datasets. This doubly-multivariate tool can simultaneously consider two variable sets from different modalities to uncover essential hidden associations. Our primer discusses the rationale, promises, and pitfalls of CCA in biomedicine.

This paper has not been read by Pith yet.

Finding the needle in high-dimensional haystack: A tutorial on canonical correlation analysis

discussion (0)