pith. sign in

arxiv: 1009.3601 · v1 · pith:2SHIFCRUnew · submitted 2010-09-19 · 📊 stat.ML · math.ST· stat.AP· stat.TH

Pair-Wise Cluster Analysis

classification 📊 stat.ML math.STstat.APstat.TH
keywords analysisclusterpresentedproblemsetupabovealgorithmaligned
0
0 comments X
read the original abstract

This paper studies the problem of learning clusters which are consistently present in different (continuously valued) representations of observed data. Our setup differs slightly from the standard approach of (co-) clustering as we use the fact that some form of `labeling' becomes available in this setup: a cluster is only interesting if it has a counterpart in the alternative representation. The contribution of this paper is twofold: (i) the problem setting is explored and an analysis in terms of the PAC-Bayesian theorem is presented, (ii) a practical kernel-based algorithm is derived exploiting the inherent relation to Canonical Correlation Analysis (CCA), as well as its extension to multiple views. A content based information retrieval (CBIR) case study is presented on the multi-lingual aligned Europal document dataset which supports the above findings.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.