pith. sign in

arxiv: 1401.5389 · v1 · pith:XQ3HLXLVnew · submitted 2014-01-16 · 💻 cs.IR · cs.CL· cs.LG

Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback

classification 💻 cs.IR cs.CLcs.LG
keywords clusteringdocumentsalongalgorithmdimensionuserclusterfocused
0
0 comments X
read the original abstract

While traditional research on text clustering has largely focused on grouping documents by topic, it is conceivable that a user may want to cluster documents along other dimensions, such as the authors mood, gender, age, or sentiment. Without knowing the users intention, a clustering algorithm will only group documents along the most prominent dimension, which may not be the one the user desires. To address the problem of clustering documents along the user-desired dimension, previous work has focused on learning a similarity metric from data manually annotated with the users intention or having a human construct a feature space in an interactive manner during the clustering process. With the goal of reducing reliance on human knowledge for fine-tuning the similarity function or selecting the relevant features required by these approaches, we propose a novel active clustering algorithm, which allows a user to easily select the dimension along which she wants to cluster the documents by inspecting only a small number of words. We demonstrate the viability of our algorithm on a variety of commonly-used sentiment datasets.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.