pith. sign in

arxiv: 1802.05733 · v1 · pith:RMIZDRAQnew · submitted 2018-02-15 · 💻 cs.LG · stat.ML

Fair Clustering Through Fairlets

classification 💻 cs.LG stat.ML
keywords clusteringfairfairletsproblemalgorithmsapproximatelycentercluster
0
0 comments X
read the original abstract

We study the question of fair clustering under the {\em disparate impact} doctrine, where each protected class must have approximately equal representation in every cluster. We formulate the fair clustering problem under both the $k$-center and the $k$-median objectives, and show that even with two protected classes the problem is challenging, as the optimum solution can violate common conventions---for instance a point may no longer be assigned to its nearest cluster center! En route we introduce the concept of fairlets, which are minimal sets that satisfy fair representation while approximately preserving the clustering objective. We show that any fair clustering problem can be decomposed into first finding good fairlets, and then using existing machinery for traditional clustering algorithms. While finding good fairlets can be NP-hard, we proceed to obtain efficient approximation algorithms based on minimum cost flow. We empirically quantify the value of fair clustering on real-world datasets with sensitive attributes.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Alternatives to the Laplacian for Scalable Spectral Clustering with Group Fairness Constraints

    cs.LG 2025-10 unverdicted novelty 4.0

    Fair-SMW uses SMW identity and alternative Laplacians to produce group-fair spectral clustering that is twice as fast and twice as balanced as prior methods on SBM and real network data.