Scalable Betweenness Centrality Maximization via Sampling

Ahmad Mahmoody; Charalampos E. Tsourakakis; Eli Upfal

read the original abstract

Betweenness centrality is a fundamental centrality measure in social network analysis. Given a large-scale network, how can we find the most central nodes? This question is of key importance to numerous important applications that rely on betweenness centrality, including community detection and understanding graph vulnerability. Despite the large amount of work on designing scalable approximation algorithms for betweenness centrality, estimating it on large-scale networks remains a computational challenge. In this paper, we study the Betweenness Centrality Maximization problem: given a graph $G=(V,E)$ and a positive integer $k$, find a set $S^* \subseteq V$ that maximizes betweenness centrality subject to the cardinality constraint $|S^*| \leq k$. We present an efficient randomized algorithm that provides a $(1-1/e-\epsilon)$-approximation with high probability, where $\epsilon>0$. Our results improve the current state-of-the-art result by Yoshida~\cite{yoshida2014almost}. Furthermore, we provide theoretical evidence for the validity of a crucial assumption in the literature of betweenness centrality estimation, namely that in real-world networks $O(|V|^2)$ shortest paths pass through the top-$k$ central nodes, where $k$ is a constant. On the experimental side, we perform an extensive experimental analysis of our method on real-world networks, demonstrate its accuracy and scalability, and study different properties of central nodes. Finally, we provide three graph mining applications of our method.

Scalable Betweenness Centrality Maximization via Sampling

discussion (0)