A linear streaming algorithm for community detection in very large networks
read the original abstract
In this paper, we introduce a novel community detection algorithm in graphs, called SCoDA (Streaming Community Detection Algorithm), based on an edge streaming setting. This algorithm has an extremely low memory footprint and a lightning-fast execution time as it only stores two integers per node and processes each edge strictly once. The approach is based on the following simple observation: if we pick an edge uniformly at random in the network, this edge is more likely to connect two nodes of the same community than two nodes of distinct communities. We exploit this idea to build communities by local changes at each edge arrival. Using theoretical arguments, we relate the ability of SCoDA to detect communities to usual quality metrics of these communities like the conductance. Experimental results performed on massive real-life networks ranging from one million to more than one billion edges shows that SCoDA runs more than ten times faster than existing algorithms and leads to similar or better detection scores on the largest graphs.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Toward a universal foundation model for graph-structured data
A pretrained graph model using feature-agnostic structural prompts matches or exceeds supervised baselines and shows strong zero-shot and few-shot transfer on held-out biomedical graphs, with a 21.8% ROC-AUC gain on SagePPI.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.