A Survey of Densest Subgraph Discovery on Large Graphs
Pith reviewed 2026-05-24 08:17 UTC · model grok-4.3
The pith
Densest subgraph discovery methods on large graphs can be classified into several groups for systematic review and comparison across roughly 50 papers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
This survey first highlights the importance of densest subgraph discovery in real-world applications and the challenges involved, then classifies existing solutions into several groups covering around 50 papers, reviews the solutions in each group, analyzes and compares the models, and identifies promising future research directions.
What carries the argument
The classification of DSD solutions into several groups that organizes the literature for review, comparison, and identification of future directions.
If this is right
- Researchers obtain a clearer map of existing densest subgraph models and can select methods appropriate to graph size and application.
- Connections between DSD and related problems such as network flow become easier to exploit when the reviewed solutions are compared side by side.
- Identified future directions supply concrete starting points for new work in the database, data mining, and network communities.
- Practitioners gain guidance on which algorithms handle the scale and constraints of their specific graphs.
Where Pith is reading between the lines
- The survey's grouping may highlight under-explored areas such as dynamic or streaming graphs where new methods could be developed.
- Links between DSD and community detection suggest hybrid algorithms that combine ideas from both areas but are not yet covered.
- Testing the classification against papers published after the survey would reveal whether the groups remain stable or need expansion.
Load-bearing premise
The selected set of approximately 50 papers and the chosen grouping into categories provide a representative and non-overlapping coverage of the DSD literature without major omissions or misclassifications of key methods.
What would settle it
The discovery of one or more substantial DSD papers published in SIGMOD, PVLDB, TODS, or WWW that fit none of the survey's groups or were omitted would show the classification is incomplete.
Figures
read the original abstract
With the prevalence of graphs for modeling complex relationships among objects, the topic of graph mining has attracted a great deal of attention from both academic and industrial communities in recent years. As one of the most fundamental problems in graph mining, the densest subgraph discovery (DSD) problem has found a wide spectrum of real applications, such as discovery of filter bubbles in social media, finding groups of actors propagating misinformation in social media, social network community detection, graph index construction, regulatory motif discovery in DNA, fake follower detection, and so on. Theoretically, DSD closely relates to other fundamental graph problems, such as network flow and bipartite matching. Triggered by these applications and connections, DSD has garnered much attention from the database, data mining, theory, and network communities. In this survey, we first highlight the importance of DSD in various real-world applications and the unique challenges that need to be addressed. Subsequently, we classify existing DSD solutions into several groups, which cover around 50 research papers published in many well-known venues (e.g., SIGMOD, PVLDB, TODS, WWW), and conduct a thorough review of these solutions in each group. Afterwards, we analyze and compare the models and solutions in these works. Finally, we point out a list of promising future research directions. It is our hope that this survey not only helps researchers have a better understanding of existing densest subgraph models and solutions, but also provides insights and identifies directions for future study.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a survey on densest subgraph discovery (DSD) that first motivates the problem via applications in social media analysis, misinformation detection, community detection, and bioinformatics, then classifies roughly 50 papers from venues including SIGMOD, PVLDB, TODS, and WWW into several groups, reviews the solutions in each group, compares the models, and lists future research directions.
Significance. A well-executed survey with representative coverage would organize a fragmented literature across database, data mining, theory, and network communities and help identify open problems; the manuscript's explicit mention of cross-community attention and concrete applications is a strength, but the absence of a documented selection protocol limits its immediate utility as a reference.
major comments (2)
- [Abstract] Abstract: the central claim that the survey 'classify[es] existing DSD solutions into several groups, which cover around 50 research papers ... and conduct[s] a thorough review' is load-bearing, yet the text provides no description of the literature search protocol, databases queried, inclusion/exclusion criteria, or date range; without these, it is impossible to assess whether the selected set is representative or whether major omissions or misclassifications have occurred.
- [Abstract] Abstract / Introduction: the assertion of 'non-overlapping' groups and 'thorough review' cannot be evaluated because the manuscript does not list the groups, the assignment criteria, or any cross-check against an exhaustive bibliography of DSD papers; this directly affects the reliability of the subsequent comparison and future-directions sections.
minor comments (1)
- [Abstract] The abstract lists example applications but does not indicate whether each cited paper is later reviewed in the corresponding group; adding forward references would improve traceability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which highlight opportunities to improve the transparency of our survey methodology. We agree that additional details on literature selection and classification criteria will strengthen the manuscript and will incorporate revisions to address both major points.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that the survey 'classify[es] existing DSD solutions into several groups, which cover around 50 research papers ... and conduct[s] a thorough review' is load-bearing, yet the text provides no description of the literature search protocol, databases queried, inclusion/exclusion criteria, or date range; without these, it is impossible to assess whether the selected set is representative or whether major omissions or misclassifications have occurred.
Authors: We acknowledge this limitation in the current version. The abstract and introduction do not describe the search protocol. In the revision we will add a dedicated subsection (likely in Section 1 or a new Section 2) that specifies: (i) the primary venues and databases considered (SIGMOD, PVLDB, TODS, WWW, VLDBJ, KDD, ICDE, and arXiv preprints), (ii) the approximate date range (papers up to early 2023), and (iii) inclusion criteria focused on works that propose novel DSD models, algorithms, or theoretical results with empirical evaluation. This will allow readers to judge coverage and potential gaps. revision: yes
-
Referee: [Abstract] Abstract / Introduction: the assertion of 'non-overlapping' groups and 'thorough review' cannot be evaluated because the manuscript does not list the groups, the assignment criteria, or any cross-check against an exhaustive bibliography of DSD papers; this directly affects the reliability of the subsequent comparison and future-directions sections.
Authors: We agree that the abstract should preview the classification scheme. The body of the manuscript organizes papers into groups (exact vs. approximate, static vs. dynamic, homogeneous vs. heterogeneous, etc.) with assignment based on the dominant technical approach. In the revision we will: (a) explicitly list the groups and their assignment criteria in the abstract/introduction, (b) add a summary table mapping each of the ~50 papers to its group, and (c) clarify that groups are intended to be non-overlapping by primary technique while noting that some papers could fit multiple categories. We will also state the scope limitations and that the list is representative rather than exhaustive. revision: yes
Circularity Check
Survey paper with no derivations or self-referential reductions
full rationale
This paper is a literature survey that classifies and reviews ~50 external papers on densest subgraph discovery. It contains no mathematical derivations, fitted parameters, predictions, or ansatzes. The central claims rest on summarizing published external work rather than any internal chain that reduces to the paper's own inputs or self-citations. No load-bearing step matches any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
classify existing DSD solutions into several groups, which cover around 50 research papers... thorough review... analyze and compare the models and solutions
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
exact solutions... Goldberg... Charikar LP... Fang CoreExact; approximation... PeelApp, CoreApp, Bahmani MapReduce
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
[x,y]-core... maximum cn-pair... 2-approximation
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
In: LATIN, Springer, pp 598–612
Abello J, Resende MG, Sudarsky S (2002) Mas- sive quasi-clique detection. In: LATIN, Springer, pp 598–612
work page 2002
-
[2]
Albert R, Jeong H, Barab´ asi AL (1999) Diameter of the world-wide web. nature 401(6749):130–131
work page 1999
-
[3]
In: Social net- works: Analysis and case studies, Springer, pp 105–125
Amelio A, Pizzuti C (2014) Overlapping commu- nity discovery methods: a survey. In: Social net- works: Analysis and case studies, Springer, pp 105–125
work page 2014
-
[4]
Andersen R (2010) A local algorithm for finding dense subgraphs. ACM TALG 6(4):1–12
work page 2010
-
[5]
Andersen R, Chellapilla K (2009) Finding dense subgraphs with size bounds. In: WAW, Springer, pp 25–37
work page 2009
-
[6]
Angadi A, Varma PS (2015) Overlapping commu- nity detection in temporal networks. IJST 8(31)
work page 2015
-
[7]
Angel A, Koudas N, Sarkas N, Srivastava D, Svendsen M, Tirthapura S (2014) Dense subgraph maintenance under streaming edge weight up- dates for real-time story identification. VLDB J 23(2):175–199
work page 2014
-
[8]
Theory of computing 8(1):121– 164
Arora S, Hazan E, Kale S (2012) The multiplica- tive weights update method: a meta-algorithm and applications. Theory of computing 8(1):121– 164
work page 2012
-
[9]
Jour- nal of Algorithms 34(2):203–221
Asahiro Y, Iwama K, Tamaki H, Tokuyama T (2000) Greedily finding a dense subgraph. Jour- nal of Algorithms 34(2):203–221
work page 2000
-
[10]
Bahmani B, Kumar R, Vassilvitskii S (2012) Densest subgraph in streaming and mapreduce. PVLDB 5(5)
work page 2012
-
[11]
Bahmani B, Goel A, Munagala K (2014) Efficient primal-dual graph algorithms for mapreduce. In: WAW, Springer, pp 59–78
work page 2014
-
[12]
Balalau OD, Bonchi F, Chan TH, Gullo F, Sozio M (2015) Finding subgraphs with maximum total density and limited overlap. In: WSDM, pp 379– 388
work page 2015
-
[13]
Operations Research 59(1):133–142
Balasundaram B, Butenko S, Hicks IV (2011) Clique relaxations in social network analysis: The maximum k-plex problem. Operations Research 59(1):133–142
work page 2011
-
[14]
An O(m) Algorithm for Cores Decomposition of Networks
Batagelj V, Zaversnik M (2003) An o (m) algo- rithm for cores decomposition of networks. arXiv preprint cs/0310049
work page internal anchor Pith review Pith/arXiv arXiv 2003
-
[15]
SIAM journal on imaging sciences 2(1):183–202
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear in- verse problems. SIAM journal on imaging sciences 2(1):183–202
work page 2009
-
[16]
Behrouz A, Hashemi F, Lakshmanan LVS (2022) Firmtruss community search in multilayer net- works. PVLDB 16(3):505–518
work page 2022
-
[17]
Beutel A, Xu W, Guruswami V, Palow C, Falout- sos C (2013) Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In: WWW, pp 119–130
work page 2013
-
[18]
Bhaskara A, Charikar M, Chlamtac E, Feige U, Vijayaraghavan A (2010) Detecting high log- densities: an o (n 1 /4) approximation for densest k-subgraph. In: STOC, pp 201–210
work page 2010
-
[19]
Bhaskara A, Charikar M, Guruswami V, Vija- yaraghavan A, Zhou Y (2012) Polynomial inte- grality gaps for strong sdp relaxations of densest k-subgraph. In: SODA, SIAM, pp 388–405
work page 2012
-
[20]
Bhattacharya S, Henzinger M, Nanongkai D, Tsourakakis C (2015) Space-and time-efficient al- gorithm for maintaining dense subgraphs on one- pass dynamic streams. In: STOC, pp 173–182
work page 2015
-
[21]
Billionnet A, Roupin F (2008) A deterministic ap- proximation algorithm for the densest k-subgraph problem. IJOR 3(3):301–314
work page 2008
-
[22]
Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. JSTAT 2008(10):P10,008
work page 2008
-
[23]
Bonchi F, Gullo F, Kaltenbrunner A, Volkovich Y (2014) Core decomposition of uncertain graphs. In: Macskassy SA, Perlich C, Leskovec J, Wang W, Ghani R (eds) SIGKDD, ACM, pp 1316–1325 22 Wensheng Luo 1 Chenhao Ma1 Yixiang Fang1 Laks V. S. Lakshmanan 2
work page 2014
-
[24]
Discrete Applied Mathe- matics 305:34–47
Bonchi F, Garc´ ıa-Soriano D, Miyauchi A, Tsourakakis CE (2021) Finding densest k- connected subgraphs. Discrete Applied Mathe- matics 305:34–47
work page 2021
-
[25]
Boob D, Sawlani S, Wang D (2019) Faster width- dependent algorithm for mixed packing and cov- ering lps. NIPS 32
work page 2019
-
[26]
Boob D, Gao Y, Peng R, Sawlani S, Tsourakakis C, Wang D, Wang J (2020) Flowless: Extracting densest subgraphs without flow computations. In: WWW
work page 2020
-
[27]
Borodin A, Lee HC, Ye Y (2012) Max-sum di- versification, monotone submodular functions and dynamic updates. In: PODS, pp 155–166
work page 2012
-
[28]
In: WALCOM, Springer, pp 114–125
Bourgeois N, Giannakos A, Lucarelli G, Milis I, Paschos VT (2013) Exact and approximation al- gorithms for densest k-subgraph. In: WALCOM, Springer, pp 114–125
work page 2013
-
[29]
Buehrer G, Chellapilla K (2008) A scalable pat- tern mining approach to web graph compression with communities. In: WSDM, pp 95–106
work page 2008
-
[30]
Information Systems 39:233–255
Calders T, Dexters N, Gillis JJ, Goethals B (2014) Mining frequent itemsets in a stream. Information Systems 39:233–255
work page 2014
-
[31]
Chang L, Qiao M (2020) Deconstruct densest sub- graphs. In: WWW, pp 2747–2753
work page 2020
-
[32]
Chang L, Yu JX, Qin L, Lin X, Liu C, Liang W (2013) Efficiently computing k-edge connected components via graph decomposition. In: SIG- MOD, pp 205–216
work page 2013
-
[33]
In: APPROX, Springer, pp 84–95
Charikar M (2000) Greedy approximation algo- rithms for finding dense components in a graph. In: APPROX, Springer, pp 84–95
work page 2000
-
[34]
Chekuri C, Quanrud K, Torres MR (2022) Densest subgraph: Supermodularity, iterative peeling, and flow. In: SODA, SIAM, pp 1531–1555
work page 2022
-
[35]
Chen J, Saad Y (2010) Dense subgraph extraction with application to community detection. TKDE 24(7):1216–1230
work page 2010
-
[36]
Cheng J, Ke Y, Chu S, ¨Ozsu MT (2011) Effi- cient core decomposition in massive networks. In: ICDE, IEEE, pp 51–62
work page 2011
-
[37]
Ching A, Edunov S, Kabiljo M, Logothetis D, Muthukrishnan S (2015) One trillion edges: Graph processing at facebook-scale. PVLDB 8(12):1804– 1815
work page 2015
-
[38]
Cohen E, Halperin E, Kaplan H, Zwick U (2003) Reachability and distance queries via 2-hop labels. SIAM J Comput 32(5):1338–1355
work page 2003
-
[39]
National security agency technical report 16(3.1)
Cohen J (2008) Trusses: Cohesive subgraphs for social network analysis. National security agency technical report 16(3.1)
work page 2008
-
[40]
Conte A, De Matteis T, De Sensi D, Grossi R, Marino A, Versari L (2018) D2k: scalable com- munity detection in massive networks via small- diameter k-plexes. In: SIGKDD, pp 1272–1281
work page 2018
-
[41]
Dai Q, Li RH, Qin H, Liao M, Wang G (2022) Scal- ing up maximal k-plex enumeration. In: CIKM, pp 345–354
work page 2022
-
[42]
Dai Y, Qiao M, Chang L (2022) Anchored densest subgraph. In: SIGMOD, pp 1200–1213
work page 2022
-
[43]
Danisch M, Chan THH, Sozio M (2017) Large scale density-friendly graph decomposition via convex programming. In: WWW, pp 233–242
work page 2017
-
[44]
Danisch M, Balalau O, Sozio M (2018) Listing k- cliques in sparse real-world graphs. In: WWW, pp 589–598
work page 2018
-
[45]
In: ISDC, Springer, pp 151–165
Das Sarma A, Lall A, Nanongkai D, Trehan A (2012) Dense subgraphs on dynamic networks. In: ISDC, Springer, pp 151–165
work page 2012
-
[46]
Ding D, Li H, Huang Z, Mamoulis N (2017) Ef- ficient fault-tolerant group recommendation using alpha-beta-core. In: CIKM, pp 2047–2050
work page 2017
-
[47]
In: Theoretical com- puter science, Springer, pp 218–240
Dinitz Y (2006) Dinitz’algorithm: The original version and even’s version. In: Theoretical com- puter science, Springer, pp 218–240
work page 2006
-
[48]
Applied Network Science 6(1):1–17
Dondi R, Hosseinzadeh MM, Guzzi PH (2021) A novel algorithm for finding top-k weighted over- lapping densest connected subgraphs in dual net- works. Applied Network Science 6(1):1–17
work page 2021
-
[49]
Dondi R, Hosseinzadeh MM, Mauri G, Zoppis I (2021) Top-k overlapping densest subgraphs: ap- proximation algorithms and computational com- plexity. J Comb Optim 41(1):80–104
work page 2021
-
[50]
Epasto A, Lattanzi S, Sozio M (2015) Efficient densest subgraph computation in evolving graphs. In: WWW, pp 300–310
work page 2015
-
[51]
Expert P, Evans TS, Blondel VD, Lambiotte R (2011) Uncovering space-independent communi- ties in spatial networks. PNAS 108(19):7663–7668
work page 2011
-
[52]
Fang Y, Cheng R, Li X, Luo S, Hu J (2017) Effec- tive community search over large spatial graphs. PVLDB 10(6):709–720
work page 2017
-
[53]
Fang Y, Wang Z, Cheng R, Wang H, Hu J (2018) Effective and efficient community search over large directed graphs. TKDE 31(11):2093–2107
work page 2018
-
[54]
Fang Y, Yu K, Cheng R, Lakshmanan LV, Lin X (2019) Efficient algorithms for densest subgraph discovery. PVLDB 12(11):1719–1732
work page 2019
-
[55]
Fang Y, Huang X, Qin L, Zhang Y, Zhang W, Cheng R, Lin X (2020) A survey of community search over big graphs. VLDBJ 29(1):353–392
work page 2020
-
[56]
Fang Y, Luo W, Ma C (2022) Densest sub- graph discovery on large graphs: Applications, challenges, and techniques. PVLDB 15(12):3766– 3769
work page 2022
-
[57]
Farag´ o A, R Mojaveri Z (2019) In search of the densest subgraph. Algorithms 12(8):157
work page 2019
-
[58]
Algorithmica 29(3):410–421 A Survey of Densest Subgraph Discovery on Large Graphs 23
Feige U, Peleg D, Kortsarz G (2001) The dense k-subgraph problem. Algorithmica 29(3):410–421 A Survey of Densest Subgraph Discovery on Large Graphs 23
work page 2001
-
[59]
Bioinfor- matics 22(14):e150–e157
Fratkin E, Naughton BT, Brutlag DL, Bat- zoglou S (2006) Motifcut: regulatory motifs find- ing with maximum density subgraphs. Bioinfor- matics 22(14):e150–e157
work page 2006
-
[60]
Galbrun E, Gionis A, Tatti N (2016) Top-k over- lapping densest subgraphs. DMKD 30(5):1134– 1165
work page 2016
-
[61]
Galimberti E, Bonchi F, Gullo F (2017) Core de- composition and densest subgraph in multilayer networks. In: CIKM, pp 1807–1816
work page 2017
-
[62]
Galimberti E, Bonchi F, Gullo F, Lanciano T (2020) Core decomposition in multilayer networks: theory, algorithms, and applications. TKDD 14(1):1–40
work page 2020
-
[63]
Galimberti E, Bonchi F, Gullo F, Lanciano T (2020) Core decomposition in multilayer networks: Theory, algorithms, and applications. TKDD
work page 2020
-
[64]
Giatsidis C, Thilikos DM, Vazirgiannis M (2013) D-cores: measuring collaboration of directed graphs based on degeneracy. KAIS 35(2):311–343
work page 2013
-
[65]
In: VLDB, Citeseer, pp 721–732
Gibson D, Kumar R, Tomkins A (2005) Discover- ing large dense subgraphs in massive graphs. In: VLDB, Citeseer, pp 721–732
work page 2005
-
[66]
Gionis A, Tsourakakis CE (2015) Dense subgraph discovery: Kdd 2015 tutorial. In: SIGKDD, pp 2313–2314
work page 2015
-
[67]
Gionis A, Junqueira FP, Leroy V, Serafini M, We- ber I (2013) Piggybacking on social networks. In: VLDB, vol 6, pp 409–420
work page 2013
-
[68]
Girvan M, Newman ME (2002) Community struc- ture in social and biological networks. PNAS 99(12):7821–7826
work page 2002
-
[69]
University of California Berkeley
Goldberg AV (1984) Finding a maximum density subgraph. University of California Berkeley
work page 1984
-
[70]
In: COM- PLEX NETWORKS, Springer, pp 116–127
Gonzales S, Migler T (2019) The densest k sub- graph problem in b-outerplanar graphs. In: COM- PLEX NETWORKS, Springer, pp 116–127
work page 2019
-
[71]
New journal of Physics 12(10):103,018
Gregory S (2010) Finding overlapping communi- ties in networks by label propagation. New journal of Physics 12(10):103,018
work page 2010
-
[72]
Hajibagheri A, Alvari H, Hamzeh A, Hashemi S (2012) Community detection in social networks using information diffusion. In: ASONAM, IEEE, pp 702–703
work page 2012
- [73]
-
[74]
Hashemi F, Behrouz A, Lakshmanan LVS (2022) Firmcore decomposition of multilayer networks. In: Laforest F, Troncy R, Simperl E, Agarwal D, Gionis A, Herman I, M´ edini L (eds) WWW, ACM, pp 1589–1600
work page 2022
-
[75]
Henderson K, Eliassi-Rad T, Papadimitriou S, Faloutsos C (2010) Hcdf: A hybrid community dis- covery framework. In: SDM, SIAM, pp 754–765
work page 2010
-
[76]
Hooi B, Song HA, Beutel A, Shah N, Shin K, Faloutsos C (2016) Fraudar: Bounding graph fraud in the face of camouflage. In: SIGKDD, pp 895–904
work page 2016
-
[77]
Hu J, Wu X, Cheng R, Luo S, Fang Y (2016) Querying minimal steiner maximum-connected subgraphs in large graphs. In: CIKM, pp 1241– 1250
work page 2016
-
[78]
Hu J, Cheng R, Chang KCC, Sankar A, Fang Y, Lam BY (2019) Discovering maximal motif cliques in large heterogeneous information networks. In: ICDE, IEEE, pp 746–757
work page 2019
-
[79]
Hu S, Wu X, Chan TH (2017) Maintaining dens- est subsets efficiently in evolving hypergraphs. In: CIKM, pp 929–938
work page 2017
-
[80]
In: ¨Ozcan F, Koutrika G, Madden S (eds) SIGMOD, ACM, pp 77–90
Huang X, Lu W, Lakshmanan LVS (2016) Truss decomposition of probabilistic graphs: Semantics and algorithms. In: ¨Ozcan F, Koutrika G, Madden S (eds) SIGMOD, ACM, pp 77–90
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.