pith. sign in

arxiv: 1907.06768 · v2 · pith:7WB3BUG3new · submitted 2019-07-15 · 💻 cs.DC

Partitioning Graphs for the Cloud using Reinforcement Learning

Pith reviewed 2026-05-24 20:57 UTC · model grok-4.3

classification 💻 cs.DC
keywords graph partitioningreinforcement learninglabel propagationvertex-centricshared-memory parallelismload balancinglarge-scale graphscloud computing
0
0 comments X

The pith

Revolver is a scalable graph partitioning algorithm that uses asynchronous reinforcement learning in a vertex-centric framework to achieve localized partitions without sacrificing load balance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Revolver partitions large-scale graphs on a single shared-memory machine using an asynchronous framework that combines reinforcement learning and label propagation. Each vertex is assigned an autonomous agent that selects a suitable partition based on local neighborhood information. This vertex-centric approach distributes the computation across all vertices without requiring global coordination. Comprehensive tests on nine real-world graphs demonstrate that Revolver produces partitions comparable in localization to three state-of-the-art methods while maintaining load balance. The design targets efficient partitioning for cloud environments.

Core claim

Revolver employs an asynchronous processing framework leveraging reinforcement learning and label propagation to adaptively partition a graph in a vertex-centric manner, where each vertex has an autonomous agent responsible for selecting its partition, enabling parallel partitioning on shared-memory machines that produces localized partitions without load imbalance.

What carries the argument

Asynchronous vertex-centric reinforcement learning combined with label propagation, where each vertex acts as an independent agent choosing its partition based on local information.

If this is right

  • The algorithm scales to large graphs on a single shared-memory machine.
  • It achieves comparable localization to popular graph partitioners.
  • Load balance across partitions is preserved without post-processing.
  • Computation is distributed across vertices using local neighborhood data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This approach might extend to dynamic graphs where partitions need to adapt over time.
  • It could lower communication overhead in cloud-based graph analytics by keeping related vertices together.
  • Testing on even larger graphs or different hardware could reveal scalability limits.

Load-bearing premise

An asynchronous vertex-centric reinforcement learning plus label propagation framework will converge to high-quality partitions on real-world graphs without requiring global coordination or post-processing steps that could reintroduce load imbalance.

What would settle it

Running Revolver on the nine real-world graphs and comparing the resulting partition localization and load balance metrics directly against the three state-of-the-art partitioners; if Revolver shows worse results in either metric, the claim would be falsified.

Figures

Figures reproduced from arXiv: 1907.06768 by Mohammad Hammoud, Mohammad Hasanzadeh Mofrad, Rami Melhem.

Figure 1
Figure 1. Figure 1: LA with k actions are allocated for nodes of graph. The rectangles represent LA actions associated with partitions. partitions for vertices in parallel; the action set of a learning automaton is the same as the range of available partitions and the probability of actions P is initialized to 1/k, 4) scores that are generated from multiple passes of (10) are evaluated by (13) to form the weight vector W, 5) … view at source ↗
Figure 2
Figure 2. Figure 2: An example of a 3-action learning automaton, LA [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Average local edges (bars scaled according to the left y axis) and max normalized load (lines scaled according to the right y axis) of Revolver, [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Revolver convergence for LJ with 32 partitions across 290 steps. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

In this paper, we propose Revolver, a parallel graph partitioning algorithm capable of partitioning large-scale graphs on a single shared-memory machine. Revolver employs an asynchronous processing framework, which leverages reinforcement learning and label propagation to adaptively partition a graph. In addition, it adopts a vertex-centric view of the graph where each vertex is assigned an autonomous agent responsible for selecting a suitable partition for it, distributing thereby the computation across all vertices. The intuition behind using a vertex-centric view is that it naturally fits the graph partitioning problem, which entails that a graph can be partitioned using local information provided by each vertex's neighborhood. We fully implemented and comprehensively tested Revolver using nine real-world graphs. Our results show that Revolver is scalable and can outperform three popular and state-of-the-art graph partitioners via producing comparable localized partitions, yet without sacrificing the load balance across partitions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper proposes Revolver, an asynchronous vertex-centric graph partitioning algorithm that combines reinforcement learning with label propagation. Each vertex acts as an autonomous agent making local partition decisions on a shared-memory machine. The central empirical claim is that Revolver scales to large graphs and outperforms three popular state-of-the-art partitioners on nine real-world graphs by achieving comparable edge-cut quality while preserving load balance across partitions.

Significance. If the empirical comparisons are supported by detailed, reproducible metrics and the RL+label-propagation framework demonstrably maintains balance without hidden global steps, the work would provide a novel local-decision RL approach to graph partitioning that could be relevant for cloud-scale workloads. The use of nine real-world graphs and the vertex-centric asynchronous design are positive aspects of the experimental setup.

major comments (3)
  1. [Abstract, §3] Abstract and §3 (method description): the central claim that the method produces balanced partitions 'without sacrificing the load balance across partitions' and 'without requiring global coordination' rests on the unstated assumption that purely local per-vertex RL decisions plus label propagation suffice to enforce global balance. No description is given of the reward function, convergence criteria, or any mechanism that prevents partition-size drift under asynchrony; this directly undermines assessment of the 'without sacrificing load balance' result.
  2. [Abstract, results section] Abstract and results section: the claim of outperformance on nine graphs supplies no quantitative metrics (edge-cut ratios, balance ratios, runtime), no baseline implementation details, and no description of the three SOTA partitioners or the RL hyperparameters. Without these, the empirical support for the central claim cannot be evaluated.
  3. [§4] §4 (experimental evaluation): if any post-processing or global aggregation step is used to restore balance after local RL updates, this must be explicitly stated and measured, because its presence would contradict the 'asynchronous vertex-centric' and 'no global coordination' framing that underpins the scalability claim.
minor comments (1)
  1. [§3] Notation for the RL state, action, and reward should be introduced with explicit equations rather than prose only.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments identify important areas for improving clarity on the balance mechanism and experimental details. We address each point below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract, §3] Abstract and §3 (method description): the central claim that the method produces balanced partitions 'without sacrificing the load balance across partitions' and 'without requiring global coordination' rests on the unstated assumption that purely local per-vertex RL decisions plus label propagation suffice to enforce global balance. No description is given of the reward function, convergence criteria, or any mechanism that prevents partition-size drift under asynchrony; this directly undermines assessment of the 'without sacrificing load balance' result.

    Authors: We agree that §3 would benefit from expanded detail on these aspects. The reward function combines a local edge-cut term with a load-imbalance penalty based on neighborhood partition counts, and label propagation is used to propagate stable decisions. Convergence is determined by local stability thresholds per vertex. We will add a dedicated subsection in the revision describing the reward function, convergence criteria, and an argument (supported by the vertex-centric update rule) that asynchrony does not produce measurable drift because each update only considers current local state. This will directly support the balance claim. revision: yes

  2. Referee: [Abstract, results section] Abstract and results section: the claim of outperformance on nine graphs supplies no quantitative metrics (edge-cut ratios, balance ratios, runtime), no baseline implementation details, and no description of the three SOTA partitioners or the RL hyperparameters. Without these, the empirical support for the central claim cannot be evaluated.

    Authors: The full experimental section contains tables reporting edge-cut ratios, balance ratios (max/min partition size), and runtimes for all nine graphs against the three baselines, along with hyperparameter settings. However, the abstract and high-level results summary are indeed terse. In the revision we will insert key quantitative values into the abstract and add a short paragraph in the results section naming the baselines (METIS, KaHIP, Scotch) and listing the main RL hyperparameters used. revision: yes

  3. Referee: [§4] §4 (experimental evaluation): if any post-processing or global aggregation step is used to restore balance after local RL updates, this must be explicitly stated and measured, because its presence would contradict the 'asynchronous vertex-centric' and 'no global coordination' framing that underpins the scalability claim.

    Authors: Revolver performs no post-processing or global aggregation step; balance is maintained solely by the per-vertex RL decisions and asynchronous label propagation. We will add an explicit sentence in §4 confirming the absence of any such step and will reference the per-graph balance ratios already reported to demonstrate that balance holds without it. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical claims rest on external graph comparisons

full rationale

The paper introduces Revolver as a vertex-centric asynchronous RL + label propagation algorithm and reports its performance via direct experimental comparison against three existing partitioners on nine real-world graphs. No equations, fitted parameters, self-citations, or uniqueness theorems appear in the provided text. The central claims are therefore falsifiable against independent external implementations and datasets, satisfying the self-contained criterion.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the RL reward structure and convergence assumptions are not stated.

pith-pipeline@v0.9.0 · 5676 in / 1114 out tokens · 15881 ms · 2026-05-24T20:57:14.099922+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

  1. [1]

    La3: a scalable link-and locality-aware linear algebra-based graph analytics system,

    Y . Ahmad, O. Khattab, A. Malik, A. Musleh, M. Hammoud, M. Kutlu, M. Shehata, and T. Elsayed, “La3: a scalable link-and locality-aware linear algebra-based graph analytics system,” Proceedings of the VLDB Endowment, vol. 11, no. 8, pp. 920–933, 2018

  2. [2]

    Performance analysis of vertex centric graph algorithms on the azure cloud platform,

    M. Redekopp, Y . Simmhan, and V . K. Prasanna, “Performance analysis of vertex centric graph algorithms on the azure cloud platform,” in Workshop on Parallel Algorithms and Software for Analysis of Massive Graphs. Citeseer, 2011

  3. [3]

    Pow- ergraph: Distributed graph-parallel computation on natural graphs

    J. E. Gonzalez, Y . Low, H. Gu, D. Bickson, and C. Guestrin, “Pow- ergraph: Distributed graph-parallel computation on natural graphs.” in OSDI, vol. 12, no. 1, 2012, p. 2

  4. [4]

    (2018) Amazon elastic compute cloud (amazon ec2)

    Amazon.com, Inc. (2018) Amazon elastic compute cloud (amazon ec2). [Online]. Available: https://aws.amazon.com/ec2/

  5. [5]

    (2018) Microsoft azure cloud services

    Microsoft Corporation. (2018) Microsoft azure cloud services. [Online]. Available: https://azure.microsoft.com/

  6. [6]

    (2018) Google app engine platfrorm

    Google LLC. (2018) Google app engine platfrorm. [Online]. Available: https://cloud.google.com/appengine/

  7. [7]

    Gunrock: A high-performance graph processing library on the gpu,

    Y . Wang, A. Davidson, Y . Pan, Y . Wu, A. Riffel, and J. D. Owens, “Gunrock: A high-performance graph processing library on the gpu,” in PPoPP, 2016, p. 11

  8. [8]

    Ligra: a lightweight graph processing framework for shared memory,

    J. Shun and G. E. Blelloch, “Ligra: a lightweight graph processing framework for shared memory,” in ACM Sigplan Notices, vol. 48, no. 8, 2013, pp. 135–146

  9. [9]

    Numa-aware graph-structured ana- lytics,

    K. Zhang, R. Chen, and H. Chen, “Numa-aware graph-structured ana- lytics,” in ACM SIGPLAN Notices , vol. 50, no. 8, 2015, pp. 183–193

  10. [10]

    A lightweight infrastructure for graph analytics,

    D. Nguyen, A. Lenharth, and K. Pingali, “A lightweight infrastructure for graph analytics,” in 24th ACM SOSP , 2013, pp. 456–471

  11. [11]

    Graphchi: Large-scale graph computation on just a pc

    A. Kyrola, G. E. Blelloch, C. Guestrin et al. , “Graphchi: Large-scale graph computation on just a pc.” in OSDI, vol. 12, 2012, pp. 31–46

  12. [12]

    X-stream: Edge-centric graph processing using streaming partitions,

    A. Roy, I. Mihailovic, and W. Zwaenepoel, “X-stream: Edge-centric graph processing using streaming partitions,” in SOSP, 2013, p. 472

  13. [13]

    Gridgraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning

    X. Zhu, W. Han, and W. Chen, “Gridgraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning.” in USENIX Annual Technical Conference, 2015, pp. 375–386

  14. [14]

    Mosaic: Processing a trillion-edge graph on a single machine,

    S. Maass, C. Min, S. Kashyap, W. Kang, M. Kumar, and T. Kim, “Mosaic: Processing a trillion-edge graph on a single machine,” in 12th ACM EuroSys, 2017, pp. 527–543

  15. [15]

    One trillion edges: Graph processing at facebook-scale,

    A. Ching, S. Edunov, M. Kabiljo, D. Logothetis, and S. Muthukrishnan, “One trillion edges: Graph processing at facebook-scale,” Proceedings of the VLDB Endowment , vol. 8, no. 12, pp. 1804–1815, 2015

  16. [16]

    Pregel: a system for large-scale graph processing,

    G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, “Pregel: a system for large-scale graph processing,” in 2010 ACM SIGMOD International Conference on Management of data, 2010, pp. 135–146

  17. [17]

    Distributed graphlab: a framework for machine learning and data mining in the cloud,

    Y . Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein, “Distributed graphlab: a framework for machine learning and data mining in the cloud,” Proceedings of the VLDB Endowment , vol. 5, no. 8, pp. 716–727, 2012

  18. [18]

    Powerlyra: Differentiated graph computation and partitioning on skewed graphs,

    R. Chen, J. Shi, Y . Chen, and H. Chen, “Powerlyra: Differentiated graph computation and partitioning on skewed graphs,” in 10th EuroSys, 2015

  19. [19]

    Sync or async: Time to fuse for distributed graph-parallel computation,

    C. Xie, R. Chen, H. Guan, B. Zang, and H. Chen, “Sync or async: Time to fuse for distributed graph-parallel computation,”ACM SIGPLAN Notices, vol. 50, no. 8, pp. 194–204, 2015

  20. [20]

    Goffish: A sub-graph centric frame- work for large-scale graph analytics,

    Y . Simmhan, A. Kumbhare, C. Wickramaarachchi, S. Nagarkar, S. Ravi, C. Raghavendra, and V . Prasanna, “Goffish: A sub-graph centric frame- work for large-scale graph analytics,” in European Conference on Parallel Processing. Springer, 2014, pp. 451–462

  21. [21]

    From think like a vertex to think like a graph,

    Y . Tian, A. Balmin, S. A. Corsten, S. Tatikonda, and J. McPherson, “From think like a vertex to think like a graph,” Proceedings of the VLDB Endowment, vol. 7, no. 3, pp. 193–204, 2013

  22. [22]

    Distributed programming for the cloud: Models, challenges, and analytics engines

    M. Hammoud and M. F. Sakr, “Distributed programming for the cloud: Models, challenges, and analytics engines.” 2014

  23. [23]

    Ja-be-ja: A distributed algorithm for balanced graph par- titioning,

    F. Rahimian, A. H. Payberah, S. Girdzijauskas, M. Jelasity, and S. Haridi, “Ja-be-ja: A distributed algorithm for balanced graph par- titioning,” in 7th IEEE SASO , 2013, pp. 51–60

  24. [24]

    Fennel: Streaming graph partitioning for massive scale graphs,

    C. Tsourakakis, C. Gkantsidis, B. Radunovic, and M. V ojnovic, “Fennel: Streaming graph partitioning for massive scale graphs,” in 7th ACM WSDM, 2014, pp. 333–342

  25. [25]

    Spinner: Scalable graph partitioning in the cloud,

    C. Martella, D. Logothetis, A. Loukas, and G. Siganos, “Spinner: Scalable graph partitioning in the cloud,” in ICDE, 2017, p. 1083

  26. [26]

    A bridging model for parallel computation,

    L. G. Valiant, “A bridging model for parallel computation,” Communi- cations of the ACM , vol. 33, no. 8, pp. 103–111, 1990

  27. [27]

    Detecting network communities by propagating labels under constraints,

    M. J. Barber and J. W. Clark, “Detecting network communities by propagating labels under constraints,” Physical Review E, vol. 80, 2009

  28. [28]

    K. S. Narendra and M. A. Thathachar, Learning automata: an introduc- tion. Courier Corporation, 2012

  29. [29]

    Adaptive cooperative particle swarm optimizer,

    M. Hasanzadeh, M. R. Meybodi, and M. M. Ebadzadeh, “Adaptive cooperative particle swarm optimizer,” Applied Intelligence , vol. 39, no. 2, pp. 397–420, 2013

  30. [30]

    Grid resource discovery based on distributed learning automata,

    M. Hasanzadeh and M. R. Meybodi, “Grid resource discovery based on distributed learning automata,” Computing, vol. 96, pp. 909–922, 2014

  31. [31]

    Distributed optimization grid resource discovery,

    M. Hasanzadeh and M. R. Meybodi, “Distributed optimization grid resource discovery,” The Journal of Supercomputing , vol. 71, no. 1, pp. 87–120, 2015

  32. [32]

    On achieving intelligent traffic-aware consolidation of virtual machines in a data center using learning automata,

    A. Jobava, A. Yazidi, B. J. Oommen, and K. Begnum, “On achieving intelligent traffic-aware consolidation of virtual machines in a data center using learning automata,” Journal of Computational Science, vol. 24, pp. 290–312, 2018

  33. [33]

    A new learning automata based sampling algorithm for social networks,

    A. Rezvanian and M. R. Meybodi, “A new learning automata based sampling algorithm for social networks,” International Journal of Com- munication Systems, vol. 30, no. 5, pp. 1–21, 2017

  34. [34]

    Cellular edge detection: Combining cellular automata and cellular learning au- tomata,

    M. H. Mofrad, S. Sadeghi, A. Rezvanian, and M. R. Meybodi, “Cellular edge detection: Combining cellular automata and cellular learning au- tomata,” AEU-International Journal of Electronics and Communications, vol. 69, no. 9, pp. 1282–1290, 2015

  35. [35]

    Learning automata cluster- ing,

    M. Hasanzadeh-Mofrad and A. Rezvanian, “Learning automata cluster- ing,” Journal of Computational Science , vol. 24, pp. 379–388, 2018

  36. [36]

    An efficient heuristic procedure for partitioning graphs,

    B. W. Kernighan and S. Lin, “An efficient heuristic procedure for partitioning graphs,” The Bell system technical journal , vol. 49, no. 2, pp. 291–307, 1970

  37. [37]

    Spectral k-way ratio-cut partitioning and clustering,

    P. K. Chan, M. D. Schlag, and J. Y . Zien, “Spectral k-way ratio-cut partitioning and clustering,” ransactions on computer-aided design of integrated circuits and systems , vol. 13, no. 9, pp. 1088–1096, 1994

  38. [38]

    A fast and high quality multilevel scheme for partitioning irregular graphs,

    G. Karypis and V . Kumar, “A fast and high quality multilevel scheme for partitioning irregular graphs,” SIAM Journal on scientific Computing, vol. 20, no. 1, pp. 359–392, 1998

  39. [39]

    Near linear time algorithm to detect community structures in large-scale networks,

    U. N. Raghavan, R. Albert, and S. Kumara, “Near linear time algorithm to detect community structures in large-scale networks,” Physical review E, vol. 76, no. 3, p. 036106, 2007

  40. [40]

    Probability matching, the magnitude of reinforcement, and classifier system bidding,

    D. E. Goldberg, “Probability matching, the magnitude of reinforcement, and classifier system bidding,” Machine Learning , vol. 5, no. 4, pp. 407–425, 1990

  41. [41]

    Revolver: vertex-centric graph partitioning using reinforcement learning,

    M. H. Mofrad, R. Melhem, and M. Hammoud, “Revolver: vertex-centric graph partitioning using reinforcement learning,” in 2018 IEEE 11th International Conference on Cloud Computing (CLOUD) . IEEE, 2018, pp. 818–821

  42. [42]

    SNAP Datasets: Stanford large network dataset collection,

    J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection,” http://snap.stanford.edu/data

  43. [43]

    The webgraph framework i: Compression techniques,

    P. Boldi and S. Vigna, “The webgraph framework i: Compression techniques,” in 13th ACM WWW , 2004, pp. 595–601

  44. [44]

    9th DIMACS implementation challenge,

    C. Demetrescu, A. V . Goldberg, and D. S. Johnson, “9th DIMACS implementation challenge,” http://www.diag.uniroma1.it/challenge9/