pith. sign in

arxiv: 1906.12018 · v1 · pith:OGNE57DLnew · submitted 2019-06-28 · 💻 cs.DB · cs.DC

Pruned Landmark Labeling Meets Vertex Centric Computation: A Surprisingly Happy Marriage!

Pith reviewed 2026-05-25 13:49 UTC · model grok-4.3

classification 💻 cs.DB cs.DC
keywords pruned landmark labelingvertex-centric computationgraph reachability queriesparallel graph algorithmsmulticore processingbatch execution modeldirected and weighted graphs
0
0 comments X

The pith

BVC-PLL adapts pruned landmark labeling to vertex-centric computation and achieves lower computational and memory costs than the original sequential PLL under a reasonable assumption.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the sequential pruned landmark labeling algorithm can be recast in a vertex-centric model despite its apparent sequential dependencies. It introduces VC-PLL to resolve the mismatch and then BVC-PLL, which adds a batch execution model to eliminate redundant work. Theoretical comparison shows that BVC-PLL incurs lower computation and memory-access costs than PLL when the assumption holds, and experiments confirm the sequential version runs more than twice as fast while also scaling in parallel. The same approach extends directly to directed and weighted graphs and to hierarchical parallelism on multicore hardware.

Core claim

By replacing the sequential traversal of PLL with a vertex-centric computation model and introducing batch execution, BVC-PLL produces exactly the same landmark labels yet exhibits strictly lower computational and memory-access costs than the original PLL under a reasonable assumption, and therefore can serve as a faster sequential algorithm while also admitting efficient parallel execution.

What carries the argument

The batch execution model inside the vertex-centric computation of pruned landmark labeling, which groups vertex updates to remove redundant distance and label checks.

If this is right

  • BVC-PLL extends without modification to directed graphs and to weighted graphs while preserving the same correctness and cost advantages.
  • The batch vertex-centric formulation admits direct use of hierarchical parallelism on modern multicore architectures with SIMD units.
  • The sequential BVC-PLL implementation runs more than twice as fast as the original PLL on real-world graphs.
  • The parallel version of BVC-PLL demonstrates both high efficiency and good scalability as the number of cores increases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same batching technique could reduce redundant work in other label-propagation or distance-oracle algorithms that currently rely on sequential traversals.
  • Lower memory-access costs may translate into better cache behavior on graphs too large to fit in main memory, allowing single-machine processing of larger instances.
  • If the reasonable assumption turns out to hold for the majority of real-world graphs, practitioners could replace existing PLL implementations with BVC-PLL for both sequential and parallel use without changing query results.

Load-bearing premise

An unspecified reasonable assumption must hold for the claimed reduction in computational and memory-access costs of BVC-PLL relative to PLL.

What would settle it

Any concrete graph on which the total number of distance comparisons or random memory accesses performed by BVC-PLL exceeds the corresponding count for the original sequential PLL.

Figures

Figures reproduced from arXiv: 1906.12018 by Bin Ren, Feodor Dragan, Gagan Agrawal, Ruoming Jin, Wendell Wu, Zhen Peng.

Figure 1
Figure 1. Figure 1: 2-Hop Labeling and PLL Example Vertex Labels A {(A, 0), (D, 1), (I, 2)} B {(B, 0), (D, 1), (E, 1), (F, 2), (I, 2)} C {(C, 0), (D, 1), (E, 1), (F, 1), (I, 2), (D, 2)} D {(D, 0), (I, 1)} E {(E, 0), (D, 1), (I, 1)} F {(F, 0), (I, 1)} G {(G, 0), (F, 1), (L, 1)(I, 2)} H {(H, 0), (A, 1), (I, 1)} I {(I, 0)} J {(J, 0), (I, 1)} K {(K, 0), (J, 1), (F, 2), (I, 2)} L {(L, 0), (F, 1), (I, 2)} [PITH_FULL_IMAGE:figures/… view at source ↗
Figure 2
Figure 2. Figure 2: A VC-PLL Running Example to v is one step longer than the shortest path between u and v, and u will be pruned. Please refer to Appendix for proof. In addition, at each of these two iterations, if u has not been or is not in the label of v, then different neighbors of v may send the same (u, d(u, v)) messages to v. Additional Cost of Distance Check: Lemma 2. The set consisting of all pairs (u, v) for distan… view at source ↗
Figure 4
Figure 4. Figure 4: illustrates the key idea in the proof of Theorem 3. Assuming 9 vertices a, b, · · · , i in one batch being added into L(v) in PLL labeling, its total distance check cost is 36 no matter which order they are received in (visualized as the area under the diagonal stairs). Now assuming they arrive in three groups as shown in [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The scalability of BVC-PLL (unweighted) and HOLLYWOOD) with an average speedup 1.58X over PLL. This observation is consistent with our theoretical analysis in Subsection 4.3. In the next subsection, we will perform a more detailed cost breakdown and comparison [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Performance Analysis Graph PLL BVC-PLL Ratio pos. neg. pos. neg. pos. neg. GNUTELLA 22 23 19 24 1.18 0.96 DBLP 46 36 41 38 1.14 0.96 WIKITALK 32 19 16 19 2.00 1.00 YOUTUBE 108 138 80 138 1.35 1.00 TREC WT10G 314 103 300 104 1.05 0.99 SKITTER 260 513 208 508 1.25 1.01 CATS-DOGS 44 244 37 244 1.19 1.00 FLICKR 893 2,003 830 2,028 1.08 0.99 HOLLYWOOD 6,667 38,343 6,455 38,487 1.03 1.00 INDOCHINA 6,265 1,886 5,… view at source ↗
Figure 8
Figure 8. Figure 8: The scalability of BVC-PLL (weighted) 1 2 4 8 16 20 1 4 16 64 256 1024 4096 Labeling Time (s) Thread ParaPLL BVC-PLL-S BVC-PLL-N SP-S SP-N 20 40 60 80 100 Speedup [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Parallel Weighted Performance: BVC-PLL vs. ParaPLL on GNUTELLA. 6, 58], and has been applied successfully in industry prac￾tice. A more detailed review on this topic can be found in a recent survey [10]. We note that the effectiveness of these approaches rely on the essential properties of road networks, such as the ones that are almost planar, have low vertex degree, are weighted, are spatial, or have a h… view at source ↗
read the original abstract

In this paper, we study how the Pruned Landmark Labeling (PPL) algorithm can be parallelized in a scalable fashion, producing the same results as the sequential algorithm. More specifically, we parallelize using a Vertex-Centric (VC) computational model on a modern SIMD powered multicore architecture. We design a new VC-PLL algorithm that resolves the apparent mismatch between the inherent sequential dependence of the PLL algorithm and the Vertex- Centric (VC) computing model. Furthermore, we introduce a novel batch execution model for VC computation and the BVC-PLL algorithm to reduce the computational inefficiency in VC-PLL. Quite surprisingly, the theoretical analysis reveals that under a reasonable assumption, BVC-PLL has lower computational and memory access costs than PLL and indicates it may run faster than PLL as a sequential algorithm. We also demonstrate how BVC-PLL algorithm can be extended to handle directed graphs and weighted graphs and how it can utilize the hierarchical parallelism on a modern parallel computing architecture. Extensive experiments on real-world graphs not only show the sequential BVC-PLL can run more than two times faster than the original PLL, but also demonstrates its parallel efficiency and scalability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents VC-PLL, a vertex-centric reformulation of the sequential Pruned Landmark Labeling (PLL) algorithm for reachability queries that is claimed to produce identical results, and BVC-PLL, a batched variant. Under an unspecified 'reasonable assumption,' BVC-PLL is shown to have strictly lower computational and memory-access costs than PLL (potentially enabling faster sequential execution). The work extends the approach to directed and weighted graphs, exploits hierarchical parallelism, and reports >2× sequential speedup plus good parallel scalability on real-world graphs.

Significance. If the cost bounds hold under the stated assumption and the observed speedups are attributable to the algorithmic redesign rather than implementation details, the result would improve the practical construction of landmark labels for large-scale reachability queries on multicore hardware while preserving exactness.

major comments (2)
  1. [Abstract / theoretical analysis] Abstract and the theoretical cost analysis (likely §4 or §5): the central claim that 'under a reasonable assumption, BVC-PLL has lower computational and memory access costs than PLL' is load-bearing for both the sequential speedup assertion and the motivation for BVC-PLL. The assumption must be stated explicitly, shown to hold on the evaluated graphs (or on the graph families where the 2× speedup is reported), and the cost inequality must be derived without circularity. If the assumption encodes a property (e.g., label-growth rate or access pattern) that fails on the test instances, the theoretical advantage disappears and the experimental result requires a different explanation.
  2. [Experiments] Experimental section (likely §6): the reported >2× sequential speedup of BVC-PLL over PLL must be accompanied by controls that isolate the effect of the batch execution model from other implementation choices (SIMD tuning, memory layout, etc.). Without such controls, it is unclear whether the observed runtime improvement validates the cost analysis or arises from orthogonal engineering factors.
minor comments (2)
  1. [Algorithm description] Notation for the batch size parameter and the precise definition of 'batch execution model' should be introduced earlier and used consistently when comparing VC-PLL and BVC-PLL.
  2. [VC-PLL definition] The paper should clarify whether the vertex-centric formulation preserves the exact pruning behavior of the original PLL or only the final label sets; a short proof sketch or invariant would strengthen the 'identical results' claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point-by-point below and commit to revisions that strengthen the presentation of the theoretical claims and experimental validation.

read point-by-point responses
  1. Referee: [Abstract / theoretical analysis] Abstract and the theoretical cost analysis (likely §4 or §5): the central claim that 'under a reasonable assumption, BVC-PLL has lower computational and memory access costs than PLL' is load-bearing for both the sequential speedup assertion and the motivation for BVC-PLL. The assumption must be stated explicitly, shown to hold on the evaluated graphs (or on the graph families where the 2× speedup is reported), and the cost inequality must be derived without circularity. If the assumption encodes a property (e.g., label-growth rate or access pattern) that fails on the test instances, the theoretical advantage disappears and the experimental result requires a different explanation.

    Authors: We agree that the assumption and its verification are central and currently underspecified. In the revised manuscript we will (i) state the assumption explicitly and formally in the cost-analysis section, (ii) provide a self-contained derivation of the strict inequality that avoids any circular reference to experimental outcomes, and (iii) add a short subsection (or appendix table) that checks the assumption on every graph used for the reported speedups. If the assumption is shown to hold, the theoretical advantage is directly linked to the observed results; if any graph violates it, we will note the discrepancy and discuss alternative explanations for the runtime improvement. revision: yes

  2. Referee: [Experiments] Experimental section (likely §6): the reported >2× sequential speedup of BVC-PLL over PLL must be accompanied by controls that isolate the effect of the batch execution model from other implementation choices (SIMD tuning, memory layout, etc.). Without such controls, it is unclear whether the observed runtime improvement validates the cost analysis or arises from orthogonal engineering factors.

    Authors: We accept that the current experiments do not fully isolate the batch-execution contribution. In the revision we will add an ablation-style comparison (or additional micro-benchmark) in which the only systematic difference between the two implementations is the batch versus non-batch vertex-centric schedule, while memory layout, SIMD intrinsics, and compiler flags remain identical. The results of this controlled comparison will be reported alongside the existing end-to-end timings. revision: yes

Circularity Check

0 steps flagged

No circularity; cost comparison rests on external assumption, not self-definition or fitted inputs

full rationale

The paper introduces BVC-PLL via vertex-centric redesign and batch execution, then invokes an external 'reasonable assumption' to claim strictly lower computational and memory-access costs versus PLL. No quoted equations reduce the claimed inequality to a fitted parameter, self-citation chain, or definitional renaming. The assumption is presented as an independent premise whose validity is left for verification outside the derivation; the algorithmic redesign itself contains independent content. This is the common honest case of a self-contained proposal whose central claim is falsifiable on real graphs rather than forced by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on standard graph-theoretic definitions of pruned landmark labeling and the vertex-centric computation model; no free parameters, invented entities, or ad-hoc axioms are visible in the abstract.

axioms (1)
  • domain assumption The vertex-centric model can be adapted to PLL while exactly preserving the same labeling results as the sequential algorithm.
    Abstract states that VC-PLL produces the same results; this premise is required for the parallel version to be correct.

pith-pipeline@v0.9.0 · 5747 in / 1347 out tokens · 28951 ms · 2026-05-25T13:49:50.638454+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

97 extracted references · 97 canonical work pages · 2 internal anchors

  1. [1]

    the Scatter function, which describes how each vertex uses its vertex value and edge value to propagate a message to its neighbors; and 2) Gather function, which describes how each vertex computes a new value based on its original value and all the new messages it received. Each phase can traverse in parallel their corresponding vertex sets: ActiveVertice...

  2. [2]

    1) iterates following the vertex rank (order): at the ith iteration, the vertex u with rank π(u) = i will be distributed to all other vertices in the graph using a BFS process

    BASIC VERTEX­CENTRIC ALGORITHM Recall that the PLL algorithm (Alg. 1) iterates following the vertex rank (order): at the ith iteration, the vertex u with rank π(u) = i will be distributed to all other vertices in the graph using a BFS process. The key condition to add u into the label of v, L(v), is the distance check for the canonical labeling criterion:...

  3. [3]

    Scatter phase (Lines 3 to 7, also referred to as the push model): all active vertices with new labels perform 4 Algorithm 3 VC-PLL for G = ( V, E) with Order π {Init.: ( L(v): label; δL(v): new label from each iteration)} 1: ActiveV ertices← V ;∀ v∈ V, δL (v)←{ (v, 0)}, L(v)← δL(v) 2: while ActiveV ertices̸=∅ do {Scatter Phase:} 3: for all a∈ ActiveVertic...

  4. [4]

    # $"% &"' (

    Gather phase (Line 8-16): all vertices that receive a new message ( v.messages ̸= ∅) perform a vertex Gather function (Lines 9-15): For a vertex v, it traverses all its re- ceived messages (distance label from its neighbors), and fo r each unique vertex (u, d(u, v)) across the set of messages, it confirms the distance check for the canonical labeling crite...

  5. [5]

    Though VC-PLL can be described in a natural Vertex- Centric computational scheme, it also demonstrates certai n limitations of the original vertex-centric model assumpti ons:

    BATCHED VERTEX­CENTRIC ALG. Though VC-PLL can be described in a natural Vertex- Centric computational scheme, it also demonstrates certai n limitations of the original vertex-centric model assumpti ons:

  6. [6]

    Indeed, the additional computational costs of VC-PLL compared with PLL (Subsection 3.3) can be traced back to these limitations

    Typically, the vertex value (and message) is fixed in VC, whereas in VC-PLL, each vertex value (and message) is a continuously growing list (or set); 2) In the Gather functio n, the computation needs remote memory access for checking distance conditions (Line 11 in Algorithm 3): in most cases, u is not a neighbor of v, and when we use L(u) for distance che...

  7. [7]

    Since the number of negative checks is the same for PLL and BVC-PLL, we expect their overall cost will be fairly close to each other

    can help provide almost O(1) pruning. Since the number of negative checks is the same for PLL and BVC-PLL, we expect their overall cost will be fairly close to each other. Putting It T ogether: Assuming that PLL and BVC-PLL have a similar cost for negative distance checks, theoreti- cally, BVC-PLL may have smaller computational cost than that of PLL (due ...

  8. [8]

    VC-PLL and BVC-PLL can be easily extended to handle directed graphs by considering these two labels as separate computations

    V ARIANTS AND IMPLEMENTATION 8 5.1 Generalization Directed Graphs: For directed graph, each vertex v is assigned with two labels Lin(v) and Lout(v). VC-PLL and BVC-PLL can be easily extended to handle directed graphs by considering these two labels as separate computations. Specifically, in the Scatter function, the new labels δLin and δLout will be sent o...

  9. [9]

    deg.” denotes average degree. “dia

    EV ALUATION In this section, we perform a detailed evaluation of BVC- PLL, focusing on answering the following questions: 1) How does BVC-PLL algorithm perform against the original PLL in a sequential setting (single thread; no parallelism)? Sp ecif- ically, the theoretical analysis indicates it may run faste r, but we conduct experiments to confirm this. ...

  10. [10]

    There have been quite a list of efforts in designing parallel Dijkstra and BFS algorithms [56, 49, 47]

    RELATED WORK Online and Parallel Shortest Path Distance Com- putation: The standard (single source) shortest path com- putation method is Dijkstra’s algorithm [26] for weighted graphs and Breadth-First Search (BFS) traversal for un- weighted graphs. There have been quite a list of efforts in designing parallel Dijkstra and BFS algorithms [56, 49, 47]. Part...

  11. [11]

    We have achieved this by mapping the algorithm to a vertex-centric model

    CONCLUSION In this paper, we proposed VC-PLL, which, to the best of our knowledge, is the first scalable parallelization of Prun ed Landmark Labeling (PLL) that is able to produce the same result as the sequential method. We have achieved this by mapping the algorithm to a vertex-centric model. We also introduced a new batched execution mechanism for VC-PL...

  12. [12]

    Graph minors. ii. algorithmic aspects of tree-width. Journal of Algorithms , 7(3):309 – 322, 1986

  13. [13]

    Abraham, D

    I. Abraham, D. Delling, A. V. Goldberg, and R. F. Werneck. A hub-based labeling algorithm for shortest paths in road networks. In SEA’11, 2011

  14. [14]

    Abraham, D

    I. Abraham, D. Delling, A. V. Goldberg, and R. F. Werneck. Hierarchical hub labelings for shortest paths. In Proceedings of the 20th Annual European Conference on Algorithms , ESA’12, 2012

  15. [15]

    Abraham, A

    I. Abraham, A. Fiat, A. V. Goldberg, and R. F. Werneck. Highway dimension, shortest paths, and provably efficient algorithms. In SODA’10, 2010

  16. [16]

    A. B. Adcock, B. D. Sullivan, and M. W. Mahoney. Tree decompositions and social graphs. Internet Mathematics, 12(5):315–361, 2016

  17. [17]

    Akiba, Y

    T. Akiba, Y. Iwata, K.-i. Kawarabayashi, and Y. Kawata. Fast shortest-path distance queries on road networks by pruned highway labeling. In Proceedings of the Meeting on Algorithm Engineering & Expermiments, pages 147–154, 2014

  18. [18]

    Akiba, Y

    T. Akiba, Y. Iwata, and Y. Yoshida. Fast Exact Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data , pages 349–360. ACM, 2013

  19. [19]

    Babenko, A

    M. Babenko, A. V. Goldberg, A. Gupta, and V. Nagarajan. Algorithms for hub label optimization. ACM Trans. Algorithms , 13(1), Nov. 2016

  20. [20]

    Babenko, A

    M. Babenko, A. V. Goldberg, H. Kaplan, R. Savchenko, and M. Weller. On the complexity of hub labeling (extended abstract). In Mathematical Foundations of Computer Science 2015 , pages 62–74. Springer Berlin Heidelberg, 2015

  21. [21]

    H. Bast, D. Delling, A. Goldberg, M. Müller-Hannemann, T. Pajor, P. Sanders, D. Wagner, and R. F. Werneck. Route Planning in Transportation Networks, pages 19–80. Springer International Publishing, Cham, 2016

  22. [22]

    H. Bast, S. Funke, P. Sanders, and D. Schultes. Fast Routing in Road Networks with Transit Nodes. Science, 316:566–, Apr. 2007

  23. [23]

    Bauer, D

    R. Bauer, D. Delling, P. Sanders, D. Schieferdecker, D. Schultes, and D. Wagner. Combining hierarchical and goal-directed speed-up techniques for dijkstra’s algorithm. J. Exp. Algorithmics , 15, March 2010

  24. [24]

    Besta, M

    M. Besta, M. Podstawski, L. Groner, E. Solomonik, and T. Hoefler. To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations. In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing , pages 93–104. ACM, 2017

  25. [25]

    U. Brandes. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology , 25:163–177, 2001

  26. [26]

    S. Cao, W. Lu, and Q. Xu. Grarep: Learning Graph Representations with Global Structural Information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pages 891–900. ACM, 2015

  27. [27]

    Chang, J

    L. Chang, J. X. Yu, L. Qin, H. Cheng, and M. Qiao. The exact distance to destination in undirected world. The VLDB Journal , 21(6):869–888, Dec 2012

  28. [28]

    Cheng, Z

    J. Cheng, Z. Shang, H. Cheng, H. Wang, and J. Yu. K-reach: Who is in your small world. PVLDB, 5(11):1292–1303, 2012

  29. [29]

    Cheng, J

    J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, 2006

  30. [30]

    Cheng, J

    J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In EDBT, 2008

  31. [31]

    Cheng, J

    J. Cheng, J. X. Yu, and P. S. Yu. Graph pattern matching: A join/semijoin approach. IEEE Trans. Knowl. Data Eng. , 23(7):1006–1021, 2011

  32. [32]

    Cohen, E

    E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. In Proceedings of the 13th annual ACM-SIAM Symposium on Discrete algorithms , pages 937–946, 2002

  33. [33]

    Das Sarma, S

    A. Das Sarma, S. Gollapudi, and R. Najork, M.and Panigrahy. A sketch-based distance oracle for web-scale graphs. In WSDM ’10 , 2010

  34. [34]

    Dathathri, G

    R. Dathathri, G. Gill, L. Hoang, H.-V. Dang, A. Brooks, N. Dryden, M. Snir, and K. Pingali. Gluon: A Communication-Optimizing Substrate for Distributed Heterogeneous Graph Analytics. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 752–768. ACM, 2018

  35. [35]

    Delling, A

    D. Delling, A. V. Goldberg, T. Pajor, and R. F. Werneck. Robust exact distance queries on massive networks. Microsoft Research, USA, Tech. Rep , 2, 2014

  36. [36]

    Dhulipala, G

    L. Dhulipala, G. E. Blelloch, and J. Shun. Theoretically efficient parallel graph algorithms can be fast and scalable. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, SPAA ’18, pages 393–404, 2018

  37. [37]

    E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik , 1(1):269–271, December 1959

  38. [38]

    Ferizovic

    D. Ferizovic. Parallel Pruned Landmark Labeling for Shortest Path Queries on Unit-Weight Networks . PhD thesis, National Research Center, 2015

  39. [39]

    Gavoille, D

    C. Gavoille, D. Peleg, S. Pérennes, and R. Raz. Distance labeling in graphs. J. Algorithms , 53(1):85–112, 2004

  40. [40]

    Geisberger, P

    R. Geisberger, P. Sanders, D. Schultes, and D. Delling. Contraction hierarchies: faster and simpler hierarchical routing in road networks. In Proceedings 13 of the 7th international conference on Experimental algorithms, pages 319–333, 2008

  41. [41]

    Geisberger, P

    R. Geisberger, P. Sanders, D. Schultes, and D. Delling. Contraction hierarchies: faster and simpler hierarchical routing in road networks. In Proceedings of the 7th international conference on Experimental algorithms, 2008

  42. [42]

    A. V. Goldberg and C. Harrelson. Computing the shortest path: A search meets graph theory. In SODA ’05, 2005

  43. [43]

    J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In OSDI, volume 12, page 2, 2012

  44. [44]

    Gubichev, S

    A. Gubichev, S. Bedathur, S. Seufert, and G. Weikum. Fast and accurate estimation of shortest paths in large graphs. In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM ’10, pages 499–508, 2010

  45. [45]

    R. J. Gutman. Reach-based routing: A new approach to shortest path algorithms optimized for road networks. In ALENEX/ANALC, pages 100–111, 2004

  46. [46]

    W. Han, D. Mawhirter, B. Wu, and M. Buland. Graphie: Large-Scale Asynchronous Graph Traversals on Just a GPU. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT),, pages 233–245. IEEE, 2017

  47. [47]

    Jiang, A

    M. Jiang, A. W.-C. Fu, R. C.-W. Wong, and Y. Xu. Hop doubling label indexing for point-to-point distance querying on scale-free networks. Proc. VLDB Endow., 7(12), Aug. 2014

  48. [48]

    R. Jin, N. Ruan, Y. Xiang, and V. E. Lee. A highway-centric labeling approach for answering distance queries on large sparse graphs. In SIGMOD, 2012

  49. [49]

    R. Jin, N. Ruan, B. You, and H. Wang. Hub-accelerator: Fast and exact shortest path computation in large social networks. CoRR, abs/1305.0507, 2013

  50. [50]

    Jin and G

    R. Jin and G. Wang. Simple, fast, and scalable reachability oracle. Proceedings of the VLDB Endowment, 6(14):1978–1989, 2013

  51. [51]

    R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. In SIGMOD’09, 2009

  52. [52]

    N. Jing, Y. Huang, and E. A. Rundensteiner. Hierarchical encoded path views for path query processing: An optimal model and its performance evaluation. TKDE, 10(3):409–432, 1998

  53. [53]

    Jung and S

    S. Jung and S. Pramanik. An efficient path computation model for hierarchically structured topographical road maps. TKDE, 14(5):1029–1046, 2002

  54. [54]

    Khorasani, K

    F. Khorasani, K. Vora, R. Gupta, and L. N. Bhuyan. CuSha: Vertex-centric Graph Processing on GPUs. In Proceedings of the 23rd international symposium on High-performance parallel and distributed computing , pages 239–252. ACM, 2014

  55. [55]

    Ko and W.-S

    S. Ko and W.-S. Han. TurboGraph++: A Scalable and Fast Graph Analytics System. In Proceedings of the 2018 International Conference on Management of Data, pages 395–410. ACM, 2018

  56. [56]

    Kriegel, P

    H. Kriegel, P. Kröger, M. Renz, and T. Schmidt. Hierarchical graph embedding for efficient query processing in very large traffic networks. In SSDBM ’08, 2008

  57. [57]

    Kyrola, G

    A. Kyrola, G. E. Blelloch, and C. Guestrin. Graphchi: Large-Scale Graph Computation on Just a PC. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI) . USENIX, 2012

  58. [58]

    C. E. Leiserson and T. B. Schardl. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In Proceedings of the Twenty-second Annual ACM Symposium on Parallelism in Algorithms and Architectures , SPAA ’10, 2010

  59. [59]

    Y. Li, M. L. Yiu, N. M. Kou, et al. An experimental study on hub labeling based shortest path algorithms. Proceedings of the VLDB Endowment , 11(4):445–457, 2017

  60. [60]

    Liu and H

    H. Liu and H. H. Huang. Enterprise: breadth-first graph traversal on gpus. In SC ’15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , 2015

  61. [61]

    H. Liu, H. H. Huang, and Y. Hu. iBFS: Concurrent Breadth-First Search on GPUs. In Proceedings of the 2016 International Conference on Management of Data, pages 403–416. ACM, 2016

  62. [62]

    Y. Low, J. E. Gonzalez, A. Kyrola, D. Bickson, C. E. Guestrin, and J. Hellerstein. Graphlab: A New Framework for Parallel Machine Learning. arXiv preprint arXiv:1408.2041, 2014

  63. [63]

    C. C. M. Potamias, F. Bonchi and A. Gionis. Fast shortest path distance estimation in large networks. In CIKM ’09 , 2009

  64. [64]

    Maass, C

    S. Maass, C. Min, S. Kashyap, W. Kang, M. Kumar, and T. Kim. Mosaic: Processing a Trillion-Edge Graph on a Single Machine. In Proceedings of the Twelfth European Conference on Computer Systems , pages 527–543. ACM, 2017

  65. [65]

    Malewicz, M

    G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 international conference on Management of data , SIGMOD ’10, 2010

  66. [66]

    R. R. McCune, T. Weninger, and G. Madey. Thinking Like a Vertex: a Survey of Vertex-centric Frameworks for Large-scale Distributed Graph Processing. ACM Computing Surveys (CSUR) , 48(2):25, 2015

  67. [67]

    Meyer and P

    U. Meyer and P. Sanders. Delta-stepping: A parallel single source shortest path algorithm. In Proceedings of the 6th Annual European Symposium on Algorithms, ESA ’98, pages 393–404, 1998

  68. [68]

    Nguyen, A

    D. Nguyen, A. Lenharth, and K. Pingali. A Lightweight Infrastructure for Graph Analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles , pages 456–471. ACM, 2013

  69. [69]

    Ouyang, L

    D. Ouyang, L. Qin, L. Chang, X. Lin, Y. Zhang, and Q. Zhu. When hierarchy meets 2-hop-labeling: Efficient shortest distance queries on road networks. In Proceedings of the 2018 International Conference on Management of Data , SIGMOD ’18, pages 709–724, 2018. 14

  70. [70]

    Pai and K

    S. Pai and K. Pingali. A Compiler for Throughput Optimization of Graph Algorithms on GPUs. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications , pages 1–19. ACM, 2016

  71. [71]

    S. B. K. A. D. Patterson. Direction-Optimizing Breadth-First Search. SC12, November, pages 10–16, 2012

  72. [72]

    M. Qiao, H. Cheng, L. Chang, and J. X. Yu. Approximate shortest distance computing: A query-dependent local landmark scheme. In ICDE, 2012

  73. [73]

    K. Qiu, Y. Zhu, J. Yuan, J. Zhao, X. Wang, and T. Wolf. ParaPLL: Fast Parallel Shortest-path Distance Query on Large-scale Weighted Graphs. In Proceedings of the 47th International Conference on Parallel Processing, page 2. ACM, 2018

  74. [74]

    Quamar, A

    A. Quamar, A. Deshpande, and J. Lin. NScale: Neighborhood-Centric Large-Scale Graph Analytics in the Cloud. The VLDB Journal–The International Journal on Very Large Data Bases , 25(2):125–150, 2016

  75. [75]

    A. Roy, I. Mihailovic, and W. Zwaenepoel. X-Stream: Edge-centric Graph Processing using Streaming Partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles , pages 472–488. ACM, 2013

  76. [76]

    Salihoglu and J

    S. Salihoglu and J. Widom. Optimizing graph algorithms on pregel-like systems. Proc. VLDB Endow., 7(7), Mar. 2014

  77. [77]

    Samet, J

    H. Samet, J. Sankaranarayanan, and H. Alborzi. Scalable network distance browsing in spatial databases. In SIGMOD’08, 2008

  78. [78]

    Sanders and D

    P. Sanders and D. Schultes. Highway hierarchies hasten exact shortest path queries. In 17th Eur. Symp. Algorithms (ESA) , 2005

  79. [79]

    Sankaranarayanan, H

    J. Sankaranarayanan, H. Alborzi, and H. Samet. Distance join queries on spatial networks. In Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems, GIS ’06, 2006

  80. [80]

    Sankaranarayanan, H

    J. Sankaranarayanan, H. Samet, and H. Alborzi. Path oracles for spatial networks. PVLDB, 2, August 2009

Showing first 80 references.