Pruned Landmark Labeling Meets Vertex Centric Computation: A Surprisingly Happy Marriage!

Bin Ren; Feodor Dragan; Gagan Agrawal; Ruoming Jin; Wendell Wu; Zhen Peng

arxiv: 1906.12018 · v1 · pith:OGNE57DLnew · submitted 2019-06-28 · 💻 cs.DB · cs.DC

Pruned Landmark Labeling Meets Vertex Centric Computation: A Surprisingly Happy Marriage!

Ruoming Jin , Zhen Peng , Wendell Wu , Feodor Dragan , Gagan Agrawal , Bin Ren This is my paper

Pith reviewed 2026-05-25 13:49 UTC · model grok-4.3

classification 💻 cs.DB cs.DC

keywords pruned landmark labelingvertex-centric computationgraph reachability queriesparallel graph algorithmsmulticore processingbatch execution modeldirected and weighted graphs

0 comments

The pith

BVC-PLL adapts pruned landmark labeling to vertex-centric computation and achieves lower computational and memory costs than the original sequential PLL under a reasonable assumption.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the sequential pruned landmark labeling algorithm can be recast in a vertex-centric model despite its apparent sequential dependencies. It introduces VC-PLL to resolve the mismatch and then BVC-PLL, which adds a batch execution model to eliminate redundant work. Theoretical comparison shows that BVC-PLL incurs lower computation and memory-access costs than PLL when the assumption holds, and experiments confirm the sequential version runs more than twice as fast while also scaling in parallel. The same approach extends directly to directed and weighted graphs and to hierarchical parallelism on multicore hardware.

Core claim

By replacing the sequential traversal of PLL with a vertex-centric computation model and introducing batch execution, BVC-PLL produces exactly the same landmark labels yet exhibits strictly lower computational and memory-access costs than the original PLL under a reasonable assumption, and therefore can serve as a faster sequential algorithm while also admitting efficient parallel execution.

What carries the argument

The batch execution model inside the vertex-centric computation of pruned landmark labeling, which groups vertex updates to remove redundant distance and label checks.

If this is right

BVC-PLL extends without modification to directed graphs and to weighted graphs while preserving the same correctness and cost advantages.
The batch vertex-centric formulation admits direct use of hierarchical parallelism on modern multicore architectures with SIMD units.
The sequential BVC-PLL implementation runs more than twice as fast as the original PLL on real-world graphs.
The parallel version of BVC-PLL demonstrates both high efficiency and good scalability as the number of cores increases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same batching technique could reduce redundant work in other label-propagation or distance-oracle algorithms that currently rely on sequential traversals.
Lower memory-access costs may translate into better cache behavior on graphs too large to fit in main memory, allowing single-machine processing of larger instances.
If the reasonable assumption turns out to hold for the majority of real-world graphs, practitioners could replace existing PLL implementations with BVC-PLL for both sequential and parallel use without changing query results.

Load-bearing premise

An unspecified reasonable assumption must hold for the claimed reduction in computational and memory-access costs of BVC-PLL relative to PLL.

What would settle it

Any concrete graph on which the total number of distance comparisons or random memory accesses performed by BVC-PLL exceeds the corresponding count for the original sequential PLL.

Figures

Figures reproduced from arXiv: 1906.12018 by Bin Ren, Feodor Dragan, Gagan Agrawal, Ruoming Jin, Wendell Wu, Zhen Peng.

**Figure 1.** Figure 1: 2-Hop Labeling and PLL Example Vertex Labels A {(A, 0), (D, 1), (I, 2)} B {(B, 0), (D, 1), (E, 1), (F, 2), (I, 2)} C {(C, 0), (D, 1), (E, 1), (F, 1), (I, 2), (D, 2)} D {(D, 0), (I, 1)} E {(E, 0), (D, 1), (I, 1)} F {(F, 0), (I, 1)} G {(G, 0), (F, 1), (L, 1)(I, 2)} H {(H, 0), (A, 1), (I, 1)} I {(I, 0)} J {(J, 0), (I, 1)} K {(K, 0), (J, 1), (F, 2), (I, 2)} L {(L, 0), (F, 1), (I, 2)} [PITH_FULL_IMAGE:figures/… view at source ↗

**Figure 2.** Figure 2: A VC-PLL Running Example to v is one step longer than the shortest path between u and v, and u will be pruned. Please refer to Appendix for proof. In addition, at each of these two iterations, if u has not been or is not in the label of v, then different neighbors of v may send the same (u, d(u, v)) messages to v. Additional Cost of Distance Check: Lemma 2. The set consisting of all pairs (u, v) for distan… view at source ↗

**Figure 4.** Figure 4: illustrates the key idea in the proof of Theorem 3. Assuming 9 vertices a, b, · · · , i in one batch being added into L(v) in PLL labeling, its total distance check cost is 36 no matter which order they are received in (visualized as the area under the diagonal stairs). Now assuming they arrive in three groups as shown in [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: The scalability of BVC-PLL (unweighted) and HOLLYWOOD) with an average speedup 1.58X over PLL. This observation is consistent with our theoretical analysis in Subsection 4.3. In the next subsection, we will perform a more detailed cost breakdown and comparison [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Performance Analysis Graph PLL BVC-PLL Ratio pos. neg. pos. neg. pos. neg. GNUTELLA 22 23 19 24 1.18 0.96 DBLP 46 36 41 38 1.14 0.96 WIKITALK 32 19 16 19 2.00 1.00 YOUTUBE 108 138 80 138 1.35 1.00 TREC WT10G 314 103 300 104 1.05 0.99 SKITTER 260 513 208 508 1.25 1.01 CATS-DOGS 44 244 37 244 1.19 1.00 FLICKR 893 2,003 830 2,028 1.08 0.99 HOLLYWOOD 6,667 38,343 6,455 38,487 1.03 1.00 INDOCHINA 6,265 1,886 5,… view at source ↗

**Figure 8.** Figure 8: The scalability of BVC-PLL (weighted) 1 2 4 8 16 20 1 4 16 64 256 1024 4096 Labeling Time (s) Thread ParaPLL BVC-PLL-S BVC-PLL-N SP-S SP-N 20 40 60 80 100 Speedup [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 9.** Figure 9: Parallel Weighted Performance: BVC-PLL vs. ParaPLL on GNUTELLA. 6, 58], and has been applied successfully in industry practice. A more detailed review on this topic can be found in a recent survey [10]. We note that the effectiveness of these approaches rely on the essential properties of road networks, such as the ones that are almost planar, have low vertex degree, are weighted, are spatial, or have a h… view at source ↗

read the original abstract

In this paper, we study how the Pruned Landmark Labeling (PPL) algorithm can be parallelized in a scalable fashion, producing the same results as the sequential algorithm. More specifically, we parallelize using a Vertex-Centric (VC) computational model on a modern SIMD powered multicore architecture. We design a new VC-PLL algorithm that resolves the apparent mismatch between the inherent sequential dependence of the PLL algorithm and the Vertex- Centric (VC) computing model. Furthermore, we introduce a novel batch execution model for VC computation and the BVC-PLL algorithm to reduce the computational inefficiency in VC-PLL. Quite surprisingly, the theoretical analysis reveals that under a reasonable assumption, BVC-PLL has lower computational and memory access costs than PLL and indicates it may run faster than PLL as a sequential algorithm. We also demonstrate how BVC-PLL algorithm can be extended to handle directed graphs and weighted graphs and how it can utilize the hierarchical parallelism on a modern parallel computing architecture. Extensive experiments on real-world graphs not only show the sequential BVC-PLL can run more than two times faster than the original PLL, but also demonstrates its parallel efficiency and scalability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents VC-PLL, a vertex-centric reformulation of the sequential Pruned Landmark Labeling (PLL) algorithm for reachability queries that is claimed to produce identical results, and BVC-PLL, a batched variant. Under an unspecified 'reasonable assumption,' BVC-PLL is shown to have strictly lower computational and memory-access costs than PLL (potentially enabling faster sequential execution). The work extends the approach to directed and weighted graphs, exploits hierarchical parallelism, and reports >2× sequential speedup plus good parallel scalability on real-world graphs.

Significance. If the cost bounds hold under the stated assumption and the observed speedups are attributable to the algorithmic redesign rather than implementation details, the result would improve the practical construction of landmark labels for large-scale reachability queries on multicore hardware while preserving exactness.

major comments (2)

[Abstract / theoretical analysis] Abstract and the theoretical cost analysis (likely §4 or §5): the central claim that 'under a reasonable assumption, BVC-PLL has lower computational and memory access costs than PLL' is load-bearing for both the sequential speedup assertion and the motivation for BVC-PLL. The assumption must be stated explicitly, shown to hold on the evaluated graphs (or on the graph families where the 2× speedup is reported), and the cost inequality must be derived without circularity. If the assumption encodes a property (e.g., label-growth rate or access pattern) that fails on the test instances, the theoretical advantage disappears and the experimental result requires a different explanation.
[Experiments] Experimental section (likely §6): the reported >2× sequential speedup of BVC-PLL over PLL must be accompanied by controls that isolate the effect of the batch execution model from other implementation choices (SIMD tuning, memory layout, etc.). Without such controls, it is unclear whether the observed runtime improvement validates the cost analysis or arises from orthogonal engineering factors.

minor comments (2)

[Algorithm description] Notation for the batch size parameter and the precise definition of 'batch execution model' should be introduced earlier and used consistently when comparing VC-PLL and BVC-PLL.
[VC-PLL definition] The paper should clarify whether the vertex-centric formulation preserves the exact pruning behavior of the original PLL or only the final label sets; a short proof sketch or invariant would strengthen the 'identical results' claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point-by-point below and commit to revisions that strengthen the presentation of the theoretical claims and experimental validation.

read point-by-point responses

Referee: [Abstract / theoretical analysis] Abstract and the theoretical cost analysis (likely §4 or §5): the central claim that 'under a reasonable assumption, BVC-PLL has lower computational and memory access costs than PLL' is load-bearing for both the sequential speedup assertion and the motivation for BVC-PLL. The assumption must be stated explicitly, shown to hold on the evaluated graphs (or on the graph families where the 2× speedup is reported), and the cost inequality must be derived without circularity. If the assumption encodes a property (e.g., label-growth rate or access pattern) that fails on the test instances, the theoretical advantage disappears and the experimental result requires a different explanation.

Authors: We agree that the assumption and its verification are central and currently underspecified. In the revised manuscript we will (i) state the assumption explicitly and formally in the cost-analysis section, (ii) provide a self-contained derivation of the strict inequality that avoids any circular reference to experimental outcomes, and (iii) add a short subsection (or appendix table) that checks the assumption on every graph used for the reported speedups. If the assumption is shown to hold, the theoretical advantage is directly linked to the observed results; if any graph violates it, we will note the discrepancy and discuss alternative explanations for the runtime improvement. revision: yes
Referee: [Experiments] Experimental section (likely §6): the reported >2× sequential speedup of BVC-PLL over PLL must be accompanied by controls that isolate the effect of the batch execution model from other implementation choices (SIMD tuning, memory layout, etc.). Without such controls, it is unclear whether the observed runtime improvement validates the cost analysis or arises from orthogonal engineering factors.

Authors: We accept that the current experiments do not fully isolate the batch-execution contribution. In the revision we will add an ablation-style comparison (or additional micro-benchmark) in which the only systematic difference between the two implementations is the batch versus non-batch vertex-centric schedule, while memory layout, SIMD intrinsics, and compiler flags remain identical. The results of this controlled comparison will be reported alongside the existing end-to-end timings. revision: yes

Circularity Check

0 steps flagged

No circularity; cost comparison rests on external assumption, not self-definition or fitted inputs

full rationale

The paper introduces BVC-PLL via vertex-centric redesign and batch execution, then invokes an external 'reasonable assumption' to claim strictly lower computational and memory-access costs versus PLL. No quoted equations reduce the claimed inequality to a fitted parameter, self-citation chain, or definitional renaming. The assumption is presented as an independent premise whose validity is left for verification outside the derivation; the algorithmic redesign itself contains independent content. This is the common honest case of a self-contained proposal whose central claim is falsifiable on real graphs rather than forced by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on standard graph-theoretic definitions of pruned landmark labeling and the vertex-centric computation model; no free parameters, invented entities, or ad-hoc axioms are visible in the abstract.

axioms (1)

domain assumption The vertex-centric model can be adapted to PLL while exactly preserving the same labeling results as the sequential algorithm.
Abstract states that VC-PLL produces the same results; this premise is required for the parallel version to be correct.

pith-pipeline@v0.9.0 · 5747 in / 1347 out tokens · 28951 ms · 2026-05-25T13:49:50.638454+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

97 extracted references · 97 canonical work pages · 2 internal anchors

[1]

the Scatter function, which describes how each vertex uses its vertex value and edge value to propagate a message to its neighbors; and 2) Gather function, which describes how each vertex computes a new value based on its original value and all the new messages it received. Each phase can traverse in parallel their corresponding vertex sets: ActiveVertice...

work page
[2]

1) iterates following the vertex rank (order): at the ith iteration, the vertex u with rank π(u) = i will be distributed to all other vertices in the graph using a BFS process

BASIC VERTEXCENTRIC ALGORITHM Recall that the PLL algorithm (Alg. 1) iterates following the vertex rank (order): at the ith iteration, the vertex u with rank π(u) = i will be distributed to all other vertices in the graph using a BFS process. The key condition to add u into the label of v, L(v), is the distance check for the canonical labeling criterion:...

work page
[3]

Scatter phase (Lines 3 to 7, also referred to as the push model): all active vertices with new labels perform 4 Algorithm 3 VC-PLL for G = ( V, E) with Order π {Init.: ( L(v): label; δL(v): new label from each iteration)} 1: ActiveV ertices← V ;∀ v∈ V, δL (v)←{ (v, 0)}, L(v)← δL(v) 2: while ActiveV ertices̸=∅ do {Scatter Phase:} 3: for all a∈ ActiveVertic...

work page
[4]

# $"% &"' (

Gather phase (Line 8-16): all vertices that receive a new message ( v.messages ̸= ∅) perform a vertex Gather function (Lines 9-15): For a vertex v, it traverses all its re- ceived messages (distance label from its neighbors), and fo r each unique vertex (u, d(u, v)) across the set of messages, it conﬁrms the distance check for the canonical labeling crite...

work page
[5]

Though VC-PLL can be described in a natural Vertex- Centric computational scheme, it also demonstrates certai n limitations of the original vertex-centric model assumpti ons:

BATCHED VERTEXCENTRIC ALG. Though VC-PLL can be described in a natural Vertex- Centric computational scheme, it also demonstrates certai n limitations of the original vertex-centric model assumpti ons:

work page
[6]

Indeed, the additional computational costs of VC-PLL compared with PLL (Subsection 3.3) can be traced back to these limitations

Typically, the vertex value (and message) is ﬁxed in VC, whereas in VC-PLL, each vertex value (and message) is a continuously growing list (or set); 2) In the Gather functio n, the computation needs remote memory access for checking distance conditions (Line 11 in Algorithm 3): in most cases, u is not a neighbor of v, and when we use L(u) for distance che...

work page
[7]

Since the number of negative checks is the same for PLL and BVC-PLL, we expect their overall cost will be fairly close to each other

can help provide almost O(1) pruning. Since the number of negative checks is the same for PLL and BVC-PLL, we expect their overall cost will be fairly close to each other. Putting It T ogether: Assuming that PLL and BVC-PLL have a similar cost for negative distance checks, theoreti- cally, BVC-PLL may have smaller computational cost than that of PLL (due ...

work page
[8]

VC-PLL and BVC-PLL can be easily extended to handle directed graphs by considering these two labels as separate computations

V ARIANTS AND IMPLEMENTATION 8 5.1 Generalization Directed Graphs: For directed graph, each vertex v is assigned with two labels Lin(v) and Lout(v). VC-PLL and BVC-PLL can be easily extended to handle directed graphs by considering these two labels as separate computations. Speciﬁcally, in the Scatter function, the new labels δLin and δLout will be sent o...

work page
[9]

deg.” denotes average degree. “dia

EV ALUATION In this section, we perform a detailed evaluation of BVC- PLL, focusing on answering the following questions: 1) How does BVC-PLL algorithm perform against the original PLL in a sequential setting (single thread; no parallelism)? Sp ecif- ically, the theoretical analysis indicates it may run faste r, but we conduct experiments to conﬁrm this. ...

work page
[10]

There have been quite a list of eﬀorts in designing parallel Dijkstra and BFS algorithms [56, 49, 47]

RELATED WORK Online and Parallel Shortest Path Distance Com- putation: The standard (single source) shortest path com- putation method is Dijkstra’s algorithm [26] for weighted graphs and Breadth-First Search (BFS) traversal for un- weighted graphs. There have been quite a list of eﬀorts in designing parallel Dijkstra and BFS algorithms [56, 49, 47]. Part...

work page
[11]

We have achieved this by mapping the algorithm to a vertex-centric model

CONCLUSION In this paper, we proposed VC-PLL, which, to the best of our knowledge, is the ﬁrst scalable parallelization of Prun ed Landmark Labeling (PLL) that is able to produce the same result as the sequential method. We have achieved this by mapping the algorithm to a vertex-centric model. We also introduced a new batched execution mechanism for VC-PL...

work page
[12]

Graph minors. ii. algorithmic aspects of tree-width. Journal of Algorithms , 7(3):309 – 322, 1986

work page 1986
[13]

Abraham, D

I. Abraham, D. Delling, A. V. Goldberg, and R. F. Werneck. A hub-based labeling algorithm for shortest paths in road networks. In SEA’11, 2011

work page 2011
[14]

Abraham, D

I. Abraham, D. Delling, A. V. Goldberg, and R. F. Werneck. Hierarchical hub labelings for shortest paths. In Proceedings of the 20th Annual European Conference on Algorithms , ESA’12, 2012

work page 2012
[15]

Abraham, A

I. Abraham, A. Fiat, A. V. Goldberg, and R. F. Werneck. Highway dimension, shortest paths, and provably eﬃcient algorithms. In SODA’10, 2010

work page 2010
[16]

A. B. Adcock, B. D. Sullivan, and M. W. Mahoney. Tree decompositions and social graphs. Internet Mathematics, 12(5):315–361, 2016

work page 2016
[17]

Akiba, Y

T. Akiba, Y. Iwata, K.-i. Kawarabayashi, and Y. Kawata. Fast shortest-path distance queries on road networks by pruned highway labeling. In Proceedings of the Meeting on Algorithm Engineering & Expermiments, pages 147–154, 2014

work page 2014
[18]

Akiba, Y

T. Akiba, Y. Iwata, and Y. Yoshida. Fast Exact Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data , pages 349–360. ACM, 2013

work page 2013
[19]

Babenko, A

M. Babenko, A. V. Goldberg, A. Gupta, and V. Nagarajan. Algorithms for hub label optimization. ACM Trans. Algorithms , 13(1), Nov. 2016

work page 2016
[20]

Babenko, A

M. Babenko, A. V. Goldberg, H. Kaplan, R. Savchenko, and M. Weller. On the complexity of hub labeling (extended abstract). In Mathematical Foundations of Computer Science 2015 , pages 62–74. Springer Berlin Heidelberg, 2015

work page 2015
[21]

H. Bast, D. Delling, A. Goldberg, M. Müller-Hannemann, T. Pajor, P. Sanders, D. Wagner, and R. F. Werneck. Route Planning in Transportation Networks, pages 19–80. Springer International Publishing, Cham, 2016

work page 2016
[22]

H. Bast, S. Funke, P. Sanders, and D. Schultes. Fast Routing in Road Networks with Transit Nodes. Science, 316:566–, Apr. 2007

work page 2007
[23]

Bauer, D

R. Bauer, D. Delling, P. Sanders, D. Schieferdecker, D. Schultes, and D. Wagner. Combining hierarchical and goal-directed speed-up techniques for dijkstra’s algorithm. J. Exp. Algorithmics , 15, March 2010

work page 2010
[24]

Besta, M

M. Besta, M. Podstawski, L. Groner, E. Solomonik, and T. Hoeﬂer. To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations. In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing , pages 93–104. ACM, 2017

work page 2017
[25]

U. Brandes. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology , 25:163–177, 2001

work page 2001
[26]

S. Cao, W. Lu, and Q. Xu. Grarep: Learning Graph Representations with Global Structural Information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pages 891–900. ACM, 2015

work page 2015
[27]

Chang, J

L. Chang, J. X. Yu, L. Qin, H. Cheng, and M. Qiao. The exact distance to destination in undirected world. The VLDB Journal , 21(6):869–888, Dec 2012

work page 2012
[28]

Cheng, Z

J. Cheng, Z. Shang, H. Cheng, H. Wang, and J. Yu. K-reach: Who is in your small world. PVLDB, 5(11):1292–1303, 2012

work page 2012
[29]

Cheng, J

J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, 2006

work page 2006
[30]

Cheng, J

J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In EDBT, 2008

work page 2008
[31]

Cheng, J

J. Cheng, J. X. Yu, and P. S. Yu. Graph pattern matching: A join/semijoin approach. IEEE Trans. Knowl. Data Eng. , 23(7):1006–1021, 2011

work page 2011
[32]

Cohen, E

E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. In Proceedings of the 13th annual ACM-SIAM Symposium on Discrete algorithms , pages 937–946, 2002

work page 2002
[33]

Das Sarma, S

A. Das Sarma, S. Gollapudi, and R. Najork, M.and Panigrahy. A sketch-based distance oracle for web-scale graphs. In WSDM ’10 , 2010

work page 2010
[34]

Dathathri, G

R. Dathathri, G. Gill, L. Hoang, H.-V. Dang, A. Brooks, N. Dryden, M. Snir, and K. Pingali. Gluon: A Communication-Optimizing Substrate for Distributed Heterogeneous Graph Analytics. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 752–768. ACM, 2018

work page 2018
[35]

Delling, A

D. Delling, A. V. Goldberg, T. Pajor, and R. F. Werneck. Robust exact distance queries on massive networks. Microsoft Research, USA, Tech. Rep , 2, 2014

work page 2014
[36]

Dhulipala, G

L. Dhulipala, G. E. Blelloch, and J. Shun. Theoretically eﬃcient parallel graph algorithms can be fast and scalable. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, SPAA ’18, pages 393–404, 2018

work page 2018
[37]

E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik , 1(1):269–271, December 1959

work page 1959
[38]

Ferizovic

D. Ferizovic. Parallel Pruned Landmark Labeling for Shortest Path Queries on Unit-Weight Networks . PhD thesis, National Research Center, 2015

work page 2015
[39]

Gavoille, D

C. Gavoille, D. Peleg, S. Pérennes, and R. Raz. Distance labeling in graphs. J. Algorithms , 53(1):85–112, 2004

work page 2004
[40]

Geisberger, P

R. Geisberger, P. Sanders, D. Schultes, and D. Delling. Contraction hierarchies: faster and simpler hierarchical routing in road networks. In Proceedings 13 of the 7th international conference on Experimental algorithms, pages 319–333, 2008

work page 2008
[41]

Geisberger, P

R. Geisberger, P. Sanders, D. Schultes, and D. Delling. Contraction hierarchies: faster and simpler hierarchical routing in road networks. In Proceedings of the 7th international conference on Experimental algorithms, 2008

work page 2008
[42]

A. V. Goldberg and C. Harrelson. Computing the shortest path: A search meets graph theory. In SODA ’05, 2005

work page 2005
[43]

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In OSDI, volume 12, page 2, 2012

work page 2012
[44]

Gubichev, S

A. Gubichev, S. Bedathur, S. Seufert, and G. Weikum. Fast and accurate estimation of shortest paths in large graphs. In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM ’10, pages 499–508, 2010

work page 2010
[45]

R. J. Gutman. Reach-based routing: A new approach to shortest path algorithms optimized for road networks. In ALENEX/ANALC, pages 100–111, 2004

work page 2004
[46]

W. Han, D. Mawhirter, B. Wu, and M. Buland. Graphie: Large-Scale Asynchronous Graph Traversals on Just a GPU. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT),, pages 233–245. IEEE, 2017

work page 2017
[47]

Jiang, A

M. Jiang, A. W.-C. Fu, R. C.-W. Wong, and Y. Xu. Hop doubling label indexing for point-to-point distance querying on scale-free networks. Proc. VLDB Endow., 7(12), Aug. 2014

work page 2014
[48]

R. Jin, N. Ruan, Y. Xiang, and V. E. Lee. A highway-centric labeling approach for answering distance queries on large sparse graphs. In SIGMOD, 2012

work page 2012
[49]

R. Jin, N. Ruan, B. You, and H. Wang. Hub-accelerator: Fast and exact shortest path computation in large social networks. CoRR, abs/1305.0507, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[50]

Jin and G

R. Jin and G. Wang. Simple, fast, and scalable reachability oracle. Proceedings of the VLDB Endowment, 6(14):1978–1989, 2013

work page 1978
[51]

R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. In SIGMOD’09, 2009

work page 2009
[52]

N. Jing, Y. Huang, and E. A. Rundensteiner. Hierarchical encoded path views for path query processing: An optimal model and its performance evaluation. TKDE, 10(3):409–432, 1998

work page 1998
[53]

Jung and S

S. Jung and S. Pramanik. An eﬃcient path computation model for hierarchically structured topographical road maps. TKDE, 14(5):1029–1046, 2002

work page 2002
[54]

Khorasani, K

F. Khorasani, K. Vora, R. Gupta, and L. N. Bhuyan. CuSha: Vertex-centric Graph Processing on GPUs. In Proceedings of the 23rd international symposium on High-performance parallel and distributed computing , pages 239–252. ACM, 2014

work page 2014
[55]

Ko and W.-S

S. Ko and W.-S. Han. TurboGraph++: A Scalable and Fast Graph Analytics System. In Proceedings of the 2018 International Conference on Management of Data, pages 395–410. ACM, 2018

work page 2018
[56]

Kriegel, P

H. Kriegel, P. Kröger, M. Renz, and T. Schmidt. Hierarchical graph embedding for eﬃcient query processing in very large traﬃc networks. In SSDBM ’08, 2008

work page 2008
[57]

Kyrola, G

A. Kyrola, G. E. Blelloch, and C. Guestrin. Graphchi: Large-Scale Graph Computation on Just a PC. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI) . USENIX, 2012

work page 2012
[58]

C. E. Leiserson and T. B. Schardl. A work-eﬃcient parallel breadth-ﬁrst search algorithm (or how to cope with the nondeterminism of reducers). In Proceedings of the Twenty-second Annual ACM Symposium on Parallelism in Algorithms and Architectures , SPAA ’10, 2010

work page 2010
[59]

Y. Li, M. L. Yiu, N. M. Kou, et al. An experimental study on hub labeling based shortest path algorithms. Proceedings of the VLDB Endowment , 11(4):445–457, 2017

work page 2017
[60]

Liu and H

H. Liu and H. H. Huang. Enterprise: breadth-ﬁrst graph traversal on gpus. In SC ’15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , 2015

work page 2015
[61]

H. Liu, H. H. Huang, and Y. Hu. iBFS: Concurrent Breadth-First Search on GPUs. In Proceedings of the 2016 International Conference on Management of Data, pages 403–416. ACM, 2016

work page 2016
[62]

Y. Low, J. E. Gonzalez, A. Kyrola, D. Bickson, C. E. Guestrin, and J. Hellerstein. Graphlab: A New Framework for Parallel Machine Learning. arXiv preprint arXiv:1408.2041, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2041
[63]

C. C. M. Potamias, F. Bonchi and A. Gionis. Fast shortest path distance estimation in large networks. In CIKM ’09 , 2009

work page 2009
[64]

Maass, C

S. Maass, C. Min, S. Kashyap, W. Kang, M. Kumar, and T. Kim. Mosaic: Processing a Trillion-Edge Graph on a Single Machine. In Proceedings of the Twelfth European Conference on Computer Systems , pages 527–543. ACM, 2017

work page 2017
[65]

Malewicz, M

G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 international conference on Management of data , SIGMOD ’10, 2010

work page 2010
[66]

R. R. McCune, T. Weninger, and G. Madey. Thinking Like a Vertex: a Survey of Vertex-centric Frameworks for Large-scale Distributed Graph Processing. ACM Computing Surveys (CSUR) , 48(2):25, 2015

work page 2015
[67]

Meyer and P

U. Meyer and P. Sanders. Delta-stepping: A parallel single source shortest path algorithm. In Proceedings of the 6th Annual European Symposium on Algorithms, ESA ’98, pages 393–404, 1998

work page 1998
[68]

Nguyen, A

D. Nguyen, A. Lenharth, and K. Pingali. A Lightweight Infrastructure for Graph Analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles , pages 456–471. ACM, 2013

work page 2013
[69]

Ouyang, L

D. Ouyang, L. Qin, L. Chang, X. Lin, Y. Zhang, and Q. Zhu. When hierarchy meets 2-hop-labeling: Eﬃcient shortest distance queries on road networks. In Proceedings of the 2018 International Conference on Management of Data , SIGMOD ’18, pages 709–724, 2018. 14

work page 2018
[70]

Pai and K

S. Pai and K. Pingali. A Compiler for Throughput Optimization of Graph Algorithms on GPUs. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications , pages 1–19. ACM, 2016

work page 2016
[71]

S. B. K. A. D. Patterson. Direction-Optimizing Breadth-First Search. SC12, November, pages 10–16, 2012

work page 2012
[72]

M. Qiao, H. Cheng, L. Chang, and J. X. Yu. Approximate shortest distance computing: A query-dependent local landmark scheme. In ICDE, 2012

work page 2012
[73]

K. Qiu, Y. Zhu, J. Yuan, J. Zhao, X. Wang, and T. Wolf. ParaPLL: Fast Parallel Shortest-path Distance Query on Large-scale Weighted Graphs. In Proceedings of the 47th International Conference on Parallel Processing, page 2. ACM, 2018

work page 2018
[74]

Quamar, A

A. Quamar, A. Deshpande, and J. Lin. NScale: Neighborhood-Centric Large-Scale Graph Analytics in the Cloud. The VLDB Journal–The International Journal on Very Large Data Bases , 25(2):125–150, 2016

work page 2016
[75]

A. Roy, I. Mihailovic, and W. Zwaenepoel. X-Stream: Edge-centric Graph Processing using Streaming Partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles , pages 472–488. ACM, 2013

work page 2013
[76]

Salihoglu and J

S. Salihoglu and J. Widom. Optimizing graph algorithms on pregel-like systems. Proc. VLDB Endow., 7(7), Mar. 2014

work page 2014
[77]

Samet, J

H. Samet, J. Sankaranarayanan, and H. Alborzi. Scalable network distance browsing in spatial databases. In SIGMOD’08, 2008

work page 2008
[78]

Sanders and D

P. Sanders and D. Schultes. Highway hierarchies hasten exact shortest path queries. In 17th Eur. Symp. Algorithms (ESA) , 2005

work page 2005
[79]

Sankaranarayanan, H

J. Sankaranarayanan, H. Alborzi, and H. Samet. Distance join queries on spatial networks. In Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems, GIS ’06, 2006

work page 2006
[80]

Sankaranarayanan, H

J. Sankaranarayanan, H. Samet, and H. Alborzi. Path oracles for spatial networks. PVLDB, 2, August 2009

work page 2009

Showing first 80 references.

[1] [1]

the Scatter function, which describes how each vertex uses its vertex value and edge value to propagate a message to its neighbors; and 2) Gather function, which describes how each vertex computes a new value based on its original value and all the new messages it received. Each phase can traverse in parallel their corresponding vertex sets: ActiveVertice...

work page

[2] [2]

1) iterates following the vertex rank (order): at the ith iteration, the vertex u with rank π(u) = i will be distributed to all other vertices in the graph using a BFS process

BASIC VERTEXCENTRIC ALGORITHM Recall that the PLL algorithm (Alg. 1) iterates following the vertex rank (order): at the ith iteration, the vertex u with rank π(u) = i will be distributed to all other vertices in the graph using a BFS process. The key condition to add u into the label of v, L(v), is the distance check for the canonical labeling criterion:...

work page

[3] [3]

Scatter phase (Lines 3 to 7, also referred to as the push model): all active vertices with new labels perform 4 Algorithm 3 VC-PLL for G = ( V, E) with Order π {Init.: ( L(v): label; δL(v): new label from each iteration)} 1: ActiveV ertices← V ;∀ v∈ V, δL (v)←{ (v, 0)}, L(v)← δL(v) 2: while ActiveV ertices̸=∅ do {Scatter Phase:} 3: for all a∈ ActiveVertic...

work page

[4] [4]

# $"% &"' (

Gather phase (Line 8-16): all vertices that receive a new message ( v.messages ̸= ∅) perform a vertex Gather function (Lines 9-15): For a vertex v, it traverses all its re- ceived messages (distance label from its neighbors), and fo r each unique vertex (u, d(u, v)) across the set of messages, it conﬁrms the distance check for the canonical labeling crite...

work page

[5] [5]

Though VC-PLL can be described in a natural Vertex- Centric computational scheme, it also demonstrates certai n limitations of the original vertex-centric model assumpti ons:

BATCHED VERTEXCENTRIC ALG. Though VC-PLL can be described in a natural Vertex- Centric computational scheme, it also demonstrates certai n limitations of the original vertex-centric model assumpti ons:

work page

[6] [6]

Indeed, the additional computational costs of VC-PLL compared with PLL (Subsection 3.3) can be traced back to these limitations

Typically, the vertex value (and message) is ﬁxed in VC, whereas in VC-PLL, each vertex value (and message) is a continuously growing list (or set); 2) In the Gather functio n, the computation needs remote memory access for checking distance conditions (Line 11 in Algorithm 3): in most cases, u is not a neighbor of v, and when we use L(u) for distance che...

work page

[7] [7]

Since the number of negative checks is the same for PLL and BVC-PLL, we expect their overall cost will be fairly close to each other

can help provide almost O(1) pruning. Since the number of negative checks is the same for PLL and BVC-PLL, we expect their overall cost will be fairly close to each other. Putting It T ogether: Assuming that PLL and BVC-PLL have a similar cost for negative distance checks, theoreti- cally, BVC-PLL may have smaller computational cost than that of PLL (due ...

work page

[8] [8]

VC-PLL and BVC-PLL can be easily extended to handle directed graphs by considering these two labels as separate computations

V ARIANTS AND IMPLEMENTATION 8 5.1 Generalization Directed Graphs: For directed graph, each vertex v is assigned with two labels Lin(v) and Lout(v). VC-PLL and BVC-PLL can be easily extended to handle directed graphs by considering these two labels as separate computations. Speciﬁcally, in the Scatter function, the new labels δLin and δLout will be sent o...

work page

[9] [9]

deg.” denotes average degree. “dia

EV ALUATION In this section, we perform a detailed evaluation of BVC- PLL, focusing on answering the following questions: 1) How does BVC-PLL algorithm perform against the original PLL in a sequential setting (single thread; no parallelism)? Sp ecif- ically, the theoretical analysis indicates it may run faste r, but we conduct experiments to conﬁrm this. ...

work page

[10] [10]

There have been quite a list of eﬀorts in designing parallel Dijkstra and BFS algorithms [56, 49, 47]

RELATED WORK Online and Parallel Shortest Path Distance Com- putation: The standard (single source) shortest path com- putation method is Dijkstra’s algorithm [26] for weighted graphs and Breadth-First Search (BFS) traversal for un- weighted graphs. There have been quite a list of eﬀorts in designing parallel Dijkstra and BFS algorithms [56, 49, 47]. Part...

work page

[11] [11]

We have achieved this by mapping the algorithm to a vertex-centric model

CONCLUSION In this paper, we proposed VC-PLL, which, to the best of our knowledge, is the ﬁrst scalable parallelization of Prun ed Landmark Labeling (PLL) that is able to produce the same result as the sequential method. We have achieved this by mapping the algorithm to a vertex-centric model. We also introduced a new batched execution mechanism for VC-PL...

work page

[12] [12]

Graph minors. ii. algorithmic aspects of tree-width. Journal of Algorithms , 7(3):309 – 322, 1986

work page 1986

[13] [13]

Abraham, D

I. Abraham, D. Delling, A. V. Goldberg, and R. F. Werneck. A hub-based labeling algorithm for shortest paths in road networks. In SEA’11, 2011

work page 2011

[14] [14]

Abraham, D

I. Abraham, D. Delling, A. V. Goldberg, and R. F. Werneck. Hierarchical hub labelings for shortest paths. In Proceedings of the 20th Annual European Conference on Algorithms , ESA’12, 2012

work page 2012

[15] [15]

Abraham, A

I. Abraham, A. Fiat, A. V. Goldberg, and R. F. Werneck. Highway dimension, shortest paths, and provably eﬃcient algorithms. In SODA’10, 2010

work page 2010

[16] [16]

A. B. Adcock, B. D. Sullivan, and M. W. Mahoney. Tree decompositions and social graphs. Internet Mathematics, 12(5):315–361, 2016

work page 2016

[17] [17]

Akiba, Y

T. Akiba, Y. Iwata, K.-i. Kawarabayashi, and Y. Kawata. Fast shortest-path distance queries on road networks by pruned highway labeling. In Proceedings of the Meeting on Algorithm Engineering & Expermiments, pages 147–154, 2014

work page 2014

[18] [18]

Akiba, Y

T. Akiba, Y. Iwata, and Y. Yoshida. Fast Exact Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data , pages 349–360. ACM, 2013

work page 2013

[19] [19]

Babenko, A

M. Babenko, A. V. Goldberg, A. Gupta, and V. Nagarajan. Algorithms for hub label optimization. ACM Trans. Algorithms , 13(1), Nov. 2016

work page 2016

[20] [20]

Babenko, A

M. Babenko, A. V. Goldberg, H. Kaplan, R. Savchenko, and M. Weller. On the complexity of hub labeling (extended abstract). In Mathematical Foundations of Computer Science 2015 , pages 62–74. Springer Berlin Heidelberg, 2015

work page 2015

[21] [21]

H. Bast, D. Delling, A. Goldberg, M. Müller-Hannemann, T. Pajor, P. Sanders, D. Wagner, and R. F. Werneck. Route Planning in Transportation Networks, pages 19–80. Springer International Publishing, Cham, 2016

work page 2016

[22] [22]

H. Bast, S. Funke, P. Sanders, and D. Schultes. Fast Routing in Road Networks with Transit Nodes. Science, 316:566–, Apr. 2007

work page 2007

[23] [23]

Bauer, D

R. Bauer, D. Delling, P. Sanders, D. Schieferdecker, D. Schultes, and D. Wagner. Combining hierarchical and goal-directed speed-up techniques for dijkstra’s algorithm. J. Exp. Algorithmics , 15, March 2010

work page 2010

[24] [24]

Besta, M

M. Besta, M. Podstawski, L. Groner, E. Solomonik, and T. Hoeﬂer. To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations. In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing , pages 93–104. ACM, 2017

work page 2017

[25] [25]

U. Brandes. A faster algorithm for betweenness centrality. Journal of Mathematical Sociology , 25:163–177, 2001

work page 2001

[26] [26]

S. Cao, W. Lu, and Q. Xu. Grarep: Learning Graph Representations with Global Structural Information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pages 891–900. ACM, 2015

work page 2015

[27] [27]

Chang, J

L. Chang, J. X. Yu, L. Qin, H. Cheng, and M. Qiao. The exact distance to destination in undirected world. The VLDB Journal , 21(6):869–888, Dec 2012

work page 2012

[28] [28]

Cheng, Z

J. Cheng, Z. Shang, H. Cheng, H. Wang, and J. Yu. K-reach: Who is in your small world. PVLDB, 5(11):1292–1303, 2012

work page 2012

[29] [29]

Cheng, J

J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, 2006

work page 2006

[30] [30]

Cheng, J

J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computing reachability labelings for large graphs with high compression rate. In EDBT, 2008

work page 2008

[31] [31]

Cheng, J

J. Cheng, J. X. Yu, and P. S. Yu. Graph pattern matching: A join/semijoin approach. IEEE Trans. Knowl. Data Eng. , 23(7):1006–1021, 2011

work page 2011

[32] [32]

Cohen, E

E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. In Proceedings of the 13th annual ACM-SIAM Symposium on Discrete algorithms , pages 937–946, 2002

work page 2002

[33] [33]

Das Sarma, S

A. Das Sarma, S. Gollapudi, and R. Najork, M.and Panigrahy. A sketch-based distance oracle for web-scale graphs. In WSDM ’10 , 2010

work page 2010

[34] [34]

Dathathri, G

R. Dathathri, G. Gill, L. Hoang, H.-V. Dang, A. Brooks, N. Dryden, M. Snir, and K. Pingali. Gluon: A Communication-Optimizing Substrate for Distributed Heterogeneous Graph Analytics. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 752–768. ACM, 2018

work page 2018

[35] [35]

Delling, A

D. Delling, A. V. Goldberg, T. Pajor, and R. F. Werneck. Robust exact distance queries on massive networks. Microsoft Research, USA, Tech. Rep , 2, 2014

work page 2014

[36] [36]

Dhulipala, G

L. Dhulipala, G. E. Blelloch, and J. Shun. Theoretically eﬃcient parallel graph algorithms can be fast and scalable. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, SPAA ’18, pages 393–404, 2018

work page 2018

[37] [37]

E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik , 1(1):269–271, December 1959

work page 1959

[38] [38]

Ferizovic

D. Ferizovic. Parallel Pruned Landmark Labeling for Shortest Path Queries on Unit-Weight Networks . PhD thesis, National Research Center, 2015

work page 2015

[39] [39]

Gavoille, D

C. Gavoille, D. Peleg, S. Pérennes, and R. Raz. Distance labeling in graphs. J. Algorithms , 53(1):85–112, 2004

work page 2004

[40] [40]

Geisberger, P

R. Geisberger, P. Sanders, D. Schultes, and D. Delling. Contraction hierarchies: faster and simpler hierarchical routing in road networks. In Proceedings 13 of the 7th international conference on Experimental algorithms, pages 319–333, 2008

work page 2008

[41] [41]

Geisberger, P

R. Geisberger, P. Sanders, D. Schultes, and D. Delling. Contraction hierarchies: faster and simpler hierarchical routing in road networks. In Proceedings of the 7th international conference on Experimental algorithms, 2008

work page 2008

[42] [42]

A. V. Goldberg and C. Harrelson. Computing the shortest path: A search meets graph theory. In SODA ’05, 2005

work page 2005

[43] [43]

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In OSDI, volume 12, page 2, 2012

work page 2012

[44] [44]

Gubichev, S

A. Gubichev, S. Bedathur, S. Seufert, and G. Weikum. Fast and accurate estimation of shortest paths in large graphs. In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM ’10, pages 499–508, 2010

work page 2010

[45] [45]

R. J. Gutman. Reach-based routing: A new approach to shortest path algorithms optimized for road networks. In ALENEX/ANALC, pages 100–111, 2004

work page 2004

[46] [46]

W. Han, D. Mawhirter, B. Wu, and M. Buland. Graphie: Large-Scale Asynchronous Graph Traversals on Just a GPU. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT),, pages 233–245. IEEE, 2017

work page 2017

[47] [47]

Jiang, A

M. Jiang, A. W.-C. Fu, R. C.-W. Wong, and Y. Xu. Hop doubling label indexing for point-to-point distance querying on scale-free networks. Proc. VLDB Endow., 7(12), Aug. 2014

work page 2014

[48] [48]

R. Jin, N. Ruan, Y. Xiang, and V. E. Lee. A highway-centric labeling approach for answering distance queries on large sparse graphs. In SIGMOD, 2012

work page 2012

[49] [49]

R. Jin, N. Ruan, B. You, and H. Wang. Hub-accelerator: Fast and exact shortest path computation in large social networks. CoRR, abs/1305.0507, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[50] [50]

Jin and G

R. Jin and G. Wang. Simple, fast, and scalable reachability oracle. Proceedings of the VLDB Endowment, 6(14):1978–1989, 2013

work page 1978

[51] [51]

R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. In SIGMOD’09, 2009

work page 2009

[52] [52]

N. Jing, Y. Huang, and E. A. Rundensteiner. Hierarchical encoded path views for path query processing: An optimal model and its performance evaluation. TKDE, 10(3):409–432, 1998

work page 1998

[53] [53]

Jung and S

S. Jung and S. Pramanik. An eﬃcient path computation model for hierarchically structured topographical road maps. TKDE, 14(5):1029–1046, 2002

work page 2002

[54] [54]

Khorasani, K

F. Khorasani, K. Vora, R. Gupta, and L. N. Bhuyan. CuSha: Vertex-centric Graph Processing on GPUs. In Proceedings of the 23rd international symposium on High-performance parallel and distributed computing , pages 239–252. ACM, 2014

work page 2014

[55] [55]

Ko and W.-S

S. Ko and W.-S. Han. TurboGraph++: A Scalable and Fast Graph Analytics System. In Proceedings of the 2018 International Conference on Management of Data, pages 395–410. ACM, 2018

work page 2018

[56] [56]

Kriegel, P

H. Kriegel, P. Kröger, M. Renz, and T. Schmidt. Hierarchical graph embedding for eﬃcient query processing in very large traﬃc networks. In SSDBM ’08, 2008

work page 2008

[57] [57]

Kyrola, G

A. Kyrola, G. E. Blelloch, and C. Guestrin. Graphchi: Large-Scale Graph Computation on Just a PC. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI) . USENIX, 2012

work page 2012

[58] [58]

C. E. Leiserson and T. B. Schardl. A work-eﬃcient parallel breadth-ﬁrst search algorithm (or how to cope with the nondeterminism of reducers). In Proceedings of the Twenty-second Annual ACM Symposium on Parallelism in Algorithms and Architectures , SPAA ’10, 2010

work page 2010

[59] [59]

Y. Li, M. L. Yiu, N. M. Kou, et al. An experimental study on hub labeling based shortest path algorithms. Proceedings of the VLDB Endowment , 11(4):445–457, 2017

work page 2017

[60] [60]

Liu and H

H. Liu and H. H. Huang. Enterprise: breadth-ﬁrst graph traversal on gpus. In SC ’15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , 2015

work page 2015

[61] [61]

H. Liu, H. H. Huang, and Y. Hu. iBFS: Concurrent Breadth-First Search on GPUs. In Proceedings of the 2016 International Conference on Management of Data, pages 403–416. ACM, 2016

work page 2016

[62] [62]

Y. Low, J. E. Gonzalez, A. Kyrola, D. Bickson, C. E. Guestrin, and J. Hellerstein. Graphlab: A New Framework for Parallel Machine Learning. arXiv preprint arXiv:1408.2041, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2041

[63] [63]

C. C. M. Potamias, F. Bonchi and A. Gionis. Fast shortest path distance estimation in large networks. In CIKM ’09 , 2009

work page 2009

[64] [64]

Maass, C

S. Maass, C. Min, S. Kashyap, W. Kang, M. Kumar, and T. Kim. Mosaic: Processing a Trillion-Edge Graph on a Single Machine. In Proceedings of the Twelfth European Conference on Computer Systems , pages 527–543. ACM, 2017

work page 2017

[65] [65]

Malewicz, M

G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 international conference on Management of data , SIGMOD ’10, 2010

work page 2010

[66] [66]

R. R. McCune, T. Weninger, and G. Madey. Thinking Like a Vertex: a Survey of Vertex-centric Frameworks for Large-scale Distributed Graph Processing. ACM Computing Surveys (CSUR) , 48(2):25, 2015

work page 2015

[67] [67]

Meyer and P

U. Meyer and P. Sanders. Delta-stepping: A parallel single source shortest path algorithm. In Proceedings of the 6th Annual European Symposium on Algorithms, ESA ’98, pages 393–404, 1998

work page 1998

[68] [68]

Nguyen, A

D. Nguyen, A. Lenharth, and K. Pingali. A Lightweight Infrastructure for Graph Analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles , pages 456–471. ACM, 2013

work page 2013

[69] [69]

Ouyang, L

D. Ouyang, L. Qin, L. Chang, X. Lin, Y. Zhang, and Q. Zhu. When hierarchy meets 2-hop-labeling: Eﬃcient shortest distance queries on road networks. In Proceedings of the 2018 International Conference on Management of Data , SIGMOD ’18, pages 709–724, 2018. 14

work page 2018

[70] [70]

Pai and K

S. Pai and K. Pingali. A Compiler for Throughput Optimization of Graph Algorithms on GPUs. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications , pages 1–19. ACM, 2016

work page 2016

[71] [71]

S. B. K. A. D. Patterson. Direction-Optimizing Breadth-First Search. SC12, November, pages 10–16, 2012

work page 2012

[72] [72]

M. Qiao, H. Cheng, L. Chang, and J. X. Yu. Approximate shortest distance computing: A query-dependent local landmark scheme. In ICDE, 2012

work page 2012

[73] [73]

K. Qiu, Y. Zhu, J. Yuan, J. Zhao, X. Wang, and T. Wolf. ParaPLL: Fast Parallel Shortest-path Distance Query on Large-scale Weighted Graphs. In Proceedings of the 47th International Conference on Parallel Processing, page 2. ACM, 2018

work page 2018

[74] [74]

Quamar, A

A. Quamar, A. Deshpande, and J. Lin. NScale: Neighborhood-Centric Large-Scale Graph Analytics in the Cloud. The VLDB Journal–The International Journal on Very Large Data Bases , 25(2):125–150, 2016

work page 2016

[75] [75]

A. Roy, I. Mihailovic, and W. Zwaenepoel. X-Stream: Edge-centric Graph Processing using Streaming Partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles , pages 472–488. ACM, 2013

work page 2013

[76] [76]

Salihoglu and J

S. Salihoglu and J. Widom. Optimizing graph algorithms on pregel-like systems. Proc. VLDB Endow., 7(7), Mar. 2014

work page 2014

[77] [77]

Samet, J

H. Samet, J. Sankaranarayanan, and H. Alborzi. Scalable network distance browsing in spatial databases. In SIGMOD’08, 2008

work page 2008

[78] [78]

Sanders and D

P. Sanders and D. Schultes. Highway hierarchies hasten exact shortest path queries. In 17th Eur. Symp. Algorithms (ESA) , 2005

work page 2005

[79] [79]

Sankaranarayanan, H

J. Sankaranarayanan, H. Alborzi, and H. Samet. Distance join queries on spatial networks. In Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems, GIS ’06, 2006

work page 2006

[80] [80]

Sankaranarayanan, H

J. Sankaranarayanan, H. Samet, and H. Alborzi. Path oracles for spatial networks. PVLDB, 2, August 2009

work page 2009