BitPath -- Label Order Constrained Reachability Queries over Large Graphs
read the original abstract
In this paper we focus on the following constrained reachability problem over edge-labeled graphs like RDF -- "given source node x, destination node y, and a sequence of edge labels (a, b, c, d), is there a path between the two nodes such that the edge labels on the path satisfy a regular expression "*a.*b.*c.*d.*". A "*" before "a" allows any other edge label to appear on the path before edge "a". "a.*" forces at least one edge with label "a". ".*" after "a" allows zero or more edge labels after "a" and before "b". Our query processing algorithm uses simple divide-and-conquer and greedy pruning procedures to limit the search space. However, our graph indexing technique -- based on "compressed bit-vectors" -- allows indexing large graphs which otherwise would have been infeasible. We have evaluated our approach on graphs with more than 22 million edges and 6 million nodes -- much larger compared to the datasets used in the contemporary work on path queries.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.