pith. machine review for the scientific record. sign in

arxiv: 1406.3411 · v1 · pith:AXLQSRTZnew · submitted 2014-06-13 · 💻 cs.SI · physics.soc-ph

VoG: Summarizing and Understanding Large Graphs

classification 💻 cs.SI physics.soc-ph
keywords graphdescriptiongraphssubgraphsvocabularylargelengthmeasure
0
0 comments X
read the original abstract

How can we succinctly describe a million-node graph with a few simple sentences? How can we measure the "importance" of a set of discovered subgraphs in a large graph? These are exactly the problems we focus on. Our main ideas are to construct a "vocabulary" of subgraph-types that often occur in real graphs (e.g., stars, cliques, chains), and from a set of subgraphs, find the most succinct description of a graph in terms of this vocabulary. We measure success in a well-founded way by means of the Minimum Description Length (MDL) principle: a subgraph is included in the summary if it decreases the total description length of the graph. Our contributions are three-fold: (a) formulation: we provide a principled encoding scheme to choose vocabulary subgraphs; (b) algorithm: we develop \method, an efficient method to minimize the description cost, and (c) applicability: we report experimental results on multi-million-edge real graphs, including Flickr and the Notre Dame web graph.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.