Subgraph Ensembles and Motif Discovery Using a New Heuristic for Graph Isomorphism

Kim Baskerville; Maya Paczuski

read the original abstract

A new heuristic based on vertex invariants is developed to rapidly distinguish non-isomorphic graphs to a desired level of accuracy. The method is applied to sample subgraphs from an E.coli protein interaction network, and as a probe for discovery of extended motifs. The network's structure is described using statistical properties of its $N$-node subgraphs for $N\leq 14$. The Zipf plots for subgraph occurrences are robust power laws that do not change when rewiring the network while fixing the degree sequence -- although the specific subgraphs may exchange ranks. However the exponent depends on $N$. The study of larger subgraphs highlights some striking patterns for various $N$. Motifs, or connected pieces that are over-abundant in the ensemble of subgraphs, have more edges, for a given number of nodes, than antimotifs and generally display a bipartite structure or tend towards a complete graph. In contrast, antimotifs, which are under-abundant connected pieces, are mostly trees or contain at most a single, small loop. The extension to directed graphs is straightforward.

Subgraph Ensembles and Motif Discovery Using a New Heuristic for Graph Isomorphism

discussion (0)