Network Embedding: on Compression and Learning
Pith reviewed 2026-05-25 01:44 UTC · model grok-4.3
The pith
Compressing graphs via neighborhood similarity speeds up random-walk embeddings without accuracy loss.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
NECL uses a neighborhood similarity based graph compression to produce a smaller graph without losing much information about the global structure of the graph and the local proximity of the vertices. Network embedding is performed on this compressed graph rather than the original to lower the embedding cost. It acts as a meta-strategy that enhances the efficiency of state-of-the-art random walk based graph embedding algorithms without decreasing their effectiveness, as shown in experiments on real-world networks.
What carries the argument
Neighborhood similarity based graph compression that merges or reduces parts of the graph based on similar neighborhoods to preserve structure and proximity information.
If this is right
- Embedding time including walking and learning phases is reduced by 23-57% on average.
- Classification accuracy on single-label and multi-label tasks stays the same as on the original graph.
- The method applies to any random walk based embedding algorithm as a preprocessing step.
- Experiments confirm this on networks including DBLP, BlogCatalog, Cora, and Wiki.
Where Pith is reading between the lines
- Similar compression could be tested on embedding methods not based on random walks to see if the benefit generalizes.
- Applying the method to graphs with millions of nodes would test its scalability for very large social networks.
- Other downstream tasks such as community detection or link prediction may see comparable efficiency gains.
Load-bearing premise
The compression based on neighborhood similarity preserves sufficient global structure and local proximity information so that embeddings from the compressed graph perform as well as those from the original on classification tasks.
What would settle it
A significant decrease in classification accuracy for embeddings learned on the compressed graph compared to the original graph on one of the tested real-world networks would falsify the claim.
Figures
read the original abstract
Recently, network embedding that encodes structural information of graphs into a vector space has become popular for network analysis. Although recent methods show promising performance for various applications, the huge sizes of graphs may hinder a direct application of existing network embedding method to them. This paper presents NECL, a novel efficient Network Embedding method with two goals. 1) Is there an ideal Compression of a network? 2) Will the compression of a network significantly boost the representation Learning of the network? For the first problem, we propose a neighborhood similarity based graph compression method that compresses the input graph to get a smaller graph without losing any/much information about the global structure of the graph and the local proximity of the vertices in the graph. For the second problem, we use the compressed graph for network embedding instead of the original large graph to bring down the embedding cost. NECL is a general meta-strategy to improve the efficiency of all of the state-of-the-art graph embedding algorithms based on random walks, including DeepWalk and Node2vec, without losing their effectiveness. Extensive experiments on large real-world networks validate the efficiency of NECL method that yields an average improvement of 23 - 57% embedding time, including walking and learning time without decreasing classification accuracy as evaluated on single and multi-label classification tasks on real-world graphs such as DBLP, BlogCatalog, Cora and Wiki.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes NECL, a meta-strategy for network embedding that first applies a neighborhood-similarity-based compression to the input graph and then runs existing random-walk embedding algorithms (DeepWalk, Node2vec, etc.) on the compressed graph. The central claims are that the compression preserves global structure and local proximity sufficiently well that downstream node-classification accuracy remains unchanged while embedding time (walking plus learning) drops by 23–57 % on four real-world networks (DBLP, BlogCatalog, Cora, Wiki).
Significance. If the empirical accuracy parity holds under controlled conditions and the method is shown to be robust across compression ratios, NECL would supply a practical, algorithm-agnostic preprocessing step that improves scalability of random-walk embeddings without requiring changes to the core embedding routines. The absence of any parameter-free derivation or machine-checked guarantee, however, keeps the result at the level of an empirical heuristic rather than a foundational advance.
major comments (2)
- [Abstract / Method] Abstract and method description: the claim that the compressed graph yields embeddings that are 'effectively interchangeable' with those from the original graph for random-walk algorithms rests on the unproven assumption that neighborhood-similarity compression leaves finite-length walk distributions and stationary probabilities sufficiently close. No direct comparison of walk statistics, embedding-space distances, or higher-order connectivity is reported, which is load-bearing for the 'without losing their effectiveness' assertion.
- [Experiments] Experiments: the abstract states 'without decreasing classification accuracy' on single- and multi-label tasks, yet supplies no information on how the similarity threshold or target compression ratio was selected, whether multiple random seeds were used, or whether statistical significance tests were performed. This prevents assessment of whether the reported parity is robust or an artifact of particular hyper-parameter choices.
minor comments (1)
- [Abstract] The abstract contains the repeated phrase 'without losing any/much information'; a more precise statement of what structural invariants are (or are not) preserved would improve clarity.
Simulated Author's Rebuttal
Thank you for the opportunity to respond to the referee's report. We address the major comments below and propose revisions where appropriate to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract / Method] Abstract and method description: the claim that the compressed graph yields embeddings that are 'effectively interchangeable' with those from the original graph for random-walk algorithms rests on the unproven assumption that neighborhood-similarity compression leaves finite-length walk distributions and stationary probabilities sufficiently close. No direct comparison of walk statistics, embedding-space distances, or higher-order connectivity is reported, which is load-bearing for the 'without losing their effectiveness' assertion.
Authors: Our claims are limited to the preservation of effectiveness as measured by node classification accuracy, which is the standard evaluation for network embedding methods. We do not assert that the compressed graph produces identical walk distributions or embeddings. The experiments show that classification performance is maintained across the tested datasets and tasks. We will revise the wording in the abstract and introduction to emphasize that the method maintains task performance without claiming theoretical equivalence of the embeddings themselves. revision: partial
-
Referee: [Experiments] Experiments: the abstract states 'without decreasing classification accuracy' on single- and multi-label tasks, yet supplies no information on how the similarity threshold or target compression ratio was selected, whether multiple random seeds were used, or whether statistical significance tests were performed. This prevents assessment of whether the reported parity is robust or an artifact of particular hyper-parameter choices.
Authors: We acknowledge the lack of these details in the current version. The manuscript specifies the compression ratios applied to each network but does not detail the selection process or statistical analysis. In the revised manuscript, we will include a description of how the similarity threshold was determined to achieve the reported compression levels while preserving key graph properties, report averages over multiple random seeds for the embedding process, and perform statistical significance tests (e.g., Wilcoxon signed-rank test) to validate that accuracy differences are not significant. revision: yes
Circularity Check
No circularity: method is a preprocessing heuristic validated empirically
full rationale
The paper presents NECL as a neighborhood-similarity compression preprocessing step applied to existing random-walk embedding algorithms (DeepWalk, Node2vec). No equations, derivations, or fitted parameters are described that reduce the claimed preservation of walk statistics or downstream accuracy to a quantity defined from the same data. Effectiveness is asserted via experimental accuracy parity on four datasets rather than by construction or self-citation chains. The central claim therefore remains an independent empirical hypothesis rather than a tautology.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Persistence Homology of Networks: Methods and Applications
A conceptual review of persistent homology techniques, filtrations, algorithms, and applications for complex networks, including a proposed unified framework.
Reference graph
Works this paper leans on
-
[1]
edges) between individual units (i.e
INTRODUCTION Many real-world data can be modeled as networks to capture the interaction (i.e. edges) between individual units (i.e. vertices). Node classification, community detection and link prediction are some applications of network analysis in many different areas such as social networks and biological networks. Node classification is to find the label o...
-
[2]
Is there an ideal compression of a network?
-
[3]
Will the compression of a network significantly boost the representation learning of the network? As a solution to these problems, we propose NECL, a novel network embedding method. For the first problem, we propose a neighborhood similarity based graph compression method that compresses the input graph to get a relatively smaller graph without losing any/m...
-
[4]
NETWORK EMBEDDING USING SIMILARITY BASED COMPRESSION In this section We first give preliminary information about network embedding and graph compressing, then we describe our neighborhood similarity based graph compression algorithm and how to use compressed graph towards optimizing the efficiency of network embedding. 2.1 Preliminaries In this section, we b...
-
[5]
We first provide an overview of the datasets and embedding methods used for experiments
EXPERIMENTS We perform experimental studies to evaluate the efficiency and effectiveness of our algorithms on challenging multi-class and multi-label classification tasks in severa l real-world networks. We first provide an overview of the datasets and embedding methods used for experiments. We further show the performance of algorithms and also the improvemen...
-
[6]
RELATED WORK In this section, we briefly discuss the related work in the areas of networks embedding and graph compression. Network embedding. Previous researchers consider the 7 Table 2: Performance comparison of the single-label classi fication tasks for the similarity threshold λ = 0.5 and training ratio 5% for Cora and Wiki Cora (5%) Wiki (5%) NECL(DW) ...
work page 2000
-
[7]
CONCLUSIONS We propose a novel efficient network embedding method NECL which preserves the local structural features of the vertices. To overcome the efficiency limitations of the state-of-the-art methods, we use the idea of the graph compressing layout to network representation learning methods. We combine related vertices of a network into super-nodes which...
-
[8]
M. Adler and M. Mitzenmacher. Towards compressing web graphs. In Proceedings of Data Compression Conference, DCC 2001. , pages 203–212. IEEE, 2001
work page 2001
-
[9]
E. Akbas and P. Zhao. Attributed graph clustering: An attribute-aware graph embedding approach. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 , ASONAM ’17, pages 305–308, New York, NY, USA, 2017. ACM
work page 2017
-
[10]
E. Akbas and P. Zhao. Graph Clustering Based on Attribute-Aware Graph Embedding, pages 109–131. Springer International Publishing, Cham, 2019
work page 2019
-
[11]
M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, NIPS’01, pages 585–591, Cambridge, MA, USA, 2001. MIT Press
work page 2001
- [12]
-
[13]
H. Cai, V. W. Zheng, and K. C.-C. Chang. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering , 30(9):1616–1637, 2018
work page 2018
-
[14]
S. Cao, W. Lu, and Q. Xu. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM ’15, pages 891–900, New York, NY, USA, 2015. ACM
work page 2015
-
[15]
S. Cao, W. Lu, and Q. Xu. Deep neural networks for learning graph representations. In Thirtieth AAAI Conference on Artificial Intelligence , 2016
work page 2016
-
[16]
H. Chen, B. Perozzi, R. Al-Rfou, and S. Skiena. A tutorial on network embeddings. arXiv preprint arXiv:1808.02590, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[17]
H. Chen, B. Perozzi, Y. Hu, and S. Skiena. Harp: Hierarchical representation learning for networks. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018
work page 2018
-
[18]
P. Cui, X. Wang, J. Pei, and W. Zhu. A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering , 2018
work page 2018
- [19]
-
[20]
A. C. Gilbert and K. Levchenko. Compressing network graphs. In Proceedings of the LinkKDD workshop at the 10th ACM Conference on KDD, International conference on Knowledge discovery and data mining , volume 124, 2004
work page 2004
-
[21]
V. Gligorijevi´ c, M. Barot, and R. Bonneau. deepnf: Deep network fusion for protein function prediction. bioRxiv, 2017
work page 2017
-
[22]
P. Goyal and E. Ferrara. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems , 151:78–94, 2018
work page 2018
-
[23]
A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , pages 855–864. ACM, 2016
work page 2016
-
[24]
W. L. Hamilton, R. Ying, and J. Leskovec. Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584 , 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[25]
S. Letovsky and S. Kasif. Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics, 19(suppl 1):i197–i204, 2003
work page 2003
-
[26]
Y. Liu, T. Safavi, A. Dighe, and D. Koutra. Graph summarization methods and applications: A survey. ACM Comput. Surv. , 51(3):62:1–62:34, June 2018
work page 2018
-
[27]
G. R. Lopes, M. M. Moro, L. K. Wives, and J. P. M. De Oliveira. Collaboration recommendation on academic social networks. In International conference on conceptual modeling , pages 190–199. Springer, 2010
work page 2010
-
[28]
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In Proceedings of Workshop at ICLR, 2013. , 2013
work page 2013
-
[29]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems , pages 3111–3119, 2013
work page 2013
-
[30]
M. Niepert, M. Ahmed, and K. Kutzkov. Learning convolutional neural networks for graphs. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, pages 2014–2023. JMLR.org, 2016
work page 2014
-
[31]
B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining , pages 701–710. ACM, 2014
work page 2014
-
[32]
J. Qiu, Y. Dong, H. Ma, J. Li, K. Wang, and J. Tang. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining , pages 459–467. ACM, 2018
work page 2018
-
[33]
S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. science, 290(5500):2323–2326, 2000
work page 2000
- [34]
-
[35]
T. Suel and J. Yuan. Compressing the graph structure of the web. pages 213–222, 02 2001
work page 2001
-
[36]
J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale information network embedding. In Proceedings of the 24th international conference on world wide web, WWW15 , pages 1067–1077. International World Wide Web Conferences Steering Committee, 2015
work page 2015
-
[37]
H. Wang, J. Wang, J. Wang, M. Zhao, W. Zhang, F. Zhang, X. Xie, and M. Guo. Graphgan: Graph representation learning with generative adversarial nets. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018
work page 2018
-
[38]
S. Wold, K. Esbensen, and P. Geladi. Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3):37–52, 1987
work page 1987
-
[39]
R. Ying, J. You, C. Morris, X. Ren, W. L. Hamilton, and J. Leskovec. Hierarchical graph representation learning with differentiable pooling. In Proceedings of the 32Nd International Conference on Neural Information Processing Systems , NIPS’18, pages 4805–4815, USA, 2018. Curran Associates Inc
work page 2018
- [40]
-
[41]
F. Zhou. Graph compressiony. In Department of Computer Science and Helsinki Institute for Information Technology HIIT , page 112, 2015. 12
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.