pith. sign in

arxiv: 1907.02811 · v2 · pith:PRQDPUOCnew · submitted 2019-07-05 · 💻 cs.SI · cs.LG

Network Embedding: on Compression and Learning

Pith reviewed 2026-05-25 01:44 UTC · model grok-4.3

classification 💻 cs.SI cs.LG
keywords network embeddinggraph compressionrandom walksDeepWalkNode2vecnode classificationgraph representation learning
0
0 comments X

The pith

Compressing graphs via neighborhood similarity speeds up random-walk embeddings without accuracy loss.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents NECL, a method that compresses large graphs using neighborhood similarity to create a smaller graph that retains global structure and local vertex proximities. This compressed graph is then fed to random-walk based embedding algorithms such as DeepWalk and Node2vec. The approach reduces the time for embedding computation by 23 to 57 percent on average. A reader would care because it provides a general way to handle very large networks where direct embedding is too costly, while keeping the learned representations effective for tasks like node classification.

Core claim

NECL uses a neighborhood similarity based graph compression to produce a smaller graph without losing much information about the global structure of the graph and the local proximity of the vertices. Network embedding is performed on this compressed graph rather than the original to lower the embedding cost. It acts as a meta-strategy that enhances the efficiency of state-of-the-art random walk based graph embedding algorithms without decreasing their effectiveness, as shown in experiments on real-world networks.

What carries the argument

Neighborhood similarity based graph compression that merges or reduces parts of the graph based on similar neighborhoods to preserve structure and proximity information.

If this is right

  • Embedding time including walking and learning phases is reduced by 23-57% on average.
  • Classification accuracy on single-label and multi-label tasks stays the same as on the original graph.
  • The method applies to any random walk based embedding algorithm as a preprocessing step.
  • Experiments confirm this on networks including DBLP, BlogCatalog, Cora, and Wiki.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar compression could be tested on embedding methods not based on random walks to see if the benefit generalizes.
  • Applying the method to graphs with millions of nodes would test its scalability for very large social networks.
  • Other downstream tasks such as community detection or link prediction may see comparable efficiency gains.

Load-bearing premise

The compression based on neighborhood similarity preserves sufficient global structure and local proximity information so that embeddings from the compressed graph perform as well as those from the original on classification tasks.

What would settle it

A significant decrease in classification accuracy for embeddings learned on the compressed graph compared to the original graph on one of the tested real-world networks would falsify the claim.

Figures

Figures reproduced from arXiv: 1907.02811 by Esra Akbas, Mehmet Aktas.

Figure 1
Figure 1. Figure 1: Example of graph compressing on Les Miserables net [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: (a), while the transition probability from n1 to its neighbors is 1 ∣N(n1)∣, after compressing, it becomes 1 ∣N(n1)∣−1 since number of neighbors decrease by one. In order to avoid this problem, we give weights to edges of super-nodes based on the number of merged edges within the compression. For example, the super-edge between (a) (b) [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Detailed single-label classification result on Co [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Detailed single-label classification result on Wi [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Detailed multi-label classification result on DBL [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Detailed multi-label classification result on Blo [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The ratio of vertices/edges of the compressed grap [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
read the original abstract

Recently, network embedding that encodes structural information of graphs into a vector space has become popular for network analysis. Although recent methods show promising performance for various applications, the huge sizes of graphs may hinder a direct application of existing network embedding method to them. This paper presents NECL, a novel efficient Network Embedding method with two goals. 1) Is there an ideal Compression of a network? 2) Will the compression of a network significantly boost the representation Learning of the network? For the first problem, we propose a neighborhood similarity based graph compression method that compresses the input graph to get a smaller graph without losing any/much information about the global structure of the graph and the local proximity of the vertices in the graph. For the second problem, we use the compressed graph for network embedding instead of the original large graph to bring down the embedding cost. NECL is a general meta-strategy to improve the efficiency of all of the state-of-the-art graph embedding algorithms based on random walks, including DeepWalk and Node2vec, without losing their effectiveness. Extensive experiments on large real-world networks validate the efficiency of NECL method that yields an average improvement of 23 - 57% embedding time, including walking and learning time without decreasing classification accuracy as evaluated on single and multi-label classification tasks on real-world graphs such as DBLP, BlogCatalog, Cora and Wiki.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes NECL, a meta-strategy for network embedding that first applies a neighborhood-similarity-based compression to the input graph and then runs existing random-walk embedding algorithms (DeepWalk, Node2vec, etc.) on the compressed graph. The central claims are that the compression preserves global structure and local proximity sufficiently well that downstream node-classification accuracy remains unchanged while embedding time (walking plus learning) drops by 23–57 % on four real-world networks (DBLP, BlogCatalog, Cora, Wiki).

Significance. If the empirical accuracy parity holds under controlled conditions and the method is shown to be robust across compression ratios, NECL would supply a practical, algorithm-agnostic preprocessing step that improves scalability of random-walk embeddings without requiring changes to the core embedding routines. The absence of any parameter-free derivation or machine-checked guarantee, however, keeps the result at the level of an empirical heuristic rather than a foundational advance.

major comments (2)
  1. [Abstract / Method] Abstract and method description: the claim that the compressed graph yields embeddings that are 'effectively interchangeable' with those from the original graph for random-walk algorithms rests on the unproven assumption that neighborhood-similarity compression leaves finite-length walk distributions and stationary probabilities sufficiently close. No direct comparison of walk statistics, embedding-space distances, or higher-order connectivity is reported, which is load-bearing for the 'without losing their effectiveness' assertion.
  2. [Experiments] Experiments: the abstract states 'without decreasing classification accuracy' on single- and multi-label tasks, yet supplies no information on how the similarity threshold or target compression ratio was selected, whether multiple random seeds were used, or whether statistical significance tests were performed. This prevents assessment of whether the reported parity is robust or an artifact of particular hyper-parameter choices.
minor comments (1)
  1. [Abstract] The abstract contains the repeated phrase 'without losing any/much information'; a more precise statement of what structural invariants are (or are not) preserved would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's report. We address the major comments below and propose revisions where appropriate to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract / Method] Abstract and method description: the claim that the compressed graph yields embeddings that are 'effectively interchangeable' with those from the original graph for random-walk algorithms rests on the unproven assumption that neighborhood-similarity compression leaves finite-length walk distributions and stationary probabilities sufficiently close. No direct comparison of walk statistics, embedding-space distances, or higher-order connectivity is reported, which is load-bearing for the 'without losing their effectiveness' assertion.

    Authors: Our claims are limited to the preservation of effectiveness as measured by node classification accuracy, which is the standard evaluation for network embedding methods. We do not assert that the compressed graph produces identical walk distributions or embeddings. The experiments show that classification performance is maintained across the tested datasets and tasks. We will revise the wording in the abstract and introduction to emphasize that the method maintains task performance without claiming theoretical equivalence of the embeddings themselves. revision: partial

  2. Referee: [Experiments] Experiments: the abstract states 'without decreasing classification accuracy' on single- and multi-label tasks, yet supplies no information on how the similarity threshold or target compression ratio was selected, whether multiple random seeds were used, or whether statistical significance tests were performed. This prevents assessment of whether the reported parity is robust or an artifact of particular hyper-parameter choices.

    Authors: We acknowledge the lack of these details in the current version. The manuscript specifies the compression ratios applied to each network but does not detail the selection process or statistical analysis. In the revised manuscript, we will include a description of how the similarity threshold was determined to achieve the reported compression levels while preserving key graph properties, report averages over multiple random seeds for the embedding process, and perform statistical significance tests (e.g., Wilcoxon signed-rank test) to validate that accuracy differences are not significant. revision: yes

Circularity Check

0 steps flagged

No circularity: method is a preprocessing heuristic validated empirically

full rationale

The paper presents NECL as a neighborhood-similarity compression preprocessing step applied to existing random-walk embedding algorithms (DeepWalk, Node2vec). No equations, derivations, or fitted parameters are described that reduce the claimed preservation of walk statistics or downstream accuracy to a quantity defined from the same data. Effectiveness is asserted via experimental accuracy parity on four datasets rather than by construction or self-citation chains. The central claim therefore remains an independent empirical hypothesis rather than a tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are described.

pith-pipeline@v0.9.0 · 5770 in / 1123 out tokens · 38893 ms · 2026-05-25T01:44:57.463273+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Persistence Homology of Networks: Methods and Applications

    math.AT 2019-07 unverdicted novelty 1.0

    A conceptual review of persistent homology techniques, filtrations, algorithms, and applications for complex networks, including a proposed unified framework.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    edges) between individual units (i.e

    INTRODUCTION Many real-world data can be modeled as networks to capture the interaction (i.e. edges) between individual units (i.e. vertices). Node classification, community detection and link prediction are some applications of network analysis in many different areas such as social networks and biological networks. Node classification is to find the label o...

  2. [2]

    Is there an ideal compression of a network?

  3. [3]

    Will the compression of a network significantly boost the representation learning of the network? As a solution to these problems, we propose NECL, a novel network embedding method. For the first problem, we propose a neighborhood similarity based graph compression method that compresses the input graph to get a relatively smaller graph without losing any/m...

  4. [4]

    2.1 Preliminaries In this section, we briefly discuss the necessary preliminaries for our new meta-strategy for graph embedding

    NETWORK EMBEDDING USING SIMILARITY BASED COMPRESSION In this section We first give preliminary information about network embedding and graph compressing, then we describe our neighborhood similarity based graph compression algorithm and how to use compressed graph towards optimizing the efficiency of network embedding. 2.1 Preliminaries In this section, we b...

  5. [5]

    We first provide an overview of the datasets and embedding methods used for experiments

    EXPERIMENTS We perform experimental studies to evaluate the efficiency and effectiveness of our algorithms on challenging multi-class and multi-label classification tasks in severa l real-world networks. We first provide an overview of the datasets and embedding methods used for experiments. We further show the performance of algorithms and also the improvemen...

  6. [6]

    Network embedding

    RELATED WORK In this section, we briefly discuss the related work in the areas of networks embedding and graph compression. Network embedding. Previous researchers consider the 7 Table 2: Performance comparison of the single-label classi fication tasks for the similarity threshold λ = 0.5 and training ratio 5% for Cora and Wiki Cora (5%) Wiki (5%) NECL(DW) ...

  7. [7]

    To overcome the efficiency limitations of the state-of-the-art methods, we use the idea of the graph compressing layout to network representation learning methods

    CONCLUSIONS We propose a novel efficient network embedding method NECL which preserves the local structural features of the vertices. To overcome the efficiency limitations of the state-of-the-art methods, we use the idea of the graph compressing layout to network representation learning methods. We combine related vertices of a network into super-nodes which...

  8. [8]

    Adler and M

    M. Adler and M. Mitzenmacher. Towards compressing web graphs. In Proceedings of Data Compression Conference, DCC 2001. , pages 203–212. IEEE, 2001

  9. [9]

    Akbas and P

    E. Akbas and P. Zhao. Attributed graph clustering: An attribute-aware graph embedding approach. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 , ASONAM ’17, pages 305–308, New York, NY, USA, 2017. ACM

  10. [10]

    Akbas and P

    E. Akbas and P. Zhao. Graph Clustering Based on Attribute-Aware Graph Embedding, pages 109–131. Springer International Publishing, Cham, 2019

  11. [11]

    Belkin and P

    M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, NIPS’01, pages 585–591, Cambridge, MA, USA, 2001. MIT Press

  12. [12]

    Bhagat, G

    S. Bhagat, G. Cormode, and S. Muthukrishnan. Node classification in social networks. In Social network data analytics , pages 115–148. Springer, 2011

  13. [13]

    H. Cai, V. W. Zheng, and K. C.-C. Chang. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering , 30(9):1616–1637, 2018

  14. [14]

    S. Cao, W. Lu, and Q. Xu. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM ’15, pages 891–900, New York, NY, USA, 2015. ACM

  15. [15]

    S. Cao, W. Lu, and Q. Xu. Deep neural networks for learning graph representations. In Thirtieth AAAI Conference on Artificial Intelligence , 2016

  16. [16]

    H. Chen, B. Perozzi, R. Al-Rfou, and S. Skiena. A tutorial on network embeddings. arXiv preprint arXiv:1808.02590, 2018

  17. [17]

    H. Chen, B. Perozzi, Y. Hu, and S. Skiena. Harp: Hierarchical representation learning for networks. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018

  18. [18]

    P. Cui, X. Wang, J. Pei, and W. Zhu. A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering , 2018

  19. [19]

    Fan, K.-W

    R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. Liblinear: A library for large linear classification. Journal of machine learning research , 9(Aug):1871–1874, 2008

  20. [20]

    A. C. Gilbert and K. Levchenko. Compressing network graphs. In Proceedings of the LinkKDD workshop at the 10th ACM Conference on KDD, International conference on Knowledge discovery and data mining , volume 124, 2004

  21. [21]

    Gligorijevi´ c, M

    V. Gligorijevi´ c, M. Barot, and R. Bonneau. deepnf: Deep network fusion for protein function prediction. bioRxiv, 2017

  22. [22]

    Goyal and E

    P. Goyal and E. Ferrara. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems , 151:78–94, 2018

  23. [23]

    Grover and J

    A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , pages 855–864. ACM, 2016

  24. [24]

    W. L. Hamilton, R. Ying, and J. Leskovec. Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584 , 2017

  25. [25]

    Letovsky and S

    S. Letovsky and S. Kasif. Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics, 19(suppl 1):i197–i204, 2003

  26. [26]

    Y. Liu, T. Safavi, A. Dighe, and D. Koutra. Graph summarization methods and applications: A survey. ACM Comput. Surv. , 51(3):62:1–62:34, June 2018

  27. [27]

    G. R. Lopes, M. M. Moro, L. K. Wives, and J. P. M. De Oliveira. Collaboration recommendation on academic social networks. In International conference on conceptual modeling , pages 190–199. Springer, 2010

  28. [28]

    Mikolov, K

    T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In Proceedings of Workshop at ICLR, 2013. , 2013

  29. [29]

    Mikolov, I

    T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems , pages 3111–3119, 2013

  30. [30]

    Niepert, M

    M. Niepert, M. Ahmed, and K. Kutzkov. Learning convolutional neural networks for graphs. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, pages 2014–2023. JMLR.org, 2016

  31. [31]

    Perozzi, R

    B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining , pages 701–710. ACM, 2014

  32. [32]

    J. Qiu, Y. Dong, H. Ma, J. Li, K. Wang, and J. Tang. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining , pages 459–467. ACM, 2018

  33. [33]

    S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. science, 290(5500):2323–2326, 2000

  34. [34]

    Sharan, I

    R. Sharan, I. Ulitsky, and R. Shamir. Network-based 11 prediction of protein function. Molecular systems biology, 3(1):88, 2007

  35. [35]

    Suel and J

    T. Suel and J. Yuan. Compressing the graph structure of the web. pages 213–222, 02 2001

  36. [36]

    J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale information network embedding. In Proceedings of the 24th international conference on world wide web, WWW15 , pages 1067–1077. International World Wide Web Conferences Steering Committee, 2015

  37. [37]

    H. Wang, J. Wang, J. Wang, M. Zhao, W. Zhang, F. Zhang, X. Xie, and M. Guo. Graphgan: Graph representation learning with generative adversarial nets. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018

  38. [38]

    S. Wold, K. Esbensen, and P. Geladi. Principal component analysis. Chemometrics and intelligent laboratory systems, 2(1-3):37–52, 1987

  39. [39]

    R. Ying, J. You, C. Morris, X. Ren, W. L. Hamilton, and J. Leskovec. Hierarchical graph representation learning with differentiable pooling. In Proceedings of the 32Nd International Conference on Neural Information Processing Systems , NIPS’18, pages 4805–4815, USA, 2018. Curran Associates Inc

  40. [40]

    Zhang, J

    D. Zhang, J. Yin, X. Zhu, and C. Zhang. Network Representation Learning: A Survey. IEEE Transactions on Big Data , PP(c):1, 2017

  41. [41]

    F. Zhou. Graph compressiony. In Department of Computer Science and Helsinki Institute for Information Technology HIIT , page 112, 2015. 12