Regional based query in graph active learning

Roy Abel; Yoram Louzoun

arxiv: 1906.08541 · v1 · pith:UAZQ5XVLnew · submitted 2019-06-20 · 💻 cs.LG · stat.ML

Regional based query in graph active learning

Roy Abel , Yoram Louzoun This is my paper

Pith reviewed 2026-05-25 19:50 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords active learninggraph neural networksnode classificationuncertainty samplingpage rankregional queries

0 comments

The pith

Neighbor-based uncertainty criteria outperform single-node uncertainty for active learning on graphs once labeled nodes reach one per average degree.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops two selection criteria for choosing which nodes to query in graph node classification with graph convolution networks. Because node classes tend to correlate with neighbors, it replaces the usual per-node uncertainty with measures that look at uncertainty in neighboring nodes. One criterion extends standard uncertainty sampling regionally; the other adapts the page-rank algorithm to the same goal. The page-rank version is shown to be best when very few nodes are labeled, while the regional uncertainty version surpasses all prior methods once the labeled fraction reaches roughly one over the average degree. The approach is presented as potentially applicable to any classification setting that supplies a distance between samples.

Core claim

We propose two regional query criteria for active learning on graphs. The page-rank extension is optimal at low fractions of tagged nodes. When the tagged fraction grows to one over the average degree, the regional uncertainty performs better than all existing methods.

What carries the argument

Regional uncertainty, which selects nodes by the uncertainty in the class of their neighbors rather than by their own class uncertainty.

If this is right

The page-rank extension gives optimal performance when the fraction of tagged nodes is low.
Regional uncertainty becomes the best choice once the labeled fraction reaches one over the average degree.
The same regional criteria can be applied to any classification problem supplied with a distance metric between samples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Active learning on graphs may gain more from local structure than current per-sample uncertainty methods capture.
A switching rule that uses page-rank at very low label counts and regional uncertainty at moderate counts could combine the strengths of both.
Defining suitable distances on non-graph data might let the same neighbor-based selection improve active learning outside networks.

Load-bearing premise

The class of each node is often related to the class of its neighbors.

What would settle it

An experiment on a graph whose node labels are assigned independently of neighbor labels, showing that neither proposed method beats standard uncertainty sampling, would falsify the advantage.

Figures

Figures reproduced from arXiv: 1906.08541 by Roy Abel, Yoram Louzoun.

**Figure 2.** Figure 2: Accuracy, Micro and Macro F1 as a function of training set fraction in 3 datasets for different learning methods. We tested for three reported datasets multiple precision estimates as a function of the training set fraction. We have tested four algorithms: GXBoost, FFN, GCN and RF. For each algorithm, we tested three types of input, the neighbors tag, topological features of the node and the combination of… view at source ↗

**Figure 3.** Figure 3: Upper plot. Accuracy as a function of sampling fraction for different datasets and different local AL methods. results lower than the random sampling show AL algorithms that are worse than random sampling. The two subplots surrounded by a box are the results with the BOW. Lower plot difference between random sampling and each of the AL methods at the last time point sampled (15 % when no input is used and … view at source ↗

**Figure 4.** Figure 4: Upper plot and lower plot follow figure 3. The only difference is that the upper plot in this figure is the difference with random, so positive values represent higher accuracy than the one obtained in random sampling, while negative accuracy are worse than random sampling. adaptive page rank. Those two measures can be combined with existing active learning techniques to achieve even better performance. Wh… view at source ↗

**Figure 5.** Figure 5: Distance to randomly sampled nodes as a function of the fraction of sampled fraction. Within a sampling fraction of 5 %, the average distance to a sampled node reaches a fixed distance. [2] D. Berberidis and G. B. Giannakis, “Data-adaptive active sampling for efficient graph-cognizant classification,” IEEE Transactions on Signal Processing, vol. 66, no. 19, pp. 5167–5179, 2018. [3] X. Zhu, Z. Ghahramani, a… view at source ↗

read the original abstract

Graph convolution networks (GCN) have emerged as the leading method to classify node classes in networks, and have reached the highest accuracy in multiple node classification tasks. In the absence of available tagged samples, active learning methods have been developed to obtain the highest accuracy using the minimal number of queries to an oracle. The current best active learning methods use the sample class uncertainty as selection criteria. However, in graph based classification, the class of each node is often related to the class of its neighbors. As such, the uncertainty in the class of a node's neighbor may be a more appropriate selection criterion. We here propose two such criteria, one extending the classical uncertainty measure, and the other extending the page-rank algorithm. We show that the latter is optimal when the fraction of tagged nodes is low, and when this fraction grows to one over the average degree, the regional uncertainty performs better than all existing methods. While we have tested this methods on graphs, such methods can be extended to any classification problem, where a distance metrics can be defined between the input samples. All the code used can be accessed at : https://github.com/louzounlab/graph-al All the datasets used can be accessed at : https://github.com/louzounlab/DataSets

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds two concrete query rules for graph active learning (regional uncertainty and PageRank adaptation) plus released code, with a claimed crossover in performance as label fraction grows.

read the letter

The main new pieces are the regional uncertainty selector that pulls in neighbor uncertainty and the PageRank-based rule that the authors say works best at very low label fractions. They report that the PageRank version is preferable early and regional uncertainty beats existing methods once the labeled fraction reaches roughly one over average degree. Both build on the standard homophily assumption that node labels correlate with neighbors, which is a reasonable basis for graph data. The abstract presents these as extensions rather than entirely new frameworks, and the crossover claim is specific enough to test directly. What the paper does well is ship the full code and the datasets used, so the reported crossover can be reproduced without needing to reimplement from scratch. That removes one common barrier to checking whether the gains hold up. The argument stays within the usual active-learning setup for node classification and does not introduce circular definitions or free parameters that would make the optimality claims unfalsifiable. The soft spots are modest but real. The abstract supplies no numbers, no list of baselines, and no mention of statistical tests, so the size and consistency of the improvement remain unclear until the experiments are examined. The suggestion that the same ideas could apply to any setting with a distance metric is left undeveloped and adds little. Readers focused on label-efficient node classification on networks will get the most from this. Someone already running active learning on graphs could try the two selectors on their own data in an afternoon thanks to the released code. It is worth sending to peer review because the ideas are straightforward, the reproducibility is present, and the central claim is narrow enough that referees can evaluate it on its own terms without needing major new theory.

Referee Report

2 major / 3 minor

Summary. The manuscript proposes two neighborhood-aware active learning query strategies for GCN-based node classification on graphs. One extends classical uncertainty sampling to a regional measure over a node's neighbors; the other extends the PageRank algorithm. The central empirical claim is that the PageRank variant is optimal at low labeled-node fractions, while the regional-uncertainty variant outperforms existing methods once the labeled fraction reaches approximately one over the average degree. The approach rests on the standard homophily assumption and is accompanied by public code and datasets.

Significance. If the reported crossover in performance holds under proper controls, the work supplies a practical, easily implemented improvement to active learning on graphs that exploits local structure rather than single-node uncertainty. The public release of code and datasets is a clear strength, directly supporting reproducibility of the claimed transition between the two proposed criteria.

major comments (2)

[Experiments] Experiments section (and associated figures/tables): the abstract asserts that the PageRank extension is 'optimal' at low label fractions and that regional uncertainty 'performs better than all existing methods' at the 1/degree threshold, yet no table or figure is referenced that reports accuracy-vs-label-fraction curves, lists the exact baselines (random, uncertainty, core-set, etc.), or supplies error bars or statistical tests across repeated trials; without these the load-bearing performance claims cannot be evaluated.
[Section 3] Section 3 (method definitions): the regional uncertainty and PageRank-extension criteria are described only at a high level; an explicit equation or pseudocode for each selection score is required so that the claimed optimality statements can be verified independently of the released code.

minor comments (3)

[Abstract] Abstract: 'this methods' should be 'these methods'.
[Abstract] The final sentence claims the methods 'can be extended to any classification problem where a distance metric can be defined'; this extension is asserted but never demonstrated or even sketched, so the sentence should be qualified or removed.
[Introduction] The manuscript should add a short related-work paragraph citing recent graph active-learning baselines (e.g., AGE, ANRMAB, or core-set methods) to situate the two new criteria.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and commit to revisions that improve clarity and verifiability while preserving the core contributions.

read point-by-point responses

Referee: [Experiments] Experiments section (and associated figures/tables): the abstract asserts that the PageRank extension is 'optimal' at low label fractions and that regional uncertainty 'performs better than all existing methods' at the 1/degree threshold, yet no table or figure is referenced that reports accuracy-vs-label-fraction curves, lists the exact baselines (random, uncertainty, core-set, etc.), or supplies error bars or statistical tests across repeated trials; without these the load-bearing performance claims cannot be evaluated.

Authors: We agree that the experimental claims require clearer support in the text. The manuscript contains figures and tables comparing accuracy versus labeled fraction for the proposed methods against baselines including random sampling and uncertainty sampling, with results averaged over repeated trials. We will revise the Experiments section to explicitly reference these figures/tables, list all baselines, report error bars, and note the number of trials. The public code repository already enables independent reproduction of the curves. revision: yes
Referee: [Section 3] Section 3 (method definitions): the regional uncertainty and PageRank-extension criteria are described only at a high level; an explicit equation or pseudocode for each selection score is required so that the claimed optimality statements can be verified independently of the released code.

Authors: We acknowledge that the current description in Section 3 is high-level. We will add explicit mathematical definitions for the regional uncertainty score (extending per-node uncertainty over neighbors) and the PageRank-extension score, together with pseudocode for the selection procedure. These additions will allow direct verification of the optimality claims without sole reliance on the released code. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's central claims rest on the standard homophily assumption in graphs (node labels correlated with neighbors) combined with empirical comparisons of selection criteria at varying label fractions. No equations, derivations, or first-principles results are shown that reduce the claimed optimality of the page-rank extension or regional uncertainty to fitted parameters, self-definitions, or self-citation chains. The methods are presented as extensions of existing uncertainty and PageRank ideas, with results supported by publicly available code and datasets for direct reproduction. This is a standard empirical active-learning study without load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the domain assumption that neighbor class information improves query selection; no free parameters or invented entities are mentioned.

axioms (1)

domain assumption The class of each node is often related to the class of its neighbors.
Explicitly invoked in the abstract as the motivation for shifting from single-node to regional or neighbor-based criteria.

pith-pipeline@v0.9.0 · 5748 in / 1111 out tokens · 22051 ms · 2026-05-25T19:50:24.843713+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 5 internal anchors

[1]

A variance minimization criterion to active learning on graphs,

M. Ji and J. Han, “A variance minimization criterion to active learning on graphs,” in Artiﬁcial Intelligence and Statistics , pp. 556–564, 2012. Table II RESULTS algorithm Cora CiteSeer accuracy macro-F1 micro-F1 accuracy macro-F1 micro-F1 our random 0.803 0.779 0.803 0.697 0.627 0.697 Kipf (random) [11] 0.801 - - 0.679 - - HNE [30] 0.59* - - - - - TV/MS...

work page 2012
[2]

Data-adaptive active sampling for efﬁcient graph-cognizant classiﬁcation,

D. Berberidis and G. B. Giannakis, “Data-adaptive active sampling for efﬁcient graph-cognizant classiﬁcation,” IEEE Transactions on Signal Processing, vol. 66, no. 19, pp. 5167–5179, 2018

work page 2018
[3]

Semi-supervised learning using gaussian ﬁelds and harmonic functions,

X. Zhu, Z. Ghahramani, and J. D. Lafferty, “Semi-supervised learning using gaussian ﬁelds and harmonic functions,” inProceedings of the 20th International conference on Machine learning (ICML-03) , pp. 912–919, 2003

work page 2003
[4]

Combining active learning and semi-supervised learning using gaussian ﬁelds and harmonic functions,

X. Zhu, J. Lafferty, and Z. Ghahramani, “Combining active learning and semi-supervised learning using gaussian ﬁelds and harmonic functions,” in ICML 2003 workshop on the continuum from labeled to unlabeled data in machine learning and data mining , vol. 3, 2003

work page 2003
[5]

Beyond the point cloud: from transductive to semi-supervised learning,

V . Sindhwani, P. Niyogi, and M. Belkin, “Beyond the point cloud: from transductive to semi-supervised learning,” in Proceedings of the 22nd international conference on Machine learning, pp. 824–831, ACM, 2005

work page 2005
[6]

Semi-supervised learning on riemannian manifolds,

M. Belkin and P. Niyogi, “Semi-supervised learning on riemannian manifolds,” Machine learning, vol. 56, no. 1-3, pp. 209–239, 2004

work page 2004
[7]

Normalized cuts and image segmentation,

J. Shi and J. Malik, “Normalized cuts and image segmentation,” De- partmental Papers (CIS), p. 107, 2000

work page 2000
[8]

Community detection in net- works with node attributes,

J. Yang, J. McAuley, and J. Leskovec, “Community detection in net- works with node attributes,” in2013 IEEE 13th International Conference on Data Mining , pp. 1151–1156, IEEE, 2013

work page 2013
[9]

Topological similarity as a proxy to content similarity,

Y . Rosen and Y . Louzoun, “Topological similarity as a proxy to content similarity,” Journal of Complex Networks, vol. 4, no. 1, pp. 38–60, 2015

work page 2015
[10]

Edge sign prediction based on a combination of network structural topology and sign propagation,

R. Naaman, K. Cohen, and Y . Louzoun, “Edge sign prediction based on a combination of network structural topology and sign propagation,” Journal of Complex Networks , vol. 7, no. 1, pp. 54–66, 2018

work page 2018
[11]

Semi-Supervised Classification with Graph Convolutional Networks

T. N. Kipf and M. Welling, “Semi-supervised classiﬁcation with graph convolutional networks,” arXiv preprint arXiv:1609.02907 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[12]

Modeling relational data with graph convolutional net- works,

M. Schlichtkrull, T. N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, and M. Welling, “Modeling relational data with graph convolutional net- works,” in European Semantic Web Conference, pp. 593–607, Springer, 2018

work page 2018
[13]

Topological based classification of paper domains using graph convolutional networks

I. Benami, K. Cohen, O. Nagar, and Y . Louzoun, “Topological based classiﬁcation of paper domains using graph convolutional networks,” arXiv preprint arXiv:1904.07787 , 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904
[14]

Active learning literature survey,

B. Settles, “Active learning literature survey,” tech. rep., University of Wisconsin-Madison Department of Computer Sciences, 2009

work page 2009
[15]

Heterogeneous uncertainty sampling for supervised learning,

D. D. Lewis and J. Catlett, “Heterogeneous uncertainty sampling for supervised learning,” in Machine learning proceedings 1994 , pp. 148– 156, Elsevier, 1994

work page 1994
[16]

Reducing labeling effort for structured prediction tasks,

A. Culotta and A. McCallum, “Reducing labeling effort for structured prediction tasks,” in AAAI, vol. 5, pp. 746–751, 2005

work page 2005
[17]

An analysis of active learning strategies for sequence labeling tasks,

B. Settles and M. Craven, “An analysis of active learning strategies for sequence labeling tasks,” in Proceedings of the conference on empirical methods in natural language processing , pp. 1070–1079, Association for Computational Linguistics, 2008

work page 2008
[18]

Graph Convolutional Matrix Completion

R. v. d. Berg, T. N. Kipf, and M. Welling, “Graph convolutional matrix completion,” arXiv preprint arXiv:1706.02263 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[19]

node2vec: Scalable feature learning for networks,

A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 855–864, ACM, 2016

work page 2016
[20]

Active hidden markov models for information extraction,

T. Scheffer, C. Decomain, and S. Wrobel, “Active hidden markov models for information extraction,” in International Symposium on Intelligent Data Analysis, pp. 309–318, Springer, 2001

work page 2001
[21]

A mathematical theory of communication,

C. E. Shannon, “A mathematical theory of communication,” Bell system technical journal, vol. 27, no. 3, pp. 379–423, 1948

work page 1948
[22]

Selective sampling for example-based word sense disambiguation,

A. Fujii, T. Tokunaga, K. Inui, and H. Tanaka, “Selective sampling for example-based word sense disambiguation,” Computational Linguistics, vol. 24, no. 4, pp. 573–597, 1998

work page 1998
[23]

Active learning using pre-clustering,

H. T. Nguyen and A. Smeulders, “Active learning using pre-clustering,” in Proceedings of the twenty-ﬁrst international conference on Machine learning, p. 79, ACM, 2004

work page 2004
[24]

Active learning with sampling by uncertainty and density for data annotations,

J. Zhu, H. Wang, B. K. Tsou, and M. Ma, “Active learning with sampling by uncertainty and density for data annotations,” IEEE Transactions on audio, speech, and language processing , vol. 18, no. 6, pp. 1323–1331, 2009

work page 2009
[25]

Using graph-based metrics with empirical risk min- imization to speed up active learning on networked data,

S. A. Macskassy, “Using graph-based metrics with empirical risk min- imization to speed up active learning on networked data,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 597–606, ACM, 2009

work page 2009
[26]

Batch mode active learning for node classiﬁcation in assortative and disassortative networks,

S. Ping, D. Liu, B. Yang, Y . Zhu, H. Chen, and Z. Wang, “Batch mode active learning for node classiﬁcation in assortative and disassortative networks,” IEEE Access, vol. 6, pp. 4750–4758, 2017

work page 2017
[27]

Active Learning for Graph Embedding

H. Cai, V . W. Zheng, and K. C.-C. Chang, “Active learning for graph embedding,” arXiv preprint arXiv:1705.05085 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[28]

σ-optimality for active learning on gaussian random ﬁelds,

Y . Ma, R. Garnett, and J. Schneider, “σ-optimality for active learning on gaussian random ﬁelds,” in Advances in Neural Information Processing Systems, pp. 2751–2759, 2013

work page 2013
[29]

Active learning of gaussian processes with manifold-preserving graph reduction,

J. Zhou and S. Sun, “Active learning of gaussian processes with manifold-preserving graph reduction,” Neural Computing and Applic- ations, vol. 25, no. 7-8, pp. 1615–1625, 2014

work page 2014
[30]

ActiveHNE: Active Heterogeneous Network Embedding

X. Chen, G. Yu, J. Wang, C. Domeniconi, Z. Li, and X. Zhang, “Activehne: Active heterogeneous network embedding,” arXiv preprint arXiv:1905.05659, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1905
[31]

Self-emergence of knowledge trees: Extraction of the wikipedia hierarchies,

L. Muchnik, R. Itzhack, S. Solomon, and Y . Louzoun, “Self-emergence of knowledge trees: Extraction of the wikipedia hierarchies,” Physical Review E, vol. 76, no. 1, p. 016106, 2007

work page 2007
[32]

Locating inﬂu- ential nodes in complex networks,

F. D. Malliaros, M.-E. G. Rossi, and M. Vazirgiannis, “Locating inﬂu- ential nodes in complex networks,” Scientiﬁc reports, vol. 6, p. 19307, 2016

work page 2016
[33]

The pagerank citation ranking: Bringing order to the web.,

L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation ranking: Bringing order to the web.,” tech. rep., Stanford InfoLab, 1999

work page 1999
[34]

Automating the construction of internet portals with machine learning,

A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore, “Automating the construction of internet portals with machine learning,” Information Retrieval, vol. 3, no. 2, pp. 127–163, 2000

work page 2000
[35]

Collective classiﬁcation in network data,

P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi- Rad, “Collective classiﬁcation in network data,” AI magazine , vol. 29, no. 3, pp. 93–93, 2008

work page 2008
[36]

Citeseer: An automatic citation indexing system.,

C. L. Giles, K. D. Bollacker, and S. Lawrence, “Citeseer: An automatic citation indexing system.,” in ACM DL, pp. 89–98, 1998

work page 1998
[37]

Graph evolution: Dens- iﬁcation and shrinking diameters,

J. Leskovec, J. Kleinberg, and C. Faloutsos, “Graph evolution: Dens- iﬁcation and shrinking diameters,” ACM Transactions on Knowledge Discovery from Data (TKDD) , vol. 1, no. 1, p. 2, 2007

work page 2007
[38]

SNAP Datasets: Stanford large network dataset collection

J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection.” http://snap.stanford.edu/data, June 2014

work page 2014
[39]

Local higher- order graph clustering,

H. Yin, A. R. Benson, J. Leskovec, and D. F. Gleich, “Local higher- order graph clustering,” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 555–564, ACM, 2017

work page 2017
[40]

Cora citation network dataset – KONECT,

“Cora citation network dataset – KONECT,” Apr. 2017

work page 2017
[41]

Human wayﬁnding in information networks,

R. West and J. Leskovec, “Human wayﬁnding in information networks,” in Proceedings of the 21st international conference on World Wide Web, pp. 619–628, ACM, 2012

work page 2012
[42]

Wikispeedia: An online game for inferring semantic distances between concepts,

R. West, J. Pineau, and D. Precup, “Wikispeedia: An online game for inferring semantic distances between concepts,” in Twenty-First International Joint Conference on Artiﬁcial Intelligence , 2009

work page 2009
[43]

Relational active learning for link-based classiﬁc- ation,

L. K. McDowell, “Relational active learning for link-based classiﬁc- ation,” in 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA) , pp. 1–10, IEEE, 2015

work page 2015
[44]

Active learning for networked data,

M. Bilgic, L. Mihalkova, and L. Getoor, “Active learning for networked data,” in Proceedings of the 27th international conference on machine learning (ICML-10), pp. 79–86, 2010

work page 2010

[1] [1]

A variance minimization criterion to active learning on graphs,

M. Ji and J. Han, “A variance minimization criterion to active learning on graphs,” in Artiﬁcial Intelligence and Statistics , pp. 556–564, 2012. Table II RESULTS algorithm Cora CiteSeer accuracy macro-F1 micro-F1 accuracy macro-F1 micro-F1 our random 0.803 0.779 0.803 0.697 0.627 0.697 Kipf (random) [11] 0.801 - - 0.679 - - HNE [30] 0.59* - - - - - TV/MS...

work page 2012

[2] [2]

Data-adaptive active sampling for efﬁcient graph-cognizant classiﬁcation,

D. Berberidis and G. B. Giannakis, “Data-adaptive active sampling for efﬁcient graph-cognizant classiﬁcation,” IEEE Transactions on Signal Processing, vol. 66, no. 19, pp. 5167–5179, 2018

work page 2018

[3] [3]

Semi-supervised learning using gaussian ﬁelds and harmonic functions,

X. Zhu, Z. Ghahramani, and J. D. Lafferty, “Semi-supervised learning using gaussian ﬁelds and harmonic functions,” inProceedings of the 20th International conference on Machine learning (ICML-03) , pp. 912–919, 2003

work page 2003

[4] [4]

Combining active learning and semi-supervised learning using gaussian ﬁelds and harmonic functions,

X. Zhu, J. Lafferty, and Z. Ghahramani, “Combining active learning and semi-supervised learning using gaussian ﬁelds and harmonic functions,” in ICML 2003 workshop on the continuum from labeled to unlabeled data in machine learning and data mining , vol. 3, 2003

work page 2003

[5] [5]

Beyond the point cloud: from transductive to semi-supervised learning,

V . Sindhwani, P. Niyogi, and M. Belkin, “Beyond the point cloud: from transductive to semi-supervised learning,” in Proceedings of the 22nd international conference on Machine learning, pp. 824–831, ACM, 2005

work page 2005

[6] [6]

Semi-supervised learning on riemannian manifolds,

M. Belkin and P. Niyogi, “Semi-supervised learning on riemannian manifolds,” Machine learning, vol. 56, no. 1-3, pp. 209–239, 2004

work page 2004

[7] [7]

Normalized cuts and image segmentation,

J. Shi and J. Malik, “Normalized cuts and image segmentation,” De- partmental Papers (CIS), p. 107, 2000

work page 2000

[8] [8]

Community detection in net- works with node attributes,

J. Yang, J. McAuley, and J. Leskovec, “Community detection in net- works with node attributes,” in2013 IEEE 13th International Conference on Data Mining , pp. 1151–1156, IEEE, 2013

work page 2013

[9] [9]

Topological similarity as a proxy to content similarity,

Y . Rosen and Y . Louzoun, “Topological similarity as a proxy to content similarity,” Journal of Complex Networks, vol. 4, no. 1, pp. 38–60, 2015

work page 2015

[10] [10]

Edge sign prediction based on a combination of network structural topology and sign propagation,

R. Naaman, K. Cohen, and Y . Louzoun, “Edge sign prediction based on a combination of network structural topology and sign propagation,” Journal of Complex Networks , vol. 7, no. 1, pp. 54–66, 2018

work page 2018

[11] [11]

Semi-Supervised Classification with Graph Convolutional Networks

T. N. Kipf and M. Welling, “Semi-supervised classiﬁcation with graph convolutional networks,” arXiv preprint arXiv:1609.02907 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[12] [12]

Modeling relational data with graph convolutional net- works,

M. Schlichtkrull, T. N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, and M. Welling, “Modeling relational data with graph convolutional net- works,” in European Semantic Web Conference, pp. 593–607, Springer, 2018

work page 2018

[13] [13]

Topological based classification of paper domains using graph convolutional networks

I. Benami, K. Cohen, O. Nagar, and Y . Louzoun, “Topological based classiﬁcation of paper domains using graph convolutional networks,” arXiv preprint arXiv:1904.07787 , 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904

[14] [14]

Active learning literature survey,

B. Settles, “Active learning literature survey,” tech. rep., University of Wisconsin-Madison Department of Computer Sciences, 2009

work page 2009

[15] [15]

Heterogeneous uncertainty sampling for supervised learning,

D. D. Lewis and J. Catlett, “Heterogeneous uncertainty sampling for supervised learning,” in Machine learning proceedings 1994 , pp. 148– 156, Elsevier, 1994

work page 1994

[16] [16]

Reducing labeling effort for structured prediction tasks,

A. Culotta and A. McCallum, “Reducing labeling effort for structured prediction tasks,” in AAAI, vol. 5, pp. 746–751, 2005

work page 2005

[17] [17]

An analysis of active learning strategies for sequence labeling tasks,

B. Settles and M. Craven, “An analysis of active learning strategies for sequence labeling tasks,” in Proceedings of the conference on empirical methods in natural language processing , pp. 1070–1079, Association for Computational Linguistics, 2008

work page 2008

[18] [18]

Graph Convolutional Matrix Completion

R. v. d. Berg, T. N. Kipf, and M. Welling, “Graph convolutional matrix completion,” arXiv preprint arXiv:1706.02263 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[19] [19]

node2vec: Scalable feature learning for networks,

A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 855–864, ACM, 2016

work page 2016

[20] [20]

Active hidden markov models for information extraction,

T. Scheffer, C. Decomain, and S. Wrobel, “Active hidden markov models for information extraction,” in International Symposium on Intelligent Data Analysis, pp. 309–318, Springer, 2001

work page 2001

[21] [21]

A mathematical theory of communication,

C. E. Shannon, “A mathematical theory of communication,” Bell system technical journal, vol. 27, no. 3, pp. 379–423, 1948

work page 1948

[22] [22]

Selective sampling for example-based word sense disambiguation,

A. Fujii, T. Tokunaga, K. Inui, and H. Tanaka, “Selective sampling for example-based word sense disambiguation,” Computational Linguistics, vol. 24, no. 4, pp. 573–597, 1998

work page 1998

[23] [23]

Active learning using pre-clustering,

H. T. Nguyen and A. Smeulders, “Active learning using pre-clustering,” in Proceedings of the twenty-ﬁrst international conference on Machine learning, p. 79, ACM, 2004

work page 2004

[24] [24]

Active learning with sampling by uncertainty and density for data annotations,

J. Zhu, H. Wang, B. K. Tsou, and M. Ma, “Active learning with sampling by uncertainty and density for data annotations,” IEEE Transactions on audio, speech, and language processing , vol. 18, no. 6, pp. 1323–1331, 2009

work page 2009

[25] [25]

Using graph-based metrics with empirical risk min- imization to speed up active learning on networked data,

S. A. Macskassy, “Using graph-based metrics with empirical risk min- imization to speed up active learning on networked data,” in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining , pp. 597–606, ACM, 2009

work page 2009

[26] [26]

Batch mode active learning for node classiﬁcation in assortative and disassortative networks,

S. Ping, D. Liu, B. Yang, Y . Zhu, H. Chen, and Z. Wang, “Batch mode active learning for node classiﬁcation in assortative and disassortative networks,” IEEE Access, vol. 6, pp. 4750–4758, 2017

work page 2017

[27] [27]

Active Learning for Graph Embedding

H. Cai, V . W. Zheng, and K. C.-C. Chang, “Active learning for graph embedding,” arXiv preprint arXiv:1705.05085 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[28] [28]

σ-optimality for active learning on gaussian random ﬁelds,

Y . Ma, R. Garnett, and J. Schneider, “σ-optimality for active learning on gaussian random ﬁelds,” in Advances in Neural Information Processing Systems, pp. 2751–2759, 2013

work page 2013

[29] [29]

Active learning of gaussian processes with manifold-preserving graph reduction,

J. Zhou and S. Sun, “Active learning of gaussian processes with manifold-preserving graph reduction,” Neural Computing and Applic- ations, vol. 25, no. 7-8, pp. 1615–1625, 2014

work page 2014

[30] [30]

ActiveHNE: Active Heterogeneous Network Embedding

X. Chen, G. Yu, J. Wang, C. Domeniconi, Z. Li, and X. Zhang, “Activehne: Active heterogeneous network embedding,” arXiv preprint arXiv:1905.05659, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1905

[31] [31]

Self-emergence of knowledge trees: Extraction of the wikipedia hierarchies,

L. Muchnik, R. Itzhack, S. Solomon, and Y . Louzoun, “Self-emergence of knowledge trees: Extraction of the wikipedia hierarchies,” Physical Review E, vol. 76, no. 1, p. 016106, 2007

work page 2007

[32] [32]

Locating inﬂu- ential nodes in complex networks,

F. D. Malliaros, M.-E. G. Rossi, and M. Vazirgiannis, “Locating inﬂu- ential nodes in complex networks,” Scientiﬁc reports, vol. 6, p. 19307, 2016

work page 2016

[33] [33]

The pagerank citation ranking: Bringing order to the web.,

L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation ranking: Bringing order to the web.,” tech. rep., Stanford InfoLab, 1999

work page 1999

[34] [34]

Automating the construction of internet portals with machine learning,

A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore, “Automating the construction of internet portals with machine learning,” Information Retrieval, vol. 3, no. 2, pp. 127–163, 2000

work page 2000

[35] [35]

Collective classiﬁcation in network data,

P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi- Rad, “Collective classiﬁcation in network data,” AI magazine , vol. 29, no. 3, pp. 93–93, 2008

work page 2008

[36] [36]

Citeseer: An automatic citation indexing system.,

C. L. Giles, K. D. Bollacker, and S. Lawrence, “Citeseer: An automatic citation indexing system.,” in ACM DL, pp. 89–98, 1998

work page 1998

[37] [37]

Graph evolution: Dens- iﬁcation and shrinking diameters,

J. Leskovec, J. Kleinberg, and C. Faloutsos, “Graph evolution: Dens- iﬁcation and shrinking diameters,” ACM Transactions on Knowledge Discovery from Data (TKDD) , vol. 1, no. 1, p. 2, 2007

work page 2007

[38] [38]

SNAP Datasets: Stanford large network dataset collection

J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection.” http://snap.stanford.edu/data, June 2014

work page 2014

[39] [39]

Local higher- order graph clustering,

H. Yin, A. R. Benson, J. Leskovec, and D. F. Gleich, “Local higher- order graph clustering,” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pp. 555–564, ACM, 2017

work page 2017

[40] [40]

Cora citation network dataset – KONECT,

“Cora citation network dataset – KONECT,” Apr. 2017

work page 2017

[41] [41]

Human wayﬁnding in information networks,

R. West and J. Leskovec, “Human wayﬁnding in information networks,” in Proceedings of the 21st international conference on World Wide Web, pp. 619–628, ACM, 2012

work page 2012

[42] [42]

Wikispeedia: An online game for inferring semantic distances between concepts,

R. West, J. Pineau, and D. Precup, “Wikispeedia: An online game for inferring semantic distances between concepts,” in Twenty-First International Joint Conference on Artiﬁcial Intelligence , 2009

work page 2009

[43] [43]

Relational active learning for link-based classiﬁc- ation,

L. K. McDowell, “Relational active learning for link-based classiﬁc- ation,” in 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA) , pp. 1–10, IEEE, 2015

work page 2015

[44] [44]

Active learning for networked data,

M. Bilgic, L. Mihalkova, and L. Getoor, “Active learning for networked data,” in Proceedings of the 27th international conference on machine learning (ICML-10), pp. 79–86, 2010

work page 2010