Topology-Aware Active Learning on Graphs

Adrien Weihs; Andrea L. Bertozzi; Harris Hardiman-Mostow; Jack Mauro

arxiv: 2510.25892 · v2 · submitted 2025-10-29 · 💻 cs.LG

Topology-Aware Active Learning on Graphs

Harris Hardiman-Mostow , Jack Mauro , Adrien Weihs , Andrea L. Bertozzi This is my paper

Pith reviewed 2026-05-18 02:32 UTC · model grok-4.3

classification 💻 cs.LG

keywords active learninggraphssemi-supervised learninggraph curvaturelabel propagationcoresetsexploration exploitation

0 comments

The pith

A graph active learning method uses Balanced Forman Curvature to build a coreset of initial labels, trigger the shift from exploration to exploitation, and apply localized rewiring to improve label propagation at low label rates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an active learning approach for graphs that directly addresses the exploration versus exploitation trade-off when label budgets are small. It selects a representative starting set of nodes via Balanced Forman Curvature to match the graph's cluster structure and adds a data-driven rule to decide when exploration is complete. The method then rewires edges locally around labeled nodes to bring in multiscale neighborhood information for better label spread while keeping the graph sparse. A sympathetic reader would care because this targets the practical problem of accurate node classification when obtaining labels is expensive.

Core claim

The central claim is that a graph-topological approach to active learning, which uses Balanced Forman Curvature for coreset construction to reflect cluster structure and for dynamic switching between exploration and exploitation, combined with a localized rewiring strategy that incorporates multiscale information around labeled nodes, consistently outperforms existing graph-based semi-supervised baselines at low label rates on benchmark classification tasks.

What carries the argument

Balanced Forman Curvature for selecting representative coresets and triggering the exploration-to-exploitation shift, paired with localized graph rewiring to enhance multiscale label propagation while preserving sparsity.

If this is right

Consistent outperformance over existing graph-based semi-supervised baselines at low label rates on benchmark tasks.
Replacement of hand-tuned heuristics with a data-driven curvature signal for deciding when to stop exploration.
Efficient addition of multiscale information around labeled nodes while keeping the overall graph sparse.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same curvature signal might help identify natural communities or bottlenecks even in graphs without labels.
The localized rewiring step could be tested as a general preprocessing tool for other label-propagation algorithms.
Performance gains may be largest on graphs with clear hierarchical or multiscale cluster structure.

Load-bearing premise

Balanced Forman Curvature reliably identifies representative clusters for coreset construction and the localized rewiring improves label propagation without distorting the graph structure or introducing bias.

What would settle it

Experiments on the same benchmark classification tasks at low label rates showing no consistent outperformance over standard graph-based semi-supervised methods after disabling the curvature-based coreset or rewiring steps.

Figures

Figures reproduced from arXiv: 2510.25892 by Adrien Weihs, Andrea L. Bertozzi, Harris Hardiman-Mostow, Jack Mauro.

**Figure 2.** Figure 2: Coreset points chosen by CC and DAC at different iterations of coreset selection on the Blobs dataset. CC selects [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of our proposed stopping condition [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Coreset and AL results for Curvature, DAC, and Random on several benchmarks. The left and right columns present [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: Coreset and AL accuracy comparison between our method and DAC (with different radii), where we use each method’s [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: Performance of PWLL-τ with the minimum norm uncertainty acquisition function under different decay schedules. Using curvature as a data-driven metric to inform the value of τ consistently ensures effective transitioning from exploration to exploitation, and removes the need to set K beforehand in a decay schedule. Moreover, the proposed schedule of K = 10 in [29] for FashionMNIST and CIFAR-10 appears to be… view at source ↗

**Figure 7.** Figure 7: Comparison of sequential active learning performance with GBSSL variants on MNIST, FashionMNIST, and CIFAR [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗

read the original abstract

We propose a graph-topological approach to active learning that directly targets the core challenge of exploration versus exploitation under scarce label budgets. To guide exploration, we introduce a coreset construction algorithm based on Balanced Forman Curvature (BFC), which selects representative initial labels that reflect the graph's cluster structure. This method includes a data-driven stopping criterion that signals when the graph has been sufficiently explored. We further use BFC to dynamically trigger the shift from exploration to exploitation within active learning routines, replacing hand-tuned heuristics. To improve exploitation, we introduce a localized graph rewiring strategy that efficiently incorporates multiscale information around labeled nodes, enhancing label propagation while preserving sparsity. Experiments on benchmark classification tasks show that our methods consistently outperform existing graph-based semi-supervised baselines at low label rates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a BFC-driven active learning scheme for graphs that targets exploration-exploitation with coreset selection, dynamic switching, and localized rewiring, but the abstract alone leaves the empirical claims difficult to assess.

read the letter

The main point your colleague should know is that the authors are trying to improve active learning on graphs by leaning on Balanced Forman Curvature to guide both the initial selection of labels and the timing of when to move from broad exploration to focused exploitation. They introduce a coreset algorithm that picks representative nodes based on BFC to better capture the graph's cluster structure. There's also a data-driven stopping criterion to tell when enough exploration has happened. BFC is used again to dynamically decide the switch to exploitation, which replaces the usual hand-tuned parameters. On the exploitation side, they propose a localized rewiring that brings in multiscale information around the labeled nodes to help with label propagation, all while trying to keep the graph sparse. This approach does a decent job of addressing the exploration-exploitation tradeoff in a topology-aware way, which is relevant for semi-supervised learning on networks where labels are expensive. It builds directly on prior work with curvature concepts and applies them to this new setting in a specific combination. The soft spots come down to the limited information available. Since we only have the abstract, there's no view into the actual experimental protocol, the choice of baselines, the number of trials, or any error bars or statistical analysis. The claim that the methods consistently outperform existing baselines at low label rates is stated, but without those details it's not possible to verify how solid the evidence is or if there were any post-hoc adjustments. The assumption that BFC reliably identifies good clusters for the coreset and that the rewiring improves propagation without issues also needs checking against the full methods section. This paper would be of interest to researchers in graph machine learning who focus on active learning or who use geometric properties like curvature in their algorithms. A reader looking for new techniques in low-label regimes on graphs might find the ideas worth exploring. Overall, the thinking seems straightforward and engaged with the problem, even if the current writeup is high-level. I recommend sending it for peer review once the full paper with experiments and code is available, as the core ideas could be useful if the results check out.

Referee Report

1 major / 1 minor

Summary. The paper proposes a topology-aware active learning approach for graphs that uses Balanced Forman Curvature (BFC) to construct a coreset of representative initial labels reflecting cluster structure, incorporates a data-driven stopping criterion to signal sufficient exploration, employs BFC to dynamically trigger the shift from exploration to exploitation (replacing hand-tuned heuristics), and introduces a localized graph rewiring strategy to incorporate multiscale information around labeled nodes for improved label propagation while preserving sparsity. The central empirical claim is that the proposed methods consistently outperform existing graph-based semi-supervised baselines on benchmark classification tasks at low label rates.

Significance. If the empirical results are substantiated with full experimental details, the work could offer a meaningful contribution to active learning on graphs by integrating topological curvature measures to address the exploration-exploitation tradeoff in a more principled, data-driven manner. The extension of prior curvature concepts to coreset selection and dynamic switching, along with the sparsity-preserving rewiring, represents a potentially useful direction for low-label regimes.

major comments (1)

Abstract: the claim that the methods 'consistently outperform existing graph-based semi-supervised baselines at low label rates' is presented without any details on experimental setup, baselines, number of trials, statistical tests, label-rate definitions, or potential post-hoc analysis choices, rendering the central empirical claim impossible to evaluate from the provided manuscript.

minor comments (1)

Abstract: the acronym BFC is introduced without a brief inline definition or citation to the underlying Balanced Forman Curvature formulation, which may reduce accessibility for readers new to curvature-based graph methods.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments and the recommendation for major revision. We address the single major comment below and will incorporate the suggested changes in the revised manuscript.

read point-by-point responses

Referee: [—] Abstract: the claim that the methods 'consistently outperform existing graph-based semi-supervised baselines at low label rates' is presented without any details on experimental setup, baselines, number of trials, statistical tests, label-rate definitions, or potential post-hoc analysis choices, rendering the central empirical claim impossible to evaluate from the provided manuscript.

Authors: We agree that the abstract, standing alone, does not supply sufficient experimental context to allow direct evaluation of the central claim. The full manuscript contains a dedicated Experiments section with these details (datasets, baselines, label-rate definitions, trial counts, and statistical procedures). To address the referee's concern, we will revise the abstract to concisely incorporate key elements of the experimental setup—specifically the benchmark datasets, primary baselines, the range of low label rates examined, the number of independent trials, and a brief reference to statistical evaluation—while preserving the abstract's length and focus. This change will make the empirical claim more transparent and evaluable from the abstract itself. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The provided abstract describes a new application of Balanced Forman Curvature for coreset selection, a data-driven stopping rule, and localized rewiring in active learning, with performance claims resting on benchmark experiments. No equations, derivations, fitted parameters renamed as predictions, or self-citation chains appear in the text. The central claims are empirical outperformance on external classification tasks rather than any first-principles result that reduces to its own inputs by construction. With only the abstract available and no load-bearing internal derivation visible, the work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

With only the abstract available, no explicit free parameters, axioms, or invented entities are identifiable. The work relies on Balanced Forman Curvature (presumably from prior literature) and standard assumptions of graph-based semi-supervised learning.

pith-pipeline@v0.9.0 · 5633 in / 1194 out tokens · 45681 ms · 2026-05-18T02:32:21.003978+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Ric(i, j) := −2 + 2/di + 2/dj + ... (BFC definition)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 1 internal anchor

[1]

X. Zhu, Z. Ghahramani, J. Lafferty, Semi- supervised learning using gaussian fields and harmonic functions, in: Proceedings of the Twentieth International Conference on Inter- national Conference on Machine Learning, ICML’03, AAAI Press, 2003, p. 912–919

work page 2003
[2]

Calder, B

J. Calder, B. Cook, M. Thorpe, D. Slepcev, Poisson Learning: Graph Based Semi- Supervised Learning At Very Low Label Rates, in: Proceedings of the 37th Inter- national Conference on Machine Learning, PMLR, 2020, pp. 1306–1316, iSSN: 2640-3498. URLhttps://proceedings.mlr.press/ v119/calder20a.html

work page 2020
[3]

Chapman, B

J. Chapman, B. Chen, Z. Tan, J. Calder, K. Miller, A. L. Bertozzi, Novel batch active learning approach and its application to syn- thetic aperture radar datasets, in: Algorithms for Synthetic Aperture Radar Imagery XXX, Vol. 12520, SPIE, 2023, pp. 95–110

work page 2023
[4]

Brown, R

J. Brown, R. O’Neill, J. Calder, A. L. Bertozzi, Utilizing contrastive learning for graph-based active learning of sar data, in: Algorithms for Synthetic Aperture Radar Imagery XXX, Vol. 12520, SPIE, 2023, pp. 179–193

work page 2023
[5]

Enwright, H

J. Enwright, H. Hardiman-Mostow, J. Calder, A. Bertozzi, Deep semi-supervised label prop- agation for sar image classification, in: Algo- rithms for Synthetic Aperture Radar Imagery XXX, Vol. 12520, SPIE, 2023, pp. 158–170

work page 2023
[6]

Calder, A

K.Miller, J.Mauro, J.Setiadi, X.Baca, Z.Shi, J. Calder, A. L. Bertozzi, Graph-based active learning for semi-supervised classification of sar data, in: Algorithms for Synthetic Aper- ture Radar Imagery XXIX, Vol. 12095, SPIE, 2022, pp. 126–139

work page 2022
[7]

G. Iyer, J. Chanussot, A. L. Bertozzi, A graph- based approach for data fusion and segmen- tation of multimodal images, IEEE Transac- tions on Geoscience and Remote Sensing 59 (5) (2020) 4419–4429

work page 2020
[8]

A. L. Bertozzi, X. Luo, A. M. Stuart, K. C. Zygalakis, Uncertainty quantification in graph-based classification of high dimensional data, SIAM/ASA Journal on Uncertainty Quantification 6 (2) (2018) 568–595.arXiv: https://doi.org/10.1137/17M1134214, doi:10.1137/17M1134214. URLhttps://doi.org/10.1137/ 17M1134214

work page doi:10.1137/17m1134214 2018
[9]

von Luxburg, A tutorial on spectral clus- tering, Statistics and Computing 17 (4) (2007) 395–416.doi:10.1007/s11222-007-9033-z

U. von Luxburg, A tutorial on spectral clus- tering, Statistics and Computing 17 (4) (2007) 395–416.doi:10.1007/s11222-007-9033-z. 19 URLhttps://doi.org/10.1007/ s11222-007-9033-z

work page doi:10.1007/s11222-007-9033-z 2007
[10]

Alexander J

D. Slepcev, M. Thorpe, Analysis ofp- laplacian regularization in semisupervised learning, SIAM J. Math. Anal. 51 (3) (2019) 2085–2120.doi:10.1137/17M115222X. URLhttps://doi.org/10.1137/ 17M115222X

work page doi:10.1137/17m115222x 2019
[11]

M. M. Dunlop, D. Slepcev, A. M. Stuart, M. Thorpe, Large data and zero noise limits of graph-based semi-supervised learning algo- rithms, Applied and Computational Harmonic Analysis 49 (2) (2020) 655–697.doi:https: //doi.org/10.1016/j.acha.2019.03.005. URLhttps://www.sciencedirect.com/ science/article/pii/S1063520318301398

work page doi:10.1016/j.acha.2019.03.005 2020
[12]

J. Calder, Consistency of lipschitz learning with infinite unlabeled data and finite labeled data, SIAM Journal on Mathematics of Data Science 1 (4) (2019) 780–812.arXiv: https://doi.org/10.1137/18M1199241, doi:10.1137/18M1199241. URLhttps://doi.org/10.1137/ 18M1199241

work page doi:10.1137/18m1199241 2019
[13]

Weihs, M

A. Weihs, M. Thorpe, Consistency of frac- tional graph-laplacian regularization in semisupervised learning with finite labels, SIAM Journal on Mathematical Anal- ysis 56 (4) (2024) 4253–4295.arXiv: https://doi.org/10.1137/23M1559087, doi:10.1137/23M1559087. URLhttps://doi.org/10.1137/ 23M1559087

work page doi:10.1137/23m1559087 2024
[14]

Z. Shi, S. Osher, W. Zhu, Weighted nonlocal laplacian on interpolation from sparse data, Journal of Scientific Computing 73 (2) (2017) 1164–1177. doi:10.1007/s10915-017-0421-z. URLhttps://doi.org/10.1007/ s10915-017-0421-z

work page doi:10.1007/s10915-017-0421-z 2017
[15]

Jeff Calder, Dejan Slepčev, and Matthew Thorpe

J. Calder, D. Slepcev, Properly-weighted graph laplacian for semi-supervised learning, Applied Mathematics & Op- timization 82 (3) (2020) 1111–1159. doi:10.1007/s00245-019-09637-3. URLhttps://doi.org/10.1007/ s00245-019-09637-3

work page doi:10.1007/s00245-019-09637-3 2020
[16]

L. Fei, Y. Xu, X. Fang, J. Yang, Low rank rep- resentation with adaptive distance penalty for semi-supervised subspace classification, Pat- tern recognition 67 (2017) 252–262

work page 2017
[17]

X. Qiao, C. Chen, W. Wang, Q. Peng, A. Gha- far, Efficient l2,1-norm graph for robust semi- supervised classification, Pattern Recognition 169 (2026) 111890

work page 2026
[18]

A. J. Smola, R. Kondor, Kernels and regu- larization on graphs, in: B. Schölkopf, M. K. Warmuth (Eds.), Learning Theory and Kernel Machines, Springer Berlin Heidelberg, Berlin, Heidelberg, 2003, pp. 144–158

work page 2003
[19]

Zhang, R

T. Zhang, R. K. Ando, Analysis of spectral kernel design based semi-supervised learning, in: Y. Weiss, B. Schölkopf, J. Platt (Eds.), Advances in Neural Information Processing Systems, Vol. 18, MIT Press, 2005. URLhttps://proceedings.neurips. cc/paper_files/paper/2005/file/ 35c5a2cb362c4d214156f930e7d13252-Paper. pdf

work page 2005
[20]

Karhadkar, P

K. Karhadkar, P. K. Banerjee, G. Mont- ufar, FoSR: First-order spectral rewiring for addressing oversquashing in GNNs, in: The Eleventh International Conference on Learning Representations, 2023. URLhttps://openreview.net/forum?id= 3YjQfCLdrzz

work page 2023
[21]

Topping, F

J. Topping, F. D. Giovanni, B. P. Chamber- lain, X. Dong, M. M. Bronstein, Understand- ing over-squashing and bottlenecks on graphs via curvature, in: International Conference on Learning Representations, 2022. URLhttps://openreview.net/forum?id= 7UmjRGzp-A

work page 2022
[22]

J. H. Giraldo, K. Skianis, T. Bouwmans, F. D. Malliaros, On the trade-off between over-smoothing and over-squashing in deep graph neural networks, in: Proceedings of the 32nd ACM International Conference on Infor- mation and Knowledge Management, CIKM ’23, Association for Computing Machinery, New York, NY, USA, 2023, p. 566–576. doi:10.1145/3583780.361499...

work page doi:10.1145/3583780.3614997 2023
[23]

Weihs, A

A. Weihs, A. L. Bertozzi, M. Thorpe, Higher- order regularization on hypergraphs, in prepa- ration (2025)

work page 2025
[24]

X. Zhu, J. Lafferty, Z. Ghahramani, Combin- ing active learning and semi-supervised learn- ing using gaussian fields and harmonic func- tions, in: ICML 2003 workshop on the contin- uumfromlabeledtounlabeleddatainmachine learning and data mining, Vol. 3, 2003, pp. 58– 65

work page 2003
[25]

Miller, A

K. Miller, A. L. Bertozzi, Model-Change Ac- tive Learning in Graph-Based Semi-Supervised Learning, Communications on Applied Math- ematics and Computation 6 (2) (2024) 1270– 1298.doi:10.1007/s42967-023-00328-z

work page doi:10.1007/s42967-023-00328-z 2024
[26]

M. Ji, J. Han, A variance minimization crite- rion to active learning on graphs, in: N. D. Lawrence, M. Girolami (Eds.), Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, Vol. 22 of Proceedings of Machine Learning Research, PMLR, La Palma, Canary Islands, 2012, pp. 556–564. URLhttps://proceedings.mlr.press/ v...

work page 2012
[27]

Settles, Active learning literature survey (2009)

B. Settles, Active learning literature survey (2009)

work page 2009
[28]

B. Chen, K. Miller, A. L. Bertozzi, J. Schwenk, Batch active learning for multispectral and hy- perspectral image segmentation using similar- ity graphs, Communications on Applied Math- ematics and Computation 6 (2) (2024) 1013– 1033

work page 2024
[29]

Miller, J

K. Miller, J. Calder, Poisson reweighted lapla- cian uncertainty sampling for graph-based ac- tive learning, SIAM Journal on Mathematics of Data Science 5 (4) (2023) 1160–1190

work page 2023
[30]

A. Feng, M. Weber, Graph Pooling via Ricci Flow, Transactions on Machine Learning Re- search (2024)

work page 2024
[31]

Merkurjev, D

E. Merkurjev, D. D. Nguyen, G.-W. Wei, Multiscale laplacian learning, Applied In- telligence 53 (12) (2023) 15727–15746. doi:10.1007/s10489-022-04333-2. URLhttps://doi.org/10.1007/ s10489-022-04333-2

work page doi:10.1007/s10489-022-04333-2 2023
[32]

Weihs, A

A. Weihs, A. L. Bertozzi, M. Thorpe, Analysis of semi-supervised learning on hypergraphs, in preparation (2025)

work page 2025
[33]

Ollivier, A survey of ricci curvature for met- ric spaces and markov chains, in: Probabilistic approach to geometry, Vol

Y. Ollivier, A survey of ricci curvature for met- ric spaces and markov chains, in: Probabilistic approach to geometry, Vol. 57, Mathematical Society of Japan, 2010, pp. 343–382

work page 2010
[34]

Y. Tian, Z. Lubberts, M. Weber, Curvature- based clustering on graphs, Journal of Machine Learning Research 26 (52) (2025) 1–67

work page 2025
[35]

A. Ng, M. Jordan, Y. Weiss, On spectral clustering: Analysis and an algorithm, in: T. Dietterich, S. Becker, Z. Ghahramani (Eds.), Advances in Neural Information Pro- cessing Systems, Vol. 14, MIT Press, 2001. URLhttps://proceedings.neurips. cc/paper_files/paper/2001/file/ 801272ee79cfde7fa5960571fee36b9b-Paper. pdf

work page 2001
[36]

Garcia Trillos, D

N. Garcia Trillos, D. S. cev, A variational approach to the consistency of spectral clus- tering, Applied and Computational Harmonic Analysis 45 (2) (2018) 239–281.doi:https: //doi.org/10.1016/j.acha.2016.09.003. URLhttps://www.sciencedirect.com/ science/article/pii/S106352031630063X

work page doi:10.1016/j.acha.2016.09.003 2018
[37]

Hoffmann, B

F. Hoffmann, B. Hosseini, A. A. Oberai, A. M. Stuart, Spectral analysis of weighted laplacians arising in data clustering, Ap- plied and Computational Harmonic Anal- ysis 56 (2022) 189–249.doi:https: //doi.org/10.1016/j.acha.2021.07.004. URLhttps://www.sciencedirect.com/ science/article/pii/S1063520321000622

work page doi:10.1016/j.acha.2021.07.004 2022
[38]

X. Zhou, M. Belkin, Semi-supervised learning by higher order regularization, in: G. Gordon, D. Dunson, M. Dudík (Eds.), Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Vol. 15 of Proceedings of Machine Learning Research, PMLR, Fort Lauderdale, FL, USA, 2011, pp. 892–900. URLhttps://proceedings.mlr.press/ v...

work page 2011
[39]

Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, S. Y. Philip, A comprehensive survey on graph neural networks, IEEE transactions on neural 21 networks and learning systems 32 (1) (2020) 4–24

work page 2020
[40]

Y. Ma, X. Liu, N. Shah, J. Tang, Is homophily a necessity for graph neural networks?, in: International Conference on Learning Repre- sentations, 2022. URLhttps://openreview.net/forum?id= ucASPPD9GKN

work page 2022
[41]

Y. Ma, R. Garnett, J. Schneider,σ-optimality for active learning on gaussian random fields, Advances in Neural Information Processing Systems 26 (2013)

work page 2013
[42]

Cloninger, H

A. Cloninger, H. N. Mhaskar, Cautious active clustering, Applied and Computational Har- monic Analysis 54 (2021) 44–74

work page 2021
[43]

J. M. Murphy, M. Maggioni, Unsupervised clustering and active learning of hyperspectral images with nonlinear diffusion, IEEE Trans- actions on Geoscience and Remote Sensing 57 (3) (2018) 1829–1845

work page 2018
[44]

C. J. B. Yann LeCun, Corinna Cortes, Mnist handwritten digit database,http://yann. lecun.com/exdb/mnist/(2010)

work page 2010
[45]

H. Xiao, K. Rasul, R. Vollgraf, Fashion- mnist: a novel image dataset for benchmarking machine learning algorithms, arXiv preprint arXiv:1708.07747 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[46]

Krizhevsky, G

A. Krizhevsky, G. Hinton, et al., Learning multiple layers of features from tiny images, Toronto, ON, Canada (2009)

work page 2009
[47]

D.P.Kingma, M.Welling, Auto-encodingvari- ational bayes, in: 2nd International Confer- ence on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Confer- ence Track Proceedings, 2014

work page 2014
[48]

T. Chen, S. Kornblith, M. Norouzi, G. Hinton, A simple framework for contrastive learning of visual representations, in: H. D. III, A. Singh (Eds.), Proceedings of the 37th International Conference on Machine Learning, Vol. 119 of Proceedings of Machine Learning Research, PMLR, 2020, pp. 1597–1607. URLhttps://proceedings.mlr.press/ v119/chen20j.html

work page 2020
[49]

Calder, Graphlearning python package (Jan

J. Calder, Graphlearning python package (Jan. 2022).doi:10.5281/zenodo.5850940. URLhttps://doi.org/10.5281/zenodo. 5850940

work page doi:10.5281/zenodo.5850940 2022
[50]

2921–2926

G.Cohen, S.Afshar, J.Tapson, A.VanSchaik, Emnist: Extending mnist to handwritten let- ters, in: 2017 international joint conference on neural networks (IJCNN), IEEE, 2017, pp. 2921–2926

work page 2017
[51]

reduc- tion parameter

J. Wang, Y. Guo, L. Yang, Y. Wang, Heterophily-aware graph attention network, Pattern Recognition 156 (2024) 110738. Appendix A. Reduction Parameter In order to speed up CC, we proposed a “reduc- tion parameter”r, which reduces the search space by a factor ofrby only considering the top1/r nodes by degree (see Remark 3.3). This aligns with the definition ...

work page 2024

[1] [1]

X. Zhu, Z. Ghahramani, J. Lafferty, Semi- supervised learning using gaussian fields and harmonic functions, in: Proceedings of the Twentieth International Conference on Inter- national Conference on Machine Learning, ICML’03, AAAI Press, 2003, p. 912–919

work page 2003

[2] [2]

Calder, B

J. Calder, B. Cook, M. Thorpe, D. Slepcev, Poisson Learning: Graph Based Semi- Supervised Learning At Very Low Label Rates, in: Proceedings of the 37th Inter- national Conference on Machine Learning, PMLR, 2020, pp. 1306–1316, iSSN: 2640-3498. URLhttps://proceedings.mlr.press/ v119/calder20a.html

work page 2020

[3] [3]

Chapman, B

J. Chapman, B. Chen, Z. Tan, J. Calder, K. Miller, A. L. Bertozzi, Novel batch active learning approach and its application to syn- thetic aperture radar datasets, in: Algorithms for Synthetic Aperture Radar Imagery XXX, Vol. 12520, SPIE, 2023, pp. 95–110

work page 2023

[4] [4]

Brown, R

J. Brown, R. O’Neill, J. Calder, A. L. Bertozzi, Utilizing contrastive learning for graph-based active learning of sar data, in: Algorithms for Synthetic Aperture Radar Imagery XXX, Vol. 12520, SPIE, 2023, pp. 179–193

work page 2023

[5] [5]

Enwright, H

J. Enwright, H. Hardiman-Mostow, J. Calder, A. Bertozzi, Deep semi-supervised label prop- agation for sar image classification, in: Algo- rithms for Synthetic Aperture Radar Imagery XXX, Vol. 12520, SPIE, 2023, pp. 158–170

work page 2023

[6] [6]

Calder, A

K.Miller, J.Mauro, J.Setiadi, X.Baca, Z.Shi, J. Calder, A. L. Bertozzi, Graph-based active learning for semi-supervised classification of sar data, in: Algorithms for Synthetic Aper- ture Radar Imagery XXIX, Vol. 12095, SPIE, 2022, pp. 126–139

work page 2022

[7] [7]

G. Iyer, J. Chanussot, A. L. Bertozzi, A graph- based approach for data fusion and segmen- tation of multimodal images, IEEE Transac- tions on Geoscience and Remote Sensing 59 (5) (2020) 4419–4429

work page 2020

[8] [8]

A. L. Bertozzi, X. Luo, A. M. Stuart, K. C. Zygalakis, Uncertainty quantification in graph-based classification of high dimensional data, SIAM/ASA Journal on Uncertainty Quantification 6 (2) (2018) 568–595.arXiv: https://doi.org/10.1137/17M1134214, doi:10.1137/17M1134214. URLhttps://doi.org/10.1137/ 17M1134214

work page doi:10.1137/17m1134214 2018

[9] [9]

von Luxburg, A tutorial on spectral clus- tering, Statistics and Computing 17 (4) (2007) 395–416.doi:10.1007/s11222-007-9033-z

U. von Luxburg, A tutorial on spectral clus- tering, Statistics and Computing 17 (4) (2007) 395–416.doi:10.1007/s11222-007-9033-z. 19 URLhttps://doi.org/10.1007/ s11222-007-9033-z

work page doi:10.1007/s11222-007-9033-z 2007

[10] [10]

Alexander J

D. Slepcev, M. Thorpe, Analysis ofp- laplacian regularization in semisupervised learning, SIAM J. Math. Anal. 51 (3) (2019) 2085–2120.doi:10.1137/17M115222X. URLhttps://doi.org/10.1137/ 17M115222X

work page doi:10.1137/17m115222x 2019

[11] [11]

M. M. Dunlop, D. Slepcev, A. M. Stuart, M. Thorpe, Large data and zero noise limits of graph-based semi-supervised learning algo- rithms, Applied and Computational Harmonic Analysis 49 (2) (2020) 655–697.doi:https: //doi.org/10.1016/j.acha.2019.03.005. URLhttps://www.sciencedirect.com/ science/article/pii/S1063520318301398

work page doi:10.1016/j.acha.2019.03.005 2020

[12] [12]

J. Calder, Consistency of lipschitz learning with infinite unlabeled data and finite labeled data, SIAM Journal on Mathematics of Data Science 1 (4) (2019) 780–812.arXiv: https://doi.org/10.1137/18M1199241, doi:10.1137/18M1199241. URLhttps://doi.org/10.1137/ 18M1199241

work page doi:10.1137/18m1199241 2019

[13] [13]

Weihs, M

A. Weihs, M. Thorpe, Consistency of frac- tional graph-laplacian regularization in semisupervised learning with finite labels, SIAM Journal on Mathematical Anal- ysis 56 (4) (2024) 4253–4295.arXiv: https://doi.org/10.1137/23M1559087, doi:10.1137/23M1559087. URLhttps://doi.org/10.1137/ 23M1559087

work page doi:10.1137/23m1559087 2024

[14] [14]

Z. Shi, S. Osher, W. Zhu, Weighted nonlocal laplacian on interpolation from sparse data, Journal of Scientific Computing 73 (2) (2017) 1164–1177. doi:10.1007/s10915-017-0421-z. URLhttps://doi.org/10.1007/ s10915-017-0421-z

work page doi:10.1007/s10915-017-0421-z 2017

[15] [15]

Jeff Calder, Dejan Slepčev, and Matthew Thorpe

J. Calder, D. Slepcev, Properly-weighted graph laplacian for semi-supervised learning, Applied Mathematics & Op- timization 82 (3) (2020) 1111–1159. doi:10.1007/s00245-019-09637-3. URLhttps://doi.org/10.1007/ s00245-019-09637-3

work page doi:10.1007/s00245-019-09637-3 2020

[16] [16]

L. Fei, Y. Xu, X. Fang, J. Yang, Low rank rep- resentation with adaptive distance penalty for semi-supervised subspace classification, Pat- tern recognition 67 (2017) 252–262

work page 2017

[17] [17]

X. Qiao, C. Chen, W. Wang, Q. Peng, A. Gha- far, Efficient l2,1-norm graph for robust semi- supervised classification, Pattern Recognition 169 (2026) 111890

work page 2026

[18] [18]

A. J. Smola, R. Kondor, Kernels and regu- larization on graphs, in: B. Schölkopf, M. K. Warmuth (Eds.), Learning Theory and Kernel Machines, Springer Berlin Heidelberg, Berlin, Heidelberg, 2003, pp. 144–158

work page 2003

[19] [19]

Zhang, R

T. Zhang, R. K. Ando, Analysis of spectral kernel design based semi-supervised learning, in: Y. Weiss, B. Schölkopf, J. Platt (Eds.), Advances in Neural Information Processing Systems, Vol. 18, MIT Press, 2005. URLhttps://proceedings.neurips. cc/paper_files/paper/2005/file/ 35c5a2cb362c4d214156f930e7d13252-Paper. pdf

work page 2005

[20] [20]

Karhadkar, P

K. Karhadkar, P. K. Banerjee, G. Mont- ufar, FoSR: First-order spectral rewiring for addressing oversquashing in GNNs, in: The Eleventh International Conference on Learning Representations, 2023. URLhttps://openreview.net/forum?id= 3YjQfCLdrzz

work page 2023

[21] [21]

Topping, F

J. Topping, F. D. Giovanni, B. P. Chamber- lain, X. Dong, M. M. Bronstein, Understand- ing over-squashing and bottlenecks on graphs via curvature, in: International Conference on Learning Representations, 2022. URLhttps://openreview.net/forum?id= 7UmjRGzp-A

work page 2022

[22] [22]

J. H. Giraldo, K. Skianis, T. Bouwmans, F. D. Malliaros, On the trade-off between over-smoothing and over-squashing in deep graph neural networks, in: Proceedings of the 32nd ACM International Conference on Infor- mation and Knowledge Management, CIKM ’23, Association for Computing Machinery, New York, NY, USA, 2023, p. 566–576. doi:10.1145/3583780.361499...

work page doi:10.1145/3583780.3614997 2023

[23] [23]

Weihs, A

A. Weihs, A. L. Bertozzi, M. Thorpe, Higher- order regularization on hypergraphs, in prepa- ration (2025)

work page 2025

[24] [24]

X. Zhu, J. Lafferty, Z. Ghahramani, Combin- ing active learning and semi-supervised learn- ing using gaussian fields and harmonic func- tions, in: ICML 2003 workshop on the contin- uumfromlabeledtounlabeleddatainmachine learning and data mining, Vol. 3, 2003, pp. 58– 65

work page 2003

[25] [25]

Miller, A

K. Miller, A. L. Bertozzi, Model-Change Ac- tive Learning in Graph-Based Semi-Supervised Learning, Communications on Applied Math- ematics and Computation 6 (2) (2024) 1270– 1298.doi:10.1007/s42967-023-00328-z

work page doi:10.1007/s42967-023-00328-z 2024

[26] [26]

M. Ji, J. Han, A variance minimization crite- rion to active learning on graphs, in: N. D. Lawrence, M. Girolami (Eds.), Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, Vol. 22 of Proceedings of Machine Learning Research, PMLR, La Palma, Canary Islands, 2012, pp. 556–564. URLhttps://proceedings.mlr.press/ v...

work page 2012

[27] [27]

Settles, Active learning literature survey (2009)

B. Settles, Active learning literature survey (2009)

work page 2009

[28] [28]

B. Chen, K. Miller, A. L. Bertozzi, J. Schwenk, Batch active learning for multispectral and hy- perspectral image segmentation using similar- ity graphs, Communications on Applied Math- ematics and Computation 6 (2) (2024) 1013– 1033

work page 2024

[29] [29]

Miller, J

K. Miller, J. Calder, Poisson reweighted lapla- cian uncertainty sampling for graph-based ac- tive learning, SIAM Journal on Mathematics of Data Science 5 (4) (2023) 1160–1190

work page 2023

[30] [30]

A. Feng, M. Weber, Graph Pooling via Ricci Flow, Transactions on Machine Learning Re- search (2024)

work page 2024

[31] [31]

Merkurjev, D

E. Merkurjev, D. D. Nguyen, G.-W. Wei, Multiscale laplacian learning, Applied In- telligence 53 (12) (2023) 15727–15746. doi:10.1007/s10489-022-04333-2. URLhttps://doi.org/10.1007/ s10489-022-04333-2

work page doi:10.1007/s10489-022-04333-2 2023

[32] [32]

Weihs, A

A. Weihs, A. L. Bertozzi, M. Thorpe, Analysis of semi-supervised learning on hypergraphs, in preparation (2025)

work page 2025

[33] [33]

Ollivier, A survey of ricci curvature for met- ric spaces and markov chains, in: Probabilistic approach to geometry, Vol

Y. Ollivier, A survey of ricci curvature for met- ric spaces and markov chains, in: Probabilistic approach to geometry, Vol. 57, Mathematical Society of Japan, 2010, pp. 343–382

work page 2010

[34] [34]

Y. Tian, Z. Lubberts, M. Weber, Curvature- based clustering on graphs, Journal of Machine Learning Research 26 (52) (2025) 1–67

work page 2025

[35] [35]

A. Ng, M. Jordan, Y. Weiss, On spectral clustering: Analysis and an algorithm, in: T. Dietterich, S. Becker, Z. Ghahramani (Eds.), Advances in Neural Information Pro- cessing Systems, Vol. 14, MIT Press, 2001. URLhttps://proceedings.neurips. cc/paper_files/paper/2001/file/ 801272ee79cfde7fa5960571fee36b9b-Paper. pdf

work page 2001

[36] [36]

Garcia Trillos, D

N. Garcia Trillos, D. S. cev, A variational approach to the consistency of spectral clus- tering, Applied and Computational Harmonic Analysis 45 (2) (2018) 239–281.doi:https: //doi.org/10.1016/j.acha.2016.09.003. URLhttps://www.sciencedirect.com/ science/article/pii/S106352031630063X

work page doi:10.1016/j.acha.2016.09.003 2018

[37] [37]

Hoffmann, B

F. Hoffmann, B. Hosseini, A. A. Oberai, A. M. Stuart, Spectral analysis of weighted laplacians arising in data clustering, Ap- plied and Computational Harmonic Anal- ysis 56 (2022) 189–249.doi:https: //doi.org/10.1016/j.acha.2021.07.004. URLhttps://www.sciencedirect.com/ science/article/pii/S1063520321000622

work page doi:10.1016/j.acha.2021.07.004 2022

[38] [38]

X. Zhou, M. Belkin, Semi-supervised learning by higher order regularization, in: G. Gordon, D. Dunson, M. Dudík (Eds.), Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Vol. 15 of Proceedings of Machine Learning Research, PMLR, Fort Lauderdale, FL, USA, 2011, pp. 892–900. URLhttps://proceedings.mlr.press/ v...

work page 2011

[39] [39]

Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, S. Y. Philip, A comprehensive survey on graph neural networks, IEEE transactions on neural 21 networks and learning systems 32 (1) (2020) 4–24

work page 2020

[40] [40]

Y. Ma, X. Liu, N. Shah, J. Tang, Is homophily a necessity for graph neural networks?, in: International Conference on Learning Repre- sentations, 2022. URLhttps://openreview.net/forum?id= ucASPPD9GKN

work page 2022

[41] [41]

Y. Ma, R. Garnett, J. Schneider,σ-optimality for active learning on gaussian random fields, Advances in Neural Information Processing Systems 26 (2013)

work page 2013

[42] [42]

Cloninger, H

A. Cloninger, H. N. Mhaskar, Cautious active clustering, Applied and Computational Har- monic Analysis 54 (2021) 44–74

work page 2021

[43] [43]

J. M. Murphy, M. Maggioni, Unsupervised clustering and active learning of hyperspectral images with nonlinear diffusion, IEEE Trans- actions on Geoscience and Remote Sensing 57 (3) (2018) 1829–1845

work page 2018

[44] [44]

C. J. B. Yann LeCun, Corinna Cortes, Mnist handwritten digit database,http://yann. lecun.com/exdb/mnist/(2010)

work page 2010

[45] [45]

H. Xiao, K. Rasul, R. Vollgraf, Fashion- mnist: a novel image dataset for benchmarking machine learning algorithms, arXiv preprint arXiv:1708.07747 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[46] [46]

Krizhevsky, G

A. Krizhevsky, G. Hinton, et al., Learning multiple layers of features from tiny images, Toronto, ON, Canada (2009)

work page 2009

[47] [47]

D.P.Kingma, M.Welling, Auto-encodingvari- ational bayes, in: 2nd International Confer- ence on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Confer- ence Track Proceedings, 2014

work page 2014

[48] [48]

T. Chen, S. Kornblith, M. Norouzi, G. Hinton, A simple framework for contrastive learning of visual representations, in: H. D. III, A. Singh (Eds.), Proceedings of the 37th International Conference on Machine Learning, Vol. 119 of Proceedings of Machine Learning Research, PMLR, 2020, pp. 1597–1607. URLhttps://proceedings.mlr.press/ v119/chen20j.html

work page 2020

[49] [49]

Calder, Graphlearning python package (Jan

J. Calder, Graphlearning python package (Jan. 2022).doi:10.5281/zenodo.5850940. URLhttps://doi.org/10.5281/zenodo. 5850940

work page doi:10.5281/zenodo.5850940 2022

[50] [50]

2921–2926

G.Cohen, S.Afshar, J.Tapson, A.VanSchaik, Emnist: Extending mnist to handwritten let- ters, in: 2017 international joint conference on neural networks (IJCNN), IEEE, 2017, pp. 2921–2926

work page 2017

[51] [51]

reduc- tion parameter

J. Wang, Y. Guo, L. Yang, Y. Wang, Heterophily-aware graph attention network, Pattern Recognition 156 (2024) 110738. Appendix A. Reduction Parameter In order to speed up CC, we proposed a “reduc- tion parameter”r, which reduces the search space by a factor ofrby only considering the top1/r nodes by degree (see Remark 3.3). This aligns with the definition ...

work page 2024