Image Classification with Hierarchical Multigraph Networks

Boris Knyazev; Graham W. Taylor; Mohamed R. Amer; Xiao Lin

arxiv: 1907.09000 · v1 · pith:NL253WI5new · submitted 2019-07-21 · 💻 cs.CV · cs.LG

Image Classification with Hierarchical Multigraph Networks

Boris Knyazev , Xiao Lin , Mohamed R. Amer , Graham W. Taylor This is my paper

Pith reviewed 2026-05-24 18:32 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords graph convolutional networksimage classificationsuperpixelsmultigraph networkshierarchical graphsMNISTCIFAR-10PASCAL

0 comments

The pith

Hierarchical multigraph networks built from superpixels let GCNs match or exceed CNN accuracy on image classification tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that graph convolutional networks can be adapted for image classification by representing images as hierarchical multigraphs whose nodes are superpixels and whose edges encode multiple relations at different scales. This design exploits GCNs' natural handling of irregular inputs and multirelational structure, properties that standard CNNs do not directly encode. The authors identify concrete design choices that allow these networks to reach classification performance comparable to or better than CNNs on the MNIST, CIFAR-10, and PASCAL datasets. A reader would care because the approach replaces the rigid pixel grid with a more flexible graph representation that could lower computational cost while preserving accuracy. The central demonstration is therefore that domain knowledge for vision can be injected into GCNs through careful graph construction rather than through hardcoded convolutional filters.

Core claim

By constructing hierarchical multigraphs from superpixels of images, where edges represent multiple relations, and applying graph convolutional networks, it is possible to perform image classification at accuracies that match or exceed those of convolutional neural networks on datasets including MNIST, CIFAR-10, and PASCAL.

What carries the argument

Hierarchical multigraph networks, in which image superpixels become nodes connected by multiple edge types across resolution levels, carrying the spatial and relational information needed for classification.

If this is right

GCNs can operate directly on irregular image representations without requiring a fixed rectangular grid.
Multiple edge relations let the model capture distinct kinds of pixel or region interactions in one forward pass.
Hierarchical construction reduces the number of nodes relative to raw pixels and thereby lowers memory and compute demands.
Best-practice design choices for superpixel graphs and edge types transfer across the three evaluated datasets.
The same graph-construction recipe can be applied to other vision tasks that benefit from irregular or multirelational inputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same superpixel multigraph approach might be tested on video or 3-D data where the underlying structure is already non-grid.
If superpixel quality varies with image content, an adaptive segmentation step could become necessary for consistent performance.
Because the model never sees the raw pixel lattice, it offers a natural testbed for measuring how much translation invariance is truly required for a given dataset.
Scaling the hierarchy to deeper levels or larger images would reveal whether the accuracy gains persist or saturate.

Load-bearing premise

Superpixel graphs together with the chosen multirelational edges preserve enough spatial layout and semantic content from the original pixel image to support high-accuracy classification.

What would settle it

Running the proposed multigraph model on MNIST, CIFAR-10, and PASCAL and finding that its accuracy falls below a standard CNN baseline on every dataset would disprove the claim of comparability or superiority.

Figures

Figures reproduced from arXiv: 1907.09000 by Boris Knyazev, Graham W. Taylor, Mohamed R. Amer, Xiao Lin.

**Figure 1.** Figure 1: Examples of the original images (a-c), defined on a regular grid, and their superpixel representations (d-f) for MNIST (a,d), CIFAR-10 (b,e) and PASCAL (c,f); N is the number of superpixels (nodes in our graphs). GCNs can learn both from images and superpixels due to their flexibility, whereas standard CNNs can learn only from images defined on a regular grid (a-c). The challenge of generalizing convoluti… view at source ↗

**Figure 2.** Figure 2: An example of the “PC” relation type fusion based on a trainable projection (Eq. 3). We first project features onto a common multirelational space, where we fuse them using a fusion operator, such as summation or concatenation. In this work, relation types 1 and 2 can denote spatial and hierarchical (or learned) edges. We also allow for three or more relation types. Convolution is an essential computationa… view at source ↗

**Figure 3.** Figure 3: (top) We compute superpixels at several scales and combine all of them into a single set. (bottom) We then build a graph, where each node corresponds to a superpixel from this set and has features, such as mean RGB color and coordinates of the centres of masses. Using Eq. 4 and 7, we compute spatial (a) and hierarchical (c) edges. Nodes 0 to 300 correspond to the first level of the hierarchy (first scale … view at source ↗

**Figure 4.** Figure 4: Image classification pipeline using our model. Each m th graph convolutional layer in our model takes the graph Gm = (Vm,E (r) ) and returns a graph with the same nodes and edges. Node features become increasingly global after each subsequent layer as the receptive field increases, while edges are propagated without changes. As a result, after several graph convolutional layers, each node in the graph cont… view at source ↗

**Figure 6.** Figure 6: (a) Number of trainable parameters, # params, in a graph convolutional layer as a function of the number of relations, R. Fusion methods based on trainable projections, including those proposed in our work, have “# params” comparable to the baseline concatenation method while being more powerful in terms of classification (see Tables 1 and 2). (b) Comparison of single edge types, where learned and hierarc… view at source ↗

read the original abstract

Graph Convolutional Networks (GCNs) are a class of general models that can learn from graph structured data. Despite being general, GCNs are admittedly inferior to convolutional neural networks (CNNs) when applied to vision tasks, mainly due to the lack of domain knowledge that is hardcoded into CNNs, such as spatially oriented translation invariant filters. However, a great advantage of GCNs is the ability to work on irregular inputs, such as superpixels of images. This could significantly reduce the computational cost of image reasoning tasks. Another key advantage inherent to GCNs is the natural ability to model multirelational data. Building upon these two promising properties, in this work, we show best practices for designing GCNs for image classification; in some cases even outperforming CNNs on the MNIST, CIFAR-10 and PASCAL image datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds hierarchical multigraph GCNs on superpixels and claims they can match or beat CNNs on MNIST/CIFAR/PASCAL, but the abstract supplies zero numbers or controls to support that.

read the letter

The core claim here is that GCNs on hierarchical superpixel multigraphs can sometimes outperform CNNs on standard image benchmarks by injecting domain knowledge through graph construction rather than fixed filters. That is the one thing worth noting up front. What the work actually does is take existing GCN machinery, apply it to irregular superpixel inputs, add multirelational edges, and outline some design choices for making this work on vision data. The advantage they emphasize—handling irregular inputs and multirelational structure—is real and could matter for compute savings or richer relational modeling. The abstract is clear that this is an application of prior GCN ideas rather than a new theoretical framework. The main soft spot is the performance claim itself. The abstract states outperformance without any accuracy numbers, baselines, error bars, or experimental details, so there is no way to judge whether the result holds or what controls were used. The stress-test concern about superpixels discarding fine spatial detail that CNNs exploit is reasonable on its face; nothing in the provided text shows how the multigraph construction recovers absolute layout or prevents boundary errors from hurting accuracy on CIFAR or PASCAL. If the full paper has reproducible tables with proper CNN baselines and ablation on the graph construction, that would change the picture. As it stands, the paper is for readers already working on graph-based vision pipelines who want to see one concrete way to adapt GCNs to images. It is not yet strong enough for a serious referee without the missing experimental section. I would not bring it to reading group or cite it until the numbers are checked.

Referee Report

2 major / 1 minor

Summary. The paper proposes hierarchical multigraph networks (HMGNs) for image classification. Images are segmented into superpixels forming graphs with multiple relation types; GCN layers operate on these irregular structures to perform classification. The central claim is that this approach incorporates domain knowledge via multirelational edges and can, in some cases, outperform standard CNNs on MNIST, CIFAR-10, and PASCAL while reducing computational cost.

Significance. If the empirical results are rigorously validated with proper controls and baselines, the work would indicate that GCNs on superpixel multigraphs can encode sufficient spatial and semantic information to compete with translation-equivariant CNN filters on standard vision benchmarks. This could support more efficient irregular-domain models for image tasks.

major comments (2)

[Abstract] Abstract: The headline claim of outperforming CNNs on MNIST, CIFAR-10, and PASCAL supplies no numerical results, baselines, error bars, or experimental controls. This absence makes the central empirical assertion unevaluable and directly undermines assessment of whether the multigraph construction preserves the information CNNs exploit.
[Abstract] The assumption that superpixel graphs plus multirelational edges preserve sufficient spatial and semantic information (stated in the abstract as addressing the lack of hardcoded CNN filters) is load-bearing for the outperformance claim. No explicit positional encoding or coordinate features are described to recover absolute layout lost when superpixel boundaries merge or split regions, and the irregular topology lacks the fixed-grid equivariance of CNN kernels.

minor comments (1)

Notation for the multirelational edge types and hierarchical pooling steps should be defined more clearly with explicit equations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and indicate where revisions will be made to the abstract and related sections.

read point-by-point responses

Referee: [Abstract] Abstract: The headline claim of outperforming CNNs on MNIST, CIFAR-10, and PASCAL supplies no numerical results, baselines, error bars, or experimental controls. This absence makes the central empirical assertion unevaluable and directly undermines assessment of whether the multigraph construction preserves the information CNNs exploit.

Authors: We agree that the abstract should supply concrete numbers to make the claim evaluable. The revised abstract will include our reported accuracies (with standard deviations across runs) on MNIST, CIFAR-10 and PASCAL, together with the CNN baselines used in the experimental section. This will allow readers to assess the empirical support directly from the abstract. revision: yes
Referee: [Abstract] The assumption that superpixel graphs plus multirelational edges preserve sufficient spatial and semantic information (stated in the abstract as addressing the lack of hardcoded CNN filters) is load-bearing for the outperformance claim. No explicit positional encoding or coordinate features are described to recover absolute layout lost when superpixel boundaries merge or split regions, and the irregular topology lacks the fixed-grid equivariance of CNN kernels.

Authors: The manuscript constructs multiple edge relation types from superpixel adjacency and appearance features; these relations are intended to encode relative spatial layout without requiring a fixed grid. However, the current text does not describe explicit coordinate features for superpixel centroids. We will add a short clarification in the methods section on how the chosen relations capture positional information and will note the absence of explicit absolute coordinates as a point for potential future augmentation. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical design with no derivation chain or self-referential predictions

full rationale

The provided abstract and text describe an empirical method for applying GCNs to superpixel graphs for image classification, with performance evaluated on MNIST, CIFAR-10 and PASCAL. No equations, first-principles derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear. Claims rest on experimental results rather than any mathematical reduction to inputs by construction. The derivation chain is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; any such elements would require the full manuscript.

pith-pipeline@v0.9.0 · 5677 in / 956 out tokens · 19831 ms · 2026-05-24T18:32:07.545208+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 4 internal anchors

[1]

Slic superpixels compared to state-of-the-art superpixel methods

Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk. Slic superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence, 34(11):2274–2282, 2012

work page 2012
[2]

Contour detection and hi- erarchical image segmentation

Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. Contour detection and hi- erarchical image segmentation. IEEE transactions on pattern analysis and machine intelligence, 33(5):898–916, 2010

work page 2010
[3]

Relational inductive biases, deep learning, and graph networks

Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zam- baldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[4]

Translating embeddings for modeling multi-relational data

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems, pages 2787–2795, 2013

work page 2013
[5]

Ge- ometric deep learning: going beyond euclidean data

Michael M Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst. Ge- ometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4): 18–42, 2017

work page 2017
[6]

Spectral networks and locally connected networks on graphs

Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. Spectral networks and locally connected networks on graphs. InInternational Conference on Learning Representations (ICLR), 2014

work page 2014
[7]

Iterative visual reasoning beyond convo- lutions

Xinlei Chen, Li-Jia Li, Li Fei-Fei, and Abhinav Gupta. Iterative visual reasoning beyond convo- lutions. In Proc. CVPR, 2018

work page 2018
[8]

Convolutional neural networks on graphs with fast localized spectral ﬁltering

Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral ﬁltering. In Advances in Neural Information Processing Systems, pages 3844–3852, 2016

work page 2016
[9]

Weighted graph cuts without eigenvectors a multilevel approach

Inderjit S Dhillon, Yuqiang Guan, and Brian Kulis. Weighted graph cuts without eigenvectors a multilevel approach. IEEE transactions on pattern analysis and machine intelligence , 29(11), 2007

work page 2007
[10]

The pascal visual object classes (voc) challenge

Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2): 303–338, 2010

work page 2010
[11]

Splinecnn: Fast geomet- ric deep learning with continuous b-spline kernels

Matthias Fey, Jan Eric Lenssen, Frank Weichert, and Heinrich Müller. Splinecnn: Fast geomet- ric deep learning with continuous b-spline kernels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 869–877, 2018

work page 2018
[12]

Graph U-Net

Hongyang Gao and Shuiwang Ji. Graph U-Net. In Proceedings of the 36th International Confer- ence on Machine Learning (ICML), 2019

work page 2019
[13]

Neural message passing for quantum chemistry

Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning (ICML), pages 1263–1272, 2017

work page 2017
[14]

Representation Learning on Graphs: Methods and Applications

William L Hamilton, Rex Ying, and Jure Leskovec. Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584, 2017. 12 B. KNY AZEV , X. LIN, M.R. AMER, G.W. TA YLOR: IMAGE CLASS., HIER. MULTIGRAPHS

work page internal anchor Pith review Pith/arXiv arXiv 2017
[15]

Deep Convolutional Networks on Graph-Structured Data

Mikael Henaff, Joan Bruna, and Yann LeCun. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[16]

Batch normalization: Accelerating deep network training by reducing internal covariate shift

Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning (ICML), 2015

work page 2015
[17]

Graph-based isometry invariant representation learning

Renata Khasanova and Pascal Frossard. Graph-based isometry invariant representation learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages 1847–1856. JMLR. org, 2017

work page 2017
[18]

Adam: A method for stochastic optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InInternational Conference on Learning Representations (ICLR), 2015

work page 2015
[19]

Semi-supervised classiﬁcation with graph convolutional net- works

Thomas N Kipf and Max Welling. Semi-supervised classiﬁcation with graph convolutional net- works. In International Conference on Learning Representations (ICLR), 2017

work page 2017
[20]

Spectral multigraph net- works for discovering and fusing relationships in molecules

Boris Knyazev, Xiao Lin, Mohamed R Amer, and Graham W Taylor. Spectral multigraph net- works for discovering and fusing relationships in molecules. In NeurIPS Workshop on Machine Learning for Molecules and Materials, 2018

work page 2018
[21]

On valid optimal assignment kernels and applications to graph classiﬁcation

Nils M Kriege, Pierre-Louis Giscard, and Richard Wilson. On valid optimal assignment kernels and applications to graph classiﬁcation. In Advances in Neural Information Processing Systems, pages 1623–1631, 2016

work page 2016
[22]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, Cite- seer, 2009

work page 2009
[23]

Gradient-based learning applied to document recognition

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998

work page 1998
[24]

Semantic object parsing with graph lstm

Xiaodan Liang, Xiaohui Shen, Jiashi Feng, Liang Lin, and Shuicheng Yan. Semantic object parsing with graph lstm. In European Conference on Computer Vision, pages 125–143. Springer, 2016

work page 2016
[25]

Visual relationship detection with language priors

Cewu Lu, Ranjay Krishna, Michael Bernstein, and Li Fei-Fei. Visual relationship detection with language priors. In European Conference on Computer Vision, 2016

work page 2016
[26]

Geometric deep learning on graphs and manifolds using mixture model cnns

Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodola, Jan Svoboda, and Michael M Bronstein. Geometric deep learning on graphs and manifolds using mixture model cnns. In Proc. CVPR, volume 1, page 3, 2017

work page 2017
[27]

Learning convolutional neural networks for graphs

Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. Learning convolutional neural networks for graphs. In Proceedings of the 33rd International Conference on Machine Learning (ICML), pages 2014–2023, 2016

work page 2014
[28]

Attribute-graph: A graph based approach to image ranking

Nikita Prabhu and R Venkatesh Babu. Attribute-graph: A graph based approach to image ranking. In Proceedings of the IEEE International Conference on Computer Vision , pages 1071–1079, 2015

work page 2015
[29]

Modeling relational data with graph convolutional networks

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. Modeling relational data with graph convolutional networks. In European Semantic Web Conference, pages 593–607. Springer, 2018

work page 2018
[30]

Weisfeiler-lehman graph kernels

Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M Borgwardt. Weisfeiler-lehman graph kernels. Journal of Machine Learning Research, 12(Sep): 2539–2561, 2011. B. KNY AZEV , X. LIN, M.R. AMER, G.W. TA YLOR: IMAGE CLASS., HIER. MULTIGRAPHS13

work page 2011
[31]

Dynamic edgeconditioned ﬁlters in convolutional neural networks on graphs

Martin Simonovsky and Nikos Komodakis. Dynamic edgeconditioned ﬁlters in convolutional neural networks on graphs. In Proc. CVPR, 2017

work page 2017
[32]

Graphvae: Towards generation of small graphs using variational autoencoders

Martin Simonovsky and Nikos Komodakis. Graphvae: Towards generation of small graphs using variational autoencoders. In International Conference on Artiﬁcial Neural Networks, pages 412–

work page
[33]

Striving for Simplicity: The All Convolutional Net

Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[34]

Dropout: a simple way to prevent neural networks from overﬁtting

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overﬁtting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014

work page 1929
[35]

Graph attention networks

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. In International Conference on Learning Representations (ICLR), 2018

work page 2018
[36]

Deep graph infomax

Petar Veli ˇckovi´c, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. Deep graph infomax. In International Conference on Learning Representations (ICLR), 2019

work page 2019
[37]

How powerful are graph neural networks? In International Conference on Learning Representations (ICLR), 2019

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? In International Conference on Learning Representations (ICLR), 2019

work page 2019
[38]

Deep graph kernels

Pinar Yanardag and SVN Vishwanathan. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages 1365–

work page
[39]

Hierarchical graph representation learning with differentiable pooling

Zhitao Ying, Jiaxuan You, Christopher Morris, Xiang Ren, Will Hamilton, and Jure Leskovec. Hierarchical graph representation learning with differentiable pooling. In Advances in Neural Information Processing Systems, pages 4805–4815, 2018

work page 2018

[1] [1]

Slic superpixels compared to state-of-the-art superpixel methods

Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Süsstrunk. Slic superpixels compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence, 34(11):2274–2282, 2012

work page 2012

[2] [2]

Contour detection and hi- erarchical image segmentation

Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. Contour detection and hi- erarchical image segmentation. IEEE transactions on pattern analysis and machine intelligence, 33(5):898–916, 2010

work page 2010

[3] [3]

Relational inductive biases, deep learning, and graph networks

Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zam- baldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[4] [4]

Translating embeddings for modeling multi-relational data

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems, pages 2787–2795, 2013

work page 2013

[5] [5]

Ge- ometric deep learning: going beyond euclidean data

Michael M Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheynst. Ge- ometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4): 18–42, 2017

work page 2017

[6] [6]

Spectral networks and locally connected networks on graphs

Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. Spectral networks and locally connected networks on graphs. InInternational Conference on Learning Representations (ICLR), 2014

work page 2014

[7] [7]

Iterative visual reasoning beyond convo- lutions

Xinlei Chen, Li-Jia Li, Li Fei-Fei, and Abhinav Gupta. Iterative visual reasoning beyond convo- lutions. In Proc. CVPR, 2018

work page 2018

[8] [8]

Convolutional neural networks on graphs with fast localized spectral ﬁltering

Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral ﬁltering. In Advances in Neural Information Processing Systems, pages 3844–3852, 2016

work page 2016

[9] [9]

Weighted graph cuts without eigenvectors a multilevel approach

Inderjit S Dhillon, Yuqiang Guan, and Brian Kulis. Weighted graph cuts without eigenvectors a multilevel approach. IEEE transactions on pattern analysis and machine intelligence , 29(11), 2007

work page 2007

[10] [10]

The pascal visual object classes (voc) challenge

Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2): 303–338, 2010

work page 2010

[11] [11]

Splinecnn: Fast geomet- ric deep learning with continuous b-spline kernels

Matthias Fey, Jan Eric Lenssen, Frank Weichert, and Heinrich Müller. Splinecnn: Fast geomet- ric deep learning with continuous b-spline kernels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 869–877, 2018

work page 2018

[12] [12]

Graph U-Net

Hongyang Gao and Shuiwang Ji. Graph U-Net. In Proceedings of the 36th International Confer- ence on Machine Learning (ICML), 2019

work page 2019

[13] [13]

Neural message passing for quantum chemistry

Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning (ICML), pages 1263–1272, 2017

work page 2017

[14] [14]

Representation Learning on Graphs: Methods and Applications

William L Hamilton, Rex Ying, and Jure Leskovec. Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584, 2017. 12 B. KNY AZEV , X. LIN, M.R. AMER, G.W. TA YLOR: IMAGE CLASS., HIER. MULTIGRAPHS

work page internal anchor Pith review Pith/arXiv arXiv 2017

[15] [15]

Deep Convolutional Networks on Graph-Structured Data

Mikael Henaff, Joan Bruna, and Yann LeCun. Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[16] [16]

Batch normalization: Accelerating deep network training by reducing internal covariate shift

Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning (ICML), 2015

work page 2015

[17] [17]

Graph-based isometry invariant representation learning

Renata Khasanova and Pascal Frossard. Graph-based isometry invariant representation learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages 1847–1856. JMLR. org, 2017

work page 2017

[18] [18]

Adam: A method for stochastic optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InInternational Conference on Learning Representations (ICLR), 2015

work page 2015

[19] [19]

Semi-supervised classiﬁcation with graph convolutional net- works

Thomas N Kipf and Max Welling. Semi-supervised classiﬁcation with graph convolutional net- works. In International Conference on Learning Representations (ICLR), 2017

work page 2017

[20] [20]

Spectral multigraph net- works for discovering and fusing relationships in molecules

Boris Knyazev, Xiao Lin, Mohamed R Amer, and Graham W Taylor. Spectral multigraph net- works for discovering and fusing relationships in molecules. In NeurIPS Workshop on Machine Learning for Molecules and Materials, 2018

work page 2018

[21] [21]

On valid optimal assignment kernels and applications to graph classiﬁcation

Nils M Kriege, Pierre-Louis Giscard, and Richard Wilson. On valid optimal assignment kernels and applications to graph classiﬁcation. In Advances in Neural Information Processing Systems, pages 1623–1631, 2016

work page 2016

[22] [22]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, Cite- seer, 2009

work page 2009

[23] [23]

Gradient-based learning applied to document recognition

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998

work page 1998

[24] [24]

Semantic object parsing with graph lstm

Xiaodan Liang, Xiaohui Shen, Jiashi Feng, Liang Lin, and Shuicheng Yan. Semantic object parsing with graph lstm. In European Conference on Computer Vision, pages 125–143. Springer, 2016

work page 2016

[25] [25]

Visual relationship detection with language priors

Cewu Lu, Ranjay Krishna, Michael Bernstein, and Li Fei-Fei. Visual relationship detection with language priors. In European Conference on Computer Vision, 2016

work page 2016

[26] [26]

Geometric deep learning on graphs and manifolds using mixture model cnns

Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodola, Jan Svoboda, and Michael M Bronstein. Geometric deep learning on graphs and manifolds using mixture model cnns. In Proc. CVPR, volume 1, page 3, 2017

work page 2017

[27] [27]

Learning convolutional neural networks for graphs

Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. Learning convolutional neural networks for graphs. In Proceedings of the 33rd International Conference on Machine Learning (ICML), pages 2014–2023, 2016

work page 2014

[28] [28]

Attribute-graph: A graph based approach to image ranking

Nikita Prabhu and R Venkatesh Babu. Attribute-graph: A graph based approach to image ranking. In Proceedings of the IEEE International Conference on Computer Vision , pages 1071–1079, 2015

work page 2015

[29] [29]

Modeling relational data with graph convolutional networks

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, and Max Welling. Modeling relational data with graph convolutional networks. In European Semantic Web Conference, pages 593–607. Springer, 2018

work page 2018

[30] [30]

Weisfeiler-lehman graph kernels

Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M Borgwardt. Weisfeiler-lehman graph kernels. Journal of Machine Learning Research, 12(Sep): 2539–2561, 2011. B. KNY AZEV , X. LIN, M.R. AMER, G.W. TA YLOR: IMAGE CLASS., HIER. MULTIGRAPHS13

work page 2011

[31] [31]

Dynamic edgeconditioned ﬁlters in convolutional neural networks on graphs

Martin Simonovsky and Nikos Komodakis. Dynamic edgeconditioned ﬁlters in convolutional neural networks on graphs. In Proc. CVPR, 2017

work page 2017

[32] [32]

Graphvae: Towards generation of small graphs using variational autoencoders

Martin Simonovsky and Nikos Komodakis. Graphvae: Towards generation of small graphs using variational autoencoders. In International Conference on Artiﬁcial Neural Networks, pages 412–

work page

[33] [33]

Striving for Simplicity: The All Convolutional Net

Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[34] [34]

Dropout: a simple way to prevent neural networks from overﬁtting

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overﬁtting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014

work page 1929

[35] [35]

Graph attention networks

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. In International Conference on Learning Representations (ICLR), 2018

work page 2018

[36] [36]

Deep graph infomax

Petar Veli ˇckovi´c, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. Deep graph infomax. In International Conference on Learning Representations (ICLR), 2019

work page 2019

[37] [37]

How powerful are graph neural networks? In International Conference on Learning Representations (ICLR), 2019

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? In International Conference on Learning Representations (ICLR), 2019

work page 2019

[38] [38]

Deep graph kernels

Pinar Yanardag and SVN Vishwanathan. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , pages 1365–

work page

[39] [39]

Hierarchical graph representation learning with differentiable pooling

Zhitao Ying, Jiaxuan You, Christopher Morris, Xiang Ren, Will Hamilton, and Jure Leskovec. Hierarchical graph representation learning with differentiable pooling. In Advances in Neural Information Processing Systems, pages 4805–4815, 2018

work page 2018