Temporal Motif Signatures for Temporal Graph Neural Networks
Pith reviewed 2026-06-28 17:05 UTC · model grok-4.3
The pith
A compact 13-feature motif map captures predictive patterns that temporal GNNs miss.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Real temporal interaction streams carry predictive structure in short-horizon motif patterns -- repetition, reciprocity, star diversity, triadic flow -- that vanilla temporal graph neural networks (TGNNs) often fail to expose to their edge scorers. We show this concretely on MOOC interaction prediction, where a small four-feature family of past-window star counts already delivers most of the lift over a strong static GNN. Across a wide set of real and synthetic temporal datasets we find that motif activity organizes consistently along three scale-stable axes (dyadic recency/reciprocity, star diversity, triadic flow), and we use this empirical structure to design a compact 13-coordinate, leak
What carries the argument
The 13-coordinate leakage-safe candidate-local motif feature map h(u, v, t) derived from three scale-stable axes of motif activity.
Load-bearing premise
Motif activity organizes consistently along three scale-stable axes across real and synthetic temporal datasets.
What would settle it
A temporal dataset in which the three axes fail to organize observed motif counts or in which the 13 features produce no lift on any baseline TGNN for link prediction.
Figures
read the original abstract
Real temporal interaction streams carry predictive structure in short-horizon motif patterns -- repetition, reciprocity, star diversity, triadic flow -- that vanilla temporal graph neural networks (TGNNs) often fail to expose to their edge scorers. We show this concretely on MOOC interaction prediction, where a small four-feature family of past-window star counts already delivers most of the lift over a strong static GNN. Across a wide set of real and synthetic temporal datasets we find that motif activity organizes consistently along three scale-stable axes (dyadic recency/reciprocity, star diversity, triadic flow), and we use this empirical structure to design a compact 13-coordinate, leakage-safe, candidate-local motif feature map h(u, v, t) that linearly embeds into any static or temporal encoder without architectural changes. A temporal Weisfeiler-Leman (WL) analysis places the augmentation relative to the first level of an anchored temporal-WL hierarchy and exhibits a candidate-anchored pair on which motif features distinguish. We demonstrate empirically that the same augmentation consistently lifts performance across heterogeneous tasks: TGB link-property prediction across all five baselines, edge classification on Bitcoin Alpha/OTC and MOOC, and graph-level classification of synthetic temporal generators.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that motif activity in temporal graphs organizes consistently along three scale-stable axes (dyadic recency/reciprocity, star diversity, triadic flow) across real and synthetic datasets. It uses this structure to define a compact 13-coordinate, leakage-safe, candidate-local feature map h(u,v,t) that can be linearly embedded into any static or temporal GNN without architectural changes. A temporal Weisfeiler-Leman analysis situates the map relative to the first level of an anchored temporal-WL hierarchy, and empirical results are said to show consistent performance lifts on TGB link-property prediction (five baselines), edge classification (Bitcoin Alpha/OTC, MOOC), and graph-level classification of synthetic generators.
Significance. If the three-axis organization generalizes and the performance improvements prove robust under proper controls, the work would supply a lightweight, architecture-agnostic motif augmentation for TGNNs. The explicit temporal-WL placement is a strength, as it provides a theoretical anchor for the candidate-anchored distinction that the features are claimed to capture.
major comments (3)
- [Abstract, §3] Abstract and §3: The central claim that motif activity 'organizes consistently along three scale-stable axes' is derived from patterns observed on the same class of datasets later used for evaluation. No quantitative measure of axis stability (e.g., correlation of axis loadings across held-out datasets or sensitivity to motif horizon) is supplied, which is load-bearing for the generalization of the fixed 13-coordinate map.
- [§4] §4 (feature map definition): The 13 coordinates of h(u,v,t) are fixed from the empirical axes; the leakage-safe construction is asserted but no explicit argument or experiment demonstrates that the coordinate selection itself does not encode test-distribution information, leaving the 'candidate-local' guarantee unverified for new temporal graphs whose motif scale stability differs.
- [§5] §5 (empirical evaluation): The abstract asserts 'consistent' lifts across all five TGB baselines and multiple tasks, yet the provided text supplies no quantitative tables, error bars, statistical tests, or ablation isolating the contribution of each axis. This prevents assessment of whether the reported improvements are statistically reliable or driven by the specific datasets used to discover the axes.
minor comments (2)
- [§2, §4] Notation for the temporal WL hierarchy and the precise definition of 'anchored' pairs should be clarified with a small example in §2 or §4.
- [§4] The manuscript would benefit from an explicit statement of the temporal horizon(s) used to count motifs when constructing the 13 features.
Simulated Author's Rebuttal
We appreciate the referee's insightful comments on our manuscript. Below we provide point-by-point responses to the major comments, outlining clarifications and planned revisions.
read point-by-point responses
-
Referee: [Abstract, §3] Abstract and §3: The central claim that motif activity 'organizes consistently along three scale-stable axes' is derived from patterns observed on the same class of datasets later used for evaluation. No quantitative measure of axis stability (e.g., correlation of axis loadings across held-out datasets or sensitivity to motif horizon) is supplied, which is load-bearing for the generalization of the fixed 13-coordinate map.
Authors: The identification of the three axes was based on observations across a diverse set of real and synthetic datasets, as described in §3. While the manuscript emphasizes the consistency observed, we agree that providing quantitative measures of stability would strengthen the claim. In the revision, we will include analyses such as the correlation of axis loadings across held-out dataset partitions and sensitivity tests to the motif horizon parameter. revision: yes
-
Referee: [§4] §4 (feature map definition): The 13 coordinates of h(u,v,t) are fixed from the empirical axes; the leakage-safe construction is asserted but no explicit argument or experiment demonstrates that the coordinate selection itself does not encode test-distribution information, leaving the 'candidate-local' guarantee unverified for new temporal graphs whose motif scale stability differs.
Authors: The h(u,v,t) map is designed to be candidate-local, relying solely on information available at time t for the candidate pair without access to future events or test set statistics. The coordinates are fixed globally based on the empirical structure rather than being dataset-specific. We will expand §4 with a formal argument for leakage-safety and an additional experiment applying the fixed map to a new temporal graph with differing motif characteristics to verify the guarantee. revision: yes
-
Referee: [§5] §5 (empirical evaluation): The abstract asserts 'consistent' lifts across all five TGB baselines and multiple tasks, yet the provided text supplies no quantitative tables, error bars, statistical tests, or ablation isolating the contribution of each axis. This prevents assessment of whether the reported improvements are statistically reliable or driven by the specific datasets used to discover the axes.
Authors: The manuscript contains quantitative results for the TGB link prediction tasks across the five baselines as well as the other tasks. To better demonstrate reliability, we will add error bars, statistical tests for significance, and ablations that isolate the contribution of each axis (dyadic, star, triadic) in the revised §5. revision: yes
Circularity Check
No significant circularity in the derivation chain
full rationale
The paper's derivation proceeds from empirical observation of motif patterns on real and synthetic datasets to the design of a fixed 13-coordinate feature map h(u,v,t), followed by independent empirical validation of performance lifts across TGB tasks, edge classification, and synthetic generators. No step reduces by construction to its inputs via self-definition, fitted parameters renamed as predictions, or load-bearing self-citations; the feature map is a data-motivated but non-adaptive construction whose utility is tested on held-out evaluations rather than forced by the initial observations.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Motif activity in temporal interaction streams organizes consistently along three scale-stable axes (dyadic recency/reciprocity, star diversity, triadic flow)
Reference graph
Works this paper leans on
-
[1]
Jian Gao, Jianshe Wu, and JingYi Ding. Hyperevent: A strong baseline for dynamic link prediction via relative structural encoding.arXiv preprint arXiv:2507.11836,
-
[2]
Temporal graph benchmark for machine learning on temporal graphs.Advances in Neural Information Processing Systems, 36:2056–2073,
Shenyang Huang, Farimah Poursafaei, Jacob Danovitch, Matthias Fey, Weihua Hu, Emanuele Rossi, Jure Leskovec, Michael Bronstein, Guillaume Rabusseau, and Reihaneh Rabbany. Temporal graph benchmark for machine learning on temporal graphs.Advances in Neural Information Processing Systems, 36:2056–2073,
2056
-
[3]
Justifying recommendations using distantly-labeled reviews and fine-grained aspects
Jianmo Ni, Jiacheng Li, and Julian McAuley. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. InProceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pages 188–197,
2019
-
[4]
Graph Convolutional Neural Networks via Motif-based Attention
Hao Peng, Jianxin Li, Qiran Gong, Senzhang Wang, Yuanxing Ning, and Philip S Yu. Graph convolutional neural networks via motif-based attention.arXiv preprint arXiv:1811.08270,
work page internal anchor Pith review Pith/arXiv arXiv
-
[5]
Temporal Graph Networks for Deep Learning on Dynamic Graphs
Emanuele Rossi, Ben Chamberlain, Fabrizio Frasca, Davide Eynard, Federico Monti, and Michael Bronstein. Temporal graph networks for deep learning on dynamic graphs.arXiv preprint arXiv:2006.10637,
work page internal anchor Pith review Pith/arXiv arXiv 2006
-
[6]
Representation Learning over Dynamic Graphs
Rakshit Trivedi, Mehrdad Farajtabar, Prasenjeet Biswal, and Hongyuan Zha. Representation learning over dynamic graphs.arXiv preprint arXiv:1803.04051,
work page internal anchor Pith review Pith/arXiv arXiv
-
[7]
A survey of link prediction in temporal networks.arXiv preprint arXiv:2502.21185,
Jiafeng Xiong, Ahmad Zareie, and Rizos Sakellariou. A survey of link prediction in temporal networks.arXiv preprint arXiv:2502.21185,
-
[8]
How Powerful are Graph Neural Networks?
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826,
work page internal anchor Pith review Pith/arXiv arXiv
-
[9]
Motifexplainer: a motif-based graph neural network explainer
Zhaoning Yu and Hongyang Gao. Motifexplainer: a motif-based graph neural network explainer. arXiv preprint arXiv:2202.00519,
-
[10]
For heavy-tailed count coordinates (all except m2_ba_since_last), we apply the per-coordinate transform x7→ log10(1 +x)before passinghto the linear embeddingM. 14 B.5 Neighbor cap and subsampling rule To bound the cost of evaluating Axis-A2 and Axis-A3 features in the presence of high-degree hubs, we apply a global distinct-neighbor cap C∈N . If |N dir a ...
2017
-
[11]
C.6 Temporal-k-WL and the hierarchy theorem We define temporal-k-WL and prove Theorem C.7
bucketization. C.6 Temporal-k-WL and the hierarchy theorem We define temporal-k-WL and prove Theorem C.7. Definition C.6(Temporal- k-WL).Fix k≥1 . Atemporal k-tupleat reference time t is an element (⃗ v, ⃗ s)∈Vk ×[t−∆, t] k. Its initial color C(0) ⃗ v,⃗ sencodes the isomorphism type of the ≤k -node subgraph induced by events in Wpast t (∆) between the nod...
2019
-
[12]
Differences are within the variation expected from seed counts, hyperparameter retuning, and minor pipeline conventions (negative sampling protocol, evaluator version)
against the official TGB leader- board / reported figures [Huang et al., 2023, Gastinger et al., 2024] for the same baselines, to make the faithfulness of our reproductions explicit. Differences are within the variation expected from seed counts, hyperparameter retuning, and minor pipeline conventions (negative sampling protocol, evaluator version). Where...
2023
-
[13]
F PaySim stress test The PaySim fraud-detection stream is included in this paper as a stress test rather than as a headline empirical result. Two structural properties of the dataset put it outside the regime where motif augmentation is expected to help: (i) the fraud subgraph has negligible triadic flow, so the four A3 coordinates of h are dataset-level ...
1968
-
[14]
H.2 Motif-based GNN architectures Several families of GNN architectures couple motifs to message passing or attention
is the immediate predecessor of our feature family; in contrast to that paper’s aggregate enumeration, we use temporal motifcounts per candidate edgein a past-only window. H.2 Motif-based GNN architectures Several families of GNN architectures couple motifs to message passing or attention. MotifNet [Monti et al., 2018] builds one adjacency matrix per moti...
2018
-
[15]
features of the neighborhood at time t
analyze the expressive power of temporal graph networks and provide a related but distinct WL-style bound. Our temporal-WL hierarchy (Definitions C.1 and C.6) is self-contained, modeled on Morris et al. [2019], and makes the connection between motif order and WL rung explicit (Theorem C.7). Feature-augmented expressivity beyond 1-WL in static settings was...
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.