Beyond Leakage and Complexity: Towards Realistic and Efficient Information Cascade Prediction

Bin Tong; Bo Zheng; Guan Wang; Jie Peng; Qiang Wang; Rui Wang; Zhewei Wei

arxiv: 2510.25348 · v2 · pith:BWUHUWZFnew · submitted 2025-10-29 · 💻 cs.LG · cs.SI

Beyond Leakage and Complexity: Towards Realistic and Efficient Information Cascade Prediction

Jie Peng , Rui Wang , Qiang Wang , Zhewei Wei , Bin Tong , Guan Wang , Bo Zheng This is my paper

Pith reviewed 2026-05-21 20:15 UTC · model grok-4.3

classification 💻 cs.LG cs.SI

keywords information cascade predictiontemporal leakagepopularity predictione-commerce datasettemporal walksGRU encodingtime-aware attentionleak-free evaluation

0 comments

The pith

A lightweight model predicts information cascade popularity more accurately than complex methods when future data is withheld from training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that standard random splits in cascade prediction allow models to peek at future events, producing overly optimistic results that do not match real forecasting. It replaces those splits with strict time-ordered windows, builds a new e-commerce dataset that tracks actual purchases after promotion, and introduces a simple model that walks through time steps, selects related cascades by overlap, and encodes them with gated recurrent units plus attention. Under this stricter setup the new model matches or exceeds prior work on four datasets while training and running orders of magnitude faster, especially when the task is to forecast later-stage conversions such as sales.

Core claim

Under time-ordered evaluation that prevents future leakage, the CasTemp framework models cascade dynamics with temporal walks, Jaccard-based selection of neighboring cascades, and GRU encoding equipped with time-aware attention, delivering state-of-the-art accuracy on four datasets together with orders-of-magnitude speedups and strong performance on second-stage conversion prediction.

What carries the argument

CasTemp, a lightweight framework that models cascade dynamics through temporal walks, Jaccard-based neighbor selection for inter-cascade dependencies, and GRU-based encoding with time-aware attention.

If this is right

Cascade prediction tasks can now be evaluated under realistic forecasting conditions that match deployment.
E-commerce platforms gain a dataset that links early diffusion signals directly to later purchase outcomes.
Lightweight temporal-walk models can replace heavy graph neural networks for large-scale cascade analysis.
Second-stage conversion prediction becomes a practical target for monetization and inventory planning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same time-window protocol could be tested on other temporal social-media tasks such as rumor detection or trend forecasting.
Rich conversion-labeled datasets similar to Taoke may be needed in non-commerce domains to move beyond simple popularity counts.
The efficiency gains suggest that scaling to networks with millions of cascades becomes feasible without specialized hardware.

Load-bearing premise

Chronological partitioning of data into consecutive windows fully removes any access to future information and produces evaluations that match real-world forecasting needs.

What would settle it

Re-running the same models on the identical datasets but using random cascade splits instead of time windows, and checking whether CasTemp loses its reported advantage or whether other methods suddenly match or exceed it.

Figures

Figures reproduced from arXiv: 2510.25348 by Bin Tong, Bo Zheng, Guan Wang, Jie Peng, Qiang Wang, Rui Wang, Zhewei Wei.

**Figure 2.** Figure 2: Illustration of the Taoke dataset. New Dataset. We have noticed that the private domain recommendation scenario features an e-commerce platform’s product promotion and forwarding process that is entirely consistent with the cascade propagation process. Specifically, product promoters [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Conceptual illustration of CasTemp, highlighting the integration of inter-cascade competition graph, temporal [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: The training time per epoch of each baseline. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: The results of the ablation study on Twitter and APS. [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: The results of the ablation study on Weibo and Taoke. [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

read the original abstract

Information cascade popularity prediction is a key problem in analyzing content diffusion in social networks. However, current related works suffer from three critical limitations: (1) temporal leakage in current evaluation--random cascade-based splits allow models to access future information, yielding unrealistic results; (2) feature-poor datasets that lack downstream conversion signals (e.g., likes, comments, or purchases), which limits more practical applications; (3) computational inefficiency of complex graph-based methods that require days of training for marginal gains. We systematically address these challenges from three perspectives: task setup, dataset construction, and model design. First, we propose a time-ordered splitting strategy that chronologically partitions data into consecutive windows, ensuring models are evaluated on genuine forecasting tasks without future information leakage. Second, we introduce Taoke, a large-scale e-commerce cascade dataset featuring rich promoter/product attributes and ground-truth purchase conversions--capturing the complete diffusion lifecycle from promotion to monetization. Third, we develop CasTemp, a lightweight framework that efficiently models cascade dynamics through temporal walks, Jaccard-based neighbor selection for inter-cascade dependencies, and GRU-based encoding with time-aware attention. Under leak-free evaluation, CasTemp achieves state-of-the-art performance across four datasets with orders-of-magnitude speedup. Notably, it excels at predicting second-stage popularity conversions--a practical task critical for real-world applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds a new e-commerce dataset with conversion labels and pushes time-window splits for more realistic cascade evaluation, but shared promoters across windows could still allow indirect leakage.

read the letter

The main new pieces are the Taoke dataset, which includes promoter and product attributes plus actual purchase conversions, and the CasTemp model built around temporal walks, Jaccard neighbor selection, and a GRU with time-aware attention. They also replace random splits with consecutive chronological windows to block direct future information. These choices directly target the three problems called out in the abstract: leakage, feature-poor data, and slow graph methods. The efficiency angle and the conversion-prediction task are practical, and the dataset itself could be useful for other groups working on monetization forecasts in social or e-commerce settings. Credit is due for shipping a concrete alternative to the usual heavy baselines and for trying to make the evaluation setup closer to real deployment. The soft spot is the leakage concern. Even with time-ordered windows, the same promoters, products, or users often appear in multiple periods, so patterns learned early can still correlate with later outcomes through shared attributes or recurring entities. The abstract does not mention ablations that mask those overlaps or test for residual signals, so the SOTA and speedup claims rest partly on an unverified assumption. If the full experiments include proper baseline details, error bars, and significance checks, that would help; without them the results stay hard to judge. This work is aimed at researchers who already follow cascade prediction and want something lighter and more applied than the current graph-heavy literature. The dataset and the evaluation critique give it enough substance that a referee could usefully examine the leakage question and the experimental controls. I would send it for peer review.

Referee Report

2 major / 1 minor

Summary. The paper identifies three limitations in information cascade popularity prediction: temporal leakage from random splits, feature-poor datasets lacking conversion signals, and inefficiency of complex graph models. It proposes a chronological consecutive-window split for leak-free evaluation, introduces the Taoke e-commerce dataset with promoter/product attributes and ground-truth purchases, and develops CasTemp, a lightweight model based on temporal walks, Jaccard neighbor selection for inter-cascade dependencies, and GRU encoding with time-aware attention. The central claim is that CasTemp achieves SOTA performance across four datasets with orders-of-magnitude speedup under this evaluation, particularly for second-stage conversion prediction.

Significance. If the results hold, the work offers a more realistic forecasting-oriented evaluation protocol, a valuable new dataset capturing full diffusion-to-monetization lifecycles, and an efficient model that could make cascade prediction practical for large-scale applications. The focus on second-stage popularity conversions addresses a gap with direct real-world utility in e-commerce settings.

major comments (2)

[Abstract and §4] Abstract and experimental section: The SOTA and speedup claims are presented without reported details on exact baselines, error bars, statistical significance, or full hyperparameter settings, limiting verification of the performance gains under the proposed leak-free protocol.
[§3.1] §3.1 (Task Setup): The chronological consecutive-window split is asserted to eliminate future information access, but no ablation or analysis addresses potential indirect leakage via shared promoters, products, or users across windows that could correlate features and inflate performance. This directly underpins the 'leak-free' SOTA claim.

minor comments (1)

[Figures/Tables] Figure and table captions could more explicitly state the evaluation protocol (e.g., window sizes and overlap handling) to improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments, which have helped us improve the clarity and rigor of our work. Below, we provide detailed responses to each major comment.

read point-by-point responses

Referee: [Abstract and §4] Abstract and experimental section: The SOTA and speedup claims are presented without reported details on exact baselines, error bars, statistical significance, or full hyperparameter settings, limiting verification of the performance gains under the proposed leak-free protocol.

Authors: We agree that additional details are necessary for full verification. In the revised manuscript, we will expand the experimental section to include exact baseline implementations, report performance with error bars from multiple random seeds, include statistical significance tests (e.g., paired t-tests), and provide complete hyperparameter settings in an appendix. This will substantiate the SOTA and speedup claims under the leak-free protocol. revision: yes
Referee: [§3.1] §3.1 (Task Setup): The chronological consecutive-window split is asserted to eliminate future information access, but no ablation or analysis addresses potential indirect leakage via shared promoters, products, or users across windows that could correlate features and inflate performance. This directly underpins the 'leak-free' SOTA claim.

Authors: This is a valid point regarding potential indirect leakage. While the consecutive-window split prevents direct access to future cascades, shared entities could introduce correlations. We will add a new subsection in §3.1 analyzing the degree of overlap in promoters, products, and users between training and test windows. Furthermore, we will conduct an ablation study where we remove or mask features from shared entities to measure any performance inflation. If the impact is minimal, it supports the leak-free nature; otherwise, we will discuss implications for the evaluation protocol. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rely on new dataset, model, and split rather than reducing to inputs by construction

full rationale

The paper introduces a time-ordered consecutive-window splitting strategy, constructs a new dataset Taoke with promoter/product attributes and purchase conversions, and proposes the CasTemp model using temporal walks, Jaccard neighbor selection, and GRU encoding with time-aware attention. Performance claims (SOTA under leak-free evaluation, speedup, second-stage conversion prediction) are presented as empirical outcomes on these new elements across four datasets. No equations or steps in the provided text reduce a claimed prediction or result to a fitted parameter, self-definition, or self-citation chain by construction; the evaluation protocol is explicitly proposed as an improvement rather than derived tautologically from prior results.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

The central claims rest on standard machine learning assumptions for temporal sequence modeling and the validity of the new dataset construction; no explicit free parameters or invented entities are detailed in the abstract.

free parameters (1)

model hyperparameters
GRU and attention parameters tuned during training on the datasets.

pith-pipeline@v0.9.0 · 5781 in / 991 out tokens · 63105 ms · 2026-05-21T20:15:47.528395+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Jaccard similarity ... w_ij = |U_i ∩ U_j| / |U_i ∪ U_j| ... temporal random walks ... GRU-based sequential encoder with time-aware attention
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat induction and orbit embedding unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

time-ordered splitting strategy that chronologically partitions data into consecutive windows

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

[1]

Anand V Bodapati. 2008. Recommendation systems with purchase data.Journal of marketing research45, 1 (2008), 77–93

work page 2008
[2]

Qi Cao, Huawei Shen, Keting Cen, Wentao Ouyang, and Xueqi Cheng. 2017. Deephawkes: Bridging the gap between prediction and understanding of infor- mation cascades. InProceedings of the 2017 ACM on Conference on Information and Knowledge Management. 1149–1158

work page 2017
[3]

Xueqin Chen, Fan Zhou, Kunpeng Zhang, Goce Trajcevski, Ting Zhong, and Fengli Zhang. 2019. Information diffusion prediction via recurrent cascades convolution. In2019 IEEE 35th international conference on data engineering (ICDE). IEEE, 770–781

work page 2019
[4]

Justin Cheng, Lada Adamic, P Alex Dow, Jon Michael Kleinberg, and Jure Leskovec. 2014. Can cascades be predicted?. InProceedings of the 23rd interna- tional conference on World wide web. 925–936

work page 2014
[5]

Zhangtao Cheng, Fan Zhou, Xovee Xu, Kunpeng Zhang, Goce Trajcevski, Ting Zhong, and Philip S Yu. 2024. Information cascade popularity prediction via probabilistic diffusion.IEEE Transactions on Knowledge and Data Engineering (2024)

work page 2024
[6]

Kushal Dave, Rushi Bhatt, and Vasudeva Varma. 2011. Modelling action cascades in social networks. InProceedings of the International AAAI Conference on Web and Social Media, Vol. 5. 121–128

work page 2011
[7]

Rahul Dey and Fathi M Salem. 2017. Gate-variants of gated recurrent unit (GRU) neural networks. In2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS). IEEE, 1597–1600

work page 2017
[8]

Chen Gao, Xiangnan He, Dahua Gan, Xiangning Chen, Fuli Feng, Yong Li, Tat- Seng Chua, Lina Yao, Yang Song, and Depeng Jin. 2019. Learning to recommend with multiple cascading behaviors.IEEE transactions on knowledge and data engineering33, 6 (2019), 2588–2601

work page 2019
[9]

2002.Ordinary differential equations

Philip Hartman. 2002.Ordinary differential equations. SIAM

work page 2002
[10]

Diederik P Kingma, Max Welling, et al . 2019. An introduction to variational autoencoders.Foundations and Trends®in Machine Learning12, 4 (2019), 307– 392

work page 2019
[11]

Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting dynamic em- bedding trajectory in temporal interaction networks. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1269–1278

work page 2019
[12]

Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1269–1278

work page 2019
[13]

Jure Leskovec, Lada A Adamic, and Bernardo A Huberman. 2007. The dynamics of viral marketing.ACM Transactions on the Web (TWEB)1, 1 (2007), 5–es

work page 2007
[14]

Cheng Li, Jiaqi Ma, Xiaoxiao Guo, and Qiaozhu Mei. 2017. Deepcas: An end-to- end predictor of information cascades. InProceedings of the 26th international conference on World Wide Web. 577–586

work page 2017
[15]

Yiming Li, Yanyan Shen, Lei Chen, and Mingxuan Yuan. 2023. Zebra: When tem- poral graph neural networks meet temporal personalized PageRank.Proceedings of the VLDB Endowment16, 6 (2023), 1332–1345

work page 2023
[16]

Dongliang Liao, Jin Xu, Gongfu Li, Weijie Huang, Weiqing Liu, and Jing Li. 2019. Popularity prediction on online articles with deep fusion of temporal process and content features. InProceedings of the AAAI conference on artificial intelligence, Vol. 33. 200–207

work page 2019
[17]

Xiaodong Lu, Shuo Ji, Le Yu, Leilei Sun, Bowen Du, and Tongyu Zhu. 2023. Continuous-Time Graph Learning for Cascade Popularity Prediction. InInterna- tional Joint Conference on Artificial Intelligence. https://api.semanticscholar.org/ CorpusID:259088656

work page 2023
[18]

Xiaodong Lu, Leilei Sun, Tongyu Zhu, and Weifeng Lv. 2024. Improving tem- poral link prediction via temporal walk matrix projection.Advances in Neural Information Processing Systems37 (2024), 141153–141182

work page 2024
[19]

Suphakit Niwattanakul, Jatsada Singthongchai, Ekkachai Naenudorn, and Su- pachanun Wanapu. 2013. Using of Jaccard coefficient for keywords similarity. InProceedings of the international multiconference of engineers and computer scientists, Vol. 1. 380–384

work page 2013
[20]

Jie Peng, Zhewei Wei, and Yuhang Ye. 2025. TIDFormer: Exploiting Temporal and Interactive Dynamics Makes A Great Dynamic Graph Transformer. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V

work page 2025
[21]

Emanuele Rossi, Ben Chamberlain, Fabrizio Frasca, Davide Eynard, Federico Monti, and Michael Bronstein. 2020. Temporal Graph Networks for Deep Learn- ing on Dynamic Graphs. InICML 2020 Workshop on Graph Representation Learn- ing

work page 2020
[22]

Michele Starnini, Andrea Baronchelli, Alain Barrat, and Romualdo Pastor- Satorras. 2012. Random walks on temporal networks.Physical Review E—Statistical, Nonlinear, and Soft Matter Physics85, 5 (2012), 056115

work page 2012
[23]

Gabor Szabo and Bernardo A Huberman. 2010. Predicting the popularity of online content.Commun. ACM53, 8 (2010), 80–88

work page 2010
[24]

Mike Thelwall. 2018. Social media analytics for YouTube comments: Potential and limitations.International Journal of Social Research Methodology21, 3 (2018), 303–316

work page 2018
[25]

Rakshit Trivedi, Mehrdad Farajtabar, Prasenjeet Biswal, and Hongyuan Zha. 2019. DyRep: Learning Representations over Dynamic Graphs. In7th International Conference on Learning Representations. OpenReview.net

work page 2019
[26]

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph Attention Networks. InInternational Conference on Learning Representations (ICLR). https://openreview.net/forum? id=rJXMpikCZ

work page 2018
[27]

Xuhong Wang, Ding Lyu, Mengjian Li, Yang Xia, Qi Yang, Xinwen Wang, Xin- guang Wang, Ping Cui, Yupu Yang, Bowen Sun, et al. 2021. Apan: Asynchronous propagation attention network for real-time temporal graph embedding. InPro- ceedings of the 2021 international conference on management of data. 2628–2638

work page 2021
[28]

Lilian Weng, Filippo Menczer, and Yong-Yeol Ahn. 2013. Virality prediction and community structure in social networks.Scientific reports3, 1 (2013), 2522

work page 2013
[29]

Xovee Xu, Fan Zhou, Kunpeng Zhang, Siyuan Liu, and Goce Trajcevski. 2021. Casflow: Exploring hierarchical structures and propagation uncertainty for cas- cade prediction.IEEE Transactions on Knowledge and Data Engineering35, 4 (2021), 3484–3499

work page 2021
[30]

Le Yu, Leilei Sun, Bowen Du, and Weifeng Lv. 2023. Towards better dynamic graph learning: New architecture and unified library.Advances in Neural Information Processing Systems36 (2023), 67686–67700. Jie Peng, Rui Wang, Qiang Wang, Zhewei Wei, Bin Tong∗, and Guan Wang A Datasets We evaluate our method on four real-world datasets spanning social media, aca...

work page 2023

[1] [1]

Anand V Bodapati. 2008. Recommendation systems with purchase data.Journal of marketing research45, 1 (2008), 77–93

work page 2008

[2] [2]

Qi Cao, Huawei Shen, Keting Cen, Wentao Ouyang, and Xueqi Cheng. 2017. Deephawkes: Bridging the gap between prediction and understanding of infor- mation cascades. InProceedings of the 2017 ACM on Conference on Information and Knowledge Management. 1149–1158

work page 2017

[3] [3]

Xueqin Chen, Fan Zhou, Kunpeng Zhang, Goce Trajcevski, Ting Zhong, and Fengli Zhang. 2019. Information diffusion prediction via recurrent cascades convolution. In2019 IEEE 35th international conference on data engineering (ICDE). IEEE, 770–781

work page 2019

[4] [4]

Justin Cheng, Lada Adamic, P Alex Dow, Jon Michael Kleinberg, and Jure Leskovec. 2014. Can cascades be predicted?. InProceedings of the 23rd interna- tional conference on World wide web. 925–936

work page 2014

[5] [5]

Zhangtao Cheng, Fan Zhou, Xovee Xu, Kunpeng Zhang, Goce Trajcevski, Ting Zhong, and Philip S Yu. 2024. Information cascade popularity prediction via probabilistic diffusion.IEEE Transactions on Knowledge and Data Engineering (2024)

work page 2024

[6] [6]

Kushal Dave, Rushi Bhatt, and Vasudeva Varma. 2011. Modelling action cascades in social networks. InProceedings of the International AAAI Conference on Web and Social Media, Vol. 5. 121–128

work page 2011

[7] [7]

Rahul Dey and Fathi M Salem. 2017. Gate-variants of gated recurrent unit (GRU) neural networks. In2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS). IEEE, 1597–1600

work page 2017

[8] [8]

Chen Gao, Xiangnan He, Dahua Gan, Xiangning Chen, Fuli Feng, Yong Li, Tat- Seng Chua, Lina Yao, Yang Song, and Depeng Jin. 2019. Learning to recommend with multiple cascading behaviors.IEEE transactions on knowledge and data engineering33, 6 (2019), 2588–2601

work page 2019

[9] [9]

2002.Ordinary differential equations

Philip Hartman. 2002.Ordinary differential equations. SIAM

work page 2002

[10] [10]

Diederik P Kingma, Max Welling, et al . 2019. An introduction to variational autoencoders.Foundations and Trends®in Machine Learning12, 4 (2019), 307– 392

work page 2019

[11] [11]

Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting dynamic em- bedding trajectory in temporal interaction networks. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1269–1278

work page 2019

[12] [12]

Srijan Kumar, Xikun Zhang, and Jure Leskovec. 2019. Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 1269–1278

work page 2019

[13] [13]

Jure Leskovec, Lada A Adamic, and Bernardo A Huberman. 2007. The dynamics of viral marketing.ACM Transactions on the Web (TWEB)1, 1 (2007), 5–es

work page 2007

[14] [14]

Cheng Li, Jiaqi Ma, Xiaoxiao Guo, and Qiaozhu Mei. 2017. Deepcas: An end-to- end predictor of information cascades. InProceedings of the 26th international conference on World Wide Web. 577–586

work page 2017

[15] [15]

Yiming Li, Yanyan Shen, Lei Chen, and Mingxuan Yuan. 2023. Zebra: When tem- poral graph neural networks meet temporal personalized PageRank.Proceedings of the VLDB Endowment16, 6 (2023), 1332–1345

work page 2023

[16] [16]

Dongliang Liao, Jin Xu, Gongfu Li, Weijie Huang, Weiqing Liu, and Jing Li. 2019. Popularity prediction on online articles with deep fusion of temporal process and content features. InProceedings of the AAAI conference on artificial intelligence, Vol. 33. 200–207

work page 2019

[17] [17]

Xiaodong Lu, Shuo Ji, Le Yu, Leilei Sun, Bowen Du, and Tongyu Zhu. 2023. Continuous-Time Graph Learning for Cascade Popularity Prediction. InInterna- tional Joint Conference on Artificial Intelligence. https://api.semanticscholar.org/ CorpusID:259088656

work page 2023

[18] [18]

Xiaodong Lu, Leilei Sun, Tongyu Zhu, and Weifeng Lv. 2024. Improving tem- poral link prediction via temporal walk matrix projection.Advances in Neural Information Processing Systems37 (2024), 141153–141182

work page 2024

[19] [19]

Suphakit Niwattanakul, Jatsada Singthongchai, Ekkachai Naenudorn, and Su- pachanun Wanapu. 2013. Using of Jaccard coefficient for keywords similarity. InProceedings of the international multiconference of engineers and computer scientists, Vol. 1. 380–384

work page 2013

[20] [20]

Jie Peng, Zhewei Wei, and Yuhang Ye. 2025. TIDFormer: Exploiting Temporal and Interactive Dynamics Makes A Great Dynamic Graph Transformer. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V

work page 2025

[21] [21]

Emanuele Rossi, Ben Chamberlain, Fabrizio Frasca, Davide Eynard, Federico Monti, and Michael Bronstein. 2020. Temporal Graph Networks for Deep Learn- ing on Dynamic Graphs. InICML 2020 Workshop on Graph Representation Learn- ing

work page 2020

[22] [22]

Michele Starnini, Andrea Baronchelli, Alain Barrat, and Romualdo Pastor- Satorras. 2012. Random walks on temporal networks.Physical Review E—Statistical, Nonlinear, and Soft Matter Physics85, 5 (2012), 056115

work page 2012

[23] [23]

Gabor Szabo and Bernardo A Huberman. 2010. Predicting the popularity of online content.Commun. ACM53, 8 (2010), 80–88

work page 2010

[24] [24]

Mike Thelwall. 2018. Social media analytics for YouTube comments: Potential and limitations.International Journal of Social Research Methodology21, 3 (2018), 303–316

work page 2018

[25] [25]

Rakshit Trivedi, Mehrdad Farajtabar, Prasenjeet Biswal, and Hongyuan Zha. 2019. DyRep: Learning Representations over Dynamic Graphs. In7th International Conference on Learning Representations. OpenReview.net

work page 2019

[26] [26]

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph Attention Networks. InInternational Conference on Learning Representations (ICLR). https://openreview.net/forum? id=rJXMpikCZ

work page 2018

[27] [27]

Xuhong Wang, Ding Lyu, Mengjian Li, Yang Xia, Qi Yang, Xinwen Wang, Xin- guang Wang, Ping Cui, Yupu Yang, Bowen Sun, et al. 2021. Apan: Asynchronous propagation attention network for real-time temporal graph embedding. InPro- ceedings of the 2021 international conference on management of data. 2628–2638

work page 2021

[28] [28]

Lilian Weng, Filippo Menczer, and Yong-Yeol Ahn. 2013. Virality prediction and community structure in social networks.Scientific reports3, 1 (2013), 2522

work page 2013

[29] [29]

Xovee Xu, Fan Zhou, Kunpeng Zhang, Siyuan Liu, and Goce Trajcevski. 2021. Casflow: Exploring hierarchical structures and propagation uncertainty for cas- cade prediction.IEEE Transactions on Knowledge and Data Engineering35, 4 (2021), 3484–3499

work page 2021

[30] [30]

Le Yu, Leilei Sun, Bowen Du, and Weifeng Lv. 2023. Towards better dynamic graph learning: New architecture and unified library.Advances in Neural Information Processing Systems36 (2023), 67686–67700. Jie Peng, Rui Wang, Qiang Wang, Zhewei Wei, Bin Tong∗, and Guan Wang A Datasets We evaluate our method on four real-world datasets spanning social media, aca...

work page 2023