ID and Graph View Contrastive Learning with Multi-View Attention Fusion for Sequential Recommendation
Pith reviewed 2026-05-10 12:05 UTC · model grok-4.3
The pith
Fusing ID sequences and interaction graphs through three contrastive objectives and attention improves sequential recommendations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MVCrec integrates complementary signals from sequential ID-based views and graph-based views using three contrastive objectives—within the sequential view, within the graph view, and across views—combined with a multi-view attention fusion module that employs global and local attention to estimate the likelihood of a target user purchasing a target item.
What carries the argument
The multi-view attention fusion module, which combines global and local attention mechanisms on representations learned from the three contrastive objectives.
If this is right
- The three contrastive objectives produce user and item representations that encode both sequential order and relational structure.
- The attention fusion step yields higher accuracy in next-item prediction than either view alone.
- The approach delivers gains without any auxiliary features, relying solely on interaction data.
- Empirical tests show consistent outperformance of 11 baselines on five real-world datasets, with peak lifts of 14.44 percent in NDCG@10.
Where Pith is reading between the lines
- The same multi-view contrastive pattern could be tested on session-based recommendation tasks where graph edges represent co-occurrences within short windows.
- If attention weights in the fusion module are inspected per user, they might reveal cases where graph structure matters more than sequence order for certain user types.
- Applying the framework to datasets that do include side information would test whether the multi-view benefit persists or becomes smaller once richer features are available.
Load-bearing premise
That ID sequences and interaction graphs supply reliably complementary signals rather than redundant ones, so that the contrastive objectives and fusion module can combine them effectively when only interaction data is present.
What would settle it
An experiment in which single-view models using only ID contrastive learning or only graph contrastive learning match or exceed the full MVCrec performance across the same five benchmark datasets would show the cross-view component adds no value.
Figures
read the original abstract
Sequential recommendation has become increasingly prominent in both academia and industry, particularly in e-commerce. The primary goal is to extract user preferences from historical interaction sequences and predict items a user is likely to engage with next. Recent advances have leveraged contrastive learning and graph neural networks to learn more expressive representations from interaction histories -- graphs capture relational structure between nodes, while ID-based representations encode item-specific information. However, few studies have explored multi-view contrastive learning between ID and graph perspectives to jointly improve user and item representations, especially in settings where only interaction data is available without auxiliary information. To address this gap, we propose Multi-View Contrastive learning for sequential recommendation (MVCrec), a framework that integrates complementary signals from both sequential (ID-based) and graph-based views. MVCrec incorporates three contrastive objectives: within the sequential view, within the graph view, and across views. To effectively fuse the learned representations, we introduce a multi-view attention fusion module that combines global and local attention mechanisms to estimate the likelihood of a target user purchasing a target item. Comprehensive experiments on five real-world benchmark datasets demonstrate that MVCrec consistently outperforms 11 state-of-the-art baselines, achieving improvements of up to 14.44\% in NDCG@10 and 9.22\% in HitRatio@10 over the strongest baseline. Our code and datasets are available at https://github.com/sword-Lz/MMCrec.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes MVCrec, a sequential recommendation framework that learns user and item representations by combining an ID-based sequential view with a graph-based view derived from the same interaction data. It introduces three contrastive objectives (intra-ID, intra-graph, and cross-view) plus a multi-view attention fusion module that integrates global and local attention to predict the next item. The central empirical claim is that this multi-view approach yields consistent gains over 11 baselines on five real-world datasets, with maximum improvements of 14.44% in NDCG@10 and 9.22% in HitRatio@10.
Significance. If the reported gains are robust and demonstrably attributable to complementary signals from the two views rather than added capacity or regularization, the work would provide a concrete recipe for multi-view contrastive learning in interaction-only settings, which is a common practical constraint. The public code release strengthens reproducibility.
major comments (3)
- [§3.3, Table 3] §3.3 and §4.3: The ablation table (Table 3) removes individual contrastive losses but does not report a variant that keeps intra-ID and intra-graph losses while removing only the cross-view loss; without this, it is impossible to isolate whether the cross-view term (the load-bearing component of the multi-view claim) contributes beyond standard single-view contrastive regularization.
- [§3.1, §4.4] §3.1 and §4.4: Both the ID sequence and the graph are constructed exclusively from the identical user-item interaction sequences with no auxiliary features. The paper asserts complementarity but provides no direct evidence (e.g., cosine similarity between view embeddings, mutual information estimates, or visualization of distinct clusters) that the two views encode non-redundant information; this leaves open the possibility that the fusion module and cross-view loss add little beyond increased model capacity.
- [§4.2, Table 2] §4.2: The main results (Table 2) report point estimates without standard deviations across multiple random seeds or data splits, and without statistical significance tests (e.g., paired t-test or Wilcoxon). Given the modest absolute gains on some datasets, this weakens the claim that MVCrec “consistently outperforms” the strongest baselines.
minor comments (2)
- [§2] §2: The related-work discussion cites several graph-based and contrastive sequential models but does not explicitly contrast the proposed cross-view objective with prior multi-view contrastive methods (e.g., those using separate user and item graphs).
- [§3.2] Notation: The symbols for the three contrastive losses (L_id, L_graph, L_cross) are introduced in §3.2 but the weighting hyper-parameters λ1, λ2, λ3 are only mentioned in the experimental setup; a single equation collecting all terms would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments on our manuscript. We address each major comment point by point below. We will make revisions to the paper as indicated to improve clarity and rigor.
read point-by-point responses
-
Referee: [§3.3, Table 3] §3.3 and §4.3: The ablation table (Table 3) removes individual contrastive losses but does not report a variant that keeps intra-ID and intra-graph losses while removing only the cross-view loss; without this, it is impossible to isolate whether the cross-view term (the load-bearing component of the multi-view claim) contributes beyond standard single-view contrastive regularization.
Authors: We agree with this assessment. The current ablation study removes losses individually but lacks the specific combination requested. We will add this variant to Table 3 in the revised manuscript, keeping the intra-ID and intra-graph losses while removing only the cross-view loss. This will help isolate the contribution of the cross-view term to the overall performance gains. revision: yes
-
Referee: [§3.1, §4.4] §3.1 and §4.4: Both the ID sequence and the graph are constructed exclusively from the identical user-item interaction sequences with no auxiliary features. The paper asserts complementarity but provides no direct evidence (e.g., cosine similarity between view embeddings, mutual information estimates, or visualization of distinct clusters) that the two views encode non-redundant information; this leaves open the possibility that the fusion module and cross-view loss add little beyond increased model capacity.
Authors: This point is well-taken. Although our results show improvements over single-view contrastive learning methods, direct evidence of non-redundancy would be beneficial. In the revision, we will include an analysis of the cosine similarity between ID-view and graph-view embeddings and provide visualizations (e.g., t-SNE) of the learned representations to illustrate that the views capture complementary information. revision: yes
-
Referee: [§4.2, Table 2] §4.2: The main results (Table 2) report point estimates without standard deviations across multiple random seeds or data splits, and without statistical significance tests (e.g., paired t-test or Wilcoxon). Given the modest absolute gains on some datasets, this weakens the claim that MVCrec “consistently outperforms” the strongest baselines.
Authors: We appreciate the suggestion for more rigorous statistical analysis. In the updated manuscript, we will report results averaged over multiple random seeds with standard deviations in Table 2. We will also include p-values from paired t-tests to demonstrate that the observed improvements are statistically significant, addressing concerns about the robustness of the gains. revision: yes
Circularity Check
No circularity: empirical results on held-out test sets are independent of model equations
full rationale
The paper proposes MVCrec as a new architecture combining ID-based sequential representations, graph views, three contrastive losses, and a multi-view attention fusion module. Its central claims consist of measured performance gains (NDCG@10, HitRatio@10) on five held-out benchmark test sets against 11 external baselines. These quantities are obtained by standard training and evaluation protocols rather than by algebraic derivation or parameter fitting that would force the reported improvements. No equations are presented that define a target quantity in terms of itself, no fitted parameters are relabeled as predictions, and no load-bearing uniqueness theorems or ansatzes are imported via self-citation. The derivation chain is therefore self-contained: model design followed by empirical measurement on independent data.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Contrastive losses can align representations from two different views of the same user-item interaction data.
Reference graph
Works this paper leans on
-
[1]
W. Liu, X. Zheng, C. Chen, J. Su, X. Liao, M. Hu, and Y . Tan, “Joint internal multi-interest exploration and external domain alignment for cross domain sequential recommendation,” inWebConf, 2023
work page 2023
-
[2]
Mmmlp: Multi-modal multilayer perceptron for sequential recommen- dations,
J. Liang, X. Zhao, M. Li, Z. Zhang, W. Wang, H. Liu, and Z. Liu, “Mmmlp: Multi-modal multilayer perceptron for sequential recommen- dations,” inWebConf, 2023
work page 2023
-
[3]
Dual-interest factorization-heads attention for sequential recommendation,
G. Lin, C. Gao, Y . Zheng, J. Chang, Y . Niu, Y . Song, Z. Li, D. Jin, and Y . Li, “Dual-interest factorization-heads attention for sequential recommendation,” inWebConf, apr 2023
work page 2023
-
[4]
Automlp: Automated mlp for sequential recommendations,
M. Li, Z. Zhang, X. Zhao, W. Wang, M. Zhao, R. Wu, and R. Guo, “Automlp: Automated mlp for sequential recommendations,” inWeb- Conf, 2023
work page 2023
-
[5]
Fine-grained interest matching for neural news recommendation,
H. Wang, F. Wu, Z. Liu, and X. Xie, “Fine-grained interest matching for neural news recommendation,” inACL, 2020
work page 2020
-
[6]
Multi-behavior hypergraph-enhanced transformer for sequential recommendation,
Y . Yang, C. Huang, L. Xia, Y . Liang, Y . Yu, and C. Li, “Multi-behavior hypergraph-enhanced transformer for sequential recommendation,” in KDD, 2022
work page 2022
-
[7]
Text is all you need: Learning language representations for sequential recommendation,
J. Li, M. Wang, J. Li, J. Fu, X. Shen, J. Shang, and J. McAuley, “Text is all you need: Learning language representations for sequential recommendation,” inKDD, 2023
work page 2023
-
[8]
Session-based recommendations with recurrent neural networks,
B. Hidasi, A. Karatzoglou, L. Baltrunas, and D. Tikk, “Session-based recommendations with recurrent neural networks,” inICLR, 2016
work page 2016
-
[9]
Personalized top-n sequential recommendation via convolutional sequence embedding,
J. Tang and K. Wang, “Personalized top-n sequential recommendation via convolutional sequence embedding,” inWSDM, 2018
work page 2018
-
[10]
Self-attentive sequential recommenda- tion,
W.-C. Kang and J. McAuley, “Self-attentive sequential recommenda- tion,” inICDM, 2018
work page 2018
-
[11]
Bert4rec: Sequential recommendation with bidirectional encoder representations from transformer,
F. Sun, J. Liu, J. Wu, C. Pei, X. Lin, W. Ou, and P. Jiang, “Bert4rec: Sequential recommendation with bidirectional encoder representations from transformer,” inCIKM, 2019
work page 2019
-
[12]
Contrastive learning for sequential recommendation,
X. Xie, F. Sun, Z. Liu, S. Wu, J. Gao, J. Zhang, B. Ding, and B. Cui, “Contrastive learning for sequential recommendation,” inICDE, 2022
work page 2022
-
[13]
Contrastive learning for representation degeneration problem in sequential recommendation,
R. Qiu, Z. Huang, H. Yin, and Z. Wang, “Contrastive learning for representation degeneration problem in sequential recommendation,” in WSDM, 2022
work page 2022
-
[14]
Meta-optimized contrastive learning for sequential recom- mendation,
X. Qin, H. Yuan, P. Zhao, J. Fang, F. Zhuang, G. Liu, Y . Liu, and V . Sheng, “Meta-optimized contrastive learning for sequential recom- mendation,” inSIGIR, 2023
work page 2023
-
[15]
Intent contrastive learning for sequential recommendation,
Y . Chen, Z. Liu, J. Li, J. McAuley, and C. Xiong, “Intent contrastive learning for sequential recommendation,” inWebConf, 2022
work page 2022
-
[16]
Yu, Julian McAuley, and Caiming Xiong
Z. Liu, Y . Chen, J. Li, P. S. Yu, J. McAuley, and C. Xiong, “Contrastive self-supervised sequential recommendation with robust augmentation,” arXiv preprint arXiv:2108.06479, 2021
-
[17]
Im- proving contrastive learning with model augmentation,
Z. Liu, Y . Chen, J. Li, M. Luo, P. S. Yu, and C. Xiong, “Im- proving contrastive learning with model augmentation,”arXiv preprint arXiv:2203.15508, 2022
-
[18]
Representation Learning with Contrastive Predictive Coding
A. v. d. Oord, Y . Li, and O. Vinyals, “Representation learning with contrastive predictive coding,”arXiv preprint arXiv:1807.03748, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[19]
Lightgcn: Simplifying and powering graph convolution network for recommenda- tion,
X. He, K. Deng, X. Wang, Y . Li, Y . Zhang, and M. Wang, “Lightgcn: Simplifying and powering graph convolution network for recommenda- tion,” inSIGIR, 2020
work page 2020
-
[20]
Neural graph collaborative filtering,
X. Wang, X. He, M. Wang, F. Feng, and T.-S. Chua, “Neural graph collaborative filtering,” inSIGIR, 2019
work page 2019
-
[21]
Ultragcn: Ultra simplification of graph convolutional networks for recommendation,
K. Mao, J. Zhu, X. Xiao, B. Lu, Z. Wang, and X. He, “Ultragcn: Ultra simplification of graph convolutional networks for recommendation,” in CIKM, 2021
work page 2021
-
[22]
Generative-contrastive graph learning for recommendation,
Y . Yang, Z. Wu, L. Wu, K. Zhang, R. Hong, Z. Zhang, J. Zhou, and M. Wang, “Generative-contrastive graph learning for recommendation,” inSIGIR, 2023
work page 2023
-
[23]
Candidate-aware graph contrastive learning for recommendation,
W. He, G. Sun, J. Lu, and X. S. Fang, “Candidate-aware graph contrastive learning for recommendation,” inProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023, pp. 1670–1679
work page 2023
-
[24]
Graph masked autoencoder for sequential recommendation,
Y . Ye, L. Xia, and C. Huang, “Graph masked autoencoder for sequential recommendation,” inSIGIR, 2023
work page 2023
-
[25]
Category-based and popularity-guided video game recommendation: a balance-oriented framework,
X. Li, J. Ma, K. Liu, S. Feng, H. Zhang, and Y . Wang, “Category-based and popularity-guided video game recommendation: a balance-oriented framework,” inProceedings of the ACM Web Conference 2024, 2024, pp. 3734–3744
work page 2024
-
[26]
Cpgrec+: A balance-oriented framework for personalized video game recommendations,
X. Li, A. Yang, J. Ma, K. Liu, S. Feng, H. Zhang, and Y . Zhao, “Cpgrec+: A balance-oriented framework for personalized video game recommendations,”ACM Transactions on Information Systems, vol. 44, no. 3, pp. 1–44, 2026
work page 2026
-
[27]
Personalized news rec- ommendation with context trees,
F. Garcin, C. Dimitrakakis, and B. Faltings, “Personalized news rec- ommendation with context trees,” inProceedings of the 7th ACM Conference on Recommender Systems, 2013, pp. 105–112
work page 2013
-
[28]
Personalized ranking metric embedding for next new poi recommendation,
S. Feng, X. Li, Y . Zeng, G. Cong, and Y . M. Chee, “Personalized ranking metric embedding for next new poi recommendation,” inIJCAI’15 Pro- ceedings of the 24th International Conference on Artificial Intelligence. ACM, 2015, pp. 2069–2075
work page 2015
-
[29]
Linrec: Linear attention mechanism for long-term sequential recommender systems,
L. Liu, L. Cai, C. Zhang, X. Zhao, J. Gao, W. Wang, Y . Lv, W. Fan, Y . Wang, M. He, Z. Liu, and Q. Li, “Linrec: Linear attention mechanism for long-term sequential recommender systems,” inSIGIR, 2023
work page 2023
-
[30]
Melt: Mutual enhancement of long-tailed user and item for sequential recommendation,
K. Kim, D. Hyun, S. Yun, and C. Park, “Melt: Mutual enhancement of long-tailed user and item for sequential recommendation,” inSIGIR, ser. SIGIR ’23, 2023
work page 2023
-
[31]
Unsupervised feature learning via non-parametric instance discrimination,
Z. Wu, Y . Xiong, S. X. Yu, and D. Lin, “Unsupervised feature learning via non-parametric instance discrimination,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
work page 2018
-
[32]
Momentum contrast for unsupervised visual representation learning,
K. He, H. Fan, Y . Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9729–9738
work page 2020
-
[33]
A simple framework for contrastive learning of visual representations,
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” inInternational conference on machine learning. PMLR, 2020, pp. 1597–1607
work page 2020
-
[34]
Bootstrap your own latent-a new approach to self-supervised learning,
J.-B. Grill, F. Strub, F. Altch ´e, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azaret al., “Bootstrap your own latent-a new approach to self-supervised learning,” Advances in neural information processing systems, vol. 33, pp. 21 271– 21 284, 2020
work page 2020
-
[35]
Multimodal pre-training framework for sequential recommendation via contrastive learning,
L. Zhang, X. Zhou, and Z. Shen, “Multimodal pre-training framework for sequential recommendation via contrastive learning,”arXiv preprint arXiv:2303.11879, 2023
-
[36]
Multi-modal self-supervised learning for recommendation,
W. Wei, C. Huang, L. Xia, and C. Zhang, “Multi-modal self-supervised learning for recommendation,” inProceedings of the ACM Web Confer- ence 2023, 2023, pp. 790–800
work page 2023
-
[37]
Ensemble modeling with contrastive knowledge distillation for sequential recommendation,
H. Du, H. Yuan, P. Zhao, F. Zhuang, G. Liu, L. Zhao, and V . S. Sheng, “Ensemble modeling with contrastive knowledge distillation for sequential recommendation,” 2023
work page 2023
-
[38]
Debiased contrastive learning for sequential recommendation,
Y . Yang, C. Huang, L. Xia, C. Huang, D. Luo, and K. Lin, “Debiased contrastive learning for sequential recommendation,” inProceedings of the ACM Web Conference 2023, 2023, pp. 1063–1073
work page 2023
-
[39]
Multi-view multi-behavior contrastive learning in recommendation,
Y . Wu, R. Xie, Y . Zhu, X. Ao, X. Chen, X. Zhang, F. Zhuang, L. Lin, and Q. He, “Multi-view multi-behavior contrastive learning in recommendation,” inInternational conference on database systems for advanced applications. Springer, 2022, pp. 166–182
work page 2022
-
[40]
Multi-level contrastive learning framework for sequential recommendation,
Z. Wang, H. Liu, W. Wei, Y . Hu, X.-L. Mao, S. He, R. Fang, and D. Chen, “Multi-level contrastive learning framework for sequential recommendation,” inProceedings of the 31st ACM International Confer- ence on Information & Knowledge Management, 2022, pp. 2098–2107
work page 2022
-
[41]
Session-based recommendation with graph neural networks,
S. Wu, Y . Tang, Y . Zhu, L. Wang, X. Xie, and T. Tan, “Session-based recommendation with graph neural networks,”Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, p. 346–353, Jul. 2019
work page 2019
-
[42]
L. Chen, L. Wu, R. Hong, K. Zhang, and M. Wang, “Revisiting graph based collaborative filtering: A linear residual graph convolutional network approach,” inProceedings of the AAAI conference on artificial intelligence, vol. 34, no. 01, 2020, pp. 27–34
work page 2020
-
[43]
Sequential recommendation with multiple contrast signals,
C. Wang, W. Ma, C. Chen, M. Zhang, Y . Liu, and S. Ma, “Sequential recommendation with multiple contrast signals,”ACM Transactions on Information Systems, vol. 41, no. 1, pp. 1–27, 2023
work page 2023
-
[44]
Bpr: Bayesian personalized ranking from implicit feedback,
S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “Bpr: Bayesian personalized ranking from implicit feedback,” inProceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, ser. UAI ’09. Arlington, Virginia, USA: AUAI Press, 2009, p. 452–461
work page 2009
-
[45]
Mutual wasserstein discrepancy minimization for sequential recommendation,
Z. Fan, Z. Liu, H. Peng, and P. S. Yu, “Mutual wasserstein discrepancy minimization for sequential recommendation,” inWebConf, 2023
work page 2023
-
[46]
A self-correcting sequential recommender,
Y . Lin, C. Wang, Z. Chen, Z. Ren, X. Xin, Q. Yan, M. de Rijke, X. Cheng, and P. Ren, “A self-correcting sequential recommender,” in WebConf, 2023
work page 2023
-
[47]
Modeling temporal positive and negative excitation for sequential recommendation,
C. Huang, S. Wang, X. Wang, and L. Yao, “Modeling temporal positive and negative excitation for sequential recommendation,” inProceedings of the ACM Web Conference 2023, 2023, pp. 1252–1263
work page 2023
-
[48]
Query- aware sequential recommendation,
Z. He, H. Zhao, Z. Wang, Z. Lin, A. Kale, and J. Mcauley, “Query- aware sequential recommendation,” inProceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp. 4019–4023
work page 2022
-
[49]
Learning vector- quantized item representation for transferable sequential recommenders,
Y . Hou, Z. He, J. McAuley, and W. X. Zhao, “Learning vector- quantized item representation for transferable sequential recommenders,” inProceedings of the ACM Web Conference 2023, 2023, pp. 1162–1171
work page 2023
-
[50]
Sequential recommendation via stochastic self-attention,
Z. Fan, Z. Liu, Y . Wang, A. Wang, Z. Nazari, L. Zheng, H. Peng, and P. S. Yu, “Sequential recommendation via stochastic self-attention,” in Proceedings of the ACM Web Conference 2022, 2022, pp. 2036–2047
work page 2022
-
[51]
Sequential recommendation with decomposed item feature routing,
K. Lin, Z. Wang, S. Shen, Z. Wang, B. Chen, and X. Chen, “Sequential recommendation with decomposed item feature routing,” inProceedings of the ACM Web Conference 2022, 2022, pp. 2288–2297
work page 2022
-
[52]
Exploiting explicit and implicit item relationships for session-based recommenda- tion,
Z. Li, X. Wang, C. Yang, L. Yao, J. McAuley, and G. Xu, “Exploiting explicit and implicit item relationships for session-based recommenda- tion,” inProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023, pp. 553–561
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.