pith. sign in

arxiv: 1907.03459 · v2 · pith:2Q7ZOWBJnew · submitted 2019-07-08 · 💻 cs.IR

Joint Neural Collaborative Filtering for Recommender Systems

Pith reviewed 2026-05-25 01:05 UTC · model grok-4.3

classification 💻 cs.IR
keywords recommender systemsneural collaborative filteringjoint trainingdeep feature learningdeep interaction modelingimplicit feedbackexplicit feedbackpoint-wise and pair-wise loss
0
0 comments X

The pith

Joint training lets deep feature learning and interaction modeling in neural recommenders refine each other end to end.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents J-NCF as a single neural architecture that runs deep feature learning on the user-item rating matrix and feeds those representations into a second deep network for modeling non-linear interactions. These two processes train together so that each improves the other rather than operating in isolation. A combined loss function incorporates both implicit and explicit feedback along with point-wise and pair-wise terms. Experiments across MovieLens and Amazon datasets report gains in hit rate and NDCG compared with prior neural collaborative filtering methods. The work also checks behavior on sparse data and inactive users.

Core claim

J-NCF applies a joint neural network that couples deep feature learning and deep interaction modeling with a rating matrix. Deep feature learning extracts feature representations of users and items with a deep learning architecture based on a user-item rating matrix. Deep interaction modeling captures non-linear user-item interactions with a deep neural network using the feature representations generated by the deep feature learning process as input. J-NCF enables the deep feature learning and deep interaction modeling processes to optimize each other through joint training, which leads to improved recommendation performance. In addition, a new loss function takes both implicit and explicit,

What carries the argument

The joint neural network that couples deep feature learning on the rating matrix with deep interaction modeling that takes those features as input.

If this is right

  • Recommendation accuracy rises on MovieLens 100K, MovieLens 1M, and Amazon Movies by the reported margins in HR@10 and NDCG@10.
  • The model maintains competitive results when data is sparse or users have few ratings.
  • The loss function allows the model to use both point-wise and pair-wise signals from implicit and explicit feedback at once.
  • Scalability and sensitivity tests show the architecture remains practical across varying dataset sizes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same joint-training pattern could be tested on sequential or session-based recommendation tasks where feature and interaction signals interact strongly.
  • End-to-end coupling may reduce the need for separate validation of feature quality before interaction modeling begins.
  • If the joint objective proves stable, similar coupling could be applied to other paired deep networks in ranking or retrieval settings.

Load-bearing premise

Joint end-to-end training of the two networks will produce mutual gains instead of instability, one component dominating, or overfitting that erases the benefit.

What would settle it

A controlled comparison on the same datasets where separately trained feature extraction and interaction models reach equal or higher HR@10 and NDCG@10 than the jointly trained J-NCF.

Figures

Figures reproduced from arXiv: 1907.03459 by Fei Cai, Honghui Chen, Maarten de Rijke, Wanyu Chen.

Figure 1
Figure 1. Figure 1: Structure of the J-NCF model. Black arrows indicate the forward propagation for [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of users with varying numbers of interactions in the ML100K, ML1M, [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Performance of the J-NCF models applied with different loss functions where the [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Performance of Top-N item recommendation where N ranges from 1 to 10. The left [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Recommendation performance across different numbers of iterations. The left [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Recommendation performance across datasets with different levels of sparsity. [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Performance of Top-N item recommendation where N ranges from 1 to 10, tested [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
read the original abstract

We propose a J-NCF method for recommender systems. The J-NCF model applies a joint neural network that couples deep feature learning and deep interaction modeling with a rating matrix. Deep feature learning extracts feature representations of users and items with a deep learning architecture based on a user-item rating matrix. Deep interaction modeling captures non-linear user-item interactions with a deep neural network using the feature representations generated by the deep feature learning process as input. J-NCF enables the deep feature learning and deep interaction modeling processes to optimize each other through joint training, which leads to improved recommendation performance. In addition, we design a new loss function for optimization, which takes both implicit and explicit feedback, point-wise and pair-wise loss into account. Experiments on several real-word datasets show significant improvements of J-NCF over state-of-the-art methods, with improvements of up to 8.24% on the MovieLens 100K dataset, 10.81% on the MovieLens 1M dataset, and 10.21% on the Amazon Movies dataset in terms of HR@10. NDCG@10 improvements are 12.42%, 14.24% and 15.06%, respectively. We also conduct experiments to evaluate the scalability and sensitivity of J-NCF. Our experiments show that the J-NCF model has a competitive recommendation performance with inactive users and different degrees of data sparsity when compared to state-of-the-art baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes J-NCF, which couples a deep feature learning network (extracting user/item representations from the rating matrix) with a deep interaction modeling network (capturing non-linear interactions from those features) via joint end-to-end training, plus a new loss combining implicit/explicit feedback and pointwise/pairwise terms. It claims this mutual optimization yields improved recommendation performance and reports gains of up to 8.24% HR@10 and 15.06% NDCG@10 over baselines on MovieLens 100K/1M and Amazon Movies, with additional tests on scalability and sparsity.

Significance. If the reported gains are attributable to the joint training mechanism rather than the new loss alone, the work would demonstrate a concrete benefit of end-to-end optimization between feature extraction and interaction modeling in neural recommenders, extending prior NCF approaches. The use of public datasets and standard top-K metrics (HR, NDCG) aids reproducibility, though the absence of isolating experiments weakens the evidential basis for the central mutual-optimization claim.

major comments (3)
  1. [Experiments] Experiments section: no ablation is reported that trains the deep feature learning and deep interaction modeling components separately, freezes the feature extractor during interaction training, or replaces the proposed multi-feedback loss with standard BPR/MSE while retaining the joint architecture; without these, the gains (e.g., 8.24% HR@10 on MovieLens 100K) cannot be attributed to the claimed mutual optimization rather than the loss function.
  2. [Experiments] Experiments section: baseline implementations, hyperparameter search ranges, random seeds, and statistical significance tests (e.g., paired t-tests or Wilcoxon on the reported HR@10/NDCG@10 deltas) are not described, preventing verification that the improvements over state-of-the-art methods are robust and due to the joint-training component.
  3. [Model description] Model description (joint training paragraph): the claim that 'J-NCF enables the deep feature learning and deep interaction modeling processes to optimize each other through joint training' is load-bearing for the contribution, yet the architecture description provides no analysis of gradient flow, potential dominance of one network, or instability that could negate the mutual benefit.
minor comments (2)
  1. [Abstract] Abstract: 'real-word datasets' should read 'real-world datasets'.
  2. [Experiments] The paper mentions sensitivity experiments with 'inactive users' and 'different degrees of data sparsity' but does not define the exact thresholds or user/activity bins used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important aspects for strengthening the attribution of gains to joint training and improving reproducibility. We respond to each major comment below and indicate planned revisions.

read point-by-point responses
  1. Referee: [Experiments] Experiments section: no ablation is reported that trains the deep feature learning and deep interaction modeling components separately, freezes the feature extractor during interaction training, or replaces the proposed multi-feedback loss with standard BPR/MSE while retaining the joint architecture; without these, the gains (e.g., 8.24% HR@10 on MovieLens 100K) cannot be attributed to the claimed mutual optimization rather than the loss function.

    Authors: We agree that the absence of such ablations limits the ability to isolate the contribution of joint training from the new multi-feedback loss. In the revised manuscript we will add these ablation experiments, including separate training of the components, freezing the feature extractor, and replacing the loss with BPR/MSE while retaining the joint architecture, to better support the mutual-optimization claim. revision: yes

  2. Referee: [Experiments] Experiments section: baseline implementations, hyperparameter search ranges, random seeds, and statistical significance tests (e.g., paired t-tests or Wilcoxon on the reported HR@10/NDCG@10 deltas) are not described, preventing verification that the improvements over state-of-the-art methods are robust and due to the joint-training component.

    Authors: We acknowledge the need for these details to ensure reproducibility and robustness. The revised version will include explicit descriptions of baseline implementations, hyperparameter search ranges, random seeds, and statistical significance tests on the performance improvements. revision: yes

  3. Referee: [Model description] Model description (joint training paragraph): the claim that 'J-NCF enables the deep feature learning and deep interaction modeling processes to optimize each other through joint training' is load-bearing for the contribution, yet the architecture description provides no analysis of gradient flow, potential dominance of one network, or instability that could negate the mutual benefit.

    Authors: The end-to-end training allows gradients from the interaction modeling loss to update the feature learning network and vice versa. We will expand the model description section to discuss gradient flow through the coupled networks. A comprehensive empirical analysis of dominance or instability would require new experiments; we can provide a partial discussion based on observed training behavior but may not fully resolve all aspects within the current scope. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical claims rest on external benchmarks

full rationale

The paper introduces a joint neural architecture for feature learning and interaction modeling plus a composite loss, then reports HR@10 and NDCG@10 gains on public datasets (MovieLens, Amazon) against published external baselines. No equations, self-citations, or derivation steps are shown that reduce the claimed mutual optimization or performance numbers to quantities fitted or defined inside the model itself. The architecture and loss are presented as design choices whose value is tested empirically rather than derived by construction from the inputs.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The model relies on standard deep-learning assumptions and a collection of architecture and optimization hyperparameters that are not enumerated in the abstract.

free parameters (2)
  • network depth and width
    Number of layers and hidden units in both the feature and interaction networks must be chosen to fit the data.
  • loss weighting coefficients
    Relative weights among implicit, explicit, pointwise, and pairwise terms are selected during training.
axioms (1)
  • domain assumption Deep neural networks can capture non-linear user-item interactions when given learned feature vectors as input.
    Invoked by the design of the interaction modeling component.

pith-pipeline@v0.9.0 · 5789 in / 1166 out tokens · 28069 ms · 2026-05-25T01:05:22.746208+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages · 3 internal anchors

  1. [1]

    David Adedayo Adeniyi, Zhaoqiang Wei, and Yongquan Yang. 2016. Automated web usage data mining and recom- mendation system using K-Nearest Neighbor (KNN) classification method. Applied Computing and Informatics 12, 1 (2016), 90–108

  2. [2]

    Gediminas Adomavicius and Alexander Tuzhilin. 2005. Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. IEEE Transactions on Knowledge and Data Engineering. 17, 6 (2005), 734–749

  3. [3]

    Betru Basiliyos, Tilahun, Onana Charles, Awono, and Batchakui Bernabe. 2017. Deep Learning Methods on Recom- mender System: A Survey of State-of-the-art. International Journal of Computer Applications 162, 10 (2017), 17–22

  4. [4]

    Alejandro Bellogin, Pablo Castells, and Ivan Cantador. 2011. Precision-oriented Evaluation of Recommender Systems: An Algorithmic Comparison. In RecSys ’11. ACM, 333–336

  5. [5]

    Fei Cai and Maarten de Rijke. 2016. Learning from homologous queries and semantically related terms for query auto completion. Information Processing & Management 52, 4 (2016), 628–643

  6. [6]

    Fei Cai and Maarten de Rijke. 2016. A Survey of Query Auto Completion in Information Retrieval. Foundations and Trends in Information Retrieval 10, 4 (2016), 273–363

  7. [7]

    Fei Cai, Shangsong Liang, and Maarten de Rijke. 2016. Prefix-Adaptive and Time-Sensitive Personalized Query Auto Completion. IEEE Transactions on Knowledge and Data Engineering 28, 9 (Sep 2016), 2452–2466

  8. [8]

    Fei Cai, Ridho Reinanda, and Maarten de Rijke. 2016. Diversifying Query Auto-Completion. ACM Transactions on Information Systems 34, 4 (June 2016), 25:1–25:33

  9. [9]

    Chatzis, Panayiotis Christodoulou, and Andreas S

    Sotirios P. Chatzis, Panayiotis Christodoulou, and Andreas S. Andreou. 2017. Recurrent Latent Variable Networks for Session-Based Recommendation. In DLRS ’17. 38–45

  10. [10]

    Jingyuan Chen, Hanwang Zhang, Xiangnan He, Liqiang Nie, Wei Liu, and Tat-Seng Chua. 2017. Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention. In SIGIR ’17. ACM, 335–344

  11. [11]

    Wanyu Chen, Fei Cai, Honghui Chen, and Maarten de Rijke. 2018. Attention-based Hierarchical Neural Query Suggestion. In SIGIR ’18. ACM, 1093–1096

  12. [12]

    Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In DLRS 2016. ACM, 7–10

  13. [13]

    Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of Recommender Algorithms on Top-n Recommendation Tasks. In RecSys ’10. ACM, 39–46

  14. [14]

    Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-machine Based Neural Network for CTR Prediction. In IJCAI’17. AAAI Press, 1725–1731. ACM Transactions on Information Systems, Vol. 1, No. 1, Article . Publication date: July 2019. Joint Neural Collaborative Filtering for Recommender Systems † 29

  15. [15]

    Xiangnan He and Tat-Seng Chua. 2017. Neural Factorization Machines for Sparse Predictive Analytics. In SIGIR ’17. ACM, 355–364

  16. [16]

    Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In WWW ’17. ACM, 173–182

  17. [17]

    Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. 2016. Fast Matrix Factorization for Online Recom- mendation with Implicit Feedback. In SIGIR ’16. ACM, 549–558

  18. [18]

    Herlocker, Joseph A

    Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl. 2004. Evaluating Collaborative Filtering Recommender Systems. ACM Transactions on Information Systems 22, 1 (2004), 5–53

  19. [19]

    Balázs Hidasi and Alexandros Karatzoglou. 2018. Recurrent Neural Networks with Top-k Gains for Session-based Recommendations. In CIKM ’18. ACM, 843–852

  20. [20]

    Balazs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2016. Session-based Recommendations with Recurrent Neural Networks. In ICLR ’16

  21. [21]

    Balázs Hidasi, Massimo Quadrana, Alexandros Karatzoglou, and Domonkos Tikk. 2016. Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations. In RecSys ’16. ACM, 241–248

  22. [22]

    Xue Hong-Jian, Dai Xinyu, Zhang Jianbing, Huang Shujian, and Chen Jiajun. 2017. Deep Matrix Factorization Models for Recommender Systems. In IJCAI ’17. 3203–3209

  23. [23]

    Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning Deep Structured Semantic Models for Web Search Using Clickthrough Data. In CIKM ’13. ACM, 2333–2338

  24. [24]

    Santosh Kabbur, Xia Ning, and George Karypis. 2013. FISM: Factored item similarity models for Top-N recommender systems. In KDD ’13. ACM, 659–667

  25. [25]

    Donghyun Kim, Chanyoung Park, Jinoh Oh, Sungyoung Lee, and Hwanjo Yu. 2016. Convolutional Matrix Factorization for Document Context-Aware Recommendation. In RecSys ’16. 233–240

  26. [26]

    Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980 (2014)

  27. [27]

    Yehuda Koren. 2008. Factorization Meets the Neighborhood: A Multifaceted Collaborative Filtering Model. In KDD ’08. ACM, 426–434

  28. [28]

    Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer 42, 8 (2009), 30–37

  29. [29]

    Sheng Li, Jaya Kawale, and Yun Fu. 2015. Deep Collaborative Filtering via Marginalized Denoising Auto-encoder. In CIKM ’15. ACM, 811–820

  30. [30]

    Jianxun Lian, Fuzheng Zhang, Xing Xie, and Guangzhong Sun. 2017. CCCFNet: A Content-Boosted Collaborative Filtering Neural Network for Cross Domain Recommender Systems. In WWW ’17. ACM, 817–818

  31. [31]

    Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon.com Recommendations: Item-to-Item Collaborative Filtering. IEEE Internet Computing 7, 1 (2003), 76–80

  32. [32]

    Juntao Liu and Caihua Wu. 2017. Deep Learning Based Recommendation: A Survey. In ICISA ’17. 451–458

  33. [33]

    Xiaomeng Liu, Yuanxin Ouyang, Wenge Rong, and Zhang Xiong. 2015. Item Category Aware Conditional Restricted Boltzmann Machine Based Recommendation. In ICONIP ’15. 609–616

  34. [34]

    Wallace, Maarten de Rijke, and Matthew Lease

    Kezban Dilek Onal, Ye Zhang, Ismail Sengor Altingovde, Md Mustafizur Rahman, Pinar Karagoz, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, Tyler McDonnell, An Thanh Nguyen, Dan Xu, Byron C. Wallace, Maarten de Rijke, and Matthew Lease. 2018. Neural information retrieval: At the end of the...

  35. [35]

    Arkadiusz Paterek. 2007. Improving regularized singular value decomposition for collaborative filtering. In KDD ’07. ACM

  36. [36]

    Scalable Recommendation with Poisson Factorization

    Gopalan Prem, Jake M. Hofman, and David M. Blei. 2013. Scalable Recommendation with Poisson Factorization. arXiv preprint arXiv:1311.1704 (2013)

  37. [37]

    Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. In UAI ’09. 452–461

  38. [38]

    Ruslan Salakhutdinov and Andriy Mnih. 2007. Probabilistic Matrix Factorization. In NIPS’07. Curran Associates Inc., 1257–1264

  39. [39]

    Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton. 2007. Restricted Boltzmann Machines for Collaborative Filtering. In ICML ’07. 791–798

  40. [40]

    Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based Collaborative Filtering Recommen- dation Algorithms. In WWW ’01. ACM, 285–295

  41. [41]

    Konstan, and John Thomas Riedl

    Badrul Munir Sarwar, George Karypis, Joseph A. Konstan, and John Thomas Riedl. 2000. Application of Dimensionality Reduction in Recommender System–A Case Study. In ACM WebKDD Workshop. ACM

  42. [42]

    Suvash Sedhain, Aditya Menon, Scott Sanner, and Lexing Xie. 2015. AutoRec: Autoencoders Meet Collaborative Filtering. In WWW ’15. ACM, 111–112. ACM Transactions on Information Systems, Vol. 1, No. 1, Article . Publication date: July 2019. 30 Wanyu Chen, Fei Cai, Honghui Chen, and Maarten de Rijke

  43. [43]

    Lei Shi, Wayne Xin Zhao, and Yi-Dong Shen. 2017. Local Representative-Based Matrix Factorization for Cold-Start Recommendation. ACM Transaction on Information Systems 36, 2 (Aug. 2017), 22:1–22:28

  44. [44]

    Khoshgoftaar

    Xiaoyuan Su and Taghi M. Khoshgoftaar. 2009. A Survey of Collaborative Filtering Techniques. Advances in Artificial Intelligence 2009 (2009), Article 4

  45. [45]

    Bansal Trapit, Belanger David, and McCallum Andrew. 2016. Ask the GRU: Multi-task Learning for Deep Text Recommendations. In RecSys ’16. 107–114

  46. [46]

    Phung, and Svetha Venkatesh

    Tran The Truyen, Dinh Q. Phung, and Svetha Venkatesh. 2009. Ordinal Boltzmann Machines for Collaborative Filtering. In UAI ’09. 548–556

  47. [47]

    Aaron van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep Content-based Music Recommendation. In NIPS ’13. 2643–2651

  48. [48]

    Chong Wang and David M. Blei. 2011. Collaborative Topic Modeling for Recommending Scientific Articles. InKDD ’11. 448–456

  49. [49]

    Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative Deep Learning for Recommender Systems. In KDD ’15. ACM, 1235–1244

  50. [50]

    Suhang Wang, Yilin Wang, Jiliang Tang, Kai Shu, Suhas Ranganath, and Huan Liu. 2017. What Your Images Reveal: Exploiting Visual Contents for Point-of-Interest Recommendation. In WWW ’17. 391–400

  51. [51]

    Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019. Neural Graph Collaborative Filtering. In SIGIR ’19. ACM

  52. [52]

    Zheng, and Martin Ester

    Yao Wu, Christopher DuBois, Alice X. Zheng, and Martin Ester. 2016. Collaborative Denoising Auto-Encoders for Top-N Recommender Systems. In WSDM ’16. ACM, 153–162

  53. [53]

    Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, and Wei-Ying Ma. 2016. Collaborative Knowledge Base Embedding for Recommender Systems. In KDD ’16. ACM, 353–362

  54. [54]

    Shuai Zhang, Lina Yao, and Aixin Sun. 2017. Deep Learning based Recommender System: A Survey and New Perspectives. arXiv preprint arXiv:1707.07435 (2017)

  55. [55]

    Lei Zheng, Vahid Noroozi, and Philip S. Yu. 2017. Joint Deep Modeling of Users and Items Using Reviews for Recommendation. In WSDM ’17. ACM, 425–434

  56. [56]

    Yin Zheng, Bangsheng Tang, Wenkui Ding, and Hanning Zhou. 2016. A Neural Autoregressive Approach to Collabora- tive Filtering. In ICML’16. 764–773. ACM Transactions on Information Systems, Vol. 1, No. 1, Article . Publication date: July 2019