pith. sign in

arxiv: 1810.09401 · v1 · pith:ZE4RVX62new · submitted 2018-10-22 · 💻 cs.IR · cs.LG· stat.ML

Alternating Linear Bandits for Online Matrix-Factorization Recommendation

classification 💻 cs.IR cs.LGstat.ML
keywords onlineselectedalgorithmtimealternatingbanditscollaborativecumulative
0
0 comments X
read the original abstract

We consider the problem of online collaborative filtering in the online setting, where items are recommended to the users over time. At each time step, the user (selected by the environment) consumes an item (selected by the agent) and provides a rating of the selected item. In this paper, we propose a novel algorithm for online matrix factorization recommendation that combines linear bandits and alternating least squares. In this formulation, the bandit feedback is equal to the difference between the ratings of the best and selected items. We evaluate the performance of the proposed algorithm over time using both cumulative regret and average cumulative NDCG. Simulation results over three synthetic datasets as well as three real-world datasets for online collaborative filtering indicate the superior performance of the proposed algorithm over two state-of-the-art online algorithms.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Bandit's Blind Spot: The Critical Role of User State Representation in Recommender Systems

    cs.IR 2026-04 unverdicted novelty 5.0

    Variations in user state embeddings for CMAB recommenders can improve performance more than changing the bandit algorithm, with no embedding or aggregation strategy dominating across datasets.