Alternating Linear Bandits for Online Matrix-Factorization Recommendation

Hamid Dadkhahi; Sahand Negahban

arxiv: 1810.09401 · v1 · pith:ZE4RVX62new · submitted 2018-10-22 · 💻 cs.IR · cs.LG· stat.ML

Alternating Linear Bandits for Online Matrix-Factorization Recommendation

Hamid Dadkhahi , Sahand Negahban This is my paper

classification 💻 cs.IR cs.LGstat.ML

keywords onlineselectedalgorithmtimealternatingbanditscollaborativecumulative

0 comments

read the original abstract

We consider the problem of online collaborative filtering in the online setting, where items are recommended to the users over time. At each time step, the user (selected by the environment) consumes an item (selected by the agent) and provides a rating of the selected item. In this paper, we propose a novel algorithm for online matrix factorization recommendation that combines linear bandits and alternating least squares. In this formulation, the bandit feedback is equal to the difference between the ratings of the best and selected items. We evaluate the performance of the proposed algorithm over time using both cumulative regret and average cumulative NDCG. Simulation results over three synthetic datasets as well as three real-world datasets for online collaborative filtering indicate the superior performance of the proposed algorithm over two state-of-the-art online algorithms.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The Bandit's Blind Spot: The Critical Role of User State Representation in Recommender Systems
cs.IR 2026-04 unverdicted novelty 5.0

Variations in user state embeddings for CMAB recommenders can improve performance more than changing the bandit algorithm, with no embedding or aggregation strategy dominating across datasets.