PREFER is an online preference learning system that generates personalized review summaries and improves alignment with user interests in simulations on Amazon review data.
Reinforcement learning based recommender systems: A survey.ACM Computing Surveys, 55(7):1–38, 2022
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
A novel robust asynchronous Q-learning algorithm achieves finite-time convergence rates that match clean-data bounds up to an additive term proportional to the corruption fraction, with a matching information-theoretic lower bound.
Reinforcement learning policies for time-constrained slate recommendations improve engagement over contextual bandits in e-commerce settings.
citing papers explorer
-
PREFER: Personalized Review Summarization with Online Preference Learning
PREFER is an online preference learning system that generates personalized review summaries and improves alignment with user interests in simulations on Amazon review data.
-
Corruption-Tolerant Asynchronous Q-Learning with Near-Optimal Rates
A novel robust asynchronous Q-learning algorithm achieves finite-time convergence rates that match clean-data bounds up to an additive term proportional to the corruption fraction, with a matching information-theoretic lower bound.
-
Time-Constrained Recommendations: Reinforcement Learning Strategies for E-Commerce
Reinforcement learning policies for time-constrained slate recommendations improve engagement over contextual bandits in e-commerce settings.