Learning Automata Based Q-learning for Content Placement in Cooperative Caching
read the original abstract
An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing sum mean opinion score (MOS) of mobile users. Firstly, a supervised feed-forward back-propagation connectionist model based neural network (SFBC-NN) is invoked for user mobility and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of mobility prediction. Then, a learning automata-based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA-based action selection scheme is capable of enabling every state to select the optimal action with arbitrarily high probability if Q-learning is able to converge to the optimal Q value eventually. To characterize the performance of the proposed algorithms, the sum MOS of users is applied to define the reward function. Extensive simulations reveal that: 1) The prediction error of SFBC-NN lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning; 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Cache-Aided NOMA Mobile Edge Computing: A Reinforcement Learning Approach
Proposes NOMA cache-aided MEC with LSTM task popularity prediction and BLA-based multi-agent Q-learning for joint optimization of offloading, caching and resources, claiming outperformance over benchmarks and optimali...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.