AWR learns policies via advantage-weighted supervised regression on actions, achieving competitive off-policy performance on Gym tasks and strong results from static data alone.
ISBN 978-1-60558-907-7
2 Pith papers cite this work. Polarity classification is still indexing.
years
2019 2representative citing papers
A simplified convolutional neural network is inserted as a function node in the sum-product algorithm factor graph for FTN signaling to model residual ISI, with modified message updates enabling turbo equalization and up to 2.5 dB BER gain.
citing papers explorer
-
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
AWR learns policies via advantage-weighted supervised regression on actions, achieving competitive off-policy performance on Gym tasks and strong results from static data alone.
-
Deep Learning Assisted Sum-Product Detection Algorithm for Faster-than-Nyquist Signaling
A simplified convolutional neural network is inserted as a function node in the sum-product algorithm factor graph for FTN signaling to model residual ISI, with modified message updates enabling turbo equalization and up to 2.5 dB BER gain.