Improving stochastic policy gradients in continuous control with deep reinforcement learning using the Beta distribution

Po-Wei Chou, Daniel Maturana, Sebastian Scherer · 2017

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

The hidden risks of temporal resampling in clinical reinforcement learning

cs.LG · 2026-02-06 · conditional · novelty 6.0

Resampling clinical time series into uniform bins for offline RL reduces performance by up to 60% and causes retrospective evaluations to overestimate returns by 1.5-3x versus unprocessed data.

citing papers explorer

Showing 1 of 1 citing paper.

The hidden risks of temporal resampling in clinical reinforcement learning cs.LG · 2026-02-06 · conditional · none · ref 63
Resampling clinical time series into uniform bins for offline RL reduces performance by up to 60% and causes retrospective evaluations to overestimate returns by 1.5-3x versus unprocessed data.

Improving stochastic policy gradients in continuous control with deep reinforcement learning using the Beta distribution

fields

years

verdicts

representative citing papers

citing papers explorer