Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks

· 2016 · stat.ML · arXiv 1605.07127

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

We present an algorithm for model-based reinforcement learning that combines Bayesian neural networks (BNNs) with random roll-outs and stochastic optimization for policy learning. The BNNs are trained by minimizing $\alpha$-divergences, allowing us to capture complicated statistical patterns in the transition dynamics, e.g. multi-modality and heteroskedasticity, which are usually missed by other common modeling approaches. We illustrate the performance of our method by solving a challenging benchmark where model-based approaches usually fail and by obtaining promising results in a real-world scenario for controlling a gas turbine.

representative citing papers

On Divergence Measures for Training GFlowNets

cs.LG · 2024-10-12 · unverdicted · novelty 6.0

Introduces statistically efficient estimators for Renyi-α, Tsallis-α, reverse and forward KL divergences with REINFORCE and score-matching control variates for faster GFlowNet training.

Uncertainty-aware Model-based Policy Optimization

cs.LG · 2019-06-25 · unverdicted · novelty 5.0

Introduces a framework that learns an uncertainty-aware dynamics model and optimizes the policy via automatic differentiation through the model, reporting competitive asymptotic performance with significantly lower sample complexity than baselines on continuous control benchmarks.

Calibrated Model-Based Deep Reinforcement Learning

cs.LG · 2019-06-19 · unverdicted · novelty 5.0

Augmenting model-based RL agents with calibrated predictive uncertainties improves planning, sample efficiency, and exploration on continuous control tasks.

citing papers explorer

Showing 3 of 3 citing papers.

On Divergence Measures for Training GFlowNets cs.LG · 2024-10-12 · unverdicted · none · ref 18 · internal anchor
Introduces statistically efficient estimators for Renyi-α, Tsallis-α, reverse and forward KL divergences with REINFORCE and score-matching control variates for faster GFlowNet training.
Uncertainty-aware Model-based Policy Optimization cs.LG · 2019-06-25 · unverdicted · none · ref 7 · internal anchor
Introduces a framework that learns an uncertainty-aware dynamics model and optimizes the policy via automatic differentiation through the model, reporting competitive asymptotic performance with significantly lower sample complexity than baselines on continuous control benchmarks.
Calibrated Model-Based Deep Reinforcement Learning cs.LG · 2019-06-19 · unverdicted · none · ref 10 · internal anchor
Augmenting model-based RL agents with calibrated predictive uncertainties improves planning, sample efficiency, and exploration on continuous control tasks.

Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks

fields

years

verdicts

representative citing papers

citing papers explorer