Addressing function approxi- mation error in actor-critic methods

· 2018

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

browse 9 citing papers

representative citing papers

Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation

cs.RO · 2026-05-13 · unverdicted · novelty 6.0

CVaR-constrained TD3 policies for robot navigation show larger safety margins and higher post-training reachability verification rates than average-cost baselines across simulated scenarios and real-robot tests.

Self-Predictive Representation for Autonomous UAV Object-Goal Navigation

cs.RO · 2026-04-22 · unverdicted · novelty 6.0

AmelPredSto, a stochastic self-predictive representation model, outperforms other state representation learning approaches when combined with actor-critic RL for object-goal navigation in UAVs.

PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPC

cs.LG · 2026-04-09 · unverdicted · novelty 6.0

PriPG-RL trains RL policies for POMDPs by distilling knowledge from a privileged anytime-feasible MPC planner into a P2P-SAC policy, improving sample efficiency and performance in partially observable robotic navigation.

Safety-critical Control Under Partial Observability: Reach-Avoid POMDP meets Belief Space Control

cs.RO · 2026-03-11 · unverdicted · novelty 6.0

A modular belief-space controller using learned Belief Control Lyapunov Functions for information gathering and conformal-prediction Belief Control Barrier Functions for safety reduces reach-avoid POMDP synthesis to fast quadratic programs.

Constraint-Aware Reinforcement Learning via Adaptive Action Scaling

cs.RO · 2025-10-13 · unverdicted · novelty 6.0

A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.

Zero-Shot, Safe and Time-Efficient UAV Navigation via Potential-Based Reward Shaping, Control Lyapunov and Barrier Functions

eess.SY · 2026-05-03 · unverdicted · novelty 5.0

PBRS-augmented RL trained in simple settings transfers zero-shot to complex UAV environments when wrapped with a CLF-CBF-QP safety filter, yielding shorter missions and formal safety guarantees.

On Safer Reinforcement Learning for Sedation and Analgesia in Intensive Care

cs.LG · 2026-01-30 · unverdicted · novelty 5.0

Offline RL for ICU sedation shows that adding 30-day mortality to the objective yields policies whose clinician agreement correlates negatively with mortality, unlike pain-only versions.

Morphology-Aware Graph Reinforcement Learning for Tensegrity Robot Locomotion

cs.RO · 2025-10-30 · unverdicted · novelty 5.0

A GNN-augmented SAC policy that encodes tensegrity topology as a graph improves sample efficiency and enables zero-shot sim-to-real locomotion on a 3-bar tensegrity robot.

LLM-Guided Task- and Affordance-Level Exploration in Reinforcement Learning

cs.RO · 2025-09-20 · unverdicted · novelty 5.0

LLM-TALE steers RL exploration using LLM-generated plans at task and affordance levels with online suboptimality correction, improving sample efficiency and success rates on pick-and-place tasks without human supervision.

citing papers explorer

Showing 9 of 9 citing papers.

Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation cs.RO · 2026-05-13 · unverdicted · none · ref 15
CVaR-constrained TD3 policies for robot navigation show larger safety margins and higher post-training reachability verification rates than average-cost baselines across simulated scenarios and real-robot tests.
Self-Predictive Representation for Autonomous UAV Object-Goal Navigation cs.RO · 2026-04-22 · unverdicted · none · ref 37
AmelPredSto, a stochastic self-predictive representation model, outperforms other state representation learning approaches when combined with actor-critic RL for object-goal navigation in UAVs.
PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPC cs.LG · 2026-04-09 · unverdicted · none · ref 6
PriPG-RL trains RL policies for POMDPs by distilling knowledge from a privileged anytime-feasible MPC planner into a P2P-SAC policy, improving sample efficiency and performance in partially observable robotic navigation.
Safety-critical Control Under Partial Observability: Reach-Avoid POMDP meets Belief Space Control cs.RO · 2026-03-11 · unverdicted · none · ref 78
A modular belief-space controller using learned Belief Control Lyapunov Functions for information gathering and conformal-prediction Belief Control Barrier Functions for safety reduces reach-avoid POMDP synthesis to fast quadratic programs.
Constraint-Aware Reinforcement Learning via Adaptive Action Scaling cs.RO · 2025-10-13 · unverdicted · none · ref 28
A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.
Zero-Shot, Safe and Time-Efficient UAV Navigation via Potential-Based Reward Shaping, Control Lyapunov and Barrier Functions eess.SY · 2026-05-03 · unverdicted · none · ref 23
PBRS-augmented RL trained in simple settings transfers zero-shot to complex UAV environments when wrapped with a CLF-CBF-QP safety filter, yielding shorter missions and formal safety guarantees.
On Safer Reinforcement Learning for Sedation and Analgesia in Intensive Care cs.LG · 2026-01-30 · unverdicted · none · ref 48
Offline RL for ICU sedation shows that adding 30-day mortality to the objective yields policies whose clinician agreement correlates negatively with mortality, unlike pain-only versions.
Morphology-Aware Graph Reinforcement Learning for Tensegrity Robot Locomotion cs.RO · 2025-10-30 · unverdicted · none · ref 16
A GNN-augmented SAC policy that encodes tensegrity topology as a graph improves sample efficiency and enables zero-shot sim-to-real locomotion on a 3-bar tensegrity robot.
LLM-Guided Task- and Affordance-Level Exploration in Reinforcement Learning cs.RO · 2025-09-20 · unverdicted · none · ref 27
LLM-TALE steers RL exploration using LLM-generated plans at task and affordance levels with online suboptimality correction, improving sample efficiency and success rates on pick-and-place tasks without human supervision.

Addressing function approxi- mation error in actor-critic methods

fields

years

verdicts

representative citing papers

citing papers explorer