CVaR-constrained TD3 policies for robot navigation show larger safety margins and higher post-training reachability verification rates than average-cost baselines across simulated scenarios and real-robot tests.
hub
Soft actor-critic: Off- policy maximum entropy deep reinforcement learning with a stochastic actor
10 Pith papers cite this work. Polarity classification is still indexing.
hub tools
representative citing papers
A visual-tactile RL method learns peg-in-hole assembly from reversed peg-out-of-hole disassembly trajectories, reaching 87.5% success on seen objects and 77.1% on unseen objects while lowering contact forces.
EvoGymCM introduces continuous material stiffness as a first-class variable in soft robot co-design, with reactive and invariant settings that improve task performance over discrete baselines.
PriPG-RL trains RL policies for POMDPs by distilling knowledge from a privileged anytime-feasible MPC planner into a P2P-SAC policy, improving sample efficiency and performance in partially observable robotic navigation.
A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.
Meta-learning framework adapting iMAML for rapid controller tuning on uncertain nonlinear systems via offline source data and limited online target adaptation, shown with neural state-space and DQN variants.
Koopman-learned linear dynamics enable an online actor-critic RL method that improves sample efficiency and closed-loop performance on nonlinear robotic systems compared with model-free and other model-based baselines.
Event-based perception combined with progressive low-to-high speed training improves robotic table tennis return accuracy by 35.8% using the same number of training episodes.
citing papers explorer
-
Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation
CVaR-constrained TD3 policies for robot navigation show larger safety margins and higher post-training reachability verification rates than average-cost baselines across simulated scenarios and real-robot tests.
-
Visual-Tactile Peg-in-Hole Assembly Learning from Peg-out-of-Hole Disassembly
A visual-tactile RL method learns peg-in-hole assembly from reversed peg-out-of-hole disassembly trajectories, reaching 87.5% success on seen objects and 77.1% on unseen objects while lowering contact forces.
-
EvoGymCM: Harnessing Continuous Material Stiffness for Soft Robot Co-Design
EvoGymCM introduces continuous material stiffness as a first-class variable in soft robot co-design, with reactive and invariant settings that improve task performance over discrete baselines.
-
PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPC
PriPG-RL trains RL policies for POMDPs by distilling knowledge from a privileged anytime-feasible MPC planner into a P2P-SAC policy, improving sample efficiency and performance in partially observable robotic navigation.
-
Constraint-Aware Reinforcement Learning via Adaptive Action Scaling
A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.
-
Meta-Learning for Rapid Adaptation in Reference Tracking of Uncertain Nonlinear Systems
Meta-learning framework adapting iMAML for rapid controller tuning on uncertain nonlinear systems via offline source data and limited online target adaptation, shown with neural state-space and DQN variants.
-
Efficient Reinforcement Learning using Linear Koopman Dynamics for Nonlinear Robotic Systems
Koopman-learned linear dynamics enable an online actor-critic RL method that improves sample efficiency and closed-loop performance on nonlinear robotic systems compared with model-free and other model-based baselines.
-
Biologically Inspired Event-Based Perception and Sample-Efficient Learning for High-Speed Table Tennis Robots
Event-based perception combined with progressive low-to-high speed training improves robotic table tennis return accuracy by 35.8% using the same number of training episodes.
- Do We Really Need Immediate Resets? Rethinking Collision Handling for Efficient Robot Navigation
- CART: Context-Aware Terrain Adaptation using Temporal Sequence Selection for Legged Robots