hub

Soft actor-critic: Off- policy maximum entropy deep reinforcement learning with a stochastic actor

· 2018

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

browse 10 citing papers

hub tools

JSON dossier citing papers JSON

representative citing papers

Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation

cs.RO · 2026-05-13 · unverdicted · novelty 6.0

CVaR-constrained TD3 policies for robot navigation show larger safety margins and higher post-training reachability verification rates than average-cost baselines across simulated scenarios and real-robot tests.

Visual-Tactile Peg-in-Hole Assembly Learning from Peg-out-of-Hole Disassembly

cs.RO · 2026-04-22 · unverdicted · novelty 6.0

A visual-tactile RL method learns peg-in-hole assembly from reversed peg-out-of-hole disassembly trajectories, reaching 87.5% success on seen objects and 77.1% on unseen objects while lowering contact forces.

EvoGymCM: Harnessing Continuous Material Stiffness for Soft Robot Co-Design

cs.RO · 2026-04-09 · unverdicted · novelty 6.0

EvoGymCM introduces continuous material stiffness as a first-class variable in soft robot co-design, with reactive and invariant settings that improve task performance over discrete baselines.

PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPC

cs.LG · 2026-04-09 · unverdicted · novelty 6.0

PriPG-RL trains RL policies for POMDPs by distilling knowledge from a privileged anytime-feasible MPC planner into a P2P-SAC policy, improving sample efficiency and performance in partially observable robotic navigation.

Constraint-Aware Reinforcement Learning via Adaptive Action Scaling

cs.RO · 2025-10-13 · unverdicted · novelty 6.0

A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.

Meta-Learning for Rapid Adaptation in Reference Tracking of Uncertain Nonlinear Systems

cs.AI · 2026-05-21 · unverdicted · novelty 5.0

Meta-learning framework adapting iMAML for rapid controller tuning on uncertain nonlinear systems via offline source data and limited online target adaptation, shown with neural state-space and DQN variants.

Efficient Reinforcement Learning using Linear Koopman Dynamics for Nonlinear Robotic Systems

cs.RO · 2026-04-21 · unverdicted · novelty 5.0

Koopman-learned linear dynamics enable an online actor-critic RL method that improves sample efficiency and closed-loop performance on nonlinear robotic systems compared with model-free and other model-based baselines.

Biologically Inspired Event-Based Perception and Sample-Efficient Learning for High-Speed Table Tennis Robots

cs.RO · 2026-04-06 · unverdicted · novelty 5.0

Event-based perception combined with progressive low-to-high speed training improves robotic table tennis return accuracy by 35.8% using the same number of training episodes.

Do We Really Need Immediate Resets? Rethinking Collision Handling for Efficient Robot Navigation

cs.RO · 2026-05-04

CART: Context-Aware Terrain Adaptation using Temporal Sequence Selection for Legged Robots

cs.RO · 2026-04-15

citing papers explorer

Showing 10 of 10 citing papers.

Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation cs.RO · 2026-05-13 · unverdicted · none · ref 19
CVaR-constrained TD3 policies for robot navigation show larger safety margins and higher post-training reachability verification rates than average-cost baselines across simulated scenarios and real-robot tests.
Visual-Tactile Peg-in-Hole Assembly Learning from Peg-out-of-Hole Disassembly cs.RO · 2026-04-22 · unverdicted · none · ref 23
A visual-tactile RL method learns peg-in-hole assembly from reversed peg-out-of-hole disassembly trajectories, reaching 87.5% success on seen objects and 77.1% on unseen objects while lowering contact forces.
EvoGymCM: Harnessing Continuous Material Stiffness for Soft Robot Co-Design cs.RO · 2026-04-09 · unverdicted · none · ref 24
EvoGymCM introduces continuous material stiffness as a first-class variable in soft robot co-design, with reactive and invariant settings that improve task performance over discrete baselines.
PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPC cs.LG · 2026-04-09 · unverdicted · none · ref 7
PriPG-RL trains RL policies for POMDPs by distilling knowledge from a privileged anytime-feasible MPC planner into a P2P-SAC policy, improving sample efficiency and performance in partially observable robotic navigation.
Constraint-Aware Reinforcement Learning via Adaptive Action Scaling cs.RO · 2025-10-13 · unverdicted · none · ref 27
A separate regulator module adaptively scales actions in RL to reduce constraint violations while preserving exploration, yielding up to 126x fewer violations and over 10x higher returns on Safety Gym tasks.
Meta-Learning for Rapid Adaptation in Reference Tracking of Uncertain Nonlinear Systems cs.AI · 2026-05-21 · unverdicted · none · ref 27
Meta-learning framework adapting iMAML for rapid controller tuning on uncertain nonlinear systems via offline source data and limited online target adaptation, shown with neural state-space and DQN variants.
Efficient Reinforcement Learning using Linear Koopman Dynamics for Nonlinear Robotic Systems cs.RO · 2026-04-21 · unverdicted · none · ref 36
Koopman-learned linear dynamics enable an online actor-critic RL method that improves sample efficiency and closed-loop performance on nonlinear robotic systems compared with model-free and other model-based baselines.
Biologically Inspired Event-Based Perception and Sample-Efficient Learning for High-Speed Table Tennis Robots cs.RO · 2026-04-06 · unverdicted · none · ref 47
Event-based perception combined with progressive low-to-high speed training improves robotic table tennis return accuracy by 35.8% using the same number of training episodes.
Do We Really Need Immediate Resets? Rethinking Collision Handling for Efficient Robot Navigation cs.RO · 2026-05-04 · unreviewed · ref 24
CART: Context-Aware Terrain Adaptation using Temporal Sequence Selection for Legged Robots cs.RO · 2026-04-15 · unreviewed · ref 43

Soft actor-critic: Off- policy maximum entropy deep reinforcement learning with a stochastic actor

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer