Betting mechanisms can yield provably more accurate and efficient estimates of real-world robot behavior than Monte Carlo sampling under specified conditions, with practical approximations demonstrated on synthetic data and a robotic manipulator task.
Title resolution pending
12 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
HALO learns latent reduced-order models with Poincaré maps for hybrid locomotion dynamics, allowing Lyapunov-based regions of attraction to be lifted from latent space to the full-order system.
BRRL derives an analytic optimal policy for regularized constrained RL that guarantees monotonic improvement and yields the BPO algorithm that matches or exceeds PPO.
A framework using 3D Gaussian Splatting for visual domain randomization enables robust monocular RGB-based dexterous in-hand reorientation on real hardware for multiple objects under varied lighting.
PriPG-RL trains RL policies for POMDPs by distilling knowledge from a privileged anytime-feasible MPC planner into a P2P-SAC policy, improving sample efficiency and performance in partially observable robotic navigation.
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
Isaac Lab is a unified GPU-native platform combining high-fidelity physics, photorealistic rendering, multi-frequency sensors, domain randomization, and learning pipelines for scalable multi-modal robot policy training.
RANDPOL achieves effective quadruped locomotion by training only the final linear readout of a randomly initialized and fixed neural network policy, matching PPO results with reduced parameters and enabling zero-shot sim-to-real transfer on Unitree Go2.
Terrain-consistent reference modulation during RL training yields SE(2)-controllable humanoid locomotion policies that improve tracking in simulation and enable over 70 m closed-loop autonomous navigation on rough terrain and stairs on the Unitree G1 with onboard computation.
A modified YOLO segmentation model plus sim-trained PPO control yields 84.3% overall success harvesting 281 strawberries in greenhouse trials on a real UR10e manipulator.
An open-sourced Unified Autonomy Stack fuses LiDAR, radar, vision and inertial data with sampling-based planning and control barrier functions to deliver resilient autonomy on aerial and ground robots in challenging real-world settings.
citing papers explorer
-
Betting for Sim-to-Real Performance Evaluation
Betting mechanisms can yield provably more accurate and efficient estimates of real-world robot behavior than Monte Carlo sampling under specified conditions, with practical approximations demonstrated on synthetic data and a robotic manipulator task.
-
HALO: Hybrid Auto-encoded Locomotion with Learned Latent Dynamics, Poincar\'e Maps, and Regions of Attraction
HALO learns latent reduced-order models with Poincaré maps for hybrid locomotion dynamics, allowing Lyapunov-based regions of attraction to be lifted from latent space to the full-order system.
-
Bounded Ratio Reinforcement Learning
BRRL derives an analytic optimal policy for regularized constrained RL that guarantees monotonic improvement and yields the BPO algorithm that matches or exceeds PPO.
-
ViserDex: Visual Sim-to-Real for Robust Dexterous In-hand Reorientation
A framework using 3D Gaussian Splatting for visual domain randomization enables robust monocular RGB-based dexterous in-hand reorientation on real hardware for multiple objects under varied lighting.
-
PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPC
PriPG-RL trains RL policies for POMDPs by distilling knowledge from a privileged anytime-feasible MPC planner into a P2P-SAC policy, improving sample efficiency and performance in partially observable robotic navigation.
-
FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control
FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.
-
Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning
Isaac Lab is a unified GPU-native platform combining high-fidelity physics, photorealistic rendering, multi-frequency sensors, domain randomization, and learning pipelines for scalable multi-modal robot policy training.
-
RANDPOL: Parameter-Efficient End-to-End Quadruped Locomotion via Randomized Policy Learning
RANDPOL achieves effective quadruped locomotion by training only the final linear readout of a randomly initialized and fixed neural network policy, matching PPO results with reduced parameters and enabling zero-shot sim-to-real transfer on Unitree Go2.
-
Terrain Consistent Reference-Guided RL for Humanoid Navigation Autonomy
Terrain-consistent reference modulation during RL training yields SE(2)-controllable humanoid locomotion policies that improve tracking in simulation and enable over 70 m closed-loop autonomous navigation on rough terrain and stairs on the Unitree G1 with onboard computation.
-
Robotic Strawberry Harvesting with Robust Vision and Deep Reinforcement Learning based Sim-to-Real Control
A modified YOLO segmentation model plus sim-trained PPO control yields 84.3% overall success harvesting 281 strawberries in greenhouse trials on a real UR10e manipulator.
-
The Unified Autonomy Stack: Toward a Blueprint for Generalizable Robot Autonomy
An open-sourced Unified Autonomy Stack fuses LiDAR, radar, vision and inertial data with sampling-based planning and control barrier functions to deliver resilient autonomy on aerial and ground robots in challenging real-world settings.
- SERNF: Sample-Efficient Real-World Dexterous Policy Fine-Tuning via Action-Chunked Critics and Normalizing Flows