archive
Every paper Pith has read. Search by title, abstract, or pith.
2900 papers in cs.RO · page 10
-
Unified VLA model beats human drivers on driving benchmark
MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving
-
Streaming intent produces controllable driving plans in end-to-end model
Action Emergence from Streaming Intent
-
Streaming intent steers driving AI to distinct plans
Action Emergence from Streaming Intent
-
Benchmark finds successful robot tasks often unsafe
SafeManip: A Property-Driven Benchmark for Temporal Safety Evaluation in Robotic Manipulation
1 Piths -
Specialized heads boost VLA robot success in and out of domain
GuidedVLA: Specifying Task-Relevant Factors via Plug-and-Play Action Attention Specialization
-
IMU suit teleoperates humanoid robot in real time with stable motions
Real-Time Whole-Body Teleoperation of a Humanoid Robot Using IMU-Based Motion Capture with Sim2Sim and Sim2Real Validation
-
Stereo event cameras track 3D hand poses at 30 mm error
EgoEV-HandPose: Egocentric 3D Hand Pose Estimation and Gesture Recognition with Stereo Event Cameras
-
One diffusion policy learns both search and insertion
SI-Diff: A Framework for Learning Search and High-Precision Insertion with a Force-Domain Diffusion Policy
-
Timestep modulation turns diffusion pretraining into efficient robot exploration
TMRL: Diffusion Timestep-Modulated Pretraining Enables Exploration for Efficient Policy Finetuning
-
Symmetry prior speeds up bimanual robot learning
Morphologically Equivariant Flow Matching for Bimanual Mobile Manipulation
-
Three height bands give 49-FPS LiDAR pedestrian detection
TriBand-BEV: Real-Time LiDAR-Only 3D Pedestrian Detection via Height-Aware BEV and High-Resolution Feature Fusion
-
Virtual objectives stabilize twist retargeting for dexterous hands
DexTwist: Dexterous Hand Retargeting for Twist Motion via Mixed Reality-based Teleoperation
-
Mixture of inverse models turns robot video predictions into actions
From Imagined Futures to Executable Actions: Mixture of Latent Actions for Robot Manipulation
-
Bidirectional pose-action loop boosts robot manipulation
X-Imitator: Spatial-Aware Imitation Learning via Bidirectional Action-Pose Interaction
-
Premover cuts robot task time 13.6% by acting on incomplete commands
Premover: Fast Vision-Language-Action Control by Acting Before Instructions Are Complete
-
OrbiSim turns world models into differentiable physics engines
OrbiSim: World Models as Differentiable Physics Engines for Embodied Intelligence
-
World models merge with action generation for embodied AI
World Action Models: The Next Frontier in Embodied AI
-
QOED focuses robot exploration on identifiable parameters
Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration
-
INDI yields lower position errors than geometric NDI on hexarotors
Control of Fully Actuated Aerial Vehicles: A Comparison of Model-based and Sensor-based Dynamic Inversion
-
Robot execution and AI chat boost student code reflection
RoboBlockly Studio: Conversational Block Programming with Embodied Robot Feedback for Computational Thinking
-
Motion statecharts execute semantic tasks on eight robot platforms
Closing the Motion Execution Gap: From Semantic Motion Task Constraints to Kinematic Control
-
Robot blocks unsafe merges at blind intersections
Cooperative Robotics Reinforced by Collective Perception for Traffic Moderation
-
Pre-planned graph branches let robots recover from failures instantly
From Reaction to Anticipation: Proactive Failure Recovery through Agentic Task Graph for Robotic Manipulation
-
LLM evolution designs superior robot navigation rewards
EvoNav: Evolutionary Reward Function Design for Robot Navigation with Large Language Models
-
Multi-view latents and manifold actions boost VLA robotic success
Learning Action Manifold with Multi-view Latent Priors for Robotic Manipulation
-
Body regions dictate robot affective touch strategies
Mapping Embodied Affective Touch Strategies on a Humanoid Robot
-
New sampler prunes robot vision tokens to under 10% with no accuracy loss
See What Matters: Differentiable Grid Sample Pruning for Generalizable Vision-Language-Action Model
-
Grid sampler trims VLA tokens to under 10% with full success
See What Matters: Differentiable Grid Sample Pruning for Generalizable Vision-Language-Action Model
-
Online imitation learning improves navigation via privileged planner labels
NavOL: Navigation Policy with Online Imitation Learning
-
Robots dream short futures to dodge manipulation failures
DreamAvoid: Critical-Phase Test-Time Dreaming to Avoid Failures in VLA Policies
-
Surfaces guide soft gripper to grasp paper sheets
Introducing Environmental Constraints to Grasping Strategies for Paper-Like Flexible Materials Using a Soft Gripper
-
Geometry tuning lets Rainbow DQN master cooperative insertions
Rainbow Deep Q-Learning with Kinematics-Aware Design for Cooperative Delta and 3-RRS Parallel Robot Insertion
-
IEKF and smoother cut long-term error versus MUSE on quadruped data
A Proprioceptive-Only Benchmark for Quadruped State Estimation: ATE, RPE, and Runtime Trade-offs Between Filters and Smoothers
-
One prompt generates full robot learning pipelines
Nautilus: From One Prompt to Plug-and-Play Robot Learning
-
SkyPart discovers semantic parts in drone and satellite images using competing learnable…
Weather-Robust Cross-View Geo-Localization via Prototype-Based Semantic Part Discovery
-
Learnable prototypes separate layout from texture in geo-matching
Weather-Robust Cross-View Geo-Localization via Prototype-Based Semantic Part Discovery
-
Planner achieves zero tip error for continuum robots on arms
Sampling-Based Follow-the-Leader Motion Planning for Manipulator-Mounted Continuum Robots
-
Lightweight Python layer lets users swap robot bodies with little code
RIO: Flexible Real-Time Robot I/O for Cross-Embodiment Robot Learning
-
Diffusion model upgrades low-cost IMU to virtual high-grade data
Overcoming the Intrinsic Performance Limitations of MEMS IMU via Diffusion-Based Generative Learning
-
Benchmark shows intent resolution bottlenecks LLM household agents
PRISM: : Planning and Reasoning with Intent in Simulated Embodied Environments
-
Single-agent demos plus cost produce coordinated multi-agent policies
Coordinated Diffusion: Generating Multi-Agent Behavior Without Multi-Agent Demonstrations
-
Liveness operator cuts truncation bias in robot policy evaluation
Offline Policy Evaluation for Manipulation Policies via Discounted Liveness Formulation
-
PPO reformulated to beat SAC in multi-task RL
TOPPO: Rethinking PPO for Multi-Task Reinforcement Learning with Critic Balancing
-
Quadratic cost correction lifts VLA success 28.8% in dynamic scenes
Overcoming Dynamics-Blindness: Training-Free Pace-and-Path Correction for VLA Models
-
Training-free fix lifts VLA success rates up to 28.8% in dynamic scenes
Overcoming Dynamics-Blindness: Training-Free Pace-and-Path Correction for VLA Models
-
Mode discovery prevents collapse in RL-tuned generative policies
Behavioral Mode Discovery for Fine-tuning Multimodal Generative Policies
-
MRF joint aligner reduces collisions in multi-agent paths
JACoP: Joint Alignment for Compliant Multi-Agent Prediction
-
Kairos cuts physical AI task latency by 32-66 percent
Kairos: A Scalable Serving System for Physical AI
-
RL policy learns safe sparse timing via Lyapunov shield
Learning When to Act: Communication-Efficient Reinforcement Learning via Run-Time Assurance
-
Spinning single-propeller drone reduces visibility via motion blur
Computational Design of a Low-Visibility UAV Using a Human-Aligned Perceptual Metric