archive
Every paper Pith has read. Search by title, abstract, or pith.
2900 papers in cs.RO · page 1
-
Token selection speeds geometry transformers over 85 percent
Good Token Hunting: A Hitchhiker's Guide to Token Selection for Visual Geometry Transformers
-
Simulation-trained robot harvests strawberries at 84.3% success
Robotic Strawberry Harvesting with Robust Vision and Deep Reinforcement Learning based Sim-to-Real Control
-
Point tracks improve robot world-action models
Point Tracking Improves World Action Models
-
Object sensors lift imitation learning success by 14-25 points
Instrumentation for Imitation Learning: Enhancing Training Datasets for Clothes Hanger Insertion
-
Framework holds robot fleet network traffic to constant order
SFG-ROS: A Resource-Aware Framework for Dense Multi-Agent Perception
-
Direct retargeting from video skips kinematic steps for feasible humanoid motions
Direct Dynamic Retargeting for Humanoid Imitation Learning from Videos
-
Any2Any moves tracking models to new robots at 1% cost
Any2Any: Efficient Cross-Embodiment Transfer for Humanoid Whole-Body Tracking
-
RL policy lands drones on moving ships without platform tracking
Vision-Based Agile Landing on Turbulent Waters
-
125 samples suffice for ANN inverse kinematics accuracy
How Many Training Samples Are Needed for the Inverse Kinematics Solutions by Artificial Neural Networks
-
Static noise analysis sets thresholds for three-channel tactile reflex
TactileReflex: Noise-Statistics-Driven Vision-Tactile Reflex Control for Force-Sensitive Manipulation
-
Semantic routing lifts robot manipulation efficiency
Semantically Structured Mixture-of-Experts for Compositional Robotic Manipulation
-
One stack handles UAV capture
Droneulator: A Portable UAV Simulator for Agricultural Workflows with RotorPy and Godot 4
-
Robots explore multi-floor buildings with hypothetical graphs
Multi-Floor Exploration for Ground Robots via an Incremental Reachable Graph and Structural Priors
-
Assembling trajectories from primitives reduces error ratio to 1.07
Sparse Compositional Flow Matching by geometric assembly from motion primitives
-
Hybrid planner reaches 94.85 on NAVSIM
ChainFlow-VLA: Causal Flow Planning with Vision-Language Models
-
Four-layer network lets robots respond to humans at millisecond speed
6G Communication Networks Enabling Embodied Agents: Architecture and Prototype
-
Convex hull of historical prompts bridges new VLN domains
Turning Adaptation into Assets: Cross-Domain Bridging for Online Vision-Language Navigation
-
Shortest paths on convex graphs yield STL-satisfying trajectories
Signal Temporal Logic Motion Planning via Graphs of Convex Sets
-
Homography mapping yields linear bounds for camera motion verification
Lipschitz Optimization for Formal Verification of Homographies
-
VLMs reach only 5.5% success on implicit intent navigation
IntentionNav: A Benchmark for Intent-Driven Object Navigation from Implicit Human Instruction
-
VLM boosts robot map coverage by 24% in tests
Autonomous Frontier-Based Exploration with VLM Guidance
-
Semantic cues speed drone exploration 13.7 times on average
Semantic-Aware Guided Drone Exploration for Language-Conditioned 3D Indoor Mapping
-
New decoder lifts VLA robot success from 40.4% to 50.2%
$\pi_0$-EqM: Equilibrium Matching for Closed-Loop Vision-Language-Action Control
-
Four estimators cut IMU drift in legged robots with foot contacts
Four Simple Proprioceptive Estimators for Legged Robots
-
Gaussians track view disagreement for depth uncertainty
UfM*: Uncertainty from Motion* for DNN Depth Estimation Using Gaussians
-
One robot manipulates multi-robot RL outcomes via rewards and actions
PIMbot: A Self-Adaptive Attack Framework for Adversarial Manipulation of Multi-Robot Reinforcement Learning
-
Certified Cartesian steps eliminate joint-limit violations
Verified Task-Space Motion Planning Under Joint-Space Constraints
-
Active sensing serves task control
Active Sensing Subserves Task-Level Control
-
Robots detect underspecified features via demo variation and query for fixes
Robots That Know What to Ask: Recovering Misaligned Rewards through Targeted Explanations
-
Self-awareness module improves language-guided navigation
AwareVLN: Reasoning with Self-awareness for Vision-Language Navigation
-
Gestures raise robot object selection accuracy in cluttered scenes
GesVLA: Gesture-Aware Vision-Language-Action Model Embedded Representations
-
Multi-agent RL drones beat humans with half the collisions
Superhuman Safe and Agile Racing through Multi-Agent Reinforcement Learning
-
Learned prep pose cuts parking planning time over 80%
N3P: Accelerated Automated Parking via a Learning-Based Naturalistic Three-Stage Scheme
-
Drone swarm recovers masked AES keys at 0.25 m standoff
TriSweep: A Four-Drone Swarm Framework for Electromagnetic Side-Channel Analysis
-
UAV scouts cut ground robot travel costs by 32-38 percent
Scout-Assisted Planning for Heterogeneous Robot Teams under Partially Known Environments
-
Symmetry compositions across robot spaces boost policy generalization
Symmetries Here and There, Combined Everywhere: Cross-space Symmetry Compositions in Robotics
-
Pure-Python library adds SE(3) operations for robotics without heavy dependencies
SE3Kit: A Lightweight Python Library for Specialized Geometric Primitives in Robotics
-
Agentic-VLA speeds VLA convergence 2.4x with adaptive rewards
Agentic-VLA: Efficient Online Adaptation for Vision-Language-Action Models
-
Dual-interval motion cues decouple ego-motion for UAV detection
Decoupling Ego-Motion from Target Dynamics via Dual-Interval Motion Cues for UAV Detection
-
Branching MPC allows separate plans for each vehicle intention
Branch-Stochastic Model Predictive Control for Motion Planning under Multi-Modal Uncertainty with Scenario Clustering
-
Residual stress learning narrows real-to-sim gap in dynamics
MoSA: Motion-constrained Stress Adaptation for Mitigating Real-to-Sim Gap in Continuum Dynamics via Learning Residual Anisotropy
-
Joint token diffusion policy scales language humanoid control
SCRIPT: Scalable Diffusion Policy with Multi-stage Training for Language-driven Physics-based Humanoid Control
-
Robotic units deliver full-body virtual immersion
Quantifying Full-Body Immersion
-
Multimodal policies fail differently depending on latent or generative setup
Understanding Multimodal Failure in Action-Chunking Behavioral Cloning
-
LLM driving planner cuts lag from 3s to near zero
Steins;Gate Drive: Semantic Safety Arbitration over Structured Futures for Latency-Decoupled LLM Planning
-
Pre-VLA lifts VLA success rates from 31% to 38%
Pre-VLA: Preemptive Runtime Verification for Reliable Vision-Language-Action and World-Model Rollouts
-
Terminal constraints keep UAV visual servoing stable with lost features
Terminal Constraint Model Predictive Control for Image-Based Visual Servoing of UAVs with Kalman Filter-Based Moment Loss Compensation
-
Convex structure speeds dual control to 83 microseconds
Real-Time Auto-Optimization in Unknown Environments via Structure-Exploiting Dual Control for Exploration and Exploitation
-
GenRe generalizes 3D urban scenes to new viewpoints in minutes
Diffusion-guided Generalizable Enhancer for Urban Scene Reconstruction
-
Reasoning makes AI surgical copilots think ahead
How can reasoning capability empower the AI copilot robot in endoscopic surgery