WildBox provides over 237k 3D wildlife annotations from drone video and benchmarks reveal zero-shot 3D detection at 0 AP but fine-tuned performance of 8.68 AP-BEV and 13.17 AP3D, with depth estimation causing most errors.
hub Mixed citations
Lee, Matthew Tan, Yuke Zhu, and Jeannette Bohg
Mixed citation behavior. Most common role is background (67%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Uni-Mo generates 7,488 language-annotated quadruped motions via LLM prompts and video diffusion, lifts them to 3D trajectories, and trains policies achieving 96.7% real-robot success on 392 sampled motions.
VLM-PBRS trains a potential function from small-VLM preferences to enable PBRS in RL, improving sample efficiency in Meta-World and Franka Kitchen without reward hacking.
MPC-Injection biases off-policy RL locomotion policies toward controller-induced behavior basins by injecting MPC transitions into the replay buffer.
A matrix-free, GPU-compatible PyTorch implementation of phase-field fracture with explicit dynamics, custom differentiable implicit damage solve, benchmarks on dynamic and quasi-static cases, and inverse recovery of fracture energy G_c via L-BFGS.
WireCraft is a new configurable simulation benchmark for industrial DLO manipulation with three task families, dual physics models, and shared evaluation of RL, IL, and VLA policies showing high success under privileged state but bottlenecks for vision-based methods.
FARM creates an open-vocabulary relational spatial memory that improves object retrieval recall by 164-224% over prior methods on 44k language queries across 67 scenes while running at 5-10 Hz.
CHORUS adapts a single VLA backbone for decentralized control of diverse robot teams, achieving 64-point gains over from-scratch decentralized baselines and outperforming centralized methods in real-world tasks using only local observations.
NDR-SHKF replaces the static forgetting factor in Sage-Husa Kalman Filters with a learned vector-valued memory attenuation policy from a bifurcated recurrent network trained end-to-end on whitened innovations to minimize estimation error.
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
DiffLNS uses a discrete diffusion initializer to produce warm-start plans that lift LNS2 success rates to 95.8% across 20 congested MAPF settings, generalizing from 96 to 312 agents.
Pose graph optimization is recast as damped Riemannian dynamics on Lie groups, enabling a fully distributed algorithm with a semi-implicit integrator that converges under both synchronous and asynchronous communication.
LLM-Foraging uses off-the-shelf LLMs for decentralized tactical decisions in CPFA-based swarm foraging, collecting more resources than GA-tuned baselines across 36 varied configurations while showing greater consistency.
SPARCS uses a differentiable contact model and sparse Hessian solver to jointly optimize shapes and poses of up to five interacting objects, producing physically valid simulation-ready reconstructions.
Combines offline behavioral cloning with online Real-Time Recurrent RL fine-tuning on LrcSSM models to adapt autonomous driving policies to distribution shifts, validated in simulation and on a real 1:10-scale robot with event camera.
AID trains diffusion policies via behavior cloning on existing MAIPP planners followed by RL fine-tuning to achieve faster execution and higher information gain in multi-agent coordination.
Guided RL using Bezier curves and UARM model enables efficient, explainable omnidirectional jumping in quadruped robots.
INSANE releases multiple MAV datasets with cross-environment trajectories, rich multi-IMU and camera suites, high-rate vibration data, and sub-centimeter RTK GNSS ground truth for localization research.
CI-MSE improves Spearman's rank correlation between offline validation error and real rollout performance from -0.61 (raw MSE) to -0.87 across policy checkpoints in simulation and real-world robot manipulation experiments.
Causal Spectral Policy decomposes actions spectrally into coarse motion from obs/language and conditional fine corrections, outperforming baselines on precision manipulation tasks.
MAGR-BB matches exhaustive search accuracy on multi-agent Blocksworld while reducing hypothesis evaluations by orders of magnitude via RL scoring inside factorized branch-and-bound.
HilDA pre-trains LiDAR backbones via multi-layer and global distillation from vision models plus temporal occupancy diffusion, yielding SOTA results on detection, flow, and occupancy tasks.
A post-hoc predictive safety filter adjusts RL policy contact locations for quadruped robots via sampling-based optimization on a full-physics model, reducing safety violations in cluttered environments with minimal performance deviation.
HORIZON is a recoverability-governed checkpointed frontier curriculum for on-policy physical-domain scaling on quadruped locomotion that identifies three regularities: uneven widening, non-monotonic composition, and the necessity of joint on-policy interaction.
citing papers explorer
-
AID: Agent Intent from Diffusion for Multi-Agent Informative Path Planning
AID trains diffusion policies via behavior cloning on existing MAIPP planners followed by RL fine-tuning to achieve faster execution and higher information gain in multi-agent coordination.
-
Guided Reinforcement Learning for Omnidirectional 3D Jumping in Quadruped Robots
Guided RL using Bezier curves and UARM model enables efficient, explainable omnidirectional jumping in quadruped robots.
-
Toward Seamless Physical Human-Humanoid Interaction: Insights from Control, Intent, and Modeling with a Vision for What Comes Next
A literature review of pHHI that proposes a taxonomy of interaction types by modality and engagement level while outlining pathways to integrate control, intent, and modeling for more seamless humanoid-human collaboration.
-
Online Adaptive Probabilistic Safety Certificate with Language Guidance
A framework integrates user language and probabilistic environment estimates into adaptive safety certificates that guarantee long-term safety for stochastic systems via probabilistic invariance.
-
STL-Based Motion Planning and Uncertainty-Aware Risk Analysis for Human-Robot Collaboration with a Multi-Rotor Aerial Vehicle
The paper proposes an STL-based optimization planner with uncertainty-aware risk analysis and event-triggered replanning for safe human-drone collaboration, demonstrated in simulations of an object handover task.
-
NOOUGAT: Towards Unified Online and Offline Multi-Object Tracking
NOOUGAT unifies online and offline multi-object tracking with a GNN that processes non-overlapping subclips fused by an Autoregressive Long-term Tracking layer, reporting SOTA gains on DanceTrack, SportsMOT, and MOT20.
-
Linking Exteroception and Proprioception through Improved Contact Modeling for Soft Growing Robots
Soft growing robots map unknown 2D environments by characterizing collision deformations, building a geometry-based simulator, and using Monte Carlo sampling to select optimal deployments that approach ideal actions.
-
Improving Action Smoothness for a Cascaded Online Learning Flight Control System
Adds temporal smoothness and low-pass filtering to cut oscillations in cascaded online learning flight controllers, shown via FFT and simulations.
-
Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions
A literature review that categorizes deep learning approaches for visual hand gesture recognition, summarizes state-of-the-art methods across tasks, reviews datasets and metrics, and identifies challenges and future directions.
- House of Dextra: Cross-embodied Co-design for Dexterous Hands
- QuickLAP: Quick Language-Action Preference Learning for Semi-Autonomous Agents