NDR-SHKF replaces the static forgetting factor in Sage-Husa Kalman Filters with a learned vector-valued memory attenuation policy from a bifurcated recurrent network trained end-to-end on whitened innovations to minimize estimation error.
hub Mixed citations
Bidirectional attention network for monocular depth estimation
Mixed citation behavior. Most common role is background (67%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
DiffLNS uses a discrete diffusion initializer to produce warm-start plans that lift LNS2 success rates to 95.8% across 20 congested MAPF settings, generalizing from 96 to 312 agents.
Pose graph optimization is recast as damped Riemannian dynamics on Lie groups, enabling a fully distributed algorithm with a semi-implicit integrator that converges under both synchronous and asynchronous communication.
LLM-Foraging uses off-the-shelf LLMs for decentralized tactical decisions in CPFA-based swarm foraging, collecting more resources than GA-tuned baselines across 36 varied configurations while showing greater consistency.
SPARCS uses a differentiable contact model and sparse Hessian solver to jointly optimize shapes and poses of up to five interacting objects, producing physically valid simulation-ready reconstructions.
Combines offline behavioral cloning with online Real-Time Recurrent RL fine-tuning on LrcSSM models to adapt autonomous driving policies to distribution shifts, validated in simulation and on a real 1:10-scale robot with event camera.
AID trains diffusion policies via behavior cloning on existing MAIPP planners followed by RL fine-tuning to achieve faster execution and higher information gain in multi-agent coordination.
Guided RL using Bezier curves and UARM model enables efficient, explainable omnidirectional jumping in quadruped robots.
INSANE releases multiple MAV datasets with cross-environment trajectories, rich multi-IMU and camera suites, high-rate vibration data, and sub-centimeter RTK GNSS ground truth for localization research.
VBT-MPC performs robotic contour following by running MPC directly in vision-based tactile contour feature space and is tested on varied geometries in simulation and real experiments.
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
BoolXLLM augments an existing Boolean rule learner with LLMs for feature selection, discretization thresholds, and natural-language rule translation to improve interpretability while preserving accuracy.
TouchAnything reconstructs accurate 3D object geometries from only a few tactile contacts by optimizing for consistency with a pretrained visual diffusion prior.
L2G-Det detects and segments novel object instances in open scenes by using local template patch matches to generate points that prompt an augmented SAM for global masks.
A co-design framework learns task-specific hand shapes and complementary control policies, supporting design, training, fabrication, and deployment of new dexterous hands in under 24 hours.
QuickLAP fuses LLM-extracted language observations with physical feedback in a closed-form Bayesian update to cut reward learning error by over 70% in a driving simulator and improve user preference in a 15-person study.
A single learned controller called MHC enables real humanoid robots to execute diverse whole-body behaviors from multi-modal inputs via masked target trajectories.
FusionSense uses server-side fusion learning, filter-out-safe labels, and edge compaction to enable runtime-adaptive multimodal sensing that cuts energy up to 33x while preserving task quality on RGB+Depth data.
A visibility-aware mobile grasping system with iterative whole-body planning and behavior-tree subgoal generation achieves 68.8% success in unknown static and 58% in dynamic environments, outperforming a baseline by 22.8% and 18%.
Gaussian and related cropping strategies for point cloud subclouds improve 3D neural network performance over spherical cropping on large outdoor scenes.
BIEVR-LIO improves robustness of LiDAR-inertial odometry by representing maps as voxel-wise oriented height images and sampling points only from geometrically informative regions.
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
ARROW adds a distribution-matching long-term replay buffer to DreamerV3 and shows reduced forgetting versus same-size baselines on Atari and Procgen continual RL benchmarks.
citing papers explorer
-
Learned Memory Attenuation in Sage-Husa Kalman Filters for Robust UAV State Estimation
NDR-SHKF replaces the static forgetting factor in Sage-Husa Kalman Filters with a learned vector-valued memory attenuation policy from a bifurcated recurrent network trained end-to-end on whitened innovations to minimize estimation error.
-
Randomized Advantage Transformation (RAT): Computing Natural Policy Gradients via Direct Backpropagation
RAT reformulates regularized natural policy gradients as vanilla gradients with a transformed advantage, computed efficiently via randomized block Kaczmarz iterations on on-policy data.
-
Discrete Diffusion for Complex and Congested Multi-Agent Path Finding with Sparse Social Attention
DiffLNS uses a discrete diffusion initializer to produce warm-start plans that lift LNS2 success rates to 95.8% across 20 congested MAPF settings, generalizing from 96 to 312 agents.
-
Distributed Pose Graph Optimization via Continuous Riemannian Dynamics
Pose graph optimization is recast as damped Riemannian dynamics on Lie groups, enabling a fully distributed algorithm with a semi-implicit integrator that converges under both synchronous and asynchronous communication.
-
LLM-Foraging: Large Language Models for Decentralized Swarm Robot Foraging
LLM-Foraging uses off-the-shelf LLMs for decentralized tactical decisions in CPFA-based swarm foraging, collecting more resources than GA-tuned baselines across 36 varied configurations while showing greater consistency.
-
Simulation-Ready Cluttered Scene Estimation via Physics-aware Joint Shape and Pose Optimization
SPARCS uses a differentiable contact model and sparse Hessian solver to jointly optimize shapes and poses of up to five interacting objects, producing physically valid simulation-ready reconstructions.
-
Adaptive Control in Autonomous Driving via Real-Time Recurrent RL
Combines offline behavioral cloning with online Real-Time Recurrent RL fine-tuning on LrcSSM models to adapt autonomous driving policies to distribution shifts, validated in simulation and on a real 1:10-scale robot with event camera.
-
AID: Agent Intent from Diffusion for Multi-Agent Informative Path Planning
AID trains diffusion policies via behavior cloning on existing MAIPP planners followed by RL fine-tuning to achieve faster execution and higher information gain in multi-agent coordination.
-
Guided Reinforcement Learning for Omnidirectional 3D Jumping in Quadruped Robots
Guided RL using Bezier curves and UARM model enables efficient, explainable omnidirectional jumping in quadruped robots.
-
INSANE: Cross-Domain UAV Data Sets with Increased Number of Sensors for developing Advanced and Novel Estimators
INSANE releases multiple MAV datasets with cross-environment trajectories, rich multi-IMU and camera suites, high-rate vibration data, and sub-centimeter RTK GNSS ground truth for localization research.
-
VBT-MPC: Vision-Based Tactile MPC for Contour Following
VBT-MPC performs robotic contour following by running MPC directly in vision-based tactile contour feature space and is tested on varied geometries in simulation and real experiments.
-
ARC-RL: A Reinforcement Learning Playground Inspired by ARC Raiders
ARC-RL is a new suite of four MuJoCo continuous-control environments featuring game-inspired hexapod and quadruped morphologies, a single closed-form multi-component reward function, CPG demonstrators, and empirical comparisons of online and offline-to-online RL algorithms.
-
BoolXLLM: LLM-Assisted Explainability for Boolean Models
BoolXLLM augments an existing Boolean rule learner with LLMs for feature selection, discretization thresholds, and natural-language rule translation to improve interpretability while preserving accuracy.
-
TouchAnything: Diffusion-Guided 3D Reconstruction from Sparse Robot Touches
TouchAnything reconstructs accurate 3D object geometries from only a few tactile contacts by optimizing for consistency with a pretrained visual diffusion prior.
-
From Local Matches to Global Masks: Template-Guided Instance Detection and Segmentation in Open-World Scenes
L2G-Det detects and segments novel object instances in open scenes by using local template patch matches to generate points that prompt an augmented SAM for global masks.
-
House of Dextra: Cross-embodied Co-design for Dexterous Hands
A co-design framework learns task-specific hand shapes and complementary control policies, supporting design, training, fabrication, and deployment of new dexterous hands in under 24 hours.
-
QuickLAP: Quick Language-Action Preference Learning for Semi-Autonomous Agents
QuickLAP fuses LLM-extracted language observations with physical feedback in a closed-form Bayesian update to cut reward learning error by over 70% in a driving simulator and improve user preference in a 15-person study.
-
Learning Multi-Modal Whole-Body Control for Real-World Humanoid Robots
A single learned controller called MHC enables real humanoid robots to execute diverse whole-body behaviors from multi-modal inputs via masked target trajectories.
-
FusionSense: Tri-Stage Near-Sensor Learning for Runtime-Adaptive Multimodal Edge Intelligence
FusionSense uses server-side fusion learning, filter-out-safe labels, and edge compaction to enable runtime-adaptive multimodal sensing that cuts energy up to 33x while preserving task quality on RGB+Depth data.
-
Visibility-Aware Mobile Grasping in Dynamic Environments
A visibility-aware mobile grasping system with iterative whole-body planning and behavior-tree subgoal generation achieves 68.8% success in unknown static and 58% in dynamic environments, outperforming a baseline by 22.8% and 18%.
-
From Spherical to Gaussian: A Comparative Analysis of Point Cloud Cropping Strategies in Large-Scale 3D Environments
Gaussian and related cropping strategies for point cloud subclouds improve 3D neural network performance over spherical cropping on large outdoor scenes.
-
BIEVR-LIO: Robust LiDAR-Inertial Odometry through Bump-Image-Enhanced Voxel Maps
BIEVR-LIO improves robustness of LiDAR-inertial odometry by representing maps as voxel-wise oriented height images and sampling points only from geometrically informative regions.
-
Hierarchical Awareness Adapters with Hybrid Pyramid Feature Fusion for Dense Depth Prediction
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
-
ARROW: Augmented Replay for RObust World models
ARROW adds a distribution-matching long-term replay buffer to DreamerV3 and shows reduced forgetting versus same-size baselines on Atari and Procgen continual RL benchmarks.
-
Toward Seamless Physical Human-Humanoid Interaction: Insights from Control, Intent, and Modeling with a Vision for What Comes Next
A literature review of pHHI that proposes a taxonomy of interaction types by modality and engagement level while outlining pathways to integrate control, intent, and modeling for more seamless humanoid-human collaboration.
-
Online Adaptive Probabilistic Safety Certificate with Language Guidance
A framework integrates user language and probabilistic environment estimates into adaptive safety certificates that guarantee long-term safety for stochastic systems via probabilistic invariance.
-
STL-Based Motion Planning and Uncertainty-Aware Risk Analysis for Human-Robot Collaboration with a Multi-Rotor Aerial Vehicle
The paper proposes an STL-based optimization planner with uncertainty-aware risk analysis and event-triggered replanning for safe human-drone collaboration, demonstrated in simulations of an object handover task.
-
NOOUGAT: Towards Unified Online and Offline Multi-Object Tracking
NOOUGAT unifies online and offline multi-object tracking with a GNN that processes non-overlapping subclips fused by an Autoregressive Long-term Tracking layer, reporting SOTA gains on DanceTrack, SportsMOT, and MOT20.
-
Linking Exteroception and Proprioception through Improved Contact Modeling for Soft Growing Robots
Soft growing robots map unknown 2D environments by characterizing collision deformations, building a geometry-based simulator, and using Monte Carlo sampling to select optimal deployments that approach ideal actions.
-
4D Radar Semantic Segmentation of People in Field Conditions Using Temporal Multi-View Networks
TMVA4D uses CNN and ConvLSTM encoders on multi-view 2D projections of 4D radar point clouds for semantic segmentation of people, reporting Dice 75.9% and IoU 61.2% in field tests.
-
A Systematic Survey on Deep Learning Architectures for Point Cloud Classification and Segmentation
A systematic literature survey that categorizes deep learning architectures for point cloud classification, part segmentation, and semantic segmentation, evaluates them on benchmarks, and discusses innovations, limitations, and future directions.
-
The Unified Autonomy Stack: Toward a Blueprint for Generalizable Robot Autonomy
An open-sourced Unified Autonomy Stack fuses LiDAR, radar, vision and inertial data with sampling-based planning and control barrier functions to deliver resilient autonomy on aerial and ground robots in challenging real-world settings.
-
Smoothing Out the Edges: Continuous-Time Estimation with Gaussian Process Motion Priors on Factor Graphs
The paper recasts Gaussian-process continuous-time estimation in factor-graph language and supplies three GTSAM implementations to lower the barrier to adoption.
-
Explainable Planning for Hybrid Systems
A comprehensive study on generating explanations for automated planning in hybrid systems.
-
Improving Action Smoothness for a Cascaded Online Learning Flight Control System
Adds temporal smoothness and low-pass filtering to cut oscillations in cascaded online learning flight controllers, shown via FFT and simulations.
-
Optimal Gait Control for a Tendon-driven Soft Quadruped Robot by Model-based Reinforcement Learning
Develops and tests a model-based RL controller with post-training for gait in a tendon-driven soft quadruped, reporting improved efficiency and robustness over benchmarks.
-
Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions
A literature review that categorizes deep learning approaches for visual hand gesture recognition, summarizes state-of-the-art methods across tasks, reviews datasets and metrics, and identifies challenges and future directions.