An asynchronous architecture decouples incremental voxel-based mapping from VLM-based semantic enrichment to produce queryable open-vocabulary 3D scene graphs that match or exceed prior methods on segmentation and grounding benchmarks.
hub Canonical reference
Monocular visual-inertial odometry in low-textured environments with smooth gradients: A fully dense direct filtering approach
Canonical reference. 80% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
An event-camera system with active gaze control and contrast-maximization spin estimation achieves real-time performance in table tennis with 8.8% magnitude error, 6.4° axis error, 3 ms latency, and 750 Hz throughput.
SA-LIVO uses eigendecomposition of the joint information matrix with linear-clamp soft gates per eigendirection for efficient degeneracy-aware LiDAR-inertial-visual odometry.
A flow-adaptive ergodic coverage formulation using MMD that preserves guarantees over evolving domains and supports open-loop planning for robots in flows.
Timed reward machines extend reward machines with timing constraints, allowing model-free RL algorithms to learn policies that satisfy precise temporal requirements on standard benchmarks.
BEVCALIB performs LiDAR-camera calibration from raw data by fusing camera and LiDAR bird's-eye view features with a novel feature selector and reports state-of-the-art accuracy on KITTI and NuScenes.
Residual feature integration with a trainable target-side encoder provably prevents negative transfer, achieving convergence rates no worse than training from scratch under informative target distributions.
INSANE releases multiple MAV datasets with cross-environment trajectories, rich multi-IMU and camera suites, high-rate vibration data, and sub-centimeter RTK GNSS ground truth for localization research.
FALCON algorithm solves non-convex partially-decoupled GNEPs via SCP and potential games, claiming global convergence to open-loop Nash equilibria under mild assumptions.
Introduces LSM that outputs calibrated multimodal spatial distributions from language plus scene graph, fused via VL-Map to improve 3D target localization on VLA-3D benchmark and real robot.
A LiDAR-inertial odometry pipeline using on-manifold ellipsoidal set-membership filtering to output feasible sets as deterministic protection levels under unknown-but-bounded point-cloud noise.
RigidFormer learns mesh-free rigid dynamics from point clouds using object-centric anchors, Anchor-Vertex Pooling, Anchor-based RoPE, and differentiable Kabsch alignment to enforce rigidity.
Dr-BA delivers a separable optimization approach for direct radar bundle adjustment and cross-session localization using full spinning-radar intensity images, achieving state-of-the-art performance on over 200 km of on-road data.
New knot classification benchmark and topology-aware supervision methods yield small specificity gains but confirm that appearance bias remains the dominant failure mode.
Integrates iterative learning control with a torque library to enable high-precision adaptive locomotion on bipedal and quadrupedal robots, reducing tracking errors by up to 85% and achieving over 30x faster control rates.
Bee hive mind from weighted voter imitation equals a single RL agent using a new multi-armed bandit rule called Maynard-Cross Learning.
UECP replaces detection-correlated confidence maps with a LiDAR point-density uncertainty map and introduces Uncertainty-Aware Pyramid Fusion to improve collaborative perception.
KITE decouples task reasoning from embodiment-specific control via learned latent interaction intents to enable zero-shot transfer across structurally different robots.
A fully discrete strain-based model for continuum robot dynamics via Lie group variational integrators, combined with an EKF-based observer for states and disturbances, validated on hardware.
A simulation-trained deep deformation model combined with online adaptive control enables zero-shot autonomous tissue retraction for ROI exposure in robotic surgery.
Systematic grasping strategies for paper-like materials are developed and tested with a soft gripper by exploiting environmental constraints to improve force control and success rates.
PF-CD3Q uses online particle filtering to estimate fatigue parameters and constrains a deep Q-learning agent to solve fatigue-aware human-robot task planning as a CMDP.
PRISM downsamples point clouds by stratifying on RGB color bins with a maximum capacity k per bin to preserve high chromatic diversity regions over homogeneous surfaces.
RSR-RSMARL is a robust safe MARL framework with V2V communication and CBF safety shields that supports zero-shot sim-to-real transfer and improves coordination on 1/10-scale vehicle hardware.
citing papers explorer
-
Think While You Map: Asynchronous Vision-Language Agents for Incremental 3D Scene Graphs
An asynchronous architecture decouples incremental voxel-based mapping from VLM-based semantic enrichment to produce queryable open-vocabulary 3D scene graphs that match or exceed prior methods on segmentation and grounding benchmarks.
-
Event-based Gaze Control System for Accurate Real-time Spin Estimation in Professional Ball Games
An event-camera system with active gaze control and contrast-maximization spin estimation achieves real-time performance in table tennis with 8.8% magnitude error, 6.4° axis error, 3 ms latency, and 750 Hz throughput.
-
SA-LIVO: Efficient LiDAR-Inertial-Visual Odometry with Subspace-Aware Degeneracy Handling
SA-LIVO uses eigendecomposition of the joint information matrix with linear-clamp soft gates per eigendirection for efficient degeneracy-aware LiDAR-inertial-visual odometry.
-
Asymptotically Optimal Ergodic Coverage on Generalized Motion Fields
A flow-adaptive ergodic coverage formulation using MMD that preserves guarantees over evolving domains and supports open-loop planning for robots in flows.
-
BEVCALIB: LiDAR-Camera Calibration via Geometry-Guided Bird's-Eye View Representations
BEVCALIB performs LiDAR-camera calibration from raw data by fusing camera and LiDAR bird's-eye view features with a novel feature selector and reports state-of-the-art accuracy on KITTI and NuScenes.
-
Residual Feature Integration is Sufficient to Prevent Negative Transfer
Residual feature integration with a trainable target-side encoder provably prevents negative transfer, achieving convergence rates no worse than training from scratch under informative target distributions.
-
A Fast Convergent Algorithm for Solving Non-convex Partially-Decoupled Generalized Nash Equilibrium Problems
FALCON algorithm solves non-convex partially-decoupled GNEPs via SCP and potential games, claiming global convergence to open-loop Nash equilibria under mild assumptions.
-
Language as a Sensor: Calibrated Spatial Belief Estimation in 3D Scenes from Natural Language
Introduces LSM that outputs calibrated multimodal spatial distributions from language plus scene graph, fused via VL-Map to improve 3D target localization on VLA-3D benchmark and real robot.
-
Safety-Critical LiDAR-Inertial Odometry with On-Manifold Deterministic Protection Level
A LiDAR-inertial odometry pipeline using on-manifold ellipsoidal set-membership filtering to output feasible sets as deterministic protection levels under unknown-but-bounded point-cloud noise.
-
RigidFormer: Learning Rigid Dynamics using Transformers
RigidFormer learns mesh-free rigid dynamics from point clouds using object-centric anchors, Anchor-Vertex Pooling, Anchor-based RoPE, and differentiable Kabsch alignment to enforce rigidity.
-
Dr-BA: Separable Optimization for Direct Radar Bundle Adjustment & Localization
Dr-BA delivers a separable optimization approach for direct radar bundle adjustment and cross-session localization using full spinning-radar intensity images, achieving state-of-the-art performance on over 200 km of on-road data.
-
Physical Knot Classification Beyond Accuracy: A Benchmark and Diagnostic Study
New knot classification benchmark and topology-aware supervision methods yield small specificity gains but confirm that appearance bias remains the dominant failure mode.
-
Iteratively Learning Muscle Memory for Legged Robots to Master Adaptive and High Precision Locomotion
Integrates iterative learning control with a torque library to enable high-precision adaptive locomotion on bipedal and quadrupedal robots, reducing tracking errors by up to 85% and achieving over 30x faster control rates.
-
The Hive Mind is a Single Reinforcement Learning Agent
Bee hive mind from weighted voter imitation equals a single RL agent using a new multi-armed bandit rule called Maynard-Cross Learning.
-
UECP: Uncertainty-Enhanced Collaborative Perception
UECP replaces detection-correlated confidence maps with a LiDAR point-density uncertainty map and introduces Uncertainty-Aware Pyramid Fusion to improve collaborative perception.
-
KITE: Decoupling Kinematics and Interaction for Zero-Shot Cross-Embodiment Manipulation
KITE decouples task reasoning from embodiment-specific control via learned latent interaction intents to enable zero-shot transfer across structurally different robots.
-
Discrete Geometric Modeling and Extended State Estimation of Continuum Robots
A fully discrete strain-based model for continuum robot dynamics via Lie group variational integrators, combined with an EKF-based observer for states and disturbances, validated on hardware.
-
Learning-Based Adaptive Control for Surgical Robotic Exposure Task on Deformable Tissues
A simulation-trained deep deformation model combined with online adaptive control enables zero-shot autonomous tissue retraction for ROI exposure in robotic surgery.
-
Introducing Environmental Constraints to Grasping Strategies for Paper-Like Flexible Materials Using a Soft Gripper
Systematic grasping strategies for paper-like materials are developed and tested with a soft gripper by exploiting environmental constraints to improve force control and success rates.
-
Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production
PF-CD3Q uses online particle filtering to estimate fatigue parameters and constrains a deep Q-learning agent to solve fatigue-aware human-robot task planning as a CMDP.
-
PRISM: Color-Stratified Point Cloud Sampling
PRISM downsamples point clouds by stratifying on RGB color bins with a maximum capacity k per bin to preserve high chromatic diversity regions over homogeneous surfaces.
-
Robust and Safe Multi-Agent Reinforcement Learning with Communication for Autonomous Vehicles: From Simulation to Hardware
RSR-RSMARL is a robust safe MARL framework with V2V communication and CBF safety shields that supports zero-shot sim-to-real transfer and improves coordination on 1/10-scale vehicle hardware.
-
Optimizing Agricultural Drone Operations: From Launch and Recovery Siting to Tiered Routing Strategies
p-median heuristic for facility siting and 6-8 cluster tiered routing reduce drone operation planning time by 1-3 orders of magnitude with 4% or less loss in serviced area.
-
On-Device Robotic Planning: Eliminating Inference Redundancy for Efficient Decision-Making
REIS reduces inference redundancy in embodied robotic planning via lightweight gating and routing while preserving task performance on ALFRED and real robots.
-
DigiForest: Digital Analytics and Robotics for Sustainable Forestry
DigiForest integrates heterogeneous autonomous robots for data collection, automated tree trait extraction, a decision support system for growth forecasting, and autonomous harvesters for selective logging, with real-world tests in European forests.
-
Trajectory Prediction for Autonomous Driving: Progress, Limitations, and Future Directions
A survey of trajectory prediction techniques for autonomous vehicles that proposes a taxonomy, overviews the prediction pipeline, and highlights remaining research gaps.
-
Attentive Dilated Convolution for Automatic Sleep Staging using Force-directed Layout
AttDiCNN reaches 98.56%, 99.66%, and 99.08% accuracy on EDFX, HMC, and NCH sleep datasets via force-directed visibility graph EEG representations and a three-module attentive dilated CNN architecture.
-
Foundations of Future Communication Systems: Innovations in Communication - A Report
The report assembles abstracts of invited talks, presentations, and posters from the FFCS conference on foundational limits and emerging paradigms in communication.