LangTail uses entity-level semantic priors from language models aligned via contrastive learning in a hierarchical clustering setup to resolve long-tail ambiguity, yielding +13.5, +12.9, and +8.9 mIoU gains on ScanNet-v2, S3DIS, and nuScenes.
hub
Pointnet: Deep learning on point sets for 3d classification and segmentation
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 12roles
background 1polarities
background 1representative citing papers
DockAnywhere lifts single demonstrations to diverse docking points via structure-preserving augmentation and point-cloud spatial editing to improve viewpoint generalization in visuomotor policies for mobile manipulation.
CoLA-Flow Policy encodes action sequences into a continuous latent space and learns an explicit flow there, yielding near-single-step inference with up to 93.7% smoother trajectories and 25-point higher task success than raw-action flow baselines.
Proposes the first light field-LiDAR semantic segmentation dataset and the Mlpfseg network, which improves mIoU by 1.71 over image-only and 2.38 over point-cloud-only baselines via feature completion and depth perception modules.
HITL-D combines diffusion policies with human input for shared robotic control, reducing required joystick axes and improving speed and workload in manipulation tasks per a 12-participant study.
Real2Sim reconstructs editable dynamic driving scenes as temporally continuous Gaussians integrated with a differentiable MPM physics solver for high-fidelity simulation of interactions and collisions.
A modular belief-space controller using learned Belief Control Lyapunov Functions for information gathering and conformal-prediction Belief Control Barrier Functions for safety reduces reach-avoid POMDP synthesis to fast quadratic programs.
MapRF reaches about 75% of fully supervised HD map accuracy on Argoverse 2 and nuScenes by generating view-consistent pseudo labels via a NeRF conditioned on map predictions and refining them with Map-to-Ray Matching in self-training.
PointCRA reduces information loss in deep point cloud networks by treating temporal trend variation as an extra evaluation dimension alongside spatial and channel attention, guided by a neighborhood homogeneity constraint.
FastGrasp uses two-stage RL with CVAE for diverse grasp candidates from point clouds and tactile sensing for impact adjustments to achieve robust fast whole-body grasping in sim and real-world settings.
Empowered t-FCW graph representation provides a unified non-parametric and interpretable method for point cloud analysis with high efficiency on ModelNet40 classification.
FP16 quantization preserves accuracy in BEV-based LiDAR place recognition at lower cost while INT8 degradation depends on the network architecture.
citing papers explorer
-
Resolving Long-Tail Ambiguity in Unsupervised 3D Point Cloud Segmentation with Language Priors
LangTail uses entity-level semantic priors from language models aligned via contrastive learning in a hierarchical clustering setup to resolve long-tail ambiguity, yielding +13.5, +12.9, and +8.9 mIoU gains on ScanNet-v2, S3DIS, and nuScenes.
-
DockAnywhere: Data-Efficient Visuomotor Policy Learning for Mobile Manipulation via Novel Demonstration Generation
DockAnywhere lifts single demonstrations to diverse docking points via structure-preserving augmentation and point-cloud spatial editing to improve viewpoint generalization in visuomotor policies for mobile manipulation.
-
CoLA-Flow Policy: Temporally Coherent Imitation Learning via Continuous Latent Action Flow Matching for Robotic Manipulation
CoLA-Flow Policy encodes action sequences into a continuous latent space and learns an explicit flow there, yielding near-single-step inference with up to 93.7% smoother trajectories and 25-point higher task success than raw-action flow baselines.
-
Geometry-Aware Cross Modal Alignment for Light Field-LiDAR Semantic Segmentation
Proposes the first light field-LiDAR semantic segmentation dataset and the Mlpfseg network, which improves mIoU by 1.71 over image-only and 2.38 over point-cloud-only baselines via feature completion and depth perception modules.
-
HITL-D: Human In The Loop Diffusion Assisted Shared Control
HITL-D combines diffusion policies with human input for shared robotic control, reducing required joystick axes and improving speed and workload in manipulation tasks per a 12-participant study.
-
Real2Sim: A Physics-driven and Editable Gaussian Splatting Framework for Autonomous Driving Scenes
Real2Sim reconstructs editable dynamic driving scenes as temporally continuous Gaussians integrated with a differentiable MPM physics solver for high-fidelity simulation of interactions and collisions.
-
Safety-critical Control Under Partial Observability: Reach-Avoid POMDP meets Belief Space Control
A modular belief-space controller using learned Belief Control Lyapunov Functions for information gathering and conformal-prediction Belief Control Barrier Functions for safety reduces reach-avoid POMDP synthesis to fast quadratic programs.
-
MapRF: Weakly Supervised Online HD Map Construction via NeRF-Guided Self-Training
MapRF reaches about 75% of fully supervised HD map accuracy on Argoverse 2 and nuScenes by generating view-consistent pseudo labels via a NeRF conditioned on map predictions and refining them with Map-to-Ray Matching in self-training.
-
Channel-Level Relation to Attentive Aggregation with Neighborhood-Homogeneity Constraint for Point Cloud Analysis
PointCRA reduces information loss in deep point cloud networks by treating temporal trend variation as an extra evaluation dimension alongside spatial and channel attention, guided by a neighborhood homogeneity constraint.
-
FastGrasp: Learning-based Whole-body Control method for Fast Dexterous Grasping with Mobile Manipulators
FastGrasp uses two-stage RL with CVAE for diverse grasp candidates from point clouds and tactile sensing for impact adjustments to achieve robust fast whole-body grasping in sim and real-world settings.
-
A Unified Non-Parametric and Interpretable Point Cloud Analysis via t-FCW Graph Representation
Empowered t-FCW graph representation provides a unified non-parametric and interpretable method for point cloud analysis with high efficiency on ModelNet40 classification.
-
EdgeLPR: On the Deep Neural Network trade-off between Precision and Performance in LiDAR Place Recognition
FP16 quantization preserves accuracy in BEV-based LiDAR place recognition at lower cost while INT8 degradation depends on the network architecture.