Title resolution pending

· 2025 · arXiv 2509.21986

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

ActiveMimic: Egocentric Video Pretraining with Active Perception

cs.RO · 2026-06-04 · unverdicted · novelty 6.0

ActiveMimic pretrains on egocentric human video by recovering and modeling active camera motion as viewpoint actions, matching robot-data pretraining performance on real-world tasks.

AFUN: Towards an Affordance Foundation Model for Functionality Understanding

cs.RO · 2026-06-01 · unverdicted · novelty 6.0

AFUN predicts task-conditional functional masks and 3D post-contact motion curves from RGB-D and language, trained via a standardized multi-source data pipeline, and reports large gains over baselines on segmentation, contact prediction, and motion tasks.

From Human Videos to Robot Manipulation: A Survey on Scalable Vision-Language-Action Learning with Human-Centric Data

cs.RO · 2026-05-18 · unverdicted · novelty 3.0

The paper surveys four classes of techniques that derive action-related supervision from human videos for VLA robot models and identifies three open challenges in episode structuring, embodiment grounding, and evaluation.

citing papers explorer

Showing 3 of 3 citing papers after filters.

ActiveMimic: Egocentric Video Pretraining with Active Perception cs.RO · 2026-06-04 · unverdicted · none · ref 13
ActiveMimic pretrains on egocentric human video by recovering and modeling active camera motion as viewpoint actions, matching robot-data pretraining performance on real-world tasks.
AFUN: Towards an Affordance Foundation Model for Functionality Understanding cs.RO · 2026-06-01 · unverdicted · none · ref 84
AFUN predicts task-conditional functional masks and 3D post-contact motion curves from RGB-D and language, trained via a standardized multi-source data pipeline, and reports large gains over baselines on segmentation, contact prediction, and motion tasks.
From Human Videos to Robot Manipulation: A Survey on Scalable Vision-Language-Action Learning with Human-Centric Data cs.RO · 2026-05-18 · unverdicted · none · ref 75
The paper surveys four classes of techniques that derive action-related supervision from human videos for VLA robot models and identifies three open challenges in episode structuring, embodiment grounding, and evaluation.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer