FIDeL detects failures in imitation learning by building compact nominal representations via optimal transport, applying conformal prediction thresholds, and using VLMs for semantic filtering, outperforming baselines by 5.3% AUROC and 17.38% accuracy on the new BotFails dataset.
Primal wasserstein imita- tion learning.arXiv preprint arXiv:2006.04678
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
TimeRewarder derives step-wise progress rewards from frame-wise temporal distances in passive videos and uses them to guide RL, achieving high success rates on Meta-World tasks with fewer interactions than prior methods or hand-designed rewards.
citing papers explorer
-
TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance
TimeRewarder derives step-wise progress rewards from frame-wise temporal distances in passive videos and uses them to guide RL, achieving high success rates on Meta-World tasks with fewer interactions than prior methods or hand-designed rewards.