ARM trains reward models on Progressive/Regressive/Stagnant labels to enable adaptive reweighting in offline RL, reaching 99.4% success on towel-folding with minimal human intervention.
Learning transferable visual models from natural language supervision
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A differentiable motion forecasting model retrieves and refines interpretable trajectory anchors from a contrastively learned motion bank to improve transparency without sacrificing multi-modal accuracy.
ID-Sim is a new similarity metric that aims to capture human selective sensitivity to identities by training on curated real and generative synthetic data and validating against human annotations on recognition, retrieval, and generative tasks.
citing papers explorer
-
ARM: Advantage Reward Modeling for Long-Horizon Manipulation
ARM trains reward models on Progressive/Regressive/Stagnant labels to enable adaptive reweighting in offline RL, reaching 99.4% success on towel-folding with minimal human intervention.
-
Recall to Predict: Grounding Motion Forecasting in Interpretable Motion Bank
A differentiable motion forecasting model retrieves and refines interpretable trajectory anchors from a contrastively learned motion bank to improve transparency without sacrificing multi-modal accuracy.
-
ID-Sim: An Identity-Focused Similarity Metric
ID-Sim is a new similarity metric that aims to capture human selective sensitivity to identities by training on curated real and generative synthetic data and validating against human annotations on recognition, retrieval, and generative tasks.