PAIR-VLA adds invariance and sensitivity objectives over paired visual variants during PPO fine-tuning of VLA models, yielding 9-16% average gains on ManiSkill3 under distractors, textures, poses, viewpoints, and lighting shifts.
Curl: Contrastive unsupervised repre- sentations for reinforcement learning
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3roles
background 2polarities
background 2representative citing papers
RC-aux corrects spatiotemporal mismatch in reconstruction-free latent world models by adding multi-horizon prediction and reachability supervision, improving planning performance on goal-conditioned pixel-control tasks.
A procedure builds provably minimal Markovian states from a longitudinal causal graph, but deep RL requires multi-order historical state exposure (MOSE) to realize gains over minimal or fixed-window baselines.
citing papers explorer
-
What to Ignore, What to React: Visually Robust RL Fine-Tuning of VLA Models
PAIR-VLA adds invariance and sensitivity objectives over paired visual variants during PPO fine-tuning of VLA models, yielding 9-16% average gains on ManiSkill3 under distractors, textures, poses, viewpoints, and lighting shifts.
-
Predictive but Not Plannable: RC-aux for Latent World Models
RC-aux corrects spatiotemporal mismatch in reconstruction-free latent world models by adding multi-horizon prediction and reachability supervision, improving planning performance on goal-conditioned pixel-control tasks.
-
Integrating Causal DAGs in Deep RL: Activating Minimal Markovian States with Multi-Order Exposure
A procedure builds provably minimal Markovian states from a longitudinal causal graph, but deep RL requires multi-order historical state exposure (MOSE) to realize gains over minimal or fixed-window baselines.