Counterfactual transport flows enable conservative, instance-specific trajectory refinement in offline RL by constructing local preference pairs in latent space from offline data and learning refinement directions controlled by a strength parameter.
arXiv preprint arXiv:2311.03630 , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Counterfactual Transport Flows for Offline Conservative Trajectory Refinement
Counterfactual transport flows enable conservative, instance-specific trajectory refinement in offline RL by constructing local preference pairs in latent space from offline data and learning refinement directions controlled by a strength parameter.