SeqRejectron constructs a stopping rule with a small set of validator policies to achieve horizon-free sample complexity for selective imitation learning under arbitrary dynamics shifts.
Advances in neural information processing systems , volume=
7 Pith papers cite this work. Polarity classification is still indexing.
7
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 7roles
background 1polarities
background 1representative citing papers
DOSER detects OOD actions via diffusion-model denoising error and applies selective regularization based on predicted transitions, proving gamma-contraction with performance bounds and outperforming priors on offline RL benchmarks.
DRIFT enables stable offline-to-online fine-tuning of CTMC policies in discrete RL via advantage-weighted discrete flow matching, path-space regularization, and candidate-set approximation.
Geometric Pareto Control embeds Pareto solutions in a Lie group submanifold and navigates via Riemannian gradient flow to achieve 100% feasibility and low suboptimality in control tasks without retraining.