A myopic MINMPC framework learns a value function offline via inverse optimization from expert data, allowing short horizons with near-optimal performance and strict integer feasibility online for hybrid systems.
Advances in neural information processing systems , volume=
5 Pith papers cite this work. Polarity classification is still indexing.
years
2026 5verdicts
UNVERDICTED 5representative citing papers
Wahkon unifies Kolmogorov superposition with RKHS regularization to produce a deep network whose penalized estimator is exactly the MAP under a hierarchical GP prior and achieves minimax-optimal rates.
A tractable ensemble distributionally robust Bayesian optimization method achieves improved sublinear regret bounds under context uncertainty.
ERPPO adds a DSA-based ambiguity estimator to MAPPO and switches between L1 and L2 entropy regularization to improve exploration and stability in non-stationary multi-dimensional observations.
RASP-Tuner matches or beats GP-UCB and CMA-ES regret on seven of nine synthetic non-stationary tasks while running 8-12 times faster per step.
citing papers explorer
-
Learning myopic mixed-integer nonlinear model predictive control from expert demonstrations
A myopic MINMPC framework learns a value function offline via inverse optimization from expert data, allowing short horizons with near-optimal performance and strict integer feasibility online for hybrid systems.
-
Wahkon: A Statistically Principled Deep RKHS Superposition Network
Wahkon unifies Kolmogorov superposition with RKHS regularization to produce a deep network whose penalized estimator is exactly the MAP under a hierarchical GP prior and achieves minimax-optimal rates.
-
Ensemble Distributionally Robust Bayesian Optimisation
A tractable ensemble distributionally robust Bayesian optimization method achieves improved sublinear regret bounds under context uncertainty.
-
ERPPO: Entropy Regularization-based Proximal Policy Optimization
ERPPO adds a DSA-based ambiguity estimator to MAPPO and switches between L1 and L2 entropy regularization to improve exploration and stability in non-stationary multi-dimensional observations.
-
RASP-Tuner: Retrieval-Augmented Soft Prompts for Context-Aware Black-Box Optimization in Non-Stationary Environments
RASP-Tuner matches or beats GP-UCB and CMA-ES regret on seven of nine synthetic non-stationary tasks while running 8-12 times faster per step.