FAN achieves state-of-the-art offline RL performance on robotic tasks by anchoring flow policies and using single-sample noise-conditioned Q-learning, with proven convergence and reduced runtimes.
Econometrica: Journal of the Econometric Society , pages=
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
A new evaluation framework shows that blood glucose forecasting models with high overall accuracy often fail at timely hypoglycemia detection in high-risk periods and at predicting effects of changed insulin doses.
QHyer replaces return-to-go with a state-conditioned Q-estimator and adds a gated hybrid attention-mamba backbone to achieve state-of-the-art performance in offline goal-conditioned RL on both Markovian and non-Markovian datasets.
A novel bias-reduced online covariance estimator for SGD achieves convergence rate n to the power (α-1)/2 times square root of log n without second-order derivatives.
DQPOPE estimates the entire return distribution in off-policy evaluation via deep quantile process regression, providing statistical advantages over standard single-value methods with equivalent sample sizes.
citing papers explorer
-
Towards Efficient and Expressive Offline RL via Flow-Anchored Noise-conditioned Q-Learning
FAN achieves state-of-the-art offline RL performance on robotic tasks by anchoring flow policies and using single-sample noise-conditioned Q-learning, with proven convergence and reduced runtimes.
-
From Prediction to Practice: A Task-Aware Evaluation Framework for Blood Glucose Forecasting
A new evaluation framework shows that blood glucose forecasting models with high overall accuracy often fail at timely hypoglycemia detection in high-risk periods and at predicting effects of changed insulin doses.
-
QHyer: Q-conditioned Hybrid Attention-mamba Transformer for Offline Goal-conditioned RL
QHyer replaces return-to-go with a state-conditioned Q-estimator and adds a gated hybrid attention-mamba backbone to achieve state-of-the-art performance in offline goal-conditioned RL on both Markovian and non-Markovian datasets.
-
Refining Covariance Matrix Estimation in Stochastic Gradient Descent Through Bias Reduction
A novel bias-reduced online covariance estimator for SGD achieves convergence rate n to the power (α-1)/2 times square root of log n without second-order derivatives.
-
Distributional Off-Policy Evaluation with Deep Quantile Process Regression
DQPOPE estimates the entire return distribution in off-policy evaluation via deep quantile process regression, providing statistical advantages over standard single-value methods with equivalent sample sizes.