URL http://arxiv.org/abs/ 2105.11066

doi: 10 · 2007 · arXiv 2105.11066

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Priced Motion Through Optimal Faces: A Normal-Fan Geometry for Non-Stationary Adversarial MDPs

cs.LG · 2026-06-27 · unverdicted · novelty 7.0

Introduces priced face-crossing via normal-fan geometry on occupancy polytopes to decompose dynamic regret into intrinsic motion cost plus within-face error in non-stationary adversarial MDPs.

Reference-Sampled Boltzmann Projection for KL-Regularized RLVR: Target-Matched Weighted SFT, Finite One-Shot Gaps, and Policy Mirror Descent

cs.LG · 2026-05-04 · unverdicted · novelty 7.0

Reference-sampled weighted SFT with prompt-normalized Boltzmann weights induces the same policy as fixed-reference KL-regularized RLVR, with BOLT as the estimator and a finite one-shot error decomposition separating coverage, variance, and other terms.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Priced Motion Through Optimal Faces: A Normal-Fan Geometry for Non-Stationary Adversarial MDPs cs.LG · 2026-06-27 · unverdicted · none · ref 38
Introduces priced face-crossing via normal-fan geometry on occupancy polytopes to decompose dynamic regret into intrinsic motion cost plus within-face error in non-stationary adversarial MDPs.
Reference-Sampled Boltzmann Projection for KL-Regularized RLVR: Target-Matched Weighted SFT, Finite One-Shot Gaps, and Policy Mirror Descent cs.LG · 2026-05-04 · unverdicted · none · ref 55
Reference-sampled weighted SFT with prompt-normalized Boltzmann weights induces the same policy as fixed-reference KL-regularized RLVR, with BOLT as the estimator and a finite one-shot error decomposition separating coverage, variance, and other terms.

URL http://arxiv.org/abs/ 2105.11066

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer