InFindings of the Association for Computational Linguistics: EMNLP 2024, pages 15503–15514

A reductions approach to risk-sensitive reinforcement learning with optimized certainty equivalents , author= · 2024 · arXiv 2403.06323

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Frictive Policy Optimization for LLMs: Epistemic Intervention, Risk-Sensitive Control, and Reflective Alignment

cs.CL · 2026-04-28 · unverdicted · novelty 7.0

The paper introduces Frictive Policy Optimization as a risk-sensitive epistemic control framework for LLM alignment that treats interventions like clarification, verification, and refusal as explicit actions to improve downstream belief quality rather than immediate rewards.

When Does Trajectory-Level Supervision Permit Efficient Offline Reinforcement Learning?

stat.ML · 2026-06-16 · unverdicted · novelty 6.0

Proposes OPAC for trajectory-level offline RL achieving 𝓣O(H^{2}√(C_sa(π*)/n)) bounds with matching lower bound, plus conditions for tractability in generalized nonlinear outcome settings.

Sample Complexity for Markov Decision Processes and Stochastic Optimal Control with Static Risk Measures

math.OC · 2026-04-06 · unverdicted · novelty 4.0

State augmentation allows dynamic programming and sample complexity bounds for MDPs and optimal control under static risk measures including CVaR.

citing papers explorer

Showing 1 of 1 citing paper after filters.

When Does Trajectory-Level Supervision Permit Efficient Offline Reinforcement Learning? stat.ML · 2026-06-16 · unverdicted · none · ref 35
Proposes OPAC for trajectory-level offline RL achieving 𝓣O(H^{2}√(C_sa(π*)/n)) bounds with matching lower bound, plus conditions for tractability in generalized nonlinear outcome settings.

InFindings of the Association for Computational Linguistics: EMNLP 2024, pages 15503–15514

fields

years

verdicts

representative citing papers

citing papers explorer