The paper introduces Frictive Policy Optimization as a risk-sensitive epistemic control framework for LLM alignment that treats interventions like clarification, verification, and refusal as explicit actions to improve downstream belief quality rather than immediate rewards.
InFindings of the Association for Computational Linguistics: EMNLP 2024, pages 15503–15514
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
State augmentation allows dynamic programming and sample complexity bounds for MDPs and optimal control under static risk measures including CVaR.
citing papers explorer
-
Sample Complexity for Markov Decision Processes and Stochastic Optimal Control with Static Risk Measures
State augmentation allows dynamic programming and sample complexity bounds for MDPs and optimal control under static risk measures including CVaR.