The paper introduces Frictive Policy Optimization as a risk-sensitive epistemic control framework for LLM alignment that treats interventions like clarification, verification, and refusal as explicit actions to improve downstream belief quality rather than immediate rewards.
InFindings of the Association for Computational Linguistics: EMNLP 2024, pages 15503–15514
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
State augmentation allows dynamic programming and sample complexity bounds for MDPs and optimal control under static risk measures including CVaR.
citing papers explorer
-
Frictive Policy Optimization for LLMs: Epistemic Intervention, Risk-Sensitive Control, and Reflective Alignment
The paper introduces Frictive Policy Optimization as a risk-sensitive epistemic control framework for LLM alignment that treats interventions like clarification, verification, and refusal as explicit actions to improve downstream belief quality rather than immediate rewards.
-
Sample Complexity for Markov Decision Processes and Stochastic Optimal Control with Static Risk Measures
State augmentation allows dynamic programming and sample complexity bounds for MDPs and optimal control under static risk measures including CVaR.