Occurrence-level attribution in finite-horizon adaptive learning is defined via a conditional interventional target, shown to be unrecoverable from replay data in general but identifiable in a specific structural class from logged data.
Learning from the right rollouts: Data attribution for PPO-based LLM post-training
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
citing papers explorer
-
Data Attribution in Adaptive Learning
Occurrence-level attribution in finite-horizon adaptive learning is defined via a conditional interventional target, shown to be unrecoverable from replay data in general but identifiable in a specific structural class from logged data.
- Prefix Teach, Suffix Fade: Local Teachability Collapse in Strong-to-Weak On-Policy Distillation