Mesa-optimization arises when learned models act as optimizers with objectives that can differ from their training loss, creating alignment risks in advanced machine learning.
Thoughts on human models.MIRI, 2019
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2019 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Risks from Learned Optimization in Advanced Machine Learning Systems
Mesa-optimization arises when learned models act as optimizers with objectives that can differ from their training loss, creating alignment risks in advanced machine learning.